Everybody speaks today about big data. It is probably one of the most overhyped and confused terms. It goes everywhere and means different things depends who you are talking to. It can be data gathered from mobile devices, traffic data, social media and social networking activity data. The expectations are the size of big data will be going through the roof. Read Forbes article Extreme Big Data: Beyond Zettabytes And Yottabytes. The main point of the article – we produce data faster than we can invent a name how to call it. Here is a scale we are more/less familiar with – TB terabyte, PB petabyte, EB exabyte, ZB zettabyte, YB yottabyte…
However, article also brings an interesting lingo of data sizes. Here are some examples: Hellabytes (a hell of a lot of bytes), Ninabytes, Tenabytes, etc. Wikipedia provides a different option to extend prefix system – zetta, yotta, xona, weka, vunda, uda, treda, sorta, rinta, quexa, pepta, ocha, nena, minga, luma, … Another interesting comparison came from itknowledgeexchange article. Navigate here to read more. Here is my favorite passage. The last comparison to Facbook is the most impressive.
Beyond what-do-we-call-it, we also have the obligatory how-to-put-it-in-terms-we-puny-humans-can-understand discussion, aka the Flurry of Analogies that came up when IBM announced a 120-petabyte hard drive a year ago. Depending on where you read about it, that drive was: 2.4 million Blu-ray disks; 24 million HD movies; 24 billion MP3s; 6,000 Libraries of Congress (a standard unit of data measure); Almost as much data as Google processes every week; Or, four Facebooks.
Forbes article made me think about sizes of PLM data, engineering data, design data. It is not unusual to speak about CAD data and/or design data as something very big. Talk to every engineering IT manager and he will speak to you about oversizing of CAD files in libraries. Large enterprise companies (especially in regulated industries) are concerned about how to store data for 40-50 years, what format to use, how much space it can keep and how it can be accessible. At the same time, I’ve seen a complete libraries of CAD components together with all design data coming from a mid size companies backed up with simple 1TB USB drive. I believe software like simulation can produce lots of data, but this data today is not controlled and just lost on desktops. One of the most popular requirements from engineers about PDM was the ability to delete old revisions. The sizes of PLM repositories for Items and Bill of Materials can reach certain size, but still I can hardly see how it compete to Google and Facebook media libraries. At the same time, engineering is just before to explore the richness of online data and internet of things. So, the size of engineering repositories will only grow up.
What is my conclusion? If you compare to Google, Twitter and Facebook scale, the majority of engineering repositories today are modest sized. After all, even very large CAD files can hardly compete with the amount of photo and video streams uploaded by billion people on social networks. Also, tracking data captured from mobile devices oversize every possible Engineering Change Order (ECO) records. However, engineering data has a potential to become big. An increased interest to simulation, analyzes as well as design options can bump sizes of engineering data significantly. Another potential source of information is related to an increased ability to capture customer interests and requirements as well as product behavior online. Just my thoughts. So, how fast PLM will grow to Yottabytes? What is your take?