In today’s piece, we’ll focus all our attention on some of the most mind-boggling big data statistics. For anyone who’s new to the concept of big data, TechJury has prepared a brief intro on the topic.
Big data refers to enormous data sets gathered from numerous sources. These data sets cannot be collected, stored, or processed using any of the existing conventional tools due to their quantity and complexity.
So, there is a variety of tools used to analyze big data – NoSQL databases, Hadoop, and Spark – to name a few. With the help of big data analytics tools, we can gather different types of data from the most versatile sources – digital media, web services, business apps, machine log data, etc.
Big Time Big data statistics
- The big data analytics market is set to reach $103 billion by 2023
- In 2019, the big data market is expected to grow by 20%
- By 2020, every person will generate 1.7 megabytes in just a second
- Internet users generate about 2.5 quintillion bytes of data each day
- In 2019, there are 2.3 billion active Facebook users, and they generate a lot of data
- 97.2% of organizations are investing in big data and AI
- Using big data, Netflix saves $1 billion per year on customer retention
Now, why is big data important? Once analyzed, this data helps in a multitude of ways. In healthcare, it helps avoid preventable diseases by detecting them in their early stages. It is also immensely useful in the banking sector, where it aids in recognizing illegal activities such as money laundering. Finally, in meteorology, it helps study global warming.
Alright! Now that we’ve covered the basics, let’s check out some interesting statistics about big data.
1. By 2020, there will be around 40 trillion gigabytes of data (40 zettabytes).
Measuring the amount of data we have today is not an exact science. When going through the numbers related to big data, we found numerous predictions and estimates, but very few numbers we could hold on to. The number of big data growth statistics was also very high. One of the many predictions about the amount of big data came from IDC’s study – “The Digital Universe in 2020”. According to the source, next year, we should have around 40 trillion gigabytes of data, i.e., 40 zettabytes.
The same study suggested that the big data size in 2010 was standing at 1.2 zettabytes. Additionally, IDC gave us another useful piece of information that helped us answer the question of how fast is data growing? The IDC report said that the digital universe would double every two years until 2020. So, we decided to test this. We took the amount of data from 2010 (1.2 zettabytes) and doubled it five times for every two years in a decade.
The result we got was around 38.5 zettabytes, which is pretty much in line with IDC’s forecast for 2020 (40 zettabytes). Now, these are all rough estimates, and so, in 2012, there were 2.8 zettabytes of data (instead of 2.4 zettabytes). Each consecutive estimate is slightly off as well, so take these with a grain of salt.
2. 90% of all data has been created in the last two years.
One of the big data stats that caught our attention came from an IBM study from 2017. It outlined that 90% of all the data in the world back then had been created in the last two years. At first, we were quite surprised to learn that we’ve generated so much data in such a relatively short timeframe. However, once we analyzed the incredible growth of the internet, this started to make sense. In 2012, we had 2.5 billion internet users. In 2014, this number reached the three-billion mark, and in 2019 we have 4.1 billion people online.
Now, one thing is for sure – the amount of data is increasing exponentially over time, and we could say the same for internet users. So, could it really be that we’ve created 90% of all data in just two years? The answer is a resounding yes.
3. Today it would take a person approximately 181 million years to download all the data from the internet.
An interesting piece of information about big data comes from Physics.org that answered the question of how long would it take to download all the data from the internet. The source used the following values: 0.55 zettabytes for all the information on the internet, and 44Mbps as average download speed. However, since these big data statistics have changed, we redid the calculation with 33 zettabytes of data and an average download speed of 46Mbps. The result we got was around 181.3 million years. Impressive, right?
4. In 2012, only 0.5% of all data was analyzed.
(Source: The Guardian)
The vast quantity of big data has no value unless it is tagged or analyzed. So, the question is how much data is that? According to IDC’s Digital Universe Study from 2012, only 0.5% of data is analyzed, while the percentage of tagged data is a bit higher at 3%. By further researching these data analytics statistics we discovered that not all data has the potential to bring value.
In 2017, the Economist claimed that data replaced oil as the world’s most valuable source. There were many sources that compared data to oil while neglecting one big difference between the two. Unlike oil, data can be easily extracted, and the supplies are endless. What’s more, unlike oil, we can use data multiple times and get new insights from it. The comparison between oil and data leads us to the conclusion that we should collect and store as much data as possible. However, if we only do that, without tagging or analyzing the information we have, its value will be far less significant than that of oil.
According to big data statistics from IDC, in 2012 only 22% of all the data had the potential for analysis. This includes data from different fields such as surveillance, entertainment and social media, etc. The same source said that by 2020, the percentage of useful data, i.e., the information that has the potential for analysis, would jump to 37%.
5. Internet users generate about 2.5 quintillion bytes of data each day.
(Source: Data Never Sleeps 5.0)
With the estimated amount of data we should have by 2020 (40 zettabytes), we have to ask ourselves what’s our part in creating all that data. So, how much data is generated every day? 2.5 quintillion bytes. Now, this number seems rather high, but if we look at it in zettabytes, i.e., 0.0025 zettabytes this doesn’t seem all that much. When we add to that the fact that in 2020 we should have 40 zettabytes, we’re generating data at a regular pace.
However, there are other ways to look at the amount of data we generate on a daily basis. 2.5 quintillion bytes are equal to the number of all ants on the planet multiplied by 100. Moreover, with one quintillion pennies, we could cover the entire surface of the earth 1.5 times. With 2.5 quintillion of them – five times. It’s really fascinating what we can learn from big data facts and figures. 2018 was quite interesting big data-wise, and we expect 2019 to be just as exciting and data-rich.
6. In 2018, internet users spent 2.8 million years online.
(Source: Global Web Index)
Just imagine how much data internet users can generate in a million years, let alone 2.8 million years? Now, before we continue, let us explain how we got to this conclusion. In 2019, we have 4.1 billion internet users. According to the Global Web Index report from 2017, internet users spent 6.5 hours on the internet which clearly illustrates rapid big data growth. So, if each of the 4.1 billion internet users spent around 6.5 hours online on a daily basis, we’ve spent 2.8 million years online in 2018 alone.
7. Social media accounts for 33% of the total time spent online.
(Source: Global Web Index)
Before we give you some numbers on how users generate data on Facebook and Twitter, we wanted to paint a picture of general social media usage first. Global Web Index published a piece on the average number of social accounts per user in 2016. Comparing the number of social accounts in 2012 and 2016, we got some interesting social media big data statistics. Namely, in 2012, social media users had three social accounts on average, while that number rose to 7 in 2016.
Apart from the rise of the multi-networking trend, the average time users spend on social media platforms also saw a significant increase. In 2012, digital users spent an hour and a half filling up their spare time on social media sites, while in 2017, the average time they spent on these sites was at 2 hours and 15 minutes.
Lastly, the same source discovered that out of the total time digital users spend online, 33% is reserved for social media. This is no doubt a large part of why the data growth statistics are what they are today. Apart from social media, 16% of the time users spend online goes to online TV and streaming, and another 16% to music streaming. Online press takes a 13% share of total online time, whereas the remaining 22% of the time is reserved for other online activities.
8. In 2019, there are 2.3 billion active Facebook users, and they generate a lot of data.
(Source: Data Never Sleeps)
Next on our agenda are Facebook big data stats. There are 2.3 billion Facebook users in 2019. Now, the question we want to answer is how much data these users generate in only one minute. To help us with that, we gathered the data from Domo that publishes annual reports on the amount of data digital users create in 60 seconds.
Facebook stats from 2012 showed users were sharing 684,478 pieces of content every minute. In 2014, that number increased nearly four times, resulting in 2.46 million pieces of content per minute. When it comes to data statistics from 2015, the data from Domo shows that in just 60 seconds, Facebook users like 4.1 million posts.
Apart from Facebook stats, Domo provided us with some rather fascinating United States big data statistics. According to the source, Americans used 2,657,700 GB of internet data in every minute of 2017. The next year, the amount of internet data used per minute reached 3, 138, 420 GB, which is an impressive jump for one year.
9. Twitter users send nearly half a million tweets every minute.
Facebook’s internet data usage stats are only the tip of the iceberg. Social data coming from Domo’s Data Never Sleeps 6.0 report gives us some insights about user activity on Twitter as well. The source suggests the number of tweets per minute increased from 456,000 in 2017 to 473,400 in 2018.
We also looked at the Internet Live stats to see how many tweets were sent in 2019 alone. In just a little less than 1.5 months Twitter users sent more than 30 billion tweets. Taking into account that it took Twitter the first three years of its existence to reach the billionth tweet, the numbers we have today show us just how much this social network has grown over the years.
Furthermore, Twitter is one of the big companies that use big data and artificial intelligence. Stats and facts about Twitter show us that not only does the social media network use AI for their image cropping tools, but for preventing inappropriate content as well.
10. 97.2% of organizations are investing in big data and AI.
(Source: New Vantage)
In 2018, New Vantage published its sixth Executives Survey with a primary focus on big data and artificial intelligence. The study recorded the executives’ answers from approximately 60 Fortune 1000 companies including Motorola, American Express, NASDAQ, etc. Aside from indicating a strong presence of big data in leading companies, the New Vantage study also answered the question: How much do companies spend on data analytics? So, here’s what we’ve learned.
62.5% of participants said their organization appointed a Chief Data Officer (CDO), which indicates a fivefold increase since 2012 (12%). Additionally, a record number of organizations participating in the study have invested in big data and artificial intelligence initiatives at 97.2%. The highest percentage of organizations (60.3%) invested under $50 million. Nearly one-third of participants (27%) said their companies’ cumulative investments in big data and AI fall into the range between $50 million and $550 million. Lastly, only 12.7% of participants said their companies invested more than $500 million.
So, is big data the future? If we focus on the big data investments from companies such as Goldman Sachs, IBM, and Bank of America, we could answer this question with a “yes.”
11. Using big data, Netflix saves $1 billion per year on customer retention.
(Source: Inside Big Data)
Today, many companies use big data to expand and enhance their businesses, and Netflix is a perfect example of that. The digital users’ favorite streaming service, Netflix has 139 million subscribers as of January 2019. Now, the California-based company can help us answer the question: what are the benefits of big data? Well, one of the benefits of using big data in streaming services is customer retention as a result of lower subscription cancelation rates. Netflix has a strategy to tie its audience to their seats, and big data is a big part of that strategy.
Some of the information Netflix collects includes searches, ratings, re-watched programs, and so on. This data helps Netflix provide its users with personalized recommendations, show videos similar to the ones they’ve already watched or suggest various titles from a specific genre. Plus, we have to admit that the company’s “Continue Watching” feature improves the user experience a lot.
While going through various big data statistics, we discovered that back in 2009 Netflix invested $1 million in enhancing their recommendation algorithm. What’s even more interesting is that the company’s budget for technology and development stood at $651 million in 2015. Last year, the budget reached $1.3 billion.
As for the $1 billion in savings from customer retention, this was just a rough estimate Carlos Uribe-Gomez and Neil Hunt made in 2016. We believe that number is significantly higher now, as, among other reasons, Netflix spent over $12 billion on content in 2018, and that number is projected to reach $15 billion this year.
12. What is big data and analytics market worth in 2019? $49 billion, says Wikibon.
We’ve already covered how Netflix benefited from big data, but that’s only the beginning. Big data found its place in various industries as it helps detect patterns, consumer trends, and enhance decision making, among other things. So, the question is how much is the big data industry worth, and what can we expect in the next couple of years? In their 2018 Big Data Analytics Trends and Forecast Wikibon answered these questions.
So, how much is big data worth? According to Wikibon, the big data analytics market (BDA) is expected to reach $49 billion with a compounded annual growth rate (CAGR) of 11%. So, each year, the market will gain $7 billion in value. As a result of this forecast, the BDA market should reach $103 billion by 2023.
13. In 2019, the big data market is expected to grow by 20%.
While exploring global data market growth forecast from Statista, we discovered that big data had the highest growth rate in 2012(61%) and 2013 (60%). While going through big data growth statistics, 2018 saw big data market growth of 20%, and this year, the big data market should grow by 17%. As Statista points out, the market’s growth will decrease over time, and increase by 7% 2025 to 2027.
14. Job listings for data science and analytics will reach around 2.7 million by 2020.
One of the biggest problems in the big data industry is the lack of people with deep analytical skills. Looking at the data growth statistics, it’s clear that there are not enough people who are trained to work with big data. According to RJMetrics, in 2015, there were between 11,400 and 19,400 data scientists worldwide. McKinsey predicted that in 2018 there should be approximately 2.8 million people with analytical talent. On the other hand, the number of jobs for data science and analytics is expected to reach 2.7 million by 2020. So, there’s a big gap between demand in data science and analytics talent.
15. By 2020, every person will generate 1.7 megabytes in just a second.
If we assume that the big data growth projections from Domo are accurate, by 2020, every person on the planet should generate 146,880 GB a day. If we take into account that the world population will reach 8 billion people by that time, it’s easy to conclude the amount of data we’ll create on a daily basis will rise dramatically. Moreover, IDC forecasts that we will be producing 165 zettabytes per year by 2025.
Now, let’s jump to 2020 technology predictions and future trends related to big data.
16. Automated analytics will be vital to big data by 2020.
(Source: Flat World Solutions)
One of the many predictions in the big data field is that automating processes behind frameworks such as Hadoop and Spark will be inevitable in just a year from now. Another prediction relates to smart wearables, which will help accelerate big data growth. We can also expect machine learning to develop further in the near future. Combined with data analytics, we expect it to create predictive models to forecast the future with even higher level of accuracy. Lastly, Flat World Solutions forecasts that businesses will gain $430 billion by 2020 if they opt for a data-driven approach.
We hope we succeeded in our quest to find some of the most impressive big data statistics. One of the key takeaways from this topic is that the big data market is quickly expanding and with every passing day we have more information. The ultimate goal is not about collecting as much data as possible though, but about getting value from the data we collect.