How do you know if the data you have is considered big data? There are generally four characteristics that must be part of a dataset to qualify it as big data—volume, velocity, variety and veracity. Value is a fifth characteristic that is also important for big data to be useful to an organization.
Our world has become datafied. From data that shows activity such as our Google searches and online shopping habits to our communication and conversations through text, smartphones and virtual assistants, and all the pictures and videos we take to the sensor data collected by internet-of-things devices and more, there are 2.5 quintillion bytes of data created each day. The better companies and organizations manage and secure this data, the more successful they are likely to be. How do you know if the data you have has the characteristics that qualify it as “big”? Most people determine data is “big” if it has the four Vs—volume, velocity, variety and veracity. But in order for data to be useful to an organization, it must create value—a critical fifth characteristic of big data that can’t be overlooked.
The first V of big data is all about the amount of data—the volume. Today, every single minute we create the same amount of data that was created from the beginning of time until the year 2000. We now use the terms terabytes and petabytes to discuss the size of data that needs to be processed. The quantity of data is certainly an important aspect of making it be classified as big data. As a result of the amount of data we deal with daily, new technologies and strategies such as multitiered storage media have been developed to securely collect, analyze and store it properly.
Velocity, the second V of big data, is all about the speed new data is generated and moves around. When you send a text, check out your social media feed and react to posts on Facebook, Instagram or Twitter or make a credit card purchase, these acts create data that need to be processed instantaneously. Compound these activities by all the people in the world doing the same and more and you can start to see how velocity is a key attribute of big data.
Today, data is generally one of three types: unstructured, semi-structured and structured. The algorithms required to process the variety of data generated varies based on the type of data to be processed. In the past, data was nicely structured—think Excel spreadsheets or other relational databases. A key characteristic of big data is that it not only is structured data but also includes text, images, videos, voice files and other unstructured data that doesn’t fit easily into the framework of a spreadsheet. Unstructured data isn’t bound by rules like structured data is. Again, this variety has helped put the “big” in data. We are able to use technology to make sense of unstructured data today in a way that wasn’t possible in the past. This ability has opened up a tremendous amount of data that have previously not been accessible or useful.
The veracity of big data denotes the trustworthiness of the data. Is the data accurate and high-quality? When talking about big data that comes from a variety of sources, it’s important to understand the chain of custody, metadata and the context when the data was collected to be able to glean accurate insights. The higher the veracity of the data equates to the data’s importance to analyze and contribute to meaningful results for an organization.
While this article is about the 4 Vs of data, there is actually an important fifth element we must consider when it comes to big data. This is the need to turn our data into value. In fact, organizations that have not created a data strategy to yield insights and to drive data-driven decision-making are going to fall behind competitors. Big data that’s analyzed effectively can provide important understanding of customers and their behaviors and desires, how to optimize business processes and operations and to improve a nearly endless amount of applications. Whether you use data to create a new product or service or to understand a way to cut costs, it is incredibly important that big data creates value. This value is why organizations of every size must have a data strategy in place in order to ensure the data needed to achieve the business objectives they adopted are being collected and analyzed.
Where to go from here
If you would like to know more about technology during COVID-19, check out my articles on:
Or browse the Big Data & Analytics to find the metrics that matter most to you.
Bernard Marr is a world-renowned futurist, influencer and thought leader in the field of business and technology. He is the author of 18 best-selling books, writes a regular column for Forbes and advises and coaches many of the world’s best-known organisations. He has 2 million social media followers and was ranked by LinkedIn as one of the top 5 business influencers in the world and the No 1 influencer in the UK.