The Risks of “Too Much Data”

We mentioned in our 2022 predictions the immense amount of data that is being created on a daily, monthly, and yearly basis. The numbers are staggering, with an estimated 46 billion IoT (Internet of Things) devices connected globally. There are certainly benefits that individuals, companies, organizations, and governments can gain from increasing data, and it could be stated that our entire global infrastructure has essentially become dependent on data access. But with increasing benefits of big data, also comes risk. Consider the following risks as we continue to create zettabytes of new data annually:

Security: Much of the data that is created from the 46 billion devices noted above is discoverable. What this means is that if you search on Google, if you create a document and save it to the cloud, or if you buy something online, that data is accessible by any number of people and machines. For more on this topic, check out our recent post on the current state of Data Privacy and Security.

Data Standardization & Validation: With ever-increasing data flows, organizations are wrestling with how to standardize and combine their data. “Data Lakes” have become “Data Swamps,” and companies are struggling to get meaningful, trustworthy data out of their reporting, analytics, and AI tools. For example, look at the contacts on your own phone. How many of these people do you actually know or have a record of the last time you spoke? Some have last names, some just first, and some may have companies associated but the individual no longer works there.

Man Shoveling - Valid DatumData Shoveling: Data shoveling occurs when a company shovels massive quantities of data into a new system without taking care to clean, standardize, and test its data before moving it. Similar to shoveling shards of shredded documents like in the picture below, data becomes nearly impossible to navigate and use. This is becoming the norm as businesses tend to overlook data planning as a key component of their implementation plan.

This is also an emerging issue in the Private Equity arena and with acquisition-hungry companies as M&A activities are on the rise. Acquirers are struggling to assess and digest the continued proliferation of data. What ends up happening is data assimilation is either significantly delayed or simply shoveled into a larger pile.

API Abuse: APIs are the most popular way to pass data between systems; a vendor provides a tool to connect one system to another and it seems like a quick and easy solution to a problem. However, APIs are like the dumb waiter. They move data in a predetermined way from one place to another.

They typically do not do anything to continually monitor and test the data Pasta-VDmoving through them, so all that checking must be done elsewhere. In a typical company, the high-level view of API connections looks like my 2-year-old grandson just dumped a couple of boxes of uncooked spaghetti onto the floor! Consider then, that this mess is duplicated for each and every connected application (I ran out of spaghetti so couldn’t show the full effect here!)

All that data moving around a company in an inefficient circuitous manner with decision-making executives simply crossing their fingers that the data makes it into the reporting and analytics without mishap.

In our next post, we will cover ways companies can navigate these growing risks of “Too Much Data.” If you have questions in the meantime or would like to discuss your own situation, please reach out!