Estimated reading time: 3 minutes, 0 seconds
What Portion of Your Big Data Should be Deleted? Featured
Big data is the fuel that drives many aspects of our businesses today. With this, the ability to manage large datasets has become critical to the success of a company. But even with the proliferation of various data management methods, functions and tools, most companies remain far behind in the implementation of a sound big data strategy. On the other hand, those that have a big data strategy in place often find themselves facing some issues in how they can use data to make decisions. A cross-industry strategy shows that less than half of structured data in organizations is used in making decisions, while below 1% of the unstructured data is used in doing the same. The same study notes that more than 70% of employees have access to data they should not be accessing in the first place.
With the rising adoption of big data, data breaches have become common and rogue datasets are hosted in silos while technologies being used by many organizations are not up to the standard with regard to meeting the demands. While big data is a critical innovation for modern companies, what happens when you need to delete some of it? What portion of your big data should be deleted? To begin with, don’t collect or store data that you don’t need. This is the data that will not give you any value and keeping it would not only occupy more of your space but can also add risk. Only keep data that your organization will benefit from and delete the rest that will be a burden to you. While it might seem easy, keeping data is an expensive undertaking that increases the cost of maintaining your big data systems. The chance that some datasets are useful goes down over time while vulnerability remains the same. Therefore, you should delete outdated data that is no longer useful.
There is a chance that keeping data that is no longer useful to your business and decision-making is harmful. Unfortunately, losing this data, regardless of how useless it is, to the hackers might still affect you substantially. To make it worse, government agencies and privacy crusaders will not take it lightly if personally identifiable information (PII) is lost to hackers. Therefore, the lesser the amount of data you keep, the smaller the number of people you will have to compensate in case you lose their critical information.
Considering almost a third of the data you store in your business is redundant, you should consider proactively deleting the redundant files. For example, you should filter out ex-employee or ex-customer data which presents a big is a risk. Such data may contain PII, so it is only worth keeping such data for legal reasons. You need to also manage financial records properly since such information is often a target to hackers.
Data that is no longer useful can only be deleted after a thorough identification process. To begin with, businesses need to identify details within data and categorize them appropriately based on the risk and its potential value. Understand the data that is stored, people that can access such data, and how often they can access it. It is only after doing this that you will understand the data that exists in your database and act accordingly. Also, delete data that is associated with systems that are no longer in use in an organization. For example, delete data from old websites so that they don’t unnecessarily increase an attack surface and give hackers an unnecessary advantage. Make deletion of data a serious operation that happens occasionally, like at least once quarterly.

Scott Koegler
Scott Koegler is Executive Editor for Big Data & Analytics Tech Brief
scottkoegler.me/Latest from Scott Koegler
- Big Data and IoT
- The Role of Big Data in Machine Learning and Artificial Intelligence
- Exploring Data Visualization: How to Use Big Data to Tell Stories
- Big Data and Business: How Companies are Leveraging Big Data for Business Decisions
- Big Data Governance: Best Practices for Managing and Securing Your Data
Most Read
-
-
Mar 11 2019
-
Written by Danielle Loughnane
-
-
-
Mar 11 2019
-
Written by News
-
-
-
Mar 11 2019
-
Written by News
-
-
-
Jan 13 2019
-
Written by Danielle Loughnane
-