Estimated reading time: 2 minutes, 58 seconds

Working Around Big Data Issues Featured

Working Around Big Data Issues Christopher Burns

Big data has become a critical part of our lives and perhaps one of the most important areas of study in the past few years. With the rise in data coming from various sources, such as individuals and organizations, the demand for effective data analysis has never been so important. This article explores some problems and strategies for working around big data issues.

Effects of bad data on analytics

Modern businesses rely on real-time collection or generation of data to enhance operations. Most of these businesses have specialized analytics divisions for analyzing vast data from different sources. However, the data collected can be bad, resulting in bad decisions. Here are some reasons why bad data can be a significant problem:

  1. Produces misleading insights

Businesses use various analytical tools to gather insights from vast amounts of data. However, such insights may be unreliable if duplicated data is collected. An example is when data gathered from 20 different sources and locations are duplicated, the output may show 40 distinct data points. The insights will be inaccurate if this example is magnified to include millions of data points and duplicates.

  1. Leads to inaccurate correctional expenses

According to a Gartner Data Quality Market Survey of 2017, poor data quality leads to average losses among businesses of up to $15 million. Losses may have increased in the subsequent years as more than 90% of data in circulation today came up in the past two to three years. Most data may have inconsistencies, inaccuracies and duplication.

  1. Unreliability of data

Data must be captured continuously from different sources. The collected data can be transmitted over long distances. The transmission of data can result in loss of data integrity through contamination. This affects the reliability of data, which cannot be used for forecasting.

How can big data issues be fixed?

  1. Verifying data from the source

Most quality issues emerge from the sources from which data is gathered or generated. Therefore, issues can be mitigated by cleaning the data right from the source before being sent to the point of processing. The process of verification entails putting the freshly gathered data through various verifications to check the correctness and completeness. 

  1. Fix quality issues at the ETL phase

Customer data is gathered from different sources in Extract, Transform and Load phase before analytics can be carried out by businesses that need it. Your business can use various tools and applications to “find” and “fix” the quality issues emerging from it at this stage before they enter storage databases.

  1. Use precision identity or entity resolution

This is the most powerful way of fixing data quality issues. One of the common marketing-related issues with customer records and databases in organizations is that the identity or residential location of customers may not be verified. Therefore, customers living in the same household or various records of the same customer are stored in these databases. Customers or households may receive similar marketing information at various times. This results in duplication, which can be prevented using the precision identity or entity resolution to identify/entity resolution to identify customers or households where more than one marketing email or other methods of communication will not be sent.

From the above, it is evident that the best way to resolve big data problems is by scaling up investment in technology. Most of the problems revolve around data collection, storage, analysis and sharing and drawing insights and conclusions from it. With the reliance on big data to make decisions in organizations and the administration of smart cities, intelligent technologies such as AI and IoT will help move forward.

Read 171 times
Rate this item
(0 votes)
Scott Koegler

Scott Koegler is Executive Editor for Big Data & Analytics Tech Brief

Visit other PMG Sites:

PMG360 is committed to protecting the privacy of the personal data we collect from our subscribers/agents/customers/exhibitors and sponsors. On May 25th, the European's GDPR policy will be enforced. Nothing is changing about your current settings or how your information is processed, however, we have made a few changes. We have updated our Privacy Policy and Cookie Policy to make it easier for you to understand what information we collect, how and why we collect it.