NEWSAbout UsContact UsWebsiteBizcommunity

Shedding light on your company's dark data

We all know there are mountains after mountains of data out there - generating it is what our beloved Internet does, along with our smartphones and other connected devices. While having every conceivable bit of information available at your fingertips can be highly beneficial, the existence of massive volumes of data carries an inherent risk of improper control over it.

For organisations, this problem is often called ’dark data’, information assets obtained during various regular organisational processes that are not used in any meaningful way to derive insights or empower decision making. Perhaps the most troubling aspect of dark data is that there are usually no plans to ever use it and, yet, organisations opt not to discard it.

Naturally, this creates all sorts of issues for the digital and operational pillars such as data compliance, endpoint security, privacy and productivity. There are increased risks and liabilities in information governance, too, which can, in turn, impact data quality.

Even beyond the burden of regulatory compliance, dark data is costing businesses vast sums. That may be even more jarring with extreme changes in circumstances such as the ongoing pandemic, where countless businesses are thrust into the unknown and forced to adapt to the best of their abilities.

This is why reevaluating important data and adding structure to unstructured content stored in log files, documents, emails, photos, ex-employee records and the like is key to better understanding its long-term impact.

So how can businesses get the most value from dark data?

Shining light with the help of AI

Delivering insights on complex data should start with a process of unifying that data from multiple sources. On-premises and cloud data (both private and public) sources first need to be located, recovered, parsed and aggregated together before any insights can be gained and distributed. Tools like Apache Hadoop and SAP Hana act as main frameworks for data storage and processing that include everything from log files from multiple applications to user behavior data.

The idea is to get the big picture of existing systems and processes to understand why and what they are being used for – and utilise them efficiently.

Once the spotlight is on the type of data being collected within the organisation, the next step is to provide access to it, wherever it may be. As is often the case with large companies, data is siloed in systems and/or dependent on dashboards, creating challenges when it comes to data consistency and a comprehensive view of it.

To make the most of your data in a scalable manner, artificial intelligence will have to lead the way. Potentially, the solution to illuminating dark data can be found in training more workers in data science and analytics and implementing data collection into software development wherever applicable. However, using an AI-powered solution enables even less-technical employees to analyse data and unlock deeper insights.

This is definitely the case when it comes to ‘self-service BI’ solutions. Sisense, for example, is known for simplifying the consolidation and mapping process, allowing for the creation of personalised dashboards with relevant analytics that don’t require previous experience. Advanced teams can use the platform to collaborate on data projects via the cloud or to create apps with embedded analytics functionality.

Not everyone will be able to digest a file of complex analytical insights and apply them to business practices. These tools make it easier to filter, explore, mine and visualise data for immediate answers. In other words, once your team starts working with previously dark data, they’ll still need to have the appropriate tools to transform it into actionable insights.

Enterprises can leverage AI to have its information discoverable through search and other means. Driven by machine learning algorithms, AI bots can crawl through all the files, then automatically classify and tag them so that organisations can begin understanding the information they own.

Classification as a prerequisite for governance

In a nutshell, data classification helps determine what baseline controls are suitable for the actionable and safe management of that data. Before dark data can churn out some form of revelation, businesses have to ask the right questions that are relevant to their undertaking, and they can’t do that without full visibility. Some data has huge potential, while other data is simply not needed.

By reorganising it so that relevant documents can be found faster and deleting unnecessary data, companies can minimise costs and potentially even monetise data they didn’t know existed in the first place.

All of the above, of course, requires a safe environment. It’s critical to be able to properly safeguard the system and data, especially as your business grows and security needs change. There are countless ways in which information can leak, get permanently lost or misplaced: from using software that isn’t IT-approved and using insecure servers and storage services, to acquiring other companies with their technologies and legacy apps and everything in between.

To conquer the problem of gaps in a company’s safety net, security can be split into three levels:
  • System-level - role-based security and integration
  • Object-level - access for specific dashboards and subsystems to specific users and groups
  • Data-level - user-based security according to a specific user’s needs

In doing so, businesses can save money on potential compliance fines, as well as minimise the risks associated with hacks and data breaches.

Dark data demands a strategic approach

Most companies leave an enormous amount of dark data behind them. To effectively bring it to light, a holistic approach is needed. The aim of illuminating dark data is not to digitise information and tap into it just because it’s there. The pursuit of dark data must lead to real business outcomes and as such, has to be close to the top of organisational priorities.

There must be a strategy in place to seamlessly transfigure technologies, processes and operations in order to take maximum advantage of data through automation and analytics. If you unlock its relevance and usefulness, benefits of dark data far outweigh the costs of maintaining it. Data counts – but only if you can count it.

18 Aug 2020 13:55


About Boris Dzhingarov

Boris Dzhingarov graduated UNWE with a major in marketing. He is the CEO of ESBO ltd brand mentioning agency. He writes for several online sites such as,,, Boris is the founder of and