Big Data and Analytics has been a much discussed topic in business and IT news for a while now. This emerging capability of mining and analyzing vast amounts of data to recognize patterns, correlations and trends has never been possible before. The concurrent streams of advancing hardware scalability, speed and affordability, and advancing software capabilities around natural language processing and artificial intelligence have come together to make this possible. Relatively cheap storage enables vast amounts of data to be collected and retained from a mind-numbing array of sources, including almost any and all smartphone and computing device programs or apps. Imagining practical valuable applications for this capability is one of the few limiting factors in the burgeoning deployment of this technology.
All companies maintain transactional and customer facing documents in unstructured content repositories. These documents typically contain historical snapshots and unique business context for much of what goes on in the business. Many of these relevant business data elements only come together in one place and that is within regulatory compliant communications to your customers. That is why these documents are often called the “single source of the truth” and is why these documents can be such an important source of data for your Big Data and Analytic initiatives.
Heretofore, structured data has been the primary target and source for Big Data and the analytics engines and use cases we’ve heard about. When you consider that perhaps 90% of prospective data is unstructured and often “locked away” – out of sight and out of mind in departmental and enterprise Content Repositories, it becomes clear this is a vast untapped reservoir of potential data to explore.
The Content Repositories that house this unstructured data are notoriously proprietary, siloed and opaque. It’s hard to know what’s inside and then difficult to extract and repurpose it even if you do know. The retention of much of what is archived in these repositories is mandated by various regulations and business policies. Many companies view these only as costly necessary evils that add little to no value or competitive advantage – certainly not enough to offset their costs.
What if there was a way to find out what’s inside these content archives, extract it, reformat it and stage it for input to a Big Data Analytics application? What if there were compelling use cases for the mining and analysis of this data to recognize patterns, correlations and trends, or to satisfy regulatory compliance requirements? This would turn upside down the value proposition for these archives by repurposing a rich source of data that’s considered a costly necessary evil into a source of business value and insight.
A real-life example of this occurred when a large bank needed to provide legal proof of its compliance with banking regulations by providing a comprehensive analysis of its previous 10 years of customer credit card statements. They needed an automated solution to identify, extract and repurpose the necessary information to meet its regulatory obligations. CrawfordTech developed and deployed purpose-built software to find and retrieve statements archived in AFP document format, and then identify and extract all the required data, outputting it into XML.
Once extracted, the statements were pushed through an analytics audit system so that the data could be analyzed for potential overcharges and over-payments. Deploying this solution has allowed the bank to proactively address any mistakes it may have made and provide proof of compliance to regulators, leading to increased customer satisfaction and avoiding potentially costly fines.
Imagination is the limiting factor for the harnessing of this unique historical archived data for innovative applications as Big Data Analytics becomes more pervasive and even ubiquitous.
At Crawford Technologies, we understand content archives, document and data formats, and offer many enabling and innovative tools to manage, extend and extract greater value from your Enterprise Content Management investment.