Records Management (RM) is supported by mature records management systems. However, the data explosion is raising new concerns about how RM should be handled.
A few ongoing issues include big data, master data management (MDM), and how to deal with unstructured data and records in unusual formats such as those contained in graph databases.
Records are kept for compliance purposes and for their business value and sometimes because no process has been implemented for systematically removing them.
There are continuing struggles with the massive volume of big data. IT and legal have different priorities about what to keep and getting rid of data makes IT nervous, but there are times when records should be dispositioned.
Data stored in data lakes is mainly uncontrolled and typically has not had data retention processes applied to it. Data quality for big data repositories is usually not applied until someone actually needs to use the data.
Quality assurance might include making sure that duplicate records are dealt with appropriately, that inaccurate information is excluded or annotated and that data from multiple sources is being mapped accurately to the destination database or record. In traditional data warehouses, data is typically extracted, transformed and loaded. With a data lake, data is extracted (or acquired), loaded and then not transformed until required for a specific need.
MDM is a method for improving data quality by reconciling inconsistencies across multiple data sources to create a single, consistent and comprehensive view of critical business data. The master file is recognized as the best that is available and ideally is used enterprise wide for analytics and decision making. But from an RM perspective, questions arise, such as what would happen if the original source data reached the end of its retention schedule.
A record is information that is used to make a business decision, and it can be either an original set of data or a derivative record based on master data. The record is a snapshot that becomes an unalterable document and is stored in a system. Even if the original information is destroyed or transformed, the record lives on as a captured image or artifact. Therefore the “golden record” that constitutes the best and most accurate information can become a persistent piece of data within an RM system.
Unstructured data challenge
A large percentage of records management efforts are oriented toward being ready for e-discovery, This is much more of a problem in the case of unstructured data than for MDM. MDM has gone well beyond the narrow structure of relational databases and is entering the realm of big data, but its roots are still in the world of structured databases with well-defined metadata classifications, which makes RM for such records a more straight forward process.
The challenge with unstructured data is to build out the semantics so that the content management or RM and the data management components can work together. In the case of a contract, for example, the document might have many pieces of master data. It contains transactional data with certain values, such as product or customer information, and a specialist data steward or data librarian might be needed to tag and classify what data values are represented within that contract. With both the content and the data classified using a consistent semantic, it would be much simpler bringing intelligent parsing into the picture to bridge the gap between unstructured and structured data. Auto-classification of records can assist, although human intervention remains an essential element.
Redundant, obsolete and trivial information constitutes a large portion of stored information in many organizations. The information generated by organizations needs to be under control whether it consists of official records or non-record documents with business value. Otherwise, it will accumulate and become completely unmanageable. On the other hand, if organizations aggressively delete documents, they run the risk of employees creating underground archives of information they don’t want to relinquish, which can pose significant risks. Companies need to approach this with a well defined strategy.
The system should follow a “five-second rule,” allowing employees to easily save documents using built-in classification instead of a lot of manual tagging. “The key is to make the system intuitive enough for any employee to use with just a few seconds of time and a few clicks of the mouse. In addition, the value of good records management needs to be communicated and the value ‘sold,’ so employees understand that it can actually help them with their work rather than being a burden. A well-designed system hides the complexity from users and puts it in the back end. It is more difficult to set up this type of system initially, but as more information is created, the importance of managing it also increases, in order to reduce costs and risk.
Graph databases—a different kind of record
Graph databases store information in a way that focuses on the relationships among data elements. Those representations could include networks and hierarchies as well as other relationships among nodes.
Graph databases are designed to persist data in a format that highlights relationships among data elements. A graph might include customers, orders, products and promotions. The network itself as represented in the graph database might be a useful record. A network could show relationships that indicate fraudulent activities, and those networks could be saved as records.
Graph databases are used in several other ways to aid records management. Many organizations today are creating their own internal knowledge graphs that represent records as a connected data model to aid search and discovery. This knowledge graph speeds up risk analysis and compliance determination. Graph databases are also used within the legal industry to speed up legal research associated with a case. A graph of case files, opinions and other documents makes it easy for researchers to identify information that may be material to a case.
The RM Struggle
Studies of records management consistently show that only a minority of organizations have a retention schedule in place that would be considered legally credible and that some have no schedule at all.
Even if a retention schedule is in place, compliance with it is often poor. Some go so far as to say that RM is facing a crisis. There is a battle shaping up between those who essentially want to keep everything forever because they might be able to extract business value from it and those who want to use records and information management to effectively get rid of as much information as soon as possible. It is very important to reconcile these differences.
From a business perspective, the potential upside of retaining corporate records so they can be used to gain insights into customer behavior, for example, may outweigh the apparent risks that result from non-compliance. Storage costs have drastically decreased and are often bundled into other paid services such as messaging, collaboration and large-scale analytics services in the cloud. The cost of combing through and removing unnecessary information can be high. I have seen a number of scenarios in which companies have undertaken projects to get rid of data, and they have found it more expensive than just keeping it.
Organizations need to ensure legal compliance function with its highest measurable value coming from getting rid of outdated and useless records. However, the highest value is actually within its framework for understanding and classifying information so that its business value can be exploited. RM professionals who realize this will not only survive, but also thrive in a world of increasing information complexity and volume.
If organizations view RM as a resource rather than a burden, it can contribute to enterprise success. In many respects, the management of enterprise information is already becoming more integrated and less siloed. For example, most enterprise content management (ECM) systems now have RM functionality. The same classification technology used for e-discovery is also used for classification of enterprise content. Seeing RM as part of that environment and recognizing its ability to enrich the understanding of business content as well as ensuring compliance can support that convergence.
Governance can be a unifying technique that provides a framework to encompass any type of information as it is created and managed. Governance is a set of multidisciplinary structures, policies and procedures to manage enterprise information in a way that supports an organization’s near-term and long-term operational and legal requirements. It is important to consider the impact of all forms of information, from big data to graph data, but within a comprehensive strategy of governance, the changing environment for RM becomes more tractable.
Galaxy Consulting has over 20 years experience in records management. Please contact us today for a free consultation.