MDM and Big Data Make Each Other Better
So far, we have discussed Enterprise Data Governance topics. This is the third blog post in a series exploring Enterprise Data Governance. In the first one, we briefly defined transaction data, metadata, master data, reference data, and dimensional data. In the second part, we further explored reference data and its role in data governance solutions. In the third part, we discussed data governance needs within Financial Services, a highly-regulated industry, and how other industries can benefit from these capabilities.
In this installment, we bring Big Data into the discussion.
Big Data allows companies to process data sets that are too large to handle by traditional means. These data sets can originate from within the company; for example, a large airline may produce massive volumes of diagnostic data every hour, which is far beyond what is cost effective to store long-term. Many companies are focused on data originating from sources outside the enterprise, such as social media, financial instrument performance, or weather monitoring. With so many varied sources of Big Data available, can big data be governed? If so, is it worth the effort?
Before answer those questions, it’s important to point out that Big Data vendors may be pushing features of their software solutions instead of discussing big data governance. Product vendors tend to discuss Big Data use cases from the factory perspective; in other words, the types of data sought or the processes being built.
Vendors will typically cover information such as:
- Social Media Exploration
- Internet of Things
- Data Warehouse Modernization
However, it is essential for Big Data Strategy to include the ability to drive value from Big Data insights across relevant use cases, since use cases drive the investment. That’s where Master Data Management (MDM) comes into play.
The following should be considered:
- Customer Analytics
- Product Marketing Effectiveness
- Operational Efficiencies
- Merger and Acquisition Impacts
- Market Opportunity Analysis
The key is understanding what value propositions are sought when investing in Big Data solutions; this will allow companies to gain a competitive advantage. Rather than attempting to govern what may be “ungovernable,” MDM seeks to bring clarity to the key aspects of the business that drive performance. This, in turn, lends clarity to key business drivers that can be improved through Big Data analysis. In other words, MDM facilitates an increase in ROI from Big Data investment by focusing on driving analysis from well-governed enterprise data.
One of the fundamental Big Data principles is that greater insights can be attained from aggregations and statistics than can be gleaned from any individual record. For example, in order to analyze consumer sentiments regarding a product, the company may mine social media for data. However, this produces some challenges: brand sentiment is often easier to analyze than sentiment towards specific products. To solve this, MDM is fundamental, and mines Big Data based on a cleansed and consolidated list of products.
All companies need to address similar challenges just to obtain the right subset of Big Data to analyze. Once companies have assembled the proper datasets, what separates their effectiveness in the analysis stage is the ability to leverage master data to create meaningful aggregations. A company that can analyze customer sentiment across geographic, business region, and operational cost dimensions will be able to make more rapid and meaningful business process adjustments than a competitor that only considers geography. Only enterprises with well-managed MDM programs can make adjustments to business practices based on this analysis with confidence.
After the initial implementation, an effective Big Data strategy will plan for growth along the capability-maturity learning curve. A useful analogy is how master reference data is used to manage acquisitions in a phased approach. When a business is acquired, its chart of accounts is mapped onto the parent company’s chart to produce consolidated financial results. Sometimes the parent company’s chart of accounts must be extended to accommodate the new business. These data sets and mappings then make their way into the data warehouse. For conglomerates, that may be as far as it goes, but in many cases the acquired business ultimately moves to the parent company’s chart of accounts and systems, where MDM then supports a full-blown financial transformation process within the acquired business.
Big Data follows a similar progression, where master and reference data provide the mappings for external, unstructured data sources to align with internal data sources for analytics. As the Big Data processes mature, they influence governance processes, which extends the validated code sets and mappings to accommodate the high-value, unstructured data sources. This establishes an ongoing feedback loop between MDM and Big Data that increases the effectiveness of both.
Process alignment between MDM & Big Data is critical to maximizing these synergies. There are a multitude of valid technical options, which are of secondary importance to the business and data governance use cases. For example, many data architects have a preconceived notion that MDM should push master data into the data lake to better support the Big Data best practice of “Transform in Place”. While this is certainly an option, solutions like Oracle’s Big Data Appliance include highly scalable technologies that allow Hadoop file storage to be directly accessed by SQL and integration technologies (bypassing batch MapReduce processing entirely), making mapping and transforming unstructured data in Middleware an extensible approach.
In summary, Big Data analytics resemble traditional Data Warehouse analytics in that the better the data is governed, the better the insights from analysis will be. This will always be true, regardless of the technologies utilized.