Tuesday, November 24, 2009

Master Data Management: Building a Foundation for Success


Evan Levy wrote a nice white paper, provided by Informatica, called Master Data Management: Building a Foundation for Success (Free registration required to download the white paper). In the white paper, he told about the challenges to implement a master data management. He said that in his book Customer Data Integration: Reaching a Single Version of the Truth, co-written with Jill Dyché, his partner and co-founder of Baseline Consulting, MDM is defined as: “The set of disciplines and methods to ensure the currency, meaning, and quality of a company’s reference data that is shared across various systems and organizations.” Below is a summary of the white paper:

Depending on the maturity and size of an IT organization, there may be a sizable collection of infrastructure services associated with managing data and systems. In their experience working with firms on new MDM programs, there are typically five technical functions that are core to MDM processing. Frequently, these capabilities are already part of the IT infrastructure. The five functions are:

1. Data cleansing and correction

Data cleansing is fairly common within most IT organizations, particularly when data is highly prone to data entry errors. Most data quality environments have been set up to focus on the cleansing and correction of well-understood data values. Customer name and address values are frequently the starting point for many companies seeking to clean up their “bad data.”

2. Metadata

Metadata isn’t limited to identifying the individual data element names and their definitions. It also identifies the data values and the origin of that data (its lineage).But metadata isn’t MDM. Metadata focuses on the descriptive detail about data.

The most visible metadata features that are useful in an MDM environment include: Terminology • y or data element names—for instance, the data element name is “ItemColor”
- Data values—for instance, the acceptable values are “red, green, or blue”
- Value representation—for instance, CC0000, 00CC00, and 0066CC
- Lineage details—for instance, “Created: 01/12/2009. System of origin: Order System”
- Data definitions—for instance, “the color of the item”

The use of existing metadata structures can dramatically simplify other systems’ abilities to adopt and use MDM because they may already recognize and work with the existing metadata.

3. Security and access services

One of the lesser-known functions within MDM is the need to manage access to individual reference data elements. MDM systems typically manipulate data using CRUD processing—to create, read, update, and delete—so centralizing the permissions to access and change key data is an important part of maintaining master data. MDM can support a very detailed or granular level of security access to the individual reference data elements.

Because of the granular level of data access that MDM affords—CRUD processing against individual reference values—an MDM system should avoid having its own siloed proprietary security solution. Instead it needs to interface to existing security services that an IT department relies on to manage application access.

4. Data migration

Data migration technologies alleviate the need to develop specialized code to extract and transfer data between different systems. Companies have invested heavily in ETL tools and application messaging technologies such as enterprise service bus (ESB) because of the need to move large volumes of data between their various application systems. The challenge in moving data between systems exists because the data is rarely stored in a simple, intuitive structure.

Regardless of the specific tools and technologies used in transporting data into or out of an MDM system, there are two basic ways of moving data: large volume bulk data (loads or extracts) and individual transactions. To load the MDM server initially with data from an application system requires bulk data loading capabilities. ETL tools are very well equipped to handle the application data extraction and MDM system loading.

5. Identity resolution

Most MDM solutions also benefit from the ability to uniquely identify and manage individual reference details. In the situations of simple subject areas (like color or size), managing and tracking the various reference values is very straightforward. This simplicity comes from the fact that the values are easy to differentiate; there’s not much logic involved in determining if the value matches “red”, “green”, or “blue.” Complexity occurs when the subject details being mastered have more variables, and variables that can be vague—such as a person. In the instances of complex subject areas, mastering the reference value requires more sophisticated analysis of the numerous attributes associated with the individual reference value such as their name or address.

The Inventory for an MDM Foundation

Here are some techniques for you to use to assess how to move forward on your own MDM development effort.

1. Data Cleansing and Correction
- Identify data cleansing tools that are already implemented within your company on other projects; pay special attention to applications that are likely to link to the MDM system
- Determine the data cleansing rules for the master data subject area and whether they have been documented
- Establish how you will interface your MDM system to the data cleansing tool (e.g., Web services or APIs)
- Contact the product vendor to determine which MDM product(s) it may already support

2. Metadata
- Review existing data warehouse metadata to discover whether there is metadata content that may apply to the master data subject area in question
- Determine whether there are metadata standards in place that apply to the given master data
- Find out if any of the key application systems use existing metadata standards
- If your company has a data management team, see if there is already a sanctioned process for identifying and developing new metadata content

3. Security and Access Services
- Identify the level of security requirements that need to apply to core master data
- Talk to the application architecture group to determine if there are application-level security standards in place
- Investigate whether your company has already defined a set of security and access Web services that may be leveraged for MDM

4. Data Migration
- Identify the bulk data migration standards that are in place—for example, how is Informatica® technology used in implementations other than data warehouse or BI implementations?
- Determine the current mechanism for application-to-application connectivity: Is it EAI? ESB? Point to point?
- Clarify which application systems can accept reconciled and corrected data
- Determine which legacy systems can only support bulk data extract versus transaction access

5. Identity Resolution
- Understand if your company already has identity resolution software in place to identify the existence of multiple customers, products, or other subject area items; this type of capability is most likely to exist in a customer-facing financial system
- Investigate whether the existing identity resolution system is configurable with new or custom rules
- Determine how the identity resolution technology interfaces to other systems (e.g., embedded code? API? Web services?)

6. MDM Functional Requirements
- Identify a handful of high-profile systems that depend on the timely sharing and synchronization of reference data; if you have a BI environment, we recommend looking at the predominant source systems as well as the most-used data marts
- Decide which reference data is shared and synchronized between multiple systems; the need for operational integration typically reflects higher business value
- Determine the individual system’s functional requirements for its reference data (e.g., create/ read/update or data quality/correction)
- Categorize the individual data elements by subject area to identify the specific subject areas requiring mastering. It’s only practical to implement MDM one subject area at a time.

His conclusion: "When it comes to the question of build versus buy, for MDM the answer is as complex as your company’s IT infrastructure, functional requirements, and business needs. The considerations are multifaceted and require an intimate knowledge of what you can do today versus what you’ll need to accomplish tomorrow. When it comes to integrating master data, one thing is clear: The first step to ROI is avoiding unnecessary expenses. Investigate and use the capabilities that already exist within your IT infrastructure, enabling you to make an informed decision on launching MDM the right way."

Evan Levy is an expert in MDM, and wrote a nice white paper, with a good explanation about how to build a MDM foundation. For those interested in the subject, he and Jill Dyché wrote a good book called: Customer Data Integration: Reaching a Single Version of the Truth.

No comments: