A lot of what is typically discussed in regards to Master Data Services (MDS) revolves around either the Silverlight web-based interface or the Excel plug-in. What’s not often discussed are the back-end integrations scenarios for loading, extracting (querying) and the otherwise day-to-day management or reporting requirements a Master Data solution might involve. In this post, we will look at common integration scenarios and techniques for handling each within MDS.
Loading Data (Batch)
Out-of-the-box MDS has facilities to stage and easily load data in batch. For each entity (and hierarchy) you create, a staging table is also created. For leaf members the staging table takes the form of:
Data is then loaded directly to the table using a SSIS package or any other tool capable of writing to a SQL Server table. When all the data is loaded the batch can be started programmatically using calling a stored procedure and passing in the current version and batch name:
EXEC [stg].[udp_Customer_Leaf] @VersionName = 'VERSION_1', @LogFlag = 1, @BatchTag = N'Demo'
When preparing and loading the data to the staging table, the two most basic and critical fields are the BatchName and the ImportType. The BatchName will typically be dynamically created and I like to use [Entity Name]_[Source]_[YYYYMMDD] (i.e. Customer_CRM_20130707) format for batch names. The ImportType is the most critical as it determines the load behavior MDS will use. You will definitely want to familiarize yourself with the following ImportTypes:
|Import Type Code||Description|
|0||Inserts new records, updates existing records assuming staged data is not null.|
|1||Inserts new records, updates to existing records will fail.|
|2||Inserts new records, updates existing records. Will overwrite data with null if staged value is null.|
|3||Deactivates (Soft Delete) member. Will fail if the entity is referenced by another member.|
|4||Hard Deletes member. Will fail if the entity is referenced by another member.|
|5||Deactivates (Soft Delete) member. For members that reference the deactivated member, their value will be nulled out.|
|6||Hard Deletes member. For members that reference the deleted member, their value will be nulled out.|
Beyond selecting the import type, next consider the package for process design for bulk loading your batches. The typical SSIS package consists of:
- A data flow to move data from the source to the MDS staging table. Any necessary data type conversion or transformations should occur prior to writing the data to the staging table.
- An execute SQL task to kick off the batch stage process.
If you want to make your package more dynamic, instead of using a hardcoded variable containing the model version you can add an additional Execute SQL Task to the package to dynamically select the current version for the use in staging and batch processing.
After the batch has completed, the ImportStatus column found in the staging column provides the row-by-row results with a status of 1, indicating success and 2 indicating failure (a status of 0 indicates that the row has not been processed). When errors do occur during the load, you can quickly detail the cause of the error using the stg.viw_<Entity Name>_MemberErrorDetails view. Using this view, you can quickly determine the relevant error code and descriptions.
Extracting (Subscription View)
Subscription Views are the easiest and most straight-forward method to access leaf members, hierarchies, collections and consolidated member data. These views are created by clicking on the Integration Management link in the Administrative section of the MDS homepage and then choosing the Create Views option from the menu.
Adding a new view requires the selection of the model, entity, version and level (leaf, hierarchy, collection, etc.). The view can either be hard-coded to the version by explicitly selecting the version or can be set-up with added flexibility in mind by using a version flag.
Out-of-the-box MDS gives you four version flags: Archive, Current, Plan and Prior. If you need (or want) additional flags, they are created and deleted on the Version Management page by selecting Manage and then Flags. Mapping a version to either a pre-existing or new version flag is handled on the Manage Version page by double-clicking the version column in the grid and then selecting the specified version.
If you are feeling brave and wondering what lurks beneath, you can script out one of the Subscription Views to see the various gyrations that are required to get the data out of MDS. I’ll warn you that the query the system has generated for you is not for the faint of heart. Once you’ve distilled it done the foundation of your data is built on a series of tables that at first glance seem to have almost meaningless formats. Closer inspection however reveals that all tables created for your entity use the following format:
mdm.tbl_<Model ID>_<Entity_ID>_<Table Type Code>
The Model ID and Entity ID are keys created/assigned when the MDS creates the entity and the following table type codes are used:
To further make sense of the data structures in the MDS database we can dig further into the metadata tables to reveal the details of the inner workings by way of the Model ID, Entity ID and later the Attribute ID (for custom entity attributes).
The three tables that provide this insight are as follows:
- mdm.tblModel – Query by model name to retrieve the model id
- mdm.tblEntity – Query by model id and entity name
- mdm.tboAttribute – Query by entity id for list of attributes or include attribute name to get the attribute id
Taken together, a query could be easily built to extract the metadata information required to further decompose the MDS data structure. In the example below we are simply finding the model, entity and attribute ids for the Payment Terms attribute on the Customer entity in the Customers model.
SELECT m.ID AS ModelID, m.Name AS ModelName, e.ID AS EntityID, e.Name AS EntityName, a.ID, a.Name AS AttributeName FROM mdm.tblModel (NOLOCK) m --Models JOIN mdm.tblEntity (NOLOCK) e ON (m.ID = e.Model_ID) --Entity JOIN mdm.tblAttribute (NOLOCK) a ON (e.ID = a.[Entity_ID]) --Attribute WHERE m.Name = 'Customers' AND e.Name = 'Customer' AND a.Name = 'PaymentTerms'
So the natural question is probably, “Why on earth would I want to do this?”. That’s a fair question and the answer is pretty-straightforward. Outside of bulk loading data into staging tables, the Business Rule Workflows and the Web Service API are the primary points for either automation or integration. While both are highly-useful, the inherent complexity in working with either scares most people. Luckily, since all the MDS data is completely contained within the db instance and you have a solid grasp on writing t-SQL queries it is easy to use the metadata to do things such as build or automate batch integrations and build master data management reports.
To better illustrate this, lets look at a example which will allow us to simply find customer’s whose payment terms have changed. Note that before we start you could take a broad approach by using a subscription view and filtering by LastChgDateTime but that will returned all records that have been recently change. Instead, building on the query from above we start by finding the ids for each object type within MDS. Next, we could then join the mdm.tblTransaction table using the attribute id to find all entity members whose Payment Terms have changed as seen below.
SELECT t.* FROM mdm.tblModel (NOLOCK) m --Models JOIN mdm.tblEntity (NOLOCK) e ON (m.ID = e.Model_ID) --Entity JOIN mdm.tblAttribute (NOLOCK) a ON (e.ID = a.[Entity_ID]) --Attribute JOIN mdm.tblTransaction (NOLOCK) t ON (a.ID = t.Attribute_ID) WHERE m.Name = 'Customers' AND e.Name = 'Customer' AND a.Name = 'PaymentTerms' AND t.OldCode IS NOT NULL ORDER BY t.EnterDTM DESC
You could continue this example by next looping in a subscription view created over the Customer entity and the mdm.tblUser table to turn the result into an audit report for customer’s whose Payment Terms have been updated (seen below).
SELECT c.*, t.OldCode, t.OldValue, u.UserName FROM mdm.tblModel (NOLOCK) m --Models JOIN mdm.tblEntity (NOLOCK) e ON (m.ID = e.Model_ID) --Entity JOIN mdm.tblAttribute (NOLOCK) a ON (e.ID = a.[Entity_ID]) --Attribute JOIN mdm.tblTransaction (NOLOCK) t ON (a.ID = t.Attribute_ID) JOIN mdm.Customers (NOLOCK) c ON (t.Member_ID = c.ID) JOIN mdm.tblUser (NOLOCK) u ON (t.LastChgUserID = u.ID) WHERE m.Name = 'Customers' AND e.Name = 'Customer' AND a.Name = 'PaymentTerms' AND t.OldCode IS NOT NULL ORDER BY t.EnterDTM DESC
Now that you have been introduced to the super-secret inner workings of MDS we sidetrack the discussion into an architectural gray area. Purist would tell you that any access to the MDS database that isn’t batch-based should go through either the provided web interface, the Excel plug-in or the web service API. Rarely, however have I seen this as a workable solution for a company using MDS as an enterprise master data management platform. Instead, it occasionally necessary to provide facilities that break the proverbial mold. A high-performance Advanced Search is one of these scenarios.
Realizing that MDS is built on top of SQL Server 2012, a natural search option is the robust full-text capabilities. Using the metadata a full-text index can be applied to the underlying table which facilitates a flexible, high performance search over your data. Beware, however! Any changes to your model or deployments to alternate environments via the MDSModelDeployment utility will cause issues as you are not guaranteed to get the same underlying table structures so plan accordingly.
As you plan your master data management architecture and evaluate Master Data Services as the platform for building your solution, recall not only the built-in options for handling batch-based integration but also consider you use the MDS metadata to introduce flexible and powerful outside-the-box solutions. In my next MDS post we will look at using the Web Service API to manage both the MDS model and data.
Till Next Time!