We are wrapping up this series on DQS with an overview of the administrative tasks that you will inevitably encounter in your own adventures with DQS. Specifically we will dial in on the activity monitoring capabilities, configuration and the security model. If you haven’t been following along you can catch up using the links below.
DQS Blog Series Index
Part 2: Building Out a Knowledge Base
Part 3: Knowledge Discovery in DQS
Part 4: Data Cleansing in DQS
Part 5 : Building a Matching Policy in DQS
Part 6: Matching Projects in DQS
One of the two options available under the Administration tab within the Data Quality Client is Activity Monitoring. Activity Monitoring allows you to view all of the domain management, data quality project and SSIS cleansing activity that is currently in process or has occurred previously in the DQS environment.
The basis of activity monitoring in DQS is a grid with the activity data and a filter dialog that helps you digest the data and find what your looking for. The grid contains the following information:
1. ID – A DQS generated id assigned to the activity
2. Name – The name has a different meaning based on the activity type. For knowledge management this is the knowledge base name, data quality projects it’s the project name and SSIS cleansing it’s the SSIS package name.
3. Is Active – Either active, ended or terminated
4. Type – Activity type can be Knowledge Management, DQ Project or SSIS Cleansing
5. Sub-Type – User when the activity type is either Knowledge Management or DQ Project. For Knowledge Management valid values are either Domain Management or Knowledge Discovery and for DQ Project the sub ?type will be either Cleansing or Matching.
5. Current Status – Graphical representation of either running, succeeded, failed or stopped.
6. DQKB – The knowledge base used for the activity
7. User – The user which created or last ran the activity
8. Activity Start Time
9. Elapsed Time
10. Activity End Time
Above the activity grid is the filter dialog and several buttons. This dialog allows you to filter the result set based on one of of a number of the grid columns (Is Active, Type, Sub Type, Current Status, DQKB or User). It’s interesting to note that the Value prompt populates based on the range of possible values once you select an option in the Filter By, which means you don’t have to worry about typos or knowing the name or spelling of a knowledge base for example. You can also filter based on a date range, which should be relatively self-explanatory. Finally, to apply the filters you have selected, you must click the “Refresh the activities list” button.
There are several other important parts to point out within the Activity Management portion of DQS. The first is the ability to Terminate an activity. With the proper permissions, you can terminate both a Knowledge Manager and DQ Project (but not SSIS Cleansing) activity by selecting the row in the grid and then clicking the “Terminate the selected activity” button above the grid. You will be prompted to confirm that you in fact want to terminate the activity before it is terminated.
The next function you have available is an export activity. When you select an activity and click export, an workbook with the activity, processes, source profile and field profilers is created for your use.
Activity Monitor – Activity Details and Profiler
The last part to point out is the activity details and profiler information. The activity details breaks down the larger activity into what Microsoft refers to as computational steps or processes. For example a DQ Matching Project could be broken down to one or more matching steps and an export step. Each step type as well as status, start time, end time and elapsed time is displayed. Finally, the Profiler tab is exactly as we seen previously in that it displays the source statistics associated with the activity.
Configuring DQS is a rather trivial task once you understand the bigger picture of the DQS fundamentals. The configuration section available under the Administration section of the Data Client Client is broken down into three sections.
The first section is the configuration area for hooking DQS to a reference data source. If you are using the Azure Data Market you will copy and paste your Data Market Account ID here. You can additionally, configure direct online third party reference data services by specifying some basic info such as Name, Schema, URI and an Account ID. We will dive much deeper into this functionality in a separate blog post so we will basically just gloss over it now.
Configuration – Reference Data
The second section is for General Settings which allows you to set the minimum score for suggestions and auto corrections for cleansing activities and the minimum record match score when performing matching activities. Additionally, you can enable r disable the notifications that appear within the profiler by setting the “Enable notifications”.
Configuration – General Settings
Lastly is the Log Settings. Within this tab you can control the log level (Fatal, Error, Warn, Info and Debug) within each distinct area of DQS. Additionally, under the advanced section you can set the log level at the DQS component level if you happen to find yourself troubleshooting an issue that requires it. It should be noted that there are three distinct log files produced by DQS and that the log file are in the standard pipe delimited format. The log files can be found in the follow locations:
1. Server Log File – C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Log
2. Client Log File – %APPDATA%\SSDQS\Log
3. Cleansing Component Log File – %APPDATA%\SSDQS\Log
If you find that you need to configure logging in a more advanced fashion check out this MSDN article.
Configuration – Log Settings
Security within DQS is role-based and is driven by three different roles. You must be able database administrator to add users to the DQS roles. The best way to illustrate how security works with these roles are a chart.
|DQS KB Editor
|DQS KB Operator
|Create/Edit Knowledge Base||X||X|
|Edit/Execute DQ Project||X||X||X|
|View Activity Monitor||X||X|
With that we can wrap up this introductory series to DQS. In this post we covered the activity monitoring capabilities that are available within DQS. We also looked at configuration and how security is handled. I hope this series has been educational and that you have taken away the power and potential that is available in the Data Quality Service set of tools. Keep an eye out for future blogs that will dig deeper into more advanced. DQS topics
Till next time!!