For German readers: Es gibt eine deutsche Version dieses Blogeintrags.
In the past, I have outlined elements to take into account when starting a DQ initiative. This post describes how to deal with Data Quality issues.
Some Background
The Deming Cycle is an established iterative four-step problem-solving process typically used in business process improvement. It is also know as PDCA for its four phases:
- Plan
Establish the objectives and processes necessary to deliver expected results. - Do
Implement the new processes, often on a small scale if possible. - Check
Measure the new processes and compare the results against the expected results. - Act
Analyze the differences to determine their cause. Each will be part of either one or more of the PDCA steps.
(Adapted from the wikipedia article on PDCA.)
Overview
The DQ Cycle is an adaption of this general framework to Data Quality. I’ve used this framework at a number of customers with good success. The general idea is pretty simple and easy to follow, but it is an excellent reminder to make sure you’ve crossed all the T’s and dotted all the I’s.
The DQ cycle has the same phases as the Deming cycle. It describes how to deal with a single DQ problem that has been identified. This means that there may be a number of DQ cycles going on at the same time, each cycle dealing with its problem, and each of these cycles may be at different points in the cycle.
As you may have noted, the description started with the term “DQ problem that has been identified”. In order to accommodate this “problem identification”, there is an additional “Init” phase that kicks off a new DQ cycle:
So we end up with these phases:
- Init: A new DQ problem is identified
- Plan: You analyze the problem and decide on a course of action
- Do: The bad data is corrected
- Check: You verify that the DQ problem is resolved
- Act: You identify and implement measures to prevent the problem from re-occurring
These phases will be described in more detail in the following posts.
Leave a Reply