preview

Rarara

Decent Essays

Dealing with Missing Information in a Data Warehouse

Today businesses are investing many resources in building data warehouses and data marts to obtain timely and actionable information that will give them better business insight. This will enable them to achieve, among other things, sustainable competitive advantage, increased revenues and a better bottom line.
In the early '90s, data warehousing applications were either strategic or tactical in nature. Trending and detecting patterns was the typical focus of many solutions. Now, companies are implementing data warehouses or operational data stores which meet both strategic and operational needs. The business need for these solutions usually comes from the desire to make near …show more content…

In this scenario, the incoming fact records are stored for reprocessing and flagged as incomplete. These facts are then cycled back as input the next time the fact table is built. One can try reprocessing the fact record until there is no more missing information.
When correcting the missing information, consider the following questions. * Does the fact record only come from the source system once? * Is there a possibility that a row will become associated with a dummy dimension record because of edits? * Can changes or fixes to related dimensions result in the reassignment of dummy foreign keys to correct values?
If the answer to all three of these questions is yes, then reprocessing may be required. The final test is to balance the cost of development versus the benefits of implementation. Is this type of error likely to occur, and is ultimately obtaining the correct information of value to the organization? If yes, then reprocess the fact record.
If you have determined that reprocessing is required, you may either update the fact record or back out the original fact record with an offset entry followed by the insertion of the correct fact record.
-------------------------------------------------
Intelligent Assignment of Dummies
Occasionally, no amount of reprocessing will help. The information will never be correct and it will always be missing some dimension values. From a user-analysis perspective, it is

Get Access