Rather than use industry business intelligence terms pregnant with political meaning like Data Warehousing and Data Marts, I am using "Reporting Repository," which I hope is free from bias.
While there are proponents for doing reporting directly against the operational systems' data (see another blog for some information), it is generally speaking just not a good idea. The main argument is that operational data has live, real-time information. But there is a long list of disadvantages.
Instead of using operational data for reporting purposes, you should provide your end users with a centralized, shared copy created especially for BI: a Reporting Repository.
With this term, I mean that the operational and external data feeds have been restructured and stored in a way to make end-user reporting as simple and easy as possible. In addition to changing the data structure, the organization has added an abstract layer (master data management, hierarchies, and so forth) as well as a metadata layer.
A team of technical specialists must bear the burden of understanding the complex data one time while creating the reporting repository and the necessary integration processes. Otherwise, the end users are yoked everyday with the burden of trying to write reports against the complex operational data (which is not meant for reporting).
Here are some reasons you should not use operational data for your end-user reporting:
- Because of critical nature, operational system must be isolated and protected
- Data is structured for ease of data maintenance, not reporting
- Since not designed for end-user reporting, there is rarely documentation for doing that
- Typically, operational system does not physically store all data (uses business rules inside application code)
- Typically, operational system does not store historical data (only a snapshot of current situation)
- Typically, operational system does not have an abstract layer for end-user reporting
- Typically, operational system does not have a metadata layer for end-user reporting
- Typically, operational data not accessible by common BI tools
Reporting Repositories provide a corresponding solution to each of those operational data problems:
- Intentionally designed for access with considerations for security, performance, etc.
- Data is structured for ease of reporting
- Intentionally designed and documented for end-user user
- Either physically stores calculated columns or exposes virtual columns
- Captures historical information for auditing, comparisons, trending
- Provides abstract layer to provide context for understanding information
- Provides metadata layer to provide explanations for accessing and using information
- Designed to be accessed by BI tools
But what are you going to put in your Reporting Repository? Well, you will figure out the design and content based on the needs of your business decision makers. If you want to build an enterprise repository to serve any and all reporting purposes, you are looking at a Data Warehouse (Bill Inmon approach). If you go with a very specific repository for a unique purpose, you are considering the Data Mart (Ralph Kimball).
Here are the steps for designing your Reporting Repository:
- Your decision maker needs to take important business action
- Which requires specific information.
- Decision maker also needs specific method of interacting with the information (information delivery or interactive, on-demand access)
- Which determines the proper storage of information in the repository
- Which determines the requirements for integration with operational systems
- Which determines the requirements for capturing and storing data within the operational systems
If you came here through Google looking for answers to your BI questions, I hope this helped. If you have any questions, please contact me.