Unused data in the data warehouse is no use at all. It takes up valuable space and often negatively affects the performance. For that reason, we do not usually load all the data we have available in our source systems into a data warehouse. On-demand data integration will let you retrieve the information you want directly from the source systems without storing it into a database.
Generally, we select only the data that is relevant for decision-making, customer relationship management (CRM) and performance management. However, after analysis and reporting, one may want to have some detailed data available to be able to take appropriate actions.
Direct access to operational systems
For example, a large pharmaceutical wholesaler wants, due to the economic climate, to cut inventory without affecting the delivery service levels. After thorough analysis of the revenues, logistics management decides to change the policy regarding the product assortment. Products sold less than five times in a year will be removed from the list of products sold, with the exception of those that are bought by loyal or profitable customers. The vendors of these products need to be informed. To do this properly the purchasers should have a list of vendors, with their product codes, their full address data, the contact people, their phone numbers and other detailed product data. That type of information is not available in the data warehouse itself, so either this data needs to be transferred from the operational systems or you need Business Intelligence tools (list) which has direct access to the operational systems.
Access to SAP has always been difficult
Products that can do this have been available for years, but getting permission to access production systems like SAP has always been difficult. The reason for refusing to allow access has generally been performance, many production systems are finely tuned for the number of known users they support and adding an unknown number of users who want ad-hoc information from gigabytes of data is generally seen as a bridge to far. It is interesting to not that when we look at companies who have been very successful with their Business Intelligence-solutions that they almost always have this facility available. In general the more users who have access to more data the more successful your Business Intelligence system will become!
A new concept: on-demand data integration
All this complex moving data backwards and forwards is no longer necessary with a concept called on-demand data integration. First, all of an organization’s data is modeled in a meta data repository, which could be the repository of the Business Intelligence or ETL tool. Per data element, we define the transformations, straight forward for simple detail data, or more complex steps for derived data like performance indicators. In addition, we decide which elements we need to store in the data warehouse (for most organizations the focus here is on decision making) and which elements we do not need (mostly operational detailed data).
Webservice provides one interface to corporate data
After building and loading the data warehouse, we build a webservice that presents the meta data layer to the user; this will probably be either a reporting solution, a dashboard or a portal. All the data the organization keeps about customers, products, vendors, processes, etc. is presented to the user through one interface. To the user it looks like one system, in the background different separate systems are involved.
How does it work?
The webservice interacts with the ETL tool, so that when a user asks to see specific information, the webservice knows if the data is stored in the data warehouse, or should be extracted, integrated and transformed real-time from the source system(s). In the background, an ETL process is called to retrieve and transform the data. For performance reasons it is necessary to ensure that the operational system(s) are not overburdened with queries retrieving a lot of data. It is also important that for on-demand data integration that all the different components of the architecture are properly integrated; that means the reporting tool or portal with the webservice, the webservice with the ETL tool and meta data repository, and the ETL tool with the source systems. If it works properly, on-demand data integration fully supports the principle of closed-loop (operational) Business Intelligence. Employees can easily switch from the analysis phase to the action phase, being able to base their actions on reliable information and useful knowledge.
The operational data store – ODS
Some organizations have built an operational data store (ODS), for the kind of detailed information described above allowing analysis and reporting on both detailed and near real-time data. The data in an ODS is generally fairly new, not more than a few days old, and has same structure as the source systems. Although such a database can be valuable, it requires a lot of work to built and maintain it. Additionally it makes the data warehouse architecture much more complex. However, an operational data store may be necessary when you require (near) real-time insight into business processes performance, or when highly detailed information is needed. These goals are often difficult to combine with a central data warehouse, modeled with star- or snowflake schemas, and optimized for (high-level) decision-making. When an ODS is built, most organizations try to maximize its’ use, moving the operational reports built-in the source systems to the ODS, by doing so, a fully independent (operational) reporting environment has been created. All the information needed for operational, tactical and strategic decision-making can be found in one system. Organizations should be aware of the concept and advantages of on-demand data integration, and decide for themselves if it can be useful. If so, it can be seen as an alternative to building a relatively expensive ODS.