Why redundant data isn't a disaster | Data warehousing

Redundant data: rocking the boat?

Written by

Passionned Group is a leading analyst and consultancy firm specialized in Business Analytics and Business Intelligence. Our passionate advisors assist many organizations in selecting the best Business Analytics Software and applications. Every two years we organize the election of the smartest company.

Business Intelligence only works well when we regularly retrieve data from the source systems and copy it to a separate computer and database. This means that the data from the source system are stored redundantly: in the source system and in the data warehouse.

Data should never be stored more than once?

A traditionally minded IT specialist will find this unacceptable: data should – within the company network – never be stored more than once so that when we change data we will not need to do this at several places. The fact that this principle benefits the maintainability of data is beyond dispute. Especially when we need to analyze large volumes of (unstructured) data: Big Data.

Do they have a valid argument?

At first glance, the IT specialists do have a valid argument, however there are many other reasons that actually justify redundancy of data within the corporate network. The main argument is that we actually need a copy if we want to be able to ‘freely’ analyze data – which can be a heavy burden on the computer -, without the operational system putting its cap down.

Many analyses require quite some calculating power from the computer. For example: in order to calculate the revenue of a pharmaceutical wholesaler per account manager, per quarter no less than 25 million rows need to ploughed through.

Operational processes are at risk

And that is not all: the data still need to be added up and grouped per account manager, per quarter to then be presented in a report. When we perform such analyses on the source system – for example, the ERP system itself – the organization’s operational process is very much at risk: the order processing process proceeds much slower or stops altogether.

In one of the next posts, we will examine the other arguments that provide proof for the fact that we do – in many cases – need a data warehouse.

Comment on this post by Daan van Beek

Your email address will not be published. Required fields are marked *

A selection of our customers

Become a customer with us now

Do you also want to become a customer of ours? We are happy to help you with why redundant data isn't a disaster (data warehousing) or other things that will make you smarter.

Daan van Beek, Managing Director

DAAN VAN BEEK MSc

Managing Director

contact me directly

Fact sheet

Number of organizations serviced
2506
Number of training courses
2507
Number of participants trained
2508
Overall customer rating
8.9
Number of consultants & teachers
2509
Number of offices
3
Number of years active
14