The pros and cons of a Data Vault | Data Vault Design

The pros and cons of a Data Vault

Written by

Passionned Group is an expert in the field of data vaults and data warehousing.

A modeling technique for central data warehouse

A Data Vault is a modeling technique for the CDW, designed by Dan Linstedt, which chooses to store all incoming transactions regardless of whether the details are in fact trustworthy and correct: “100% of the data 100% of the time”.

It’s all about transactions

The pros and cons of a Data VaultFor example: a sales transaction has already taken place, but the corresponding customer does not yet exist in the CRM system. The sales transaction can nonetheless be stored in the CRM system. When the customer becomes known to the system, the transaction changes from a ‘meaningless’ fact into a useful ‘truth’ because now its context is known.

Data Vault keeps track of history

The Data Vault keeps a history for each table field and an ingenious construction of hubs, links and satellites ensures enormous flexibility in storing data. The CDW is loaded much faster since different aspects can be processed simultaneously, in parallel. When we use a Data Vault, the CDW does not have a dimensional structure. That stage comes later, namely when we build the data marts or cubes from the Data Vault. Overall, the Data Vault concept provides a different outlook on both modeling and the architecture of Business Intelligence.

What are the real benefits of a Data Vault?

Question is: what are the real benefits. Moreover, does the Data Vault have any disadvantages? Most noticeable is that the Data Vault distinguishes between facts and the truth, which can be useful in order not to lose transactions and is in fact often necessary from the perspective of compliance. However, does it actually make sense to include a transaction in a report (or analysis) if it is not truly honest?

It requires more time

Creating a Data Vault seems to be complex and probably requires more time, particularly because it remains to be seen whether available ETL software solutions will in fact support the standard Data Vault (see the Data Vault discussion). The same applies to translating hubs and satellites into data marts and cubes. It is simply more difficult.

One version of the truth

Another question: how do we ensure that we do not develop more than one version of the truth, whilst creating the data marts and cubes? After all, it is at this stage that we establish the business definitions in the Data Vault Architecture and it is possible that we may need as many as ten different aggregations for one specific indicator.

Barely manageable data silos

Generating all these from within the Data Vault, may lead to a situation that could easily degenerate into an indistinct, barely manageable jumble of loose data silos – just like old times in the pre-data warehouse era. In short: it is true that a Data Vault offers a flexible repository for all corporate data, but its usefulness and advantages appear to be limited. Besides this, the fact that no enforced data-integration takes place is quite a drawback.

  1. Avatar The pros and cons of a Data Vault Joel Wittenmyer says:

    Daan,
    4 years later, I’m wondering if your outlook on Data Vault has changed.

Comment on this post by Daan van Beek

Your email address will not be published. Required fields are marked *

A selection of our customers

Become a customer with us now

Do you also want to become a customer of ours? We are happy to help you with the pros and cons of a data vault (data vault design) or other things that will make you smarter.

Daan van Beek, Managing Director

DAAN VAN BEEK MSc

Managing Director

contact me directly

Fact sheet

Number of organizations serviced
2740
Number of training courses
2741
Number of participants trained
2742
Overall customer rating
8.9
Number of consultants & teachers
2743
Number of offices
3
Number of years active
14