SQLTeam.com | Weblogs | Forums

Data Vault or alternative

At my company, a team is setting up a new data vault 2.0 based data system.

When it was introduced, we were presented slides that showed it would retrieve data from different internal and external sources, consolidate that data, with Master Data Management that would re-inject the cleansed data into the source data system to increase data quality. Overall the system would become the basis for all our internal and external reporting needs, even with provisions for data scientists to experiment in. It would replace our Kimball data warehouse, as a Kimball based DWH is hard to maintain and adjust to new requirements. The new data system would be agile and adjust to any change in any data source or business requirement way faster than a Kimball DWH.
That's what the PowerPoint slides claimed.

At this moment the project is a year late. A team of seven senior programmers, all certified data vault 2.0, are working on it full time. And the project has only delivered a few minor things so far. Business is frustrated because all work on the Kimball data warehouses is set on hold. They were used to get the changes they asked for delivered in a few days.

As the person who build the Kimball data warehouses has left the company today, here's my question:
I witnessed one guy who created and maintained a few Kimball data warehouses, while also working on other projects, who was successful, fast and cheap.
On the other hand I see a team that has not been very successful so far, with slow progress and that's expensive. I hear they have recurring performance and missing/duplicate data issues.

What are your thoughts on setting up a general data system that can do the things the slides promised?
All architectures promise about the same things. What works in reality?

slides are one thing reality is another. Instead of a behemoth deliverable your company should have settle for pieces of the product every few weeks. By 4th week you would have known if they were legit. You might want to rethink your relationship with them. Cut your losses. Also You are telling us their side of the failure. It takes 2 to tango. Why did the dw engineer leave today?

We have a similar situation.
I love the simplicity of the core DV model with its hubs, satellites and links. Yet at the same time it's hard to get everything done right. We face similar issues with performance and missing or duplicate data and frequent disk space expansions (but that's normal as the system stores the complete history).
You need a skilled team member to tweak your RDBMS to get decent performance.
You need good tools for automated CDC and hub, satellite and link generation.

The initial investment in time and money is definitely far higher than promised, but I'm inclined to give it a chance. Maybe we'll see the benefits once the core of the system is functioning.

So far, the Kimball DWH wins thumbs down though. In overall maintainability, speed of delivery and cost.

The development is done in sprints, agile, with prioritisation, ...

I think it took at least 4 weeks for the initial setup of the servers, communication with the business about the new roles and all the other initial things. It's not that nothing is done. I have the imprerssion a lot of time is spend solving technical problems, and not solving business problems. The business just hasn't seen the great benefits that were promised so far.

He got a job somewhere else. I can't imagine he liked the current situation.