Enterprise Data Warehouse - Mixed Dimension

Hi there, normally when discussing a dimensional model (Enterprise Data Warehouse / Star Schemas), you hear about dealing with a mixed grain. I have a situation with what I'd call mixed dimensions. I've been able to handle it for a variety of dimensions, but our employee dimension is going to be problematic.

We have a situation where, we have employee information that comes in for different applications that are not integrated. We'd like to create a conformed dim_Employee dimension that has the same grain.

For a subset of our employees, we have way more robust information that accurately tracks changes in employee status, position changes etc. The rest of our employees are not (and will never be) in that data source. The employees in this more robust system are also in our HR/Payroll system with everyone else.

Would anyone have tips on how to handle these disparate sets? There is some common information (employee number from the HR system). But the data we have for the rest of the agency will not have nearly as much information or changes. Do you suggest having them all in one comformed dimension, but to leave the additional columns not present for the rest of the agency as unknowns, and allow for the more rapid change tracking for the people in the more robust system? Thoughts/ideas would be greatly appreciated!

Thanks

Have you heard of the Data Vault architecture? It sounds like it may help with your scenario.

This video discusses it:

FYI I've seen Leslie and Jeff present on Data Vault but I haven't watched this particular video yet.

1 Like

Thanks Rob! Much appreciated. I'll watch today. I'm sure this should be a single conformed dimension. I'm just getting hung up on the discrepancies between the tracking for employees depending on the source systems for each business area.

Thanks again!

IIRC you would conform that dimension as best you can with all of the columns that are common in every source. Any additional columns go into an attribute table. Definitely watch that video and check on Jeff and Leslie's other resources. I'm pretty sure they're both on Twitter, you can try reaching out to them on specific questions.

Very cool, thanks! I'm a bit nervous about the idea of going the data vault direction...it's just a complete different path than what we're on. We're on a good track with most of the design, but the employees dimension is problematic. We have a very robust system for our Bus Operators (transit agency) with the gnats-a## details including things like inactive status etc. Then there's the "ERP" that we use for all employees which is a complete wreck. The bus operators make up about 70% of the workforce, so I'd hate to lose the level of detail we have there. Mechanics, admin, facility workers etc all are relegated to the ERP. Bus Operators are also in the ERP which adds more complexity. I'll reach out to them...I'm about 1/2 way through the video and definitely getting some good ideas. I appreciate the feedback, Rob.

Thanks!
Craig