Blog 10 November, 2022

Denormalised data, disaggregated

If you have been following our content you would have read about:

The time has come to discuss the structure of your data warehouse. What is the best way to build it? Kimball? Immon? Facts? Dimensions? What’s it all about? Grab a coffee and let’s dive in.

As we discussed in the ELT versus ETL blog, the near removal of storage and compute barriers has changed the game for data modeling and design. No longer do we need to worry about minimizing data redundancy and repetition. No more careful Kimball modeling of fact and dimension tables with spidered joins between them.

Instead we can flatten, denormalise and disaggregate.

With any of the column store cloud data warehouse technology out there, the thing to focus on is minimizing the complexity of the data at rest.

In nearly all cases, the greatest cost to a business in delivering business intelligence, insight and analysis is in the human resource required to actually perform the analysis. The cost of having to wait a week for an answer to a question almost certainly outweighs any infrastructure cost in your data pipeline.

By denormalizing the data into the minimum number of large, flat, disaggregated tables (think: a customer table – single row per customer, an orders table – single row per order; each with potentially hundreds of columns with all of the relevant attributes), you minimize the cognitive load on the analysts.

Analysis is predominantly conducted intra-table rather than using joins to create inter-table queries.

If constructed carefully and designed well, you may even achieve the nirvana of Insight at the Speed of Thought.

Imagine the dream scenario:

You can sit down in a meeting room with your senior stakeholders, a decent BI tool connected to your curated, structured data warehouse, and you can confidently say to them: “ask me the burning questions – what do you want to know?”

Intelligent
How are our orders trending versus last week and last year?

A few clicks and there is a graph on screen for them to see;

Ok, split that out by region:

Click – there it is;

Throw in our customer segments and split out by value cohorts;

Click – the analysis is on screen.

Go as deep as you like – with the data structured properly, available at the fingertips of your analysts, you can become the data heroes your business needs.

Use the transform layer and the power of a cloud data warehouse to create the data structures. Disaggregate the data completely (remember – to aggregate data is to destroy data – don’t do it where you can’t undo it). Use a BI tool (Tableau, Looker, PowerBI – they’re all good) to aggregate, compare, slice and dice for analysis.

Want to know more? We’d love to chat. Please get in touch.

Matt Sawyer

ELT ETL

Denormalised data, disaggregated

Sign up to our newsletter

Follow Us:

Denormalised data, disaggregated

You might also like...

Data Transformation Explained

Extract and load – the top of your data funnel

Sign up to our newsletter

Follow Us: