Many years ago, my horrid manager refused me a wonderful opportunity to go on an all expenses paid training course all about data warehousing in some exotic location.
I was moaning about this to a colleague over lunch. She was an ex-teacher and happened to work in the prestigious data warehousing consultancy group. ‘There, there Norman. Don’t cry. Tell me exactly what you wanted to learn from this course ?’
‘Well Sue, I just feel so stupid. I don’t even know what a fact table is, or a slowly moving dimension let alone a star schema - all that fancy data warehousing terminology’.
‘Shut up and listen. You buy a sandwich in Tesco. The sandwich costs 2.55 GBP. You have a table called TRANSACTION with a column called PRICE. There are other tables called PRODUCT, REGION, STORE, DATE and CAMPAIGN. There are a load of foreign keys from the fact table to the dimensions and the data model is highly normalised.’
‘The TRANSACTION table is a fact table because it records a fact - an event that actually happened. Fact tables tend to be large. Just think of all those massive queues for all the the checkouts at all the Tesco stores.’
The other tables are called dimensions - these tables tend to be smaller and describe elements of the business and allow managers to report on sales by product/region/store/campaign/month/year/quarter.
‘Oh I see but what about a star schema ?’
‘Draw a picture with the fact table in the middle and the dimension tables around the edges. Connect the tables together. What do you see ?’
‘Oh I see. A pretty star. OK then. What about a snowflake ?’
‘Draw 7 stars and join them up. What do you see ?’
‘Oh I see. A lovely snowflake. Thanks a lot, Sue. That really has been very useful.’
‘No problem. Data warehousing isn’t actually that hard.’
‘Now what is the Pareto Principle ?’
Unfortunately, my helpful teacher suddenly remembered she had an urgent meeting to go to and the ‘30 second Guide to CRM’ was postponed.