The original data warehouse definition, going back to Bill Inmon, states that a data warehouse is (among other things) a “time-variant, non-volatile collection of data”. But what do these requirements mean in practice?
Time-variant means that the data changes over time. But on which timeline? In many data warehousing projects, no explicit decision concerning the timeline(s) used for historization is made. This can lead to confusion and problems later on.
In addition, the inherent conflict between time-variance and non-volatility is often overlooked. What about data that arrives later than expected for some reason? What if corrections to attribute values or to the timeline itself have to be made later?
In this presentation, we will look at the different kinds of time you might encounter in a data warehousing context and discuss which (and how many) timelines you should use.
Christian Kaul is a data modeler, writer and event organizer based in Munich, Germany, who focuses on designing, implementing and improving data warehouses.
He has several years’ business intelligence experience in various industries, including healthcare, insurance, media, tourism and telecommunications. His project roles have included data modeler, data warehouse developer, project manager and support team lead.
With a keen interest in new developments in the data modelling area, he organizes a data modelling meetup group and chairs the Knowledge Gap data modelling and data architecture conference.