Quantcast
Channel: Tabular – SQLBI
Viewing all articles
Browse latest Browse all 227

Handling Wrong or Missing Dates in Tabular

$
0
0

In the traditional star schema design of a Data Mart, you replace a missing, unknown or wrong date in the fact table with a dummy value in the Date dimension table. In Tabular, handling a Date table requires an existing date and you cannot use a NULL date in a Date table. This article describes how to apply the NULL replacement in the fact table using views, without altering the relational structure of the Data Mart.

Classical Star Schema Approach for Date Table

The traditional way to handle the Date dimension in a star schema is creating a table with one row for every day for the years used in fact tables. The key is usually an integer number in the format YYYYMMDD, with a special value identifying unknown or missing days. Such a value can be 0, like in the following table.

ID_Date Date Year Month Number Month Day of Month
0 0 0 <Unknown> 0
20130101 2013-01-01 2013 1 January 1
20130102 2013-01-02 2013 1 January 2
20130103 2013-01-03 2013 1 January 3

The use of this table produces a fictitious year and month number (0) and a special description of names (Unknown) that isolate the values associated to an unknown or wrong date. This works pretty well in Analysis Services Multidimensional, but is not going to work in Tabular and Power Pivot.

Tabular and Power Pivot Do Not Like NULL Values in Date Table

The Date dimension of a star schema is imported as a table in Tabular and Power Pivot, and then it has to be marked as a Date Table. This operation activates all the Time Intelligence functions in DAX and defines which is the column containing the list of dates that will be used for all time-related calculations. Unfortunately, the presence of a NULL value is not allowed in this column and you have to change something in order to make it working.

You might be tempted to replace the NULL date with a real date, such as Jan 1, 1900. However, this is not a good idea, because would be displayed as an available date to the end user, with data associated to it. A better solution can be based on the fact that Tabular and Power Pivot accept values in the fact table that does not correspond to existing rows in the dimension. Thus, the simpler solution is importing data from the Date table excluding the row assigned to the dummy value (the one containing the NULL date) using a query (or better a view) such as:

SELECT * FROM DimDate WHERE Date IS NOT NULL

You will see all the data contained in the fact table referencing a non existing row in the Date table grouped into a single “blank” value.

Missing Year

The main issue is that in this way you cannot control the name displayed (blank) and its position. The usage of <unknown> as a special name was also because in alphabetical order it was displayed at the beginning of most of the lists.

Having a way to customize the (blank) values added automatically to handle unrelated data would be a nice feature in a future version of Power Pivot and Analysis Services Tabular.

Alternative Approach for Date Table

As an alternative approach, you can keep the dummy row in the Date table and avoid using Time Intelligence functions in DAX, writing the calculation using CALCULATE and other standard DAX functions. An example of this approach is described in the article Week Based Time Intelligence in DAX, where you can see how to write the same calculations based on columns in the Date table. Just consider adding special conditions in your formula in order to handle the dummy rows present in the Date table.


Viewing all articles
Browse latest Browse all 227

Trending Articles