A key capability of Kusto Query Language and Azure Data Explorer is the ability to make time series. This is the start of a 3-series set of posts starting with the make-series operator. We'll continue working with the sales dataset and define the overall scope of the query we're trying to build here.
let series = sales
| where getyear(Order_Date) == 2010 and getmonth(Order_Date) == 1
| summarize UnitsSold = sum(Units_Sold) by Region, Timestamp = Order_Date
| order by Timestamp asc;
let min_t = toscalar(series | summarize min(Timestamp));
let max_t = toscalar(series | summarize max(Timestamp));
series
| make-series NumSold=sum(UnitsSold) default=0 on Timestamp in range(min_t, max_t, 1d) by Region
| render timechart
Firstly we create a variable called series which looks at all of the Units sold in January 2010 only segmented by Region.
We also set 2 more variables min_t which contains the first Timestamp value (1st January 2010) and max_t which contains the last Timestamp value (31st January 2010).
Two more functions that we've introduced are the toscalar function which converts the variables into a single value and the range function. Similar to the Python keyword this produces a range which we'll use between the max and min Order_Date. It's also interpolated so will fill in the gaps if there are zero units sold on that day.
To do this we'll look at the time-series function. This function will create a variable called NumSold which will aggregate the number sold by Region by Day and if there are no sales then it will default to zero. In our case we want to know if there are no sales for that particular day but in other cases we might want to fill in the blanks in different ways.
In KQL we can use the following functions to do this:
series_fill_forward() - uses the previous value to fill the current one
series_fill_backwards() - uses the next value to fill the previous missing value
series_fill_constant() - uses a constant value for missing values - as per example is default in shorthand
series_fill_linear() - uses a linear interpolation function to fill in the missing values
The end result looks like this.
It's worth trying the following query which we'll use in the next couple of Lightning Fact pieces which is shown below.
let series = sales
| where getyear(Order_Date) == 2010 and (getmonth(Order_Date) == 1
or getmonth(Order_Date) == 2 or getmonth(Order_Date) == 3) and (Country == "United Kingdom" or Country == "Canada" or Country == "France")
| summarize UnitsSold = sum(Units_Sold) by Country, Timestamp = Order_Date
| order by Timestamp asc;
let min_t = toscalar(series | summarize min(Timestamp));
let max_t = toscalar(series | summarize max(Timestamp));
series
| make-series NumSold=sum(UnitsSold) default=0 on Timestamp in range(min_t, max_t, 1d) by Country
| render timechart
The graph that this renders show the interpolation of the values at 0 where they don't exist which is what we want.
Comments