Kusto Lightning Fact 6: series-fir() - Generating Moving Averages for Time Series data

A key initiative and a first step in introducing time series analytics into a data model is to generate moving averages. Indeed, the most methodology ARIMA, AutoRegressive IntegratedMovingAverages.

In this post we'll focus on the moving average part and in subsequent posts we'll focus on regression.

Moving averages are incredibly useful in that they allow us to compress and smooth out the volume of data effectively averaging a group of close data points into a single one so the period that the moving average is calculated is all important.

It gives for a much smoother understanding of data over time without the peaks, troughs and anomalies.

We'll consider our sales dataset again which will pick up from the previous post which described the use of make-series to create a series capability using KQL.

let series = sales
| where getyear(Order_Date) == 2010 and (getmonth(Order_Date) == 1
  or getmonth(Order_Date) == 2 or getmonth(Order_Date) == 3) and (Country == "United Kingdom")
| summarize UnitsSold = sum(Units_Sold) by Country, Timestamp = Order_Date
| order by Timestamp asc; 
let min_t = toscalar(series | summarize min(Timestamp));
let max_t = toscalar(series | summarize max(Timestamp));
series
| make-series NumSold=sum(UnitsSold) default=0 on Timestamp in range(min_t, max_t, 1d) by Country
| extend NumSoldMA=series_fir(NumSold, repeat(1, 8))
| render timechart

The query above looks at the the first 3 months of 2010 for United Kingdom "units sold" going through the motions of creating a series like we did in the previous post. However, you'll see the addition of the series_fir() (finite impulse response) function. This taks our newly created series NumSold a period to roll up data points repeat(1, 8) - or 8 data points and creates a a moving average. The original series is in blue and the new moving average series is in red as in the below graph. You should be able to see the different immediately with the overall up and down trends captured without the spikes. Lowering the repeat period will increase the size of the spikes and troughs in our moving average series.

One key aspect of generating this new moving average series is being able to capture the trend. If we then use that trend to generate the difference between the real numbers in the data we're left a bit of a random walk showing how the series varies from the moving average.

let series = sales
| where getyear(Order_Date) == 2010 and (Country == "United Kingdom")
| summarize UnitsSold = sum(Units_Sold) by Country, Timestamp = Order_Date
| order by Timestamp asc; 
let min_t = toscalar(series | summarize min(Timestamp));
let max_t = toscalar(series | summarize max(Timestamp));
series
| make-series NumSold=sum(UnitsSold) default=0 on Timestamp in range(min_t, max_t, 1d) by Country
| extend NumSoldMA=series_fir(NumSold, repeat(1, 8))
| extend Residuals=series_subtract(NumSold, NumSoldMA)
| render timechart

We've now extended our dataset with a new column called Residuals which is a simple subtraction of our 2 time series. The original one and the one we created from the moving average. The data in the above query is extended to over a year which is looking at at a 10-day moving average so the peaks and troughs will be closer to the actual peaks and troughs as you can see below. The original series is in blue and moving average series in red. The new series which looks like a random walk is in green and shows how the actual changes in data vary from our calculated moving average.

In the next couple of posts we'll look at using regression in time series and continue on with the concept of seasonality.

Keep reading!

3 comentarios

UUdolfiJelenai

19 abr

A typical high-grade English pocket watch from the end of the 19th century and the beginning link of the link 20th, would have had a ¾ plate layout, with all steelwork either black polished or heat-blued, jewels set in gold chatons, a large diameter, free-sprung, bimetallic cut compensating balance set with timing and compensating link screws, and often, a chronometer detent escapement rather than a lever. The balance spring would have been an overcoil type, needless to say.

Me gusta

18 abr

Sure, the compromise here is that we have shorter days in the winter – but everything about winter is hellish anyway. Take away an hour of daylight and I barely notice. The only way link to get through link the winter is to embrace the suck and carry on; find your winter-friendly hobbies and wait it out. I'll gladly link sacrifice an hour in the winter for more useful time in the summer, anyway.

KaleoxKendax

07 abr

As Isaacson said, as a part of the Polaris Dawn mission, the team is taking part in the first-ever commercial astronaut spacewalk link using the new SpaceX Extravehicular Activity (EVA) suit. The design on the dial is most likely based on the Intravehicular Activity (IVA) suit that the EVA is developed on (that's a guess because I saw this watch link last year, and the EVA suit was only unveiled earlier this link year).