Kusto Lightning Fact 1: summarize
I recently ran an intro using Kusto in Azure Data Explorer with our team. Most of them are pretty adept at Kusto anyway but it gave me a chance to think through some key features which I wanted to explore.
I downloaded a sales dataset from the internet. A 500K row dataset you can find here (http://eforexcel.com/wp/downloads-18-sample-csv-files-data-sets-for-testing-sales/)
In order to one-click ingest the file into your Azure Data Explorer instance follow these steps.
You should also get started with Kusto Explorer here:
So for our first lesson, once you get the data loaded refer to the tablename you created on ingestion. For me that was sales.
sales | take 200;
Run this query and you should see 200 rows of data. It's worth noting that you can replace take with limit if you fancy. In order to get the row count you can run the following:
sales | count;
Assuming you used the 500K row dataset you should get the number 500000 returned. Now run this query and I'll explain it.
project Region, Total_Revenue, Order_Date | summarize Earnings = round(sum(Total_Revenue) / 1000000000, 1) by Region, Year = getyear(Order_Date) | order by Region, Year asc;
Okay what happening here:
project Region, Total_Revenue, Order_Date
This is our equivalent of a SQL SELECT which give us the Region, Total_Revenue and Order_Date columns.
summarize Earnings = round(sum(Total_Revenue) / 1000000000, 1) by Region, Year = getyear(Order_Date)
We use the pipe operator to pipe the output to the summarize command and create a new column called Earnings from Total_Revenue. As Total_Revenue is represented in billions we round it to something easier to read and to 1 decimal place and summarize this by Region and the year in Order_Date using the KQL getyear function. This is the same as doing a SQL GROUP BY.
We then pipe the output again so that we can order by region and year ascending and voila we have a simple query which shows us how much each region made per year ordered by Region name.