How to use KQL for Azure monitoring (A basic recipe guide)

A guide on how to use common KQL conventions and techniques to query data, explained with handy cooking analogies.

By Amy Coughlin

Apr 15, 2024 • 5 Minute Read

Please set an alt value for this image...

Kusto Query Language, or KQL, is causing quite a stir in Azure Synapse Analytics and other Azure services that handle big data analytics. But most developers and IT administrators stumble upon this language for the first time while dealing with an urgent platform issue in an Azure Log Analytics workspace. And boy, do they need answers pronto!

It's like when your partner casually drops the bomb that they're bringing a colleague over for some "apps and drinks" tonight. Translated, that means they're secretly expecting you to whip up a gourmet meal that's a step above a sad bagged salad and a store-bought pre-roasted chicken. No pressure, right?

Now, I can't magically transform you into a KQL master chef with just five easy lessons. But what I can do is share a few handy kitchen tips that will help you bluff your way through until you find a spare moment to peruse the instruction manual for your fancy, new KQL cookware.

In fact, you can impress your colleague by serving this casserole at the table—or at your next neighborhood potluck if you end up ordering pizza for the "apps and drinks" tonight:

          // Show anonymous requests
StorageBlobLogs
| where TimeGenerated > ago(3d)
	and AuthenticationType contains "Anonymous"
	and OperationName startswith "PUT"
	and StatusCode == toscalar (
        	StorageBlobLogs
           	| top 1 by TimeGenerated desc
           	| project StatusCode)
| project TimeGenerated, OperationName, AuthenticationType
| order by TimeGenerated desc
    

In this article, we’ll break down this basic dish line by line, and explain the few common KQL conventions and techniques used to make it. For those familiar with SQL, I've included some comparisons to help you apply your existing knowledge of querying data.

The KQL Basics: As easy as boiling water

Enthusiasts of SQL well know the basic way to return all data from all columns in a table with SELECT * FROM StorageBlobLogs. However, KQL is a read-only language, so you don’t need to think about INSERT, DELETE, UPDATE and so on. This frees you to dispense with the all of the SELECT * FROM syntax!

The equivalent to return all columns from the StorageBlobLogs table is simply:

          // Show blob storage requests
StorageBlobLogs
    

Yes, that is an executable snippet of code! That’s because KQL was made to intentionally be easier than SQL. You can see that commenting your code is just two forward-slashes to privately express your thoughts.

Your cooking pipeline: Add the ingredients in order

If I want to include only a couple of specific columns, I use the project keyword like this:

          // Show blob storage requests
StorageBlobLogs
| project TimeGenerated, OperationName, AuthenticationType
    

Notice the | (pipe) element. In KQL queries, each operation is delineated by a pipe, and they are executed in the order they are written. The data set is manipulated at each step, and then the resulting set is “piped” to the next step.

With both KQL and kitchen recipes, the order you do things matters. After all, you wouldn’t put your peanut butter on your toast before toasting it, or try to add the sugar to your cake recipe after you’ve already baked it.

Speaking of order, note that sorting uses the familiar order by syntax of SQL:

          // Show blob storage requests
StorageBlobLogs
| project TimeGenerated, OperationName, AuthenticationType
| order by TimeGenerated desc
    

There’s nothing fancy there as you can see, but there are some additional options available in KQL beyond just ascending (asc) or descending (desc) order. For example, the nulls last and nulls first options allow you to deliberately manage the handling of empty data elements.

Portion Control: Taking only what you can eat

Filtering with the where clause should also be familiar SQL practitioners:

          // Show anonymous PUT requests
StorageBlobLogs
| where TimeGenerated > ago(3d)
	and AuthenticationType contains "Anonymous"
	and OperationName startswith "PUT"
| project TimeGenerated, OperationName, AuthenticationType
| order by TimeGenerated desc
    

But there are some notable differences in KQL. The ago function to get the last three days is handy in comparison to using slightly more complex syntax in SQL. And the contains and startswith operators are more user friendly as compared to using LIKE in SQL – but with the same high compute cost, so you may want to use other operators described in the KQL documentation.

You might have noticed that, unlike SQL syntax, the project clause with the list of columns comes after the where filter because that’s the more efficient way to process the data.

Fun Fact: That’s actually exactly what happens under-the-covers in modern SQL-based databases, too — even though the SELECT clause is correctly located at the beginning of the SQL query. The query optimizer in most relational database platforms do not apply your SELECT column choices until after most of the other operations are performed.

Knife Skills: Further Slicing and dicing your data

Big data often calls for multiple dimensions built from the same source or multiple sources. KQL use of union over data sets, and nested or correlated subqueries, are very similar to SQL syntax. Check out the use of a nested subquery that returns a scalar value in the where clause:

          // Show anonymous PUT requests, filter by recent status
StorageBlobLogs
| where TimeGenerated > ago(3d)
	and AuthenticationType contains "Anonymous"
	and OperationName startswith "PUT"
	and StatusCode ==  toscalar (
        			StorageBlobLogs
           		| top 1 by TimeGenerated desc
           		| project StatusCode)
| project TimeGenerated, OperationName, AuthenticationType
| order by TimeGenerated desc
    

As a bonus tip, in the subquery, you can see how the top operator works with a by ordering expression that is more succinct than the method used in SQL. Joins are a little bit trickier, so you’ll want to check out the cheat sheet at SQL to Kusto query translation | Microsoft Learn to study the equivalent syntax to a JOIN in SQL.

And that’s it on making that basic casserole! But if you want to add a kicky side dish of data, keep reading.

Garnishing with graphs and data charts

There are dozens of functions and techniques with KQL for producing big data charts and graphs. Here’s an example of a function that decomposes time series data and outputs it in a series of line charts:

          let min_t = datetime(2025-01-05);
let max_t = datetime(2025-02-03 22:00);
let dt = 2h;
StorageBlobLogs
| make-series num=avg(OperationCount) on TimeGenerated
from min_t to max_t step dt
| extend (baseline, seasonal, trend, residual) =
series_decompose(num, -1, 'linefit') 
| render timechart with
(title='My Time Series Demo', ysplit=panels)
    

More on preparing your Azure data like a pro

If you are interested in learning about features in Azure that can produce data worthy of inquiry with KQL, check out my Azure certification course titled, “AZ-204: Developing Solutions for Microsoft Azure.” And if you want to learn more about KQL, Microsoft has gone all-in on the “Kusto” brand, so there is a lot of documentation, but I suggest you start here: Kusto Query Language (KQL) overview.

Amy C.

Amy Coughlin is a Pluralsight Author and Senior Azure Training Architect with over 30 years of experience in the tech industry, mainly focused on Microsoft stack services and databases. She's living the dream of combining her love of technology with her passion for teaching others.

More about this author