﻿---
title: ES|QL query patterns for Kibana alerting v2 rules
description: Advanced {{esql}} query patterns for {{alerting-v2}} rules: SLO burn rate, no-data detection, persistent breach, and unsupported operations.
url: https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/5528/explore-analyze/alerting/kibana-alerting-v2/rules/esql-query-patterns-v2
products:
  - Kibana
applies_to:
  - Elastic Cloud Serverless: Preview
  - Elastic Stack: Unavailable
---

# ES|QL query patterns for Kibana alerting v2 rules
Some detection problems can't be expressed as a single metric compared to a fixed threshold. You might need to know whether an SLO is burning through its error budget across multiple time windows at once. Or whether a specific host has gone silent, rather than whether the query returned nothing. Or whether a condition has persisted continuously across consecutive time buckets rather than appearing once. These are structurally different problems that require different query shapes.
Use this page when a basic `STATS ... WHERE` pattern isn't enough, or when the detection logic itself requires multi-window calculation, last-seen reasoning, or bucket-level persistence checks. If you're still learning how Kibana alerting v2 rules work, start with [Author rules](https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/5528/explore-analyze/alerting/kibana-alerting-v2/rules/author-rules-v2) first.

## Basic threshold query

A threshold query evaluates one metric over one lookback window and fires if a value crosses a limit. It is the simplest rule shape: a `STATS` aggregation followed by a `WHERE` condition.
```esql
FROM logs-*
| STATS
    // Count only error responses; count all requests for the denominator
    error_count = COUNT_IF(http.response.status_code >= 500),
    total_count = COUNT(*)
  BY service.name
| EVAL error_rate = error_count / total_count 
| WHERE error_rate > 0.10                     
| KEEP service.name, error_rate, error_count, total_count
```

One window, one aggregate, one threshold check. The result is either a breach or no breach for each group.

## SLO burn rate query

An SLO burn rate query asks a different question than a basic threshold: are you consuming your error budget faster than you can afford to? Rather than checking a single metric at a fixed limit, it calculates error rates across multiple time windows simultaneously and returns a severity level.

### Why multiple windows

Checking both a short window (for example, 5 minutes) and a long window (for example, 1 hour) together filters out brief spikes that do not represent a real budget threat. CRITICAL fires only when *both* the short and long burn rates exceed the threshold. The two-window requirement is what separates a genuine budget emergency from a momentary blip.

### Query structure

A single ES|QL query handles all window pairs at once using conditional aggregation:
```esql
FROM metrics-*
| WHERE @timestamp >= NOW() - 3 days  
                                       // Keep this value in sync with the rule's lookback setting.
| STATS
    // CRITICAL window pair: 5 min catches the fast signal, 1 hour confirms it's sustained
    errors_5m   = COUNT_IF(outcome == "failure" AND @timestamp >= NOW() - 5  minutes),
    total_5m    = COUNT_IF(@timestamp >= NOW() - 5  minutes),
    errors_1h   = COUNT_IF(outcome == "failure" AND @timestamp >= NOW() - 1  hour),
    total_1h    = COUNT_IF(@timestamp >= NOW() - 1  hour),
    // HIGH window pair: 30 min fast signal, 6 hours sustained confirmation
    errors_30m  = COUNT_IF(outcome == "failure" AND @timestamp >= NOW() - 30 minutes),
    total_30m   = COUNT_IF(@timestamp >= NOW() - 30 minutes),
    errors_6h   = COUNT_IF(outcome == "failure" AND @timestamp >= NOW() - 6  hours),
    total_6h    = COUNT_IF(@timestamp >= NOW() - 6  hours)
  BY slo.id                           
| EVAL
    // Compute error rates (errors as a fraction of total requests) for each window
    burn_5m  = errors_5m  / total_5m,
    burn_1h  = errors_1h  / total_1h,
    burn_30m = errors_30m / total_30m,
    burn_6h  = errors_6h  / total_6h
| EVAL severity = CASE(
    // CRITICAL: both the fast and sustained windows exceed 14.4x the baseline error rate.
    // Requiring both prevents a single brief spike from triggering a critical alert.
    burn_5m  > 14.4 AND burn_1h  > 14.4, "CRITICAL",
    // HIGH: same two-window logic at a lower threshold
    burn_30m > 6.0  AND burn_6h  > 6.0,  "HIGH",
    "none"
  )
| WHERE severity != "none"            
| KEEP slo.id, severity, burn_5m, burn_1h, burn_30m, burn_6h 
```

The burn rate multipliers (14.4×, 6×) reflect standard SLO error budget consumption rates. Adjust them to match your SLO targets.
Because the query computes several window pairs in one pass, the lookback window on the rule must cover the longest window in the query (3 days in the example above).

## No-data detection

No-data detection inverts the normal pattern. Instead of filtering for data that meets a condition, you query for when data was *last seen* and flag sources that have gone silent.
The technique uses a broad lookback to find all known hosts, then surfaces only those that have not reported recently:
```esql
FROM metrics-*
| WHERE @timestamp >= NOW() - 12 hours        
                                               // have at least one event in the window under normal conditions
| STATS last_seen = MAX(@timestamp) BY host.name 
| WHERE last_seen < NOW() - 15 minutes        
| KEEP host.name, last_seen                   
```

Every row returned is a host that has gone silent, so the base query itself drives the alert. No separate alert condition is needed.

### Variants


| Variant       | What it detects                                                                                                                                                                                         |
|---------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Host-specific | Each host that stops reporting generates its own alert series (use `BY host.name` for grouping).                                                                                                        |
| Global        | No data from any source. Omit the `BY` clause and check whether the query returns any rows at all.                                                                                                      |
| Combined      | Flags both a high-metric condition *and* silent hosts in one query using a `CASE` expression to assign a `status` field (`"alert"`, `"no data"`, or `"ok"`), then filters to only the problematic rows. |


### Lookback window sizing

The lookback must be wide enough that known hosts appear in the result set. If the lookback is too short, a silent host falls outside the window and is never checked. However, large lookback windows on high-frequency data streams increase query cost significantly. Start with a lookback that comfortably covers the longest expected reporting gap for your hosts, not the full history of the index.
For no-data behavior when the entire base query returns zero rows (as opposed to detecting specific silent sources), refer to [No-data handling](/elastic/docs-content/pull/5528/explore-analyze/alerting/kibana-alerting-v2/rules/configure-a-rule-v2#no-data-handling-v2).

## Limitations and workarounds

Some patterns from the classic alerting aggregation API are not directly available in ES|QL, and some require workarounds.

### Persistent breach detection

A persistent breach condition detects a metric that stays above a threshold across several consecutive time buckets (for example, "CPU above 90% in all 10 of the last 10 five-minute windows"). ES|QL can express this with bucket counting:
```esql
FROM metrics-*
| WHERE @timestamp >= NOW() - 50 minutes      
| EVAL bucket = BUCKET(@timestamp, 5 minutes) 
| STATS
    total_buckets     = COUNT_DISTINCT(bucket),         
    exceeding_buckets = COUNT_DISTINCT(
      CASE(system.cpu.total.pct > 0.90, bucket, null)   
    )                                                   
  BY host.name
| WHERE total_buckets >= 10                   
    AND exceeding_buckets == total_buckets    
                                               // every bucket in the window must have breached
| KEEP host.name, total_buckets, exceeding_buckets
```

The rule's lookback window must cover all the buckets you want to check (50 minutes for 10 five-minute buckets in this example). If any bucket is missing from the data because the host stopped reporting briefly mid-window, `total_buckets` drops below 10 and the condition does not fire. Design the query so that gaps in reporting produce the behavior you want: either treating partial coverage as a non-breach or adjusting the `WHERE` filter to allow it.

### Derivative aggregation

ES|QL does not have a `DERIVATIVE` function. In the Elasticsearch aggregations API, a derivative pipeline aggregation calculates the rate of change between consecutive time buckets (for example, "how fast is this counter increasing per minute?"). There is no equivalent in ES|QL today.
Use cases that require true per-bucket deltas (such as detecting a sudden acceleration in error rate) cannot be expressed as an ES|QL rule at this time. Consider pre-computing deltas in an ingest pipeline or using a transform to write derived metrics to a separate index that your rule can then query with a standard threshold pattern.