Quick start: Your first rule
This tutorial walks you through the core Kibana alerting v2 model in action. You'll load sample error logs, create a rule that groups results by service, watch episodes open as the rule finds breaches, and then trigger recovery by deleting the data. By the end, you'll have learned about rules, grouping, episodes, and the alert lifecycle play out from start to finish.
Run this in Dev Tools to create a data stream called logs-tutorial and populate it with sample error logs. This is the data your rule will query.
POST logs-tutorial/_bulk
{ "index": {} }
{ "@timestamp": "2026-04-21T21:50:00.000Z", "log_level": "ERROR", "service_name": "checkout", "message": "Connection timeout" }
{ "index": {} }
{ "@timestamp": "2026-04-21T21:51:00.000Z", "log_level": "ERROR", "service_name": "checkout", "message": "Database query failed" }
{ "index": {} }
{ "@timestamp": "2026-04-21T21:52:00.000Z", "log_level": "ERROR", "service_name": "checkout", "message": "Null pointer exception" }
{ "index": {} }
{ "@timestamp": "2026-04-21T21:50:00.000Z", "log_level": "ERROR", "service_name": "payments", "message": "Payment gateway unreachable" }
{ "index": {} }
{ "@timestamp": "2026-04-21T21:51:00.000Z", "log_level": "ERROR", "service_name": "payments", "message": "Transaction rollback failed" }
{ "index": {} }
{ "@timestamp": "2026-04-21T21:50:00.000Z", "log_level": "INFO", "service_name": "checkout", "message": "Request received" }
{ "index": {} }
{ "@timestamp": "2026-04-21T21:51:00.000Z", "log_level": "INFO", "service_name": "payments", "message": "Health check passed" }
logs-tutorial is automatically created as a data stream because the logs-* naming pattern matches Elasticsearch's default index template. You don't need to create it manually beforehand.
Confirm the data was indexed. You should see errors: false and result: "created" for each document in the response.
Run this in Dev Tools to confirm the rule will find it:
FROM logs-tutorial
| WHERE log_level == "ERROR"
| STATS error_count = COUNT() BY service_name
| WHERE error_count >= 2
You should see two rows: one for checkout (3 errors) and one for payments (2 errors). If you see this, the rule will trigger for both services.
Go to Management > V2 Alerting Preview and create a new rule using the YAML editor with the following configuration:
kind: alert
metadata:
name: tutorial-error-rate
tags:
- tutorial
time_field: '@timestamp'
schedule:
every: 5s
lookback: 24h
evaluation:
query:
base: |-
FROM logs-tutorial
| WHERE log_level == "ERROR"
| STATS error_count = COUNT() BY service_name
| WHERE error_count >= 2
grouping:
fields:
- service_name
Save the rule. It will be enabled automatically.
Wait 5 seconds, then run the following in Discover using the ES|QL mode. Replace <your-rule-id> with the tutorial-error-rate rule ID.
FROM $`.rule-events`
| WHERE rule.id == "<your-rule-id>"
| SORT @timestamp DESC
| LIMIT 10
After saving the rule, open it in Management > V2 Alerting Preview. The rule ID is the string at the browser URL.
Check the following in the query results:
| Field | What you should see |
|---|---|
@timestamp |
Recent timestamps updating every 5 seconds |
status |
breached confirms the query is finding matches |
episode.status |
active confirms episodes have opened for both services |
data.service_name |
checkout and payments confirms grouping is working |
In Discover, run:
FROM $`.rule-events`
| WHERE rule.id == "<your-rule-id>"
| STATS latest = MAX(@timestamp) BY episode.id, episode.status, data.service_name
| SORT latest DESC
You should see two rows: one episode for checkout and one for payments, both with episode.status: active.
Delete the test data to clear the breach condition. Run the following in Dev Tools:
POST logs-tutorial/_delete_by_query
{
"query": { "match_all": {} }
}
Wait up to 5 seconds, then re-run the query from Step 5 in Discover. Both episodes should now show episode.status: inactive.
By completing this tutorial, you saw the core alerting v2 model in action end to end:
- Rules query your data on a schedule. Your rule ran every 5 seconds, checking for services with 2 or more errors in the last 24 hours.
- Grouping creates independent series. Because the rule grouped by
service_name,checkoutandpaymentswere tracked separately. Each got its own episode. - Episodes follow a lifecycle. When the error logs existed, both episodes moved to
active. When you deleted the logs, both recovered and moved toinactiveautomatically, no manual intervention required. - Rule events are the underlying record. Every evaluation wrote documents to
.rule-events, giving you a full queryable history of what the rule found and when.
- Add notifications: Create a workflow and action policy to route alerts when an episode opens or recovers. Refer to Notifications.
- Use your own data: Swap
logs-tutorialfor a real data source and update the breach condition to match your use case. Refer to Author rules to learn more. - Explore rule history in Discover: Query
.rule-eventsto track trends, compare episode durations, and identify which services breach most frequently. Refer to Query alerts and signals in Discover to learn more.