ES|QL TS command
The TS source command is similar to the FROM
source command, with the following key differences:
- Targets only time series indices
- Enables the use of time series aggregation functions inside the STATS command
TS index_pattern [METADATA fields]
index_pattern- A list of indices, data streams or aliases. Supports wildcards and date math.
fields- A comma-separated list of metadata fields to retrieve.
The TS source command enables time series semantics and adds support for
time series aggregation functions to the STATS command, such as
AVG_OVER_TIME(),
or RATE.
These functions are implicitly evaluated per time series, then aggregated by group using a secondary aggregation
function. For an example, refer to Calculate the rate of search requests per host.
This paradigm (a pair of aggregation functions) is standard for time series querying. For supported inner (time series) functions per metric type, refer to ES|QL time series aggregation functions. These functions also apply to downsampled data, with the same semantics as for raw data.
To discover which metrics, metric types, and dimensions are available before writing a
TS query, use METRICS_INFO
or TS_INFO.
If a query is missing an inner (time series) aggregation function,
LAST_OVER_TIME()
is assumed and used implicitly. For example, two equivalent queries that return the average of the last memory usage values per time series are shown in Aggregate with implicit LAST_OVER_TIME. To calculate the average memory usage across per-time-series averages, refer to Calculate the average of per-time-series averages.
You can use time series aggregation functions
directly in the STATS command (
You can also combine time series aggregation functions with regular aggregation functions such as SUM(), as outer aggregation functions. For examples, refer to Combine SUM and RATE and Combine SUM and AVG_OVER_TIME.
However, using a time series aggregation function in combination with an inner time series function causes an error. For an example, refer to Invalid query: nested time series functions.
If there is no STATS command in the query, the output of the TS command gets sorted by @timestamp in descending order by default. This helps listing recent values across many time series, as opposed to listing the results based on index sort configuration that may just return data points for a single time series.
TS is the right choice for aggregating metrics. It's optimized for time series
data and avoids correctness issues that
FROM | STATS runs into on
raw metric data. The two most common examples where FROM silently produces
wrong results that TS handles correctly are counter handling and uneven
publish intervals for metrics.
Counter metrics (such as a request_count that monotonically increases) record a
running total, not per-interval events. Summing or averaging the raw counter column
has no meaningful interpretation: the result reflects how many samples each series
happened to publish, not the underlying rate. Counter resets on process restart make
the problem worse by mixing pre-reset and post-reset values:
FROM metrics | STATS SUM(request_count) BY host
TS with RATE()
or INCREASE()
first computes the per-bucket delta for each time series (handling resets correctly
along the way), then lets you aggregate those deltas across series:
TS metrics | STATS SUM(RATE(request_count)) BY host
Different hosts and agents publish at different cadences. A host emitting CPU samples
every second contributes 120x more rows than one emitting every two minutes, so AVG()
over the raw column weights the chatty host accordingly:
FROM metrics | STATS AVG(cpu_usage) BY cluster
TS first reduces each time series to a single value per time bucket (using
AVG_OVER_TIME
or another inner function), then aggregates across series. Each series contributes
equally to the outer average:
TS metrics
| STATS AVG(AVG_OVER_TIME(cpu_usage)) BY cluster, TBUCKET(1 minute)
When the first STATS after TS uses a bare
time series aggregation function
(that is, a time series function not wrapped in an outer aggregation such as AVG() or SUM()),
the rows are implicitly grouped by all dimensions
of each time series. The result set includes a _timeseries keyword
column that contains a JSON-encoded object with the dimension key/value pairs
identifying each group. Only the dimensions that are actually present for a given time
series are included — not every dimension declared in the index mappings — so different
rows in the result may carry different dimension keys. For an example, refer to
Group by all dimensions implicitly.
You can make this grouping explicit, or narrow it to a subset of dimensions, using the
WITHOUT
grouping function in the BY clause (
BY WITHOUT(dim1, dim2, ...)groups by all dimensions except those listed. See Exclude dimensions with WITHOUT.BY WITHOUT()(no arguments) explicitly groups by every dimension; it is equivalent to the implicit "group by all" behavior.
When combining a bare time series function with other groupings, only grouping functions
(such as TBUCKET
or WITHOUT) are allowed in the BY clause — bare dimension columns are not.
For example, TS k8s | STATS rate(network.total_bytes_in) BY host is rejected; use
BY TBUCKET(1 hour) or wrap the time series function with an outer aggregation instead.
WITHOUT can only be used inside the first STATS command under TS source. Using it in a FROM | STATS ... BY WITHOUT(...)
query leads to an error.
- Prefer
TSoverFROMfor time series aggregations.FROMis still appropriate for non-aggregation uses such as listing document contents. - Avoid aggregating multiple metrics in the same query when those metrics have different dimensional cardinalities.
For example, in
STATS max(rate(foo)) + rate(bar)), iffooandbardon't share the same dimension values, the rate for one metric will be null for some dimension combinations. Because the + operator returns null when either input is null, the entire result becomes null for those dimensions. Additionally, queries that aggregate a single metric can filter out null values more efficiently. - Add a time range filter on
@timestampto limit the data volume scanned and improve query performance. - Time series aggregations produce large result sets, especially if they involve many dimensions and small time buckets. The limits are updated accordingly, with the default result truncation size increased to 10,000 rows. For more information on the limits and how to adjust them, refer to Result set size limitation.
COUNT(*)underTS: After the firstSTATSunderTS, rows are grouped per time series rather than counted as raw metric documents, soCOUNT(*)does not have the same meaning it does underFROM. To count samples for a specific metric, useCOUNT(<field>)orCOUNT_OVER_TIME()against a named field.- Commands that reorder rows: Commands that change row order, such as
SORTorFORK, cannot appear betweenTSand the firstSTATS. Time series aggregations rely on the natural@timestampordering produced byTS. After the firstSTATS, results are tabular and these commands are available again — see Process tabular results after the first STATS. - Metric type compatibility: Time series aggregation functions enforce the field's
metric_typemapping. For example,RATE()requires a counter and rejects gauge fields. See ES|QL time series aggregation functions > Metric type compatibility for the per-function mapping.
The following examples demonstrate common time series query patterns using TS.
Calculate the total rate of search requests (tracked by the search_requests counter) per host and hour. The RATE()
function is applied per time series in hourly buckets. These rates are summed for each
host and hourly bucket (since each host can map to multiple time series):
TS metrics
| WHERE @timestamp >= now() - 1 hour
| STATS SUM(RATE(search_requests)) BY TBUCKET(1 hour), host
The following two queries are equivalent, returning the average of the last memory usage values per time series. If a query is missing an inner (time series) aggregation function, LAST_OVER_TIME() is assumed and used implicitly:
TS metrics | STATS AVG(memory_usage)
TS metrics | STATS AVG(LAST_OVER_TIME(memory_usage))
This query calculates the average memory usage across per-time-series averages, rather than the average of all raw values:
TS metrics | STATS AVG(AVG_OVER_TIME(memory_usage))
You can use a time series aggregation function directly in STATS (
TS metrics
| WHERE TRANGE(1 day)
| STATS RATE(search_requests) BY TBUCKET(1 hour)
Use SUM as the outer aggregation to sum counter rates across groups:
TS metrics | STATS SUM(RATE(search_requests)) BY host
Use AVG_OVER_TIME to compute per-time-series averages, then group the results by host and time bucket:
TS metrics
| WHERE @timestamp >= now() - 1 day
| STATS SUM(AVG_OVER_TIME(memory_usage)) BY host, TBUCKET(1 hour)
Once the first STATS under TS completes, the pipeline becomes a regular ES|QL table.
Chain additional STATS, EVAL, SORT, and similar commands to derive secondary
aggregations or computed columns. For example, take per-minute rates per cluster, then
compute the 95th-percentile and maximum rate per cluster and rank clusters by headroom:
TS k8s
| WHERE @timestamp >= now() - 1 hour
| STATS rate = MAX(RATE(network.total_bytes_in)) BY cluster, TBUCKET(1 minute)
| STATS p95 = PERCENTILE(rate, 95), peak = MAX(rate) BY cluster
| EVAL headroom = peak - p95
| SORT headroom DESC
When a bare time series aggregation function is used without a BY clause, results are
implicitly grouped by all dimensions of each time series and include a _timeseries
column with the dimension key/value pairs. Note how the qa cluster rows only carry
cluster and pod keys, while the prod and staging rows also include region —
only the dimensions that actually exist for a given time series appear in _timeseries:
TS k8s
| STATS avg = avg_over_time(network.cost)
| SORT avg DESC
| avg:double | _timeseries:keyword |
|---|---|
| 8.375 | {"cluster":"prod","pod":"three","region":["eu","us"]} |
| 7.262931034482759 | {"cluster":"qa","pod":"one"} |
| 6.870689655172414 | {"cluster":"qa","pod":"two"} |
| 6.635416666666667 | {"cluster":"qa","pod":"three"} |
| 6.46875 | {"cluster":"staging","pod":"two","region":"us"} |
| 5.869791666666667 | {"cluster":"staging","pod":"three","region":"us"} |
| 5.586956521739131 | {"cluster":"prod","pod":"one","region":["eu","us"]} |
| 5.1826923076923075 | {"cluster":"staging","pod":"one","region":"us"} |
| 4.0703125 | {"cluster":"prod","pod":"two","region":["eu","us"]} |
Use WITHOUT
(
BY clause to exclude specific dimensions from
the time series grouping. For example, group by every dimension except pod:
TS k8s
| STATS total_cost = sum(network.cost) BY WITHOUT(pod)
| SORT total_cost
| total_cost:double | _timeseries:keyword |
|---|---|
| 15.875 | {"cluster":"staging","region":"us"} |
| 18.625 | {"cluster":"prod","region":["eu","us"]} |
| 26.5 | {"cluster":"qa"} |
WITHOUT can be combined with
TBUCKET
to add a time bucket to the grouping — useful for producing per-interval aggregates
across the surviving dimensions:
TS k8s
| STATS total_cost = sum(network.cost) BY WITHOUT(pod), tbucket = TBUCKET(1 hour)
| SORT total_cost
| total_cost:double | _timeseries:keyword | tbucket:datetime |
|---|---|---|
| 15.875 | {"cluster":"staging","region":"us"} | 2024-05-10T00:00:00.000Z |
| 18.625 | {"cluster":"prod","region":["eu","us"]} | 2024-05-10T00:00:00.000Z |
| 26.5 | {"cluster":"qa"} | 2024-05-10T00:00:00.000Z |
Passing no arguments (WITHOUT()) is equivalent to grouping by all dimensions — the same
as omitting the BY clause entirely. Refer to the
WITHOUT
function reference for more examples.
Using a time series aggregation function in combination with an inner time series function causes an error:
TS metrics | STATS AVG_OVER_TIME(RATE(memory_usage))