ES|QL TBUCKET function
buckets- Target number of buckets, or desired bucket size. When a number is provided, the actual bucket size is derived from
from/to. When a duration or period is provided, it is used as the explicit bucket size. from- Start of the range. Required with a numeric
buckets. to-
End of the range. Required with a numeric
buckets.
Creates groups of values - buckets - out of a @timestamp attribute. The size of the buckets can either be provided directly as a duration or period, or chosen based on a recommended count and a range.
| buckets | from | to | result |
|---|---|---|---|
| date_period | date | ||
| date_period | date_nanos | ||
| integer | date | date | date |
| integer | date | date | date_nanos |
| integer | date | keyword | date |
| integer | date | keyword | date_nanos |
| integer | date | text | date |
| integer | date | text | date_nanos |
| integer | keyword | date | date |
| integer | keyword | date | date_nanos |
| integer | keyword | keyword | date |
| integer | keyword | keyword | date_nanos |
| integer | keyword | text | date |
| integer | keyword | text | date_nanos |
| integer | text | date | date |
| integer | text | date | date_nanos |
| integer | text | keyword | date |
| integer | text | keyword | date_nanos |
| integer | text | text | date |
| integer | text | text | date_nanos |
| time_duration | date | ||
| time_duration | date_nanos |
TBUCKET can work in two modes: one in which the size of the bucket is computed
based on a buckets count recommendation and a range, and
another in which the bucket size is provided directly as a duration or period.
Using a target number of buckets, a start of a range, and an end of a range,
TBUCKET picks an appropriate bucket size to generate the target number of buckets or fewer.
For example, asking for at most 3 buckets over a 2 hour range results in hourly buckets
FROM sample_data
| STATS count = COUNT(*) BY bucket = TBUCKET(3, "2023-10-23T12:00:00Z", "2023-10-23T14:00:00Z")
| SORT bucket
| count:long | bucket:datetime |
|---|---|
| 2 | 2023-10-23T12:00:00.000Z |
| 5 | 2023-10-23T13:00:00.000Z |
The goal isn't to provide exactly the target number of buckets, it's to pick a range that people are comfortable with that provides at most the target number of buckets.
Asking for more buckets can result in a finer granularity.
For example, asking for at most 20 buckets in the same range results in 10-minute buckets
FROM sample_data
| STATS count = COUNT(*) BY bucket = TBUCKET(20, "2023-10-23T12:00:00Z", "2023-10-23T14:00:00Z")
| SORT bucket
| count:long | bucket:datetime |
|---|---|
| 1 | 2023-10-23T12:10:00.000Z |
| 1 | 2023-10-23T12:20:00.000Z |
| 1 | 2023-10-23T13:30:00.000Z |
| 4 | 2023-10-23T13:50:00.000Z |
TBUCKET does not filter any rows. It only uses the provided range to pick a good bucket size.
For rows with a value outside of the range, it returns a bucket value that corresponds to a bucket
outside the range. Combine TBUCKET with WHERE
to filter rows.
If the desired bucket size is known in advance, simply provide it as the first argument, leaving the range out:
FROM sample_data
| STATS min = MIN(@timestamp), max = MAX(@timestamp) BY bucket = TBUCKET(1 hour)
| SORT min
| min:datetime | max:datetime | bucket:datetime |
|---|---|---|
| 2023-10-23T12:15:03.360Z | 2023-10-23T12:27:28.948Z | 2023-10-23T12:00:00.000Z |
| 2023-10-23T13:33:34.937Z | 2023-10-23T13:55:01.543Z | 2023-10-23T13:00:00.000Z |
When providing the bucket size, it must be a time duration or date period.
Also the reference is epoch, which starts at 0001-01-01T00:00:00Z.
The bucket size can also be provided as a string:
FROM sample_data
| STATS min = MIN(@timestamp), max = MAX(@timestamp) BY bucket = TBUCKET("1 hour")
| SORT min
| min:datetime | max:datetime | bucket:datetime |
|---|---|---|
| 2023-10-23T12:15:03.360Z | 2023-10-23T12:27:28.948Z | 2023-10-23T12:00:00.000Z |
| 2023-10-23T13:33:34.937Z | 2023-10-23T13:55:01.543Z | 2023-10-23T13:00:00.000Z |