ES|QL CATEGORIZE function
Note
The CATEGORIZE function requires a platinum license.
field- Expression to categorize
options-
(Optional) Categorize additional options as function named parameters.
}
Groups text messages into categories of similarly formatted text values.
CATEGORIZE has the following limitations:
- can’t be used within other expressions
- can’t be used more than once in the groupings
- can’t be used or referenced within aggregate functions and it has to be the first grouping
| field | options | result |
|---|---|---|
| keyword | keyword | |
| text | keyword |
output_format- (keyword) The output format of the categories. Defaults to regex.
similarity_threshold- (integer) The minimum percentage of token weight that must match for text to be added to the category bucket. Must be between 1 and 100. The larger the value the narrower the categories. Larger values will increase memory usage and create narrower categories. Defaults to 70.
analyzer-
(keyword) Analyzer used to convert the field into tokens for text categorization.
This example categorizes server logs messages into categories and aggregates their counts.
FROM sample_data
| STATS count=COUNT() BY category=CATEGORIZE(message)
| count:long | category:keyword |
|---|---|
| 3 | .*?Connected.+?to.*? |
| 3 | .*?Connection.+?error.*? |
| 1 | .*?Disconnected.*? |