Loading

ES|QL CATEGORIZE function

Note

The CATEGORIZE function requires a platinum license.

Embedded
field
Expression to categorize
options

(Optional) Categorize additional options as function named parameters. }

Groups text messages into categories of similarly formatted text values.

CATEGORIZE has the following limitations:

  • can’t be used within other expressions
  • can’t be used more than once in the groupings
  • can’t be used or referenced within aggregate functions and it has to be the first grouping
field options result
keyword keyword
text keyword
output_format
(keyword) The output format of the categories. Defaults to regex.
similarity_threshold
(integer) The minimum percentage of token weight that must match for text to be added to the category bucket. Must be between 1 and 100. The larger the value the narrower the categories. Larger values will increase memory usage and create narrower categories. Defaults to 70.
analyzer

(keyword) Analyzer used to convert the field into tokens for text categorization.

This example categorizes server logs messages into categories and aggregates their counts.

FROM sample_data
| STATS count=COUNT() BY category=CATEGORIZE(message)
		
count:long category:keyword
3 .*?Connected.+?to.*?
3 .*?Connection.+?error.*?
1 .*?Disconnected.*?