Loading

ES|QL TOP_SNIPPETS function

Embedded
field
The field to extract snippets from. The input can be a single-valued or multi-valued field. In the case of a multi-valued argument, snippets are extracted from each value separately.
query
The input text containing only query terms for snippet extraction. Lucene query syntax, operators, and wildcards are not allowed.
options

(Optional) TOP_SNIPPETS additional options as function named parameters.

Use TOP_SNIPPETS to extract the best snippets for a given query string from a text field.

TOP_SNIPPETS can be used on fields from the text family like text and semantic_text. TOP_SNIPPETS will extract the best snippets for a given query string.

field query options result
keyword keyword named parameters keyword
keyword keyword keyword
text keyword named parameters keyword
text keyword keyword
num_snippets
(integer) The maximum number of matching snippets to return.
num_words

(integer) The maximum number of words to return in each snippet.

When set to 0, disables chunking entirely, the input field values are used as-is, which is useful when the text has already been chunked.

highlight
(boolean) When true, wraps matched query terms in the returned snippets with markup tags. Defaults to false.
pre_tag
(keyword) Opening tag for highlighted terms. Only applies when highlight is true. Defaults to <em>.
post_tag
(keyword) Closing tag for highlighted terms. Only applies when highlight is true. Defaults to </em>.
encoder

(keyword) Controls HTML encoding of snippet text before tagging: default (no encoding) or html. Only applies when highlight is true. Defaults to default.

FROM books
| EVAL snippets = TOP_SNIPPETS(description, "Tolkien")
		
book_no:keyword title:text snippets:keyword
1211 The brothers Karamazov null
1463 Realms of Tolkien: Images of Middle-earth Twenty new and familiar Tolkien artists are represented in this fabulous volume, breathing an extraordinary variety of life into 58 different scenes, each of which is accompanied by appropriate passage from The Hobbit and The Lord of the Rings and The Silmarillion
1502 Selected Passages from Correspondence with Friends null
1937 The Best Short Stories of Dostoevsky (Modern Library) null
1985 Brothers Karamazov null

FROM books
| WHERE MATCH(title, "Return")
| EVAL snippets = TOP_SNIPPETS(description, "Tolkien", { "num_snippets": 3, "num_words": 25 })
		
book_no:keyword title:text snippets:keyword
2714 Return of the King Being the Third Part of The Lord of the Rings [Concluding the story begun in The Hobbit, this is the final part of Tolkien s epic masterpiece, The Lord of the Rings, featuring an exclusive, Tolkien s epic masterpiece, The Lord of the Rings, featuring an exclusive cover image from the film, the definitive text, and a detailed map of, Tolkien s classic tale of magic and adventure, begun in The Fellowship of the Ring and The Two Towers, features the definitive edition of the]
7350 Return of the Shadow [Tolkien for long believed would be a far shorter book, 'a sequel to The Hobbit'., In The Return of the Shadow (an abandoned title for the first volume) Christopher Tolkien describes, with full citation of the earliest notes, outline plans, ) Christopher Tolkien describes, with full citation of the earliest notes, outline plans, and narrative drafts, the intricate evolution of The Fellowship of the Ring and]

FROM books
| WHERE MATCH(title, "return")
| RERANK "Tolkien" ON TOP_SNIPPETS(description, "Tolkien", { "num_snippets": 3, "num_words": 25 }) WITH { "inference_id" : "test_reranker" }
		
book_no:keyword title:text _score:double
2714 Return of the King Being the Third Part of The Lord of the Rings 0.007092198356986046
7350 Return of the Shadow 0.012500000186264515

This examples demonstrates how to use TOP_SNIPPETS with RERANK. By returning a fixed number of snippets with a limited size, we have more control over the number of tokens that are used for semantic reranking.

FROM books
| WHERE MATCH(title, "Return")
| EVAL snippets = TOP_SNIPPETS(description, "Tolkien", { "num_snippets": 1, "num_words": 25, "highlight": true })
		
book_no:keyword title:text snippets:keyword
2714 Return of the King Being the Third Part of The Lord of the Rings Concluding the story begun in The Hobbit, this is the final part of <em>Tolkien</em> s epic masterpiece, The Lord of the Rings, featuring an exclusive
7350 Return of the Shadow <em>Tolkien</em> for long believed would be a far shorter book, 'a sequel to The Hobbit'.

Enable highlighting by setting highlight to true in the options. This wraps matched query terms in the returned snippets with <em> tags by default. To use different tags, set the pre_tag and post_tag options to the desired opening and closing tags respectively.

ROW chunks = ["Alice was beginning to get very tired of sitting by her sister on the bank. The white rabbit disappeared down a hole and Alice quickly jumped in after it. The rabbit hole went straight on like a tunnel for some way and then dipped suddenly.", "The Queen of Hearts ordered her soldiers to paint the white roses red."]
| EVAL snippets = TOP_SNIPPETS(chunks, "alice rabbit hole", {"num_words": 0, "highlight": true})
		
snippets:keyword
<em>Alice</em> was beginning to get very tired of sitting by her sister on the bank. The white <em>rabbit</em> disappeared down a <em>hole</em> and <em>Alice</em> quickly jumped in after it. The <em>rabbit</em> <em>hole</em> went straight on like a tunnel for some way and then dipped suddenly.

Set num_words to 0 to disable chunking entirely. This keeps the input field values as-is, which is useful when the text has already been chunked. Combine this with highlight set to true to highlight matched terms within each full value.

ROW content = CONCAT(
    "# Climate Change Solutions\\n\\n",
    "## Renewable Energy\\n",
    "Solar and wind power are becoming cost-competitive with fossil fuels. Investment in clean energy infrastructure creates jobs.\\n\\n",
    "## Carbon Capture\\n",
    "Direct air capture technology removes CO2 from the atmosphere. These systems require significant energy input but show promise for large-scale deployment.\\n\\n",
    "## Policy Changes\\n",
    "Government regulations can accelerate the transition to clean energy through carbon pricing and renewable energy mandates.")
| EVAL chunked_content = CHUNK(content, {"strategy": "recursive", "max_chunk_size": 30, "separators": ["##", "\\n\\n"]})
| EVAL snippets = TOP_SNIPPETS(chunked_content, "clean energy", {"num_words": 0, "num_snippets": 1, "highlight":true})
| KEEP snippets
		
snippets:keyword
## Policy Changes\nGovernment regulations can accelerate the transition to <em>clean</em> <em>energy</em> through carbon pricing and renewable <em>energy</em> mandates.

This is another example of setting num_words to 0, this time applied to an input that has already been chunked with the CHUNK command. The markdown text is chunked by sections first, and because the text is pre-chunked, no further splitting is needed. Setting num_words to 0 disables chunking so that each chunk is scored and highlighted individually.