elasticsearch
Loading

ES|QL dense vector functions

Serverless Preview Stack Preview

ES|QL supports dense vector functions for vector similarity calculations and k-nearest neighbor search. Dsense vector functions work with dense_vector fields and require appropriate field mappings.

ES|QL supports these vector functions:

Serverless Preview Stack Preview 9.2.0

Syntax

Embedded

Parameters

field
Field that the query will target. knn function can be used with dense_vector or semantic_text fields. Other text fields are not allowed
query
Vector value to find top nearest neighbours for.
options

(Optional) kNN additional options as function named parameters. See knn query for more information.

Description

Finds the k nearest vectors to a query vector, as measured by a similarity metric. knn function finds nearest vectors through approximate search on indexed dense_vectors or semantic_text fields.

Supported types

field query options result
dense_vector dense_vector named parameters boolean
text dense_vector named parameters boolean

Supported function named parameters

boost
(float) Floating point number used to decrease or increase the relevance scores of the query.Defaults to 1.0.
k
(integer) The number of nearest neighbors to return from each shard. Elasticsearch collects k results from each shard, then merges them to find the global top results. This value must be less than or equal to num_candidates. This value is automatically set with any LIMIT applied to the function.
visit_percentage
(float) The percentage of vectors to explore per shard while doing knn search with bbq_disk. Must be between 0 and 100. 0 will default to using num_candidates for calculating the percent visited. Increasing visit_percentage tends to improve the accuracy of the final results. If visit_percentage is set for bbq_disk, num_candidates is ignored. Defaults to ~1% per shard for every 1 million vectors
min_candidates
(integer) The minimum number of nearest neighbor candidates to consider per shard while doing knn search. KNN may use a higher number of candidates in case the query can't use a approximate results. Cannot exceed 10,000. Increasing min_candidates tends to improve the accuracy of the final results. Defaults to 1.5 * k (or LIMIT) used for the query.
rescore_oversample
(double) Applies the specified oversampling for rescoring quantized vectors. See oversampling and rescoring quantized vectors for details.
similarity

(double) The minimum similarity required for a document to be considered a match. The similarity value calculated relates to the raw similarity used, not the document score.

Example

from colors metadata _score
| where knn(rgb_vector, [0, 120, 0])
| sort _score desc, color asc
		
color:text rgb_vector:dense_vector
green [0.0, 128.0, 0.0]
black [0.0, 0.0, 0.0]
olive [128.0, 128.0, 0.0]
teal [0.0, 128.0, 128.0]
lime [0.0, 255.0, 0.0]
sienna [160.0, 82.0, 45.0]
maroon [128.0, 0.0, 0.0]
navy [0.0, 0.0, 128.0]
gray [128.0, 128.0, 128.0]
chartreuse [127.0, 255.0, 0.0]

Serverless Preview Stack Planned

Syntax

Embedded

Parameters

text
Text string to generate embeddings from. Must be a non-null literal string value.
inference_id

Identifier of an existing inference endpoint the that will generate the embeddings. The inference endpoint must have the text_embedding task type and should use the same model that was used to embed your indexed data.

Description

Generates dense vector embeddings from text input using a specified inference endpoint. Use this function to generate query vectors for KNN searches against your vectorized data or others dense vector based operations.

Supported types

text inference_id result
keyword keyword dense_vector

Examples

Basic text embedding generation from a text string using an inference endpoint.

ROW input="Who is Victor Hugo?"
| EVAL embedding = TEXT_EMBEDDING("Who is Victor Hugo?", "test_dense_inference")
		

Generate text embeddings and store them in a variable for reuse in KNN vector search queries.

FROM dense_vector_text METADATA _score
| EVAL query_embedding = TEXT_EMBEDDING("be excellent to each other", "test_dense_inference")
| WHERE KNN(text_embedding_field, query_embedding)
		

Directly embed text within a KNN query for streamlined vector search without intermediate variables.

FROM dense_vector_text METADATA _score
| WHERE KNN(text_embedding_field, TEXT_EMBEDDING("be excellent to each other", "test_dense_inference"))
		

Serverless Preview Stack Planned

Syntax

Embedded

Parameters

left
first dense_vector to calculate cosine similarity
right

second dense_vector to calculate cosine similarity

Description

Calculates the cosine similarity between two dense_vectors.

Supported types

left right result
dense_vector dense_vector double

Example

from colors
| where color != "black"
| eval similarity = v_cosine(rgb_vector, [0, 255, 255])
| sort similarity desc, color asc
		
color:text similarity:double
cyan 1.0
teal 1.0
turquoise 0.9781067967414856
aqua marine 0.929924726486206
azure 0.8324936032295227
lavender 0.827340304851532
mint cream 0.8245516419410706
honeydew 0.8244848847389221
gainsboro 0.8164966106414795
gray 0.8164966106414795

Serverless Preview Stack Planned

Syntax

Embedded

Parameters

left
first dense_vector to calculate dot product similarity
right

second dense_vector to calculate dot product similarity

Description

Calculates the dot product between two dense_vectors.

Supported types

left right result
dense_vector dense_vector double

Example

from colors
| eval similarity = v_dot_product(rgb_vector, [0, 255, 255])
| sort similarity desc, color asc
		
color:text similarity:double
azure 130050.0
cyan 130050.0
white 130050.0
mint cream 128775.0
snow 127500.0
honeydew 126225.0
ivory 126225.0
sea shell 123165.0
lavender 122400.0
old lace 121125.0

Serverless Preview Stack Planned

Syntax

Embedded

Parameters

left
First dense_vector to use to calculate the Hamming distance
right

Second dense_vector to use to calculate the Hamming distance

Description

Calculates the Hamming distance between two dense vectors.

Supported types

left right result
dense_vector dense_vector double

Example

from colors
| eval similarity = v_hamming(rgb_byte_vector, [0, 127, 127])
| sort similarity desc, color asc
		
color:text similarity:double
red 23.0
indigo 19.0
orange 19.0
black 17.0
gold 17.0
bisque 16.0
chartreuse 16.0
green 16.0
maroon 16.0
navy 16.0

Serverless Preview Stack Planned

Syntax

Embedded

Parameters

left
first dense_vector to calculate l1 norm similarity
right

second dense_vector to calculate l1 norm similarity

Description

Calculates the l1 norm between two dense_vectors.

Supported types

left right result
dense_vector dense_vector double

Example

from colors
| eval similarity = v_l1_norm(rgb_vector, [0, 255, 255])
| sort similarity desc, color asc
		
color:text similarity:double
red 765.0
crimson 650.0
maroon 638.0
firebrick 620.0
orange 600.0
tomato 595.0
brown 591.0
chocolate 585.0
coral 558.0
gold 550.0

lists/dense-vector-functions.md

Serverless Preview Stack Planned

Syntax

Embedded

Parameters

left
first dense_vector to calculate l2 norm similarity
right

second dense_vector to calculate l2 norm similarity

Description

Calculates the l2 norm between two dense_vectors.

Supported types

left right result
dense_vector dense_vector double

Example

from colors
| eval similarity = v_l2_norm(rgb_vector, [0, 255, 255])
| sort similarity desc, color asc
		
color:text similarity:double
red 441.6729431152344
maroon 382.6669616699219
crimson 376.36419677734375
orange 371.68536376953125
gold 362.8360595703125
black 360.62445068359375
magenta 360.62445068359375
yellow 360.62445068359375
firebrick 359.67486572265625
tomato 351.0227966308594