ES|QL dense vector functions

Serverless Preview Stack Preview

ES|QL supports dense vector functions for vector similarity calculations and k-nearest neighbor search. Dsense vector functions work with dense_vector fields and require appropriate field mappings.

ES|QL supports these vector functions:

KNN Stack Preview 9.2.0 Serverless Preview
TEXT_EMBEDDING Stack Planned Serverless Preview
V_COSINE Stack Planned Serverless Preview
V_DOT_PRODUCT Stack Planned Serverless Preview
V_HAMMING Stack Planned Serverless Preview
V_L1_NORM Stack Planned Serverless Preview
V_L2_NORM Stack Planned Serverless Preview

`KNN`

Serverless Preview Stack Preview 9.2.0

Syntax

Parameters

field: Field that the query will target. knn function can be used with dense_vector or semantic_text fields. Other text fields are not allowed
query: Vector value to find top nearest neighbours for.
options: (Optional) kNN additional options as function named parameters. See knn query for more information.

Description

Finds the k nearest vectors to a query vector, as measured by a similarity metric. knn function finds nearest vectors through approximate search on indexed dense_vectors or semantic_text fields.

Supported types

field	query	options	result
dense_vector	dense_vector	named parameters	boolean
text	dense_vector	named parameters	boolean

Supported function named parameters

boost: (float) Floating point number used to decrease or increase the relevance scores of the query.Defaults to 1.0.
k: (integer) The number of nearest neighbors to return from each shard. Elasticsearch collects k results from each shard, then merges them to find the global top results. This value must be less than or equal to num_candidates. This value is automatically set with any LIMIT applied to the function.
visit_percentage: (float) The percentage of vectors to explore per shard while doing knn search with bbq_disk. Must be between 0 and 100. 0 will default to using num_candidates for calculating the percent visited. Increasing visit_percentage tends to improve the accuracy of the final results. If visit_percentage is set for bbq_disk, num_candidates is ignored. Defaults to ~1% per shard for every 1 million vectors
min_candidates: (integer) The minimum number of nearest neighbor candidates to consider per shard while doing knn search. KNN may use a higher number of candidates in case the query can't use a approximate results. Cannot exceed 10,000. Increasing min_candidates tends to improve the accuracy of the final results. Defaults to 1.5 * k (or LIMIT) used for the query.
rescore_oversample: (double) Applies the specified oversampling for rescoring quantized vectors. See oversampling and rescoring quantized vectors for details.
similarity: (double) The minimum similarity required for a document to be considered a match. The similarity value calculated relates to the raw similarity used, not the document score.

Example

		from colors metadata _score
| where knn(rgb_vector, [0, 120, 0])
| sort _score desc, color asc
		
	

color:text	rgb_vector:dense_vector
green	[0.0, 128.0, 0.0]
black	[0.0, 0.0, 0.0]
olive	[128.0, 128.0, 0.0]
teal	[0.0, 128.0, 128.0]
lime	[0.0, 255.0, 0.0]
sienna	[160.0, 82.0, 45.0]
maroon	[128.0, 0.0, 0.0]
navy	[0.0, 0.0, 128.0]
gray	[128.0, 128.0, 128.0]
chartreuse	[127.0, 255.0, 0.0]

`TEXT_EMBEDDING`

Serverless Preview Stack Planned

Syntax

Parameters

text: Text string to generate embeddings from. Must be a non-null literal string value.
inference_id: Identifier of an existing inference endpoint the that will generate the embeddings. The inference endpoint must have the text_embedding task type and should use the same model that was used to embed your indexed data.

Description

Generates dense vector embeddings from text input using a specified inference endpoint. Use this function to generate query vectors for KNN searches against your vectorized data or others dense vector based operations.

Supported types

text	inference_id	result
keyword	keyword	dense_vector

Examples

Basic text embedding generation from a text string using an inference endpoint.

		ROW input="Who is Victor Hugo?"
| EVAL embedding = TEXT_EMBEDDING("Who is Victor Hugo?", "test_dense_inference")

Generate text embeddings and store them in a variable for reuse in KNN vector search queries.

		FROM dense_vector_text METADATA _score
| EVAL query_embedding = TEXT_EMBEDDING("be excellent to each other", "test_dense_inference")
| WHERE KNN(text_embedding_field, query_embedding)
		
	

Directly embed text within a KNN query for streamlined vector search without intermediate variables.

		FROM dense_vector_text METADATA _score
| WHERE KNN(text_embedding_field, TEXT_EMBEDDING("be excellent to each other", "test_dense_inference"))

`V_COSINE`

Serverless Preview Stack Planned

Syntax

Parameters

left: first dense_vector to calculate cosine similarity
right: second dense_vector to calculate cosine similarity

Description

Calculates the cosine similarity between two dense_vectors.

Supported types

left	right	result
dense_vector	dense_vector	double

Example

		from colors
| where color != "black"
| eval similarity = v_cosine(rgb_vector, [0, 255, 255])
| sort similarity desc, color asc
		
	

color:text	similarity:double
cyan	1.0
teal	1.0
turquoise	0.9781067967414856
aqua marine	0.929924726486206
azure	0.8324936032295227
lavender	0.827340304851532
mint cream	0.8245516419410706
honeydew	0.8244848847389221
gainsboro	0.8164966106414795
gray	0.8164966106414795

`V_DOT_PRODUCT`

Serverless Preview Stack Planned

Syntax

Parameters

left: first dense_vector to calculate dot product similarity
right: second dense_vector to calculate dot product similarity

Description

Calculates the dot product between two dense_vectors.

Supported types

left	right	result
dense_vector	dense_vector	double

Example

		from colors
| eval similarity = v_dot_product(rgb_vector, [0, 255, 255])
| sort similarity desc, color asc
		
	

color:text	similarity:double
azure	130050.0
cyan	130050.0
white	130050.0
mint cream	128775.0
snow	127500.0
honeydew	126225.0
ivory	126225.0
sea shell	123165.0
lavender	122400.0
old lace	121125.0

`V_HAMMING`

Serverless Preview Stack Planned

Syntax

Parameters

left: First dense_vector to use to calculate the Hamming distance
right: Second dense_vector to use to calculate the Hamming distance

Description

Calculates the Hamming distance between two dense vectors.

Supported types

left	right	result
dense_vector	dense_vector	double

Example

		from colors
| eval similarity = v_hamming(rgb_byte_vector, [0, 127, 127])
| sort similarity desc, color asc
		
	

color:text	similarity:double
red	23.0
indigo	19.0
orange	19.0
black	17.0
gold	17.0
bisque	16.0
chartreuse	16.0
green	16.0
maroon	16.0
navy	16.0

`V_L1_NORM`

Serverless Preview Stack Planned

Syntax

Parameters

left: first dense_vector to calculate l1 norm similarity
right: second dense_vector to calculate l1 norm similarity

Description

Calculates the l1 norm between two dense_vectors.

Supported types

left	right	result
dense_vector	dense_vector	double

Example

		from colors
| eval similarity = v_l1_norm(rgb_vector, [0, 255, 255])
| sort similarity desc, color asc
		
	

color:text	similarity:double
red	765.0
crimson	650.0
maroon	638.0
firebrick	620.0
orange	600.0
tomato	595.0
brown	591.0
chocolate	585.0
coral	558.0
gold	550.0

lists/dense-vector-functions.md

`V_L2_NORM`

Serverless Preview Stack Planned

Syntax

Parameters

left: first dense_vector to calculate l2 norm similarity
right: second dense_vector to calculate l2 norm similarity

Description

Calculates the l2 norm between two dense_vectors.

Supported types

left	right	result
dense_vector	dense_vector	double

Example

		from colors
| eval similarity = v_l2_norm(rgb_vector, [0, 255, 255])
| sort similarity desc, color asc
		
	

color:text	similarity:double
red	441.6729431152344
maroon	382.6669616699219
crimson	376.36419677734375
orange	371.68536376953125
gold	362.8360595703125
black	360.62445068359375
magenta	360.62445068359375
yellow	360.62445068359375
firebrick	359.67486572265625
tomato	351.0227966308594