Vector search in Elasticsearch
Sometimes full-text search alone isn't enough. Machine learning techniques help you find data based on intent and contextual meaning, not just keywords. Vector search is the foundation for these capabilities in Elasticsearch.
Vector search uses machine learning models to convert content into numerical representations called vector embeddings. These embeddings capture meaning and relationships, enabling Elasticsearch to retrieve results based on similarity rather than exact term matches.
New to vector search? Start with the semantic_text workflow, which provides an easy-to-use abstraction over vector search with sensible defaults and automatic model management. Learn more in this hands-on tutorial.
To understand the core concepts behind vector search, including vectors, embeddings, similarity, and the difference between dense and sparse approaches, refer to How vector search works.
Vector search enables a wide range of applications:
- Natural language search: Let users search in everyday language and get results based on meaning, not just keywords.
- Retrieval Augmented Generation (RAG): Retrieve relevant documents from Elasticsearch and feed them into a large language model (LLM) to generate grounded, context-aware answers.
- Question answering: Match natural language questions to the most relevant answers in your data.
- Content recommendations: Suggest related articles, products, or media based on vector similarity.
- Large-scale information retrieval: Search across millions or billions of documents efficiently.
- Product discovery: Help users find products that match their intent, even when they don't use exact product terms.
- Workplace document search: Search internal knowledge bases, wikis, and documents by meaning rather than exact keywords.
- Image and multimedia similarity: Find visually or semantically similar images, audio, or video by comparing their vector representations.
You can combine vector search with full-text search for hybrid search that leverages both meaning-based and keyword-based matching.
Elasticsearch offers several ways to implement vector search. Your choice depends on how much control you need and what type of content you are searching.
Semantic search workflows are managed and require minimal configuration. They handle embedding generation and model management for you. Choose semantic search when:
- You want to get started quickly with natural language search
- You prefer Elastic to manage models and indexing defaults
- Your use case is text-based and fits common patterns (document search, RAG, question answering)
Direct vector search uses the dense_vector and sparse_vector field types. Choose this when:
- You already have pre-computed embeddings or generate them outside Elasticsearch
- You need to search non-text content (images, audio) with embeddings from external models
- You require fine-grained control over indexing, quantization, or query parameters
Resources are grouped by implementation path. Try out our tutorials in Start here for a quick win, or jump to the workflow that matches how much control you need.
- Get started with semantic search: Set up hybrid search using
semantic_textwith dense vector embeddings. The recommended starting point. - How vector search works: Core concepts: vectors, embeddings, dimensions, similarity, dense vs. sparse vectors, and quantization.
Use semantic_text, the Inference APIs, or ELSER for semantic search with managed embedding generation and model deployment.
- Semantic search with
semantic_text: Implement semantic search with automatic embedding generation and model management. - Hybrid search with
semantic_text: Combine vector search with full-text search using reciprocal rank fusion. - Semantic search with the inference API: Configure inference endpoints for more control over embedding generation.
- Semantic search with ELSER: Deploy the ELSER sparse vector model and build a semantic search pipeline.
Work directly with dense_vector and sparse_vector field types when you need more control over indexing, quantization, and query parameters.
- Bring your own dense vectors to Elasticsearch: Store and search pre-computed dense vectors using the
dense_vectorfield type. - Dense vector search in Elasticsearch: How dense vectors capture semantic meaning using neural embeddings, and how to use them in Elasticsearch.
- Sparse vector search in Elasticsearch: How ELSER generates sparse vectors for explainable, term-based semantic matching.
- Tutorial: Dense and sparse workflows using ingest pipelines: A side-by-side walkthrough of dense and sparse vector ingest pipelines.
Build multi-stage retrieval and improve result ranking.
- kNN search in Elasticsearch: Run approximate and exact k-nearest neighbor searches, with filtering, multi-kNN, and nested vector support.
- Retrievers: Compose multi-stage retrieval pipelines that combine different search strategies in a single request.
- Semantic reranking: Rerank search results using a cross-encoder model to improve relevance after initial retrieval.
Learn about the models and services that power vector search in Elasticsearch.
- ELSER: Elastic's built-in sparse vector model for semantic search with explainable, term-based matching.
- E5: A multilingual dense embedding model that can be deployed directly in Elasticsearch.
- Elastic Inference Service: A managed service for running machine learning models for embedding generation and other NLP tasks.
- Search and compare text: Use deployed NLP models to search and compare text at query time.
- Text embedding and semantic search: Deploy a text embedding model and use it for vector search, from model setup to query.
- Using Cohere with Elasticsearch: Generate embeddings and perform semantic search using Cohere's models.
Tune vector search for production performance.
- Tune approximate kNN search: Optimize vector search performance by tuning quantization, HNSW parameters, memory, and recall tradeoffs.