﻿---
title: Vector search use cases
description: Sometimes full-text search alone is not enough. Machine learning helps you find data by meaning, not only by matching keywords. Vector search is how Elasticsearch...
url: https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/vector/vector-search-use-cases
products:
  - Elastic Cloud Enterprise
  - Elastic Cloud Hosted
  - Elastic Cloud Serverless
  - Elastic Cloud on Kubernetes
  - Elastic Stack
  - Elasticsearch
applies_to:
  - Elastic Cloud Serverless: Generally available
  - Elastic Stack: Generally available
---

# Vector search use cases
Sometimes [full-text search](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/full-text) alone is not enough. Machine learning helps you find data by meaning, not only by matching keywords. Vector search is how Elasticsearch supports these workloads.
This page describes common vector search use cases and how to implement them.
<tip>
  New to vector search? You might want to start with the [managed `semantic_text` workflow](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/get-started/semantic-search).
</tip>


## How to implement retrieval

Choose a search strategy based on the following:
- **Embeddings**: For meaning-based text search, start with [managed workflows](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/semantic-search) such as [`semantic_text`](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/semantic-search/semantic-search-semantic-text). When you need full control over models, embeddings, or non-text vectors, configure [vector search](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/vector) directly. To understand the differences and choose the right approach, refer to [Semantic search and vector search](/elastic/docs-content/tree/main/solutions/search/vector#semantic-search-vs-vector-search).
- **Query interface**: Send requests with the [Search API and Query DSL](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/the-search-api) or [ES|QL for search](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/esql-for-search). Use the same approach at index time and at search time.
- **Combine strategies**: To rank keyword and vector results together, use [Hybrid search](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/hybrid-search) or [retrievers](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/retrievers-overview) in a single Search API request.


## RAG and question answering on your own data

Use Elasticsearch to find relevant passages in your documents, wikis, tickets, or knowledge bases, then pass those passages to a language model. The model can answer using your data instead of only its training data. This fits internal assistants, support bots, and tools that must cite sources.
<stepper>
  <step title="Learn how RAG works in Elasticsearch">
    Read how retrieval, chunking, and orchestration fit together.
    - [RAG](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/rag)
  </step>

  <step title="Set up search for your documents">
    Split long documents into smaller chunks so each search hit is a useful passage. Refer to [How to implement retrieval](#how-to-implement-retrieval) to choose your embedding approach, query interface, and search strategy.
  </step>

  <step title="Generate answers with an LLM">
    Send the top search hits and their text fields to your model or orchestration layer.
    - [Core search options in RAG](/elastic/docs-content/tree/main/solutions/search/rag#core-search-options)
  </step>
</stepper>


## Discovery and recommendations

Find related products, articles, videos, or other items when keywords alone do not match well. Examples include "similar products," "you may also like," and matching users or players in an app.
<stepper>
  <step title="Store embeddings for each item">
    Each item needs a vector from the same model so similarity scores are comparable. Refer to [How to implement retrieval](#how-to-implement-retrieval) for your embedding approach.
  </step>

  <step title="Search for similar items">
    Use the vector of the current item (or a user profile vector) as the search input. Run a k-nearest neighbor (kNN) query to get the closest matches. On large catalogs, adjust `k` and `num_candidates` to balance speed and quality. Refer to [How to implement retrieval](#how-to-implement-retrieval) for your query interface and search strategy.
    - [kNN search](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/vector/knn)
  </step>

  <step title="Limit results with filters">
    A **filter** is a rule on structured fields in your index, such as "in stock," "region = EU," or "category = shoes." It narrows which documents kNN considers. Without filters, similarity search might return items the user cannot buy or see.Add a `filter` clause to your kNN request so only matching documents are returned. This is important for catalogs where most items are out of scope for a given user.
    - [Filtered kNN search](/elastic/docs-content/tree/main/solutions/search/vector/knn#knn-search-filter-example)
  </step>

  <step title="Improve the result order">
    The closest vectors are not always the best final ranking. You can boost by popularity or recency, or rescore the top results. Refer to [How to implement retrieval](#how-to-implement-retrieval) for how to combine search strategies.
    - [Semantic reranking](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/ranking/semantic-reranking)
  </step>
</stepper>


## Multimodal search

Search images, audio, video, or text when your content uses more than one type. For example, search with text to find images, or search with an image to find similar images.
The steps below use the [Inference API](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/semantic-search/semantic-search-inference) to embed multimodal content. Refer to [How to implement retrieval](#how-to-implement-retrieval) for other embedding approaches.
<stepper>
  <step title="Create an inference endpoint">
    Create an endpoint with a model that supports your media types (text, images, audio, or video). Use the `embedding` task type for multimodal models. Use the same endpoint ID when you ingest documents and when you run a search.
    - [Create an inference endpoint](/elastic/docs-content/tree/main/solutions/search/semantic-search/semantic-search-inference#infer-text-embedding-task)
  </step>

  <step title="Add an index mapping and ingest pipeline">
    Define a `dense_vector` field for embeddings and any other fields you need for filters (category, license, date). In the same tutorial, add an ingest pipeline with an inference processor that calls your endpoint, then load your documents so each one is embedded at index time.
    - [Create the index mapping](/elastic/docs-content/tree/main/solutions/search/semantic-search/semantic-search-inference#infer-service-mappings)
  </step>

  <step title="Run kNN search">
    Use kNN with a `query_vector_builder` so Elasticsearch embeds the user's query with the same model, then returns the closest vectors. Add a filter on structured fields when you need to limit results by category or other rules. Refer to [How to implement retrieval](#how-to-implement-retrieval) for your query interface and search strategy.
    - [Semantic search with the Inference API](/elastic/docs-content/tree/main/solutions/search/semantic-search/semantic-search-inference#infer-semantic-search)
  </step>
</stepper>


## Duplicate detection, fraud, and anomaly detection

Compare documents, accounts, or events to find near-duplicates, suspicious matches, or unusual patterns that exact matching would miss. Examples include duplicate articles, fraudulent transactions, and operational outliers.
<stepper>
  <step title="Clean records before embedding">
    Use an [ingest pipeline](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/manage-data/ingest/transform-enrich/ingest-pipelines) to normalize the fields you embed before they are indexed. The goal is to avoid separate vectors for content that only differs in formatting.
  </step>

  <step title="Store one vector per record you compare">
    Index one embedding per document, account snapshot, or time window you want to compare. Refer to [How to implement retrieval](#how-to-implement-retrieval) for your embedding approach.
  </step>

  <step title="Find neighbors and apply thresholds">
    Run kNN for each new or suspect record. In your application, use the similarity score to decide what to do: mark pairs above a threshold as duplicates, block submissions close to a known fraud example, or raise an alert when neighbors look unusual. Refer to [How to implement retrieval](#how-to-implement-retrieval) for your query interface and search strategy.
    - [kNN search](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/vector/knn)
  </step>

  <step title="Act on matches in your pipeline">
    Run scheduled checks on new data, write matches to a review index, or combine vector results with aggregations (for example, count duplicates per URL). For time-series patterns that are not vector-based, you can also use [anomaly detection in Elasticsearch](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/explore-analyze/machine-learning/anomaly-detection).
  </step>
</stepper>


## Long-term memory for LLMs

Store facts, chat turns, or summaries so an assistant can load relevant past context without sending the full chat history every time.
<stepper>
  <step title="Design the memory index">
    Store a user or session ID, a timestamp, and optional expiry fields. Decide whether each stored item is a full message, a short fact, or a summary so search returns the right level of detail.
    - [Index basics](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/solutions/search/get-started/index-basics)
  </step>

  <step title="Index new memories with embeddings">
    Use the same embedding setup at index time and at search time. Refer to [How to implement retrieval](#how-to-implement-retrieval) for your embedding approach.
  </step>

  <step title="Retrieve memories for each new message">
    Restrict search to the current user or session, then run semantic or kNN search on the new message. Pass the top hits to your application with the user's latest input. Refer to [How to implement retrieval](#how-to-implement-retrieval) for your query interface and search strategy.
  </step>

  <step title="Remove or merge old memories">
    Delete or roll up outdated entries in your app, or use [index lifecycle management](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/manage-data/lifecycle/index-lifecycle-management) so the index stays accurate and does not grow without limit.
  </step>
</stepper>