Loading

Semantic search with semantic_text

This tutorial walks you through setting up semantic search using the semantic_text field type. By the end, you will be able to:

  • Create an index mapping with a semantic_text field
  • Ingest documents that are automatically converted to vector embeddings
  • Query your data using semantic search with both Query DSL and ES|QL

The semantic_text field type simplifies the inference workflow by providing inference at ingestion time with sensible defaults. You don’t need to define model-related settings and parameters, or create inference ingest pipelines.

We recommend using the semantic_text workflow for semantic search in the Elastic Stack. When you need more control over indexing and query settings, you can use the complete inference workflow instead (refer to the Inference API documentation for details).

This tutorial uses the Elastic Inference Service (EIS), but you can use any service and model supported by the Inference API.

Note
  • To use the semantic_text field type with an inference service other than Elastic Inference Service, you must create an inference endpoint using the Create inference API.

Create a destination index with a semantic_text field. This field stores the vector embeddings that the inference endpoint generates from your input text.

You can run inference either using the Elastic Inference Service or on your own ML-nodes. The following examples show you both scenarios.

				PUT semantic-embeddings
					{
  "mappings": {
    "properties": {
      "content": {
        "type": "semantic_text"
      }
    }
  }
}
		
  1. The name of the field to contain the generated embeddings.
  2. The field to contain the embeddings is a semantic_text field. Since no inference_id is provided, the default inference endpoint is used.
				PUT semantic-embeddings
					{
  "mappings": {
    "properties": {
      "content": {
        "type": "semantic_text",
        "inference_id": ".elser-2-elasticsearch"
      }
    }
  }
}
		
  1. The name of the field to contain the generated embeddings.
  2. The field to contain the embeddings is a semantic_text field.
  3. The .elser-2-elasticsearch preconfigured inference endpoint for the elasticsearch service is used. To use a different inference service, you must create an inference endpoint first using the Create inference API and then specify it in the semantic_text field mapping using the inference_id parameter.
Note

For large-scale deployments using dense vector embeddings, you can significantly reduce memory usage by configuring quantization strategies like BBQ. For advanced configuration, refer to Optimizing vector storage.

Note

If you're using web crawlers or connectors to generate indices, you have to update the index mappings for these indices to include the semantic_text field. Once the mapping is updated, you'll need to run a full web crawl or a full connector sync. This ensures that all existing documents are reprocessed and updated with the new semantic embeddings, enabling semantic search on the updated data.

With your index mapping in place, add some data. Because you mapped the content field as semantic_text, Elasticsearch automatically intercepts the text during ingestion, sends it to the inference endpoint, and stores the resulting vector embeddings alongside your document.

Use the _bulk API to ingest a few sample documents:

POST _bulk
{ "index": { "_index": "semantic-embeddings", "_id": "1" } }
{ "content": "After running, cool down with light cardio for a few minutes to lower your heart rate and reduce muscle soreness." }
{ "index": { "_index": "semantic-embeddings", "_id": "2" } }
{ "content": "Marathon plans stress weekly mileage; carb loading before a race does not replace recovery between hard sessions." }
{ "index": { "_index": "semantic-embeddings", "_id": "3" } }
{ "content": "Tune cluster performance by monitoring thread pools and refresh interval." }
		

The response returns "errors": false and an items array with a "result": "created" entry for each document. If you see errors, check that your index mapping and inference endpoint are configured correctly.

With your data ingested and automatically embedded, you can query it using semantic search. Choose between Query DSL or ES|QL syntax.

The Query DSL approach uses the match query type with the semantic_text field:

				GET semantic-embeddings/_search
					{
  "query": {
    "match": {
      "content": {
        "query": "What causes muscle soreness after running?"
      }
    }
  }
}
		
  1. The semantic_text field on which you want to perform the search.
  2. The query text.

The ES|QL approach uses the match (:) operator, which automatically detects the semantic_text field and performs the search on it. The query uses METADATA _score to sort by _score in descending order.

				POST /_query?format=txt
					{
  "query": """
    FROM semantic-embeddings METADATA _score
    | WHERE content: "How to avoid muscle soreness while running?"
    | SORT _score DESC
    | LIMIT 1000
  """
}
		
  1. The METADATA _score clause returns the relevance score of each document.
  2. The match (:) operator detects the semantic_text field and performs semantic search on content.
  3. Sorts by descending score to display the most relevant results first.
  4. Limits the results to 1000 documents.

Both queries return the documents ranked by semantic relevance. The documents about running and muscle soreness score highest because they are semantically closest to the query, while the document about cluster performance scores lower.

  • For an overview of all query types supported by semantic_text fields and guidance on when to use them, see Querying semantic_text fields.
  • If you want to use semantic_text in hybrid search, refer to this notebook for a step-by-step guide.
  • For more information on how to optimize your ELSER endpoints, refer to the ELSER recommendations section in the model documentation.
  • To learn more about model autoscaling, refer to the trained model autoscaling page.
  • To learn how to optimize storage and search performance when using dense vector embeddings, read about Optimizing vector storage.