Semantic text field type

Warning

The semantic_text field mapping can be added regardless of license state. However, it typically calls the Inference API, which requires an appropriate license. In these cases, using semantic_text in a cluster without the appropriate license causes operations such as indexing and reindexing to fail.

The semantic_text field type simplifies semantic search by providing sensible defaults that automate most of the manual work typically required for vector search. Using semantic_text, you don't have to manually configure mappings, set up ingestion pipelines, or handle chunking. The field type automatically:

Configures index mappings: Chooses the correct field type (sparse_vector or dense_vector), dimensions, similarity functions, and storage optimizations based on the inference endpoint.
Generates embeddings during indexing: Automatically generates embeddings when you index documents, without requiring ingestion pipelines or inference processors.
Handles chunking: Automatically chunks long text documents during indexing.

Basic `semantic_text` mapping example

The following example creates an index mapping with a semantic_text field, using default values:

						PUT semantic-embeddings 
					{
  "mappings": { 
    "properties": {
      "content": { 
        "type": "semantic_text"
      }
    }
  }
}
		
	

Extended `semantic_text` mapping example

The following example creates an index mapping with a semantic_text field that uses dense vectors:

						PUT semantic-embeddings
					{
  "mappings": {
    "properties": {
      "content": {
        "type": "semantic_text",
        "inference_id": "my-inference-endpoint",
        "search_inference_id": "my-search-inference-endpoint",
        "index_options": {
          "dense_vector": {
            "type": "bbq_disk"
          }
        },
        "chunking_settings": {
          "strategy": "word",
          "max_chunk_size": 120,
          "overlap": 40
        }
      }
    }
  }
}
		
	

(Optional) Specifies the inference endpoint used to generate embeddings at index time. If you don’t specify an inference_id, the semantic_text field uses a default inference endpoint.
(Optional) The inference endpoint used to generate embeddings at query time. If not specified, the endpoint defined by inference_id is used at both index and query time.
(Optional) Configures how the underlying vector representation is indexed. In this example, bbq_disk is selected for dense vectors. You can configure different index options depending on whether the field uses dense or sparse vectors. Learn how to set index_options for sparse_vectors and how to set index_options for dense_vectors.
(Optional) Overrides the chunking settings from the inference endpoint. In this example, the word strategy splits text on individual words with a maximum of 120 words per chunk and an overlap of 40 words between chunks. The default chunking strategy is sentence.

Tip

For a complete example, refer to the Semantic search with semantic_text tutorial.

Overview

The semantic_text field type documentation is organized into reference content and how-to guides.

Reference

The Reference section provides technical reference content:

Parameters: Parameter descriptions for semantic_text fields.
Inference endpoints: Overview of inference endpoints used with semantic_text fields.
Chunking: How semantic_text automatically processes long text passages by generating smaller chunks.
Pre-filtering for dense vector queries: Automatic pre-filtering behavior for dense vector queries on semantic_text fields.
Limitations: Current limitations of semantic_text fields.
Document count discrepancy: Understanding document counts in _cat/indices for indices with semantic_text fields.
Querying semantic_text fields: Supported query types for semantic_text fields.

How-to guides

The How-to guides section organizes procedure descriptions and examples into the following guides:

Set up and configure semantic_text fields: Learn how to configure inference endpoints, including default and preconfigured options, ELSER on EIS, custom endpoints, and dedicated endpoints for ingestion and search operations.
Ingest data with semantic_text fields: Learn how to index pre-chunked content, use copy_to and multi-fields to collect values from multiple fields, and perform updates and partial updates to optimize ingestion costs.
Search and retrieve semantic_text fields: Learn how to query semantic_text fields, retrieve indexed chunks, return field embeddings, and highlight the most relevant fragments from search results.