Loading

Semantic text field type

Warning

The semantic_text field mapping can be added regardless of license state. However, it typically calls the Inference API, which requires an appropriate license. In these cases, using semantic_text in a cluster without the appropriate license causes operations such as indexing and reindexing to fail.

The semantic_text field type simplifies semantic search by providing sensible defaults that automate most of the manual work typically required for vector search. Using semantic_text, you don't have to manually configure mappings, set up ingestion pipelines, or handle chunking. The field type automatically:

  • Configures index mappings: Chooses the correct field type (sparse_vector or dense_vector), dimensions, similarity functions, and storage optimizations based on the inference endpoint.
  • Generates embeddings during indexing: Automatically generates embeddings when you index documents, without requiring ingestion pipelines or inference processors.
  • Handles chunking: Automatically chunks long text documents during indexing.

The following example creates an index mapping with a semantic_text field, using default values:

				PUT semantic-embeddings 
					{
  "mappings": { 
    "properties": {
      "content": { 
        "type": "semantic_text"
      }
    }
  }
}
		
Important

If you don't specify an inference_id, like in the example above, and upgrade to a later version, newly created indices might use a different embedding model than existing ones. Queries that target these indices together can produce unexpected ranking results. For details, refer to potential issues when mixing embedding models across indices.

The following example creates an index mapping with a semantic_text field that uses dense vectors:

				PUT semantic-embeddings
					{
  "mappings": {
    "properties": {
      "content": {
        "type": "semantic_text",
        "inference_id": "my-inference-endpoint",
        "search_inference_id": "my-search-inference-endpoint",
        "index_options": {
          "dense_vector": {
            "type": "bbq_disk"
          }
        },
        "chunking_settings": {
          "strategy": "word",
          "max_chunk_size": 120,
          "overlap": 40
        }
      }
    }
  }
}
		
  1. (Optional) Specifies the inference endpoint used to generate embeddings at index time. If you don’t specify an inference_id, the semantic_text field uses a default inference endpoint.
  2. (Optional) The inference endpoint used to generate embeddings at query time. If not specified, the endpoint defined by inference_id is used at both index and query time.
  3. (Optional) Configures how the underlying vector representation is indexed. In this example, bbq_disk is selected for dense vectors. You can configure different index options depending on whether the field uses dense or sparse vectors. Learn how to set index_options for sparse_vectors and how to set index_options for dense_vectors.
  4. (Optional) Overrides the chunking settings from the inference endpoint. In this example, the word strategy splits text on individual words with a maximum of 120 words per chunk and an overlap of 40 words between chunks. The default chunking strategy is sentence.
Tip

For a complete example, refer to the Semantic search with semantic_text tutorial.

The semantic_text field type documentation is organized into reference content and how-to guides.

The Reference section provides technical reference content:

The How-to guides section organizes procedure descriptions and examples into the following guides:

  • Set up and configure semantic_text fields: Learn how to configure inference endpoints, including default and preconfigured options, ELSER on EIS, custom endpoints, and dedicated endpoints for ingestion and search operations.

  • Ingest data with semantic_text fields: Learn how to index pre-chunked content, use copy_to and multi-fields to collect values from multiple fields, and perform updates and partial updates to optimize ingestion costs.

  • Search and retrieve semantic_text fields: Learn how to query semantic_text fields, retrieve indexed chunks, return field embeddings, and highlight the most relevant fragments from search results.