Loading

Semantic text field type

Warning

The semantic_text field mapping can be added regardless of license state. However, it typically calls the Inference API, which requires an appropriate license. In these cases, using semantic_text in a cluster without the appropriate license causes operations such as indexing and reindexing to fail.

The semantic_text field type simplifies semantic search by providing sensible defaults that automate most of the manual work typically required for vector search. Using semantic_text, you don't have to manually configure mappings, set up ingestion pipelines, or handle chunking. The field type automatically:

  • Configures index mappings: Chooses the correct field type (sparse_vector or dense_vector), dimensions, similarity functions, and storage optimizations based on the inference endpoint.
  • Generates embeddings during indexing: Automatically generates embeddings when you index documents, without requiring ingestion pipelines or inference processors.
  • Handles chunking: Automatically chunks long text documents during indexing.

The following example creates an index mapping with a semantic_text field, using default values:

				PUT semantic-embeddings 
					{
  "mappings": { 
    "properties": {
      "content": { 
        "type": "semantic_text"
      }
    }
  }
}
		

The following example creates an index mapping with a semantic_text field that uses dense vectors:

				PUT semantic-embeddings
					{
  "mappings": {
    "properties": {
      "content": {
        "type": "semantic_text",
        "inference_id": "my-inference-endpoint",
        "search_inference_id": "my-search-inference-endpoint",
        "index_options": {
          "dense_vector": {
            "type": "bbq_disk"
          }
        },
        "chunking_settings": {
          "strategy": "word",
          "max_chunk_size": 120,
          "overlap": 40
        }
      }
    }
  }
}
		
  1. (Optional) Specifies the inference endpoint used to generate embeddings at index time. If you don’t specify an inference_id, the semantic_text field uses a default inference endpoint.
  2. (Optional) The inference endpoint used to generate embeddings at query time. If not specified, the endpoint defined by inference_id is used at both index and query time.
  3. (Optional) Configures how the underlying vector representation is indexed. In this example, bbq_disk is selected for dense vectors. You can configure different index options depending on whether the field uses dense or sparse vectors. Learn how to set index_options for sparse_vectors and how to set index_options for dense_vectors.
  4. (Optional) Overrides the chunking settings from the inference endpoint. In this example, the word strategy splits text on individual words with a maximum of 120 words per chunk and an overlap of 40 words between chunks. The default chunking strategy is sentence.
Tip

For a complete example, refer to the Semantic search with semantic_text tutorial.

The semantic_text field type documentation is organized into reference content and how-to guides.

The Reference section provides technical reference content:

The How-to guides section organizes procedure descriptions and examples into the following guides:

  • Set up and configure semantic_text fields: Learn how to configure inference endpoints, including default and preconfigured options, ELSER on EIS, custom endpoints, and dedicated endpoints for ingestion and search operations.

  • Ingest data with semantic_text fields: Learn how to index pre-chunked content, use copy_to and multi-fields to collect values from multiple fields, and perform updates and partial updates to optimize ingestion costs.

  • Search and retrieve semantic_text fields: Learn how to query semantic_text fields, retrieve indexed chunks, return field embeddings, and highlight the most relevant fragments from search results.