﻿---
title: Optimize dense vector storage for semantic search
description: Reduce the memory footprint of dense vector embeddings in semantic search by configuring quantization strategies on semantic_text fields.
url: https://www.elastic.co/elastic/docs-builder/docs/3202/solutions/search/vector/vector-storage-for-semantic-search
products:
  - Elasticsearch
applies_to:
  - Elastic Cloud Serverless: Generally available
  - Elastic Stack: Generally available
---

# Optimize dense vector storage for semantic search
When scaling semantic search, the memory footprint of dense vector embeddings is a primary concern. You can reduce storage requirements by configuring a [quantization strategy](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3202/reference/elasticsearch/mapping-reference/dense-vector#dense-vector-quantization) on your `semantic_text` fields using the `index_options` parameter.
This guide walks you through choosing a strategy and applying it to a `semantic_text` field mapping. For full details on all available quantization options and their parameters, refer to the [`dense_vector` field type reference](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3202/reference/elasticsearch/mapping-reference/dense-vector#dense-vector-index-options).

## Requirements

- You need a `semantic_text` field that uses an inference endpoint producing **dense vector embeddings** (such as E5, OpenAI embeddings, or Cohere).
- If you use a custom model, create the inference endpoint first using the [Create inference API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-put).

<note>
  These `index_options` do not apply to sparse vector models like ELSER, which use a different internal representation.
</note>

<tip>
  To run the `curl` examples on this page, set the following environment variables:
  ```bash
  export ELASTICSEARCH_URL="your-elasticsearch-url"
  export API_KEY="your-api-key"
  ```
  To generate API keys, search for `API keys` in the [global search bar](https://www.elastic.co/elastic/docs-builder/docs/3202/explore-analyze/find-and-organize/find-apps-and-objects). [Learn more about finding your endpoint and credentials](https://www.elastic.co/elastic/docs-builder/docs/3202/solutions/elasticsearch-solution-project/search-connection-details).
</tip>


## Choose a quantization strategy

Select a quantization strategy based on your dataset size and performance requirements:

| Strategy                                                                         | Memory reduction | Best for                                                | Trade-offs                          |
|----------------------------------------------------------------------------------|------------------|---------------------------------------------------------|-------------------------------------|
| `bbq_hnsw`                                                                       | Up to 32x        | Most production use cases (default for 384+ dimensions) | Minimal accuracy loss               |
| `bbq_flat`                                                                       | Up to 32x        | Smaller datasets needing maximum accuracy               | Slower queries (brute-force search) |
| `bbq_disk` <applies-to>Elastic Stack: Generally available since 9.2</applies-to> | Up to 32x        | Large datasets with constrained RAM                     | Slower queries (disk-based)         |
| `int8_hnsw`                                                                      | 4x               | High accuracy retention                                 | Lower compression than BBQ          |
| `int4_hnsw`                                                                      | 8x               | Balance between compression and accuracy                | Some accuracy loss                  |

For most use cases with dense vector embeddings from text models, we recommend [Better Binary Quantization (BBQ)](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3202/reference/elasticsearch/mapping-reference/bbq). BBQ requires a minimum of 64 dimensions and works best with text embeddings.

## Configure your index mapping

Create an index with a `semantic_text` field and set the `index_options` to your chosen quantization strategy.
<tab-set>
  <tab-item title="BBQ with HNSW">
    ```json

    {
      "mappings": {
        "properties": {
          "content": {
            "type": "semantic_text",
            "inference_id": ".multilingual-e5-small-elasticsearch", <1>
            "index_options": {
              "dense_vector": {
                "type": "bbq_hnsw" <2>
              }
            }
          }
        }
      }
    }
    ```

    <dropdown title="Equivalent `curl` command">
      ```bash
      curl -X PUT "${ELASTICSEARCH_URL}/semantic-embeddings-optimized" \
           -H "Content-Type: application/json" \
           -H "Authorization: ApiKey ${API_KEY}" \
           -d '{
             "mappings": {
               "properties": {
                 "content": {
                   "type": "semantic_text",
                   "inference_id": ".multilingual-e5-small-elasticsearch",
                   "index_options": {
                     "dense_vector": {
                       "type": "bbq_hnsw"
                     }
                   }
                 }
               }
             }
           }'
      ```
    </dropdown>
  </tab-item>

  <tab-item title="BBQ flat">
    Use `bbq_flat` for smaller datasets where you need maximum accuracy at the expense of speed:
    ```json

    {
      "mappings": {
        "properties": {
          "content": {
            "type": "semantic_text",
            "inference_id": ".multilingual-e5-small-elasticsearch", <1>
            "index_options": {
              "dense_vector": {
                "type": "bbq_flat" <2>
              }
            }
          }
        }
      }
    }
    ```

    <dropdown title="Equivalent `curl` command">
      ```bash
      curl -X PUT "${ELASTICSEARCH_URL}/semantic-embeddings-flat" \
           -H "Content-Type: application/json" \
           -H "Authorization: ApiKey ${API_KEY}" \
           -d '{
             "mappings": {
               "properties": {
                 "content": {
                   "type": "semantic_text",
                   "inference_id": ".multilingual-e5-small-elasticsearch",
                   "index_options": {
                     "dense_vector": {
                       "type": "bbq_flat"
                     }
                   }
                 }
               }
             }
           }'
      ```
    </dropdown>
  </tab-item>

  <tab-item title="DiskBBQ">
    <applies-to>
      - Elastic Cloud Serverless: Unavailable
      - Elastic Stack: Generally available since 9.2
    </applies-to>
    For large datasets where RAM is constrained, use `bbq_disk` (DiskBBQ) to minimize memory usage:
    ```json

    {
      "mappings": {
        "properties": {
          "content": {
            "type": "semantic_text",
            "inference_id": ".multilingual-e5-small-elasticsearch", <1>
            "index_options": {
              "dense_vector": {
                "type": "bbq_disk" <2>
              }
            }
          }
        }
      }
    }
    ```

    <dropdown title="Equivalent `curl` command">
      ```bash
      curl -X PUT "${ELASTICSEARCH_URL}/semantic-embeddings-disk" \
           -H "Content-Type: application/json" \
           -H "Authorization: ApiKey ${API_KEY}" \
           -d '{
             "mappings": {
               "properties": {
                 "content": {
                   "type": "semantic_text",
                   "inference_id": ".multilingual-e5-small-elasticsearch",
                   "index_options": {
                     "dense_vector": {
                       "type": "bbq_disk"
                     }
                   }
                 }
               }
             }
           }'
      ```
    </dropdown>
  </tab-item>

  <tab-item title="Integer quantization">
    ```json

    {
      "mappings": {
        "properties": {
          "content": {
            "type": "semantic_text",
            "inference_id": ".multilingual-e5-small-elasticsearch", <1>
            "index_options": {
              "dense_vector": {
                "type": "int8_hnsw" <2>
              }
            }
          }
        }
      }
    }
    ```

    <dropdown title="Equivalent `curl` command">
      ```bash
      curl -X PUT "${ELASTICSEARCH_URL}/semantic-embeddings-int8" \
           -H "Content-Type: application/json" \
           -H "Authorization: ApiKey ${API_KEY}" \
           -d '{
             "mappings": {
               "properties": {
                 "content": {
                   "type": "semantic_text",
                   "inference_id": ".multilingual-e5-small-elasticsearch",
                   "index_options": {
                     "dense_vector": {
                       "type": "int8_hnsw"
                     }
                   }
                 }
               }
             }
           }'
      ```
    </dropdown>
  </tab-item>
</tab-set>

<dropdown title="Example response">
  ```js
  {
    "acknowledged": true, 
    "shards_acknowledged": true,
    "index": "semantic-embeddings-optimized"
  }
  ```
</dropdown>


## Verify your configuration

Confirm that the `index_options` are applied to your index:
<tab-set>
  <tab-item title="Console">
    ```json
    ```
  </tab-item>

  <tab-item title="curl">
    ```bash
    curl -X GET "${ELASTICSEARCH_URL}/semantic-embeddings-optimized/_mapping" \
         -H "Authorization: ApiKey ${API_KEY}"
    ```
  </tab-item>
</tab-set>

The response includes the `index_options` you configured under the `content` field's mapping. If the `index_options` block is missing, check that you specified it correctly in the `PUT` request.
<dropdown title="Example response">
  ```js
  {
    "semantic-embeddings-optimized": {
      "mappings": {
        "properties": {
          "content": {
            "type": "semantic_text",
            "inference_id": ".multilingual-e5-small-elasticsearch",
            "index_options": { 
              "dense_vector": {
                "type": "bbq_hnsw"
              }
            }
          }
        }
      }
    }
  }
  ```
</dropdown>


## (Optional) Tune HNSW parameters

For HNSW-based strategies, you can tune graph parameters like `m` and `ef_construction` in the `index_options`. Refer to the [`dense_vector` field type reference](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3202/reference/elasticsearch/mapping-reference/dense-vector#dense-vector-index-options) for the full list of tunable parameters.
<tab-set>
  <tab-item title="Console">
    ```json

    {
      "mappings": {
        "properties": {
          "content": {
            "type": "semantic_text",
            "inference_id": ".multilingual-e5-small-elasticsearch",
            "index_options": {
              "dense_vector": {
                "type": "bbq_hnsw",
                "m": 32, <1>
                "ef_construction": 200 <2>
              }
            }
          }
        }
      }
    }
    ```
  </tab-item>

  <tab-item title="curl">
    ```bash
    curl -X PUT "${ELASTICSEARCH_URL}/semantic-embeddings-custom" \
         -H "Content-Type: application/json" \
         -H "Authorization: ApiKey ${API_KEY}" \
         -d '{
           "mappings": {
             "properties": {
               "content": {
                 "type": "semantic_text",
                 "inference_id": ".multilingual-e5-small-elasticsearch",
                 "index_options": {
                   "dense_vector": {
                     "type": "bbq_hnsw",
                     "m": 32,
                     "ef_construction": 200
                   }
                 }
               }
             }
           }
         }'
    ```
  </tab-item>
</tab-set>


## Next steps

- Follow the [Semantic search with `semantic_text`](https://www.elastic.co/elastic/docs-builder/docs/3202/solutions/search/semantic-search/semantic-search-semantic-text) tutorial to set up an end-to-end semantic search workflow.
- Combine semantic search with keyword search using [hybrid search](https://www.elastic.co/elastic/docs-builder/docs/3202/solutions/search/hybrid-semantic-text).


## Related pages

- [`dense_vector` `index_options` reference](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3202/reference/elasticsearch/mapping-reference/dense-vector#dense-vector-index-options)
- [Better Binary Quantization (BBQ)](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3202/reference/elasticsearch/mapping-reference/bbq)
- [Dense vector search](https://www.elastic.co/elastic/docs-builder/docs/3202/solutions/search/vector/dense-vector)
- [Trained model autoscaling](https://www.elastic.co/elastic/docs-builder/docs/3202/deploy-manage/autoscaling/trained-model-autoscaling)