﻿---
title: Bring your own dense vectors to Elasticsearch
description: An introduction to vectors and knn search in Elasticsearch.
url: https://www.elastic.co/elastic/docs-builder/docs/3028/solutions/search/vector/bring-own-vectors
products:
  - Elasticsearch
applies_to:
  - Elastic Cloud Serverless: Generally available
  - Elastic Stack: Generally available
---

# Bring your own dense vectors to Elasticsearch
Elasticsearch enables you to store and search mathematical representations of your content - _embeddings_ or _vectors_ - which power AI-driven relevance. There are two types of vector representation - _dense_ and _sparse_ - suited to different queries and use cases (for example, finding similar images and content or storing expanded terms and weights).
In this introduction to [vector search](https://www.elastic.co/elastic/docs-builder/docs/3028/solutions/search/vector), you’ll store and search for dense vectors in Elasticsearch. You’ll also learn the syntax for querying these documents with a [k-nearest neighbour](https://www.elastic.co/elastic/docs-builder/docs/3028/solutions/search/vector/knn) (kNN) query.

## Prerequisites for vector search

- If you're using Elasticsearch Serverless, you must have a `developer` or `admin` predefined role or an equivalent custom role to add the sample data.
- If you're using Elastic Cloud Hosted or a self-managed cluster, start Elasticsearch and Kibana. The simplest method to complete the steps in this guide is to log in with a user that has the `superuser` built-in role.

To learn about role-based access control, check out [User roles](https://www.elastic.co/elastic/docs-builder/docs/3028/deploy-manage/users-roles/cluster-or-deployment-auth/user-roles).
To learn about Elasticsearch Serverless project profiles, refer to [Dense vector search in Elasticsearch > General purpose and vector optimized projects](/elastic/docs-builder/docs/3028/solutions/search/vector/dense-vector#vector-profiles).

## Create a vector database

When you create vectors (or _vectorize_ your data), you convert complex content (text, images, audio, video) into multidimensional numeric representations. These vectors are stored in specialized data structures that enable efficient similarity search and fast kNN distance calculations.
In this guide, you’ll use documents that already include dense vector embeddings. To deploy a vector embedding model in Elasticsearch and generate vectors during ingest and search, refer to the links in [Learn more](#bring-your-own-vectors-learn-more).
<tip>
  This is an advanced use case that uses the `dense_vector` field type. Refer to [Semantic search](https://www.elastic.co/elastic/docs-builder/docs/3028/solutions/search/semantic-search) for an overview of your options for semantic search with Elasticsearch.
  To learn about the differences between semantic search and vector search, go to [AI-powered search](https://www.elastic.co/elastic/docs-builder/docs/3028/solutions/search/ai-search/ai-search).
</tip>

<stepper>
  <step title="Create an index with dense vector field mappings">
    Each document in our simple data set will have:
    - A review: stored in a `review_text` field
    - An embedding of that review: stored in a `review_vector` field, which is defined as a [`dense_vector`](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3028/reference/elasticsearch/mapping-reference/dense-vector) data type.

    <tip>
      The `dense_vector` type automatically uses quantization to reduce the memory footprint when searching float vectors.
      Learn more about the default quantization strategy and balancing performance and accuracy in [Dense vector field type](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3028/reference/elasticsearch/mapping-reference/dense-vector).
    </tip>
    The following API request defines the `review_text` and `review_vector` fields:
    ```json

    {
      "mappings": {
        "properties": {
          "review_vector": {
            "type": "dense_vector",
            "dims": 8, <1>
            "index": true, <2>
            "similarity": "cosine" <3>
          },
          "review_text": {
            "type": "text"
          }
        }
      }
    }
    ```
    Here we’re using an 8-dimensional embedding for readability. The vectors that neural network models work with can have several hundreds or even thousands of dimensions that represent a point in a multi-dimensional space. Each dimension represents a feature or characteristic of the unstructured data.
  </step>

  <step title="Add documents with embeddings">
    First, index a single document to understand the document structure.You can provide vectors in two different input formats:
    - Array of floats: A JSON array of numeric values representing each vector dimension.
    - Base64-encoded string: A Base64-encoded binary representation of the vector. More compact than float arrays, reducing payload size and improving efficiency for large vectors and bulk indexing. You can use the [Python client dense vector packing](https://elasticsearch-py.readthedocs.io/en/stable/api_helpers.html#dense-vector-packing) utility to convert float arrays into Base64-encoded binary representation.

    <tab-set>
      <tab-item title="Array of floats">
        ```json

        {
          "review_text": "This product is lifechanging! I'm telling all my friends about it.",
          "review_vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8] <1>
        }
        ```
      </tab-item>

      <tab-item title="Base64">
        ```json

        {
          "review_text": "This product is lifechanging! I'm telling all my friends about it.",
          "review_vector": "PczMzT5MzM0+mZmaPszMzT8AAAA/GZmaPzMzMz9MzM0=" 
        }
        ```
      </tab-item>
    </tab-set>
  </step>
  In a production scenario, you'll want to index many documents at once using the [`_bulk` endpoint](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk).
  Here's an example of indexing multiple documents in a single `_bulk` request:
  <tab-set>
    <tab-item title="Array of floats">
      ```json

      { "index": { "_index": "amazon-reviews", "_id": "2" } }
      { "review_text": "This product is amazing! I love it.", "review_vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8] }
      { "index": { "_index": "amazon-reviews", "_id": "3" } }
      { "review_text": "This product is terrible. I hate it.", "review_vector": [0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1] }
      { "index": { "_index": "amazon-reviews", "_id": "4" } }
      { "review_text": "This product is great. I can do anything with it.", "review_vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8] }
      { "index": { "_index": "amazon-reviews", "_id": "5" } }
      { "review_text": "This product has ruined my life and the lives of my family and friends.", "review_vector": [0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1] }
      ```
    </tab-item>

    <tab-item title="Base64">
      ```json

      { "index": { "_index": "amazon-reviews", "_id": "2" } }
      { "review_text": "This product is amazing! I love it.", "review_vector": "PczMzT5MzM0+mZmaPszMzT8AAAA/GZmaPzMzMz9MzM0=" }
      { "index": { "_index": "amazon-reviews", "_id": "3" } }
      { "review_text": "This product is terrible. I hate it.", "review_vector": "P0zMzT8zMzM/GZmaPwAAAD7MzM0+mZmaPkzMzT3MzM0=" }
      { "index": { "_index": "amazon-reviews", "_id": "4" } }
      { "review_text": "This product is great. I can do anything with it.", "review_vector": "PczMzT5MzM0+mZmaPszMzT8AAAA/GZmaPzMzMz9MzM0=" }
      { "index": { "_index": "amazon-reviews", "_id": "5" } }
      { "review_text": "This product has ruined my life and the lives of my family and friends.", "review_vector": "P0zMzT8zMzM/GZmaPwAAAD7MzM0+mZmaPkzMzT3MzM0=" }
      ```
    </tab-item>
  </tab-set>
</stepper>


## Test vector search

Now you can query these document vectors using a [`knn` retriever](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3028/reference/elasticsearch/rest-apis/retrievers/knn-retriever). `knn` is a type of vector similarity search that finds the `k` most similar documents to a query vector. Here we're using a raw vector for the query text for demonstration purposes:
```json

{
  "retriever": {
    "knn": {
      "field": "review_vector",
      "query_vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8], <1>
      "k": 2, <2>
      "num_candidates": 5 <3>
    }
  }
}
```


## Next steps: implementing vector search

If you want to try a similar workflow from an Elasticsearch client, use the following guided index workflow in Elasticsearch Serverless, Elastic Cloud Hosted, or a self-managed cluster:
- Go to the **Index Management** page using the navigation menu or the [global search field](https://www.elastic.co/elastic/docs-builder/docs/3028/explore-analyze/find-and-organize/find-apps-and-objects).
- Select **Create index**. Specify a name and then select **Create my index**.
- Select **Vector Search** option and follow the guided workflow.

When you finish your tests and no longer need the sample data set, delete the index:
```json
```


## Learn more about vector search

In these simple examples, we send a raw vector for the query text. In a real-world scenario, you won’t know the query text ahead of time. You’ll generate query vectors on the fly using the same embedding model that produced the document vectors. For this, deploy a text embedding model in Elasticsearch and use the[`query_vector_builder` parameter](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3028/reference/query-languages/query-dsl/query-dsl-knn-query#knn-query-top-level-parameters). Alternatively, you can generate vectors client-side and send them directly with the search request.
For an example of using pipelines to generate text embeddings, check out [Tutorial: Dense and sparse workflows using ingest pipelines](https://www.elastic.co/elastic/docs-builder/docs/3028/solutions/search/vector/dense-versus-sparse-ingest-pipelines).
To learn more about the search options in Elasticsearch, such as semantic, full-text, and hybrid, refer to [Search approaches](https://www.elastic.co/elastic/docs-builder/docs/3028/solutions/search/search-approaches).