﻿---
title: Diversify retriever
description: The diversify retriever reduces the result set from another retriever by applying a diversification strategy to the top-N results. This is useful when...
url: https://www.elastic.co/elastic/docs-builder/docs/3016/reference/elasticsearch/rest-apis/retrievers/diversify-retriever
products:
  - Elasticsearch
applies_to:
  - Elastic Cloud Serverless: Preview
  - Elastic Stack: Preview since 9.3
---

# Diversify retriever
The `diversify` retriever reduces the result set from another retriever by applying a diversification strategy to the top-N results.
This is useful when you want to maximize diversity by preventing similar documents from dominating the top results returned from a search.
Practical use cases include:
- **eCommerce applications**: Show users a wider variety of products rather than multiple similar items
- **Retrieval augmented generation (RAG) workflows**: Provide more diverse context to the LLM, reducing redundancy in the prompt

The retriever uses [MMR (Maximum Marginal Relevance)](https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf) diversification to discard results that are too similar to each other.
Similarity is determined based on the `field` parameter and any provided `query_vector` or `query_vector_builder`.
<note>
  The ordering of results returned from the inner retriever is preserved.
</note>


## Parameters

<definitions>
  <definition term="type">
    (Required, string)
    The type of diversification to use. Currently only `mmr` (maximum marginal relevance) is supported.
  </definition>
  <definition term="field">
    (Required, string)
    The name of the field that will use its values for the diversification process.
    The field type must be one of:
    - `dense_vector`
    - `semantic_text` (with dense vector embeddings). For `semantic_text` fields, you must also provide a `query_vector` or `query_vector_builder`. <applies-to>Elastic Stack: Planned</applies-to> <applies-to>Elastic Cloud Serverless: Preview</applies-to>
  </definition>
  <definition term="rank_window_size">
    (Optional, integer)
    The maximum number of top-documents the `diversify` retriever will receive from the inner retriever.
    Defaults to 10.
  </definition>
  <definition term="retriever">
    (Required, retriever object)
    A single child retriever that provides the initial result set for diversification.
    <note>
      Although some of the inner retriever's results may be removed, the rank and order of the remaining documents is preserved.
    </note>
  </definition>
  <definition term="query_vector">
    (Required if the `field` is a `semantic_text` type, otherwise optional, array of `float`, `byte` or string)
    Query vector. Must have the same number of dimensions as the vector field you are searching against.
    Must be one of:
    - An array of floats
    - A hex-encoded byte vector (one byte per dimension; for `bit`, one byte per 8 dimensions)<applies-to>Elastic Stack: Generally available from 9.0 to 9.3</applies-to>
    - A base64-encoded vector string. Base64 supports `float` and `bfloat16` (big-endian), `byte`, and `bit` encodings depending on the target field type. <applies-to>Elastic Stack: Planned</applies-to>
      If you provide a `query_vector`, you cannot also provide a `query_vector_builder`.
  </definition>
  <definition term="query_vector_builder">
    (Required if the `field` is a `semantic_text` type, otherwise optional, query vector builder object)
    Defines a [model](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3016/solutions/search/vector/knn#knn-semantic-search) to build a query vector.
    If you provide a `query_vector_builder`, you cannot also provide a `query_vector`.
  </definition>
  <definition term="lambda">
    (Required for `mmr`, float)
    A value between 0.0 and 1.0 that controls how similarity is calculated during diversification. Higher values weight the similarity to the query_vector more heavily, lower values weight the diversity more heavily.
  </definition>
  <definition term="size">
    (Optional, only if `mmr` is used, integer)
    The maximum number of top-N results to return. Defaults to 10.
  </definition>
</definitions>


## Example

The following example uses a `diversify` retriever of type `mmr` to diversify and
return the top three results from the inner standard retriever.
The `lambda` value of 0.7 weights `my_dense_field_vector` comparisons more heavily than query vector similarity when determining document differences.
```json

{
  "retriever": {
    "diversify": {
      "type": "mmr",
      "field": "my_dense_vector_field",
      "lambda": 0.7,
      "size": 3,
      "query_vector": [0.1, 0.2, 0.3],
      "retriever": {
        "standard": {
          "query": {
            "match": {
              "title": "elasticsearch"
            }
          }
        }
      }
    }
  }
}
```