﻿---
title: Common grams token filter
description: Generates bigrams for a specified set of common words. For example, you can specify is and the as common words. This filter then converts the tokens [the,...
url: https://www.elastic.co/elastic/docs-builder/docs/3028/reference/text-analysis/analysis-common-grams-tokenfilter
products:
  - Elasticsearch
---

# Common grams token filter
Generates [bigrams](https://en.wikipedia.org/wiki/Bigram) for a specified set of common words.
For example, you can specify `is` and `the` as common words. This filter then converts the tokens `[the, quick, fox, is, brown]` to `[the, the_quick, quick, fox, fox_is, is, is_brown, brown]`.
You can use the `common_grams` filter in place of the [stop token filter](https://www.elastic.co/elastic/docs-builder/docs/3028/reference/text-analysis/analysis-stop-tokenfilter) when you don’t want to completely ignore common words.
This filter uses Lucene’s [CommonGramsFilter](https://lucene.apache.org/core/10_0_0/analysis/common/org/apache/lucene/analysis/commongrams/CommonGramsFilter.md).

## Example

The following [analyze API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-analyze) request creates bigrams for `is` and `the`:
```json

{
  "tokenizer" : "whitespace",
  "filter" : [
    {
      "type": "common_grams",
      "common_words": ["is", "the"]
    }
  ],
  "text" : "the quick fox is brown"
}
```

The filter produces the following tokens:
```text
[ the, the_quick, quick, fox, fox_is, is, is_brown, brown ]
```


## Add to an analyzer

The following [create index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-create) request uses the `common_grams` filter to configure a new [custom analyzer](https://docs-v3-preview.elastic.dev/elastic/docs-builder/docs/3028/manage-data/data-store/text-analysis/create-custom-analyzer):
```json

{
  "settings": {
    "analysis": {
      "analyzer": {
        "index_grams": {
          "tokenizer": "whitespace",
          "filter": [ "common_grams" ]
        }
      },
      "filter": {
        "common_grams": {
          "type": "common_grams",
          "common_words": [ "a", "is", "the" ]
        }
      }
    }
  }
}
```


## Configurable parameters

<definitions>
  <definition term="common_words">
    (Required*, array of strings) A list of tokens. The filter generates bigrams for these tokens.
  </definition>
</definitions>

Either this or the `common_words_path` parameter is required.
<definitions>
  <definition term="common_words_path">
    (Required*, string) Path to a file containing a list of tokens. The filter generates bigrams for these tokens.
  </definition>
</definitions>

This path must be absolute or relative to the `config` location. The file must be UTF-8 encoded. Each token in the file must be separated by a line break.
Either this or the `common_words` parameter is required.
<definitions>
  <definition term="ignore_case">
    (Optional, Boolean) If `true`, matches for common words matching are case-insensitive. Defaults to `false`.
  </definition>
  <definition term="query_mode">
    (Optional, Boolean) If `true`, the filter excludes the following tokens from the output:
  </definition>
</definitions>

- Unigrams for common words
- Unigrams for terms followed by common words

Defaults to `false`. We recommend enabling this parameter for [search analyzers](https://www.elastic.co/elastic/docs-builder/docs/3028/reference/elasticsearch/mapping-reference/search-analyzer).
For example, you can enable this parameter and specify `is` and `the` as common words. This filter converts the tokens `[the, quick, fox, is, brown]` to `[the_quick, quick, fox_is, is_brown,]`.

## Customize

To customize the `common_grams` filter, duplicate it to create the basis for a new custom token filter. You can modify the filter using its configurable parameters.
For example, the following request creates a custom `common_grams` filter with `ignore_case` and `query_mode` set to `true`:
```json

{
  "settings": {
    "analysis": {
      "analyzer": {
        "index_grams": {
          "tokenizer": "whitespace",
          "filter": [ "common_grams_query" ]
        }
      },
      "filter": {
        "common_grams_query": {
          "type": "common_grams",
          "common_words": [ "a", "is", "the" ],
          "ignore_case": true,
          "query_mode": true
        }
      }
    }
  }
}
```