﻿---
title: Using the annotated-text field
description: The annotated-text tokenizes text content as per the more common text field (see "limitations" below) but also injects any marked-up annotation tokens...
url: https://www.elastic.co/elastic/docs-builder/docs/3016/reference/elasticsearch/plugins/mapper-annotated-text-usage
products:
  - Elasticsearch
---

# Using the annotated-text field
The `annotated-text` tokenizes text content as per the more common [`text`](https://www.elastic.co/elastic/docs-builder/docs/3016/reference/elasticsearch/mapping-reference/text) field (see "limitations" below) but also injects any marked-up annotation tokens directly into the search index:
```json

{
  "mappings": {
    "properties": {
      "my_field": {
        "type": "annotated_text"
      }
    }
  }
}
```

Such a mapping would allow marked-up text eg wikipedia articles to be indexed as both text and structured tokens. The annotations use a markdown-like syntax using URL encoding of one or more values separated by the `&` symbol.
We can use the "_analyze" api to test how an example annotation would be stored as tokens in the search index:
```js
GET my-index-000001/_analyze
{
  "field": "my_field",
  "text":"Investors in [Apple](Apple+Inc.) rejoiced."
}
```

Response:
```js
{
  "tokens": [
    {
      "token": "investors",
      "start_offset": 0,
      "end_offset": 9,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "in",
      "start_offset": 10,
      "end_offset": 12,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "Apple Inc.", 
      "start_offset": 13,
      "end_offset": 18,
      "type": "annotation",
      "position": 2
    },
    {
      "token": "apple",
      "start_offset": 13,
      "end_offset": 18,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "rejoiced",
      "start_offset": 19,
      "end_offset": 27,
      "type": "<ALPHANUM>",
      "position": 3
    }
  ]
}
```

We can now perform searches for annotations using regular `term` queries that don’t tokenize the provided search values. Annotations are a more precise way of matching as can be seen in this example where a search for `Beck` will not match `Jeff Beck` :
```json
# Example documents

{
  "my_field": "[Beck](Beck) announced a new tour"<1>
}


{
  "my_field": "[Jeff Beck](Jeff+Beck&Guitarist) plays a strat"<2>
}

# Example search

{
  "query": {
    "term": {
        "my_field": "Beck" <3>
    }
  }
}
```

<warning>
  Any use of `=` signs in annotation values eg `[Prince](person=Prince)` will cause the document to be rejected with a parse failure. In future we hope to have a use for the equals signs so will actively reject documents that contain this today.
</warning>


## Synthetic `_source`

<important>
  Synthetic `_source` is Generally Available only for TSDB indices (indices that have `index.mode` set to `time_series`). For other indices synthetic `_source` is in technical preview. Features in technical preview may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
</important>

If using a sub-`keyword` field then the values are sorted in the same way as a `keyword` field’s values are sorted. By default, that means sorted with duplicates removed. So:

```json

{
  "settings": {
    "index": {
      "mapping": {
        "source": {
          "mode": "synthetic"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "text": {
        "type": "annotated_text",
        "fields": {
          "raw": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

{
  "text": [
    "the quick brown fox",
    "the quick brown fox",
    "jumped over the lazy dog"
  ]
}
```

Will become:
```json
{
  "text": [
    "jumped over the lazy dog",
    "the quick brown fox"
  ]
}
```

<note>
  Reordering text fields can have an effect on [phrase](https://www.elastic.co/elastic/docs-builder/docs/3016/reference/query-languages/query-dsl/query-dsl-match-query-phrase) and [span](https://www.elastic.co/elastic/docs-builder/docs/3016/reference/query-languages/query-dsl/span-queries) queries. See the discussion about [`position_increment_gap`](https://www.elastic.co/elastic/docs-builder/docs/3016/reference/elasticsearch/mapping-reference/position-increment-gap) for more detail. You can avoid this by making sure the `slop` parameter on the phrase queries is lower than the `position_increment_gap`. This is the default.
</note>

If the `annotated_text` field sets `store` to true then order and duplicates are preserved.

```json

{
  "settings": {
    "index": {
      "mapping": {
        "source": {
          "mode": "synthetic"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "text": { "type": "annotated_text", "store": true }
    }
  }
}

{
  "text": [
    "the quick brown fox",
    "the quick brown fox",
    "jumped over the lazy dog"
  ]
}
```

Will become:
```json
{
  "text": [
    "the quick brown fox",
    "the quick brown fox",
    "jumped over the lazy dog"
  ]
}
```