﻿---
title: Using failure stores to address ingestion issues
description: When something goes wrong during ingestion it is often not an isolated event. Included for your convenience are some examples of how you can use the failure...
url: https://www.elastic.co/elastic/docs-builder/docs/3016/manage-data/data-store/data-streams/failure-store-recipes
products:
  - Elastic Cloud Serverless
  - Elastic Stack
  - Elasticsearch
applies_to:
  - Elastic Cloud Serverless: Generally available
  - Elastic Stack: Generally available since 9.1
---

# Using failure stores to address ingestion issues
When something goes wrong during ingestion it is often not an isolated event. Included for your convenience are some examples of how you can use the failure store to quickly respond to ingestion failures and get your indexing back on track.

## Troubleshooting nested ingest pipelines

When a document fails in an ingest pipeline it can be difficult to figure out exactly what went wrong and where. When these failures are captured by the failure store during this part of the ingestion process, they will contain additional debugging information. Failed documents will note the type of processor and which pipeline was executing when the failure occurred. Failed documents will also contain a pipeline trace which keeps track of any nested pipeline calls that the document was in at time of failure.
To demonstrate this, we will follow a failed document through an unfamiliar data stream and ingest pipeline:
```json

{
    "@timestamp": "2025-04-21T00:00:00Z",
    "important": {
      "info": "The rain in Spain falls mainly on the plain"
    }
}
```

```json
{
  "_index": ".fs-my-datastream-ingest-2025.05.09-000001",
  "_id": "F3S3s5YBwrYNjPmayMr9",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 2,
  "_primary_term": 1,
  "failure_store": "used" 
}
```

Now we search the failure store to check the failure document to see what went wrong.
```json
```

```json
{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": ".fs-my-datastream-ingest-2025.05.09-000001",
        "_id": "F3S3s5YBwrYNjPmayMr9",
        "_score": 1,
        "_source": {
          "@timestamp": "2025-05-09T06:24:48.381Z",
          "document": {
            "index": "my-datastream-ingest",
            "source": { 
              "important": {
                "info": "The rain in Spain falls mainly on the plain" 
              },
              "@timestamp": "2025-04-21T00:00:00Z"
            }
          },
          "error": {
            "type": "illegal_argument_exception",
            "message": "field [info] not present as part of path [important.info]", 
            "stack_trace": """j.l.IllegalArgumentException: field [info] not present as part of path [important.info]
	at o.e.i.IngestDocument.getFieldValue(IngestDocument.java:202)
	at o.e.i.c.SetProcessor.execute(SetProcessor.java:86)
	... 19 more
""",
            "pipeline_trace": [ 
              "ingest-step-1",
              "ingest-step-2"
            ],
            "pipeline": "ingest-step-2", 
            "processor_type": "set" 
          }
        }
      }
    ]
  }
}
```

Despite not knowing the pipelines beforehand, we have some places to start looking. The `ingest-step-2` pipeline cannot find the `important.info` field despite it being present in the document that was sent to the cluster. If we pull that pipeline definition we find the following:
```json
```

```json
{
  "ingest-step-2": {
    "processors": [
      {
        "set": { 
          "field": "copy.info",
          "copy_from": "important.info" <2> 
        }
      }
    ]
  }
}
```

There is only a set processor in the `ingest-step-2` pipeline so this is likely not where the root problem is. Remembering the `pipeline_trace` field on the failure we find that `ingest-step-1` was the original pipeline called for this document. It is likely the data stream's default pipeline. Pulling its definition we find the following:
```json
```

```json
{
  "ingest-step-1": {
    "processors": [
      {
        "remove": {
          "field": "important.info" 
        }
      },
      {
        "pipeline": {
          "name": "ingest-step-2" 
        }
      }
    ]
  }
}
```

We find a remove processor in the first pipeline that is the root cause of the problem! The pipeline should be updated to not remove important data, or the downstream pipeline should be changed to not expect the important data to be always present.

## Troubleshooting complicated ingest pipelines

Ingest processors can be labeled with tags. These tags are user-provided information that names or describes the processor's purpose in the pipeline. When documents are redirected to the failure store due to a processor issue, they capture the tag from the processor in which the failure occurred, if it exists. Because of this behavior, it is a good practice to tag the processors in your pipeline so that the location of a failure can be identified quickly.
Here we have a needlessly complicated pipeline. It is made up of several set and remove processors. Beneficially, they are all tagged with descriptive names.
```json

{
  "processors": [
    {
      "set": {
        "tag": "initialize counter",
        "field": "counter",
        "value": "1"
      }
    },
    {
      "set": {
        "tag": "copy counter to new",
        "field": "new_counter",
        "copy_from": "counter"
      }
    },
    {
      "remove": {
        "tag": "remove old counter",
        "field": "counter"
      }
    },
    {
      "set": {
        "tag": "transfer counter back",
        "field": "counter",
        "copy_from": "new_counter"
      }
    },
    {
      "remove": {
        "tag": "remove counter again",
        "field": "counter"
      }
    },
    {
      "set": {
        "tag": "copy to new counter again",
        "field": "new_counter",
        "copy_from": "counter"
      }
    }
  ]
}
```

We ingest some data and find that it was sent to the failure store.
```json

{
    "@timestamp": "2025-04-21T00:00:00Z",
    "counter_name": "test"
}
```

```json
{
  "_index": ".fs-my-datastream-ingest-2025.05.09-000001",
  "_id": "HnTJs5YBwrYNjPmaFcri",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 1,
  "_primary_term": 1,
  "failure_store": "used"
}
```

On checking the failure, we can quickly identify the tagged processor that caused the problem.
```json
```

```json
{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": ".fs-my-datastream-ingest-2025.05.09-000001",
        "_id": "HnTJs5YBwrYNjPmaFcri",
        "_score": 1,
        "_source": {
          "@timestamp": "2025-05-09T06:41:24.775Z",
          "document": {
            "index": "my-datastream-ingest",
            "source": {
              "@timestamp": "2025-04-21T00:00:00Z",
              "counter_name": "test"
            }
          },
          "error": {
            "type": "illegal_argument_exception",
            "message": "field [counter] not present as part of path [counter]",
            "stack_trace": """j.l.IllegalArgumentException: field [counter] not present as part of path [counter]
	at o.e.i.IngestDocument.getFieldValue(IngestDocument.java:202)
	at o.e.i.c.SetProcessor.execute(SetProcessor.java:86)
	... 14 more
""",
            "pipeline_trace": [
              "complicated-processor"
            ],
            "pipeline": "complicated-processor",
            "processor_type": "set", 
            "processor_tag": "copy to new counter again" 
          }
        }
      }
    ]
  }
}
```

Without tags in place it would not be as clear where in the pipeline the indexing problem occurred. Tags provide a unique identifier for a processor that can be quickly referenced in case of an ingest failure.

## Alerting on failed ingestion

Since failure stores can be searched like a normal data stream, we can use them as inputs to [alerting rules](https://www.elastic.co/elastic/docs-builder/docs/3016/explore-analyze/alerting/alerts) in
Kibana. Here is a simple alerting example that is triggered when more than ten indexing failures have occurred in the last five minutes for a data stream:
<stepper>
  <step title="Create a failure store data view">
    If you want to use KQL or Lucene query types, you should first create a data view for your failure store data.
    If you plan to use ES|QL or the Query DSL query types, this step is not required.Navigate to the data view page in Kibana and add a new data view. Set the index pattern to your failure store using the selector syntax.
    ![create a data view using the failure store syntax in the index name](https://www.elastic.co/elastic/docs-builder/docs/3016/manage-data/images/elasticsearch-reference-management_failure_store_alerting_create_data_view.png)
  </step>

  <step title="Create new rule">
    Navigate to Management / Alerts and Insights / Rules. Create a new rule. Choose the Elasticsearch query option.
    ![create a new alerting rule and select the elasticsearch query option](https://www.elastic.co/elastic/docs-builder/docs/3016/manage-data/images/elasticsearch-reference-management_failure_store_alerting_create_rule.png)
  </step>

  <step title="Pick your query type">
    Choose which query type you wish to useFor KQL/Lucene queries, reference the data view that contains your failure store.
    ![use the data view created in the previous step as the input to the kql query](https://www.elastic.co/elastic/docs-builder/docs/3016/manage-data/images/elasticsearch-reference-management_failure_store_alerting_kql.png)
    For Query DSL queries, use the `::failures` suffix on your data stream name.
    ![use the ::failures suffix in the data stream name in the query dsl](https://www.elastic.co/elastic/docs-builder/docs/3016/manage-data/images/elasticsearch-reference-management_failure_store_alerting_dsl.png)
    For ES|QL queries, use the `::failures` suffix on your data stream name in the `FROM` command.
    ![use the ::failures suffix in the data stream name in the from command](https://www.elastic.co/elastic/docs-builder/docs/3016/manage-data/images/elasticsearch-reference-management_failure_store_alerting_esql.png)
  </step>

  <step title="Test">
    Configure schedule, actions, and details of the alert before saving the rule.
    ![complete the rule configuration and save it](https://www.elastic.co/elastic/docs-builder/docs/3016/manage-data/images/elasticsearch-reference-management_failure_store_alerting_finish.png)
  </step>

  <step title="Done">
  </step>
</stepper>


## Data remediation

If you've encountered a long span of ingestion failures you may find that a sizeable gap of events has appeared in your data stream. If the failure store is enabled, the documents that should fill those gaps would be tucked away in the data stream's failure store. Because failure stores are made up of regular indices and the failure documents contain the document source that failed, the failure documents can often times be replayed into your production data streams.
<warning>
  Care should be taken when replaying data into a data stream from a failure store. Any failures during the replay process may generate new failures in the failure store which can duplicate and obscure the original events.
</warning>

We recommend a few best practices for remediating failure data.
**Separate your failures beforehand.** As described in the previous [failure document source](/elastic/docs-builder/docs/3016/manage-data/data-store/data-streams/failure-store#use-failure-store-document-source) section, failure documents are structured differently depending on when the document failed during ingestion. We recommend to separate documents by ingest pipeline failures and indexing failures at minimum. Ingest pipeline failures often need to have the original pipeline re-run, while index failures should skip any pipelines. Further separating failures by index or specific failure type may also be beneficial.
**Perform a failure store rollover.** Consider [rolling over the failure store](/elastic/docs-builder/docs/3016/manage-data/data-store/data-streams/failure-store#manage-failure-store-rollover) before attempting to remediate failures. This will create a new failure index that will collect any new failures during the remediation process.
**Use an ingest pipeline to convert failure documents back into their original document.** Failure documents store failure information along with the document that failed ingestion. The first step for remediating documents should be to use an ingest pipeline to extract the original source from the failure document and then discard any other information about the failure.
**Simulate first to avoid repeat failures.** If you must run a pipeline as part of your remediation process, it is best to simulate the pipeline against the failure first. This will catch any unforeseen issues that may fail the document a second time. Remember, ingest pipeline failures will capture the document before an ingest pipeline is applied to it, which can further complicate remediation when a failure document becomes nested inside a new failure. The easiest way to simulate these changes is using the [pipeline simulate API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-simulate) or the [simulate ingest API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-simulate-ingest).

### Remediating ingest node failures

Failures that occurred during ingest processing will be stored as they were before any pipelines were run. To replay the document into the data stream we will need to re-run any applicable pipelines for the document.
<stepper>
  <step title="Separate out which failures to replay">
    Start off by constructing a query that can be used to consistently identify which failures will be remediated.
    ```json

    {
      "query": {
        "bool": {
          "must": [
            {
              "exists": { <1>
                "field": "error.pipeline"
              }
            },
            {
              "match": { <2>
                "document.index": "my-datastream-ingest-example"
              }
            },
            {
              "match": { <3>
                "error.type": "illegal_argument_exception"
              }
            },
            {
              "range": { <4>
                "@timestamp": {
                  "gt": "2025-05-01T00:00:00Z",
                  "lte": "2025-05-02T00:00:00Z"
                }
              }
            }
          ]
        }
      }
    }
    ```
    Take note of the documents that are returned. We can use these to simulate that our remediation logic makes sense
    ```json
    {
      "took": 14,
      "timed_out": false,
      "_shards": {
        "total": 2,
        "successful": 2,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": {
          "value": 1,
          "relation": "eq"
        },
        "max_score": 2.575364,
        "hits": [
          {
            "_index": ".fs-my-datastream-ingest-example-2025.05.16-000001",
            "_id": "cOnR2ZYByIwDXH-g6GpR",
            "_score": 2.575364,
            "_source": {
              "@timestamp": "2025-05-01T15:58:53.522Z", 
              "document": {
                "index": "my-datastream-ingest-example",
                "source": {
                  "@timestamp": "2025-05-01T00:00:00Z",
                  "data": {
                    "counter": 42 
                  }
                }
              },
              "error": {
                "type": "illegal_argument_exception",
                "message": "field [id] not present as part of path [data.id]", 
                "stack_trace": """j.l.IllegalArgumentException: field [id] not present as part of path [data.id]
    	at o.e.i.IngestDocument.getFieldValue(IngestDocument.java:202)
    	at o.e.i.c.SetProcessor.execute(SetProcessor.java:86)
    	... 14 more
    """,
                "pipeline_trace": [
                  "my-datastream-default-pipeline"
                ],
                "pipeline": "my-datastream-default-pipeline", 
                "processor_type": "set"
              }
            }
          }
        ]
      }
    }
    ```
  </step>

  <step title="Fix the original problem">
    Because ingest pipeline failures need to be reprocessed by their original pipelines, any problems with those pipelines should be fixed before remediating failures. Investigating the pipeline mentioned in the example above shows that there is a processor that expects a field to be present that is not always present.
    ```json
    {
      "my-datastream-default-pipeline": {
        "processors": [
          {
            "set": { 
              "field": "identifier",
              "copy_from": "data.id"
            }
          }
        ]
      }
    }
    ```
    Fixing a failure's root cause is often a bespoke process. In this example, instead of discarding the data, we will make this identifier field optional.
    ```json

    {
      "processors": [
        {
          "set": {
            "field": "identifier",
            "copy_from": "data.id",
            "if": "ctx.data?.id != null" <1>
          }
        }
      ]
    }
    ```
  </step>

  <step title="Create a pipeline to convert failure documents">
    We must convert our failure documents back into their original forms and send them off to be reprocessed. We will create a pipeline to do this:
    ```json

    {
      "processors": [
        {
          "script": {
          "lang": "painless",
          "source": """
              ctx._index = ctx.document.index; <1>
              ctx._routing = ctx.document.routing;
              def s = ctx.document.source; <2>
              ctx.remove("error"); <3>
              ctx.remove("document"); <4>
              for (e in s.entrySet()) { <5>
                ctx[e.key] = e.value;
              }"""
          }
        },
        {
          "reroute": { <6>
            "destination": "my-datastream-ingest-example"
          }
        }
      ]
    }
    ```
  </step>

  <step title="Test your pipelines">
    Before sending data off to be reindexed, be sure to test the pipelines in question with an example document to make sure they work. First, test to make sure the resulting document from the remediation pipeline is shaped how you expect. We can use the [simulate pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-simulate) for this.
    ```json

    {
      "pipeline": { <1>
        "processors": [
          {
            "script": {
            "lang": "painless",
            "source": """
                ctx._index = ctx.document.index;
                ctx._routing = ctx.document.routing;
                def s = ctx.document.source;
                ctx.remove("error");
                ctx.remove("document");
                for (e in s.entrySet()) {
                  ctx[e.key] = e.value;
                }"""
            }
          },
          {
            "reroute": {
              "destination": "my-datastream-ingest-example"
            }
          }
        ]
      },
      "docs": [ <2>
        {
            "_index": ".fs-my-datastream-ingest-example-2025.05.16-000001",
            "_id": "cOnR2ZYByIwDXH-g6GpR",
            "_source": {
              "@timestamp": "2025-05-01T15:58:53.522Z",
              "document": {
                "index": "my-datastream-ingest-example",
                "source": {
                  "@timestamp": "2025-05-01T00:00:00Z",
                  "data": {
                    "counter": 42
                  }
                }
              },
              "error": {
                "type": "illegal_argument_exception",
                "message": "field [id] not present as part of path [data.id]",
                "stack_trace": """j.l.IllegalArgumentException: field [id] not present as part of path [data.id]
    	at o.e.i.IngestDocument.getFieldValue(IngestDocument.java:202)
    	at o.e.i.c.SetProcessor.execute(SetProcessor.java:86)
    	... 14 more
    """,
                "pipeline_trace": [
                  "my-datastream-default-pipeline"
                ],
                "pipeline": "my-datastream-default-pipeline",
                "processor_type": "set"
              }
            }
          }
      ]
    }
    ```

    ```json
    {
      "docs": [
        {
          "doc": {
            "_index": "my-datastream-ingest-example", 
            "_version": "-3",
            "_id": "cOnR2ZYByIwDXH-g6GpR", 
            "_source": { 
              "data": {
                "counter": 42
              },
              "@timestamp": "2025-05-01T00:00:00Z"
            },
            "_ingest": {
              "timestamp": "2025-05-01T20:58:03.566210529Z"
            }
          }
        }
      ]
    }
    ```
    Now that the remediation pipeline has been tested, be sure to test the end-to-end ingestion to verify that no further problems will arise. To do this, we will use the [simulate ingestion API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-simulate-ingest) to test multiple pipeline executions.
    ```json

    {
      "pipeline_substitutions": {
        "my-datastream-remediation-pipeline": { <2>
          "processors": [
            {
              "script": {
                "lang": "painless",
                "source": """
                    ctx._index = ctx.document.index;
                    ctx._routing = ctx.document.routing;
                    def s = ctx.document.source;
                    ctx.remove("error");
                    ctx.remove("document");
                    for (e in s.entrySet()) {
                      ctx[e.key] = e.value;
                    }"""
              }
            },
            {
              "reroute": {
                "destination": "my-datastream-ingest-example"
              }
            }
          ]
        }
      },
      "docs": [ <3>
        {
            "_index": ".fs-my-datastream-ingest-example-2025.05.16-000001",
            "_id": "cOnR2ZYByIwDXH-g6GpR",
            "_source": {
              "@timestamp": "2025-05-01T15:58:53.522Z",
              "document": {
                "index": "my-datastream-ingest-example",
                "source": {
                  "@timestamp": "2025-05-01T00:00:00Z",
                  "data": {
                    "counter": 42
                  }
                }
              },
              "error": {
                "type": "illegal_argument_exception",
                "message": "field [id] not present as part of path [data.id]",
                "stack_trace": """j.l.IllegalArgumentException: field [id] not present as part of path [data.id]
    	at o.e.i.IngestDocument.getFieldValue(IngestDocument.java:202)
    	at o.e.i.c.SetProcessor.execute(SetProcessor.java:86)
    	... 14 more
    """,
                "pipeline_trace": [
                  "my-datastream-default-pipeline"
                ],
                "pipeline": "my-datastream-default-pipeline",
                "processor_type": "set"
              }
            }
          }
      ]
    }
    ```

    ```json
    {
      "docs": [
        {
          "doc": {
            "_id": "cOnR2ZYByIwDXH-g6GpR",
            "_index": "my-datastream-ingest-example", 
            "_version": -3,
            "_source": { 
              "@timestamp": "2025-05-01T00:00:00Z",
              "data": {
                "counter": 42
              }
            },
            "executed_pipelines": [ 
              "my-datastream-remediation-pipeline",
              "my-datastream-default-pipeline"
            ]
          }
        }
      ]
    }
    ```
  </step>

  <step title="Reindex the failure documents">
    Combine the remediation pipeline with the failure store query together in a [reindex operation](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) to replay the failures.
    ```json

    {
      "source": {
        "index": "my-datastream-ingest-example::failures", <1>
        "query": {
          "bool": { <2>
            "must": [
              {
                "exists": {
                  "field": "error.pipeline"
                }
              },
              {
                "match": {
                  "document.index": "my-datastream-ingest-example"
                }
              },
              {
                "match": {
                  "error.type": "illegal_argument_exception"
                }
              },
              {
                "range": {
                  "@timestamp": {
                    "gt": "2025-05-01T00:00:00Z",
                    "lte": "2025-05-17T00:00:00Z"
                  }
                }
              }
            ]
          }
        }
      },
      "dest": {
        "index": "my-datastream-ingest-example", <3>
        "op_type": "create",
        "pipeline": "my-datastream-remediation-pipeline" <4>
      }
    }
    ```

    ```json
    {
      "took": 469,
      "timed_out": false,
      "total": 1,
      "updated": 0,
      "created": 1, 
      "deleted": 0,
      "batches": 1,
      "version_conflicts": 0,
      "noops": 0,
      "retries": {
        "bulk": 0,
        "search": 0
      },
      "throttled_millis": 0,
      "requests_per_second": -1,
      "throttled_until_millis": 0,
      "failures": []
    }
    ```

    <tip>
      Since the failure store is enabled on this data stream, it would be wise to check it for any further failures from the reindexing process. Failures that happen at this point in the process may end up as nested failures in the failure store. Remediating nested failures can quickly become a hassle as the original document gets nested multiple levels deep in the failure document. For this reason, it is suggested to remediate data during a quiet period when no other failures are likely to arise. Furthermore, rolling over the failure store before executing the remediation would allow easier discarding of any new nested failures and only operate on the original failure documents.
    </tip>

    <step title="Done">
      Once any failures have been remediated, you may wish to purge the failures from the failure store to clear up space and to avoid warnings about failed data that has already been replayed. Otherwise, your failures will stay available until the maximum failure store retention should you need to reference them.
    </step>
  </step>
</stepper>


### Remediating mapping and shard failures

As described in the previous [failure document source](/elastic/docs-builder/docs/3016/manage-data/data-store/data-streams/failure-store#use-failure-store-document-source) section, failures that occur due to a mapping or indexing issue will be stored as they were after any pipelines had executed. This means that to replay the document into the data stream we will need to make sure to skip any pipelines that have already run.
<tip>
  You can greatly simplify this remediation process by writing any ingest pipelines to be idempotent. In that case, any document that has already be processed that passes through a pipeline again would be unchanged.
</tip>

<stepper>
  <step title="Separate out which failures to replay">
    Start off by constructing a query that can be used to consistently identify which failures will be remediated.
    ```json

    {
      "query": {
        "bool": {
          "must_not": [
            {
              "exists": { <1>
                "field": "error.pipeline"
              }
            }
          ],
          "must": [
            {
              "match": { <2>
                "document.index": "my-datastream-indexing-example"
              }
            },
            {
              "match": { <3>
                "error.type": "document_parsing_exception"
              }
            },
            {
              "range": { <4>
                "@timestamp": {
                  "gt": "2025-05-01T00:00:00Z",
                  "lte": "2025-05-02T00:00:00Z"
                }
              }
            }
          ]
        }
      }
    }
    ```
    Take note of the documents that are returned. We can use these to simulate that our remediation logic makes sense.
    ```json
    {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": {
          "value": 1,
          "relation": "eq"
        },
        "max_score": 1.5753641,
        "hits": [
          {
            "_index": ".fs-my-datastream-indexing-example-2025.05.16-000002",
            "_id": "_lA-GJcBHLe506UUGL0I",
            "_score": 1.5753641,
            "_source": { 
              "@timestamp": "2025-05-02T18:53:31.153Z",
              "document": {
                "id": "_VA-GJcBHLe506UUFL2i",
                "index": "my-datastream-indexing-example",
                "source": {
                  "processed": true, 
                  "data": {
                    "counter": 37
                  }
                }
              },
              "error": {
                "type": "document_parsing_exception", 
                "message": "[1:40] failed to parse: data stream timestamp field [@timestamp] is missing",
                "stack_trace": """o.e.i.m.DocumentParsingException: [1:40] failed to parse: data stream timestamp field [@timestamp] is missing
    	at o.e.i.m.DocumentParser.wrapInDocumentParsingException(DocumentParser.java:265)
    	at o.e.i.m.DocumentParser.internalParseDocument(DocumentParser.java:162)
    	... 19 more
    Caused by: j.l.IllegalArgumentException: data stream timestamp field [@timestamp] is missing
    	at o.e.i.m.DataStreamTimestampFieldMapper.extractTimestampValue(DataStreamTimestampFieldMapper.java:210)
    	at o.e.i.m.DataStreamTimestampFieldMapper.postParse(DataStreamTimestampFieldMapper.java:223)
    	... 20 more
    """
              }
            }
          }
        ]
      }
    }
    ```
  </step>

  <step title="Fix the original problem">
    There are a broad set of possible indexing failures. Most of these problems stem from incorrect values for a particular mapping. Sometimes a large number of new fields are dynamically mapped and the maximum number of mapping fields is reached, so no more can be added. In our example above, the document being indexed is missing a required timestamp.These problems can occur in a number of places: Data sent from a client may be incomplete, ingest pipelines may not be producing the correct result, or the index mapping may need to be updated to account for changes in data.Once all clients and pipelines are producing complete and correct documents, and your mappings are correctly configured for your incoming data, proceed with the remediation.
  </step>

  <step title="Create a pipeline to convert failure documents">
    We must convert our failure documents back into their original forms and send them off to be reprocessed. We will create a pipeline to do this. Since the example failure was due to not having a timestamp on the document, we will simply use the timestamp at the time of failure for the document since the original timestamp is missing. This solution assumes that the documents we are remediating were created very closely to when the failure occurred. Your remediation process may need adjustments if this is not applicable for you.
    ```json

    {
      "processors": [
        {
          "script": {
          "lang": "painless",
          "source": """
              ctx._index = ctx.document.index; <1>
              ctx._routing = ctx.document.routing;
              def s = ctx.document.source; <2>
              ctx.remove("error"); <3>
              ctx.remove("document"); <4>
              for (e in s.entrySet()) { <5>
                ctx[e.key] = e.value;
              }"""
          }
        }
      ]
    }
    ```

    <important>
      Remember that a document that has failed during indexing has already been processed by the ingest processor! It shouldn't need to be processed again unless you made changes to your pipeline to fix the original problem. Make sure that any fixes applied to the ingest pipeline are reflected in the pipeline logic here.
    </important>
  </step>

  <step title="Test your pipeline">
    Before sending data off to be reindexed, be sure to test the remedial pipeline with an example document to make sure it works. Most importantly, make sure the resulting document from the remediation pipeline is shaped how you expect. We can use the [simulate pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-simulate) for this.
    ```json

    {
      "pipeline": { <1>
        "processors": [
          {
            "script": {
            "lang": "painless",
            "source": """
                ctx._index = ctx.document.index;
                ctx._routing = ctx.document.routing;
                def s = ctx.document.source;
                ctx.remove("error");
                ctx.remove("document");
                for (e in s.entrySet()) {
                  ctx[e.key] = e.value;
                }"""
            }
          }
        ]
      },
      "docs": [ <2>
        {
            "_index": ".fs-my-datastream-indexing-example-2025.05.16-000002",
            "_id": "_lA-GJcBHLe506UUGL0I",
            "_score": 1.5753641,
            "_source": {
              "@timestamp": "2025-05-02T18:53:31.153Z",
              "document": {
                "id": "_VA-GJcBHLe506UUFL2i",
                "index": "my-datastream-indexing-example",
                "source": {
                  "processed": true,
                  "data": {
                    "counter": 37
                  }
                }
              },
              "error": {
                "type": "document_parsing_exception",
                "message": "[1:40] failed to parse: data stream timestamp field [@timestamp] is missing",
                "stack_trace": """o.e.i.m.DocumentParsingException: [1:40] failed to parse: data stream timestamp field [@timestamp] is missing
    	at o.e.i.m.DocumentParser.wrapInDocumentParsingException(DocumentParser.java:265)
    	at o.e.i.m.DocumentParser.internalParseDocument(DocumentParser.java:162)
    	... 19 more
    Caused by: j.l.IllegalArgumentException: data stream timestamp field [@timestamp] is missing
    	at o.e.i.m.DataStreamTimestampFieldMapper.extractTimestampValue(DataStreamTimestampFieldMapper.java:210)
    	at o.e.i.m.DataStreamTimestampFieldMapper.postParse(DataStreamTimestampFieldMapper.java:223)
    	... 20 more
    """
              }
            }
          }
      ]
    }
    ```

    ```json
    {
      "docs": [
        {
          "doc": {
            "_index": "my-datastream-indexing-example", 
            "_version": "-3",
            "_id": "_lA-GJcBHLe506UUGL0I",
            "_source": { 
              "processed": true,
              "@timestamp": "2025-05-28T18:53:31.153Z", 
              "data": {
                "counter": 37
              }
            },
            "_ingest": {
              "timestamp": "2025-05-28T19:14:50.457560845Z"
            }
          }
        }
      ]
    }
    ```
  </step>

  <step title="Reindex the failure documents">
    Combine the remediation pipeline with the failure store query together in a [reindex operation](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) to replay the failures.
    ```json

    {
      "source": {
        "index": "my-datastream-indexing-example::failures", <1>
        "query": {
          "bool": { <2>
            "must_not": [
              {
                "exists": {
                  "field": "error.pipeline"
                }
              }
            ],
            "must": [
              {
                "match": {
                  "document.index": "my-datastream-indexing-example"
                }
              },
              {
                "match": {
                  "error.type": "document_parsing_exception"
                }
              },
              {
                "range": {
                  "@timestamp": {
                    "gt": "2025-05-01T00:00:00Z",
                    "lte": "2025-05-28T19:00:00Z"
                  }
                }
              }
            ]
          }
        }
      },
      "dest": {
        "index": "my-datastream-indexing-example", <3>
        "op_type": "create",
        "pipeline": "my-datastream-remediation-pipeline" <4>
      }
    }
    ```

    ```json
    {
      "took": 469,
      "timed_out": false,
      "total": 1,
      "updated": 0,
      "created": 1, 
      "deleted": 0,
      "batches": 1,
      "version_conflicts": 0,
      "noops": 0,
      "retries": {
        "bulk": 0,
        "search": 0
      },
      "throttled_millis": 0,
      "requests_per_second": -1,
      "throttled_until_millis": 0,
      "failures": []
    }
    ```

    <tip>
      Since the failure store is enabled on this data stream, it would be wise to check it for any further failures from the reindexing process. Failures that happen at this point in the process may end up as nested failures in the failure store. Remediating nested failures can quickly become a hassle as the original document gets nested multiple levels deep in the failure document. For this reason, it is suggested to remediate data during a quiet period where no other failures will arise. Furthermore, rolling over the failure store before executing the remediation would allow easier discarding of any new nested failures and only operate on the original failure documents.
    </tip>

    <step title="Done">
      Once any failures have been remediated, you may wish to purge the failures from the failure store to clear up space and to avoid warnings about failed data that has already been replayed. Otherwise, your failures will stay available until the maximum failure store retention should you need to reference them.
    </step>
  </step>
</stepper>