Custom filters
Custom filters, including ingest pipeline filters and APM agent filters, allow you to filter or redact APM data on ingestion.
Ingest pipeline filters ¶
Ingest pipelines specify a series of processors that transform data in a specific way. Transformation happens prior to indexing—​inflicting no performance overhead on the monitored application. Pipelines are a flexible and easy way to filter or obfuscate Elastic APM data.
Features of this approach:
- Filters are applied at ingestion time.
- All Elastic APM agents and fields are supported.
- Data leaves the instrumented service.
- There are no performance overhead implications on the instrumented service.
For a step-by-step example, refer to Tutorial: Use an ingest pipeline to redact sensitive information.
APM agent filters ¶
Some APM agents offer a way to manipulate or drop APM events before they are sent to APM Server.
Features of this approach:
- Data is sanitized before leaving the instrumented service.
- Not supported by all Elastic APM agents.
- Potential overhead implications on the instrumented service.
Refer to the relevant agent’s documentation for more information and examples:
- .NET: Filter API.
- Node.js:
addFilter()
. - Python: custom processors.
- Ruby:
add_filter()
.
Tutorial: Use an ingest pipeline to redact sensitive information ¶
Say you decide to capture HTTP request bodies but quickly notice that sensitive information is being collected in the http.request.body.original
field:
{
"email": "test@abc.com",
"password": "hunter2"
}
To obfuscate the passwords stored in the request body, you can use a series of ingest processors.
Create a pipeline ¶
Tip
This tutorial uses the Ingest APIs, but it’s also possible to create a pipeline using the UI. In Kibana, go to Stack Management → Ingest Pipelines → Create pipeline → New pipeline or use the global search field.
To start, create a pipeline with a simple description and an empty array of processors:
{
"pipeline": {
"description": "redact http.request.body.original.password",
"processors": [] 1
}
}
- The processors defined below will go in this array
Add a JSON processor ¶
Add your first processor to the processors array. Because the agent captures the request body as a string, use the JSON processor to convert the original field value into a structured JSON object. Save this JSON object in a new field:
{
"json": {
"field": "http.request.body.original",
"target_field": "http.request.body.original_json",
"ignore_failure": true
}
}
Add a set processor ¶
If body.original_json
is not null
, i.e., it exists, we’ll redact the password
with the set processor, by setting the value of body.original_json.password
to "redacted"
:
{
"set": {
"field": "http.request.body.original_json.password",
"value": "redacted",
"if": "ctx?.http?.request?.body?.original_json != null"
}
}
Add a convert processor ¶
Use the convert processor to convert the JSON value of body.original_json
to a string and set it as the body.original
value:
{
"convert": {
"field": "http.request.body.original_json",
"target_field": "http.request.body.original",
"type": "string",
"if": "ctx?.http?.request?.body?.original_json != null",
"ignore_failure": true
}
}
Add a remove processor ¶
Finally, use the remove processor to remove the body.original_json
field:
{
"remove": {
"field": "http.request.body.original_json",
"if": "ctx?.http?.request?.body?.original_json != null",
"ignore_failure": true
}
}
Register the pipeline ¶
Then put it all together, and use the create or update pipeline API to register the new pipeline in Elasticsearch. Name the pipeline apm_redacted_body_password
:
PUT _ingest/pipeline/apm_redacted_body_password
{
"description": "redact http.request.body.original.password",
"processors": [
{
"json": {
"field": "http.request.body.original",
"target_field": "http.request.body.original_json",
"ignore_failure": true
}
},
{
"set": {
"field": "http.request.body.original_json.password",
"value": "redacted",
"if": "ctx?.http?.request?.body?.original_json != null"
}
},
{
"convert": {
"field": "http.request.body.original_json",
"target_field": "http.request.body.original",
"type": "string",
"if": "ctx?.http?.request?.body?.original_json != null",
"ignore_failure": true
}
},
{
"remove": {
"field": "http.request.body.original_json",
"if": "ctx?.http?.request?.body?.original_json != null",
"ignore_failure": true
}
}
]
}
Test the pipeline ¶
Prior to enabling this new pipeline, you can test it with the simulate pipeline API. This API allows you to run multiple documents through a pipeline to ensure it is working correctly.
The request below simulates running three different documents through the pipeline:
POST _ingest/pipeline/apm_redacted_body_password/_simulate
{
"docs": [
{
"_source": { 1
"http": {
"request": {
"body": {
"original": """{"email": "test@abc.com", "password": "hunter2"}"""
}
}
}
}
},
{
"_source": { 2
"some-other-field": true
}
},
{
"_source": { 3
"http": {
"request": {
"body": {
"original": """["invalid json" """
}
}
}
}
}
]
}
- This document features the same sensitive data from the original example above
- This document only contains an unrelated field
- This document contains invalid JSON
The API response should be similar to this:
{
"docs" : [
{
"doc" : {
"_source" : {
"http" : {
"request" : {
"body" : {
"original" : {
"password" : "redacted",
"email" : "test@abc.com"
}
}
}
}
}
}
},
{
"doc" : {
"_source" : {
"nobody" : true
}
}
},
{
"doc" : {
"_source" : {
"http" : {
"request" : {
"body" : {
"original" : """["invalid json" """
}
}
}
}
}
}
]
}
As expected, only the first simulated document has a redacted password field. All other documents are unaffected.
Create a @custom
pipeline ¶
The final step in this process is to call the newly created apm_redacted_body_password
pipeline from the @custom
pipeline of the data stream you wish to edit.
@custom
pipelines are specific to each data stream and follow a similar naming convention: <type>-<dataset>@custom
. As a reminder, the default APM data streams are:
- Application traces:
traces-apm-<namespace>
- RUM and iOS agent application traces:
traces-apm.rum-<namespace>
- APM internal metrics:
metrics-apm.internal-<namespace>
- APM transaction metrics:
metrics-apm.transaction.<metricset.interval>-<namespace>
- APM service destination metrics:
metrics-apm.service_destination.<metricset.interval>-<namespace>
- APM service transaction metrics:
metrics-apm.service_transaction.<metricset.interval>-<namespace>
- APM service summary metrics:
metrics-apm.service_summary.<metricset.interval>-<namespace>
- Application metrics:
metrics-apm.app.<service.name>-<namespace>
- APM error/exception logging:
logs-apm.error-<namespace>
- Applications UI logging:
logs-apm.app.<service.name>-<namespace>
To match a custom ingest pipeline with a data stream, follow the <type>-<dataset>@custom
template, or replace -namespace
with @custom
in the table above. For example, to target application traces, you’d create a pipeline named traces-apm@custom
.
Use the create or update pipeline API to register the new pipeline in Elasticsearch. Name the pipeline traces-apm@custom
:
PUT _ingest/pipeline/traces-apm@custom
{
"processors": [
{
"pipeline": {
"name": "apm_redacted_body_password" 1
}
}
]
}
- The name of the pipeline we previously created
That’s it! Passwords will now be redacted from your APM HTTP body data.
Next steps ¶
To learn more about ingest pipelines, see View the Elasticsearch index template.