Managed OTLP Endpoint troubleshooting
The following sections provide troubleshooting information for the Elastic Cloud Managed OTLP Endpoint.
- No telemetry data appears in Elastic.
- You haven't set up an OpenTelemetry collector or SDK yet.
Spin up an EDOT collector in a few steps:
The following error appears in your collector or SDK logs:
Exporting failed. Dropping data.
{"kind": "exporter", "data_type": }
"Unauthenticated desc = ApiKey prefix not found"
Format your API key correctly. The format depends on whether you're using a collector or SDK:
- Collector configuration:
"Authorization": "ApiKey <api-key-value-here>" - SDK environment variable:
"Authorization=ApiKey <api-key>"
- HTTP
429 Too Many Requestserrors appear when sending data. - Log messages indicate rate limiting from the mOTLP endpoint.
Your project might be hitting ingest rate limits. Refer to the dedicated 429 errors when using the Elastic Cloud Managed OTLP Endpoint troubleshooting guide for details on causes, rate limits, and solutions.
- HTTP
413 Payload Too Largeerrors appear when sending data. - gRPC errors indicate the request or response exceeded the maximum message size.
- Errors happen more often when traffic spikes, or when individual telemetry items are large.
Reduce the payload size sent by your collector by lowering batching limits. In the EDOT Collector (and upstream or contrib collectors), you can reduce the maximum batch size (in uncompressed bytes) so each request stays smaller.
For configuration guidance and the recommended batching settings for sending data to the Elastic Cloud Managed OTLP Endpoint, refer to Batching configuration for contrib OpenTelemetry Collector.
- HTTP
5XXerrors (such as500,502, or503) appear when sending data. - Data ingestion is intermittent or fails completely.
- Errors might correlate with periods of high traffic.
Server errors can indicate that your Elasticsearch cluster is undersized for the current workload. Use AutoOps to check:
- CPU usage: High CPU utilization suggests the cluster needs more processing capacity.
- Memory usage: High memory pressure can cause instability and errors.
- Active alerts: Check for events indicating resource constraints.
If metrics confirm the cluster is under-resourced, scale your deployment:
AutoOps is a diagnostic tool available for Elastic Cloud Hosted deployments that analyzes cluster metrics, provides root-cause analysis, and suggests resolution paths.
AutoOps can help you identify and resolve issues that affect mOTLP data ingestion, including 429 rate limiting errors and 5XX server errors caused by undersized clusters.
When your deployment receives excessive traffic or lacks sufficient resources, AutoOps detects patterns that often precede or accompany ingestion errors:
- Index queue is high: Usually the first indicator that your deployment lacks sufficient resources for the current data volume.
- CPU utilization is high: Often follows high index queue, as processing incoming telemetry data is CPU-intensive.
- Unbalanced node load: Some data nodes are more loaded than others, indicating potential scaling or routing issues.
AutoOps is accessible from the Elastic Cloud console. From your deployment, select AutoOps in the navigation menu to view cluster status, active events, and resource metrics.
To check node-level metrics like CPU, memory, and write queue, go to AutoOps Monitoring > Nodes.
AutoOps availability depends on your cloud provider and region. Refer to AutoOps regions for details.
AutoOps can alert you when events occur in your cluster. To get started:
- Add at least one connector (Slack, PagerDuty, MS Teams, or webhook).
- Configure notification filters to include or exclude specific events.
AutoOps delivers all events occurring across your deployments through the configured connectors.
For more information, refer to AutoOps.
Help improve the Elastic Cloud Managed OTLP Endpoint by sending us feedback in our discussion forum or community Slack.
For EDOT collector feedback, open an issue in the elastic-agent repository.