Logstash output
The Logstash output uses an internal protocol to send events directly to Logstash over TCP. Logstash provides additional parsing, transformation, and routing of data collected by Elastic Agent.
Compatibility: This output works with all compatible versions of Logstash. Refer to the Elastic Support Matrix.
This example configures a Logstash output called default
in the elastic-agent.yml
file:
outputs:
default:
type: logstash
hosts: ["127.0.0.1:5044"] 1
- The Logstash server and the port (
5044
) where Logstash is configured to listen for incoming Elastic Agent connections.
To receive the events in Logstash, you also need to create a Logstash configuration pipeline. The Logstash configuration pipeline listens for incoming Elastic Agent connections, processes received events, and then sends the events to Elasticsearch.
The following Logstash pipeline definition example configures a pipeline that listens on port 5044
for incoming Elastic Agent connections and routes received events to Elasticsearch.
input {
elastic_agent {
port => 5044
enrich => none 1
ssl => true
ssl_certificate_authorities => ["<ca_path>"]
ssl_certificate => "<server_cert_path>"
ssl_key => "<server_cert_key_in_pkcs8>"
ssl_verify_mode => "force_peer"
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"] 2
# cloud_id => "..."
data_stream => "true"
api_key => "<api_key>" 3
data_stream => true
ssl => true
# cacert => "<elasticsearch_ca_path>"
}
}
- Do not modify the events' schema.
- The Elasticsearch server and the port (
9200
) where Elasticsearch is running.
- The API Key used by Logstash to ship data to the destination data streams.
For more information about configuring Logstash, refer to Configuring Logstash and Elastic Agent input plugin.
The logstash
output supports the following settings, grouped by category. Many of these settings have sensible defaults that allow you to run Elastic Agent with minimal configuration.
Setting |
Description |
enabled
|
(boolean) Enables or disables the output. If set to false , the output is disabled.
|
escape_html
|
(boolean) Configures escaping of HTML in strings. Set to true to enable escaping.
Default: false
|
hosts
|
(list) The list of known Logstash servers to connect to. If load balancing is disabled, but multiple hosts are configured, one host is selected randomly (there is no precedence). If one host becomes unreachable, another one is selected randomly.
All entries in this list can contain a port number. If no port is specified, 5044 is used.
|
proxy_url
|
(string) The URL of the SOCKS5 proxy to use when connecting to the Logstash servers. The value must be a URL with a scheme of socks5:// . The protocol used to communicate to Logstash is not based on HTTP, so you cannot use a web proxy.
If the SOCKS5 proxy server requires client authentication, embed a username and password in the URL as shown in the example.
When using a proxy, hostnames are resolved on the proxy server instead of on the client. To change this behavior, set proxy_use_local_resolver .
yaml<br>outputs:<br> default:<br> type: logstash<br> hosts: ["remote-host:5044"]<br> proxy_url: socks5://user:password@socks5-proxy:2233<br>
|
proxy_use_ local_resolver
|
(boolean) Determines whether Logstash hostnames are resolved locally when using a proxy. If false and a proxy is used, name resolution occurs on the proxy server.
Default: false
|
When sending data to a secured cluster through the logstash
output, Elastic Agent can use SSL/TLS. For a list of available settings, refer to SSL/TLS, specifically the settings under Table 7, Common configuration options and Table 8, Client configuration options.
For more information, refer to Configure SSL/TLS for the Logstash output.
The memory queue keeps all events in memory.
The memory queue waits for the output to acknowledge or drop events. If the queue is full, no new events can be inserted into the memory queue. Only after the signal from the output will the queue free up space for more events to be accepted.
The memory queue is controlled by the parameters flush.min_events
and flush.timeout
. flush.min_events
gives a limit on the number of events that can be included in a single batch, and flush.timeout
specifies how long the queue should wait to completely fill an event request. If the output supports a bulk_max_size
parameter, the maximum batch size will be the smaller of bulk_max_size
and flush.min_events
.
flush.min_events
is a legacy parameter, and new configurations should prefer to control batch size with bulk_max_size
. As of 8.13, there is never a performance advantage to limiting batch size with flush.min_events
instead of bulk_max_size
.
In synchronous mode, an event request is always filled as soon as events are available, even if there are not enough events to fill the requested batch. This is useful when latency must be minimized. To use synchronous mode, set flush.timeout
to 0.
For backwards compatibility, synchronous mode can also be activated by setting flush.min_events
to 0 or 1. In this case, batch size will be capped at 1/2 the queue capacity.
In asynchronous mode, an event request will wait up to the specified timeout to try and fill the requested batch completely. If the timeout expires, the queue returns a partial batch with all available events. To use asynchronous mode, set flush.timeout
to a positive duration, for example 5s.
This sample configuration forwards events to the output when there are enough events to fill the output’s request (usually controlled by bulk_max_size
, and limited to at most 512 events by flush.min_events
), or when events have been waiting for 5s without filling the requested size:f 512 events are available or the oldest available event has been waiting for 5s in the queue:
queue.mem.events: 4096
queue.mem.flush.min_events: 512
queue.mem.flush.timeout: 5s
Setting |
Description |
queue.mem.events
|
The number of events the queue can store. This value should be evenly divisible by the smaller of queue.mem.flush.min_events or bulk_max_size to avoid sending partial batches to the output.
Default: 3200 events
|
queue.mem.flush.min_events
|
flush.min_events is a legacy parameter, and new configurations should prefer to control batch size with bulk_max_size . As of 8.13, there is never a performance advantage to limiting batch size with flush.min_events instead of bulk_max_size
Default: 1600 events
|
queue.mem.flush.timeout
|
(int) The maximum wait time for queue.mem.flush.min_events to be fulfilled. If set to 0s, events are available to the output immediately.
Default: 10s
|
Settings that may affect performance.
Setting |
Description |
backoff.init
|
(string) The number of seconds to wait before trying to reconnect to Logstash after a network error. After waiting backoff.init seconds, Elastic Agent tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to backoff.max . After a successful connection, the backoff timer is reset.
Default: 1s
|
backoff.max
|
(string) The maximum number of seconds to wait before attempting to connect to Elasticsearch after a network error.
Default: 60s
|
bulk_max_size
|
(int) The maximum number of events to bulk in a single Logstash request.
Events can be collected into batches. Elastic Agent will split batches larger than bulk_max_size into multiple batches.
Specifying a larger batch size can improve performance by lowering the overhead of sending events. However big batch sizes can also increase processing times, which might result in API errors, killed connections, timed-out publishing requests, and, ultimately, lower throughput.
Set this value to 0 to turn off the splitting of batches. When splitting is turned off, the queue determines the number of events to be contained in a batch.
Default: 2048
|
compression_level
|
(int) The gzip compression level. Set this value to 0 to disable compression. The compression level must be in the range of 1 (best speed) to 9 (best compression).
Increasing the compression level reduces network usage but increases CPU usage.
Default: 3
|
loadbalance
|
If true and multiple Logstash hosts are configured, the output plugin load balances published events onto all Logstash hosts. If false , the output plugin sends all events to one host (determined at random) and switches to another host if the selected one becomes unresponsive.
With loadbalance enabled:
* Elastic Agent reads batches of events and sends each batch to one Logstash worker dynamically, based on a work-queue shared between the outputs. * If a connection drops, Elastic Agent takes the disconnected Logstash worker out of its pool. * Elastic Agent tries to reconnect. If it succeeds, it re-adds the Logstash worker to the pool. * If one of the Logstash nodes is slow but "healthy", it sends a keep-alive signal until the full batch of data is processed. This prevents Elastic Agent from sending further data until it receives an acknowledgement signal back from Logstash. Elastic Agent keeps all events in memory until after that acknowledgement occurs.
Without loadbalance enabled:
* Elastic Agent picks a random Logstash host and sends batches of events to it. Due to the random algorithm, the load on the Logstash nodes should be roughly equal. * In case of any errors, Elastic Agent picks another Logstash node, also at random. If a connection to a host fails, the host is retried only if there are errors on the new connection.
Default: false
Example:
yaml<br>outputs:<br> default:<br> type: logstash<br> hosts: ["localhost:5044", "localhost:5045"]<br> loadbalance: true<br>
|
max_retries
|
(int) The number of times to retry publishing an event after a publishing failure. After the specified number of retries, the events are typically dropped.
Set max_retries to a value less than 0 to retry until all events are published.
Default: 3
|
pipelining
|
(int) The number of batches to send asynchronously to Logstash while waiting for an ACK from Logstash. The output becomes blocking after the specified number of batches are written. Specify 0 to turn off pipelining.
Default: 2
|
slow_start
|
(boolean) If true , only a subset of events in a batch of events is transferred per transaction. The number of events to be sent increases up to bulk_max_size if no error is encountered. On error, the number of events per transaction is reduced again.
Default: false
|
timeout
|
(string) The number of seconds to wait for responses from the Logstash server before timing out.
Default: 30s
|
ttl
|
(string) Time to live for a connection to Logstash after which the connection will be reestablished. This setting is useful when Logstash hosts represent load balancers. Because connections to Logstash hosts are sticky, operating behind load balancers can lead to uneven load distribution across instances. Specify a TTL on the connection to achieve equal connection distribution across instances.
Default: 0 (turns off the feature)
<><>{note} The ttl option is not yet supported on an asynchronous Logstash client (one with the pipelining option set). >>
|
worker
|
(int) The number of workers per configured host publishing events. Example: If you have two hosts and three workers, in total six workers are started (three for each host).
Default: 1
|