What is Fleet Server?
Fleet Server is a component that connects Elastic Agents to Fleet. It supports many Elastic Agent connections and serves as a control plane for updating agent policies, collecting status information, and coordinating actions across Elastic Agents. It also provides a scalable architecture. As the size of your agent deployment grows, you can deploy additional Fleet Servers to manage the increased workload.
- On-premises Fleet Server is not currently available for use in an Elastic Cloud Serverless environment. We recommend using the hosted Fleet Server that is included and configured automatically in Serverless Observability and Security projects.
The following diagram shows how Elastic Agents communicate with Fleet Server to retrieve agent policies:

- When a new agent policy is created, the Fleet UI saves the policy to a Fleet index in Elasticsearch.
- To enroll in the policy, Elastic Agents send a request to Fleet Server, using the enrollment key generated for authentication.
- Fleet Server monitors Fleet indices, picks up the new agent policy from Elasticsearch, then ships the policy to all Elastic Agents enrolled in that policy. Fleet Server may also write updated policies to the Fleet index to manage coordination between agents.
- Elastic Agent uses configuration information in the policy to collect and send data to Elasticsearch.
- Elastic Agent checks in with Fleet Server for updates, maintaining an open connection.
- When a policy is updated, Fleet Server retrieves the updated policy from Elasticsearch and sends it to the connected Elastic Agents.
- To communicate with Fleet about the status of Elastic Agents and the policy rollout, Fleet Server writes updates to Fleet indices.
Does Fleet Server run inside of Elastic Agent?
Fleet Server is a subprocess that runs inside a deployed Elastic Agent. This means the deployment steps are similar to any Elastic Agent, except that you enroll the agent in a special Fleet Server policy. Typically—especially in large-scale deployments—this agent is dedicated to running Fleet Server as an Elastic Agent communication host and is not configured for data collection.
Fleet Server uses a service token to communicate with Elasticsearch, which contains a fleet-server
service account. Each Fleet Server can use its own service token, and you can share it across multiple servers (not recommended). The advantage of using a separate token for each server is that you can invalidate each one separately.
You can create a service token by either using the Fleet UI or the Elasticsearch API. For more information, refer to Deploy Fleet Server on-premises and Elasticsearch on Cloud or Deploy on-premises and self-managed, depending on your deployment model.
Fleet Server is stateless. Connections to the Fleet Server therefore can be load balanced as long as the Fleet Server has capacity to accept more connections. Load balancing is done on a round-robin basis.
How you handle high-availability, fault-tolerance, and lifecycle management of Fleet Server depends on the deployment model you use.
To learn more about deploying and scaling Fleet Server, refer to:
- Deploy on Elastic Cloud
- Deploy Fleet Server on-premises and Elasticsearch on Cloud
- Deploy on-premises and self-managed
- Fleet Server scalability
- Monitor a self-managed Fleet Server
Secrets used to configure Fleet Server can either be directly specified in configuration or provided through secret files. See Fleet Server Secrets for more information.