The Elasticsearch data store

Elasticsearch is a distributed search and analytics engine, scalable document store, and vector database built on Apache Lucene. It stores data as JSON documents, organized into indices. You can interact with an index using its unique name or through a logical reference such as an alias. Each index holds a dataset with its own schema, defined by a mapping that specifies the fields and their types.

You can store many independent datasets side by side — each in its own index or data stream — and search them individually or together.

As your data grows, how you structure, size, and manage your indices directly affects query performance, storage costs, and operational complexity. This section covers the core storage concepts, how to configure data structure and behavior, and how to manage your indices and documents.

For production architecture guidance, including resilience, scaling, and performance optimization, refer to Run Elasticsearch in production.

Tip

You can index a document using the Index API. For production ingestion workflows and related concepts such as pipelines, agents, and Logstash, refer to Ingest: Bring your data to Elastic.

Understand data storage

Elasticsearch provides two main ways to organize your data:

Index: The general-purpose storage unit. Use an index when you need to update or delete individual documents, or when your data is not time-based.
Data stream: The recommended approach for timestamped, append-only data like logs, events, and metrics. A data stream manages rolling backing indices automatically and integrates with data lifecycle management out of the box.

Both use the same foundational concepts such as documents, mappings, templates, and aliases. The configuration topics in this section apply regardless of which you choose.

Index fundamentals: Learn about index fundamentals, including index naming and aliases, document structure, metadata fields, and mappings.
Data streams: Learn when to use data streams for timestamped and append-only time series data, like logs, events, or metrics. You work with one stream name while Elasticsearch manages multiple backing indices behind the scenes.
Near real-time search: Understand how Elasticsearch makes newly indexed data searchable within seconds of indexing.

Configure how data is stored

How your indices are structured and managed affects query performance, storage efficiency, and how easily your cluster scales. Elasticsearch provides tools to control document field types, manage unstructured text, standardize index configurations, and to simplify and automate access logic:

Mapping: Define how documents and their fields are stored and indexed. Choose between dynamic mapping for automatic field detection and explicit mapping for full control over field types and indexing behavior.
Text analysis: Configure how unstructured text is converted into a structured format optimized for full-text search, including tokenization, normalization, and custom analyzers.
Templates: Define reusable index configurations including settings, mappings, and aliases that are automatically applied when new indices or data streams are created.
Aliases: Create named references that point to one or more indices or data streams. Aliases are logical groupings that have no impact on disk layout or data structure. Instead, they provide an organizational layer for query targeting, zero-downtime reindexing, and abstracting away physical index names.

Manage data

Work with your indices and data using the Kibana UI or the Elasticsearch REST API.

Manage indices in Kibana: Use Kibana's Index Management page to view and manage your indices, data streams, templates, component templates, and enrich policies.
Manage data using APIs: Index, update, retrieve, search, and delete documents using curl and the Elasticsearch REST API.

What's next

After you understand how your data is stored, explore these topics to move to practical data workflows and growth planning:

Ingest: Explore ingestion options, from sample data and manual uploads to automated pipelines and API-driven workflows.
Query and filter your data: Search, filter, aggregate data, and learn about search approaches and query languages.
Data lifecycle: Plan data retention, storage tiers, and roll over as your data grows.
Scaling considerations: Learn how to evaluate workload growth and scale your deployment effectively.