Data lifecycle

Data (e.g. logs, events, metrics) are stored in several availability stages, basically in the chronological order. It means that the recent logs are stored in the fastest data storage and as they age, they are moved to the slower and cheaper data storage and eventually into the offline archive or they are deleted.

The lifecycle is controlled by ElasticSearch feature called Index Lifecycle Management (ILM).

Index Lifecycle Management

Index Lifecycle Management (ILM) in ElasticSearch serves to automatically close or delete old indices (f. e. with data older than three months), so searching performance is kept and data storage is able to store present data. The setting is present in the so-called ILM policy.

The ILM should be set before the data are pumped into ElasticSearch, so the new index finds and associates itself with the proper ILM policy. For more information, please refer to the official documentation: components such as Dispatcher then use a specified ILM alias (lm_) and ElasticSearch automatically put the data to the proper index assigned with the ILM policy.

Hot-Warm-Cold architecture (HWC)

HWC is an extension of the standard index rotation provided by the ElasticSearch ILM and it is a good tool for managing time series data. HWC architecture enables us to allocate specific nodes to one of the phases. When used correctly, along with the cluster architecture, this will allow for maximum performance, using available hardware to its fullest potential.

Hot stage

There is usually some period of time (week, month, etc.), where we want to query the indexes heavily, aiming for speed, rather than memory (and other resources) conservation. That is where the “Hot” phase comes in handy, by allowing us to have the index with more replicas, spread out and accessible on more nodes for optimal user experience.

Hot nodes

Hot nodes should use the fast parts of the available hardware, using most CPU’s and faster IO.


Warm stage

Once this period is over, and the indexes are no longer queried as often, we will benefit by moving them to the “Warm” phase, which allows us to reduce the number of nodes (or move to nodes with less resources available) and index replicas, lessening the hardware load, while still retaining the option to search the data reasonably fast.

Warm nodes

Warm nodes, as the name suggests, stand on the crossroads, between being solely for the storage purposes, while still retaining some CPU power to handle the occasional queries.


Cold stage

Sometimes, there are reasons to store data for extended periods of time (dictated by law, or some internal rule). The data are not expected to be queried, but at the same time, they cannot be deleted just yet.

Cold nodes

This is where the Cold nodes come in, there may be few, with only little CPU resources, they have no need to use SSD drives, being perfectly fine with slower (and optionally larger) storage.


The setting should be done in following way:

Archive stage

The archive stage is optional in the design. It is an offline long-term storage. The oldest data from a cold stage could be moved periodically to the archive stage instead of their deletion.

The standard archiving policy of the SIEM operating organization are applied. The archived data needs to be encrypted.

It is also possible to forward certain logs directly from a warm stage into the archive stage.

Create the ILM policy


Kibana version 7.x can be used to create ILM policy in ElasticSearch.

1.) Open Kibana

2.) Click Management in the left menu

3.) In the ElasticSearch section, click on Index Lifecycle Policies

4.) Click Create policy blue button

5.) Enter its name, which should be the same as the index prefix, f. e. lm_

6.) Set max index size to the desired rollover size, f. e. 25 GB (size rollover)

7.) Set maximum age of the index, f. e. 10 days (time rollover)

8.) Click the switch down the screen at Delete phase, and enter the time after which the index should be deleted, f. e. 120 days from rollover

9.) Click on Save policy green button

Use the policy in index template

Modify index template(s)

Add the following lines to the JSON index template:

"settings": {
  "index": {
    "lifecycle": {
      "name": "lm_",
      "rollover_alias": "lm_"


Kibana version 7.x can be used to link ILM policy with ES index template.

1.) Open Kibana

2.) Click Management in the left menu

3.) In the ElasticSearch section, click on Index Management

4.) At the top, select Index Template

5.) Select your desired index template, f. e. lm_

6.) Click on Edit

7.) On the Settings screen, add:

  "index": {
    "lifecycle": {
      "name": "lm_",
      "rollover_alias": "lm_"

8.) Click on Save

Create a new index which will utilize the latest index template

Through PostMan or Kibana, create a following HTTP request to the instance of ElasticSearch you are using:

PUT lm_tenant-000001
  "aliases": {
    "lm_": {
      "is_write_index": true

The alias is then going to be used by the ILM policy to distribute data to the proper ElasticSearch index, so pumps do not have to care about the number of the index.

//Note: The prefix and number of index for ILM rollover must be separated with -000001, not _000001!//

//Note: Make sure there is no index prefix configuration in the source, like in ElasticSearchSink in the pipeline. The code configuration would replace the file configuration.//