Skip to content

Data Lifecycle

Data (e.g. logs, events, metrics) are stored in several availability stages, basically in the chronological order. It means that the recent logs are stored in the fastest data storage and as they age, they are moved to the slower and cheaper data storage and eventually into the offline archive or they are deleted.

Data life cycle

Schema: Data life cycle in the TeskaLabs LogMan.io.

The lifecycle is controlled by ElasticSearch feature called Index Lifecycle Management (ILM).

Index Lifecycle Management

Index Lifecycle Management (ILM) in ElasticSearch serves to automatically close or delete old indices (f. e. with data older than three months), so searching performance is kept and data storage is able to store present data. The setting is present in the so-called ILM policy.

The ILM should be set before the data are pumped into ElasticSearch, so the new index finds and associates itself with the proper ILM policy. For more information, please refer to the official documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-index-lifecycle-management.html

LogMan.io components such as Dispatcher then use a specified ILM alias (lm_) and ElasticSearch automatically put the data to the proper index assigned with the ILM policy.

Hot-Warm-Cold architecture (HWC)

HWC is an extension of the standard index rotation provided by the ElasticSearch ILM and it is a good tool for managing time series data. HWC architecture enables us to allocate specific nodes to one of the phases. When used correctly, along with the cluster architecture, this will allow for maximum performance, using available hardware to its fullest potential.

Hot stage

There is usually some period of time (week, month, etc.), where we want to query the indexes heavily, aiming for speed, rather than memory (and other resources) conservation. That is where the “Hot” phase comes in handy, by allowing us to have the index with more replicas, spread out and accessible on more nodes for optimal user experience.

Hot nodes

Hot nodes should use the fast parts of the available hardware, using most CPU's and faster IO.

Hot

Warm stage

Once this period is over, and the indexes are no longer queried as often, we will benefit by moving them to the “Warm” phase, which allows us to reduce the number of nodes (or move to nodes with less resources available) and index replicas, lessening the hardware load, while still retaining the option to search the data reasonably fast.

Warm nodes

Warm nodes, as the name suggests, stand on the crossroads, between being solely for the storage purposes, while still retaining some CPU power to handle the occasional queries.

warm

Cold stage

Sometimes, there are reasons to store data for extended periods of time (dictated by law, or some internal rule). The data are not expected to be queried, but at the same time, they cannot be deleted just yet.

Cold nodes

This is where the Cold nodes come in, there may be few, with only little CPU resources, they have no need to use SSD drives, being perfectly fine with slower (and optionally larger) storage.

cold

The setting should be done in following way:

Archive stage

The archive stage is optional in the design. It is an offline long-term storage. The oldest data from a cold stage could be moved periodically to the archive stage instead of their deletion.

The standard archiving policy of the SIEM operating organization are applied. The archived data needs to be encrypted.

It is also possible to forward certain logs directly from a warm stage into the archive stage.

Create the ILM policy

Kibana

Kibana version 7.x can be used to create ILM policy in ElasticSearch.

1.) Open Kibana

2.) Click Management in the left menu

3.) In the ElasticSearch section, click on Index Lifecycle Policies

4.) Click Create policy blue button

5.) Enter its name, which should be the same as the index prefix, f. e. lm_

6.) Set max index size to the desired rollover size, f. e. 25 GB (size rollover)

7.) Set maximum age of the index, f. e. 10 days (time rollover)

8.) Click the switch down the screen at Delete phase, and enter the time after which the index should be deleted, f. e. 120 days from rollover

9.) Click on Save policy green button

Use the policy in index template

Modify index template(s)

Add the following lines to the JSON index template:

"settings": {
  "index": {
    "lifecycle": {
      "name": "lm_",
      "rollover_alias": "lm_"
    }
  }
},

Kibana

Kibana version 7.x can be used to link ILM policy with ES index template.

1.) Open Kibana

2.) Click Management in the left menu

3.) In the ElasticSearch section, click on Index Management

4.) At the top, select Index Template

5.) Select your desired index template, f. e. lm_

6.) Click on Edit

7.) On the Settings screen, add:

{
  "index": {
    "lifecycle": {
      "name": "lm_",
      "rollover_alias": "lm_"
    }
  }
}

8.) Click on Save

Create a new index which will utilize the latest index template

Through PostMan or Kibana, create a following HTTP request to the instance of ElasticSearch you are using:

PUT lm_tenant-000001
{
  "aliases": {
    "lm_": {
      "is_write_index": true
    }
  }
}

The alias is then going to be used by the ILM policy to distribute data to the proper ElasticSearch index, so pumps do not have to care about the number of the index.

Warning

The prefix and number of index for ILM rollover must be separated with -000001, not _000001!

Note

Make sure there is no index prefix configuration in the source, like in ElasticSearchSink in the pipeline. The code configuration would replace the file configuration.

Elasticsearch backup and restore

Snapshots

Located under Stack Management -> Snapshot and Restore. The snapshots are stored in the repository location. The structure is as follows. The snapshot itself is just a pointer to the indices that it contains. The indices themselves are stored in a separate directory, and they are stored incrementally. This basically means, that if you create a snapshot every day, the older indices are just referenced again in the snapshot, while only the new indices are actually copied to the backup directory.

Repositories

First, the snapshot repository needs to be set up. Specify the location where the snapshot repository resides, /backup/elasticsearch for instance. This path needs to be accessible from all nodes in the cluster. With the Elasticsearch running in docker, this includes mounting the space inside of the docker containers, and restarting them.

Policies

To begin taking snapshots, a policy needs to be created. The policy determines the naming prefix of the snapshots it creates, it specifies repository it will be using for creating snapshots, It requires a schedule setting, indices (defined using patterns or specific index names - lmio-mpsv-events-* for instance). Furthermore, the policy is able to specify whether to ignore unavailable indices, allow partial indices and include global state. Use of these depends on the specific case, in which the snapshot policy will be used and are not recommended by default. There is also a setting available to automatically delete snapshots and define expiration. These also depend on specific policy, the snapshots themselves however are very small (memory wise), when they do not include global state, which is to be expected since they are just pointers to a different place, where the actual index data are stored.

Restoring a snapshot

To restore a snapshot, simply select the snapshot containing the index or indices you wish to bring back and select "Restore". You then need to specify whether you want to restore all indices contained in the snapshot, or just a portion. You are able to rename the restored indices, you can also restore partially snapshot indices and modify the index setting while restoring them. Or resetting them to default. The indices are then restored as specified back into the cluster.

Caveats

When deleting snapshots, bear in mind that you need to have the backed up indices covered by a snapshot to be able to restore them. What this means is, when you for example clear some of the indices from the cluster and then delete the snapshot that contained the reference to these indexes, you will be unable to restore them.