ElasticSearch Setting¶
Index Templates¶
Before the data are loaded to the ElasticSearch, there should be an index template present, so proper data types are assigned to every field.
This is especially needed for time-based fields, which would not work without index template and could not be used for sorting and creating index patterns in Kibana.
The ElasticSearch index template should be present in the site-
repository
under the name es_index_template.json
.
To insert the index template through PostMan or Kibana, create a following HTTP request to the instance of ElasticSearch you are using:
PUT _template/lmio-
{
//Deploy to <SPECIFY_WHERE_TO_DEPLOY_THE_TEMPLATE>
"index_patterns" : ["lmio-*"],
"version": 200721, // Increase this with every release
"order" : 9999998, // Decrease this with every release
"settings": {
"index": {
"lifecycle": {
"name": "lmio-",
"rollover_alias": "lmio-"
}
}
},
"mappings": {
"properties": {
"@timestamp": { "type": "date", "format": "strict_date_optional_time||epoch_millis" },
"rt": { "type": "date", "format": "strict_date_optional_time||epoch_second" },
...
}
}
The body of the request is the content of the es_index_template.json
.
Index Lifecycle Management¶
Index Lifecycle Management (ILM) in ElasticSearch serves to automatically close or delete old indices (f. e. with data older than three months), so searching performance is kept and data storage is able to store present data. The setting is present in the so-called ILM policy.
The ILM should be set before the data are pumped into ElasticSearch, so the new index finds and associates itself with the proper ILM policy. For more information, please refer to the official documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-index-lifecycle-management.html
LogMan.io components such as Dispatcher then use a specified ILM alias (lm_) and ElasticSearch automatically put the data to the proper index assigned with the ILM policy.
The setting should be done in following way:
Create the ILM policy¶
Kibana¶
Kibana version 7.x can be used to create ILM policy in ElasticSearch.
1.) Open Kibana
2.) Click Management in the left menu
3.) In the ElasticSearch section, click on Index Lifecycle Policies
4.) Click Create policy blue button
5.) Enter its name, which should be the same as the index prefix, f. e. lm_
6.) Set max index size to the desired rollover size, f. e. 25 GB (size rollover)
7.) Set maximum age of the index, f. e. 10 days (time rollover)
8.) Click the switch down the screen at Delete phase, and enter the time after which the index should be deleted, f. e. 120 days from rollover
9.) Click on Save policy green button
Use the policy in index template¶
Modify index template(s)¶
Add the following lines to the JSON index template:
"settings": {
"index": {
"lifecycle": {
"name": "lmio-",
"rollover_alias": "lmio-"
}
}
},
Kibana¶
Kibana version 7.x can be used to link ILM policy with ES index template.
1.) Open Kibana
2.) Click Management in the left menu
3.) In the ElasticSearch section, click on Index Management
4.) At the top, select Index Template
5.) Select your desired index template, f. e. lmio-
6.) Click on Edit
7.) On the Settings screen, add:
{
"index": {
"lifecycle": {
"name": "lmio-",
"rollover_alias": "lmio-"
}
}
}
8.) Click on Save
Create a new index which will utilize the latest index template¶
Through PostMan or Kibana, create a following HTTP request to the instance of ElasticSearch you are using:
PUT lmio-tenant-events-000001
{
"aliases": {
"lmio-tenant-events": {
"is_write_index": true
}
}
}
The alias is then going to be used by the ILM policy to distribute data to the proper ElasticSearch index, so pumps do not have to care about the number of the index.
//Note: The prefix and number of index for ILM rollover must be separated with -
000001, not _
000001!//
Configure other LogMan.io components¶
The pumps may now use the ILM policy through the created alias, which in the case above is lm_tenant
. The configuration file should then look like this:
[pipeline:<PIPELINE>:ElasticSearchSink]
index_prefix=lm_tenant
doctype=_doc
The pump will always put data to the lm_tenant
alias, where ILM will take care of the proper assignment to the index, f. e. lm_-000001
.
//Note: Make sure there is no index prefix configuration in the source, like in ElasticSearchSink in the pipeline. The code configuration would replace the file configuration.//
Hot-Warm-Cold architecture (HWC)¶
HWC is an extension of the standard index rotation provided by the ElasticSearch ILM and it is a good tool for managing time series data. HWC architecture enables us to allocate specific nodes to one of the phases. When used correctly, along with the cluster architecture, this will allow for maximum performance, using available hardware to its fullest potential.
Hot¶
There is usually some period of time (week, month, etc.), where we want to query the indexes heavily, aiming for speed, rather than memory (and other resources) conservation. That is where the “Hot” phase comes in handy, by allowing us to have the index with more replicas, spread out and accessible on more nodes for optimal user experience.
Hot nodes¶
Hot nodes should use the fast parts of the available hardware, using most CPU's and faster IO.
Warm¶
Once this period is over, and the indexes are no longer queried as often, we will benefit by moving them to the “Warm” phase, which allows us to reduce the number of nodes (or move to nodes with less resources available) and index replicas, lessening the hardware load, while still retaining the option to search the data reasonably fast.
Warm nodes¶
Warm nodes, as the name suggests, stand on the crossroads, between being solely for the storage purposes, while still retaining some CPU power to handle the occasional queries.
Cold¶
Sometimes, there are reasons to store data for extended periods of time (dictated by law, or some internal rule). The data are not expected to be queried, but at the same time, they cannot be deleted just yet.
Cold nodes¶
This is where the Cold nodes come in, there may be few, with only little CPU resources, they have no need to use SSD drives, being perfectly fine with slower (and optionally larger) storage.
Conclusion¶
Using the HWC ILM feature to its full effect requires some preparation, it should be considered when building the production ElasticSearch cluster. The added value however, can be very high, depending on the specific use cases.