Data Storage¶

TeskaLabs LogMan.io operates with several different storage tiers in order to deliver optimal data isolation, performance and the cost.

Data storage structure¶

Schema: Recommended structure of the data storage.

Fast data storage¶

Fast data storage (also known as 'hot' tier) contains the most fresh logs and other events received into the TeskaLabs LogMan.io. We recommend to use the fastest possible storage class for the best throughput and search performance. The real-time component (Apache Kafka) also uses fast data storage for the stream persistency.

Recommended time span: a one day to one week
Recommended size: 2TB - 4TB
Recommended redundancy: RAID 1, additional redundancy is provided by the application layer
Recommended hardware: NVMe SSD PCIe 4.0 and better
Fast data storage physical devices MUST BE managed by mdadm
Mount point: /data/ssd
Filesystem: EXT4, noatime flag is recommended to be set for an optimum performance

Backup strategy¶

Incoming events (logs) are copied into the archive storage once they enter TeskaLabs LogMan.io. It means that there is always the way how to "replay" events into the TeskaLabs LogMan.in in case of need. Also, data are replicated to other nodes of the cluster immediately after arrival to the cluster. For this reason, traditional backup is not recommended but possible.

The restoration is handled by the cluster components by replicating the data from other nodes of the cluster.

Example

/data/ssd/kafka-1
/data/ssd/elasticsearch/es-master
/data/ssd/elasticsearch/es-hot1
/data/ssd/zookeeper-1
/data/ssd/influxdb-2
...

Slow data storage¶

The slow storage contains data, that does not have to be quickly accessed, and usually contain older logs and events, such as warm and cold indices for Elasticsearch.

Recommended redundancy: software RAID 6 or RAID 5; RAID 0 for virtualized/cloud instances with underlying storage redundancy
Recommended hardware: Cost-effective hard drives, SATA 2/3+, SAS 1/2/3+
Typical size: tens of TB, e.g. 18TB
Controller card: SATA or HBA SAS (IT Mode)
Slow data storage physical devices MUST BE managed by software RAID (mdadm)
Mount point: /data/hdd
Filesystem: EXT4, noatime flag is recommended to be set for an optimum performance

Calculation of the cluster capacity¶

This is a formula how to calculate total available cluster capacity on the slow data storage.

total = (disks-raid) * capacity * servers / replica

disks is a number of slow data storage disk per server
raid is a RAID overhead, 1 for RAID5 and 2 for RAID6
capacity is a capacity of the slow data storage disk
servers is a number of servers
replica is a replication factor in Elasticsearch

Example

(6[disks]-2[raid6]) * 18TB[capacity] * 3[servers] / 2[replica] = 108TB

Backup strategy¶

The data stored on the slow data storage are ALWAYS replicated to other nodes of the cluster and also stored in the archive. For this reason, traditional backup is not recommended but possible (consider the huge size of the slow storage).

The restoration is handled by the cluster components by replicating the data from other nodes of the cluster.

Example

/data/hdd/elasticsearch/es-warm01
/data/hdd/elasticsearch/es-warm02
/data/hdd/elasticsearch/es-cold01
/data/hdd/mongo-2
/data/hdd/nginx-1
...

Large slow data storage strategy¶

If your slow data storage will be larger than >50 TB, we recommend to employ HBA SAS Controllers, SAS Expanders and JBOD as the optimal strategy for scaling slow data storage. SAS storage connectivity can be daisy-chained to enable large number of drives to be connected. External JBOD chassis can be also connected using SAS to provide housing for additional drives.

RAID 6 vs RAID 5¶

RAID 6 and RAID 5 are both types of RAID (redundant array of independent disks) that use data striping and parity to provide data redundancy and increased performance.

RAID 5 uses striping across multiple disks, with a single parity block calculated across all the disks. If one disk fails, the data can still be reconstructed using the parity information. However, the data is lost if a second disk fails before the first one has been replaced.

RAID 6, on the other hand, uses striping and two independent parity blocks, which are stored on separate disks. If two disks fail, the data can still be reconstructed using the parity information. RAID 6 provides an additional level of data protection compared to RAID 5. However, RAID 6 also increases the overhead and reduces the storage capacity because of the two parity blocks.

Regarding slow data storage, RAID 5 is generally considered less secure than RAID 6 because the log data is usually vital, and two disk failures could cause data loss. RAID 6 is best in this scenario as it can survive two disk failures and provide more data protection.

In RAID 5, the number of disks required is (N-1) disks, where N is the number of disks in the array. This is because one of the disks is used for parity information, which is used to reconstruct the data in case of a single disk failure. For example, if you want to create a RAID 5 array with 54 TB of storage, you would need at least four (4) disks with a capacity of at least 18 TB each.

In RAID 6, the number of disks required is (N-2) disks. This is because it uses two sets of parity information stored on separate disks. As a result, RAID 6 can survive the failure of up to two disks before data is lost. For example, if you want to create a RAID 6 array with 54 TB of storage, you would need at least five (5) disks with a capacity of at least 18 TB each.

It's important to note that RAID 6 requires more disk space as it uses two parity blocks, while RAID5 uses only one. That's why RAID 6 requires additional disks as compared to RAID 5. However, RAID 6 provides extra protection and can survive two disk failures.

It is worth mentioning that the data in slow data storage are replicated across the cluster (if applicable) to provide additional data redundancy.

Tip

Use Online RAID Calulator to calculate storage requirements.

System storage¶

The system storage is dedicated for an operation system, software installations and configurations. No operational data are stored on the system storage. Installations on virtualization platforms uses commonly available locally redundant disk space.

Recommended size: 250 GB and more
Recommend hardware: two (2) local SSD disks in software RAID 1 (mirror), SATA 2/3+, SAS 1/2/3+

If applicable, following storage parititioning is recommended:

EFI partition, mounted at /boot/efi, size 1 GB
Swap partition, 64 GB
Software RAID1 (mdadm) over rest of the space
Boot partition on RAID1, mounted at /boot, size 512 MB, ext4 filesystem
LVM partition on RAID1, rest of the available space with volume group systemvg
LVM logical volume rootlv, mounted at /, size 50 GB, ext4 filesystem
LVM logical volume loglv, mounted at /var/log, size 50 GB, ext4 filesystem
LVM logical volume dockerlv, mounted at /var/lib/docker, size 100 GB, ext4 filesystem (if applicable)

Backup strategy for the system storage¶

It is recommended to periodically backup all filesystems on the system storage so that they could be used for restoring the installation when needed. The backup strategy is compatible with most common backup technologies in the market.

Recovery Point Objective (RPO): full backup once per week or after major maintenance work, incremental backup one per day.
Recovery Time Objective (RTO): 12 hours.

Note

RPO and RTO are recommended, assuming highly available setup of the LogMan.io cluster. It means three and more nodes so that the complete downtime of the single node don't impact service availability.

Archive data storage¶

Data archive storage is recommended but optional. It serves for a very long data retention periods and redundancy purposes. It also represents economical way of long-term data storage. Data are not available online in the cluster, they has to be restored back when needed, which is connected with a certain "time-to-data" interval.

Data are compressed when copied into the archive, the typical compression ratio is in range from 1:10 to 1:2, depending on the nature of the logs.

Data are replicated into the storage after initial consolidation on the fast data storage, practically immediately after ingesting into a cluster.

Recommended technologies: SAN / NAS / Cloud cold storage (AWS S3, MS Azure Storage)
Mount point: /data/archive (if applicable)

Note

Public clouds can be used as a data archive storage. Data encryption has to be enabled in such a case to protect data from unauthorised access.

Dedicated archive nodes¶

For large archives, dedicated archive nodes (servers) are recommended. These nodes should use HBA SAS drive connectivity and storage-oriented OS distributions such as Unraid or TrueNAS.

Data Storage DON'Ts¶

We DON'T recommend use of NAS / SAN storage for data storages
We DON'T recommend use of hardware RAID controllers etc. for data storages

The storage administration¶

This chapter provides a practical example of the configuration of the storage for TeskaLabs LogMan.io. You don't need to configure or manage the LogMan.io storage unless you have a specific reason for it, the LogMan.io is delivered in fully configured state.

Assuming following hardware configuration:

SSD drives for a fast data storage: /dev/nvme0n1, /dev/nvme1n1
HDD drives for a slow data storage: /dev/sde, /dev/sdf, /dev/sdg

Tip

Use lsblk command to monitor the actual status of the storage devices.

Create a software RAID1 for a fast data storage¶

mdadm --create /dev/md2 --level=1 --raid-devices=2 /dev/nvme0n1 /dev/nvme1n1
mkfs.ext4 /dev/md2
mkdir -p /data/ssd

Add mount points into /etc/fstab:

/dev/md2    /data/ssd   ext4    defaults,noatime    0 2

Mount data storage filesystems:

mount /data/ssd

Tip

Use cat /proc/mdstat to check the state of the software RAID.

Create a software RAID5 for a slow data storage¶

mdadm --create /dev/md1 --level=5 --raid-devices=3 /dev/sde /dev/sdf /dev/sdg
mkfs.ext4 /dev/md1
mkdir -p /data/hdd

Note

For RAID6 use --level=6.

Add mount points into /etc/fstab:

/dev/md1    /data/hdd   ext4    defaults,noatime    0 2

Mount data storage filesystems:

mount /data/hdd

Grow the size of a data storage¶

With ever increasing data volumes, it is highly likely that you need to grow (aka extend) the data storage, either on fast or slow data storage. It is done by adding a new data volume (eg. physical disk or virtual volume) to the machine - or on some virtualized solutions - by growing an existing volume.

Note

The data storage could be extended without any downtime.

Slow data storage grow example¶

Assuming that you want to add a new disk /dev/sdh to a slow data storage /dev/md1:

mdadm --add /dev/md1 /dev/sdh

The new disk is added as a spare device.

You can check the state of the RAID array by:

cat /proc/mdstat

The (S) behind the device means spare device.

The grow the RAID to the spare devices:

mdadm --grow --raid-devices=4 /dev/md1

Number 4 needs to be adjusted to reflect the actual RAID setup.

Grow the filesystem:

resize2fs /dev/md1