Data Storage¶
TeskaLabs LogMan.io operates with several different storage tiers in order to deliver optimal data isolation, performance and the cost.
Data storage structure¶
Schema: Recommended structure of the data storage.
Fast data storage¶
Fast data storage (also known as 'hot' tier) contains the most fresh logs and other events received into the TeskaLabs LogMan.io. We recommend to use the fastest possible storage class for the best throughput and search performance. The real-time component (Apache Kafka) also uses fast data storage for the stream persistency.
- Recommended time span: a one day to one week
- Recommended size: 2TB - 4TB
- Recommended redundancy: RAID 1, additional redundancy is provided by the application layer
- Recommended hardware: NVMe SSD PCIe 4.0 and better
- Fast data storage physical devices MUST BE managed by mdadm
- Mount point:
/data/ssd
- Filesystem: EXT4,
noatime
flag is recommended to be set for an optimum performance
Backup strategy¶
Incoming events (logs) are copied into the archive storage once they enter TeskaLabs LogMan.io. It means that there is always the way how to "replay" events into the TeskaLabs LogMan.in in case of need. Also, data are replicated to other nodes of the cluster immediately after arrival to the cluster. For this reason, traditional backup is not recommended but possible.
The restoration is handled by the cluster components by replicating the data from other nodes of the cluster.
Example
/data/ssd/kafka-1
/data/ssd/elasticsearch/es-master
/data/ssd/elasticsearch/es-hot1
/data/ssd/zookeeper-1
/data/ssd/influxdb-2
...
Slow data storage¶
The slow storage contains data, that does not have to be quickly accessed, and usually contain older logs and events, such as warm and cold indices for ElasticSearch.
- Recommended redundancy: software RAID 6 or RAID 5; RAID 0 for virtualized/cloud instances with underlying storage redundancy
- Recommended hardware: Cost-effective hard drives, SATA 2/3+, SAS 1/2/3+
- Typical size: tens of TB, e.g. 18TB
- Controller card: SATA or HBA SAS (IT Mode)
- Slow data storage physical devices MUST BE managed by software RAID (mdadm)
- Mount point:
/data/hdd
- Filesystem: EXT4,
noatime
flag is recommended to be set for an optimum performance
Calculation of the cluster capacity¶
This is a formula how to calculate total available cluster capacity on the slow data storage.
total = (disks-raid) * capacity * servers / replica
disks
is a number of slow data storage disk per serverraid
is a RAID overhead, 1 for RAID5 and 2 for RAID6capacity
is a capacity of the slow data storage diskservers
is a number of serversreplica
is a replication factor in ElasticSearch
Example
(6[disks]-2[raid6]) * 18TB[capacity] * 3[servers] / 2[replica] = 108TB
Backup strategy¶
The data stored on the slow data storage are ALWAYS replicated to other nodes of the cluster and also stored in the archive. For this reason, traditional backup is not recommended but possible (consider the huge size of the slow storage).
The restoration is handled by the cluster components by replicating the data from other nodes of the cluster.
Example
/data/hdd/elasticsearch/es-warm01
/data/hdd/elasticsearch/es-warm02
/data/hdd/elasticsearch/es-cold01
/data/hdd/mongo-2
/data/hdd/nginx-1
...
Large slow data storage strategy¶
If your slow data storage will be larger than >50 TB, we recommend to employ HBA SAS Controllers, SAS Expanders and JBOD as the optimal strategy for scaling slow data storage. SAS storage connectivity can be daisy-chained to enable large number of drives to be connected. External JBOD chassis can be also connected using SAS to provide housing for additional drives.
RAID 6 vs RAID 5¶
RAID 6 and RAID 5 are both types of RAID (redundant array of independent disks) that use data striping and parity to provide data redundancy and increased performance.
RAID 5 uses striping across multiple disks, with a single parity block calculated across all the disks. If one disk fails, the data can still be reconstructed using the parity information. However, the data is lost if a second disk fails before the first one has been replaced.
RAID 6, on the other hand, uses striping and two independent parity blocks, which are stored on separate disks. If two disks fail, the data can still be reconstructed using the parity information. RAID 6 provides an additional level of data protection compared to RAID 5. However, RAID 6 also increases the overhead and reduces the storage capacity because of the two parity blocks.
Regarding slow data storage, RAID 5 is generally considered less secure than RAID 6 because the log data is usually vital, and two disk failures could cause data loss. RAID 6 is best in this scenario as it can survive two disk failures and provide more data protection.
In RAID 5, the number of disks required is (N-1) disks, where N is the number of disks in the array. This is because one of the disks is used for parity information, which is used to reconstruct the data in case of a single disk failure. For example, if you want to create a RAID 5 array with 54 TB of storage, you would need at least four (4) disks with a capacity of at least 18 TB each.
In RAID 6, the number of disks required is (N-2) disks. This is because it uses two sets of parity information stored on separate disks. As a result, RAID 6 can survive the failure of up to two disks before data is lost. For example, if you want to create a RAID 6 array with 54 TB of storage, you would need at least five (5) disks with a capacity of at least 18 TB each.
It's important to note that RAID 6 requires more disk space as it uses two parity blocks, while RAID5 uses only one. That's why RAID 6 requires additional disks as compared to RAID 5. However, RAID 6 provides extra protection and can survive two disk failures.
It is worth mentioning that the data in slow data storage are replicated across the cluster (if applicable) to provide additional data redundancy.
Tip
Use Online RAID Calulator to calculate storage requirements.
System storage¶
The system storage is dedicated for an operation system, software installations and configurations. No operational data are stored on the system storage. Installations on virtualization platforms uses commonly available locally redundant disk space.
- Recommended size: 250 GB and more
- Recommend hardware: two (2) local SSD disks in software RAID 1 (mirror), SATA 2/3+, SAS 1/2/3+
If applicable, following storage parititioning is recommended:
- EFI partition, mounted at
/boot/efi
, size 1 GB - Swap partition, 64 GB
- Software RAID1 (mdadm) over rest of the space
- Boot partition on RAID1, mounted at
/boot
, size 512 MB, ext4 filesystem - LVM partition on RAID1, rest of the available space with volume group
systemvg
- LVM logical volume
rootlv
, mounted at/
, size 50 GB, ext4 filesystem - LVM logical volume
loglv
, mounted at/var/log
, size 50 GB, ext4 filesystem - LVM logical volume
dockerlv
, mounted at/var/lib/docker
, size 100 GB, ext4 filesystem (if applicable)
Backup strategy for the system storage¶
It is recommended to periodically backup all filesystems on the system storage so that they could be used for restoring the installation when needed. The backup strategy is compatible with most common backup technologies in the market.
- Recovery Point Objective (RPO): full backup once per week or after major maintenance work, incremental backup one per day.
- Recovery Time Objective (RTO): 12 hours.
Note
RPO and RTO are recommended, assuming highly available setup of the LogMan.io cluster. It means three and more nodes so that the complete downtime of the single node don't impact service availability.
Archive data storage¶
Data archive storage is recommended but optional. It serves for a very long data retention periods and redundancy purposes. It also represents economical way of long-term data storage. Data are not available online in the cluster, they has to be restored back when needed, which is connected with a certain "time-to-data" interval.
Data are compressed when copied into the archive, the typical compression ratio is in range from 1:10 to 1:2, depending on the nature of the logs.
Data are replicated into the storage after initial consolidation on the fast data storage, practically immediately after ingesting into a cluster.
- Recommended technologies: SAN / NAS / Cloud cold storage (AWS S3, MS Azure Storage)
- Mount point:
/data/archive
(if applicable)
Note
Public clouds can be used as a data archive storage. Data encryption has to be enabled in such a case to protect data from unauthorised access.
Dedicated archive nodes¶
For large archives, dedicated archive nodes (servers) are recommended. These nodes should use HBA SAS drive connectivity and storage-oriented OS distributions such as Unraid or TrueNAS.
Data Storage DON'Ts¶
- We DON'T recommend use of NAS / SAN storage for data storages
- We DON'T recommend use of hardware RAID controllers etc. for data storages
The storage administration¶
This chapter provides a practical example of the configuration of the storage for TeskaLabs LogMan.io. You don't need to configure or manage the LogMan.io storage unless you have a specific reason for it, the LogMan.io is delivered in fully configured state.
Assuming following hardware configuration:
- SSD drives for a fast data storage:
/dev/nvme0n1
,/dev/nvme1n1
- HDD drives for a slow data storage:
/dev/sde
,/dev/sdf
,/dev/sdg
Tip
Use lsblk
command to monitor the actual status of the storage devices.
Create a software RAID1 for a fast data storage¶
mdadm --create /dev/md2 --level=1 --raid-devices=2 /dev/nvme0n1 /dev/nvme1n1
mkfs.ext4 /dev/md2
mkdir -p /data/ssd
Add mount points into /etc/fstab
:
/dev/md2 /data/ssd ext4 defaults,noatime 0 2
Mount data storage filesystems:
mount /data/ssd
Tip
Use cat /proc/mdstat
to check the state of the software RAID.
Create a software RAID5 for a slow data storage¶
mdadm --create /dev/md1 --level=5 --raid-devices=3 /dev/sde /dev/sdf /dev/sdg
mkfs.ext4 /dev/md1
mkdir -p /data/hdd
Note
For RAID6 use --level=6
.
Add mount points into /etc/fstab
:
/dev/md1 /data/hdd ext4 defaults,noatime 0 2
Mount data storage filesystems:
mount /data/hdd
Grow the size of a data storage¶
With ever increasing data volumes, it is highly likely that you need to grow (aka extend) the data storage, either on fast or slow data storage. It is done by adding a new data volume (eg. physical disk or virtual volume) to the machine - or on some virtualized solutions - by growing an existing volume.
Note
The data storage could be extended without any downtime.
Slow data storage grow example¶
Assuming that you want to add a new disk /dev/sdh
to a slow data storage /dev/md1
:
mdadm --add /dev/md1 /dev/sdh
The new disk is added as a spare device.
You can check the state of the RAID array by:
cat /proc/mdstat
The (S) behind the device means spare device.
The grow the RAID to the spare devices:
mdadm --grow --raid-devices=4 /dev/md1
Number 4
needs to be adjusted to reflect the actual RAID setup.
Grow the filesystem:
resize2fs /dev/md1