Data Storage

TeskaLabs LogMan.io operates with several different storage tiers in order to deliver optimal data isolation, performance and the cost.

Data storage structure

TeskaLabs LogMan.io storage structure

Schema: Recommended structure of the data storage.

Fast data storage

Fast data storage (also known as ‘hot’ tier) contains the most fresh logs and other events received into the TeskaLabs LogMan.io. We recommend to use the fastest possible storage class for the best throughput and search performance. The real-time component (Apache Kafka) also uses fast data storage for the stream persistency.

  • Recommended time span: a one day to one week
  • Recommended size: 2TB - 4TB
  • Recommended redundancy: Software RAID1 (mirror using LVM2), additional redundancy is provided by the application layer
  • Recommended hardware: NVMe SSD PCIe 3.0 and better
  • Fast data storage physical devices should be managed by LVM2 in a dedicated volume group ssdvg, this allows the easy extension of an available space if needed
  • Mount point: /data/ssd
  • Filesystem: XFS, noatime flag is recommended to be set for an optimum performance

Backup strategy for the fast data storage

Incoming events (logs) are copied into the archive storage once they enter the LogMan.io. It means that there is always the way how to “replay” events into the TeskaLabs LogMan.in in case of need. Also, data are replicated to other nodes of the cluster immediately after arrival to the cluster. For this reason, traditional backup is not recommended but possible.

The restoration is handled by the cluster components by replicating the data from other nodes of the cluster.

Example of the fast storage directory structure

/data/ssd/kafka
/data/ssd/zookeeper
/data/ssd/influxdb
/data/ssd/es-master
/data/ssd/es-hot1
...

Slow data storage

The slow storage contains data, that does not have to be quickly accessed, and usually contain older logs and events, such as warm and cold indices for ElasticSearch.

  • Recommended redundancy: software RAID5 or RAID6, RAID0 for virtualized/cloud instances with underlying storage redundancy
  • Recommended hardware: Cost-effective hard drives, SATA 2/3+, SAS 1/2/3+
  • Typical size: tens of TB, e.g. 18TB
  • Controller card: SATA or HBA SAS (IT Mode)
  • Slow data storage physical devices MUST BE managed by LVM2 in a dedicated volume group hddvg, this allows the easy extension of an available space when needed
  • Mount point: /data/hdd
  • Filesystem: XFS, noatime flag is recommended to be set for an optimum performance

Backup strategy for the slow data storage

The data stored on the slow data storage are ALWAYS replicated to other nodes of the cluster and also stored in the archive. For this reason, traditional backup is not recommended but possible (consider the huge size of the slow storage).

The restoration is handled by the cluster components by replicating the data from other nodes of the cluster.

Example of the slow storage directory structure

/data/ssd/es-warm1
/data/ssd/es-cold1
/data/hdd/docker
...

Note: /data/hdd/docker is sym-linked so to /var/lib/docker.

Large slow data storage strategy

If your slow data storage will be larger than >50 TB, we recommend to employ HBA SAS Controllers, SAS Expanders and JBOD as the optimal strategy for scaling slow data storage. SAS storage connectivity can be daisy-chained to enable large number of drives to be connected. External JBOD chassis can be also connected using SAS to provide housing for additional drives.

System storage

The system storage is dedicated for an operation system, software installations and configurations. No operational data are stored on the system storage. Installations on virtualization platforms uses commonly available locally redundant disk space.

  • Recommended size: 250 GB
  • Recommend hardware: two (2) local SSD disks in software RAID1 (mirror), SATA 2/3+, SAS 1/2/3+

If applicable, following storage parititioning is recommended:

  • Boot partition, mounted at /boot, size 512 MB, ext3 filesystem
  • Swap partition, minimum size 64GB (see the chapter “Swap size”)
  • LVM partition, rest of the available space
  • Volume group systemvg with physical volume(s) from system disk(s)
  • LVM logical volume rootlv, mounted at /, size 30 GB, ext4 filesystem, mirrored on two physical volumes (if applicable)
  • LVM logical volume loglv, mounted at /var/log, size 40 GB, ext4 filesystem

Backup strategy for the system storage

It is recommended to periodically backup all filesystems on the system storage so that they could be used for restoring the installation when needed. The backup strategy is compatible with most common backup technologies in the market.

  • Recovery Point Objective (RPO): full backup once per week or after major maintenance work, incremental backup one per day.
  • Recovery Time Objective (RTO): 12 hours.

Note: RPO and RTO are recommended, assuming highly available setup of the LogMan.io cluster. It means three and more nodes so that the complete downtime of the single node don’t impact service availability.

Swap size

The total swap size should always be the same size of RAM memory as is installed in the the server. When two (or more) drives are used, the respective portition of the swap size should be allocated on the each drive.

Archive data storage

Data archive storage is recommended but optional. It serves for a very long data retention periods and redundancy purposes. It also represents economical way of long-term data storage. Data are not available online in the cluster, they has to be restored back when needed, which is connected with a certain “time-to-data” interval.

Data are compressed when copied into the archive, the typical compression ratio is in range from 1:10 to 1:2, depending on the nature of the logs.

Data are replicated into the storage after initial consolidation on the fast data storage, practically immediately after ingesting into a cluster.

  • Recommended technologies: SAN / NAS / Cloud cold storage (AWS S3, MS Azure Storage)
  • Mount point: /data/archive (if applicable)

Note: Public clouds can be used as a data archive storage. Data encryption has to be enabled in such a case to protect data from unauthorised access.

Dedicated archive nodes

For large archives, dedicated archive nodes (servers) are recommended. These nodes should use HBA SAS drive connectivity and storage-oriented OS distributions such as Unraid or TrueNAS.

Data Storage DON’Ts

  • We DON’T recommend use of NAS / SAN storage for data storages
  • We DON’T recommend use of hardware RAID controllers etc. for data storages

Managing the storage using LVM2 and MD

This chapter provides a practical example of the configuration of the storage for TeskaLabs LogMan.io. You don’t need to configure or manage the LogMan.io storage unless you have a specific reason for it, the LogMan.io is delivered in fully configured state.

Assuming following hardware configuration:

  • SSD drives for a system storage: /dev/sda, /dev/sdb
  • SSD drives for a fast data storage: /dev/sdc, /dev/sdd
  • HDD drives for a slow data storage: /dev/sde, /dev/sdf, /dev/sdg

Hint: Use lsblk command to monitor the actual status of the storage devices.

Create a software RAID5 for a slow data storage

mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sde /dev/sdf /dev/sdg

Hint: Use cat /proc/mdstat to check the state of the software RAID.

Create LVM physical volumes

pvcreate /dev/sda
pvcreate /dev/sdb
pvcreate /dev/sdc
pvcreate /dev/sdd
pvcreate /dev/md0

Create LVM volume groups

Each storage tier has its own LVM volume group. It is recommended that every volume group contains physical volumes of the same type.

Create volume groups:

vgcreate systemvg /dev/sda /dev/sdb
vgcreate ssdvg /dev/sdc /dev/sdd
vgcreate hddvg /dev/md0

Create LVM logical volumes

Create a fast data storage logical volume with one one mirror (-m1):

lvcreate -l 100%FREE -m1 -n ssdlv ssdvg

Create a slow data storage logical volume:

lvcreate -l 100%FREE -n hddlv hddvg

Note: For a brevity, we skip system volumes.

Prepare filesystems

Make filesystem (XFS):

mkfs.xfs /dev/ssdvg/ssdlv
mkfs.xfs /dev/hddvg/hddlv

Create mount points:

mkdir -p /data/ssd /data/hdd

Add mount points into /etc/fstab:

/dev/hddvg/hddlv	/data/hdd	xfs	defaults,noatime,nodiratime	0 2
/dev/ssdvg/ssdlv	/data/ssd	xfs	defaults,noatime,nodiratime	0 2

Mount data storage filesystems:

mount /data/hdd
mount /data/ssd

Extending data storage

With ever increasing data volumes, it is highly likely that you need to grow (aka extend) the data storage, either on fast or slow data storage. It is done by adding a new data volume (eg. physical disk or virtual volume) to the machine - or on some virtualized solutions - by growing an existing volume.

Note: The data storage could be extended without any downtime.

Slow data storage extension

Assuming that you want to add a new disk /dev/sdh to a slow data storage:

mdadm --add /dev/md0 /dev/sdh

The new disk is added as a spare device.

You can check the state of the RAID array by:

cat /proc/mdstat

The (S) behind the device means spare device.

The grow the RAID to the spare devices:

mdadm --grow --raid-devices=4 /dev/md0

Number 4 needs to be adjusted to reflect the actual RAID setup.

Rescan a physical volume:

pvresize /dev/md0

Extend size of the logical:

lvextend -l +100%FREE /dev/hddvg/hddlv

Grow the filesystem:

xfs_growfs /dev/hddvg/hddlv

Fast data storage extension

Assuming that you want to add a new disk /dev/sdi to a fast data storage:

pvcreate /dev/sdi
vgextend sddvg /dev/sdi

Extend size of the logical:

lvextend -l +100%FREE /dev/sddvg/sddlv

Grow the filesystem:

xfs_growfs /dev/sddvg/sddlv