Kernel-specific metrics¶
kernel
¶
Metrics to monitor the Linux kernel. Visit the InfluxDB Telegraf plugin documentation for more details.
Tags: node_id
- boot_time: The time when the system was last booted, measured in seconds since the Unix epoch (January 1, 1970). This tells you the system uptime and time of last restart. You can convert this number to a date using a (Unix epoch time converter).
- context_switches: The number (count, integer) of context switches the kernel has performed. A context switch occurs when the CPU switches from one process or thread to another. A high number of context switches can indicate that many processes are competing for CPU time, which can be a sign of high system load.
- entropy_avail: The amount (integer) of available entropy (randomness that can be generated) in the system, which is essential for secure random number generation. Low entropy can affect cryptographic functions and secure communications. Entropy is consumed by various operations and replenished over time, so monitoring this metric is important for maintaining security.
- interrupts: The total number (count, integer) of interrupts processed since boot. An interrupt is a signal to the processor emitted by hardware or software indicating an event that needs immediate attention. High numbers of interrupts can indicate a busy or possibly overloaded system.
- processes_forked: The total number (count, integer) of processes that have been forked (created) since the system was booted. Tracking the rate of process creation can help in diagnosing system performance issues, especially in environments where processes are frequently started and stopped.
kernel_vmstat
¶
Kernel virtual memory statistics gathered via proc/vmstat
. Visit the InfluxDB Telegraf plugin documentation for more details.
Relevant terms
- Active pages: Pages currently in use or recently used.
- Inactive pages: Pages not recently used, and therefore more likely to be moved to swap space or reclaimed.
- Anonymous pages: Memory pages not backed by a file on disk; typically used for data that does not need to be persisted, such as program stacks.
- Bounce buffer: Temporary memory used to facilitate data transfers between devices that cannot directly address each other’s memory.
- Compaction: The process of rearranging pages in memory to create larger contiguous free spaces, often useful for allocating huge pages.
- Dirty pages: Pages that have been modified in memory but have not yet been written back to disk.
- Evict: The process of removing pages from physical memory, either by moving them to disk (swapping out) or discarding them if they are no longer needed.
- File-backed pages: Memory pages that are associated with files on the disk, such as executable files or data files.
- Free pages: Memory pages that are available for use and not currently allocated to any process or data.
- Huge pages: Large memory pages that can be used by processes, reducing the overhead of page tables.
- Interleave: The process of distributing memory pages across different memory nodes or zones, typically to optimize performance in systems with non-uniform memory access (NUMA).
- NUMA (non-uniform memory access): A memory design where a processor accesses its own local memory faster than non-local memory.
- Page allocation: The process of assigning free memory pages to fulfill a request by a process or the kernel.
- Page fault: An event that occurs when a program tries to access a page that is not in physical memory, requiring the OS to handle this by allocating a page or retrieving it from disk.
- Page table: Data structure used by the operating system to store the mapping between virtual addresses and physical memory addresses.
- Shared memory (shmem): Memory that can be accessed by multiple processes.
- Slab pages: Memory pages used by the kernel to store objects of fixed sizes, such as file structures or inode caches.
- Swap space: A space on the disk used to store memory pages that have been evicted from physical memory.
- THP (transparent huge pages): A feature that automatically manages the allocation of huge pages to improve performance without requiring changes to applications.
- Vmscan: A kernel process that scans memory pages and decides which pages to evict or swap out based on their usage.
- Writeback: The process of writing dirty pages back to disk.
Tags: node_id
- nr_free_pages: Number of free pages in the system.
- nr_inactive_anon: Number of inactive anonymous pages.
- nr_active_anon: Number of active anonymous pages.
- nr_inactive_file: Number of inactive file-backed pages.
- nr_active_file: Number of active file-backed pages.
- nr_unevictable: Number of pages that cannot be evicted from memory.
- nr_mlock: Number of pages locked into memory (mlock).
- nr_anon_pages: Number of anonymous pages.
- nr_mapped: Number of pages mapped into userspace.
- nr_file_pages: Number of file-backed pages.
- nr_dirty: Number of pages currently dirty.
- nr_writeback: Number of pages under writeback.
- nr_slab_reclaimable: Number of reclaimable slab pages.
- nr_slab_unreclaimable: Number of unreclaimable slab pages.
- nr_page_table_pages: Number of pages used for page tables.
- nr_kernel_stack: Amount of kernel stack pages.
- nr_unstable: Number of unstable pages.
- nr_bounce: Number of bounce buffer pages.
- nr_vmscan_write: Number of pages written by vmscan.
- nr_writeback_temp: Number of temporary writeback pages.
- nr_isolated_anon: Number of isolated anonymous pages.
- nr_isolated_file: Number of isolated file pages.
- nr_shmem: Number of shared memory pages.
- numa_hit: Number of pages allocated in the preferred node.
- numa_miss: Number of pages allocated in a non-preferred node.
- numa_foreign: Number of pages intended for another node.
- numa_interleave: Number of interleaved hit pages.
- numa_local: Number of pages allocated on the local node.
- numa_other: Number of pages allocated on other nodes.
- nr_anon_transparent_hugepages: Number of anonymous transparent huge pages.
- pgpgin: Number of kilobytes read from disk.
- pgpgout: Number of kilobytes written to disk.
- pswpin: Number of pages swapped in.
- pswpout: Number of pages swapped out.
- pgalloc_dma: Number of DMA zone pages allocated.
- pgalloc_dma32: Number of DMA32 zone pages allocated.
- pgalloc_normal: Number of normal zone pages allocated.
- pgalloc_movable: Number of movable zone pages allocated.
- pgfree: Number of pages freed.
- pgactivate: Number of inactive pages activated.
- pgdeactivate: Number of active pages deactivated.
- pgfault: Number of page faults.
- pgmajfault: Number of major page faults.
- pgrefill_dma: Number of DMA zone pages refilled.
- pgrefill_dma32: Number of DMA32 zone pages refilled.
- pgrefill_normal: Number of normal zone pages refilled.
- pgrefill_movable: Number of movable zone pages refilled.
- pgsteal_dma: Number of DMA zone pages reclaimed.
- pgsteal_dma32: Number of DMA32 zone pages reclaimed.
- pgsteal_normal: Number of normal zone pages reclaimed.
- pgsteal_movable: Number of movable zone pages reclaimed.
- pgscan_kswapd_dma: Number of DMA zone pages scanned by kswapd.
- pgscan_kswapd_dma32: Number of DMA32 zone pages scanned by kswapd.
- pgscan_kswapd_normal: Number of normal zone pages scanned by kswapd.
- pgscan_kswapd_movable: Number of movable zone pages scanned by kswapd.
- pgscan_direct_dma: Number of DMA zone pages directly scanned.
- pgscan_direct_dma32: Number of DMA32 zone pages directly scanned.
- pgscan_direct_normal: Number of normal zone pages directly scanned.
- pgscan_direct_movable: Number of movable zone pages directly scanned.
- zone_reclaim_failed: Number of failed zone reclaim attempts.
- pginodesteal: Number of inodes pages reclaimed.
- slabs_scanned: Number of slab pages scanned.
- kswapd_steal: Number of pages reclaimed by kswapd.
- kswapd_inodesteal: Number of inode pages reclaimed by kswapd.
- kswapd_low_wmark_hit_quickly: Frequency of kswapd hitting low watermark quickly.
- kswapd_high_wmark_hit_quickly: Frequency of kswapd hitting high watermark quickly.
- kswapd_skip_congestion_wait: Number of times kswapd skipped wait due to congestion.
- pageoutrun: Number of pageout pages processed.
- allocstall: Number of times page allocation stalls.
- pgrotated: Number of pages rotated.
- compact_blocks_moved: Number of blocks moved during compaction.
- compact_pages_moved: Number of pages moved during compaction.
- compact_pagemigrate_failed: Number of page migrations failed during compaction.
- compact_stall: Number of stalls during compaction.
- compact_fail: Number of compaction failures.
- compact_success: Number of successful compactions.
- htlb_buddy_alloc_success: Number of successful HTLB buddy allocations.
- htlb_buddy_alloc_fail: Number of failed HTLB buddy allocations.
- unevictable_pgs_culled: Number of unevictable pages culled.
- unevictable_pgs_scanned: Number of unevictable pages scanned.
- unevictable_pgs_rescued: Number of unevictable pages rescued.
- unevictable_pgs_mlocked: Number of unevictable pages mlocked.
- unevictable_pgs_munlocked: Number of unevictable pages munlocked.
- unevictable_pgs_cleared: Number of unevictable pages cleared.
- unevictable_pgs_stranded: Number of unevictable pages stranded.
- unevictable_pgs_mlockfreed: Number of mlock-freed unevictable pages.
- thp_fault_alloc: Number of times a fault caused THP allocation.
- thp_fault_fallback: Number of times a fault fell back from THP.
- thp_collapse_alloc: Number of THP collapses allocated.
- thp_collapse_alloc_failed: Number of failed THP collapse allocations.
- thp_split: Number of THP splits.