Skip to content

Metrics

TeskaLabs LogMan.io offers a pallette of metrics from all its microservices including LogMan.io Parser, LogMan.io Dispatcher, LogMan.io Correlator and so on.

The metrics are stored in InfluxDB and updated by each running microservice approximately once per minute. Metrics are visualized in Grafana using provided/provisioned or custom dashboards.

The following sections list the available metrics (or measurements from InfluxDB perspective) together with their values and tags:

Memory metrics

Each microservice contains a gauge with the name os.stat, that gathers information about memory usage. The following values are calculated:

  • VmPeak: Peak virtual memory size
  • VmLck: Locked memory size
  • VmPin: Pinned memory size
  • VmHWM: Peak resident set size ("high water mark")
  • VmRSS: Resident set size
  • VmData, VmStk, VmExe: Size of data, stack, and text segments
  • VmLib: Shared library code size
  • VmPTE: Page table entries size
  • VmPMD: Size of second-level page tables
  • VmSwap: Swapped-out virtual memory size by anonymous private pages; shmem swap usage is not included

General pipeline metrics

The following metrics are produced by every pipeline, so it is specific for every microservice, that reads, transforms and outputs data like LogMan.io Ingestor, LogMan.io Parser, LogMan.io Dispatcher, LogMan.io Correlator and so on.

The tags are pipeline (ID of the pipeline) and host (hostname of the microservice).

bspump.pipeline

A counter metric with following values:

  • event.in: number of events entering the pipeline in the specified time interval (once per minute)
  • event.drop: number of events dropped during processing in the pipeline in the specified time interval (once per minute)
  • event.out: number of events successfully leaving the pipeline in the specified time interval (once per minute)
  • warning: number of warnings produced in the pipeline in the specified time interval (once per minute)
  • warning: number of errors produced in the pipeline in the specified time interval (once per minute)

bspump.pipeline.eps

A counter metric with following values:

  • eps.in: events per second entering the pipeline
  • eps.drop: events per second dropped in the pipeline
  • eps.out: events per second successfully leaving the pipeline
  • warning: number of warnings produced in the pipeline in the specified time interval (once per minute)
  • warning: number of errors produced in the pipeline in the specified time interval (once per minute)

bspump.pipeline.gauge

A gauge metric (the value is calculated once) with following values:

  • warning.ratio: ratio of warnings per successful events
  • error.ratio: ratio of errors per successful events

bspump.pipeline.dutycycle

A dutycycle metric, that calculates the percentage of delayed processing (caused usually by the following service like ElasticSearch) per non-delayed processing.

  • ready: a true/false indicating whether the pipeline was not delayed

timedrift

An optional pipeline metric, that is enabled in every LogMan.io microservice.

It calculates the difference between the current time and the time the given event originated, which is usually indicated by the @timestamp atrribute. The following values are calculated for the specified time interval (once per minute):

  • avg
  • median
  • stddev
  • min
  • max

Tenant metrics

Tenant metrics are specific for LogMan.io Parser, LogMan.io Dispatcher, LogMan.io Correlator and LogMan.io Watcher microservices.

The tags are pipeline (ID of the pipeline), host (hostname of the microservice) and tenant (the lowercase name of the tenant).

bspump.pipeline.tenant.eps

A counter metric with following values:

  • eps.in: the tenant's events per second entering the pipeline
  • eps.aggr: the tenant's aggregated events (number is multiplied by cnt attribute in events) per second entering the pipeline
  • eps.drop: the tenant's events per second dropped in the pipeline
  • eps.out: the tenant's events per second successfully leaving the pipeline
  • warning: the tenant's number of warnings produced in the pipeline in the specified time interval (once per minute)
  • warning: the tenant's number of errors produced in the pipeline in the specified time interval (once per minute)

In LogMan.io Parser, the most relevant metrics come from ParsersPipeline (when the data first enter the Parser and are parsed via preprocessors and parsers) and EnrichersPipelines, while in LogMan.io Dispatcher from EventsPipeline and OthersPipeline.

bspump.pipeline.tenant.load

A counter metric with following values:

  • load.in: the tenant's byte size of all events entering the pipeline in the specified time interval (once per miute)
  • load.out: the tenant's byte size of all events leaving the pipeline in the specified time interval (once per miute)

Correlator metrics

The following metrics are specific for LogMan.io Correlator.

The tags are correlator (name of the correlator) and host (hostname of the microservice).

correlator.predicate

A counter metrics, that counts how many events went through the predicate.

  • in: number events entering the predicate in the time interval (once per minute)
  • hit: number events successfully matching the predicate in the time interval (once per minute)
  • miss: number events missing the predicate in the time interval (once per minute) and thus leaving the correlator
  • error: number of errors in the predicate in the time interval (once per minute)

correlator.trigger

A counter metrics, that counts how many events went through the trigger section of the correlator.

  • in: number events entering the trigger in the time interval (once per minute)
  • out: number events leaving the trigger in the time interval (once per minute)
  • error: number of errors in the trigger in the time interval (once per minute), should be equal to in - out