Skip to content

LogMan.io Parsec Configuration

LogMan.io Parsec dependencies:

  • Apache Kafka: The source of input unparsed events and the destination of parsed events.
  • Apache Zookeeper: The library content, mainly parsing rules but also other shared cluster information.

Minimal configuration with event lane

LogMan.io Parsec can be configured both with or without event lane. We recommend to use the first option.

When event lane is used, LogMan.io Parsec reads Kafka topics, path for parsing rules and optionally charset, schema and timezone from it.

This is the minimal configuration for LogMan.io Parsec with event lane:

[tenant]
name=<tenant>  # (1)

[eventlane]
name=/EventLanes/<tenant>/<eventlane>.yaml  #(2)

[library]
providers=
    zk:///library
    ...

[kafka]
bootstrap_servers=kafka-1:9092,kafka-2:9092,kafka-3:9092  # (3)

[zookeeper]
servers=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181  # (4)
  1. Name of the tenant under which the service is running.
  2. Name of the event lane used for Kafka topics, path for parsing rules and optionally charset, schema and timezone.
  3. Addresses of Kafka servers in the cluster
  4. Addresses of Zookeeper servers in the cluster

Apache Zookeeper

Every LogMan.io microservice should advertise itself into Zookeeper.

[zookeeper]
servers=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181

Library

The library configuration specifies from where the Parsec declarations (definitions) are loaded.

The library can consist of one or multiple providers, typically Zookeeper or git repositories.

[library]
providers=
    zk:///library
    # other library layers can be included

Note

The order of layers is important. Higher layers overwrite the layers beneath them. If one file is present in multiple layers, only the one included in the highest layer is loaded.

Apache Kafka

The connection to Apache Kafka has to be configured so that events can be received from and sent to Apache Kafka:

[kafka]
bootstrap_servers=kafka-1:9092,kafka-2:9092,kafka-3:9092

Without this configuration, connection to Apache Kafka can't be properly established.

Minimal configuration without event lane

When event lane is NOT used, parsing rules, timezone and schema must be included in the configuration.

This is the configuration required for LogMan.io Parsec when event lane is not used:

[pipeline:ParsecPipeline:KafkaSource]
topic=received.<tenant>.<stream>  # (1)

[pipeline:ParsecPipeline:KafkaSink]
topic=events.<tenant>.<stream>  # (2)

[pipeline:ErrorPipeline:KafkaSink]
topic=others.<tenant>  # (3)

[tenant]
name=<tenant>  # (5)
schema=/Schemas/ECS.yaml  # (6)

[parser]
name=/Parsers/<parsing rule>  # (4)

[library]
providers=
    zk:///library
    ...

[kafka]
bootstrap_servers=kafka-1:9092,kafka-2:9092,kafka-3:9092  # (7)

[zookeeper]
servers=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181  # (8)
  1. Name of the received topic from which events are consumed.
  2. Name of the events topic to which successfully parsed events committed.
  3. Name of the others topic to which unsuccessfully parsed events committed.
  4. Specify the parsing rule to apply.
  5. Name of the tenant under which this instance of Parsec is running.
  6. Schema should be stored in /Schemas/ folder in Library.
  7. Addresses of Kafka servers in the cluster
  8. Addresses of Zookeeper servers in the cluster

Parsing rule

Each parsec must know what parsing rule to apply.

[parser]
name=/Parsers/<parsing rule>

The name of the parser specifies the path from which the parsing rule declarations are loaded. It MUST BE stored in /Parsers/ directory. Parsing rules are YAML files.

The standard path format is <vendor>/<type>, e.g. Microsoft/IIS or Oracle/Listener, but in case only one technology is used, only the name of the provider can be used, e.g. Zabbix or Devolutions.

Event lane configuration

This section optionally specifies significant attributes of the parsed events.

[eventlane]
timezone=Europe/Prague
charset=iso8859_2

timezone: If the log source produces logs in the specific timezone, different from the tenant default timezone, it has to be specified here. The name of the timezone must be compliant with IANA Time Zone Database. Internally, all timestamps are converted into UTC.

charset: If the log source produces logs in the charset (or encoding) different from UTF-8, the charset must be specified here. The list of supported charset is here. Internally, every text is encoded in UTF-8.

Kafka topics

The specification of the topic from which original logs come and the topics where successfully parsed and unsuccessfully parsed logs are sent.

The recommended way of choosing topics is to create one 'received' and one 'events' topic for each event lane, one 'others' topic for each tenant.

[pipeline:ParsecPipeline:KafkaSource]
topic=received.<tenant>.<stream>

[pipeline:ParsecPipeline:KafkaSink]
topic=events.<tenant>.<stream>

[pipeline:ErrorPipeline:KafkaSink]
topic=events.<tenant>

Warning

The pipeline name ParsecPipeline was introduced in Parsec version v23.37. The name KafkaParserPipeline used in previous versions is deprecated. End of service life is 30 January, 2024.

Kafka Consumer Group

LogMan.io Parsec is often running in multiple instances in cluster. The set of instances which consume from the same received topic is called Consumer group. This group is identified by a unique group.id. Each event is being consumed by one and only one members of the group.

LogMan.io Parsec creates group.id automatically as follows:

  1. When event lane is used, group.id has the form lmio-parsec-<tenant>-<eventlane>.
  2. When event lane is not used, group.id has the form lmio-parsec-<tenant>-<parser name>.
  3. group.id can be overwritten in ParsecPipeline configuration as follows:
[pipeline:ParsecPipeline:KafkaSource]
group_id=lmio-parsec-<stream>

Warning

By changing group.id, a new consumer group will be created and begin to read events from the start. (This depends on auto.offset.reset parameter of Kafka cluster, which is by default earliest.)

Metrics

Parsec produces own telemetry for monitoring and also forwards the telemetry from collectors to the configured telemetry data storage, such as InfluxDB. Read more about metrics.

Include in configuration:

[asab:metrics]
...