Skip to content

LogMan.io Parsec Configuration

Parsec dependencies:

  • Apache Kafka: The source of input unparsed events and the destination of parsed events.
  • Apache Zookeeper: The library content, mainly parsing rules but also other shared cluster information.

Minimal configuration

This is the most basic configuration required for LogMan.io Parsec:

[pipeline:ParsecPipeline:KafkaSource]
topic=received.<tenant>.<stream>  # (1)

[pipeline:ParsecPipeline:KafkaSink]
topic=events.<tenant>.<stream>  # (2)

[pipeline:ErrorPipeline:KafkaSink]
topic=others.<tenant>  # (3)

[tenant]
name=<tenant>  # (5)
schema=/Schemas/ECS.yaml  # (6)

[parser]
name=/Parsers/<parsing rule>  # (4)

[library]
providers=
    zk:///library
    ...

[kafka]
bootstrap_servers=kafka-1:9092,kafka-2:9092,kafka-3:9092  # (7)

[zookeeper]
servers=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181  # (8)
  1. Name of the received topic from which events are consumed.
  2. Name of the events topic to which successfully parsed events committed.
  3. Name of the others topic to which unsuccessfully parsed events committed.
  4. Specify the parsing rule to apply.
  5. Name of the tenant under which this instance of Parsec is running.
  6. Schema should be stored in /Schemas/ folder in Library.
  7. Addresses of Kafka servers in the cluster
  8. Addresses of Zookeeper servers in the cluster

Parsing rule

Each parsec must know what parsing rule to apply.

[parser]
name=/Parsers/<parsing rule>

The name of the parser specifies the path from which the parsing rule declarations are loaded. It MUST BE stored in /Parsers/ directory. Parsing rules are YAML files.

The standard path format is <vendor>/<type>, e.g. Microsoft/IIS or Oracle/Listener, but in case only one technology is used, only the name of the provider can be used, e.g. Zabbix or Devolutions.

Event lane configuration

This section optionally specifies significant attributes of the parsed events.

[eventlane]
timezone=Europe/Prague
charset=iso8859_2

timezone: If the log source produces logs in the specific timezone, different from the tenant default timezone, it has to be specified here. The name of the timezone must be compliant with IANA Time Zone Database. Internally, all timestamps are converted into UTC.

charset: If the log source produces logs in the charset (or encoding) different from UTF-8, the charset must be specified here. The list of supported charset is here. Internally, every text is encoded in UTF-8.

Library

The library configuration specifies from where the Parsec declarations (definitions) are loaded.

The library can consist of one or multiple providers, typically Zookeeper or git repositories.

[library]
providers=
    zk://zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181/library.lib
    # other library layers can be included

Note

The order of layers is important. Higher layers overwrite the layers beneath them. If one file is present in multiple layers, only the one included in the highest layer is loaded.

Apache Kafka

The connection to Apache Kafka has to be configured so that events can be received from and sent to Apache Kafka:

[connection:KafkaInputConnection]
bootstrap_servers=kafka-1:9092,kafka-2:9092,kafka-3:9092

[connection:KafkaOutputConnection]
bootstrap_servers=kafka-1:9092,kafka-2:9092,kafka-3:9092

Without this configuration, connection to Apache Kafka can't be properly established.

Kafka topics

The specification of the topic from which original logs come and the topics where successfully parsed and unsuccessfully parsed logs are sent.

The recommended way of choosing topics is to create one 'received' and one 'events' topic for each event lane, one 'others' topic for each tenant.

[pipeline:ParsecPipeline:KafkaSource]
topic=received.<tenant>.<stream>

[pipeline:ParsecPipeline:KafkaSink]
topic=events.<tenant>.<stream>

[pipeline:ErrorPipeline:KafkaSink]
topic=events.<tenant>

Warning

The pipeline name ParsecPipeline was introduced in Parsec version v23.37. The name KafkaParserPipeline used in previous versions is deprecated. End of service life is 30 January, 2024.

Kafka Consumer Group

LogMan.io Parsec is often running in multiple instances in cluster. The set of instances which consume from the same RECEIVED topic is called Consumer group. This group is identified by a unique group.id. Each event is being consumed by one and only one members of the group.

group.id is by default generated automatically in format lmio-parsec-<tenant>-<parser name>. It can be overwritten in ParsecPipeline configuration as follows:

[pipeline:ParsecPipeline:KafkaSource]
group_id=lmio-parsec-<stream>

Warning

By changing group.id, a new consumer group will be created and begin to read events from the start. (This depends on auto.offset.reset parameter of Kafka cluster, which is by default earliest.)

Apache Zookeeper

Every LogMan.io microservice should advertise itself into Zookeeper.

[zookeeper]
path=/asab
servers=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181

Metrics

Parsec produces own telemetry for monitoring and also forwards the telemetry from collectors to the configured telemetry data storage, such as InfluxDB. Read more about metrics.

Include in configuration:

[asab:metrics]
...