Skip to content

Introduction

Syslog is a widely used standard for message logging in computer systems, network devices, and applications. It enables the collection, storage, and analysis of log messages from various sources, providing valuable insights for monitoring, troubleshooting, and security analysis.

There are two main types of syslog protocols:

  • Syslog RFC3164: The legacy variant of the syslog protocol, often referred to as the BSD syslog. Its message format is described in RFC3164, section 4.
  • Syslog RFC5424: The standardized and more structured variant, described in RFC5424, section 6. It introduces additional fields and improved message structure.

Understanding the differences between these protocols is essential for accurate parsing and normalization of syslog events.

  • Syslog RFC3164: Legacy variant of the syslog protocol. The message format is described in RFC3164, section 4.
  • Syslog RFC5424: Standardized variant of the syslog protocol. The message format is described in RFC5424, section 6.

Syslog RFC3164

An event in RFC3164 format typically consists of the following parts:

Syslog headers Syslog headers

  • priority (PRI part): A number enclosed in "<>" brackets indicating the facility and severity of the log.
  • timestamp: Local time in Mmm dd hh:mm:ss format (e.g., Jan 17 12:10:03). The timezone should be set to UTC.
  • hostname: Hostname of the device that produced the event.
  • process: Identifier of the process that produced the event. It can contain a PID in square brackets.
  • message: The message part of the event.

Parsing the Header

The following parser declaration splits the RFC3164 message into its components:

10_parser.yaml
---
define:
  type: parsec/parser

parse:
  !PARSE.KVLIST
  - "<"
  - PRI: !PARSE.DIGITS
  - ">"
  - TIMESTAMP: !PARSE.DATETIME RFC3164
  - !PARSE.SPACES
  - HOSTNAME: !PARSE.UNTIL " "
  - PROCESS: !PARSE.UNTIL " "
  - !PARSE.OPTIONAL { what: !PARSE.SPACES }
  - MESSAGE: !PARSE.CHARS

Parsing the Process

The process field consists of the process name and PID. Currently, the following input:

<38>Sep 3 13:38:07 host01 systemd-logind[1078]: New session 49 of user harrypotter

produces the output:

"PROCESS": "systemd-logind[1078]:"

To correctly split the PROCESS field into PROCESS.NAME and PROCESS.PID, we need a second parser for all cases:

20_parser_process.yaml
---
define:
  type: parsec/parser
  field: PROCESS

parse:
  !TRY  #(1)

  - !PARSE.KVLIST  #(2)
    - PROCESS.NAME: !PARSE.UNTIL "["
    - PROCESS.PID: !PARSE.UNTIL "]"
    - !PARSE.OPTIONAL { what: !PARSE.EXACTLY ":" }

  - !PARSE.KVLIST  #(3)
    - PROCESS.NAME: !PARSE.UNTIL ":"

  - !PARSE.KVLIST  #(4)
    - PROCESS.NAME: !PARSE.CHARS
  1. The !TRY expression takes a list of expressions as an argument. It instructs the parser to continue with the first element. If that expression fails, it continues with the second one, then the third, etc.
  2. This branch deals with inputs of the form PROCESS.NAME[PROCESS.PID] (e.g., cron[123]).
  3. This branch deals with inputs of the form PROCESS.NAME: (e.g., sudo:).
  4. This branch deals with inputs of the form PROCESS.NAME, without the : at the end (e.g., sudo).

Mapping to ECS Schema

After scanning the entire log, the output looks like this:

PRI: 38
TIMESTAMP: Sep 3 13:38:07
HOSTNAME: host01
PROCESS.NAME: systemd-logind
PROCESS.PID: 1078
MESSAGE: New session 49 of user harrypotter

The next phase is mapping to the ECS schema:

30_mapping_ECS.yaml
---
define:
  type: parsec/mapping
  schema: /Schemas/ECS.yaml

mapping:
  PRI: log.syslog.priority
  TIMESTAMP: "@timestamp"
  HOSTNAME: host.hostname
  PROCESS.NAME: process.name
  PROCESS.PID: process.pid
  MESSAGE: message

After mapping, the output looks like this:

log.syslog.priority: 38
@timestamp: Sep 3 13:38:07
host.hostname: host01
process.name: systemd-logind
process.pid: 1078
message: New session 49 of user harrypotter

Enriching the Syslog Severity and Facility

Syslog facility and severity can be computed from the priority as follows:

PRIORITY = FACILITY * 8 + SEVERITY

FACILITY = PRIORITY // 8
SEVERITY = PRIORITY (mod) 8

The computation can be further optimized by using bit shifts.

To enrich the event with log.syslog.facility.code, log.syslog.facility.name, log.syslog.severity.code, and log.syslog.severity.name, we use an enricher declaration:

40_enricher_syslog_ECS.yaml
---
define:
  type: parsec/enricher
  schema: /Schemas/ECS.yaml

enrich:
  # SYSLOG FACILITY
  log.syslog.facility.code: !SHR { what: !GET { from: !ARG EVENT, what: log.syslog.priority }, by: 3 }
  log.syslog.facility.name: !MATCH
                            what: !GET { from: !ARG EVENT, what: log.syslog.facility.code }
                            with:
                              0: kern
                              1: user
                              2: mail
                              3: daemon
                              4: auth
                              5: syslog
                              6: lpr
                              7: news
                              8: uucp
                              9: cron
                              10: authpriv
                              11: ftp
                              16: local0
                              17: local1
                              18: local2
                              19: local3
                              20: local4
                              21: local5
                              22: local6
                              23: local7

  # SYSLOG SEVERITY
  log.syslog.severity.code: !AND [ !GET { from: !ARG EVENT, what: log.syslog.priority }, 7 ]
  log.syslog.severity.name: !MATCH
                            what: !GET {from: !ARG EVENT, what: log.syslog.severity.code}
                            with:
                              0: emergency
                              1: alert
                              2: critical
                              3: error
                              4: warning
                              5: notice
                              6: information
                              7: debug

This produces the following (and final) output, with the added fields:

log.syslog.priority: 38
log.syslog.severity.code: 6
log.syslog.severity.name: information
log.syslog.facility.code: 4
log.syslog.facility.name: auth
@timestamp: Sep 3 13:38:07
host.hostname: host01
process.name: systemd-logind
process.pid: 1078
message: New session 49 of user harrypotter

Syslog RFC5424

Syslog RFC5424 messages have a more structured format than RFC3164 messages. An event in RFC5424 format typically consists of the following parts:

  • priority (PRI part): A number enclosed in "<>" brackets indicating the facility and severity of the log.
  • version: The version of the syslog protocol (usually "1").
  • timestamp: The timestamp in YYYY-MM-DDThh:mm:ss.sTZD (e.g., 2003-10-11T22:14:15.003Z) format, according to RFC3339.
  • hostname: Hostname of the device that produced the event.
  • app-name: Name of the application that produced the event.
  • procid: Process ID of the application that produced the event.
  • msgid: Message ID.
  • structured-data: Optional structured data enclosed in square brackets.
  • message: The message part of the event.

Parsing the Header

The following parser declaration splits the RFC5424 message into its components:

10_parser.yaml
---
define:
  type: parsec/parser

parse:
  !PARSE.KVLIST
  - "<"
  - PRI: !PARSE.DIGITS
  - ">"
  - VERSION: !PARSE.DIGITS
  - !PARSE.SPACES
  - TIMESTAMP: !PARSE.DATETIME RFC3339  #(1)
  - !PARSE.SPACES
  - HOSTNAME: !PARSE.UNTIL " "
  - APPNAME: !PARSE.UNTIL " "
  - PROCID: !PARSE.UNTIL " "
  - MSGID: !PARSE.UNTIL " "
  - !PARSE.OPTIONAL { what: !PARSE.SPACES }
  - STRUCTURED_DATA: !PARSE.OPTIONAL { what: !PARSE.BETWEEN { start: "[", stop: "]" } }  #(2)
  - !PARSE.OPTIONAL { what: !PARSE.SPACES }
  - MESSAGE: !PARSE.CHARS
  1. The !PARSE.DATETIME RFC3339 expression parses the timestamp according to the RFC3339 format.
  2. Structured data is optional and enclosed in square brackets. The !PARSE.OPTIONAL expression ensures that the parser can handle messages without structured data.

Mapping to ECS Schema

After scanning the entire log, the output looks like this:

PRI: 38
VERSION: 1
TIMESTAMP: 2003-10-11T22:14:15.003Z
HOSTNAME: host01
APPNAME: myapp
PROCID: 1234
MSGID: ID47
STRUCTURED_DATA: [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"]
MESSAGE: An application event log entry...

The next phase is mapping to the ECS schema:

20_mapping_ECS.yaml
---
define:
  type: parsec/mapping
  schema: /Schemas/ECS.yaml
mapping:
  PRI: log.syslog.priority
  VERSION: log.syslog.version
  TIMESTAMP: "@timestamp"
  HOSTNAME: host.hostname
  APPNAME: process.name
  PROCID: process.pid
  MSGID: log.syslog.message_id
  STRUCTURED_DATA: log.syslog.structured_data
  MESSAGE: message

After mapping, the output looks like this:

log.syslog.priority: 38
log.syslog.version: 1
@timestamp: 2003-10-11T22:14:15.003Z
host.hostname: host01
process.name: myapp
process.pid: 1234
log.syslog.message_id: ID47
log.syslog.structured_data: [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"]
message: An application event log entry...

Enriching the Syslog Severity and Facility

Syslog facility and severity can be computed from the priority as follows:

PRIORITY = FACILITY * 8 + SEVERITY

FACILITY = PRIORITY // 8
SEVERITY = PRIORITY (mod) 8

The computation can be further optimized by using bit shifts.

To enrich the event with log.syslog.facility.code, log.syslog.facility.name, log.syslog.severity.code, and log.syslog.severity.name, we use an enricher declaration:

30_enricher_syslog_ECS.yaml
---
define:
  type: parsec/enricher
  schema: /Schemas/ECS.yaml

enrich:
  # SYSLOG FACILITY
  log.syslog.facility.code: !SHR { what: !GET { from: !ARG EVENT, what: log.syslog.priority }, by: 3 }
  log.syslog.facility.name: !MATCH
                            what: !GET { from: !ARG EVENT, what: log.syslog.facility.code }
                            with:
                              0: kern
                              1: user
                              2: mail
                              3: daemon
                              4: auth
                              5: syslog
                              6: lpr
                              7: news
                              8: uucp
                              9: cron
                              10: authpriv
                              11: ftp
                              16: local0
                              17: local1
                              18: local2
                              19: local3
                              20: local4
                              21: local5
                              22: local6
                              23: local7

  # SYSLOG SEVERITY
  log.syslog.severity.code: !AND [ !GET { from: !ARG EVENT, what: log.syslog.priority }, 7 ]
  log.syslog.severity.name: !MATCH
                            what: !GET {from: !ARG EVENT, what: log.syslog.severity.code}
                            with:
                              0: emergency
                              1: alert
                              2: critical
                              3: error
                              4: warning
                              5: notice
                              6: information
                              7: debug

This produces the following (and final) output, with the added fields:

log.syslog.priority: 38
log.syslog.severity.code: 6
log.syslog.severity.name: information
log.syslog.facility.code: 4
log.syslog.facility.name: auth
log.syslog.version: 1
@timestamp: 2003-10-11T22:14:15.003Z
host.hostname: host01
process.name: myapp
process.pid: 1234
log.syslog.message_id: ID47
log.syslog.structured_data: [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"]
message: An application event log entry...