Introduction¶
Syslog is a widely used standard for message logging in computer systems, network devices, and applications. It enables the collection, storage, and analysis of log messages from various sources, providing valuable insights for monitoring, troubleshooting, and security analysis.
There are two main types of syslog protocols:
- Syslog RFC3164: The legacy variant of the syslog protocol, often referred to as the BSD syslog. Its message format is described in RFC3164, section 4.
- Syslog RFC5424: The standardized and more structured variant, described in RFC5424, section 6. It introduces additional fields and improved message structure.
Understanding the differences between these protocols is essential for accurate parsing and normalization of syslog events.
- Syslog RFC3164: Legacy variant of the syslog protocol. The message format is described in RFC3164, section 4.
- Syslog RFC5424: Standardized variant of the syslog protocol. The message format is described in RFC5424, section 6.
Syslog RFC3164¶
An event in RFC3164 format typically consists of the following parts:
- priority (PRI part): A number enclosed in "<>" brackets indicating the facility and severity of the log.
- timestamp: Local time in Mmm dd hh:mm:ss format (e.g., Jan 17 12:10:03). The timezone should be set to UTC.
- hostname: Hostname of the device that produced the event.
- process: Identifier of the process that produced the event. It can contain a PID in square brackets.
- message: The message part of the event.
Parsing the Header¶
The following parser declaration splits the RFC3164 message into its components:
---
define:
type: parsec/parser
parse:
!PARSE.KVLIST
- "<"
- PRI: !PARSE.DIGITS
- ">"
- TIMESTAMP: !PARSE.DATETIME RFC3164
- !PARSE.SPACES
- HOSTNAME: !PARSE.UNTIL " "
- PROCESS: !PARSE.UNTIL " "
- !PARSE.OPTIONAL { what: !PARSE.SPACES }
- MESSAGE: !PARSE.CHARS
Parsing the Process¶
The process field consists of the process name and PID. Currently, the following input:
<38>Sep 3 13:38:07 host01 systemd-logind[1078]: New session 49 of user harrypotter
produces the output:
"PROCESS": "systemd-logind[1078]:"
To correctly split the PROCESS
field into PROCESS.NAME
and PROCESS.PID
, we need a second parser for all cases:
---
define:
type: parsec/parser
field: PROCESS
parse:
!TRY #(1)
- !PARSE.KVLIST #(2)
- PROCESS.NAME: !PARSE.UNTIL "["
- PROCESS.PID: !PARSE.UNTIL "]"
- !PARSE.OPTIONAL { what: !PARSE.EXACTLY ":" }
- !PARSE.KVLIST #(3)
- PROCESS.NAME: !PARSE.UNTIL ":"
- !PARSE.KVLIST #(4)
- PROCESS.NAME: !PARSE.CHARS
- The
!TRY
expression takes a list of expressions as an argument. It instructs the parser to continue with the first element. If that expression fails, it continues with the second one, then the third, etc. - This branch deals with inputs of the form
PROCESS.NAME[PROCESS.PID]
(e.g.,cron[123]
). - This branch deals with inputs of the form
PROCESS.NAME:
(e.g.,sudo:
). - This branch deals with inputs of the form
PROCESS.NAME
, without the:
at the end (e.g.,sudo
).
Mapping to ECS Schema¶
After scanning the entire log, the output looks like this:
PRI: 38
TIMESTAMP: Sep 3 13:38:07
HOSTNAME: host01
PROCESS.NAME: systemd-logind
PROCESS.PID: 1078
MESSAGE: New session 49 of user harrypotter
The next phase is mapping to the ECS schema:
---
define:
type: parsec/mapping
schema: /Schemas/ECS.yaml
mapping:
PRI: log.syslog.priority
TIMESTAMP: "@timestamp"
HOSTNAME: host.hostname
PROCESS.NAME: process.name
PROCESS.PID: process.pid
MESSAGE: message
After mapping, the output looks like this:
log.syslog.priority: 38
@timestamp: Sep 3 13:38:07
host.hostname: host01
process.name: systemd-logind
process.pid: 1078
message: New session 49 of user harrypotter
Enriching the Syslog Severity and Facility¶
Syslog facility and severity can be computed from the priority as follows:
PRIORITY = FACILITY * 8 + SEVERITY
FACILITY = PRIORITY // 8
SEVERITY = PRIORITY (mod) 8
The computation can be further optimized by using bit shifts.
To enrich the event with log.syslog.facility.code
, log.syslog.facility.name
, log.syslog.severity.code
, and log.syslog.severity.name
, we use an enricher declaration:
---
define:
type: parsec/enricher
schema: /Schemas/ECS.yaml
enrich:
# SYSLOG FACILITY
log.syslog.facility.code: !SHR { what: !GET { from: !ARG EVENT, what: log.syslog.priority }, by: 3 }
log.syslog.facility.name: !MATCH
what: !GET { from: !ARG EVENT, what: log.syslog.facility.code }
with:
0: kern
1: user
2: mail
3: daemon
4: auth
5: syslog
6: lpr
7: news
8: uucp
9: cron
10: authpriv
11: ftp
16: local0
17: local1
18: local2
19: local3
20: local4
21: local5
22: local6
23: local7
# SYSLOG SEVERITY
log.syslog.severity.code: !AND [ !GET { from: !ARG EVENT, what: log.syslog.priority }, 7 ]
log.syslog.severity.name: !MATCH
what: !GET {from: !ARG EVENT, what: log.syslog.severity.code}
with:
0: emergency
1: alert
2: critical
3: error
4: warning
5: notice
6: information
7: debug
This produces the following (and final) output, with the added fields:
log.syslog.priority: 38
log.syslog.severity.code: 6
log.syslog.severity.name: information
log.syslog.facility.code: 4
log.syslog.facility.name: auth
@timestamp: Sep 3 13:38:07
host.hostname: host01
process.name: systemd-logind
process.pid: 1078
message: New session 49 of user harrypotter
Syslog RFC5424¶
Syslog RFC5424 messages have a more structured format than RFC3164 messages. An event in RFC5424 format typically consists of the following parts:
- priority (PRI part): A number enclosed in "<>" brackets indicating the facility and severity of the log.
- version: The version of the syslog protocol (usually "1").
- timestamp: The timestamp in YYYY-MM-DDThh:mm:ss.sTZD (e.g., 2003-10-11T22:14:15.003Z) format, according to RFC3339.
- hostname: Hostname of the device that produced the event.
- app-name: Name of the application that produced the event.
- procid: Process ID of the application that produced the event.
- msgid: Message ID.
- structured-data: Optional structured data enclosed in square brackets.
- message: The message part of the event.
Parsing the Header¶
The following parser declaration splits the RFC5424 message into its components:
---
define:
type: parsec/parser
parse:
!PARSE.KVLIST
- "<"
- PRI: !PARSE.DIGITS
- ">"
- VERSION: !PARSE.DIGITS
- !PARSE.SPACES
- TIMESTAMP: !PARSE.DATETIME RFC3339 #(1)
- !PARSE.SPACES
- HOSTNAME: !PARSE.UNTIL " "
- APPNAME: !PARSE.UNTIL " "
- PROCID: !PARSE.UNTIL " "
- MSGID: !PARSE.UNTIL " "
- !PARSE.OPTIONAL { what: !PARSE.SPACES }
- STRUCTURED_DATA: !PARSE.OPTIONAL { what: !PARSE.BETWEEN { start: "[", stop: "]" } } #(2)
- !PARSE.OPTIONAL { what: !PARSE.SPACES }
- MESSAGE: !PARSE.CHARS
- The
!PARSE.DATETIME RFC3339
expression parses the timestamp according to the RFC3339 format. - Structured data is optional and enclosed in square brackets. The
!PARSE.OPTIONAL
expression ensures that the parser can handle messages without structured data.
Mapping to ECS Schema¶
After scanning the entire log, the output looks like this:
PRI: 38
VERSION: 1
TIMESTAMP: 2003-10-11T22:14:15.003Z
HOSTNAME: host01
APPNAME: myapp
PROCID: 1234
MSGID: ID47
STRUCTURED_DATA: [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"]
MESSAGE: An application event log entry...
The next phase is mapping to the ECS schema:
---
define:
type: parsec/mapping
schema: /Schemas/ECS.yaml
mapping:
PRI: log.syslog.priority
VERSION: log.syslog.version
TIMESTAMP: "@timestamp"
HOSTNAME: host.hostname
APPNAME: process.name
PROCID: process.pid
MSGID: log.syslog.message_id
STRUCTURED_DATA: log.syslog.structured_data
MESSAGE: message
After mapping, the output looks like this:
log.syslog.priority: 38
log.syslog.version: 1
@timestamp: 2003-10-11T22:14:15.003Z
host.hostname: host01
process.name: myapp
process.pid: 1234
log.syslog.message_id: ID47
log.syslog.structured_data: [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"]
message: An application event log entry...
Enriching the Syslog Severity and Facility¶
Syslog facility and severity can be computed from the priority as follows:
PRIORITY = FACILITY * 8 + SEVERITY
FACILITY = PRIORITY // 8
SEVERITY = PRIORITY (mod) 8
The computation can be further optimized by using bit shifts.
To enrich the event with log.syslog.facility.code
, log.syslog.facility.name
, log.syslog.severity.code
, and log.syslog.severity.name
, we use an enricher declaration:
---
define:
type: parsec/enricher
schema: /Schemas/ECS.yaml
enrich:
# SYSLOG FACILITY
log.syslog.facility.code: !SHR { what: !GET { from: !ARG EVENT, what: log.syslog.priority }, by: 3 }
log.syslog.facility.name: !MATCH
what: !GET { from: !ARG EVENT, what: log.syslog.facility.code }
with:
0: kern
1: user
2: mail
3: daemon
4: auth
5: syslog
6: lpr
7: news
8: uucp
9: cron
10: authpriv
11: ftp
16: local0
17: local1
18: local2
19: local3
20: local4
21: local5
22: local6
23: local7
# SYSLOG SEVERITY
log.syslog.severity.code: !AND [ !GET { from: !ARG EVENT, what: log.syslog.priority }, 7 ]
log.syslog.severity.name: !MATCH
what: !GET {from: !ARG EVENT, what: log.syslog.severity.code}
with:
0: emergency
1: alert
2: critical
3: error
4: warning
5: notice
6: information
7: debug
This produces the following (and final) output, with the added fields:
log.syslog.priority: 38
log.syslog.severity.code: 6
log.syslog.severity.name: information
log.syslog.facility.code: 4
log.syslog.facility.name: auth
log.syslog.version: 1
@timestamp: 2003-10-11T22:14:15.003Z
host.hostname: host01
process.name: myapp
process.pid: 1234
log.syslog.message_id: ID47
log.syslog.structured_data: [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"]
message: An application event log entry...