Syslog JSON

Syslog in JSON

Syslog is a line protocol, specified in RFC5424 respectively RFC3164. This specification extends a serialization of the syslog values in JSON format.

Example:

{
    "@timestamp": "2017-01-17T03:03:47.365Z", // Timestamp
    "T": "syslog", // Type
    "M": "Message",
    "m": "Short message", // (Optional)
    "H": "hostname",
    "P": "program",
    "C": "component", // (Optional)
    "s": "source_code.c:12", // Reference to source code (Optional)
    "p": 12345, // PID
    "Th": 67890, // Thread ID (Optional)
    "l": 4, // Level
    "f": 2, // Facility (Optional)
    "e": "prod", // Environment (Optional)
    "I": 1.2.3.4, // IP address (Optional)
    "E": "123", // Error code (Optional)

    // HTTP Access Logs (whole section is optional)
    "al.A": "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko", // Agent string
    "al.I": "-", // Ident (Optional)
    "al.a": "-", // Auth(Optional)
    "al.b": 12574, // Bytes
    "al.c": "200", // Response code
    "al.m": "GET", // Method
    "al.p": "/path", // Path
    "al.r": "/some/path", // Referrer (Optional)
    "al.v": "HTTP/1.1", // Version

    // Optional 1st structured data dictionary
    "sd.<app-id>": {
        'a1': 'b',
        'a1': 'c',
        'a3': 3,
    },

    // Optional 2nd structured data dictionary
    "sd.<app-id2>": {
        'b1': 'b',
        'c1': 'c',
        'd3': 3,
    }
}

Timestamp t

The date and time of the log record in UTC timezone that respect RFC3339. Example of a RFC3339 format of the log event timestamp: 2017-04-28T13:19:26.680Z.

Type T

A keyword string that specifies type of the log record.

E.g. syslog or al (for access log).

Message M

Text of the log record message. Can be multi-line, in UTF-8 encoding.

Short message m

Some systems provides also a short message.

Text in UTF-8 encoding.

Hostname H

Hostname identifies the machine that originally sent the syslog message.

The hostname field should contain the hostname and the domain name of the originator in the format specified in RFC1034. This format is called a Fully Qualified Domain Name (FQDN) in this document.

Program P

The Program field should identify the device or application that originated the message. It is a string without further semantics. It is intended for filtering messages.

Pid p

Pid is a value that identify a process, that created the syslog entry. It is an integer number.

Thread ID t

ThreadID is a value that identify a thread inside of the process. It is an integer number. This value is optional.

Level l

The Level field indicates the degree of severity.

Numerical value Severity
0 Emergency: System is unusable
1 Alert: action must be taken immediately
2 Critical: critical conditions
3 Error: error conditions
4 Warning: warning conditions
5 Notice: normal but significant condition
6 Informational: informational messages
7 Debug: debug-level messages

Facility f

The obsolete field that is provided mainly for syslog compatibility.

Numerical value Facility
0 kernel messages
1 user-level messages
2 mail system
3 system daemons
4 security/authorization messages
5 messages generated internally by syslogd
6 line printer subsystem
7 network news subsystem
8 UUCP subsystem
9 clock daemon
10 security/authorization messages
11 FTP daemon
12 NTP subsystem
13 log audit
14 log alert
15 clock daemon
16 local use 0 (local0)
17 local use 1 (local1)
18 local use 2 (local2)
19 local use 3 (local3)
20 local use 4 (local4)
21 local use 5 (local5)
22 local use 6 (local6)
23 local use 7 (local7)

Environment e

The identification (one word aka keyword) of the environment (such as prod for production, test for test environment, staging for staging environment or devel for development).

IP Address I

An IP address (Pv4 or IPv6) that is related for a log entry. It could be a peer IP address, IP address of the server etc.

Error code E

An error code such as errno. Integer number or keyword-type of text. Optional. Additional error info goes into M or m fields.

Structured data

Structured data provides a mechanism to express information in a well defined, easily parseable and interpretable data format. There are multiple usage scenarios. For example, it may express meta-information about the syslog message or application-specific information such as traffic counters or IP addresses.

Syslog-ng configuration for LogMan.io

Following configuration is needed in syslog-ng config file for Syslog in JSON to be correctly processed in LogMan.io

source s_syslogjson {
    unix-dgram(
        "/tmp/syslogjson.sock"
        flags(syslog-protocol));
};
parser p_syslogjson {
    map-value-pairs(
        pair("lm.T" "sj")
    );
};
log {
    source(s_syslogjson);
    parser(p_syslogjson);
    destination(d_amqp);
};

Python Logging Setup

We will be using SysLogHandler to send logs to a datagram socket in python.

Python

import socket, logging, logging.handlers

# Handler
h = logging.handlers.SysLogHandler('/tmp/syslogjson.sock')
h.setLevel(logging.INFO)

# Formatter
f = logging.Formatter(
    '1 - - - - -  {"t": "%(asctime)s.%(msecs)03dZ", "M": "%(message)s", "H": "%(hostname)s", "P": "%(filename)s", "p": %(process)d, "l": "%(levelname)s"}',
    datefmt='%Y-%m-%dT%H:%M:%S'
)
# Attach formatter to the hanlder
h.setFormatter(f)

# Context filter (because of hostname)
class ContextFilter(logging.Filter):
    hostname = socket.gethostname()
    def filter(self, record):
        record.hostname = ContextFilter.hostname
        return True
cf = ContextFilter()

# Logger setup
logger = logging.getLogger()
logger.setLevel(logging.INFO)
logger.addFilter(cf)
logger.addHandler(h)

Django

Logging using SysLogHandler needs to be set up in the Django settings file.

Django provides a placholder SENDER_NAME that we can use instead of %s(hostname)

Wire protocol

n order to be compatible with various syslog implementation, the wire protocol is designed to be compatible with RFC5424 when transmitted via syslog server.

Example RFC5424-compatible serialization:

<14>1 - - - - - {"t": "2017-01-17T03:03:47.365Z", "M": "Message", "H": "hostname"}