Skip to content

PARSEC expressions¤

Parsec expressions group represents the concept of Parser combinator.

They provide a way to combine basic parsers in order to construct more complex parsers for specific rules. In this context, a parser is a function that takes string as input and produces a structured output, that indicates successful parsing or provide an error message if the parsing process fails.

Parsec expressions are divided into two groups: parsers and combinators.

Parsers can be seen as the fundamental units or building blocks. They are responsible for recognizing and processing specific patterns or elements within the input string.

Combinators, on the other hand, are operators or functions that allow the combination and composition of parsers.

Every expression starts with !PARSE. prefix.


!PARSE.DIGIT: Parse a single digit¤

Type: Parser.

Synopsis:

!PARSE.DIGIT

Example

Input string: 2

!PARSE.DIGIT

!PARSE.DIGITS: Parse a sequence of digits¤

Type: Parser.

Synopsis:

!PARSE.DIGITS
min: <...>
max: <...>
exactly: <...>
Fields min, max and exactly are optional.

Warning

Exactly field can't be used together with min or max fields. And of course max value can't be less than min value.

Example

Input string: 123

!PARSE.DIGITS
max: 4
More examples Parse as many digits as possible:
!PARSE.DIGITS
Parse exactly 3 digits:
!PARSE.DIGITS
exactly: 3
Parse at least 2 digits, but not more than 4:
!PARSE.DIGITS
min: 2
max: 4

!PARSE.LETTER: Parse a single letter¤

Latin letters from A to Z, both uppercase and lowercase.

Type: Parser.

Synopsis:

!PARSE.LETTER

Example

Input string: A

!PARSE.LETTER

!PARSE.CHAR: Parse a single character¤

Any type of character.

Type: Parser.

Synopsis:

!PARSE.CHAR

Example

Input string: @

!PARSE.CHAR

!PARSE.CHARS: Parse a sequence of characters¤

Type: Parser.

Synopsis:

!PARSE.CHARS
min: <...>
max: <...>
exactly: <...>
Fields min, max and exactly are optional.

Warning

Exactly field can't be used together with min or max fields. And of course max value can't be less than min value.

Example

Input string: name@123_

!PARSE.CHARS
max: 8

Tip

Use !PARSE.CHARS without fields to parse till the end of the string.

More examples Parse as many chars as possible:
!PARSE.CHARS
Parse exactly 3 chars:
!PARSE.CHARS
exactly: 3
Parse at least 2 chars, but not more than 4:
!PARSE.CHARS
min: 2
max: 4

!PARSE.SPACE: Parse a single space character¤

Type: Parser.

Synopsis:

!PARSE.SPACE

!PARSE.SPACES: Parse a sequence of space characters¤

Parse as many space symbols as possible:

Type: Parser.

Synopsis:

!PARSE.SPACES

!PARSE.ONEOF: Parse a single character from a set of characters¤

Type: Parser.

Synopsis:

!PARSE.ONEOF
what: <...>
or shorter version:
!PARSE.ONEOF <...>

Example

Input string: Wow!

!PARSE.ONEOF
what: "!?."

!PARSE.NONEOF: Parse a single character that is not in a set of characters¤

Type: Parser.

Synopsis:

!PARSE.NONEOF
what: <...>
or shorter version:
!PARSE.NONEOF <...>

Example

Input string: Wow!

!PARSE.NONEOF
what: ",;:[]()"

!PARSE.UNTIL: Parse a sequence of characters until a specific character is found¤

Type: Parser.

Synopsis:

!PARSE.UNTIL
what: <...>
stop: <before/after>
eof: <true/false>
or shorter version:
!PARSE.UNTIL <...>

  • stop - indicates whether the stop character should be parsed or not. Possible values: before or after(default).

  • eof - indicates if we should parse till the end of the string if what symbol is not found. Possible values: true or false(default).

Info

Field what must be a single character. But some whitespace characters can also be used such as tab.

Example

Input string: 60290:11

!PARSE.UNTIL
what: ":"
More examples Parse until : symbol and stop before it:
!PARSE.UNTIL
what: ":"
stop: "before"
Parse until space symbol and stop after it:
!PARSE.UNTIL ' '
Parse until , symbol or parse till the end of the string if it's not found:
!PARSE.UNTIL
what: ","
eof: true
Parse until tab symbol:
!PARSE.UNTIL
what: 'tab'

!PARSE.EXACTLY: Parse precisely a defined sequence of characters¤

Type: Parser.

Synopsis:

!PARSE.EXACTLY
what: <...>
or shorter version:
!PARSE.EXACTLY <...>

Example

Input string: Hello world!

!PARSE.EXACTLY
what: "Hello"

!PARSE.BETWEEN: Parse a sequence of characters between two specific characters¤

Type: Parser.

Synopsis:

!PARSE.BETWEEN
what: <...>
start: <...>
stop: <...>
escape: <...>
or shorter version:
!PARSE.BETWEEN <...>

  • what - indicates between which same characters we should parse.

  • start, stop - indicates between which different characters we should parse.

  • escape - indicates escape character.

Example

Input string: [10/May/2023:08:15:54 +0000]

!PARSE.BETWEEN
start: '['
stop: ']'
More examples Parse between double-quotes:
!PARSE.BETWEEN
what: '"'
Parse between double-quotes, short form:
!PARSE.BETWEEN '"'
Parse between double-quotes, escape internal double-quotes:
Input string:"one, "two", three"
!PARSE.BETWEEN
what: '"'
escape: '\'

!PARSE.REGEX: Parse a sequence of characters that matches a regular expression¤

Type: Parser.

Synopsis:

!PARSE.REGEX
what: <...>

Example

Input string: FTVW23_L-C: Message...

Output: FTVW23_L-C

!PARSE.REGEX
what: '[a-zA-Z0-9_\-0]+'

!PARSE.MONTH: Parse a month name¤

Type: Parser.

Synopsis:

!PARSE.MONTH
what: <...>
or shorter version:
!PARSE.MONTH <...>

  • what - indicates a format of the month name. Possible values: number, short, full.

Tip

Use !PARSE.MONTH to parse month name as part of !PARSE.DATETIME.

Example

Input string: 10/May/2023:08:15:54

!PARSE.MONTH
what: 'short'
More examples Parse month in number format:
Input string:2003-10-11
!PARSE.MONTH 'number'
Parse month in full format:
Input string:2003-OCTOBER-11
!PARSE.MONTH
what: 'full'

!PARSE.FRAC: Parse a fraction¤

Type: Parser.

Synopsis:

!PARSE.FRAC
base: <...>
max: <...>
  • base - indicates a base of the fraction. Possible values: milli, micro, nano.
  • max - indicates a maximum number of digits depending on the base value. Possible values: 3, 6, 9 respectively.

Tip

Use !PARSE.FRAC to parse microseconds or nanoseconds as part of !PARSE.DATETIME.

Example

Input string: Aug 22 05:40:14.264

!PARSE.FRAC
base: "micro"
max: 6

!PARSE.DATETIME: Parse datetime in a given format¤

Type: Parser.

Synopsis:

!PARSE.DATETIME
- year: <...>
- month: <...>
- day: <...>
- hour: <...>
- minute: <...>
- second: <...>
- nanosecond: <...>
- timezone: <...>
  • Fields month, day are required.
  • Field year is optional. If not specified, the smart year function will be used.
  • Fields hour, minute, second, microsecond, nanosecond are optional. If not specified, the default value 0 will be used.
  • Specifying microseconds field like microseconds?, allow to parse microseconds or not depends on their present in the input string.
  • Field timezone is optional. If not specified, the default value UTC will be used. It can be specified in two different formats.
    1. Z, +08:00 - parsed from the input string.
    2. Europe/Prague - specified as a constant value.

Shortcuts¤

Shortcut forms are available (in both lower/upper variants):

!PARSE.DATETIME RFC3339
!PARSE.DATETIME iso8601

Example

Input string: 2022-10-13T12:34:56.987654

!PARSE.DATETIME
- year: !PARSE.DIGITS
- '-'
- month: !PARSE.MONTH 'number'
- '-'
- day: !PARSE.DIGITS
- 'T'
- hour: !PARSE.DIGITS
- ':'
- minute: !PARSE.DIGITS
- ':'
- second: !PARSE.DIGITS
- microsecond: !PARSE.FRAC
                base: "micro"
                max: 6
- timezone: "Europe/Prague"
More examples Parse datetime without year, with short month form and optional microseconds:
Input string: Aug 17 06:57:05.189
!PARSE.DATETIME
- month: !PARSE.MONTH 'short' # Month
- !PARSE.SPACE
- day: !PARSE.DIGITS # Day
- !PARSE.SPACE
- hour: !PARSE.DIGITS # Hours
- !PARSE.EXACTLY { what: ':' }
- minute: !PARSE.DIGITS # Minutes
- !PARSE.EXACTLY { what: ':' }
- second: !PARSE.DIGITS # Seconds
- microsecond?: !PARSE.FRAC # Microseconds
                base: "micro"
                max: 6
Parse datetime with timezone:
Input string: 2021-06-29T16:51:43+08:00
!PARSE.DATETIME
- year: !PARSE.DIGITS
- '-'
- month: !PARSE.MONTH 'number'
- '-'
- day: !PARSE.DIGITS
- 'T'
- hour: !PARSE.DIGITS
- ':'
- minute: !PARSE.DIGITS
- ':'
- second: !PARSE.DIGITS
- timezone: !PARSE.CHARS
Parse datetime using shortcut:
Input string: 2021-06-29T16:51:43Z
!PARSE.DATETIME RFC3339
Parse datetime using shortcut:
Input string: 20201211T111721Z
!PARSE.DATETIME iso8601
Parse datetime with nanoseconds:
Input string: 2023-03-23T07:00:00.734323900
!PARSE.DATETIME
- year: !PARSE.DIGITS
- !PARSE.EXACTLY { what: '-' }
- month: !PARSE.DIGITS
- !PARSE.EXACTLY { what: '-' }
- day: !PARSE.DIGITS
- !PARSE.EXACTLY { what: 'T' }
- hour: !PARSE.DIGITS
- !PARSE.EXACTLY { what: ':' }
- minute: !PARSE.DIGITS
- !PARSE.EXACTLY { what: ':' }
- second: !PARSE.DIGITS
- nanosecond: !PARSE.FRAC
  base: "nano"
  max: 9

!PARSE.REPEAT: Parse a repeated pattern¤

Type: Combinator.

Synopsis:

!PARSE.REPEAT
what: <...>
min: <...>
max: <...>
exactly: <...>

Fields min, max and exactly are optional. If none of them is specified, what will be repeated as many times as possible.

Example¤

Input string: abc_abc

!PARSE.REPEAT
what: !PARSE.ONEOF "abc"
exactly: 3

Output: ['a', 'b', 'c']

More examples Parse what pattern as many as possible:
!PARSE.REPEAT
what: !PARSE.EXACTLY 'hello'
Parse what pattern at least 2 times, but not more than 4:
!PARSE.REPEAT
what: !PARSE.EXACTLY 'hello'
min: 2
max: 4

!PARSE.SEPARATED: Parse a sequence with a separator¤

Type: Combinator.

Synopsis:

!PARSE.SEPARATED
what: <...>
sep: <...>
min: <...>
max: <...>
end: <...>

Fields max and end are optional.

  • end - indicates if trailing separator is required. By default, it is optional.

Example¤

Input string: 0->1->2->3

Note: trailing separator is optional, so input string 0->1->2->3-> is also valid.

!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: "->"}
min: 3

Output: [0, 1, 2, 3]

More examples Parse what values separated by sep in [min;max] interval, trailing separator is required:
Input string: 11,22,33,44,55,66,
!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: ","}
end: True
min: 3
max: 7
Parse what values separated by sep in [min;max] interval, trailing separator is not presented:
Input string: 0..1..2..3
!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: ".."}
end: False
min: 3
max: 5

!PARSE.TRIE: Parse using starting prefix¤

Type: Combinator.

!PARSE.TRIE expression chooses one of the specified prefixes and parse the rest of the input string using the corresponding parser.

Synopsis:

!PARSE.TRIE
- <prefix1>: <...>
- <prefix2>: <...>
...

Tip

Use !PARSE.TRIE to parse multivariance log messages.

Example¤

Input string: Received disconnect from 10.17.248.1 port 60290:11: disconnected by user

!PARSE.TRIE
- 'Received disconnect from ': !PARSE.KVLIST
    - CLIENT_IP: !PARSE.UNTIL ' '
    - 'port '
    - CLIENT_PORT: !PARSE.DIGITS
    - ':'
    - !PARSE.CHARS
- 'Disconnected from user ': !PARSE.KVLIST
    - USERNAME: !PARSE.UNTIL ' '
    - CLIENT_IP: !PARSE.UNTIL ' '
    - 'port '
    - CLIENT_PORT: !PARSE.DIGITS

!PARSE.OPTIONAL: Parse optional pattern¤

Type: Combinator

!PARSE.OPTIONAL expression tries to parse the input string using the specified parser. If the parser fails, starting position rolls back to the initial one.

Synopsis:

!PARSE.OPTIONAL
what: <...>
or shorter version:
!PARSE.OPTIONAL <...>

Example¤

Input strings:

  • mymachine myproc[10]: DHCPACK to
  • mymachine myproc[10]DHCPACK to
!PARSE.KVLIST
- HOSTNAME: !PARSE.UNTIL {what: ' '} # mymachine
- TAG: !PARSE.UNTIL {what: '['} # myproc
- PID: !PARSE.DIGITS  # 10
- !PARSE.EXACTLY {what: ']'}
- !PARSE.OPTIONAL ':'
- !PARSE.OPTIONAL
    what: !PARSE.SPACE
- NAME: !PARSE.UNTIL {what: ' '}

!PARSE.KV: Parse key-value pair¤

Type: Combinator

Synopsis:

!PARSE.KV
- key: <...>
  prefix: <...>
- value: <...>
- <...> # optional elements

Tip

Use combination of !PARSE.REPEAT and !PARSE.KV to parse repeated key-value pairs. (see examples)

Example¤

Input string: eventID= "1011"

!PARSE.KV
- key: !PARSE.UNTIL {what: '='}
- !PARSE.SPACE
- value: !PARSE.BETWEEN {what: '"'}

Output: (eventID, 1011)

More examples Input string: eventID= "1011"
!PARSE.KV
- key: !PARSE.UNTIL {what: '='}
  prefix: SD.PARAM.
- !PARSE.SPACE
- value: !PARSE.BETWEEN {what: '"'}
Output: (SD.PARAM.eventID, 1011)

Input string: devid="FEVM020000191439" vd="root" itime=1665629867
!PARSE.REPEAT
what: !PARSE.KV
    - !PARSE.OPTIONAL
      what: !PARSE.SPACE
    - key: !PARSE.UNTIL '='
    - value: !TRY
            - !PARSE.BETWEEN '"'
            - !PARSE.UNTIL { what: ' ', eof: true}
Output: [(devid, FEVM020000191439), (vd, root), (itime, 1665629867)]

!PARSE.KVLIST: Parse list of key-value pairs¤

Iterating through list of elements !PARSE.KVLIST expression collects key-value pairs to list of tuples. Non-key elements are parsed, but not collected. Nested !PARSE.KVLIST expressions are joined to the parent one.

Type: Combinator

Synopsis:

!PARSE.KVLIST
- <...>
- key1: <...>
- key2: <...>
- <...> 
- !PARSE.KVLIST
  - key3: <...>
  - <...>
- key4: <...>

Example¤

Input string: <141>May 9 10:00:00 VUW-DC-F5-P2R1.source-net.com notice tmm1[22731]: 01490500:5: /Common/Citrix_Receiver..

  !PARSE.KVLIST
  # parse Syslog_RFC5424
  - '<'
  - log.syslog.priority: !PARSE.DIGITS
  - '>'
  - '@timestamp': !PARSE.DATETIME
                - month: !PARSE.MONTH 'short'
                - !PARSE.SPACES
                - day: !PARSE.DIGITS # Day
                - !PARSE.SPACES
                - hour: !PARSE.DIGITS # Hours
                - ':'
                - minute: !PARSE.DIGITS # Minutes
                - ':'
                - second: !PARSE.DIGITS # Seconds
                - timezone: "Europe/Prague"
  - !PARSE.SPACES
  - host.hostname: !PARSE.UNTIL ' '
  - log.level: !PARSE.UNTIL ' '
  - log.syslog.appname: !PARSE.UNTIL '['
  - process.pid: !PARSE.DIGITS
  - ']: '
  - message: !PARSE.CHARS

Output: [(log.syslog.priority, 141), (@timestamp, 140994182325993472), (host.hostname, VUW-DC-F5-P2R1.source-net.com), (log.level, notice), (log.syslog.appname, tmm1), (process.pid, 22731), (message, 01490500:5: /Common/Citrix_Receiver..)]


!PARSE.TUPLE: Parse list of values to tuple¤

Iterating through list of elements !PARSE.TUPLE expression collects values to tuple.

Type: Combinator

Synopsis:

!PARSE.TUPLE
- <...>
- <...>
- <...>

Example¤

Input string: Hello world!

!PARSE.TUPLE
- 'Hello'
- !PARSE.SPACE
- 'world'
- '!'

Output: ('Hello', ' ', 'world', '!')


!PARSE.RECORD: Parse list of values to record structure¤

Iterating through list of elements !PARSE.RECORD expression collects values to record structure.

Type: Combinator

Synopsis:

!PARSE.RECORD
- <...>
- element1: <...>
- element2: <...>
- <...>

Example¤

Input string: <165>1

!PARSE.RECORD
- !PARSE.EXACTLY {what: '<'}
- severity: !PARSE.DIGITS
- !PARSE.EXACTLY {what: '>'}
- version: !PARSE.DIGITS
- !PARSE.EXACTLY {what: ' '}

Output: {'output.severity': 165, 'output.version': 1}