Skip to content

Combinator expressions¤

Overview¤

Combinators are functions for composing parsec expressions (parsers or another combinators) together. They specify how parsing is applied, what is the output type. They can be used for the flow control of parsing (applying conditional or repeated expressions) and also for lookahead searching in the input string.

Output selectors determine the type of output:

Flow control expressions can perform sequence of parser expressions based on certain conditions:

  • !PARSE.REPEAT: Performs the same sequence of expressions multiple times, similarly to "for" statement from different languages.
  • !PARSE.SEPARATED
  • !PARSE.OPTIONAL: Adds optional parser function, similarly to "if/else" statement from different languages.
  • !PARSE.TRIE: Performs the sequence of expressions based on the input string prefix.

Lookahead expressions:


!PARSE.KVLIST: Parse list of key-value pairs¤

Type: Combinator

Iterating through list of elements !PARSE.KVLIST expression collects key-value pairs to bag.

Synopsis:

!PARSE.KVLIST
- <...>
- key: <...>

Non-key elements are parsed, but not collected:

!PARSE.KVLIST
- <...>  # parsed, but not collected
- key1: <...>  # parsed and collected
- key2: <...>  # parsed and collected

Nested !PARSE.KVLIST expressions are joined to the parent one:

!PARSE.KVLIST
- <...>
- !PARSE.KVLIST  # expression is joined to the parent one
  - key3: <...>
  - <...>
- key4: <...>

Example

Input string:

<141>May  9 10:00:00 myhost.com notice tmm1[22731]: User 'user' was logged in.
!PARSE.KVLIST
- '<'
- PRI: !PARSE.DIGITS
- '>'
- TIMESTAMP: !PARSE.DATETIME
                - month: !PARSE.MONTH 'short'
                - !PARSE.SPACES
                - day: !PARSE.DIGITS # Day
                - !PARSE.SPACES
                - hour: !PARSE.DIGITS # Hours
                - ':'
                - minute: !PARSE.DIGITS # Minutes
                - ':'
                - second: !PARSE.DIGITS # Seconds

- !PARSE.SPACES
- HOSTNAME: !PARSE.UNTIL ' '
- LEVEL: !PARSE.UNTIL ' '
- PROCESS.NAME: !PARSE.UNTIL '['
- PROCESS.PID: !PARSE.DIGITS
- ']:'
- !PARSE.SPACES
- MESSAGE: !PARSE.CHARS

Output:

[
    (PRI, 141),
    (TIMESTAMP, 140994182325993472),
    (HOSTNAME, myhost.com),
    (LEVEL, notice),
    (PROCESS.NAME, tmm1),
    (PROCESS.PID, 22731),
    (MESSAGE, User 'user' was logged in.)
]

!PARSE.KV: Parse key-value pair¤

Type: Combinator

Parse key and value from a string into key-value pair, with the possibility of adding a certain prefix.

Synopsis:

!PARSE.KV
- prefix: <...>
- key: <...>
- value: <...>
- <...> # optional elements
  • prefix is optional. If specified, the prefix will be added to the key.
  • key and value are required.

Tip

Use combination of !PARSE.REPEAT and !PARSE.KV to parse repeated key-value pairs. (see examples)

Example

Input string: eventID= "1011"

!PARSE.KV
- key: !PARSE.UNTIL '='
- !PARSE.SPACE
- value: !PARSE.BETWEEN {what: '"'}

Output: (eventID, 1011)

Parse key and value with a specified prefix

Input string: eventID= "1011"

!PARSE.KV
- key: !PARSE.UNTIL {what: '='}
prefix: SD.PARAM.
- !PARSE.SPACE
- value: !PARSE.BETWEEN {what: '"'}
Output: (SD.PARAM.eventID, 1011)

Usage together with !PARSE.REPEAT

Input string: devid="FEVM020000191439" vd="root" itime=1665629867

!PARSE.REPEAT
what: !PARSE.KV
    - !PARSE.OPTIONAL
    what: !PARSE.SPACE
    - key: !PARSE.UNTIL '='
    - value: !TRY
            - !PARSE.BETWEEN '"'
            - !PARSE.UNTIL { what: ' ', eof: true}

Output:

[
    (devid, FEVM020000191439),
    (vd, root),
    (itime, 1665629867)
]


!PARSE.TUPLE: Parse list of values to tuple¤

Type: Combinator

Iterating through list of elements !PARSE.TUPLE expression collects values to tuple.

Synopsis:

!PARSE.TUPLE
- <...>
- <...>
- <...>

Example

Input string: Hello world!

!PARSE.TUPLE
- 'Hello'
- !PARSE.SPACE
- 'world'
- '!'

Output: ('Hello', ' ', 'world', '!')


!PARSE.RECORD: Parse list of values to record structure¤

Iterating through list of elements !PARSE.RECORD expression collects values to record structure.

Type: Combinator

Synopsis:

!PARSE.RECORD
- <...>
- element1: <...>
- element2: <...>
- <...>

Example

Input string: <165>1

!PARSE.RECORD
- '<'
- severity: !PARSE.DIGITS
- '>'
- version: !PARSE.DIGITS
- ' '

Output: {'output.severity': 165, 'output.version': 1}

!PARSE.REPEAT: Parse a repeated pattern¤

Type: Combinator.

Synopsis:

!PARSE.REPEAT
what: <expression>
min: <...>
max: <...>
exactly: <...>
  • If neither of min, max, exactly is specified, what will be repeated as many times as possible.
  • exactly determines the exact number of repetitions.
  • min and max set minimal and maximal number of repetitions.

Example

Input string: host:myhost;ip:192.0.0.1;user:root;

!PARSE.KVLIST
- !PARSE.REPEAT
what: !PARSE.KV
    - key: !PARSE.UNTIL ':'
    - value: !PARSE.UNTIL ';'

This will repeat the !PARSE.KV expression as many times as possible.

Output:

[
    (host, myhost),
    (ip, 192.0.0.1),
    (user, root)
]

Parse

Input string: hello hello hello Anna!

!PARSE.KVLIST
- !PARSE.REPEAT
    what: !PARSE.EXACTLY 'hello '
    exactly: 3
- NAME: !PARSE.UNTIL '!'

Output: [(NAME, Anna)]

Parse

Input strings:

hello hello Anna!
hello hello hello Anna!
hello hello hello hello Anna!
!PARSE.KVLIST
- !PARSE.REPEAT
    what: !PARSE.EXACTLY 'hello '
    min: 2
    max: 4
- NAME: !PARSE.UNTIL '!'

Output: [(NAME, Anna)]


!PARSE.SEPARATED: Parse a sequence with a separator¤

Type: Combinator.

Synopsis:

!PARSE.SEPARATED
what: <...>
sep: <...>
min: <...>
max: <...>
end: <...>
  • min and max are optional.
  • end indicates if trailing separator is required. By default, it is optional.

Example

Input string: 0->1->2->3

!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: "->"}
min: 3

Output: [0, 1, 2, 3]

Note: the trailing separator is optional, so input string 0->1->2->3-> is also valid.

More examples Parse what values separated by sep in [min;max] interval, trailing separator is required:
Input string: 11,22,33,44,55,66,
!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: ","}
end: True
min: 3
max: 7
Parse what values separated by sep in [min;max] interval, trailing separator is not presented:
Input string: 0..1..2..3
!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: ".."}
end: False
min: 3
max: 5

!PARSE.OPTIONAL: Parse optional pattern¤

Type: Combinator

!PARSE.OPTIONAL expression tries to parse the input string using the specified parser. If the parser fails, starting position rolls back to the initial one.

Synopsis:

!PARSE.OPTIONAL
what: <...>

or shorter version:

!PARSE.OPTIONAL <...>

Example

Input strings:

mymachine myproc[10]: DHCPACK to
mymachine myproc[10]DHCPACK to
!PARSE.KVLIST
- HOSTNAME: !PARSE.UNTIL ' ' # mymachine
- TAG: !PARSE.UNTIL '[' # myproc
- PID: !PARSE.DIGITS  # 10
- !PARSE.EXACTLY ']'

# Parsing of optional characters
- !PARSE.OPTIONAL ':'
- !PARSE.OPTIONAL
    what: !PARSE.SPACE

- NAME: !PARSE.UNTIL ' '

!PARSE.TRIE: Parse using starting prefix¤

Type: Combinator.

!PARSE.TRIE expression chooses one of the specified prefixes and parse the rest of the input string using the corresponding parser. If empty prefix is specified, the corresponding parser will be used in case other prefixes are not matched.

Synopsis:

!PARSE.TRIE
- <prefix1>: <...>
- <prefix2>: <...>
...

Tip

Use !PARSE.TRIE to parse multivariance log messages.

Example

Input strings:

Received disconnect from 10.17.248.1 port 60290:11: disconnected by user
Disconnected from user root 10.17.248.1 port 60290
!PARSE.TRIE
- 'Received disconnect from ': !PARSE.KVLIST
                            - CLIENT_IP: !PARSE.UNTIL ' '
                            - 'port '
                            - CLIENT_PORT: !PARSE.DIGITS
                            - ':'
                            - !PARSE.CHARS
- 'Disconnected from user ': !PARSE.KVLIST
                            - USERNAME: !PARSE.UNTIL ' '
                            - CLIENT_IP: !PARSE.UNTIL ' '
                            - 'port '
                            - CLIENT_PORT: !PARSE.DIGITS
Specify

Input string: Failed password for root from 218.92.0.190

!PARSE.TRIE
- 'Received disconnect from ': !PARSE.KVLIST
                            - CLIENT_IP: !PARSE.UNTIL ' '
                            - 'port '
                            - CLIENT_PORT: !PARSE.DIGITS
                            - ':'
                            - !PARSE.CHARS
- 'Disconnected from user ': !PARSE.KVLIST
                            - USERNAME: !PARSE.UNTIL ' '
                            - CLIENT_IP: !PARSE.UNTIL ' '
                            - 'port '
                            - CLIENT_PORT: !PARSE.DIGITS
- '': !PARSE.KVLIST
    - tags: ["trie-match-fail"]

Output: [(tags, ["trie-match-fail"])]


!PARSE.CHARS.LOOKAHEAD: Parse chars applying lookahead group¤

Type: Combinator

Parse chars until specified lookahead group is found and stop before it.

Synopsis:

!PARSE.CHARS.LOOKAHEAD
what:
- <...>
- <...>
- <...>
...
eof: <true/false>
  • eof - indicates if we should parse till the end of the string if what lookahead group is not found. Possible values: true(default) or false.

Example

Input string: Rule Name cs=Proxy

!PARSE.CHARS.LOOKAHEAD
what:
- " "
- !PARSE.LETTERS
- '='

Output: Rule Name