SP-Lang data types¤
In the SP-Lang, type system plays a critical role in ensuring the correctness and efficiency of expression execution. SP-Lang employs type inference. It means that the type system operates behind the scenes, delivering high performance without burdening the user with its complexities. This approach allows for a seamless and user-friendly experience, where advanced users can access the type system for more fine-grained control and optimization.
Info
A type system is a set of rules that define how data types are classified, combined, and manipulated in a language. It helps catch potential errors early on, improving code reliability, and ensures that operations are performed only on compatible data types.
Scalar types¤
Scalar types are the basic building blocks of a language, which represent single values. They are essential for working with different kinds of data and performing various operations.
Integers¤
Integers are whole numbers, like -5, 0, or 42, that can be used for counting or simple arithmetic operations. Integers could be signed or unsigned.
Type | Name | Type | Name | Bits | Bytes |
---|---|---|---|---|---|
si8 |
Signed 8bit integer | ui8 |
Unsigned 8bit integer | 8 | 1 |
si16 |
Signed 16bit integer | ui16 |
Unsigned 16bit integer | 16 | 2 |
si32 |
Signed 32bit integer | ui32 |
Unsigned 32bit integer | 32 | 4 |
si64 |
Signed 64bit integer | ui64 |
Unsigned 64bit integer | 64 | 16 |
si128 |
Signed 128bit integer | ui128 |
Unsigned 128bit integer | 128 | 32 |
si256 |
Signed 256bit integer | ui256 |
Unsigned 256bit integer | 256 | 64 |
A preferred (default) integer type is si64
(signed 64bit integer), followed by ui64
(unsigned 64bit integer).
This is because SP-Lang is designed primarily for 64bit CPUs.
int
is the alias for si64
.
Warning
256bit sizes are not fully supported yet.
Boolean¤
A Boolean (bool
) is a type that has one of two possible values denoted True
and False
.
Floating-Point¤
Floating-point numbers are decimal numbers, such as 3.14 or -0.5, that are useful for calculations involving fractions or more precise values.
Type | Name | Bytes |
---|---|---|
fp16 |
16bit float | 2 |
fp32 |
32bit float | 4 |
fp64 |
64bit float | 8 |
fp128 |
128bit float | 16 |
Warning
fp16
and fp128
are not fully supported.
Warning
Alias float
translates to fp64
which translates to LLVM double
(different from alias float
).
Complex scalar types¤
Complex scalar types are designed for values that provides some internal structure (so technically they are records or tuples) but they can fit into a scalar type (e.g. for performance or optimization purposes).
Date/Time¤
datetime
This is a value that represents a date and time in the UTC, using broken time structure.
Broken time means that year
, month
, day
, hour
, minute
, second
and microsecond
are stored in dedicated fields; different from the e.g. UNIX timestamp.
- Timezone: UTC
- Resolution: microseconds (six decimal digits)
- 64bit unsigned integer, aka
ui64
Broken time components
y
/year
m
/month
d
/day
H
/hour
M
/minute
S
/second
u
/microsecond
More detailed description of date/time is here.
IP Address¤
This data type contains IPv4 or IPv6 address.
ip
Underlying scalar type: ui128
RFC 4291
IPv4 are mapped into IPv6 space as prescribed in RFC 4291 "IPv4-Mapped IPv6 Address".
For example, the IPv4 address 12.23.45.67
will be mapped into IPv6 address ::ffff:c17:2d43
.
MAC Address¤
This data type contains MAC address, (EUI-48).
What is MAC Address?
A MAC address (short for medium access control address) is a unique identifier assigned to a network card etc.
mac
Underlying scalar type: ui64
, only 6 octecs are used in EUI-48.
Geographical coordinate¤
This type represents geographical coordinate, specifically longitude and latitude.
geopoint
Underlying scalar type: u64
More detailed description of geopoint is here.
Generic types¤
Generic types are used in the early stage of the SP-Lang parsing, optimization and compilation. The complementary type is Specific type. The SP-Lang resolves generic types into specific types by the mechanism called type inference. If generic type cannot be resolved into specific, the compilation will fail and you need to provide more information for a type inference.
The generic type starts with capital T
.
Also if the container type contains generic type, the container type or structural type itself is considered generic.
Container types¤
List¤
[Ti]
Ti
refers to a type of the item in the list
The list must contain a zero, one or many items of the same type.
The type constructor is !LIST
expression.
Set¤
{Ti}
Ti
refers to a type of the item in the set
The type constructor is !SET
expression.
Dictionary¤
{Tk:Tv}
Tk
refers to a type of the keyTv
refers to a type of the value
The type constructor is !DICT
expression.
Bag¤
[(Tk,Tv)]
Tk
refers to a type of the keyTv
refers to a type of the value
A bag (aka multimap) is a container that allows duplicate keys, unlike a dictionary, which only allows unique keys.
Tip
The bag is essentially a list of 2-tuples (couples).
Product types¤
A product type is a compounded type, formed by combining other types into a structure.
Tuple¤
Signature: (T1, T2, T3, ...)
The type constructor is !TUPLE
expression.
It is equivalent to a structure type in LLVM IR.
Tip
A tuple with no members respectively ()
is the unit.
Record¤
Signature: (name1: T1, name2: T2, name3: T3, ...)
The type constructor is !RECORD
expression.
It is is equivalent to a C struct
.
Sum type¤
A Sum type is a data structure used to hold a value that could take on several different types.
Any¤
any
The any
type is a special type that represents a value that can have any type.
Warning
The any
type shouldn't be used as a preferred type because it has an overhead.
Still, it is rather helpful for typing the dictionary that combines types (e.g. {str:any}
) and other situations where the type of the value is not known in the compile type.
The value contained in any
type is always located in the memory (e.g., memory pool); for this reason, this type is slower than others, which store value preferably in CPU registers.
The any
is a recursive type; it can contain itself because it contains all other types in the type universe.
For this reason, it is impossible to calculate the generic or even maximum size of the any
variable.
Object types¤
String¤
str
Must be in UTF-8 encoding.
Note
str
could be casted to [ui8]
(list of ui8
) in 'toll-free' manner; it is the binary equivalent.
Bytes¤
Work in progress
Planned
Enum¤
Work in progress
Planned
Regex¤
regex
Contains compiled pattern for a regular expression.
If the regex pattern is constant, then it is compiled during the respective expression compile time. In the case of dynamic regex pattern, the regex compilation happens during the expression evaluation.
JSON¤
json<SCHEMA>
JSON object, result of the JSON parsing. It is schema-based type.
Function Type¤
Function¤
(arg1:T1,arg2:T2,arg3:T3)->Tr
T1
,T2
,T3
are types of functions inputsarg1
,arg2
andarg3
respectively.Tr
specifies the output type of the function
Pythonic types¤
Pythonic types are object types that provides interfacing with the Python.
Python Dictionary¤
pydict<SCHEMA>
A Python dictionary. It is a schema-based type.
Python Object¤
pyobj
A generic Python object.
Python List¤
pylist
A Python list.
Python Tuple¤
pytuple
Casting¤
Use !CAST
expression for change of the type of a value.
!CAST
what: 1234
type: fp32
or an equivalent shortcut:
!!fp32 1234
Note
Cast is also a great helper for type inference, it means that it could be used to indicate the the type explicitly, if needed.
Schema-based types¤
Schema is the SP-Lang concept of how to bridge schema-less systems such us JSON or Python with strongly-typed SP-Lang. Schema is basically a directory that maps fields to their types and so on. For more information, continue to a chapter about SP-Lang schemas.
SP-Lang Schema-based type specifies the schema by a schema name: json<SCHEMANAME>
.
The schema name is used to locate the schema definition eg. in the library.
List of schema-based types:
* pydict<...>
* json<...>
Build-in schemas¤
ANY
: This schema declares any member to be of typeany
.VOID
: This schema has no member, use in-place type definition to specify types of fields.