- Operators
- Boolean operations & branching
- Comparison
- Arithmetics
- Formatting
- Strings
- Strings, lists and maps
- Lists
- Maps
- Dates & time
- Urls & web-related
- Fuzzy matching & information retrieval
- Utils
- IO & path wrangling
- Randomness & hashing
!x - boolean negation
-x - numerical negationWarning: those operators will always consider operands as numbers or dates and will try to cast them around as such. For string/sequence comparison, use the operators in the next section.
x == y - numerical equality
x != y - numerical inequality
x < y - numerical less than
x <= y - numerical less than or equal
x > y - numerical greater than
x >= y - numerical greater than or equalWarning: those operators will always consider operands as strings or sequences and will try to cast them around as such. For numerical comparison, use the operators in the previous section.
x eq y - string equality
x ne y - string inequality
x lt y - string less than
x le y - string less than or equal
x gt y - string greater than
x ge y - string greater than or equalx + y - numerical addition
x - y - numerical subtraction
x * y - numerical multiplication
x / y - numerical division
x % y - numerical remainder
x // y - numerical integer division
x ** y - numerical exponentiationx ++ y - string concatenationx && y - logical and
x and y
x || y - logical or
x or y
x in y
x not in yNegative indices are accepted and mean the same thing as with the Python language.
x[y] - get y from x (string or list index, map key)
x[start:end] - slice x from start index to end index
x[:end] - slice x from start to end index
x[start:] - slice x from start index to endusing "_" for left-hand side substitution.
trim(name) | len(_) - Same as len(trim(name))
trim(name) | add(1, len(_)) - Can be nested
add(trim(name) | len, 2) - Can be used anywhere- and(a, b, *n) ->
T: Perform boolean AND operation on two or more values. - if(cond, then, else?) ->
T: Evaluate condition and switch to correct branch. - unless(cond, then, else?) ->
T: Shorthand forif(not(cond), then, else?) - not(a) ->
bool: Perform boolean NOT operation. - or(a, b, *n) ->
T: Perform boolean OR operation on two or more values. - try(T) ->
T: Attempt to evaluate given expression and return null if it raised an error.
- eq(s1, s2) ->
bool: Test string or list equality. - ne(s1, s2) ->
bool: Test string or list inequality. - gt(s1, s2) ->
bool: Test string or list s1 > s2. - ge(s1, s2) ->
bool: Test string or list s1 >= s2. - lt(s1, s2) ->
bool: Test string or list s1 < s2. - le(s1, s2) ->
bool: Test string or list s1 <= s2.
- abs(x) ->
number: Return absolute value of number. - add(x, y, *n) ->
number: Add two or more numbers. - argmax(numbers, labels?) ->
any: Return the index or label of the largest number in the list. - argmin(numbers, labels?) ->
any: Return the index or label of the smallest number in the list. - ceil(x, unit?) ->
number: Return the smallest integer greater than or equal to x. Optionally ceil to nearest given unit. - div(x, y, *n) ->
number: Divide two or more numbers. - idiv(x, y) ->
number: Integer division of two numbers. - int(any) ->
int: Cast value as int and raise an error if impossible. - float(any) ->
float: Cast value as float and raise an error if impossible. - floor(x, unit?) ->
number: Return the smallest integer lower than or equal to x. Optionally floor to nearest given unit. - log(x, base?) ->
number: Return the natural or custom base logarithm of x. - log2(x) ->
number: Return the base 2 logarithm of x. - log10(x) ->
number: Return the base 10 logarithm of x. - max(x, y, *n) ->
number: Return the maximum number. - max(list_of_numbers) ->
number: Return the maximum number. - min(x, y, *n) ->
number: Return the minimum number. - min(list_of_numbers) ->
number: Return the minimum number. - mod(x, y) ->
number: Return the remainder of x divided by y. - mul(x, y, *n) ->
number: Multiply two or more numbers. - neg(x) ->
number: Return -x. - pow(x, y) ->
number: Raise x to the power of y. - round(x, unit?) ->
number: Return x rounded to the nearest integer. Optionally round to nearest given unit. - sqrt(x) ->
number: Return the square root of x. - sub(x, y, *n) ->
number: Subtract two or more numbers. - trunc(x, unit?) ->
number: Truncate the number by removing its decimal part. Optionally trunc to nearest given unit.
- bytesize(string) ->
string: Return a number of bytes in human-readable format (KB, MB, GB, etc.). - escape_regex(string) ->
string: Escape a string so it can be used safely in a regular expression. - fmt(string, *arguments) ->
string: Format a string by replacing "{}" occurrences by subsequent arguments.
Example:fmt("Hello {} {}", name, surname)will replace the first "{}" by the value of the name column, then the second one by the value of the surname column.
Can also be given a substitution map like so:fmt("Hello {name}", {name: "John"}). - fmt(string, map) ->
string: Format a string by replacing "{}" occurrences by subsequent arguments.
Example:fmt("Hello {} {}", name, surname)will replace the first "{}" by the value of the name column, then the second one by the value of the surname column.
Can also be given a substitution map like so:fmt("Hello {name}", {name: "John"}). - lower(string) ->
string: Lowercase string. - pad(string, width, char?) ->
string: Pad given string with spaces or given character so that it is least given width. - lpad(string, width, char?) ->
string: Left pad given string with spaces or given character so that it is least given width. - rpad(string, width, char?) ->
string: Right pad given string with spaces or given character so that it is least given width. - printf(format, *arguments) ->
string: Apply printf formatting with given format and arguments. Arguments can also be provided as a list.
For instance:split('John Landy') | printf('first: %s, last: %s', _) - numfmt(number, thousands_sep=",", comma=false, significance=5) ->
string: Format a number with thousands separator and proper significance. - trim(string, chars?) ->
string: Trim string of leading & trailing whitespace or provided characters. - to_fixed(number, precision) ->
string: Format given number using fixed point notation with specified number of decimal places. - ltrim(string, chars?) ->
string: Trim string of leading whitespace or provided characters. - rtrim(string, chars?) ->
string: Trim string of trailing whitespace or provided characters. - upper(string) ->
string: Uppercase string.
- count(string, substring) ->
int: Count number of times substring appear in string. Or count the number of times a regex pattern matched the strings. Note that only non-overlapping matches will be counted in both cases. Remember a regex pattern must be written with slashes e.g./france|french/i. - count(string, regex) ->
int: Count number of times substring appear in string. Or count the number of times a regex pattern matched the strings. Note that only non-overlapping matches will be counted in both cases. Remember a regex pattern must be written with slashes e.g./france|french/i. - endswith(string, substring) ->
bool: Test if string ends with substring. - match(string, regex, group) ->
string: Return a regex pattern match on the string. Remember a regex pattern must be written with slashes e.g./france|french/i. - replace(string, substring, replacement) ->
string: Replace all non-overlapping occurrences of substring in given string with provided replacement. Can also replace regex pattern matches. Remember a regex pattern must be written with slashes e.g./france|french/i.
See regex replacement string syntax documentation here:
https://docs.rs/regex/latest/regex/struct.Regex.html#replacement-string-syntax - replace(string, regex, replacement) ->
string: Replace all non-overlapping occurrences of substring in given string with provided replacement. Can also replace regex pattern matches. Remember a regex pattern must be written with slashes e.g./france|french/i.
See regex replacement string syntax documentation here:
https://docs.rs/regex/latest/regex/struct.Regex.html#replacement-string-syntax - split(string, substring, max?) ->
list: Split a string by a given separator substring. Can also split using a regex pattern. Remember a regex pattern must be written with slashes e.g./france|french/i. - split(string, regex, max?) ->
list: Split a string by a given separator substring. Can also split using a regex pattern. Remember a regex pattern must be written with slashes e.g./france|french/i. - startswith(string, substring) ->
bool: Test if string starts with substring.
- concat(string, *strings) ->
string: Concatenate given strings into a single one. - contains(string, substring) ->
bool: If target is a string: return whether substring can be found in it or return whether given regular expression matched.
If target is a list, returns whether given item was found in it.
If target is a map, returns whether given key was found in it. - contains(string, regex) ->
bool: If target is a string: return whether substring can be found in it or return whether given regular expression matched.
If target is a list, returns whether given item was found in it.
If target is a map, returns whether given key was found in it. - contains(list, item) ->
bool: If target is a string: return whether substring can be found in it or return whether given regular expression matched.
If target is a list, returns whether given item was found in it.
If target is a map, returns whether given key was found in it. - contains(map, key) ->
bool: If target is a string: return whether substring can be found in it or return whether given regular expression matched.
If target is a list, returns whether given item was found in it.
If target is a map, returns whether given key was found in it. - first(seq) ->
T: Get first char of string or first item of list. - last(seq) ->
T: Get last char of string or first item of list. - len(seq) ->
int: Get number of chars in string or number of items in list. - get(string, index, default?) ->
any: If target is a string, return the nth unicode char. If target is a list, return the nth item. Indices are zero-based and can be negative to access items in reverse. If target is a map, return the value associated with given key. All variants can also take a default value when desired item is not found. - get(list, index, default?) ->
any: If target is a string, return the nth unicode char. If target is a list, return the nth item. Indices are zero-based and can be negative to access items in reverse. If target is a map, return the value associated with given key. All variants can also take a default value when desired item is not found. - get(map, key, default?) ->
any: If target is a string, return the nth unicode char. If target is a list, return the nth item. Indices are zero-based and can be negative to access items in reverse. If target is a map, return the value associated with given key. All variants can also take a default value when desired item is not found. - slice(seq, start, end?) ->
seq: Return slice of string or list.
- all(list, lambda) ->
bool: Returns whether the given lambda returned true for all elements of the list.
For instance:all(names, name.startswith('A')) - any(list, lambda) ->
bool: Returns whether the given lambda returned true for any element of the list.
For instance:any(names, name.startswith('A')) - compact(list) ->
list: Drop all falsey values from given list. - filter(list, lambda) ->
list: Return a list containing only elements for which given lambda returned true.
For instance:filter(names, name => name.startswith('A')) - find(list, lambda) ->
any?: Return the first item of a list for which given lambda returned true.
For instance:find(names, name => name.startswith('A')) - find_index(list, lambda) ->
int?: Return the index of the first item of a list for which given lambda returned true.
For instance:find_index(names, name => name.startswith('A')) - index_by(list, key) ->
map: Take a list of maps and a key name and return an indexed map from selected keys to the original maps. - join(list, sep) ->
string: Join list by separator. - map(list, lambda) ->
list: Return a list with elements transformed by given lambda.
For instance:map(numbers, n => n + 3) - mean(numbers) ->
number?: Return the mean of the given numbers. - range(stop) ->
list[number]: Return the specified range as a list of integers. - range(start, stop, step=1) ->
list[number]: Return the specified range as a list of integers. - repeat(string_or_list, times) ->
string_or_list: Repeat target string or list n times. - sum(numbers) ->
number?: Return the sum of the given numbers, or nothing if the sum overflowed.
- keys(map) ->
[string]: Return a list of the map's keys. - values(map) ->
[T]: Return a list of the map's values.
- datetime(string, format=?) ->
zoned?_datetime: Attempt to parse a datetime with or without timezone info from given string. If no format is provided, string is parsed using ISO 8601 date format.
https://docs.rs/jiff/latest/jiff/fmt/strtime/index.html#conversion-specifications - date(string_or_datetime, format=?) ->
date: If given a datetime, will return its date component. Else, attempt to parse a date from given string. If no format is provided, string is parsed using ISO 8601 date format.
https://docs.rs/jiff/latest/jiff/fmt/strtime/index.html#conversion-specifications - time(string_or_datetime, format=?) ->
time: If given a datetime, will return its time component. Else, attempt to parse a time from given string. If no format is provided, string is parsed using ISO 8601 time format.
https://docs.rs/jiff/latest/jiff/fmt/strtime/index.html#conversion-specifications - span(string) ->
span: Parse given string as a time span that can be added or subtracted to temporal elements.
Format: https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing - now() ->
zoned_datetime: Return current datetime in local timezone. - from_timestamp(int_or_float) ->
zoned_datetime: Interpret given int as seconds timestamp, or given float as seconds timestamp with fractional subseconds component. - from_timestamp_ms(int) ->
zoned_datetime: Interpret given int as milliseconds timestamp. - to_timestamp(zoned_datetime) ->
int_or_float: Convert given datetime to seconds timestamp or seconds with fractional subseconds timestamp if datetime has enough precision. Will error if given datetime has no timezone info. - to_timestamp_ms(zoned_datetime) ->
int: Convert given datetime to milliseconds timestamp. Will error if given datetime has no timezone info. - earliest(t1, t2, *tn) ->
temporal: Return the earliest point in time. Expects homogeneous types (all dates, all datetimes etc.). - earliest(list_of_temporals) ->
temporal: Return the earliest point in time. Expects homogeneous types (all dates, all datetimes etc.). - latest(t1, t2, *tn) ->
temporal: Return the latest point in time. Expects homogeneous types (all dates, all datetimes etc.). - latest(list_of_temporals) ->
temporal: Return the latest point in time. Expects homogeneous types (all dates, all datetimes etc.). - fractional_days(t1, t2) ->
float: Returns number of days between two points in time, as a signed float. Expect homogenous types (2 dates, 2 datetimes etc.). - strftime(target, format) ->
string: Format temporal value according to format.
https://docs.rs/jiff/latest/jiff/fmt/strtime/index.html#conversion-specifications - to_timezone(zoned_datetime, timezone) ->
zoned_datetime(aliases: to_tz): Convert given datetime to given timezone. Will error if given datetime has no timezone info. - to_local_timezone(zoned_datetime) ->
zoned_datetime(aliases: to_local_tz): Convert given datetime to local timezone. Will error if given datetime has no timezone info. - with_timezone(datetime, timezone) ->
zoned_datetime(aliases: with_tz): Arbitrarily indicate that given civil datetime should be understood as being in given timezone. Will error if given datetime already has timezone info. - with_local_timezone(datetime) ->
zoned_datetime(aliases: with_local_tz): Arbitrarily indicate that given civil datetime should be understood as being in local timezone. Will error if given datetime already has timezone info. - without_timezone(zoned_datetime) ->
datetime(aliases: without_tz): Return the civil datetime of a datetime with timezone info. Will error if given datetime has no timezone info. - year_month_day(target) ->
string(aliases: ymd): Extract the year, month and day of a datetime. If the input is a string, first parse it into datetime, and then extract the year, month and day.
Equivalent tostrftime(string, format="%Y-%m-%d"). - month_day(target) ->
string: Extract the month and day of a datetime. If the input is a string, first parse it into datetime, and then extract the month and day.
Equivalent tostrftime(string, format="%m-%d"). - month(target) ->
string: Extract the month of a datetime. If the input is a string, first parse it into datetime, and then extract the month.
Equivalent tostrftime(string, format="%m"). - year(target) ->
string: Extract the year of a datetime. If the input is a string, first parse it into datetime, and then extract the year.
Equivalent tostrftime(string, format="%Y"). - year_month(target) ->
string(aliases: ym): Extract the year and month of a datetime. If the input is a string, first parse it into datetime, and then extract the year and month.
Equivalent tostrftime(string, format="%Y-%m").
- html_unescape(string) ->
string: Unescape given HTML string by converting HTML entities back to normal text. - lru(string) ->
string: Convert the given URL to LRU format.
For more info, read this: https://github.com/medialab/ural#about-lrus - mime_ext(string) ->
string: Return the extension related to given mime type. - parse_dataurl(string) ->
[string, bytes]: Parse the given data url and return its mime type and decoded binary data. - urljoin(string, string) ->
string: Join an url with the given addendum.
- fingerprint(string) ->
string: Fingerprint a string by normalizing characters, re-ordering and deduplicating its word tokens before re-joining them by spaces. - soundex(name) ->
string: Compute the SOUNDEX code (a phonetic encoding) of given name. - refined_soundex(name) ->
string: Compute the refined SOUNDEX code (a phonetic encoding) of given name. - phonogram(name) ->
string: Compute the "phonogram" code (yomguithereal's own phonetic encoding) of given name. - carry_stemmer(string) ->
string: Apply the "Carry" stemmer targeting the French language. - s_stemmer(string) ->
string: Apply a very simple stemmer removing common plural inflexions in some languages. - unidecode(string) ->
string: Convert string to ascii as well as possible.
- col() ->
bytes: Without argument, return current column's value, if relevant. Else, return value for given column, by name, by position or by name & nth, in case of duplicate header names. - col(name_or_pos, nth?) ->
bytes: Without argument, return current column's value, if relevant. Else, return value for given column, by name, by position or by name & nth, in case of duplicate header names. - col?(name_or_pos, nth?) ->
bytes: Return value of cell for given column, by name, by position or by name & nth, in case of duplicate header names. Allow selecting inexisting columns, in which case it will return null. - header() ->
bytes: Without argument, return current column's name, if relevant. Else, return header name for given column, by name, by position or by name & nth, in case of duplicate header names. - header(name_or_pos, nth?) ->
bytes: Without argument, return current column's name, if relevant. Else, return header name for given column, by name, by position or by name & nth, in case of duplicate header names. - header?(name_or_pos, nth?) ->
bytes: Return header namefor given column, by name, by position or by name & nth, in case of duplicate header names. Allow selecting inexisting columns, in which case it will return null. - col_index() ->
bytes: Without argument, return current column's zero-based index, if relevant. Else, return zero-based index of given column, by name, by position or by name & nth, in case of duplicate header names. - col_index(name_or_pos, nth?) ->
bytes: Without argument, return current column's zero-based index, if relevant. Else, return zero-based index of given column, by name, by position or by name & nth, in case of duplicate header names. - col_index?(name_or_pos, nth?) ->
bytes: Return zero-based index of given column, by name, by position or by name & nth, in case of duplicate header names. Allow selecting inexisting columns, in which case it will return null. - cols(from_name_or_pos?, to_name_or_pos?) ->
list[bytes]: Return list of cell values from the given column by name or position to another given column by name or position, inclusive. Can also be called with a single argument to take a slice from the given column to the end, or no argument at all to take all columns. - prev_col(offset=1) ->
bytes: Return cell value of column just before current column. Take an optional offset if you want a larger stride. - next_col(offset=1) ->
bytes: Return cell value of column just after current column, by an optional offset. Take an optional offset if you want a larger stride. - err(msg) ->
error: Make the expression return a custom error. - headers(from_name_or_pos?, to_name_or_pos?) ->
list[string]: Return list of header names from the given column by name or position to another given column by name or position, inclusive. Can also be called with a single argument to take a slice from the given column to the end, or no argument at all to return all headers. - row_index() ->
int?: Return current row's zero-based index, if relevant. - regex(string) ->
regex: Parse given string as regex. Useful when your patterns are dynamic, e.g. built from a CSV cell. Else prefer using regex literals e.g. "/test/". - typeof(value) ->
string: Return type of value.
- abspath(string) ->
string: Return absolute & canonicalized path. - basename(path, suffix?) ->
string: Return the final component of given path, usually the file name, all while stripping it of an optional suffix. - cmd(string, list[string]) ->
bytes: Run a command using the provided list of arguments as a subprocess and return the resulting bytes trimmed of trailing whitespace. - copy(source_path, target_path) ->
string: Copy a source to target path. Will create necessary directories on the way. Returns target path as a convenience. - dirname(path) ->
string: Return target path without final component if any. - ext(path) ->
string?: Return the path's extension, if any. - filesize(string) ->
int: Return the size of given file in bytes. - isfile(string) ->
bool: Return whether the given path is an existing file on disk. - move(source_path, target_path) ->
string: Move a source to target path. Will create necessary directories on the way. Returns target path as a convenience. - parse_json(string) ->
any: Parse the given string as JSON. - parse_py_literal(string) ->
any: Parse the given string as a python literal. - pathjoin(string, *strings) ->
string(aliases: pjoin): Join multiple paths correctly. - read(path, encoding="utf-8", errors="strict") ->
string: Read file at path. Default encoding is "utf-8". Default error handling policy is "replace", and can be one of "replace", "ignore" or "strict". - read_csv(path) ->
list[map]: Read and parse CSV file at path, returning its rows as a list of maps with headers as keys. - read_json(path) ->
any: Read and parse JSON file at path. - shell(string) ->
bytes: Convenience function runningcmd("$SHELL -c <command>")on unix-like systems andcmd("cmd \C <command>")on Windows. - shlex_split(string) ->
list[string]: Split a string of command line arguments into a proper list that can be given to e.g. thecmdfunction. - write(string, path) ->
string: Write string to path as utf-8 text. Will create necessary directories recursively before actually writing the file. Return the path that was written.
- md5(string) ->
string: Return the md5 hash of string in hexadecimal representation. - random() ->
float: Return a random float between 0 and 1. - uuid() ->
string: Return a uuid v4.