langur

string literals

There are 3 forms of string literals (more if you count block quotes).

"string"
qs"string"
QS"string"

The qs and QS forms allow newlines within the quote.

quote marks

The qs and QS forms allow other quote marks to be used. Valid quote mark pairs are shown below.

qs"" qs'' qs//
qs() qs[] qs<>

string/regex modifiers

By default, string literals allow characters designated as "Graphic" by Unicode and ASCII space characters.

modifier description
any allow any code point in a string literal
lead remove leading spaces on every line (useful with blockquotes)
ni non-interpolated string or regex
block part of the syntax for blockquotes, explained next

Modifiers are are designated by a colon and the modifier name, as in the examples below. They may be chained.

qs:any"..."

re:any"..."

Regex literals also have their own modifiers.

block quotes

Using a block modifier after a qs or QS token, separated by a colon, allows you to use block quotes with a specific marker. This is similar to a "HEREDOC."

This may also be used with regex literals.

The ending marker must be on a line of its own, with no trailing spaces. Leading spaces and tabs are allowed.

val str = qs:block BLOCK_STRING some multi-line string ... BLOCK_STRING

The line return after the opening marker and the line return before the ending marker are not part of the string.

A block modifier is always the last modifier on a literal.

langur escape codes

The plain form and the qs form for string literals interpret langur escape codes and the QS form does not.

Bare octal escape codes are not allowed. Use \oXXX instead.

Quote marks are not escaped by doubling, as that is a confusing and inferior syntax.

\" \' \/ \) \] \> when otherwise used as a closing quote mark
\{ \} indicates not to be interpolation markers
\0 null (1 to 9 not defined)
\e escape
\t tab
\n line feed
\r carriage return
\xXX 2 digit hexadecimal code unit (00 to 7F)
\oXXX 3 digit octal code unit (000 to 177)
\uXXXX 4 digit hexadecimal code point
\UXXXXXXXX 8 digit hexadecimal code point

string things

You can use a fw// or FW// literal for a free word list, as a semantic convenience.

Besides using empty quote marks, you can also use the zls token to represent a zero-length string.

string operators

~ (tilde) append operator
* integer string * integer (see below)

string multiplication

To repeat a string, use it with a multiplication operator and integer.

val result = "A" * 4 # result == "AAAA"

String multiplication, instead of throwing an exception, treats negative integers like 0.

string concatenation

The concatenation operator may be used between strings, or between a string and an integer (code point).

"A" ~ "B" == "AB" "a" ~ 99 == "ac"

You can also append code points to generate to a string.

97 ~ 98 ~ 99 == "abc"

You can append a range of numbers directly to a string, instead of using the cp2s() function first.

"a" ~ 98..100 == "abcd"

code point literals

Code point literals are specified with straight single quotes.

They are integers. There is no code point type.

Escape codes may be used in code point literals, unless they might be more than a single code point.

'a' lowercase a
'\uFEFF' code point FEFF; same as 16xFEFF

Bare code point literals (unescaped) allowed are the same as code points allowed in string literals.