Greetings all. This is a proposed draft of a proposed new programming language. The BDFL of this project is DannyNiu/NJF. The intention of this request for comments is to solicit ideas - advice, suggestions for improvement, as well as critique on preceived defects.
While any idea are welcome, they're better received if they're accompanied with counter-arguments, usage illustrations, and/or sketch of implementation, yet the decision of adoption is ultimately made by the BDFL of the project.
You may submit your idea and/or queries by opening Issues at GitHub or Gitee, both English and Chinese languages are accepted.
Build Info: This build of the (draft) spec is based on git commit 8b2ba2204e06264302f3b766d189b5110bf34820
The 2025-12-26 revision of the draft spec is the 2nd feature-complete beta, and is ready to be implemented for testingThe 'cxing' programming language (with or without caps) is a general-purpose programming language with a C-like syntax that is memory-safe, aims to be thread-safe, and have surprise-free semantics. It aims to fit into and interoperate with the existing ecosystem written in other languages, with C as its starting point.
It attempts to pioneer in the field of efficient, expressive, and robust error handling using language design toolsets.
The language is meant to be an open standard with multiple independent implementations that are widely interoperable. It can be implemented either as interpreted or as compiled. Programs written in cxing should be no less portable than when it's written in C.
Features are introduced on strictly maintainable basis. The reference implementation will be an AST-based interpreter (or a transpiler to C?), which will serve as instrument of verification for additional implementations. The version of the language (if it ever changes) will be independent of the versions of the implementations.
The see section 2. Features for more information on how the goals are achieved.
Just as Java is a beautiful island in Indonesia, we wanted a name that pride ourselves as Earth-Loving Chinese here in Shanghai, therefore we choose to name our language after the National Nature Reserve Park of Changxing Island. However, the name is too long to be used directly, and "changx" looked too much like 'clang', so we simplified it to "cxing", which we find both pleasure in looking at it, and the name giving connotation with an information technology product.
The language itself and the reference implementation are released into the public domain.
To best reflect the intent of the design, the specification shall be programmer-oriented. The purpose of features will be explained, with examples provided on how they're to be used. The syntax and semantic definitions follow.
The language does not expose pointers - to data or to function - only opaque
object handles. It uses reference counting with garbage collection to ensure
memory safety. It has separate type domain for sharable types catered to
multi-threaded access, and exclusive types for efficient access within a
single thread; only sharable types can be declared globally.
It's typical to desire some result come out of a failing program, it is even more desirable that the failure of a single component doesn't deny the service of users, it's very desirable that error recovery can be easy to program, and it's undesirable that errors cannot be detected.
In cxing, errors occur in the forms of nullish values. For the
special value null, accessing any member of it yields null, and calling
a null as a function returns null. Nullish values can be substituted with
other alternative values that programs recover from errors.
// We do not know the schema of this object, but we know it can be
// one of the two alternatives. Here the "??" punctuation is the
// nullish coalescing operator:
timescale = mp4box.movie.timescale ??
mp4box.fragments[0].timescale ??
mp4file.timescale;
A bit of background first.
The IEEE-754 standard for floating point arithmetic specifies handling of exceptional conditions for computations. These conditions can be handled in the default way (default exception handling) or in some alternative ways (alternative exception handling).
The 1985 edition of the standard described exceptions and their default handling in section 7, and handling using traps in section 8. These were revised as "exceptions and default exception handling" in section 7 as well as "altenate exception handling attributes" in section 8 in the 2008 edition of the standard - these "attributes" are associated with "blocks" which (as most would expect) are group(s) of statements. Alternate exception handling are used in many advanced numerical programs to improve robustness.
As a prescriptive standard, it was intended to have language standards to describe constructs for handling floating point errors in a generic way that abstracts away the underlying detail of system and hardware implementations. In doing so, the standard itself becomes non-generic, and described features specific to some languages that were not present in others.
The cxing language employs null coalescing operators as general-purpose error-handling syntax, and make it cover NaNs by making them nullish. As an unsolicited half-improvement, I (@dannyniu) propose the following alternative description for "alternate exception handling":
Language ought to specify ways for program to transfer the control of execution, or to evaluate certain expressions when a subset (some or all) of exceptions occur.
As an example, the continued fraction function in code example A-16 from "Numerical Computing Guide" of Sun ONE Studio 8 (https://www5.in.tum.de/~huckle/numericalcomputationguide.pdf , accessed 2025-08-15) can be written in cxing as:
subr continued_fraction(N, a, b, x, out)
{
decl f, f1, d, d1, pd1, q;
decl j;
f1 = 0.0;
f = a[N];
for(j=N-1; j>=0; j--)
{
d = x + f;
d1 = 1.0 + f;
q = b[j] / d;
f1 = (-d1 / d) * q _Fallback f1 = b[j] * pd1 / b[j+1];
pd1 = d1;
f = a[j] + q;
}
out.f = f;
out.f1 = f1;
}
Reproducibility issues treated in the standard are further discussed in 12.3. Reproducibility and Robustness
For the purpose of this section, the POSIX Extended Regular Expressions (ERE) syntax is used to describe the production of lexical elements. The POSIX regular expression is chosen for it being vendor neutral. There's a difference between the POSIX semantic of regular expression and PCRE semantic, the latter of which is widely used in many programming languages even on POSIX platforms, most notably Perl, Python, PHP, and have been adopted by JavaScript. Care have been taken to ensure the expressions used in this chapter are interpreted identically under both semantics.
Comments in the language begin with 2 forward slashses: //, or 1 hash
sign: #, and span towards the end of the line. Another form of comments
exists, where it begins with /* and ends with */ - this form of comment
can span multiple lines.
Comments in the following explanatory code blocks use the same notation as in the actual language.
An identfier has the following production: [_[:alpha:]][_[:alnum:]]*.
A keyword is an identifier that matches one of the following:
// Special Values:
true false null
// Phrases:
return break continue and or _Fallback
// Statements and Declarations:
decl
// Control Flows:
if else elif while do for
// Functions:
subr method this
// Translation Unit Interface:
_Include extern const
Decimal integer literals have the following production: [1-9][0-9]*[uU]?.
When the literal has the "U" suffix, the literal has type ulong, otherwise,
the literal has type long.
Octal integer literals have the following production: 0o?[0-7]*. An octal
literal always has type ulong.
Note: As it had been a common mistake in newcomers to zero-pad a decimal
number only to realize it's become an octal literal, it is recommended that
implementations issue warnings when a number is zero-padded and recommend user
to prefix the literal with 0o when they do intend to use octals. Likewise,
for some functions (e.g. chmod in POSIX), users may actually DO intend to
use octals when they forget to zero-prefix them to become octal literals - in
these cases, it is recommended that semantic analysis be performed using syntax
information (if possible) and appropriate warnings be given.
Hexadecimal integer literals have
the following production: 0[xX][0-9a-fA-F]+.
A hexadecimal literal always has type ulong.
Radix-64 literals have the following production: 0\\[A-Za-z0-9._]+.
The primary use of radix-64 literals are as option flags to functions, as
bitwise compositions are obscure, and symbolic constants need verbose prefixes
to not pollute global name space. A radix-64 literal always have type ulong.
The characters following the backslash have the same numerical value as those
in the Base 64 Encoding with URL and Filename Safe Alphabet
except that the minus sign (-) is replaced with a period (.) due to possible
ambiguity with the subtraction expression operator, and that there's no
padding characters.
Fraction literals has the following production: [0-9]+\.[0-9]*|\.[0-9]+.
The literal always has type double.
Decimal scientific literals is a fraction literal further suffixed by
a decimal exponent literal production: [eE][-+]?[0-9]+. The digits of the
production indicates a power of 10 to raise fraction part to.
Hexadecimal fraction literal has the following production:
0[xX]([0-9a-fA-F]+.[0-9a-fA-F]*|.[0-9a-fA-F]+) - this production is
NOT a valid lexical element in the language,
but hexadecimal scientific literal is, which is defined as
hex fraction literal followed by hexadecimal exponent literal - having the
production: [pP][-+]?[0-9]+. The digits of the production indicates a power
of 2 to raise the fraction part to.
Character and string literals have the following production:
['"]([^\]|\\(["'abefnrtv]|x[0-9a-fA-F]{2,2}|[0-7]{1,3}))['"]
In the 2nd subexpression, each alternative have the following meanings:
BEL ASCII 'bell' control character,BS ASCII backspace character,ESC ASCII escape character,FF ASCII form-feed character,LF ASCII line-feed character,CR ASCII carriage return character,HT ASCII horizontal tab character,VT ASCII vertical tab character.When single-quoted, the literal is a character literal having the value of the
first character as type long, the behavior is implementation-defined if there
are multiple characters.
When double-quoted, the literal is a string literal having type str.
Raw string literals have the following production:
\\("[^"]*"|'[^']')
In a raw string literal, there is no escape sequence. Single quotes cannot appear in single-quoted raw string literals, and double quotes cannot appear in double-quoted raw string literals.
Raw string literals are primarily intended for writing regular expressions.
Any number of raw string and double-quoted string may be concatenated into one string object by virtue of them being placed in adjacency with no character in between other than whitespaces. The set of whitespace characters are defined to be exactly the following: U+0020 (space), U+000D (carriage return), U+000B (vertical tab), U+000A (line-feed), U+0009 (horizontal tab).
A punctuation is one of the following:
( ) [ ] =? . ++ -- + - ~ ! * / %
<< >> >>> < > & ^ |
<= >= == != === !== && || ?? ? :
= *= /= %= += -= <<= >>= >>>= &= ^= |= ,
; { }
primary-expr % primary
: "(" expressions-list ")" % paren
| identifier % ident
| constant % const
;
paren: The value is that of the expressions-list.ident: The value is whatever stored in the identifier.const: The value is that represented by the constant.postfix-expr % postfix
: primary-expr % degenerate
| postfix-expr "=?" primary-expr % nullcoalesce
| postfix-expr "[" assign-expr "]" % indirect
| postfix-expr "." identifier % member
| postfix-expr "++" % inc
| postfix-expr "--" % dec
| function-call % funccall
| object-notation % objdef
;
function-call % funccall
: postfix-expr "(" ")" % noarg
| funccall-start-nocomma ")" % somearg
;
funccall-start-nocomma % funcinvokenocomma
: postfix-expr "(" assign-expr % base
| funccall-start-nocomma "," assign-expr % genrule
;
nullcoalesce: If the value of postfix-expr isn't nullish, then the value
is that of postfix-expr and the second operand is not evaluated,
otherwise the value is that of primary-expr.indirect: Reads the key identified by expressions-list from the object
identified by postfix-expr. The result is an lvalue.funccall: Calls postfix-expr as a function, given expressions-list
as parameters. If postfix-expr is a member, then its postfix-expr
is provided as the this parameter to a potential method call. The result
is the return value of the function. See
9.4. Subroutines and Methods for further discussion.member: Reads the key identified by the spelling of identifier from
the object identified by postfix-expr. The result is an lvalue.inc: Increment postfix-expr by 1. The result is the pre-increment value
of postfix-expr. postfix-expr MUST be an lvalue.dec: Decrement postfix-expr by 1. The result is the pre-decrement value
of postfix-expr. postfix-expr MUST be an lvalue.objdef: See 11. Type Definition and Object Initialization Syntax.Note: Previously, the close-binding null-coalescing operator was ->,
this was changed as it had been desired to reserve it for a 'trait' static call
syntax where the first argument of a subroutine (i.e. non-method function)
receives the value of or a reference to the left-hand of the operator. This is
tentative and no commitment over this had been made yet. All in all, the
close-binding null-coalescing operator is now =?. (Note dated 2025-09-26.)
unary-expr % unary
: postfix-expr % degenerate
| "++" unary-expr % inc
| "--" unary-expr % dec
| "+" unary-expr % positive
| "-" unary-expr % negative
| "~" unary-expr % bitcompl
| "!" unary-expr % logicnot
;
inc: Increment unary-expr by 1. The result is the post-increment value
of unary-expr. unary-expr MUST be an lvalue.dec: Decrement unary-expr by 1. The result is the post-decrement value
of unary-expr. unary-expr MUST be an lvalue.positive: The result is that of unary-expr implicitly converted to a
number if necessary.negative: The result is the negative of unary-expr, which is implicitly
converted to a number if necessary.bitcompl: The result is the bitwise complement of unary-expr under
integer context.logicnot: The result is 0 if unary-expr is non-zero, and 1 if
unary-expr compares equal to 0 (both +0 and -0).For inc and dec in unary and postfix, and positive and negative,
operation are computed under arithmetic context. For bitcompl and logicnot,
the operation are computed under integer context.
mul-expr % mulexpr
: unary-expr % degenerate
| mul-expr "*" unary-expr % multiply
| mul-expr "/" unary-expr % divide
| mul-expr "%" unary-expr % remainder
;
multiply: The value is the product of mul-expr and unary-expr.divide: The value is the quotient of mul-expr divided by unary-expr.remainder: The value is the remainder of mul-expr modulo unary-expr.The result of division on integers SHALL round towards 0.
The remainder computed SHALL be such that (a/b)*b + a%b == a is true.
If the divisor is 0, then the quotient of division becomes positive/negative
infinity of type double if the sign of both operands are same/different,
while the remainder becomes NaN, with the "invalid" floating point
exception signalled.
For the purpose of determining the sign of operands, the integer 0 in ulong
and two's complement signed long are considered to be positive.
Editorial Note: The first 3 of the above 4 paragraphs were together 1 paragraph in a previous version of the draft before 2025-08-25. This had the potential of causing the confusion that remainder is only applicable to integers. Because now remainder is also applicable to floating points, this is first separated into its own paragraph. The rule regarding type conversion on division by 0 is of separate interest, so it's also an individual paragraph now. The 4th paragraph is added on 2025-08-25.
Note: The condition for determining remainder is equivalent to:
remainder
x % yshall be suchx-nysuch that for some integern, if y is non-zero, the result has the same sign asxand magnitude less than that ofy.
These are separate descriptions for integer modulo operator and floating point
fmod function in the C language, as such, an implementation may utilize these
facilities in C. Any inconsistency between these 2 definitions in C are
supposedly unintentional from the standard developer's perspective.
All of mulexpr are computed under arithmetic context.
add-expr % addexpr
: mul-expr % degenerate
| add-expr "+" mul-expr % add
| add-expr "-" mul-expr % subtract
;
add: The value is the additive sum of add-expr and mul-expr.subtract: The value is the difference of subtracting
mul-expr from add-expr.All of addexpr are computed under arithmetic context.
bit-shift-expr % shiftexpr
: add-expr % degenerate
| bit-shift-expr "<<" add-expr % lshift
| bit-shift-expr ">>" add-expr % arshift
| bit-shift-expr ">>>" add-expr % rshift
;
lshift: The value is the left-shift bit-shift-expr by add-expr bits.arshift: The value is the arithmetic-right-shift bit-shift-expr by
add-expr bits. This is done without regard to the actual signedness
of the type of bit-shift-expr operand.rshift: The value is the logic-right-shift bit-shift-expr by
add-expr bits. This is done without regard to the actual signedness
of the type of bit-shift-expr operand.All of shiftexpr are computed under integer context.
Side Note: There was left and right rotate operators. Since there's only a single 64-bit width in native integer types, bit rotation become meaningless. Therefore those functionalities will be offered in the standard library method functions.
rel-expr % relops
: bit-shift-expr % degenerate
| rel-expr "<" bit-shift-expr % lt
| rel-expr ">" bit-shift-expr % gt
| rel-expr "<=" bit-shift-expr % le
| rel-expr ">=" bit-shift-expr % ge
;
lt: True if and only if rel-expr is
less than bit-shift-expr.gt: True if and only if rel-expr is
greater than bit-shift-expr.le: True if and only if rel-expr is
less than or equal to bit-shift-expr.ge: True if and only if rel-expr is
greater than or equal to bit-shift-expr.All of the ordering relations of relops are evaluated under arithmetic context.
If either operand is NaN or null, then the value of the expression is false.
eq-expr % eqops
: rel-expr % degenerate
| eq-expr "==" rel-expr % eq
| eq-expr "!=" rel-expr % ne
| eq-expr "===" rel-expr % ideq
| eq-expr "!==" rel-expr % idne
;
eq: True if and only if both sides loosely compare equal. False otherwise.ne: Equivalent to !(<eq-expr> == <rel-expr>),ideq: True if and only if both sides strictly compare equal. False otherwise.idne: Equivalent to !(<eq-expr> === <rel-expr>),To evaluate whether two operands are equal:
equals() method is first examined; if this
is not present, or if comparing for ordering relation or loose equality,
the cmpwith() method is then examined.equals() method of at least one operand returns true when
called appropriately, then the two operands compares equal,cmpwith() method of at least one operand returns 0 when called
appropriately, then the two operands compares equal.this object, and the method is called with it and the other operand
as the only argument, and the return value is processed as follow:
equals() method property was used, equality holds if and only if
the method returns true.cmpwith() method property was used loose equality holds if and
only if a.cmpwith(b) returns exactly 0; strict equality is not affeected
by this method and doesn't hold in this subcase.To evaluate the ordering relation of 2 operands:
cmpwith():
a > b and a >= b are satisfied if
a.cmpwith(b) returns greater than 0 or if
b.cmpwith(a) returns less than 0,a < b and a <= b are satisfied if
b.cmpwith(a) returns greater than 0 or if
a.cmpwith(b) returns less than 0,a <= b and a >= b are satisfied if
a.cmpwith(b) or b.cmpwith(a) returns exactly 0,this object, and the method is called with it and the other operand
as the only argument, and the return value is processed as follow:
cmpwith() method property was used:
a > b and a >= b are satisfied if
a.cmpwith(b) returns greater than 0,a < b and a <= b are satisfied if
a.cmpwith(b) returns less than 0,a <= b and a >= b are satisfied if
a.cmpwith(b) returns exactly 0,Note: The equals() method is never used for ordering relations including
the <= and the >= operators because an object that's missing cmpwith()
has no reasonable definition of ordering relations. Conversely, the cmpwith()
method is not used with the strict equality test, because the ordering of objects
doesn't necessarily reflect their identity.
bit-and % bitand
: eq-expr % degenerate
| bit-and "&" eq-expr % bitand
;
bit-xor % bitxor
: bit-and % degenerate
| bit-xor "^" bit-and % bitxor
;
bit-or % bitxor
: bit-xor % degenerate
| bit-or "|" bit-xor % bitor
;
bitand: The value is the bitwise and of 2 operands.bitxor: The value is the bitwise exclusive-or of 2 operands.bitor: The value is the bitwise inclusive-or of 2 operands.All of the bitwise operations are computed under integer context.
logic-and % logicand
: bit-or % degenerate
| logic-and "&&" bit-or % logicand
;
logic-or % logicor
: logic-and % degenerate
| logic-or "||" logic-and % logicor
| logic-or "??" logic-and % nullcoalesce
;
logicand: if the first operand is zero or null, then
this is the result and the second operand is not evaluated,
otherwise, it's the value of the second operand.logicor: if the first operand is non-zero and non-null, then
this is the result and the second operand is not evaluated,
otherwise, it's the value of the second operand.nullcoalesce: Refer to postfix-expr.cond-expr % tenary
: logic-or % degenerate
| logic-or "?" expressions-list ":" cond-expr % tenary
;
tenary: The logic-or is first evaluated.
If it's non-zero and non-null, then expressions-list is evaluated;
otherwise, cond-expr is evaluated;
The result is whichever expressions-list or cond-expr evaluated.assign-expr % assignment
: cond-expr % degenerate
| unary-expr "=" assign-expr % directassign
| unary-expr "*=" assign-expr % mulassign
| unary-expr "/=" assign-expr % divassign
| unary-expr "%=" assign-expr % remassign
| unary-expr "+=" assign-expr % addassign
| unary-expr "-=" assign-expr % subassign
| unary-expr "<<=" assign-expr % lshiftassign
| unary-expr ">>=" assign-expr % arshiftassign
| unary-expr ">>>=" assign-expr % rshiftassign
| unary-expr "&=" assign-expr % andassign
| unary-expr "^=" assign-expr % xorassign
| unary-expr "|=" assign-expr % orassign
;
directassign: writes the value of assign-expr to unary-expr.unary-expr.See 9.2. Object/Value Key Access for further discussion.
expressions-list % exprlist
: assign-expr % degenerate
| expressions-list "," assign-expr % exprlist
;
exprlist: A list of expressions.
expressions-list is
first evaluated, then assign-expr is evaluated next, and
the value of the expression is that of assign-expr.Between expressions and statements, there are phrases.
Phrases are like expressions, and have values, but due to grammatical constraints, they lack the usage flexibility of expressions. For example, phrases cannot be used as arguments to function calls, since phrases are not comma-delimited; nor can they be assigned to variables, since assignment operators binds more tightly than phrase delimiters. On the other hand, phrases provides flexibility in combining full expressions in way that wouldn't otherwise be expressive enough through expressions due to use of parentheses.
conj-ion % and_phrase_ion
: conj-atom "and" % and
| conj-atom "_Then" % then
;
conj-atom % and_phrase_atom
: expressions-list % degenerate
| conj-ion expressions-list % atomize
;
disj-ion % or_phrase_ion
: disj-atom "or" % or
| disj-atom "_Fallback" % nc
| conj-ion control-flow-ions % ctrl_flow
;
disj-atom % or_phrase_atom
: conj-atom % degenerate
| disj-ion conj-atom % atomize
;
phrase-stmt % phrase_stmt
: disj-atom ";" % base
| control-flow-molecule % ctrl_flow
| conj-ion control-flow-molecule % conj_ctrl_flow
| disj-ion control-flow-molecule % disj_ctrl_flow
;
control-flow-ions % ctrl_flow_ion
: control-flow-operator "or" % op_or
| control-flow-operator "_Fallback" % op_nc
| control-flow-operator identifier "or" % labelledop_or
| control-flow-operator identifier "_Fallback" % labelledop_nc
| "return" "or" % returnnull_or
| "return" "_Fallback" % returnnull_nc
| "return" expressions-list "or" % returnexpr_or
| "return" expressions-list "_Fallback" % returnexpr_nc
;
control-flow-molecule % ctrl_flow_molecule
: control-flow-operator ";" % op
| control-flow-operator identifier ";" % labelledop
| "return" ";" % returnnull
| "return" expressions-list ";" % returnexpr
;
control-flow-operator % flowctrlop
: "break" % break
| "continue" % continue
;
*_atom: first, the ion is evaluated, if the result is {STOP(value)},
then the value of this phrase atom is value, otherwise the 2nd term
(the expressions-list or the conj-atom) is evaluated, and its
value is the value of the phrase.
*_ion: if the 1st term does not alter the control flow, then its value is
evaluated, then, depending on the last token in the rewrite sequence:
"and", then {CONTINUE} is the result if the value is not false,"or", then {CONTINUE} is the result if the value is false,"_Then", then {CONTINUE} is the result if the value is not nullish."_Fallback", then {CONTINUE} is the result if and only if
the value is nullish.{STOP(value)}, with value being the evaluated value of the
1st term, becomes the result.op*: Apply the flow-control operation to the inner-most applicable scope.
labelledop*: Apply the flow-control operation to the labelled statement scope.
returnnull*: Terminates the executing function.
If the caller expected a return value, it'll be a Morgoth null.
returnexpr*: Terminates the executing function with
return value being that of expression.
control-flow-operator % flowctrlop
: "break" % break
| "continue" % continue
;
break: Terminates the applicable loop.continue: Skip the remainder of the applicable loop body and
proceed to the next iteration.statement % stmt
: ";" % emptystmt
| identifier ":" statement % labelled
| phrase-stmt % phrase
| conditionals % cond
| while-loop % while
| do-while-loop % dowhile
| for-loop % for
| "{" statements-list "}" % brace
| declaration ";" % decl
;
emptystmt: This does nothing in a function body.labelled: Identifies the statement with a label.brace: Executes statements-list.conditionals % condstmt
: predicated-clause % base
| predicated-clause "else" statement % else
;
else: Executes predicated-clause, if none of its statement(s) were
executed due to no predicate evaluated to true, then statement is executed.predicated-clause % predclause
: "if" "(" expressions-list ")" statement % base
| predicated-clause "elif" "(" expressions-list ")" statement % genrule
;
base: Evaluate expressions-list (in expression phrase context as
mentioned in the 4.7. Compounds),
if it's true, then statement is executed, otherwise it's not executed.genrule: Executes predicate-clause, if none of its statement(s) were
executed due to no predicate evaluated to true, then evaluate
expressions-list, if that is still not true, then statement is not
executed, otherwise, statement is executed.while-loop % while
: "while" "(" expressions-list ")" statement % rule
;
rule: To execute rule, evaluate expressions-list, if it's true,
then execute statement and then execute rule.do-while-loop % dowhile
: "do" "{" statements-list "}" "while" "(" expressions-list ")" ";" % rule
;
rule: To execute rule, execute statements-list, then evaluate
expressions-list, if it's true, then execute rule.for-loop % for
: "for" "(" ";" ";" ")" statement % forever
| "for" "(" ";" ";" expressions-list ")" statement % iterated
| "for" "(" ";" expressions-list ";" ")" statement % conditioned
| "for" "(" ";" expressions-list ";"
expressions-list ")" statement % controlled
| "for" "(" expressions-list ";" ";" ")" statement % initonly
| "for" "(" expressions-list ";" ";"
expressions-list ")" statement % nocond
| "for" "(" expressions-list ";"
expressions-list ";" ")" statement % noiter
| "for" "(" expressions-list ";"
expressions-list ";"
expressions-list ")" statement % classic
| "for" "(" declaration ";" ";" ")" statement % vardecl
| "for" "(" declaration ";" ";"
expressions-list ")" statement % vardecl_nocond
| "for" "(" declaration ";"
expressions-list ";" ")" statement % vardecl_noiter
| "for" "(" declaration ";"
expressions-list ";"
expressions-list ")" statement % vardecl_controlled
;
Evaluates expressions-list or declaration before the first semicolon, then
execute the for loop by invoking the "execute the for loop once" recursive
procedure described later.
To execute the for loop once, evaluate expressions-list after the first
semicolon, if it's true, then statement is evaluated, then the
expressions-list after the second semicolon is evaluated, and the for loop is
executed once again. For the purpose of "proceeding to the next iteration" as
mentioned in continue, the expressions-list after the second semicolon is
not considered part of the loop body, and is therefore always executed before
proceeding to the next iteration.
The description here used the word "once" to describe the semantic of the loop in terms of "functional recursion", where "functional" is in the sense of the "functional programming paradigm".
statements-list % stmtlist
: statement % base
| statements-list statement % genrule
;
base: statement is executed, the semicolon is a delimitor.genrule: statement-list is first executed, then statement is executed.Because the value of a variable that held integer value may transition to
null after being assigned the result of certain computation, the variable
needs to hold type information, as such, variables are represented conceptually
as "lvalue" native objects. (Actually, just value native objects, as their
scope and key can be deduced from context.)
declaration % decl
: "decl" identifier % singledecl
| "decl" identifier "=" assign-expr % singledeclinit
| declaration "," identifier % declarelist1
| declaration "," identifier "=" assign-expr % declarelist2
;
singledecl: Declares a variable with the spelling of the identifier as
its name, and initialize its value to null.singledeclinit: Declares a variable with the spelling of the identifier as
its name, and initialize its value to that of assign-expr.declarelist1: In addition to what's declared in declaration, declare
another variable in a way similar to singledecl.declarelist2: In addition to what's declared in declaration, declare
another variable in a way similar to singledeclinit.function-declaration % funcdecl
: "subr" identifier arguments-list statement % subr
| "method" identifier arguments-list statement % method
;
arguments-list % arglist
: "(" ")" % empty
| arguments-begin ")" % some
;
arguments-begin % args
: "(" identifier % base
| arguments-begin "," identifier % genrule
;
Note: As of 2025-12-26, all concepts of type keywords, operand attributes, and annotation in general had been eliminated as unnecessary.
When the function body is emptystmt, the function-declaration declares a
function; when it's brace, it defines a function. The this keyword MUST NOT
appear in the function body of a subroutine.
When the end of the function body is reached without an explicit return phrase,
a Morgoth null is implicitly returned.
The number of parameters between all declarations and the definition of a function MUST be consistent - the order of the arguments in a function call MUST be consistent with what's expected by the parameters of the function. Furthermore, whether a function is a method or a subroutine. The name of the parameters may be changed in the source code of a program. Depending on the context, this may provide the benefit of both explanative argument naming in declaration, and avoidance identifier collision in function definition when the argument is appropriately renamed.
Note: Before 2025-10-27, there were FFI methods. This had been removed, because methods are attached to properties of objects, and their prototypes cannot be reliably determined unless all parameters are of uniform type, only then, could the number of arguments be determined. As of 2025-11-03, all FFI are removed - this is because impossibility with determining the prototype of the said FFI functions when they're called from object properties.
A translation unit consist of a series of function declarations and definitions. Because definition of objects occur during run time, it's not possible to define data objects of static storage duration in cxing, this is recognized as unfortunate and accepted as a design decision.
A translation unit in cxing correspond to relocatable code object, or a file contain such information. We choose such definition to emphasize binary runtime portability; the word "translate/translation" doesn't require translation to occur - it's allowed for an implementation to interpret the source code and execute it directly for when it can be achieved. The terms "translation unit" and "relocatable object" take their usual commonly accepted meanings in building programs and applications.
The goal symbol of a source code text string is TU - the translation unit
production. It consist of a series of entity declarations.
TU % TU
: entity-declaration % base
| TU entity-declaration % genrule
;
entity-declaration % entdecl
: "_Include" string-literal ";" % srcinc
| "extern" function-declaration % extern
| function-declaration % implicit
| "const" identifier constant ";" % constdef
;
There MUST NOT be more than 1 definition of a function.
By default, all entity declarations are internal to the translation unit. For a
declaration to be visible in multiple translation units, it must be declared
"external" with the extern keyword.
As a best practice, external declarations should be kept in "header" files, and
included (explained shortly) in a source code file. The recommended filename
extension for cxing source code file is
.cxing, and .hxing for headers
(named after the Hongxing Yu village on the Changxing Island).
Source code inclusion is a limited form of reference to external definitions. This is not preprocessing, not importation, and not substitute for linking. Source code inclusion is exclusively for sharing the declarations in multiple source code files and translation units.
By default, header files are first searched in a set of pre-defined paths.
(These paths are typically hierarchy organized and implemented using a file
system.) If the header isn't found in the pre-defined paths, then it's searched
relative to the path of the source code file. However, if the string literal
naming the header file begins with ./ or ../, then it's first searched
relative to the path of the source code file, then the pre-defined set of paths.
The const keyword can be used to define symbolic constants. The type of the
constant MUST be one of long, ulong, or double. Once the constant is
defined, the identifier may be used later to substitute the defined value.
An object may have properties, properties may also be called members.
Note: The word "property" emphasizes the semantic value of the said component, while the word "member" emphasizes its identification. Both words may be used interchangeably consistent with the intended point of perspective.
The internals of an object is largely opaque to the language. The primary interface to objects are functions that operates on them.
Note: Functions in compiled implementations follow platform ABI's calling convention. Because certain opaque object types (such as the string type) in the runtime may need to be used in functions compiled on different implementations, the consistency of their structure layout is essential.
A native object is a construct for describing the language. It has a fixed set of properties, and are copied by value; mutating a native object does not affect other copies of the object.
An value is a native object with the following properties:
sharable types, this can also be the "global" scope,Other native objects may be introduced in the future.
All values have a (possibly empty) set of type-associated properties that're immutable. These type-associated properties take priority over other properties. The behavior is UNSPECIFIED when these properties are written to.
Note: The data structure for the value native objects are further defined to enable the interoperability of certain language features. Values are such described to enable discussion of "lvalue"s, alternative implementations may use other conceptual models for lvalues should they see fit.
As described in 9.1. Objects and Values objects have properties. The key used to access a value on an object is typically a string or an integer.
When the key used to access a property is an integer, there may be a mapping from the integer to a string defined by the implementation of the runtime. Portable applications SHOULD NOT create objects with mixed string and integer keys. All implementations of the runtime SHALL guarantee there's no collision between any key that is the valid spelling of an identifier and any integer between 0 and 1010 inclusive.
Note: The limit was chosen for efficiency reasons. While implementing a number to string conersion would immediately solve the issue of collision between numerical and identifier keys, it's slightly inefficient. A second option would be to pad the integer word with bytes that can never be valid in identifiers, this would be the best of both worlds. Yet considering most applications won't be needing such big array, and those that do would probably go for the string type in the standard library, a limit is set so that plausible real-world applications and implementations can enjoy the efficiency enabled by such latitude.
To read a key from an object:
0. if the object is null, it is returned as is, preserving uncasting information.
(TODO: 2026-01-24, check back for inconsistencies)
__get__ is one of the type-associated properties, then
this method is used to retrieve the actual property:
this parameter,val,__get__ is not defined as one of the type-associated
properties, then an lvalue being null augmented with 'scope' and 'key'
being the object and the key used to access this property is returned.Note: The return value from 2.1.3. may be null. The null resulting from
step 2.2. shall be a Morgoth, because there exists no diagnostic information.
To write a key onto an object:
__set__ type-assocaited method property.
The object is passed as the this parameter, the key as the first parameter
as a val, and the value as the value as the the second parameter as
a value native object. See
13.2. Calling Conventions and Foreign Function Interface.
The value of assignment expressions where the method call took place, is the
return value of the called method.directassign), then the key is read from the object, the computation
part of the compound assignment is performed, and the result is stored
written to they key on the object.Note: Compound assignment is different from loading the values from both sides of the assignment operator, perform the computation, then storing the result into the key, as the latter performs the read on the lvalue twice.
When a key is being deleted from an object:
__unset__ type-associated method
property. The object is passed as the this parameter, the key as the first
parameter as a val.__final__ method property exists on the object, then it's
called, the key is then removed from the object, after which the member
identified by the key is considered not defined on the object from this point
onwards (until it's being written to again).Note: Destruction of values and finalization of resources are further discussed in 13.3. Finalization and Garbage Collection.
Note: This section is added 2025-12-28, and explicitly defines when resource management methods are invoked.
As mentioned before, lvalues have scopes. Precisely, for an lvalue:
Values that're not lvalues are known as rvalues.
The following rules govern the occurence of automatic resource management:
__copy__'d.__final__'d.__final__
immediately frees any resources consumed by it.__copy__'d.__final__'d as late as by the end of the scope
in which the expression in which they're consumed.Both subroutines and methods are codes that can be executed in the
language, the distinction is that methods have an implicit this parameter
while subroutines don't - for compiled implementations, this is significant,
as it causes difference in parameter passing under a given calling convention.
Subroutines and methods are distinct types, as such there's no restriction that subroutines have to be called directly through identifiers or that methods have to be identified through a member access.
this parameter,this.this in a method is null.Previously (before 2025-11-03), there had been FFI (foreign function interface) subroutines and functions. Because it's impossible to determine the prototype of the functions called from properties of objects, it is therefore unsafe to call FFI functions. On the same safety note, calling convention of (non-FFI) subroutine and methods are changed to take into account for potentially missing parameters.
Note: In a previous revision, there was a note claimed that this being a
pointer handle. The idea back then was that when cxing runtime is
implemented with SafeTypes2, certain APIs of the library can be used without
modification. However, better runtime implementation stratagy was discovered
which resulted in the introduction of type-associated properties.
And so this parameter is received as a val in all (currently one) type(s)
of methods. Still, to facilitate the correct passing of parameters, it
necessitates the distinction between methods and subroutines.
As of 2025-10-27, the ref argument type is removed entirely,
further as of 2025-12-26, operand types' annotations are eliminated altogether.
long and ulong typesThe long type is a signed 64-bit integer type with negative values having
two's complement representation. The ulong type is an unsigned 64-bit
integer type. Both types have sizes and alignments of 8 bytes.
Note: 32-bit and narrower integer types don't exist natively, primarily
because of the year 2038 problem and issue with big files. However, respective
type objects for smaller integers, as well as those for float/binary32 and
other floating point types are defined in the standard library to interpret
data structures in byte strings.
The keyword bool is used exclusively as an alias for the type long, there
is no restriction that a bool can store only 0 or 1, it exist primarily for
programmers to clarify their intentions.
double typeThe double type is the floating point number type. It should correspond to
the IEEE-754 (a.k.a. ISO/IEC-60559) binary64 type - that is, it should have
1 sign bit, 11 exponent bits, and 52 mantissa bits. The type have sizes and
alignment of 8 bytes.
str typeThe string type str is not a built-in type, instead, it's an opaque object
type defined in the standard library. The string type has significance in the
indirect member access operator in a postfix-expr postfix expression.
Each occurence of (concatenation of) string literal creates a new string object.
true and false special valuesThe special value true is equal to 1 in type long.
The special value false is equal to 0 likewisely.
null and NaN special valuesThe null special value results in certain error conditions. Accessing
any properties (unless otherwise stated) results in null; calling null
as if it's a function results in null.
There are 2 kinds of nulls
null contains diagnostic information in the form of a signed
integer (i.e. long), that may be obtained by uncasting.null. This kind of null is
to be used when no diagnosis is needed.All nulls compares equal to each other barring uncasting.
The NaN special value represents exceptional condition in mathematical
computation. NaN does not compare equal to any number, or to itself.
Uncasting an NaN results in its bit pattern being re-interpreted as a long.
Both null and NaN are considered nullish in coalescing operations.
See 12. Numerics and Maths for furher discussion.
Values and/or their types may be converted used under certain contexts:
long and ulong are collectively "integer context";double is the "floating point context";long, ulong, and double are collectively "arithmetic context".The "implicit type and value conversion" apply to multiple operands in such way that there's one common type (or special value) that is the same regardless of the order of the operands. This conversion is defined in terms of a binary operation that is associative and commutative, so that any binary expression operator that is associative and commutative preserve this property regardless of the types of the operands.
Under a integer context:
Under the floating point context:
+1.0.Under arithmetic context:
long.longs results in long operands;ulong but not double results in ulong operands;double results in double;The special value null is treated specially:
null, which is neither less nor greater than any integer or
floating point number - this is known as the order evaluation conversion.null is converted
to the integer 0, or if there're double, to +0.0 -
this is known as the value computation conversionOperators shall document whether they evaluate the order of, or compute a value from operands. In general, operators that returns true/false predicate from arithmetic operands evaluates the order, while ones that computes a value would evaluate to arithmetic types.
Note: The special value NaN always have type double.
Note: It was considered to have certain operations in integer context that involved floating points to have NaNs, but this was dropped for 2 simple reasons: 1st, the current conversion rule is much simpler written, and 2nd, there exist prior art with JavaScript.
There's a simple syntax in cxing for creating compound objects and types:
decl Complex := namedtuple() { 're': double, 'im': double };
decl I := Complex() { 're': 0, 'im': 1 };
decl sockaddr := dict() { 'host': "example.net", 'port': 443 };
In the above scenario,
namedtuple() factory function creates such object that is
a type object that creates another type object with 2 members
named "re" and "im", this type is assigned to Complex,
which is then used to create a "complex number"
with the value of the imaginary unit;dict() factory function creates a type object that creates a
dictionary, initializing sockaddr with 2 members - "host" with
the value of "example.net" and "port" with 443.namedtuple, Complex, and dict are "type objects", of which,
with namedtuple being sort of a meta.
A type object contains an method property named __initset__ declared as follow:
method __initset__(key, value);
Note The parameters of the __initset__ method property were changed from
ref to val. For one, most usages would have keys and values as literals,
so it doesn't make sense to have references to them. Other issue is that, there
haven't been a way to signify the end of list. This is now changed to use the
setting of the existing __proto__ property to the type object for signifying
the end-of-list. As of 2025-10-27, the ref argument type is removed completely,
further as of 2025-12-26, operand types' annotations are eliminated altogether.
objdef-start % objdefstart
: objdef-start-comma % comma
| objdef-start-nocomma % nocomma
;
objdef-start-comma % objdefstartcomma
: objdef-start-nocomma "," % genrule
;
objdef-start-nocomma % objdefstartnocomma
: postfix-expr "{" postfix-expr ":" assign-expr % base
| objdef-start-nocomma "," postfix-expr ":" assign-expr % genrule
;
object-notation % objdef
: postfix-expr "{" "}" % empty
| objdef-start "}" % some
| auto-index % array
;
The postfix-expr MUST NOT be inc or dec. Furthermore, if postfix-expr
is degenerate, then the primary expression MUST NOT be const.
On encountering a postfix-expr that is a type object, the key-value pairs
enclosed in the braces delimited by commas are taken and the __initset__
method is called on them in turn. The key is the value of the postfix
expression on the left side of the colon, while the value is that of the
assignment expression on the right side of the colon. After this completes,
the __initset__ method is invoked with __proto__ as key and
the value of postfix-expr to signify the end, and then, the now value
of postfix-expr becomes the value of the object-notation expression.
Note: As such, the property names __initset__ and __proto__ are
RESERVED for the "Type Definition and Object Initialization Syntax".
auto-index-start-comma % array_piece
: postfix-expr "[" assign-expr "," % base
| auto-index-start-comma assign-expr "," % genrule
;
auto-index % array
: auto-index-start-comma "]" % complete
| auto-index-start-comma assign-expr "]" % streamline
;
The array rule is a syntax sugar that invokes __initset__ with elements
in the expressions-list as value and successive integer indicies as key,
starting with 0.
Note: Much of this section is motivated by a desire to have a self-contained description of numerics in commodity computer systems, as well as an/a interpretation / explanation / rationale of the standard text that's at least more useful in terms of practical usage than the standard text itself.
IEEE-754 specifies the following rounding modes:
roundTiesToEven: This is MANDATORY and SHALL be the default within a thread when the thread starts. The floating point value closest to the infinitely precise result is returned. If there are two such values, the one with an even digit value at the position corresponding to the least significant of the least significant digits of the two values will be returned.
roundTowardPositive: The least representable floating point value no less than the infinitely precise result is returned.
roundTowardNegative: The greatest representable floating point value no greater than the infinitely precise result is returned.
roundTowardZero: The representable floating point value with greatest magnitude no greater than that of the infinitely precise result is returned.
The standard library provides facility for setting and querying the rounding mode in the current thread. The presence of other rounding modes (e.g. roundTiesToAway, roundToOdd, etc.) are implementation-defined.
Infinity and NaNs are not numbers. It is the interpretation of @dannyniu that they exist in numerical computation strictly to serve as error recovery and reporting mechanism.
IEEE-754 specifies the following 5 exceptions:
invalid: known as "invalid operation" in standard's term. This is when:
pole: known as "division by zero" in standard's term. A pole results when operation by an operand results in an infinite limit. Particular cases of this include 1/0, tan(90°), log(0), etc.
overflow: this is when and only when the result exceeds the magnitude
of the largest representable finite number of the floating point data type
after rounding. The data type is double a.k.a. binary64 in our language.
underflow: this is when a tiny non-zero result having an absolute value below bemin, where b is the radix of the floating point data type - 2 in our case , and emin is, in our case -1022.
Note: emin can be derived as: 2 - 2ebits-1, where ebits is the number of bits in the exponents, which is 11 in our case.
inexact: this is when the result after rounding differs from what would be the actual result if it were calculated to unbounded precision and range.
The standard library provides facility for querying, clearing, and raising exceptions. Alternate exception handling attributes are implemented in the language as error-handling flow-control constructs, such as null-coalescing expression and phrases operators, as well as execution control functions.
Floating points have a fixed significand width as well as limited range(s) of exponents, as such, they're very similar to scientific notations, further as such, they suffer from the same inaccuracy problems as any notation that truncates a large fraction of value digits. However, this do yield a favorable trade-off in terms of implementation (and to some extent, usage) efficiency.
IEEE-754 recommends that language standard provide a mean to derive a sequence (graph actually, if taken dependencies into account) of computation in a way that is deterministic. Many C compilers provide options that make maths work faster using arithmetic associativity, commutativity, distributivity and other laws (e.g. fast-math options), cxing make no provision that prevents this - people favoring efficiency and people favoring accuracy should both be audience of this language.
The root cause of calculation errors stem from the fact that the significand of floating point datum are limited. This error is amplified in calculations. A way to quantify this error is using the "unit(s) in the last place" - ULP. There are various definitions of ULP. Vendors of mathematical libraries may at their discretion document the error amplification behavior of their library routines for users to consult; framework and library standards may at their discretion specify requirements in terms error amplification limits. Developers are reminded again to recognize, and evaluate at their discretion, the trade-off between accuracy and efficiency.
Because of the existence of calculation errors, floating point datum are recommended as instrument of data exchange. In fact, earlier versions of the IEEE-754 standard distinguished between interchange formats and arithmetic formats. Because arithmetics and the format where it's carried out are essentially black-box implementation details, the significance of arithmetic formats is no longer emphasized in IEEE-754.
The recommended methodology of arithmetic, is to first derive procedure of calculation that is a simplified version of the full algorithm, eliminating as much amplification of error as possible, then feed the input datum elements into the algorithm to obtain the output data. The procedure so derived should take into account of any exceptions that might occur.
For example, (a+b)(c+d) = ac+ad + bc+bd have
2 additions and 1 multiplication on the left-hand side and
3 additions and 4 multiplications on the right-hand side.
a program may first attempt to calculate the left hand side, because it has
less chance of error amplification. However, if the addition of c and d
overflows but they're individually small enough such that their multiplication
with either a and b won't overflow, yet the sum of a and b underflows
in a certain way that's catastrophic, the the whole expression may become NaN.
In this case, a fallback expression may then compute the right-hand side of the expression, possibly yielding a finite result, or at least one that arithmetically make sense (i.e. infinity).
The result of computation carried out using such "derived" procedure will certainly deviate from the result from of a "complete" algorithm. Developers should recognize that robustness may be more important in some applications than they may expect. In the limited circumstances where an application in reality is less important, or in fact be prototyping, developer may at their careful discretion, excercise less engineering effort when coding a numerical program.
Finally, it is recognized that large existing body of sophisticated numerical programs are written using 3rd-party libraries, and/or using techniques that're under active research and not specified and beyond the scope of many standards. Developers requiring high numerial sophistication and robustness are encouraged to consult these research, and evaluate (again) the accuracy and efficiency requirements at their careful discretion.
The recommended applications of floating points in computer, are Computer Graphics, Signal Processing, Artificial Intelligence, etc.
Typical characteristics of these applications include:
With the exception of resources and garbage collection, everything else in the entirity of this chapter is concerned with the interoperability of compiled implementations. Non-compiled implementations are nonetheless recommended to consult this chapter to maintain modal conceptual consistency. Care have been taken to ensure that this chapter is decoupled to the maximal extent from language proper, and any entanglement is not intentionally desired.
While the features and the specification of the language is supposed to be stable, as a guiding policy, in the unlikely event where certain interface in the runtime posing efficiency problem are to be replaced with alternatives, deprecation periods are given in the current major version of the runtime (and thus the language), before removal in a future major version should that happen; in the even more unlikely event where certain interface exposes a vulnerability so fundamental that necessitates its removal, the language along with its runtime is revised, a new version is released, and the vulnerable version is deprecated immediately. The versioning practice is in line with recommendation by Semantic Versioning.
Dynamic libraries and applications linking with dynamic libraries programmed in cxing should not statically link with the cxing runtime. Unless no opaque objects is passed between translation units compiled by different implementations (which is unlikely), statically linking to different incompatible implementations of the runtime may result in undefined behavior when opaque objects and the functions that manipulates them are from different implementations.
The version of the runtime and the version of the language specification are coupled together to make it easy to determine which version of runtime should be used to obtain the features of relevant version of the language. If the standard library is to be provided, then the runtime should be provided as part of the standard library, the name of the linking library file should be the same for both the runtime and for when it's extended into/as standard library.
The recommended name for the library corresponding to version
0.5 of the specification is
libcxing0.so.5 for systems using the
UNIX System V ABI such as Linux, BSDs, and several commercial Unix distros.
For the Darwin family of operating systems such as macOS, iOS, etc. the
recommended name is libcxing0.5.dylib .
For some platforms such as Windows, vendors have greater control over the dynamic libraries bundled with the programs in an application. Therefore no particular recommendations are made for these platforms.
The types long and ulong are passed to functions as C types int64_t
and uint64_t respectively; the type double is passed as the
C type double.
The "value" and "lvalue" native object are defined as the following C structure types:
enum types_enum : uint64_t {
valtyp_null = 0,
valtyp_long,
valtyp_ulong,
valtyp_double,
// the opaque object type.
valtyp_obj,
// `porper.p` points to a `struct value_nativeobj`.
// currently unused.
valtyp_ref,
// subroutines and methods.
valtyp_subr = 6,
valtyp_method,
valtyp_ffisubr, // reserved as of 2025-11-03.
valtyp_ffimethod, // reserved as of 2025-11-03.
// 10 types so far.
};
struct value_nativeobj;
struct type_nativeobj;
struct value_nativeobj {
union { double f; int64_t l; uint64_t u; void *p; } proper;
union {
const struct type_nativeobj *type;
uint64_t pad; // zero-extend the type pointer to 64-bit on ILP32 ABIs.
};
};
struct lvalue_nativeobj {
struct value_nativeobj value;
// The following fields are for lvalues:
// 2026-01-01:
// because different kind of scopes needs different accessors,
// a mere pointer to the scope is not enough - it needs
// accessor properties, therefore this is changed to
// a value native object.
struct value_nativeobj scope;
// 2026-01-01:
// the reference implementation uses `s2data_t` from the SafeTypes2
// library, other implementations may have a different choice,
// barring binary compatibility and interoperability issues.
void *key;
};
struct type_nativeobj {
enum types_enum typeid;
uint64_t n_entries;
// There are `n_entries + 1` elements, last of which `type` being the only
// `NULL` entry in the array.
struct {
const char *name;
struct value_nativeobj *member;
} static_members[];
};
As mentioned in language semantics, there are 2 types of nulls:
null, where typeid equals 0 - valtyp_null, and l member
of value proper contains the diagnostic information that may be obtained
through uncasting.p of value proper contains NULL with typeid having
the enumeration value valtyp_obj.A function in cxing receive its arguments as a pointer to an array
of value native objects, passed as the second argument in the respective
C calling convention, with the fisrt argument containing the number of actual
arguments passed. Because cxing is a dynamically typed language,
the actual number of passed arguments may be less (or more in certain cases)
than the number of argument expected as inferred from the declaration of the
functions. Implementations must anticipate for these and generate Morgoth nulls
as appropriate when these values are accessed.
As mentioned in 9.4. Subroutines and Methods, methods
carries an implicit this parameter, this is passed as the initial argument
(i.e. element with index 0 in the array of value native objects); subroutines
on the other hand receive the first argument as the initial element in the
arguments array directly.
The C prototype of cxing functions are:
struct value_nativeobj <func-ident>(int argn, struct value_nativeobj args[]);
Where <func-ident> is the identifier naming the function.
Note: Before 2025-10-03, it was mistakenly said that the this parameter
is received as a ref. This was in conflict with the spec developer intent
that opaque objects be passed as pointer handles. Since better runtime
implementation stratagy was discovered, the passing of this and opaque
object arguments are revised. See note in
9.4. Subroutines and Methods .
As of 2025-10-27, the ref argument type is removed completely.
The cxing language did away with foreign function interface as of Nov. 2025, and this aspect had been replaced entirely with reverse FFI - that is, instead of cxing invoking the foreign function, a foregin language exposes a cxing interface instead, and invokes cxing function in accordance to the cxing calling conventions.
Resources are generically defined as what enables a program to run and function, and assciated with it. When a value is destroyed, the resources associated with it are finalized and released, which may lead to the resources be free for reuse elsewhere.
Note: On a reference-counted implementation (which is conceptually prescribed), releasing an object "decreases" its reference count, and when the reference count reaches 0, the resources are "freed". Under implementation-defined circumstances, an object may be released by all, but still referenced somewhere (e.g. reference cycle), which require garbage collection to fully "free" the object and its resources.
Editorial Note: Previously (before 2025-09-26), finalize and destroy were used interchangeably; now finalize refer to that of resource and destroy refer to that of values (i.e. the concept of value native objects).
subr cxing_gc();
The cxing_gc foreign function invokes the garbage collection process.
Note: In part because of the runtime implementation need to be informed of destruction of values to finalize relevant resources, more pressingly because of benefit to the design of idiomatic standard library features, copying and destruction of values are now being defined. To define the concepts in terms of reference counts would mean to depend on intrinsic implementation details, and also that there's circular dependency in definition. Seeking an alternative, it's discovered that copying and destroying are paired concepts that must be described together, and this is the approach that will be taken right now.
To copy a value, means to preserve its existence in the event of its destruction, which causes the value ceases to exist; when a value is copied, the value and the copied value can both exist, and the destruction of either don't affect the existence of the other.
The __copy__ property is a method that copies its this argument and
returns "the copy" as a val. The __final__ property is a method that
releases the resources used by the value before the destruction of the value.
Although the __copy__ and __final__ properties are not required to be
type-associated, but because they manipulate resources that're opaque to the
language, they're almost always implemented as type-associated.
Note: Primitive types such as long, ulong, and double may not need
a __copy__ method - runtime recognizing these sort of types may copy them
in any way that may be assumed reasonable according to common sense. For types
without a __final__ method, it is assumed that there are no resource consumed
by the value beyond what's already in the value native object structure.
In the following sections, some special notations that're not part of langauge are used for ease of presentation.
The meaning of such notation:
[Type(Base): Function1 | Function2 | ... | FunctionN] := { ... }
is as follow:
An object whose members are listed in the brace may be created by and/or returned from function(s) Function1 ... FunctionN.
The optional Type(Base): part specifies Type as name for the type of object
returned by the said functions, with Base representing the 'base class' that
Type inherits features and/or behaviors from.
The cxing is composed of modules. Language syntax and semantics are specified in preceeding chapters, along with following chapters on mandatory standard libraries, these form what's colliqually known as "Module-0". Additional modules are optional, and should they exist, they specify interfaces related to particular functionality. Certain interfaces of a particular module may be specified in separate chapters if they're topically sparse.
For all library chapter in module-0, the following statement exists towards the beginning of relevant chapters:
This chapter forms an integral part of the language and its implementation is mandatory.
For library chapters pertaining to particular module, the following statement exists towards the beginning of chapters making up the module:
This chapter forms an integral part of module X - should module X be implemneted, this chapter along with any chapter constituting part of module X must be implemented in their entirity.
Certain modules may have dependencies on others, and the following statement may appear:
This module depend on module Y, should this module be implemented, module Y must also be implemented.
This chapter forms an integral part of the language and its implementation is mandatory.
str(obj) := {
method len(),
method trunc(newlength),
method putc(c),
method puts(s),
method putfin(),
method cmpwith(s2), // efficient byte-wise collation.
method equals(s2), // constant-time, cryptography-safe.
[method map(structlayout)] := {
method __get__(k),
method __set__(k, v),
method unmap(),
},
};
The string type str is a sequence of bytes. Some APIs may expect nul-terminated
strings, and would ignore any byte after the first nul byte.
A string has a length that's reported by the len() function as a long,
and can be altered using the trunc() function.
The putc() function can be used to append a byte whose integer value is
specified by c, to the end of the string; the puts() function can be
used to append another string to the end; both putc() and puts() may
buffer the input on the working context of the string, such buffer need to be
flushed using the putfin() function before the string is used in other
places.
For trunc(), putc(), puts(), and putfin(), the object itself is
returned on success, and null is returned on failure.
The cmpwith() returns less than, equal to, or greater than 0 if the string is
less than, the same as, or greater than s2. The strict prefix of a string is
less than the string to which it's a prefix of.
The equals() function returns true if the string equals s2 and false
otherwise. If the 2 strings are of the same length, it is guaranteed that
the comparison is done without cryptographically exploitable time side-channel.
The map() function creates an object that is a parsed representation of the
underlying data structure. This object can be used to modify the memory backing
of the data structure if the corresponding memory backing is writable. The
memory backing is writable by default, and the circumstances under which it's
not writable is implementation-defined.
The unmap() function unmaps the parsed representation, thus making it
no longer usable, and returns true. The variable can then only be finalized
(or overwritten, which would imply a finalization). The trunc() function
cannot be called on the string unless there's no active mapping of the string.
Note: Previously, the unmap() function returned null. Because nullish
values are reserved in cxing entirely as an error indicator, its
return type is now changed to bool.
Note: Although the canonical way to access data behind a str object, is
to first map it to a structure type, it is anticipated that a common extension
will exist in the wild allowing for "mutable" strings - where they implement
the __get__ and the __set__ methods. This is not yet considered for
standardization eventhough there's no compelling reason not to. For
implementations that do provide this extension, the following requirements apply:
- The
__get__method shall returnlongfor byte range 0-255 inclusive, and -1 on out of bound access.- The
__set__method shall accept second argument of at least thelongand theulongtype, and shall castdoubletoulongby truncating fractions. The byte values shall be set by discarding all but lowest 8 bits of the byte (non-octet bytes are not considered for cxing).- The application shall ensure the key be non-negative integer indicies of type
longorulong, and the implementation may have undefined behavior if this requirement on the applications are not met.
This chapter forms an integral part of the language and its implementation is mandatory.
decl char, byte; // signed and unsigned 8-bit,
decl short, ushort; // signed and unsigned 16-bit,
decl int, uint; // signed and unsigned 32-bit,
decl long, ulong; // signed and unsigned 64-bit,
decl half, float, double; // binary16, binary32, binary64.
// decl _Decimal32, _Decimal64; // not supported yet.
// decl huge, uhuge, quad, _Decimal128; // too large.
[subr struct()] := {
method __initset__(key, value),
};
[subr packed()] := {
method __initset__(key, value),
};
[subr union()] := {
method __initset__(key, value),
};
The representations for char, byte, short, ushort, int, uint,
long, ulong, half, float, and double are explained in the comments
following their description; their alignments are the same as their size.
These are known as primitive types.
All of these type objects have a method member called from, which performs
explicit type and value conversion - unlike implicit type and value conversion,
the resulting type are determined by the type object. The method takes
one argument and converts it to a value representable in the destination type:
long language type for signed types,
and ulong for unsigned types.double.A struct_inst object represents an instance of structure that is suitabl for
use in a call to the map() method of the str type, representing a structure
with members laid out sequentially and suitably align. A packed_inst is
similar, but with no alignment - all members are packed back-to-back.
A union_inst creates a structure layout object with all members having the
same start address at byte 0 and alignment of the strictestly-align member.
Each object of type struct_inst, packed_inst, and union_inst are
type objects. They're initialized with members using the syntax as described in
11. Type Definition and Object Initialization Syntax; and
are created using the struct(), packed(), and union() factory functions
respectively.
Primitive types and structure layout object may be array-accessed to create array types of respective types.
For example:
decl AesBlock = union() { 'b': byte[16], 'w': uint[4] };
decl Aes128Key = AesBlock[11];
The variable AesBlock holds a structure layout object of 128 bits,
and Aes128Key holds the 11 round keys for an AES-128 cipher.
This chapter forms an integral part of the language and its implementation is mandatory.
[subr dict()] := {
method __get__(k),
method __set__(k, v),
method __copy__(),
method __final__(),
method __unset__(k),
method __initset__(k, v),
[method __keys__()] := {
method __get__(k),
method __copy__(),
method __final__(),
},
}
The function dict creates a dictionary, also known as associative arraies, or
hash table (from the implementation's perspective) in literatures.
The semantics of __get__, __set__, __copy__, __final__, and __unset__
are as described in 9.2. Object/Value Key Access,
The member __initset__ SHALL NOT be a type-associated property.
The __keys__() method retrieves an immutable snapshot of the keys present
on the dictionary, at the time of the snapshot, and returns an object
consisting of the type-associated method properties __get__(), __copy__(),
and __final__().
The __get__() method may be used to retrieve length which indicates the
number of keys in the snapshot, as well as the keys themselves indexed 0
through length-1. The order of the keys are unspecified.
subr isnull(x);
subr islong(x);
subr isulong(x);
subr isdouble(x);
subr _Uncast(x);
The functions isnull, islong, isulong, isdouble, determines whether
the value is the special value null, of type long, type ulong, or
type double respectively.
The function _Uncast performs uncasting of nulls - an operation whose
semantic is described in 10. Types and Special Values.
TODO 2025-12-26: decide what to do with non-null arguments for uncasting.
To Be Changed: The exact form of the following functionality is being redesigned, and will change over time.
subr fpmode(mode);
Returns the currently active rounding mode. If mode is one of the supported mode, then set the current rounding mode to the specified mode. The value -1 is guaranteed to not be any supported mode.
The following modes are supported:
The support for other modes are unspecified.
The encoding of modes are as follow:
The next bits are as follow:
Such encoding is chosen to cater to possible future extensions. Not all possible rounding modes offer numerical analysis merit, as such some of the combinations are not valid on some implementations.
To Be Changed: The exact form of the following functionality is being redesigned, and will change over time.
// Tests for exceptions
subr fptestinval(); // **invalid**
subr fptestpole(); // **division-by-zero**
subr fptestoverf(); // **overflow**
subr fptestunderf(); // **underflow**
subr fptestinexact(); // **inexact**
// Clears exceptions
subr fpclearinval(); // **invalid**
subr fpclearpole(); // **division-by-zero**
subr fpclearoverf(); // **overflow**
subr fpclearunderf(); // **underflow**
subr fpclearinexact(); // **inexact**
// Sets exceptions
subr fpsetinval(); // **invalid**
subr fpsetpole(); // **division-by-zero**
subr fpsetoverf(); // **overflow**
subr fpsetunderf(); // **underflow**
subr fpsetinexact(); // **inexact**
// Exceptions state.
subr fpexcepts(excepts);
The fptest*, fpclear*, and fpset* functions tests, clears, and sets the
corresponding floating point exceptions in the current thread.
The fpexcepts function returns the current exceptions flags. If excepts is
a valid flag, then the exceptions flag in the current thread will be set,
otherwise, it will not be set. The value 0 is guaranteed to be a valid flag
meaning all exceptions are clear; the value -1 is guaranteed to be an invalid
flag. The validity of other flag values are UNSPECIFIED. When the
implementation is being hosted by a C implementation, the encoding of excepts
is exactly that of FE_* macros, with the clear intention to minimize
unecessary duplicate enumerations as much as possible.
This chapter forms an integral part of "The Regex Module" - should this module be implemneted, this chapter along with any chapter constituting part of "The Regex Module" must be implemented in their entirity.
[RegExp: bre_comp(regex, cflags) | ere_comp(regex, cflags)] := {
// An opaque object representing a compiled regular expression.
method split(subject);
method match(subject);
method capture(subject);
method replace(subject, replacement, limit);
};
The bre_comp() and ere_comp() functions compiles a regular expression based
on the "Basic Regular Expression" and "Extended Regular Expression" syntax
specified by POSIX. Under the C/POSIX locale, all regex features up to
POSIX-2017 are mandatory.
The cflags are expressed as radix-64 digits, whose correspondence with
POSIX compile flag constants are as follow:
0\i: REG_ICASE - regex is executed without regard to case.0\n: REG_NEWLINE - the lines (delimited by the LINE-FEED character)
in the subject string is considered individually.The split() method splits the subject string into a 0-base-indexed array
of strings. The match() method determines whether the subject string can
be matched by the regular expression.
The capture() method matches the subject string, putting matched
subexpressions in array (starting from the index 1), the (entire) matched
portion of the subject string in the 0th element of the said array,
then return the array.
The replace() method replaces limit number of occurences of the substring
matching the regex, with replacement. Each occurences of $<n> where <n>
is a single decimal digit is replaced with the n-th subexpression in the
regex. If <n> is 0, then it's replaced with the whole matched portion of
the subject string. If limit is -1, then all occurences shall be replaced.
This chapter forms an integral part of "The Multi-Threading Module" - should this module be implemneted, this chapter along with any chapter constituting part of "The Multi-Threading Module" must be implemented in their entirity.
// - Sharable objects may be used across threads
// - Exclusive objects have more efficient implementations than
// sharable objects, but the behavior is undefined when used
// in multiple threads.
[subr mutex(v)] := {
method __copy__(),
method __final__(),
[method acquire()] := {
method __get__(),
method __set__(),
method __copy__(),
method __final__(),
},
}
The mutex() function creates a mutex which is a sharable object that can be
used across threads. The argument v will be an exclusive object protected by
the mutex.
The the mutex protects its own internal state during __copy__ and __final__,
which makes it a sharable object.
Note: If implemented using reference counting, __copy__ and __final__
methods of the mutex locks the underlying mutex before changing the count, and
unlocks it afterwards.
The acquire() method of a mutex returns a "gift" object that can be used
for accessing v - when the function returns, it is guaranteed that the thread
in which it returns is the only thread holding the value protected by the mutex,
and that until the gift object goes out of scope, there should be no other
thread simultaneously using the value.
Note: The "gift" object is so named, that the exclusive gift is wrapped under a mutex, protected by it before being revealled to the acquiring thread.
The __get__() and the __set__() methods are used to access the object
protected by the mutex. When they're called with the string v as its key
argument, they respectively returns and sets the object protected by the mutex;
on all other values, they returns null. Note that the object loses the
protection of the mutex if it does not go out of scope when the gift
object does.
The __copy__() and __final__() properties increments and decrements
respectively, a conceptual counter - this counter is initially set to 1 by
acquire() and any future functions that may be defined fulfilling similar
role; when it reaches 0, the mutex is 'unlocked', allowing other threads to
acquire the value for use.
Note: A typical implementation of acquire() may lock a mutex, sets the
conceptual counter to 1, creates and returns a value native object. A typical
implementation of the __copy__() method may be as simple as just incrementing
the conceptual counter. A typical implementation of the __final__() method
may decrement the counter, and when it reaches 0, unlocks the mutex.
Note: The conceptual counter is distinct from the reference count of any potential resources used by the value protected by the mutex and the mutex itself.
[subr condvar(mtx)] := {
method __copy__(),
method __final__(),
method wait(),
method broadcast(),
method signal(),
}
The condvar() function creates a condition variable. It monitors a condition
associated with the states protected by the mutex identified by mtx.
Note: Condition variables are created associated with a mutex up front so that potential implementations using reference count can protect that counter with the mutex just like mutex instances. It is strongly advised that implementations use actual atomic reference counts where available if they were to use reference counting for resource management.
The wait() method of a condition variable instance does the following:
mtx specified in the creation argument,all in one single atomic step.
The broadcast() method of a condition variable signals a condition variable
and wakes up all threads that're waiting on it. The signal() method signals
the condition varialbe and wakes up an unspecified subset of threads blocked
on the condition variable - this subset shall not be empty if there are threads
waiting on the condition variable, and this method should typically be more
efficient than broadcast() when there's only 1 waiting thread.
[subr thrd_create(thrd_entry, thrd_param) | subr thrd_self()] := {
method join();
method detach();
method equals(thrd_hnd t2);
}
subr thrd_exit();
The thrd_create() function creates a thread with the thrd_entry as its
entry point, and thrd_param as its first and only argument. thrd_entry
MUST be a subroutine. Its return type is null. On success, a thrd_hnd
thread handle is returned, otherwise, null is returned.
The thrd_self() function returns the thread handle corresponding to the
current thread.
The thrd_exit() function cause the current thread to immediately terminate.
The join() method of a thrd_hnd blocks the calling thread until the thread
referred to by the thread handle termintates. The first such call on a
non-detached thread is supposed to succeed - implementation shall document the
underlying platform API behavior for it; subsequent calls may not necessarily
succeed. The detach() method of a thread handle detaches a thread, after
which, the thread may no longer be joinable, or be detached again. The return
values of these 2 functions are implementation-defined.
The equals() method returns true if the thread handle t2 refers to the
same thread as the thread handle on which the method is called, and false
otherwise.
The thrd_hnd shall be sharable across threads. The existence of a thread
handle does not imply that of the thread.
Note: The thread management facility is bare minimum, so that first it's
directly implementable using existing standard APIs. That second the thread
handle type thrd_hnd carries the least complexity, enabling its share across
threads - although it's not explicitly specified as a sharable type, it shall
behave as such. That third, the usage flexibility makes higher level
constructions such as asynchronously completing subroutines, coroutines,
single-apartment proxy objects, etc. be readily implementable in terms of
the minimal API.
Note: The thread handle type may be implemented as sharable by virtue of it being immutable. A technique of implementing it as sharable is documented at https://langdev.stackexchange.com/a/4633/1388.
This chapter forms an integral part of "The Input/Output Module" - should "The Input/Output Module" be implemneted, this chapter along with any chapter constituting part of "The Input/Output Module" must be implemented in their entirity.
For the purpose of this chapter, the following definitions from the POSIX standard apply:
Additionally, a file handle is anything that can be used to operate on files. One file may have several file handles. This chapter define several types of object that're file handles.
When a file is operated on from separate handles, the behavior is undefined.
Note: For example, in C, when standard input is being read through a FILE *
handle, and buffering is enabled, the subsequent file position of the
file descriptor (if implemented on top of one) is undefined - this can cause
issue when one program subsequently loads another (e.g. using one of the exec
functions) and the loaded program proceeds from an unexpected file position.
This is among the few undefined behaviors in cxing, and we choose
to not define its behavior due to its usage being arcane and lacking practicality.
When a directory entry is created as a result of calling one of the functions that accesses the filesystem, barring security hardening by specific implementations of this module, eventhough not a recommended practice, the called function should not place access restriction beyond what's already placed by system defaults.
Note: As an example of what previous paragraph means, function calls such
as mkdir, mkfifo, open, etc. should use the most liberal permission
on the created file - i.e. 0o777 for directories and 0o666 non-executable
files according to POSIX, with 'file mode creation mask' (i.e. umask)
clearing excess permissions as the said 'system default'. The previous
paragraph is normative to the extent not to forbid current latest evolving
security best practice.
subr input();
subr print(s);
The input() function is a subroutine that reads a line from the standard input,
stripping a single trailing line-feed \n byte, then if there is one,
a trailing carriage-return \r byte, then returns the resulting string.
On EOF a blessed null that uncasts to 0 is returned; on error, a blessed null
that uncasts to an implementation-defined status code is returned.
TODO: This implementation-defined status code is expected to be that of
the errno number. Details of this part is being decided.
The print() function is a subroutine that writes the string argument s to
the standard output, followed by a single line-feed \n byte. On success,
the number of bytes successfully written. A blessed null that uncasts to
an implementation-defined status code is returned on failure.
GenericFile(obj) := {
method read(len),
method write(s),
method close(),
method flush(),
method setsync(b),
}
A GenericFile is the base type for file handle objects.
Its read method reads at most len bytes of data and returns it. On EOF, it
returns an empty string; on error, it returns a blessed null that uncasts to
an implementation-defined status code.
Its write method writes the string s to the file, and returns the number of
bytes actually written. On error, it returns a blessed null that uncasts to
an implementation-defined status code.
Its close method closes the file - any buffered content will be committed,
any resource consumed for operating the file will be released, any further use
of the file handle are invalid and results in error in an undefined way.
For any file, there may be several layers of buffering, two of which are defined here (the rest are given acknowledgement).
flush method,setsync
with true (or false).The act of "committing" make it more likely that future access to the data would succeed, such as writing data permanently to the disk. Further buffering, such as those done by routers and switches for network sockets, are out of the control of the program, and to some extent, the system.
subr open(path, mode);
RegularFile(GenericFile) := {
method lseek(offset, whence),
}
The open function is a subroutine that opens a file named by the path
argument, under the mode specified by the mode argument. The file to open
doesn't have to be a regular file, any type of file supported by the
implementation may be opened (e.g. FIFO, but not sockets).
The mode is made up of one of the following 4 major options:
0\r: open for reading only,0\w: open for writing, truncate or create the file first,0\R: open for both reading and writing,0\W: open for both reading and writing, truncate or create the file first,and modified by any combination of the following minor options:
0\a: open for appending - i.e. write to the end of the file,0\e: the file handle won't be available to any program loaded by the
current process (e.g. O_CLOEXEC - close-on-exec).0\x: cause the open to fail if the file already exists - if the open was
successful, other opens elsewhere shall not succeed.The lseek method adds offset to the position indicated by whence, and
returns the resulting file position:
0\SET: from the beginning of the file - i.e. 0.0\CUR: from the current position of the file.0\END: from the end of the file.The types of files in this section are required to support communicating in one direction, volunteer support for bidirection communication is not required.
subr mkfifo(path);
subr pipe();
The mkfifo function creates a FIFO - i.e. a pipe with a filesystem name.
On success, it returns path; on failure, it returns a blessed null that
uncasts to an implementation-defined status code.
The pipe function creates an anonymous pipe, and returns an object
with 2 members:
rd, the reading end of the pipe, andwr, the writing end of the pipe.Both of which are file handles. On failure, it returns a blessed null that
uncasts to an implementation-defined status code.
subr rename(old, new);
subr remove(path);
The function rename renames the old directory entry to the new name.
On success, new is returned, otherwise, a blessed null that uncasts to
an implementation-defined status code is returned.
The function remove causes the directory entry path to be no longer
accessible. On success, it returns 0, otherwise, a blessed null that
uncasts to an implementation-defined status code is returned.
subr mkdir(path);
[subr opendir(path)] := {
method readdir(),
method rewinddir(),
method closedir(),
}
The mkdir function creates a directory reachable at path. On success, path
is returned, otherwise, a blessed null that uncasts to an
implementation-defined status code is returned.
The opendir function opens a directory to enumerate its entries. On success,
a directory handle is returned, otherwise, a blessed null that uncasts to
an implementation-defined status code is returned.
The readdir method returns a string naming the directory entry at the current
directory position, and advancing it. The directory position of a directory
handle is an opaque internal concept of directory handle. The rewinddir
resets the directory position to the state it was when it was opened and before
any call to readdir were made.
The closedir function release any resource used by the directory handle.
Any further use of the directory handle are invalid and results in error
in an undefined way.
This chapter forms an integral part of "The Process Management Module" - should "The Process Management Module" be implemneted, this chapter along with any chapter constituting part of "The Process Management Module" must be implemented in their entirity.
This module depend on "The Input/Output Module", should this module be implemented, "The Input/Output Module" must also be implemented.
[subr CmdInterp()] := {
method Argv(v),
method Envp(v),
method ObtainPipeForStdin(),
method ObtainPipeForStdout(),
method ObtainPipeForStderr(),
method SetSourceForStdin(fp),
method SetDestForStdout(fp),
method SetDestForStderr(fp),
method SetCwd(path),
[method Exec()] := {
method __get__(k),
method Wait(),
method Terminate(),
method Kill(),
method Stop(),
method Continue(),
},
};
The CmdInterp function creates a preparation context used for executing a program.
The Argv method passes the argument v as an integer-keyed object consisting
of a set of strings as the "argument vector" (i.e. the argv parameter to the
C main function) to the context.
The Envp method passes the argument v as a string-keyed object consisting
of a set of strings as the "environment variables" (i.e. available through
the getenv function in C) to the context.
The ObtainPipeFor* functions create pipes and attach appropriate reading
or writing end to the standard input/output/error of the child process, and
closing unused end in respective the process.
The SetSourceForStdin method sets the file handle fp as the reading source
for standard input of the new process. The SetDestFor* methods set fp
as the writing destination for standard output and standard error respectively.
The SetCwd method sets the initial value for the current working directory
for the new process.
On success, the functions Argv, Envp, ObtainPipesFor*
and Set{Source,Dest}For* functions returns the preparation context, allowing
successive operations to be chained. On error, a blessed null that uncasts
to an implementation-defined status code is returned.
The Exec method executes and returns a process handle, or a blessed null
that uncasts to an implementation-defined status code is returned.
The __get__ method of the process handle is used to retrieve a few
non-type-associated properties:
infile: The writing end of the pipe for standard input -
null if ObtainPipeForStdin wasn't called when creating the process.outfile: The reading end of the pipe for standard output -
null if ObtainPipeForStdout wasn't called when creating the process.errfile: The reading end of the pipe for standard error -
null if ObtainPipeForStderr wasn't called when creating the process.The Wait method blocks the calling thread until the process referred to
by the process handle terminates, and returns its exit status.
The Terminate method terminates the process referred to by the process handle
The Kill method serves a similar function, but do it more forcibly, without
giving a chance for the process to do any cleanup.
The Stop method and Continue method stops (i.e. pauses) and continues the
execution of the process refered to by the process handle.
The goal of this section is to avoid ambiguity of identifiers in the global namespace - i.e. avoiding the same identifier with conflicting meanings.
To this end, "commonly-used" refers to the attribute of an entity where it's used so frequently that having a verbose spelling would hamper the readability of the code.
When an identifier consist of multiplie words, the following terms are defined:
Identifiers in the global namespace that begins with an underscore, followed by an uppercase letter is reserved for standardization by the language.
Identifiers which consist of less than 10 lowercase letters or digits are potentially reserved for standardization by the language, as keywords or as "commonly-used" library functions or objects. Although the use of the word "potentially" signifies that the reservation is not uncompromising, 3rd-party library vendors should nontheless refrain from defining such terse identifiers in the global namespace.