=pod =encoding utf8 =head1 NAME Muldis::D::Dialect::PTMD_STD - How to format Plain Text Muldis D =head1 VERSION This document is Muldis::D::Dialect::PTMD_STD version 0.148.0. =head1 PREFACE This document is part of the Muldis D language specification, whose root document is L; you should read that root document before you read this one, which provides subservient details. =head1 DESCRIPTION This document outlines the grammar of the I standard dialect named C. The fully-qualified name of this Muldis D standard dialect is C. This dialect is designed to exactly match the Muldis D system catalog (the possible representation of Muldis D code that is visible to or updateable by Muldis D programs at runtime) as to what non-critical metadata it explicitly stores; so code in the C dialect should be round-trippable with the system catalog with the result maintaining all the details that were started with. Since it matches the system catalog, this dialect should be able to exactly represent all possible Muldis D base language code (and probably all extensions too), rather than a subset of it. That said, the C dialect does provide a choice of multiple syntax options for writing Muldis D value literals and DBMS entity (eg type and routine) declarations, so several very distinct C code artifacts may parse into the same system catalog entries. There is even a considerable level of abstraction in some cases, so that it is easier for programmers to write and understand typical C code, and so that this code isn't absurdly verbose. This dialect is designed to be as small as possible while meeting the above criteria, and is designed such that a parser that handles all of this dialect can be fairly small and simple. Likewise, a code generator for this dialect from the system catalog can be fairly small and simple. A significant quality of the C dialect is that it is designed to work easily for a single-pass parser, or at least a single-pass lexer; all the context that one needs to know for how to parse or lex any arbitrary substring of code is provided by prior code, or any required lookahead is just by a few characters in general. Therefore, a C parser can easily work on a streaming input like a file-handle where you can't go back earlier in the stream. Often this means a parser can work with little RAM. Also the dialect is designed that any amount of whitespace can be added or omitted next to most non-alphanumeric characters (which happen to be next to alphanumeric tokens) without that affecting the meaning of the code at all, except obviously for within character string literals. And long binary or character or numeric or identifier strings can be split into arbitrary-size substrings, without affecting the meaning. And many elements are identified by name rather than ordinal position, so to some degree the order they appear has no effect on the meaning. So programmers can easily format (separate, indent, linewrap, order) code how they like, and making an automated code reformatter shouldn't be difficult. Often, named elements can also be omitted entirely for brevity, in which case the parser would use context to supply default values for those elements. PTMD_STD has a I. Given that plain text is (more or less) universally unambiguously portable between all general purpose languages that could be used to implement a DBMS, it is expected that every single Muldis D implementation will natively accept input in the C dialect, which isn't dependent on any specific host language and should be easy enough to process, so it should be considered the safest official Muldis D dialect to write in by default, when you don't have a specific reason to use some other dialect. See also the dialects L and L, which are derived directly from C, and represent possible Perl 6 and 5 concrete syntax trees for it; in fact, most of the details in common with those other dialects are described just in the current file, for all 3 dialects. =head1 GENERAL STRUCTURE A C Muldis D code file consists just of a Muldis D depot definition, which begins with a language name declaration, and then has a C value literal defining the depot's catalog, and finally has, optionally, a C value literal defining the depot's data. This is conceptually what a C file is, and it can even be that literally, but C provides a canonical further abstraction for defining the depot's catalog, which should be used when doing data-definition. And so you typically use syntax resembling routine and type declarations in a general purpose programming language, where simply declaring such an entity will cause it to be part of the system catalog. Fundamentally every Muldis D depot is akin to a code library, and a Muldis D "main program" is nothing more than a depot having a procedure that is designated to execute automatically after a mount event of its host depot. As a special extension feature, a C Muldis D code file may alternately consist just of a (language-qualified) Muldis D value literal, which mainly is intended for use in mixed-language environments as an interchange format for data values between Muldis D and other languages. The grammar in this file is formatted as a hybrid between various BNF flavors and Perl 6 rules (see L for details on the latter) with further changes. It is only meant to be illustrative and human readable, and would need significant changes to actually be a functional parser, which are different for each parser toolkit. The grammar consists mainly of named I which define matching rules. Loosely speaking, each parser match of a token corresponds to a capture I or node element in the concrete syntax tree resulting from the parse; in practice, the parser may make various alterations to the match when generating a node, such as adding guide keywords corresponding to the token name, or by merging series of trivial tokens or doing escaped character substitutions. No explicit capture syntax such as parenthesis is used in the grammar. To help understand the grammar in this file, here are a few guidelines: 1. The grammar is exactly the same as that of a Perl 6 rule except where these guidelines state otherwise; this includes that square brackets mean grouping not optionality, and that when multiple sub-pattern alternatives match, the one that is the longest wins. 2. The grammar portion that actually declares a token, that is what associates a token name with its definition body, is formatted like EBNF, as C<< ::= ... >> rather than the Perl 6 way like C or C. 3. All non-quoted whitespace is not significant and just is formatting the grammar itself; rather, whitespace rules in the grammar are spelled out explicitly. 4. The meanings of any tokens with the same names as ones built-in to Perl 6 but that are explicitly defined in this grammar may have different definitions. The root grammar token for the entire dialect is C. =head1 START Grammar: ::= ^ ? [ | ] ? $ A C node has 2 ordered elements where the first element is a C node and the second element is either a C node or a C node. See the pod sections in this file named L, L, and L, for more details about the aforementioned tokens/nodes. When Muldis D is being compiled and invoked piecemeal, such as because the Muldis D implementing virtual machine (VM) is attached to an interactive user terminal, or the VM is embedded in a host language where code in the host language invokes Muldis D code at various times, many C may be fed to the VM directly for inter-language exchange, and not every one would then have its own C. Usually a C would be supplied to the Muldis D VM just once as a VM configuration step, which provides a context for further interaction with the VM that just involves Muldis D code that isn't itself qualified with a C. =head1 LANGUAGE NAME Grammar: ::= ':' ':' ':' ':' ::= Muldis_D ::= ::= ::= PTMD_STD ::= | ::= <[ a..z A..Z 0..9 _ - \. ]>+ ::= '"' [<[\ ..~]-["]> | '\\"']+ '"' ::= '{' ? catalog_abstraction_level ? '=>' ? ? ',' ? op_char_repertoire ? '=>' ? [? ',' ? standard_syntax_extensions ? => ? ]? [? ',']? ? '}' ::= the_floor | code_as_data | plain_rtn_inv | rtn_inv_alt_syn ::= basic | extended ::= '{' ? [ ** [? ',' ?] [? ',']?]? ? '}' ::= '' I under C<< >> as a placeholder and that there are currently zero valid list items.> As per the VERSIONING pod section of L, code written in Muldis D must start by declaring the fully-qualified Muldis D language name it is written in. The C dialect formats this name as a C node having 5 ordered elements: =over =item C This is the Muldis D language base name; it is simply the bareword character string C. =item C This is the base authority; it is a character string formatted as per a specific-context C value literal, except that it must be nonempty and it is expressly limited to using non-control characters in the ASCII repertoire, and its nonquoted variant has fewer limitations than C's; it is typically the delimited character string C. =item C This is the base version number; it is a character string formatted as per C; it is typically a character string like C<0.148.0>. =item C This is the dialect name; it is simply the bareword character string C. =item C This is a set of chosen pragma/parser-config options, which is formatted similarly to a C SCVL. The only 2 mandatory pragmas are C (see the L pod section) and C (see L). The only optional pragma is C (see the L pod section). Other pragmas may be added later, which would likely be optional. =back Examples: Muldis_D:"http://muldis.com":0.148.0:PTMD_STD:{ catalog_abstraction_level => rtn_inv_alt_syn, op_char_repertoire => extended } Muldis_D:"http://muldis.com":0.148.0:PTMD_STD:{ catalog_abstraction_level => rtn_inv_alt_syn, op_char_repertoire => extended, standard_syntax_extensions => {} } =head1 CATALOG ABSTRACTION LEVELS The C pragma determines with a broad granularity how large the effective Muldis D grammar is that a programmer may employ with their Muldis D code. The catalog abstraction level of some Muldis D code is a measure of how much or how little that code would resemble the system catalog data that the code would parse into. The lower the abstraction level, the smaller and simpler the used Muldis D grammar is and the more like data structure literals it is; the higher the abstraction level, the larger and more complicated the Muldis D grammar is and the more like general-purpose-language typical code it is. There are currently 4 specified catalog abstraction levels, which when arranged from lowest to highest amount of abstraction, are: C, C, C, C. Every abstraction level has a proper superset of the grammar of every other abstraction level that is lower than itself, so for example any code that is valid C is also valid C, and so on. Choosing an abstraction level to write Muldis D code against is all a matter of trade-offs, perhaps mainly between advantages for Muldis D implementors and advantages for Muldis D users. Lower levels have benefits such as that it takes less programmer effort to create a Muldis D code parser or generator that just has to support that level, and such a parser/generator could be made more quickly and occupy a smallar resource footprint. On the other side, higher levels have benefits such that any Muldis D code itself can be immensely more terse and readable (and writable), as well as have a much stronger resemblence to typical general-purpose programming languages, which also caries the benefit that a lot more of a programmer's preconceptions about what they should be able to write in a language is more likely to just work in Muldis D, and users can adopt it with less re-training. Essentially, lower abstraction levels are more like machine code while higher levels are more like human language. It may not need to be said that while a lower level may be for a Muldis D implementer an easier thing to make run, it would conversely tend to be more difficult for them to write a test suite for, being more verbose. B Specifying the C pragma in a C node is mandatory, since there is no obvious abstraction level to use implicitly when one isn't specified. =head2 the_floor When the C pragma is C, then the following grammar definitions are in effect: ::= ::= ::= This abstraction level exists more as an academic exercise and is not intended to actually be used. It is meant to be analogous to those academic programming languages whose main design goal, in addition to still being programmatically complete, is to have the absolute smallest grammar at all costs, also analogous to an extreme-RISC machine. This level is like C except that it has the absolute minimum of value literal syntaxes rather than all of them, essentially just having a single node kind apiece to cover all scalars, tuples, relations. This level is also so minimal that many representation alternatives of the system catalog itself are being ignored, such as the more concise alternatives the system catalog itself provides to represent selectors of set/array/bag values or any system-defined scalar types not in terms of possreps. Examples: Muldis_D:"http://muldis.com":0.148.0:PTMD_STD:{ catalog_abstraction_level => the_floor, op_char_repertoire => basic } List:[3, List:[ List:[1,List:[102,111,111,100]], List:[1,List:[113,116,121]], ], List:[ List:[ List:[4, List:[ List:[1,List:[115,121,115]], List:[1,List:[115,116,100]], List:[1,List:[67,111,114,101]], List:[1,List:[84,121,112,101]], List:[1,List:[84,101,120,116]], ], List:[1,List:[110,102,100,95,99,111,100,101,115]], List:[2, List:[List:[1,List:[]]], List:[List:[1,List:[67,97,114,114,111,116,115]]] ] ], 100 ], List:[ List:[4, List:[ List:[1,List:[115,121,115]], List:[1,List:[115,116,100]], List:[1,List:[67,111,114,101]], List:[1,List:[84,121,112,101]], List:[1,List:[84,101,120,116]], ], List:[1,List:[110,102,100,95,99,111,100,101,115]], List:[2, List:[List:[1,List:[]]], List:[List:[1,List:[75,105,119,105,115]]] ] ], 30 ] ] ] =head2 code_as_data When the C pragma is C, then the following grammar definitions are in effect: ::= ::= ::= This abstraction level is the best one for when you want to write code in exactly the same form as it would take in the system catalog, and at the same time use all the relatively consise alternatives the system catalog itself provides for value literals and selectors. With this abstraction level, a depot consists simply of a language name plus one or two database value literals. The format for specifying a system catalog is exactly the same as the format for specifying the user data of a database. All a Muldis D parser/generator has to know is how to parse static Muldis D value literals and its done. That said, C includes all of the special grammar dealing with value literals, including those for many specific scalar or nonscalar types. This level is analogous to a high-level assembly language in a way; what you say in code is exactly what you get in the system catalog, but your code would be too verbose for the tastes of someone preferring normal high-level language code. Code written to the C level can employ all of the language grammar constructs described in these main pod sections: L, L, L. Examples: Muldis_D:"http://muldis.com":0.148.0:PTMD_STD:{ catalog_abstraction_level => code_as_data, op_char_repertoire => basic } @:{ { food => 'Carrots', qty => 100 }, { food => 'Kiwis', qty => 30 } } Muldis_D:"http://muldis.com":0.148.0:PTMD_STD:{ catalog_abstraction_level => code_as_data, op_char_repertoire => basic } depot-catalog Database:Depot:{ functions => @:{ { name => Name:cube, material => %:Function:{ result_type => PNSQNameChain:Int, params => @:NameTypeMap:{ { name => Name:topic, type => PNSQNameChain:Int } }, expr => Database:ExprNodeSet:{ sca_val_exprs => @:{ { name => Name:INT_3, value => 3 } }, func_invo_exprs => @:{ { name => Name:"", function => PNSQNameChain:Integer.power, args => @:NameExprMap:{ { name => Name:radix, expr => Name:topic }, { name => Name:exponent, expr => Name:INT_3 } } } } } } } } } =head2 plain_rtn_inv When the C pragma is C, then the following grammar definitions are in effect: ::= ::= ::= ::= ::= This abstraction level is the lowest one that can be recommended for general use, and every Muldis D implementation that is expected to be directly used by programmers (in contrast to its main use just being by way of wrapper APIs or code generators) should support at least this level, even if that implementation is being touted as "minimal". This abstraction level has the simplest grammar that could reasonably be considered as like that of a general purpose programming language. Unlike the C level, the C level makes everything that isn't conceptually a value literal or selector look like typical routine or type declarations or value expressions or statements, just as programmers typically expect. One of Muldis D's primary features is that, as much as possible, the system-defined language features are defined in terms of ordinary types and routines. This means for one thing that users are empowered to create their own types and routines with all of the capabilities, flexibility, and syntax as the language's built-in features have. This also means that it should be relatively simple to parse Muldis D code because the vast majority of language features don't have their own special syntax to account for, and the L syntax covers most of them, in terms of the common prefix/polish notation that in practice most invocations of user-defined routines are formatted as anyway. The C abstraction level is all about having code that looks like general purpose programming language code but that everything looks like user-defined routines and types. The code is mostly just nested invocations of functions or procedures in basic polish notation, and both that code and material declarations have a C-language-like syntax. It is expected that every Muldis D implementation which supports at least the C level will, as much as is reasonably possible, preserve all non-behaviour-affecting metadata that is directly supported for storage by the system catalog itself, as described in L. Primarily this means preserving non-value code comments, and preserving the declared relative ordinal position of code elements. Code written to the C level can employ all of the language grammar constructs that C can, plus all of those described in these main pod sections: L, L, L. Examples: Muldis_D:"http://muldis.com":0.148.0:PTMD_STD:{ catalog_abstraction_level => plain_rtn_inv, op_char_repertoire => basic } depot-catalog { function cube (Int <-- topic : Int) { Integer.power( radix => topic, exponent => 3 ) } } =head2 DEPRECATED - rtn_inv_alt_syn B catalog abstraction level as it currently exists is deprecated and will disappear in the near future. Other pending enhancements to the language in both the system catalog itself and in the C level will make the latter more capable and suitable by itself for normal use. A new highest level or 3 will probably appear in place of C later for their still-unique useful features.> When the C pragma is C, then the following grammar definitions are in effect: ::= ::= ::= ::= ::= This abstraction level is the highest one and is the most recommended one for general use, assuming that all the Muldis D implementations you want to use support it. The expectation is that, in general, minimal Muldis D implementations won't support it but non-minimal ones would, so code written to it may not be the most universally portable as-is but should be portable in most common environments. In practice a huge payoff of improved user code brevity and readability (and writability) is gained by the C abstraction level over the C level by adding special syntax for a lot of commonly used built-in routines, such as infix syntax for common math operators or postcircumfix syntax for attribute accessors. The tradeoff for this user code brevity is a significant amount of extra complexity in parsers, due to all the extra special cases, though this complexity can be mitigated somewhat by standardizing these additions in format where possible. These 2 highest levels both look like a general purpose programming language, but C is a lot more concise. In particular, C is probably the I Muldis D dialect that conceivably can match or beat the conciseness of a majority of general purpose programming languages, and would probably be the most preferred abstraction level for developers. This fact would also help to drive a majority of implementations to support this greatest complexity level. And even then, this most complex of standard Muldis D grammars still generally has simpler grammar rules than a lot of general languages, even if this difference is more subtle. It certainly is simpler and more easier to parse grammar than SQL in its general case. Code written to the C level can employ all of the language grammar constructs that C can, plus all of those described in these main pod sections: L, L. Examples: Muldis_D:"http://muldis.com":0.148.0:PTMD_STD:{ catalog_abstraction_level => rtn_inv_alt_syn, op_char_repertoire => basic } depot-catalog { function cube (Int <-- topic : Int) { topic exp 3 } } =head1 OPERATOR CHARACTER REPERTOIRE The C pragma determines primarily whether or not the various routine invocation alternate syntaxes, herein called I, may be composed of only ASCII characters or also other Unicode characters, and this pragma determines secondarily whether or not a few special value literals (effectively nullary operators) composed of non-ASCII Unicode characters may exist. The pragma also determines whether or not any nonquoted DBMS entity names in the general case may contain non-ASCII Unicode alphanumeric characters. I There are currently 2 specified operator character repertoires: C, C. The latter is a proper superset of the former. The C pragma is generally orthogonal to the C pragma, so you can combine any value of the latter with any value of the former. However, in practice the operator character repertoire setting will have no effect at all when the catalog abstraction level is C, and it will otherwise have very little effect except when the catalog abstraction level is C. To be specific, what the C pragma primarily affects is special operator call syntaxes provided only by C, and what the former secondarily affects is special value literals provided by C plus greater catalog abstraction levels. Specifying the C pragma in a C node is mandatory, since there is no obviously best setting to use implicitly when one isn't specified. =head2 basic The C operator character repertoire is the smallest one, and it only supports writing the proper subset of defined operator invocations and special value literals that are composed of just 7-bit ASCII characters. This repertoire can be recommended for general use, especially since code written to it should be the most universally portable as-is (with respect to operator character repertoires), including full support even by minimal Muldis D implementations and older text editors. When the C pragma is C, then the following grammar definitions are in effect: ::= ::= ::= ::= ::= ::= ::= ::= ::= =head2 extended The C operator character repertoire is the largest one, and it supports the entire set of defined operator invocations and special value literals, many of which are composed of Unicode characters outside the 7-bit ASCII repertoire. This is the most recommended repertoire for general use, assuming that all the Muldis D implementations and source code text editors you want to use support it. The expectation is that, in general, minimal Muldis D implementations and older text editors won't support it but non-minimal ones would, so code written to it may not be the most universally portable as-is but should be portable in most common and modern environments. In practice the main payoff of C is that user code can exploit the wide range of symbols that Unicode provides which are the canonical means of writing various math or logic or relational et al operators in the wider world, and which programmers would likely have written with all along if it weren't for the large limitations of legacy computer systems which practically forced them to use various approximations instead. While you can always write with ASCII approximations, using C means you often don't have to, and your code can be a lot more readable as a result, at least to the practitioners of the domains that the symbols come from, and the code is otherwise more terse and arguably appears more attractive. When the C pragma is C, then the following grammar definitions are in effect: ::= ::= ::= ::= ::= ::= ::= ::= ::= =head1 STANDARD SYNTAX EXTENSIONS The C pragma declares which optional portions of the Muldis D grammar a programmer may employ with their Muldis D code. There are currently no specified standard syntax extensions. These are all mutually independent and any or all may be used at once. While each I is closely related to a I, you can use the latter's types and routines without declaring the former; you only declare you are using a I if you want the Muldis D parser to recognize special syntax specific to those types and routines, and otherwise you just use them using the generic syntax provided for all types and routines. The C pragma is generally orthogonal to the C pragma, so you can combine any value of the latter with any value-list of the former. However, in practice all standard syntax extensions will have no effect when the catalog abstraction level is C, and some of their features may only take effect when the catalog abstraction level is C, as is appropriate. Specifying the C pragma in a C node is optional, and when omitted it defaults to the empty set, meaning no extensions may be used. =head1 VALUE LITERALS AND SELECTORS Grammar: ::= | ::= | ::= | | | | | | | | | | | ::= | | | | | | | | | | A C node is a Muldis D value literal, which is a common special case of a Muldis D value selector. Unlike value selectors in general, which must be composed beneath a C because they actually represent a Muldis D value expression tree of a routine or type definition, a C node does I represent an expression tree, but rather a value constant; by definition, a C can be completely evaluated at compile time. A C node with a C second element is hence just a serialized Muldis D value. The PTMD_STD grammar subsection for value literals (having the root grammar token C) is completely self-defined and can be used in isolation from the wider grammar as a Muldis D sub-language; for example, a hosted-data Muldis D implementation may have an object representing a Muldis D value, which is initialized using code written in that sub-language. Every grammar token, and corresponding capture node, representing a Muldis D value literal is similarly formatted and has 1-3 elements; the following pod section L describes the similarities once for all of them, in terms of an alternate C token definition which is called C. And then the other pod sections specific to each kind of value literal then just focus on describing their unique aspects, namely their I. An C node represents a conceptually opaque Muldis D value, such that every one of these values is defined with its own literal syntax that is compact and doesn't look like a collection of other nodes; this includes the basic numeric and string literals. A C node represents a conceptually transparent Muldis D value, such that every one of these values is defined visibly in terms of a collection of other nodes; this includes the basic tuple and relation selectors. =head2 Value Literal Common Elements A I (or I) is a value literal that can be properly interpreted in a context that is expecting I value but has no expectation that said value belongs to a specific data type; in the general case, a GCVL includes explicit I metadata (such as, "this is an C" or "this is a C"); but with a few specific data types (see the C node description for details) that metadata may be omitted for brevity because the main literal has mutually uniquely identifying characteristics. For example, each element of a generic Muldis D collection value, such as a member of an array or tuple, could potentially have any type at all. In contrast, a I (or I) is a value literal that does not include explicit value kind metadata, even when the main literal doesn't have uniquely identifying characteristics, because the context of its use supplies said metadata. For example, in a tuple value literal it is assumed that a value literal in an attribute name position must denote a C. The grammar token C|C denotes a GCVL, as do most short-named grammar tokens, like C or C; in contrast, a grammar token containing C denotes a SCVL, like C or C. Every GCVL has 1-3 elements, illustrated by this grammar: ::= [ ':' [ ':' ]? ]? ::= Singleton | Bool | Order | RoundMeth | Int | NNInt | PInt | Rat | NNRat | PRat | Blob | OctetBlob | Text | Name | NameChain | PNSQNameChain | RatRoundRule | DH? Scalar | '$' | DH? Tuple | '%' | Database | DH? Relation | '@' | DH? Set | DH? [Maybe | Just] | DH? Array | DH? Bag | DH? SPInterval | DH? MPInterval | List ::= ::= | | | | | | | | | | | | | | | | | | | | | | So a C|C node has 1-3 elements in general: =over =item C This is a character string of the format C<< <[A..Z]> <[ a..z A..Z ]>+ >>; it identifies the data type of the value literal in broad terms and is the only external metadata of C generally necessary to interpret the latter; what grammars are valid for C depend just on C. For all values of just the 10 data types [C, C, C, C, C, C, C, C, C, C], the C portion of a GCVL may be omitted for brevity, but the code parser should still be able to infer it easily by examining the first few characters of the C, which for each of said 11 data types has a mutually uniquely identifying format, which is also distinct from all possible C. Note that, for the purposes of this discussion, the C type is subsumed into the C type. For many values of the 3 data types [C, C<[S|M]PInterval>], the C portion of a GCVL may be omitted for brevity; specifically, this may be done just for the [C, C<[S|M]PInterval>] GCVL whose C are not valid C of a C GCVL. For a C GCVL, all of those formatted as C<< => >> separated pairs may have their C omitted, while all of those that are not formatted as pairs may not. For a C GCVL, all of those that are formatted as a range pair with C<..>/etc may have their C omitted, while all of those formatted using the single value shorthand with no C<..>/etc may not. For a C GCVL, all of those that are formatted as a comma-delimited list with at least 2 list elements, where at least one of those elements is formatted as a range pair with C<..>/etc, may have their C omitted, while all of those either having 0..1 list elements or having just single value shorthand elements with no C<..>/etc may not. Note that omission of C is only allowed when the GCVL doesn't include a C element. For just these certain special values of other data types, the same option of omitting the C (and C) applies: C, C, C. =item C This is a Muldis D data type name, for example C; it identifies a specific subtype of the generic type denoted by C, and serves as an assertion that the Muldis D value denoted by C is a member of the named subtype. Iff C is C<[|DH]Scalar> then C is mandatory; otherwise, C is optional for all C, except that C must be omitted when C is one of the 3 [C, C, C]; this isn't because those 3 types can't be subtyped, but because in practice doing so isn't useful. How a Muldis D parser treats a C node with a C element depends on the wider context. In the general case where the C is an C beneath the context of a C node, the C is treated as if it had an extra parent C node that invokes the C function and whose 2 argument nodes are as follows: C gets the C without the C element, and C gets the C element. This means that in general the C assertion is done at runtime. In the common special case where both C is an C and C refers to a system-defined type, then the C assertion is done at compile time, and then the C element is simply eliminated, so the C ends up simply as itself with no new C parent. In another common special case, iff a C node with a C element is a C and its C names a system-defined tuple type or relation type with a specified set of attributes, then the parser will automatically generate any missing attribute values of the C node, where each has the default value of its declared type as per C. This will be done prior to the other use of C which applies a constraint, so the latter acts as if the original code had specified the missing attributes. In the case of the type being a relation type, the relation value literal doesn't even need to be well-formed (have the attributes per tuple) in the code, as the attribute generation is done per tuple. Since it only works for system-defined types, this special case is primarily useful for code involving values that represent code. =item C This is mandatory for all C. =back For GCVL and SCVL examples, see the subsequent documentation sections. =head1 OPAQUE VALUE LITERALS See also the definition of the catalog data type C, a tuple of which is what every kind of C node distills to when it is beneath the context of a C node, as it describes some semantics. =head2 Singleton Literals Grammar: ::= [Singleton ':' ]? ::= '-Inf' | Inf ::= | '-∞' | '∞' A C node represents a value of any of the singleton scalar types that C is a union over. Some of the keywords are aliases for each other: keyword | aliases --------+-------- -Inf | -∞ Inf | ∞ These are the singleton types corresponding to the keywords: -Inf -> sys.std.Core.Type.Cat."-Inf" Inf -> sys.std.Core.Type.Cat.Inf Examples: Singleton:-Inf ∞ =head2 Boolean Literals Grammar: ::= [Bool ':' ]? ::= False | True ::= | ⊥ | ⊤ A C node represents a logical boolean value. It is interpreted as a Muldis D C value as follows: The C is a bareword character string formatted as per a C SCVL, and it maps directly to the matching unqualified declared name of one of the C singleton types that the C type is defined as a union over. Some of the keywords are aliases for each other: keyword | aliases --------+-------- False | ⊥ True | ⊤ Examples: Bool:True False ⊤ ⊥ =head2 Order-Determination Literals Grammar: ::= [Order ':' ]? ::= Increase | Same | Decrease An C node represents an order-determination. It is interpreted as a Muldis D C value as follows: The C is a bareword character string formatted as per a C SCVL, and it maps directly to the matching unqualified declared name of one of the C singleton types that the C type is defined as a union over. Examples: Order:Same Decrease =head2 Rounding Method Literals Grammar: ::= [ RoundMeth ':' [ ':' ]? ]? ::= Down | Up | ToZero | ToInf | HalfDown | HalfUp | HalfToZero | HalfToInf | HalfEven A C node represents a rounding method. It is interpreted as a Muldis D C value as follows: The C is a bareword character string formatted as per a C SCVL, and it maps directly to the matching unqualified declared name of one of the C singleton types that the C type is defined as a union over. Examples: RoundMeth:HalfUp ToZero =head2 General Purpose Integer Numeric Literals Grammar: ::= [ [Int | NNInt | PInt] ':' [ ':' ]? ]? ::= '#' | | ::= ::= 0<[bodx]> ::= 0 | '-'? ::= 0 | ::= ? ::= <[ 1..9 A..Z a..z ]> ::= [[_?<[ 0..9 A..Z a..z ]>+]+] ** ::= 0 | '-'? ::= 0 | ::= ? ::= <[ 1..9 ]> ::= [[_?<[ 0..9 ]>+]+] ** An C node represents an integer numeric value. It is interpreted as a Muldis D C value as follows: If the C is composed of a C plus C, then the C is interpreted as a base-I integer where I might be between 2 and 36, and the C says which possible value of I to use. Assuming all C column values are between zero and I-minus-one, the C contains that I-minus-one. So to specify, eg, bases [2,8,10,16], use C of [1,7,9,F]. Using a C is a recommended alternative for using a C when the former can be used, which is when the C would be one of [1,7,9,F]; in those cases, [0b,0o,0d,0x] correspond respectively, and the rules for the C are the same. If the C is a C, then it is interpreted as a base 10 integer. Fundamentally the I part of an C node consists of a string of digits and plain uppercased or lowercased letters, where each digit (C<0..9>) represents its own number and each letter (C) represents a number in [10..35]. A I may optionally contain underscore characters (C<_>), which exist just to help with visual formatting, such as for C<10_000_000>, and these are ignored/stripped by the parser. A I may optionally be split into 1..N segments where each pair of consecutive segments is separated by a I token, which is a pair of backslashes (C<\>) surrounding a run of whitespace; this segmenting ability is provided to support code that contains very long numeric literals while still being well formatted (no extra long lines); the I tokens are also ignored/stripped by the parser, and the I is interpreted as if all its alphanumeric characters were contiguous. If the C of a C node is C or C rather than C, then the C node is interpreted simply as an C node whose C is C or C, and the allowed I is appropriately further restricted. Examples: Int:0b11001001 #`binary`# 0o0 #`octal`# 0o644 #`octal`# -34 #`decimal`# 42 #`decimal`# 0xDEADBEEF #`hexadecimal`# Z#-HELLOWORLD #`base-36`# 3#301 #`base-4`# B#A09B #`base-12`# =head2 General Purpose Rational Numeric Literals Grammar: ::= [ [Rat | NNRat | PRat] ':' [ ':' ]? ]? ::= '#' | | ::= '.' | '/' | '*' '^' ::= '.' | '/' | '*' '^' A C node represents a rational numeric value. It is interpreted as a Muldis D C value as follows: Fundamentally a C node is formatted and interpreted like an C node, and any similarities won't be repeated here. The differences of interpreting a C being composed of a C or C plus C versus the C being a C are as per the corresponding differences of interpreting an C. Also interpreting a C or C is as per a C or C. If the I part of a C node contains a radix point (C<.>), then it is interpreted as is usual for a programming language with such a literal. If the I part of a C node contains a solidus (C), then the rational's value is interpreted as the leading integer (a numerator) divided by the trailing positive integer (a denominator); that is, the two integers collectively map to the C possrep of the C type. If the I part of a C node contains a asterisk (C<*>) plus a circumflex accent (C<^>), then the rational's value is interpreted as the leading integer (a mantissa) multiplied by the result of the middle positive integer (a radix) taken to the power of the trailing integer (an exponent); that is, the three integers collectively map to the C possrep of the C type. Examples: Rat:0b-1.1 -1.5 #`same val as prev`# 3.14159 A#0.0 0xDEADBEEF.FACE Z#0.000AZE Rat:6#500001/1000 B#A09B/A Rat:0b1011101101*10^-11011 45207196*10^37 1/43 314159*10^-5 =head2 General Purpose Binary String Literals Grammar: ::= [ [Blob | OctetBlob] ':' [ ':' ]? ]? ::= '#' | ::= <[137F]> ::= 0<[box]> ::= '\'' <[ 0..9 A..F a..f _ \s ]>* '\'' A C node represents a general purpose bit string. It is interpreted as a Muldis D C value as follows: Fundamentally the I part of a C node consists of a delimited string of digits and plain uppercased or lowercased letters, where each digit (C<0..9>) represents its own number and each letter (C) represents a number in [10..15]; this string is qualified with a C character (C<[137F]>) or a C (C<[0b,0o,0x]>), similarly to how an C is qualified by a C or C. Each character of the delimited string specifies a sequence of one of [1,2,3,4] bits, depending on whether C is [1|0b,3,7|0o,F|0x]. The I may also contain underscore or whitespace characters between the delimiters, to aid formatting; these are ignored/stripped by the parser, and the I is interpreted as if it just consisted of the rest of the delimited string contiguously. If the C of a C node is C rather than C, then the C node is interpreted simply as a C node whose C is C, and the delimited string is appropriately further restricted. Examples: Blob:0b'00101110100010' #`binary`# 3#'' 0x'A705E' #`hexadecimal`# 0o'523504376' =head2 General Purpose Character String Literals Grammar: ::= [ Text ':' [ ':' ]? ]? ::= '\'' [<-[\']> | ]* '\'' ::= '\\\\' | '\\\'' | '\\"' | '\\`' | '\\t' | '\\n' | '\\f' | '\\r' | '\\c<' [ [<[ A..Z ]>+] ** ' ' | [0 | <[ 1..9 ]> <[ 0..9 ]>*] | <[ 1..9 A..Z a..z ]> '#' [0 | <[ 1..9 A..Z a..z ]> <[ 0..9 A..Z a..z ]>*] | 0<[ bodx ]> [0 | <[ 1..9 A..F a..f ]> <[ 0..9 A..F a..f ]>*] ] '>' ::= '\\' ? '\\' ::= '\\' \s* '\\' ::= \s+ [[ | ] \s+]* ::= '#' \s* '`' \s* [<-[\`]> | ]* \s* '`' \s* '#' ::= '#' ** 2..* A C node represents a general purpose character string. It is interpreted as a Muldis D C value as follows: The C is interpreted generally as is usual for a programming language with such a delimited character string literal. A C may contain any literal characters at all, except that any literal occurrences of a backslash (C<\>) or single-quote (C<'>) must have a leading backslash. Every run of 1+ literal whitespace or control characters, that is not composed just of the C char (C<0x20>), is substituted for a single C character by the parser, and the C is interpreted as if the post-substitution string had been the original string. However, if said run of whitespace is immediately preceded by an escape sequence denoting a whitespace or control character, then the run is simply stripped rather than a C taking its place. The main reason for this substitution/stripping feature is to ensure that the actual values being selected by string literals are not variable per the kind of linebreaks or indenting used to format the Muldis D source code itself. The feature is provided to support code that contains long value literals while still being well formatted (no extra-long lines). If you want to have actual non-C whitespace or control characters in your strings, then they must be formatted as escape sequences such as C<\n>. If you want to end up with multiple C characters at the point where a line is broken, you have to format some as escape sequences. If you want to end up with no C at all where a line is broken, then you'll have to employ some other workaround, such as catenating several quoted strings. All Muldis D delimited character string literals (generally the 3 C, C, code comments) may contain some characters denoted with escape sequences rather than literally. The Muldis D parser would substitute the escape sequences with the characters they represent, so the resulting character string values don't contain those escape sequences. Currently there are 2 classes of escape sequences, called I and I. The meanings of the simple escape sequences are: Esc | Unicode | Unicode | Chr | Literal character used Seq | Codepoint | Character Name | Lit | for when not escaped ----+-----------+-----------------+-----+------------------------------ \\ | 0x5C | REVERSE SOLIDUS | \ | esc seq lead (aka backslash) \' | 0x27 | APOSTROPHE | ' | delim Text literals \" | 0x22 | QUOTATION MARK | " | delim quoted Name literals \` | 0x60 | GRAVE ACCENT | ` | delim for code comments \t | 0x9 | CHAR... TAB... | | control char horizontal tab \n | 0xA | LINE FEED (LF) | | ctrl char line feed / newline \f | 0xC | FORM FEED (FF) | | control char form feed \r | 0xD | CARR. RET. (CR) | | control char carriage return There is currently just one complex escape sequence, of the format C<< \c<...> >>, that supports specifying characters in terms of their Unicode abstract codepoint name or number. If the C<...> consists of just uppercased (not lowercased) letters and the space character, then the C<...> is interpreted as a Unicode character name. If the C<...> looks like an C, sans that underscores and unspace aren't allowed here, then the C<...> is interpreted as a Unicode abstract codepoint number. One reason for this feature is to empower more elegant passing of Unicode-savvy PTMD_STD source code through a communications channel that is more limited, such as to 7-bit ASCII. Examples: Text:'Ceres' 'サンプル' '' 'Perl' '\c\c<0x263A>\c<65>' A C node is strictly not part of the code proper; Muldis D code can contain these almost anywhere as metadata for the code, and in large part it is treated as if it were part of the insignificant whitespace; that all being said, generally speaking any C is retained in the parse tree adjusted to live in the contextually nearest place where a resulting system catalog node has a C attribute. I
Syntactically, a C node differs from C only in that it is delimited by number-signs/hash-marks in addition to backticks/grave-accents. A C is a run of 2+ C<#> that may be used in all of the same places as a C but it does I denote a comment and will be stripped out by the parser as if it was insignificant whitespace. This feature exists to empower things like making visual dividing lines in the code just out of hash-marks. Examples: #`This does something.`# =head2 DBMS Entity Name Literals Grammar: ::= Name ':' [ ':' ]? ::= | ::= [<[ a..z A..Z _ ]> <[ a..z A..Z 0..9 _ ]>*] ** '-' ::= [ \w*] ** '-' ::= '"' [<-[\"]> | ]* '"' ::= NameChain ':' [ ':' ]? ::= | ::= ** [ '.'] ::= '[]' ::= PNSQNameChain ':' [ ':' ]? ::= A C node represents a canonical short name for any kind of DBMS entity when declaring it; it is a character string type, that is disjoint from C. It is interpreted as a Muldis D C value as follows: Fundamentally a C node is formatted and interpreted like a C node, and any similarities won't be repeated here. Unlike a C literal which must always be delimited, a C has 2 variants, one delimited (C) and one not (C). The delimited C form differs from C only in that the string is delimited by double-quotes rather than apostrophes/single-quotes, meaning also that literal double-quotes instead of apostrophes must be escaped. A C is composed of an alphabetic character followed by any sequence of alphanumeric characters. It can not be segmented, so you will have to use the C equivalent if you want a segmented string. The definitions of alphabetic and alphanumeric in this context include appropriate Unicode characters, iff the C is C; for C, they are expressly limited to the ASCII repertoire. An underscore is always considered alphabetic. A C may also contain isolated hyphens provided the next character is alphabetic. A C node represents a canonical long name for invoking a DBMS entity in some contexts; it is conceptually a sequence of entity short names. This node is interpreted as a Muldis D C value as follows: A C has 2 variants, one that defines a nonempty chain (C) and one that defines an empty chain (C). A C consists of a sequence of 1 or more C where the elements of the sequence are separated by period (C<.>) tokens; each element of the sequence, in order, defines an element of the C possrep's attribute of the result C value. A C consists simply of the special syntax of C<[]>. Fundamentally a C node is exactly the same as a C node in format and interpretation, with the primary difference being that it may only define C values that are also values of the proper subtype C, all of which are nonempty chains. Now that distinction alone wouldn't be enough rationale to have these 2 distinct node kinds, and so the secondary difference between the 2 provides that rationale; the C node supports a number of chain value shorthands while the C node supports none. Strictly speaking, a Muldis D C value is supposed to have at least 1 element in its sequence, and the first element of any sequence must be one of these 5 C values, which is a top-level namespace: C, C, C, C, C. (Actually, C is a 6th option, but that will be treated separately in this discussion.) In the general case, a C must be written out in full, so it is completely unambiguous (and is clearly self-documenting), and it is always the case that a C value in the system catalog is written out in full. But the PTMD_STD grammar also has a few commonly used special cases where a C may be a much shorter substring of its complete version, such that a simple parser, with no knowledge of any user-defined entities besides said shorter C in isolation, can still unambiguously resolve it to its complete version; exploiting these typically makes for code that is a lot less verbose, and much easier to write or read. The first special case involves any context where a type or routine is being referenced by name. In such a context, when the referenced entity is a standard system-defined type or routine, programmers may omit any number of consecutive leading chain elements from such a C, so long as the remaining unqualified chain is distinct among all standard system-defined (C-prefix) DBMS entities (but that as an exception, a non-distinct abbreviation is allowed iff exactly 1 of the candidate entities is in the language core, C-prefix, in which case that 1 is unambiguously the entity that is resolved to; or, when more than 1 of the candidate entities is in the language core, and iff exactly 1 of those in-core candidates is a virtual routine and all of the other in-core candidates are routines that implement said virtual routine either directly or indirectly, then a non-distinct abbreviation is allowed and that 1 virtual is unambiguously the entity that is resolved to). For any system-defined entities whose names have trailing empty-string chain elements, those elements are ignored when determining a match for a C, similarly to how specifying those elements is not required in a fully-qualified C to resolve it. This feature has no effect on the namespace prefixes like C or C or C; one still writes those as normal prepended to the otherwise shortened chains. When a C, whose context indicates it is a type or routine invocation, is encountered by the parser, and its existing first chain element isn't one of the other 6 top-level namespaces, then the parser will assume it is an unqualified chain in the C namespace and lookup the best / only match from the known C DBMS entities, to resolve to. So for example, one can just write C rather than C, C rather than C, C rather than C, C rather than C, C rather than C, C rather than C, and so on. In fact, the Muldis D spec itself uses such abbreviations frequently. The second special case involves any context where a type is being referenced using the C namespace prefix feature described in L. In such a context, when the namespace prefix contains either of the optional chain elements C<[|dh_]tuple_from> or C<[|dh_][set|maybe|just|array|bag|[s|m]p_interval]_of>, programmers may omit the single prefix-leading C chain element. So for example, one can just write C rather than C, or C rather than C. This second special case is completely orthogonal to which of the 5 normal top-level namespaces is in use (implicitly or explicitly) by the chain being prefixed, and works for all 5 of them. Examples: Name:login_pass Name:"First Name" NameChain:gene.sorted_person_name NameChain:stats."samples by order" NameChain:[] PNSQNameChain:fed.data.the_db.gene.sorted_person_names PNSQNameChain:fed.data.the_db.stats."samples by order" =head2 Rational Rounding Rule Literals Grammar: ::= RatRoundRule ':' [ ':' ]? ::= '[' ? ? ',' ? ? ',' ? ? ']' ::= ::= ::= A C node represents a rational rounding rule. It is interpreted as a Muldis D C value whose attributes are defined by the C. A C consists mainly of a bracket-delimited sequence of 3 comma-separated elements, which correspond in order to the 3 attributes: C (a C), C (an C), and C (a C). Each of C and C must qualify as a valid C, and C must qualify as a valid C. Examples: RatRoundRule:[10,-2,HalfEven] RatRoundRule:[2,-7,ToZero] =head1 COLLECTION VALUE SELECTORS Note that, with each of the main value selector nodes documented in this main POD section (members of C etc), any occurrences of child C nodes should be read as being C nodes instead in contexts where instances of the main nodes are being composed beneath C nodes. That is, any C node options beyond what C options exist are only valid within a C node. =head2 Scalar Selectors Grammar: ::= [DH? Scalar | '$'] ':' ':' ::= ':' | ::= ::= A C node represents a literal or selector invocation for a not-C scalar subtype value. It is interpreted as a Muldis D C subtype value whose declared type is specified by the node's (mandatory for C) C and whose attributes are defined by the C. If the C is just a C, then it is interpreted as if it also had an explicit C that is the empty string. The C is interpreted specifically as attributes of the declared type's possrep which is specified by the C. Each name+expr pair of the C defines a named possrep attribute of the new scalar; the pair's name and expr specify, respectively, the possrep attribute name, and the possrep attribute value. If the C of a C node is C rather than C, then the C node is interpreted simply as a C node that is appropriately further restricted; the C must name a C subtype, and the C must specify only deeply homogeneous typed attribute values. If the C is C<$> then this is just an alias for C. See also the definition of the catalog data type C, a tuple of which is what a C node distills to when it is beneath the context of a C node, as it describes some semantics. Examples: Scalar:Name:{ "" => 'the_thing' } $:Rat:float:{ mantissa => 45207196, radix => 10, exponent => 37, } $:fed.lib.the_db.UTCDateTime:datetime:{ year => 2003, month => 10, day => 26, hour => 1, minute => 30, second => 0.0, } $:fed.lib.the_db.WeekDay:name:{ "" => "monday", } $:fed.lib.the_db.WeekDay:number:{ "" => 5, } =head2 Tuple Selectors Grammar: ::= [DH? Tuple | '%'] ':' [ ':' ]? ::= | ::= '{' ? [[ | ] ** [? ',' ?] [? ',']?]? ? '}' ::= ? '=>' ? ::= ::= '=>' ::= D0 A C node represents a literal or selector invocation for a tuple value. It is interpreted as a Muldis D C value whose attributes are defined by the C. Iff the C is a C then each name+expr pair (C) of the C defines a named attribute of the new tuple; the pair's name and expr specify, respectively, the attribute name, and the attribute value. If the C of a C node is C rather than C, then the C node is interpreted simply as a C node that is appropriately further restricted; the C must specify only deeply homogeneous typed attribute values. If the C is C<%> then this is just an alias for C. Iff the C is a C then the C node is interpreted as the special value C aka C, which is the only C value with exactly zero attributes. Note that this is just an alternative syntax, as C can select that value too. A special shorthand for C also exists, C, which may be used only if the C of the otherwise-C is an C and that C is identical to the C. In this situation, the identical name can be specified just once, which is the shorthand; for example, the attribute C<< foo => foo >> may alternately be written out as C<< =>foo >>. This shorthand is to help with the possibly common situation where attributes of a tuple (or relation or scalar) selection are being valued from same-named expression nodes / etc. (This shorthand is like Perl 6's C<:$a> being short for C<< a => $a >>.) See also the definition of the catalog data type C, a tuple of which is what a C node distills to when it is beneath the context of a C node, as it describes some semantics. Examples: %:{} Tuple:D0 #`same as previous`# D0 #`same as previous`# %:type.tuple_from.var.fed.data.the_db.account.users:{ login_name => 'hartmark', login_pass => 'letmein', is_special => True, } %:{ name => 'Michelle', age => 17, } %:{ w => 'foo', =>x, y => 4, =>z } =head2 Database Selectors Grammar: ::= Database ':' [ ':' ]? ::= A C node represents a literal or selector invocation for a 'database' value. It is interpreted as a Muldis D C value whose attributes are defined by the C. Each name+relation pair of the C defines a named attribute of the new 'database'; the pair's name and relation specify, respectively, the attribute name, and the attribute value. While this grammar mentions that C is a C, it is in fact significantly further restricted, such that every attribute value of the C can only be a C. See also the definition of the catalog data type C, a tuple of which is what a C node distills to same as when C does. =head2 Relation Selectors Grammar: ::= [DH? Relation | '@'] ':' [ ':' ]? ::= | | | ::= '{' ? [ ** [? ',' ?] [? ',']?]? ? '}' ::= '{' ? [ ** [? ',' ?] [? ',']?]? ? '}' ::= '[' ? [ ** [? ',' ?] [? ',']?]? ? ']' ':' '{' ? [ ** [? ',' ?] [? ',']?]? ? '}' ::= '[' ? [ ** [? ',' ?] [? ',']?]? ? ']' ::= D0C0 | D0C1 A C node represents a literal or selector invocation for a relation value. It is interpreted as a Muldis D C value whose attributes and tuples are defined by the C, which is interpreted as follows: Iff the C is composed of just a C pair with zero elements between them, then it defines the only relation value having zero attributes and zero tuples. Iff the C is a C with at least one C element, then it defines the attribute names of a relation having zero tuples. Iff the C is a C with at least one C element, then each element defines a tuple of the new relation; every C must define a tuple of the same degree and have the same attribute names as its sibling C; these are the degree and attribute names of the relation as a whole, which is its heading for the current purposes. Iff the C is a C, then: The new relation value's attribute names are defined by the C elements, and the relation body's tuples' attribute values are defined by the C elements. This format is meant to be the most compact of the generic relation selector formats, as the attribute names only appear once for the relation rather than repeating for each tuple. As a trade-off, the attribute values per tuple from all of the C elements must appear in the same order as their corresponding attribute names appear in the collection of C elements, as the names and values in the relation literal are matched up by ordinal position here. Iff the C is a C then the C node is interpreted as one of the 2 special values C aka C, which are the only C values with exactly zero attributes. Note that this is just an alternative syntax, as other C formats can select those values too. If the C of a C node is C rather than C, then the C node is interpreted simply as a C node that is appropriately further restricted; the C specify only deeply homogeneous typed attribute values. If the C is C<@> then this is just an alias for C. See also the definition of the catalog data type C, a tuple of which is what a C node distills to when it is beneath the context of a C node, as it describes some semantics. Examples: @:{} #`zero attrs + zero tuples`# Relation:D0C0 #`same as previous`# @:{ x, y, z } #`3 attrs + zero tuples`# @:{ {} } #`zero attrs + 1 tuple`# D0C1 #`same as previous`# @:{ { login_name => 'hartmark', login_pass => 'letmein', is_special => True, }, } #`3 attrs + 1 tuple`# @:fed.lib.the_db.gene.Person:[ name, age ]:{ [ 'Michelle', 17 ], } #`2 attrs + 1 tuple`# =head2 Set Selectors Grammar: ::= [ DH? Set ':' [ ':' ]? ]? ::= '{' ? [ ** [? ',' ?] [? ',']?]? ? '}' A C node represents a literal or selector invocation for a set value. It is interpreted as a Muldis D C value whose elements are defined by the C. Each C of the C defines a unary tuple of the new set; each C defines the C attribute of the tuple. If the C of a C node is C rather than C, then the C node is further restricted. See also the definition of the catalog data type C, a tuple of which is what a C node distills to when it is beneath the context of a C node, as it describes some semantics. Examples: Set:fed.lib.the_db.account.Country_Names:{ 'Canada', 'Spain', 'Jordan', 'Thailand', } { 3, 16, 85, } =head2 Maybe Selectors Grammar: ::= [ DH? [Maybe | Just] ':' [ ':' ]? ]? ::= | ::= '{' ? ? '}' ::= Nothing ::= | '∅' A C node represents a literal or selector invocation for a maybe value. It is interpreted as a Muldis D C value whose elements are defined by the C. Iff the C is a C then it defines either zero or one C; in the case of one, the C defines the unary tuple of the new maybe, which is a 'single'; the C defines the C attribute of the tuple. If the C of a C node is C or C<[|DH]Just> rather than C, then the C node is further restricted, either to having only deeply homogeneous resulting C or to having exactly one C, as appropriate. Iff the C is a C then the C node is interpreted as the special value C, aka C, aka I, aka C<∅>, which is the only C value with zero elements. Note that this is just an alternative syntax, as C can select that value too. As a further restriction, the C must be just one of C<[|DH]Maybe> when the C is a C. See also the definition of the catalog data type C, a tuple of which is what a C node distills to same as when C does. Examples: Maybe:{ 'I know this one!' } Maybe:Nothing Maybe:∅ Nothing ∅ =head2 Array Selectors Grammar: ::= [ DH? Array ':' [ ':' ]? ]? ::= '[' ? [ ** [? ',' ?] [? ',']?]? ? ']' An C node represents a literal or selector invocation for an array value. It is interpreted as a Muldis D C value whose elements are defined by the C. Each C of the C defines a binary tuple of the new sequence; the C defines the C attribute of the tuple, and the C attribute of the tuple is generated such that the first C gets an C of zero and subsequent ones get consecutive higher integer values. If the C of a C node is C rather than C, then the C node is further restricted. See also the definition of the catalog data type C, a tuple of which is what an C node distills to when it is beneath the context of a C node, as it describes some semantics. Examples: [ 'Alphonse', 'Edward', 'Winry', ] Array:fed.lib.the_db.stats.Samples_By_Order:[ 57, 45, 63, 61, ] =head2 Bag Selectors Grammar: ::= [ DH? Bag ':' [ ':' ]? ]? ::= | ::= '{' ? [[ ? '=>' ? ] ** [? ',' ?] [? ',']?]? ? '}' ::= '#' | | ::= '{' ? [ ** [? ',' ?] [? ',']?]? ? '}' A C node represents a literal or selector invocation for a bag value. It is interpreted as a Muldis D C value whose elements are defined by the C, which is interpreted as follows: Iff the C is composed of just a C pair with zero elements between them, then it defines the only bag value having zero elements. Iff the C is a C with at least one C/C-pair element, then each pair defines a binary tuple of the new bag; the C defines the C attribute of the tuple, and the C defines the C attribute. Iff the C is a C with at least one C element, then each C contributes to a binary tuple of the new bag; the C defines the C attribute of the tuple. The bag has 1 tuple for every distinct (after normalization or evaluation) C and C-derived value in the C, and the C attribute of that tuple says how many instances of said C there were. See also the definition of the catalog data type C, a tuple of which is what a C node distills to when it is beneath the context of a C node, as it describes some semantics. Further concerning C, because of how C is defined, a C has to be a compile time constant, since an integer is stored in the system catalog rather than the name of an expression node like with C; if you actually want the bag value being selected at runtime to have runtime-determined C values, then you must use a C node rather than a C node. Examples: { 'Apple' => 500, 'Orange' => 300, 'Banana' => 400, } Bag:{ 'Foo', 'Quux', 'Foo', 'Bar', 'Baz', 'Baz', } =head2 Interval Selectors Grammar: ::= [ DH? SPInterval ':' [ ':' ]? ]? ::= '{' ? ? '}' ::= | ::= ? ? ::= ::= ::= '..' | '..^' | '^..' | '^..^' ::= ::= [ DH? MPInterval ':' [ ':' ]? ]? ::= '{' ? [ ** [? ',' ?] [? ',']?]? ? '}' An C node represents a literal or selector invocation for a single-piece interval value. It is interpreted as a Muldis D C value whose attributes are defined by the C. Each of C and C is an C node that defines the C and C attribute value, respectively, of the new single-piece interval. Each of the 4 C values C<..>, C<..^>, C<^..>, C<^..^> corresponds to one of the 4 possible combinations of C and C values that the new single-piece interval can have, which in order are: C<[False,False]>, C<[False,True]>, C<[True,False]>, C<[True,True]>. A special shorthand for C also exists, C, which is to help with the possibly common situation where an interval is a singleton, meaning the interval has exactly 1 value; the shorthand empowers that value to be specified just once rather than twice. Iff the C is an C, then the C is treated as if it was instead an C whose C and C are both identical to the C and whose C is C<..>. For example, the interval C<6> is shorthand for C<6..6>. An C node represents a literal or selector invocation for a multi-piece interval value. It is interpreted as a Muldis D C value whose elements are defined by the C. Each C of the C defines a 4-ary tuple, representing a single-piece interval, of the new multi-piece interval. See also the definition of the 2 catalog data types C, a tuple of which is what an C<[S|M]PInterval> node distills to, respectively, when it is beneath the context of a C node, as it describes some semantics. Examples: {1..10} {2.7..^9.3} {'a'^..'z'} {UTCInstant:[2002,12,6,,,] ^..^ UTCInstant:[2002,12,20,,,]} SPInterval:{'abc'} #`1 element`# MPInterval:{} #`zero elements`# MPInterval:{1..10} #`10 elements`# {1..3,6,8..9} #`6 elements`# {-Inf..3,14..21,29..Inf} #`all Int besides {4..13,22..28}`# =head2 Low Level List Selectors Grammar: ::= List ':' [ ':' ]? ::= '[' ? [ ** [? ',' ?] [? ',']?]? ? ']' A C node represents a literal or selector invocation for a low-level list value. It is interpreted as a Muldis D C value whose elements are defined by the C. Each C of the C defines an element of the new list, where the elements keep the same order. See also the definition of the catalog data type C, a tuple of which is what a C node distills to when it is beneath the context of a C node, as it describes some semantics. Examples: #`Nonstructure : Unicode abstract codepoints = 'Perl'`# List:[80,101,114,109] #`UCPString : Unicode abstract codepoints = 'Perl'`# List:[1,List:[80,101,114,109]] #`%:{}`# List:[2,List:[],List:[]] #`@:{}`# List:[3,List:[],List:[]] #`Set : {17,42,5}`# List:[3, List:[List:[1,List:[118,97,108,117,101]]], List:[ List:[17], List:[42], List:[5] ] ] #`Nothing`# List:[3, List:[List:[1,List:[118,97,108,117,101]]], List:[] ] #`Text : 'Perl'`# List:[4, #`type name : 'sys.std.Core.Type.Text'`# List:[ List:[1,List:[115,121,115]], List:[1,List:[115,116,100]], List:[1,List:[67,111,114,101]], List:[1,List:[84,121,112,101]], List:[1,List:[84,101,120,116]], ], #`possrep name : 'nfd_codes'`# List:[1,List:[110,102,100,95,99,111,100,101,115]], #`possrep attributes : %:{""=>"Perl"}`# List:[2, List:[List:[1,List:[]]], List:[List:[1,List:[80,101,114,109]]] ] ] =head1 DEPOT SPECIFICATION Grammar: ::= [ ]? ::= 'depot-catalog' ::= 'depot-data' ::= ::= | ::= '{' ? [[ | | ] ** ]? ? '}' ::= subdepot ::= ::= 'self-local-dbvar-type' A C node specifies a single complete depot, which is the widest scope user-defined DBMS entity that is a completely self-defined, and doesn't rely on any user-defined entities external to itself to be unambiguously understood. A C node defines a (possibly empty) system catalog database, holding user material (routine and type) definitions, plus optionally a normal-user-data database. A C node in the PTMD_STD grammar is interpreted as a Muldis D C value (which is also a C value) whose attributes are defined by its child elements. A C node specifies a single public entity namespace under a depot and all of the C nodes under a C comprise a hierarchy of such namespaces. But a C node doesn't have a corresponding data type for its entire content like with a C; rather, a C node hierarchy is stored flattened in the system catalog, such that each tuple of the C attribute from the parent C names one subdepot that exists, and all the subdepot's materials are flattened into tuples of the materials-defining attributes of the C. A C node specifies what the normal-user-data database has as its declared data type. The value of the C attribute of the parent C is determined from this node. Iff C is not specified then C must be omitted; iff C is specified then C must be present. The most liberal value of C is simply C, meaning C may define any database value at all. A C may have at most 1 C. Examples: #`A completely empty depot that doesn't have a self-local dbvar.`# depot-catalog {} #`Empty depot with self-local dbvar with unrestricted allowed values.`# depot-catalog { self-local-dbvar-type Database } depot-data Database:{} #`A depot having just one function and no dbvar.`# depot-catalog { function cube (Int <-- topic : Int) { topic exp 3 } } =head1 MATERIAL SPECIFICATION Grammar: ::= | | | | | | | | | | | | A C node specifies a new material (routine or type) that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of a (routine or type defining) attribute of a value of the catalog data type C, which is how a material specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. Or more specifically, an entire tree of PTMD_STD C nodes corresponds to a set of said attribute tuples, one attribute tuple per C node. In the nonsugared form, every C node has an explicitly designated name, and all child nodes are not declared inline with their parent nodes but rather are declared in parallel with them, and the parents refer to their children by their names. A feature of the PTMD_STD grammar is that material nodes may be declared without explicit names, such that the parser would generate names for them when deriving system catalog entries, and that is why PTMD_STD supports, and encourages the use of for code brevity/readability, the use of inline-declared material nodes, especially so when the C in question is a simple function or type that is only being used in one place, such as a typical C function or a typical subset type. When a C node is contained within another C node, the first material is conceptually part of the implementation of the second material; the first material is hereafter referred to as an I material for this inter-material relationship. When a C node is I contained within any other C node, but rather is directly contained within a C node, then this material is hereafter referred to as an I material. Both inner and outer C nodes may contain 0..N other (inner) C nodes. When a C node defines an outer material C directly within a subdepot (or depot) C, and C has no child inner materials, then the material definition will be stored in the system catalog exactly as conceived, as a new material named C directly in the subdepot C. For example, the outer material will have the name C. In contrast, when said C node has at least one child inner material C, then what happens in the system catalog instead is that a new subdepot named C is created directly in the subdepot C and every one of the whole hierarchy of said C nodes is stored directly in the subdepot C; the outer material is stored under the name that is the empty string, and its inner materials are stored under their own names. For example, the outer material will have the name C and the inner will be named C. Such a material hierarchy is stored in a flat namespace so it is required for all inner materials having a common outer material to have distinct declaration names, none of which are the empty string, regardless of whether any of them was declared inside another inner material node or directly inside the common outer node. It is mandatory for outer C nodes to have explicitly specified declaration names, because they are expected to be invoked by name in the general case, like any public routine or type. An inner C may optionally have an explicitly specified declaration name, for either self-documentation purposes or in case it might be invoked by name; however an inner C may also be anonymous, in which case it may only be used inline with its declaration, or by way of an C value which is defined inline with the material's declaration. When an inner material is declared as anonymous, it still actually has a name in the system catalog (I materials in the system catalog are named), but that name is generated by the PTMD_STD parser; strictly speaking this material could still be invoked by that name like an explicitly named one, but that would not be a good practice; use explicit names if you want to invoke by name. Strictly speaking, the algorithm to generate material names should be fully deterministic, but the names would be non-descriptive so akin to random. =head2 Material Specification Common Elements Every material has 2-3 elements, illustrated by this grammar: ::= | ::= ::= ::= function | 'named-value' | 'value-map' | 'value-map-unary' | 'value-filter' | 'value-constraint' | 'value-reduction' | 'order-determination' | procedure | 'system-service' | transaction | recipe | updater | 'scalar-type' | 'tuple-type' | 'database-type' | 'relation-type' | 'domain-type' | 'subset-type' | 'mixin-type' | 'key-constraint' | 'primary-key' | 'distrib-key-constraint' | 'distrib-primary-key' | 'subset-constraint' | 'distrib-subset-constraint' | 'stimulus-response-rule' ::= ::= | | | | | | | | | | | | So a C|C node has 2-3 elements in general: =over =item C This is a character string of the format C<< [<[ a..z ]>+] ** '-' >>; it identifies the kind of the material and is the only external metadata of C generally necessary to interpret the latter; what grammars are valid for C depend just on C. =item C This is the declared name of the material within the namespace defined by its subdepot (or depot). It is explicitly specified iff the C is a C =item C This is mandatory for all C. It specifies the entire material sans its name. Format varies with C. =back For material examples, see the subsequent documentation sections. Note that, for simplicity, the subsequent sections assume for now that C is the only valid option, and so the C isn't optional, and the only way to embed a material in another is using a C. =head2 Function Specification Grammar: ::= ::= function | 'named-value' | 'value-map' | 'value-map-unary' | 'value-filter' | 'value-constraint' | 'value-reduction' | 'order-determination' ::= ::= [ ]* ::= '(' ? ? '<--' [? ** [? ',' ?] [? ',']?]? ? ')' ::= ::= ::= | ::= '{' ? [[ | ] ]* ? '}' ::= ::= '{' ? '...' ? '}' A C node specifies a new function that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of the C attribute of a value of the catalog data type C, which is how a function specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. The C tuple has 2 primary attributes, C and C, which are valued from the C node's C and C elements, respectively. A C specifies an entire function besides its name. It is interpreted as a Muldis D C value. The C element specifies the function's public interface, which is these 5 attributes of the new C: C, C, C, C, C. The C element specifies the function's implementation, which is the 1 attribute C of the new C. The C has no impact at all on the interpretation of a C. However, it can serve to apply additional constraints on the allowed values of the resulting C, in the manner of simple subset-type constraints, and similarly it can serve to add self-documentation to the intended purpose or use of the function. Iff C is C then there are no such subset-type constraints applied, as the node is simply denoting a generic function; any other value of C means that the node is denoting a value of a proper subtype of C, and so that subtype's respective constraints are applied to the new C. The various C map to C subtypes as follows: function kind | catalog data type ----------------------+------------------ function | Function named-value | NamedValFunc value-map | ValMapFunc value-map-unary | ValMapUFunc value-filter | ValFiltFunc value-constraint | ValConstrFunc value-reduction | ValRedFunc order-determination | OrdDetFunc The C's C is interpreted as the C's C attribute. Any of these kinds of components of a C node are interpreted in exactly the same manner as for a C node, as a C is to a C: C (but that the C attribute is named C rather than C), C, C, C, C. A C must have at least one C, because a function must by definition result in a value, and that C says what this result value is. Said result-determining C must either not be a C or it must be a C whose direct C is the empty string; the latter option is saying explicitly what the parser would otherwise name the C implicitly. A C may have at most one C that isn't a C, because it can only have one result-determining C. Examples: function cube (Int <-- topic : Int) { topic exp 3 } =head2 Procedure Specification Grammar: ::= ::= procedure | 'system-service' | transaction | ::= recipe | updater ::= ::= [ ]* ::= '(' ? [ ** [? ',' ?] [? ',']?]? ? ')' ::= | | | ::= ::= '&' ::= ? ? ':' ? ::= ::= ::= | ::= '?' ::= '@' ::= ::= ? ? ::= '::=' ::= ::= implements ::= ::= | | | ::= ::= ::= '[' ? [[ | | | ] ** ]* ? ']' ::= '{' ? [[ | | ] ** ]* ? '}' ::= with ::= var ? ':' ? ::= ::= '[' ? '...' ? ']' ::= '{' ? '...' ? '}' A C node specifies a new procedure that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of the C attribute of a value of the catalog data type C, which is how a procedure specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. The C tuple has 2 primary attributes, C and C, which are valued from the C node's C and C elements, respectively. A C specifies an entire procedure besides its name. It is interpreted as a Muldis D C value. The C element specifies the procedure's public interface, which is these 9 attributes of the new C: C, C, C, C, C, C, C, C, C. The C element specifies the procedure's implementation, which is these 3 attributes of the new C: C, C, C. The C often has no impact at all on the interpretation of a C. However, it can serve to apply additional constraints on the allowed values of the resulting C, in the manner of simple subset-type constraints, and similarly it can serve to add self-documentation to the intended purpose or use of the procedure. Iff C is C then there are no such subset-type constraints applied, as the node is simply denoting a generic procedure; any other value of C means that the node is denoting a value of a proper subtype of C, and so that subtype's respective constraints are applied to the new C. Iff C is a C, then C is also constrained to be one of C<[|non]empty_recipe_body>. The C is the sole determinant of the values of the C and C attributes of the resulting C; for each valid combination there also exists a C subtype. The various C map to attribute values and C subtypes as follows: procedure kind | is_system_service | is_transaction | catalog data type ---------------+-------------------+----------------+------------------ procedure | Bool:False | Bool:False | Procedure system-service | Bool:True | Bool:True | SystemService transaction | Bool:False | Bool:True | Transaction recipe | Bool:False | Bool:True | Recipe updater | Bool:False | Bool:True | Updater Iff the C has at least one C or C, then the procedure has one or more regular parameters, which are what another routine can explicitly supply arguments for in an invocation of the procedure; each regular parameter is either subject-to-update or read-only. Each C is primarily interpreted as a tuple of the C's C attribute, and each C is primarily interpreted as a tuple of the C's C attribute; for each tuple, the C and C, respectively, of the C or C provide the tuple's C and C attribute. Iff any of the parameters have an C, then those parameters are optional to supply arguments for; for each parameter with an C, the C's C attribute has a tuple with the parameter's C. Iff any of the parameters have a C, then the procedure is being explicitly declared to be a virtual procedure, and so the C must be C; for each parameter with a C, the C's C attribute has a tuple with the parameter's C. Iff the C has at least one C or C, then the procedure has one or more global parameters, which are lexical aliases for global variables; each global parameter is either subject-to-update or read-only. Each C is primarily interpreted as a tuple of the C's C attribute, and each C is primarily interpreted as a tuple of the C's C attribute; for each tuple, the C and C, respectively, of the C or C provide the tuple's C and C attribute. Iff the C has at least one C, then the procedure is explicitly declaring that it implements one or more virtual procedure, one being named by each C. Each C is interpreted as a tuple of the C's C attribute. Iff the C is an C, then the C's C, C and C attributes are all empty. Iff the C has at least one C, then the procedure is explicitly declaring that it has one or more inner materials, such that the other materials are conceptually part of the implementation of the procedure; each C specifies one inner material in its C element. A C is not interpreted as any part of the C but rather results in other additions to its parent C, in a manner similar to as if the C were specified externally of the C node; but see the L main description for details on the complete effects of specifying an inner material. Iff the C has at least one C, then the procedure has one or more regular lexical variables. Each C is interpreted as a tuple of the C's C attribute; for each tuple, the C and C, respectively, of the C provide the tuple's C and C attribute. Iff the C directly has at least one C, then each such C is interpreted as a tuple of an attribute of the C's C attribute such that said tuple's C is explicitly user-defined rather than generated by the parser. Any C contained in a C by way of one of its direct C or C will similarly be interpreted as a tuple of an attribute of the C's C attribute, where said tuple's C is either user-defined or generated as appropriate for the kind of C. Each C of a C is interpreted as a tuple of an attribute of the the C's C attribute. A C may also, and typically does, also have nested C, thereby forming a tree, and that tree is flattened with each nested C becoming its own tuple under C like with the first. In fact, all of a procedure's statements form a single statement tree, and the root node of this tree is an implicit compound statement node (whose name is the empty string) whose direct child statements are all of the direct child C elements of the C, in order. Iff a C has no C member elements, then the procedure has a defined body that is an unconditional no-op. A C must have at least one C, because a recipe must by definition update at least one of its (regular or global) parameters, though possibly to the same value it already has, lest it otherwise be an unconditional no-op. Each C is interpreted as a tuple of the C's C attribute. Examples: procedure print_curr_time () [ var now : Instant fetch_trans_instant( &now ) write_Text_line( 'The current time is: ' ~ nlx.par.lib.utils.time_as_text( time => now ) ) ] recipe count_heads (&count : NNInt, search : Text, people ::= fed.data.db1.people) { with value-filter filt (Bool <-- topic : Tuple, search : Text) { .name like ('%' ~ search ~ '%') } count := #(people where ( =>search )) } updater make_coprime (&a : NNInt, &b : NNInt) { with function gcd (NNInt <-- a : NNInt, b : NNInt) { b = 0 ?? a !! rtn( a => b, b => a mod b round Down ) } let gcd ::= nlx.lib.gcd( =>a, =>b ) a := a div gcd round Down b := b div gcd round Down } =head2 Scalar Type Specification Grammar: ::= 'scalar-type' ::= '{' ? [ | | | | | | ] ** ? '}' ::= 'subtype-constraint' ::= possrep '{' ? [ ]? ? '}' ::= 'is-base' ::= 'possrep-map' '{' ? from using 'reverse-using' ? '}' ::= ::= A C node specifies a new scalar type that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of the C attribute of a value of the catalog data type C, which is how a scalar type specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. The C tuple has 2 primary attributes, C and C, which are valued from the C node's C and C elements, respectively. A C specifies an entire scalar type besides its name. It is interpreted as a Muldis D C value. I I =head2 Tuple Type Specification Grammar: ::= ::= 'tuple-type' | 'database-type' ::= '{' ? [ | | | | | | ] ** ? '}' ::= attr ? ':' ? ::= ::= 'virtual-attr-map' '{' ? 'determinant-attrs' 'dependent-attrs' 'map-function' [ ]? ? '}' ::= '{' ? [[ | ] ** [? ',' ?] [? ',']?]? ? '}' ::= ? '=>' ? ::= 'is-updateable' A C node specifies a new tuple type that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of the C attribute of a value of the catalog data type C, which is how a tuple type specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. The C tuple has 2 primary attributes, C and C, which are valued from the C node's C and C elements, respectively. A C specifies an entire tuple type besides its name. It is interpreted as a Muldis D C value. The C has no impact at all on the interpretation of a C. However, it can serve to apply additional constraints on the allowed values of the resulting C, in the manner of simple subset-type constraints, and similarly it can serve to add self-documentation to the intended purpose or use of the tuple type. Iff C is C then there are no such subset-type constraints applied, as the node is simply denoting a generic tuple type; iff C is C then there is a constraint applied such that the node is denoting a database type. I Examples: #`db schema with 3 relvars, 2 subset constrs, the 5 def separately`# database-type CD_DB { attr artists : nlx.lib.Artists attr cds : nlx.lib.CDs attr tracks : nlx.lib.Tracks constraint nlx.lib.sc_artist_has_cds constraint nlx.lib.sc_cd_has_tracks } #`relation type using tuple virtual-attr-map for case-insen key attr where primary text data is case-sensitive, case-preserving`# relation-type Locations { tuple-type nlx.lib.Location with tuple-type Location { attr loc_name : Text attr loc_name_uc : Text virtual-attr-map { determinant-attrs { =>loc_name } dependent-attrs { =>loc_name_uc } map-function nlx.lib.uc_loc_name } with value-map-unary uc_loc_name (Tuple <-- topic : Tuple) { %:{ loc_name_uc => upper( .loc_name ) } } } constraint nlx.lib.sk_loc_name_uc with key-constraint sk_loc_name_uc { loc_name_uc } } #`db schema with 2 real relvars, 1 virtual relvar; all are updateable real products has attrs { product_id, name } real sales has attrs { product_id, qty } virtual combines has attrs { product_id, name, qty }`# database-type DB { attr products : nlx.lib.Products attr sales : nlx.lib.Sales attr combines : nlx.lib.Combines virtual-attr-map { determinant-attrs { =>products, =>sales } dependent-attrs { =>combines } map-function nlx.lib.combine_p_s is-updateable } with value-map-unary combine_p_s (Database <-- topic : Database) { Database:{ combines => .products join .sales } } } =head2 Relation Type Specification Grammar: ::= 'relation-type' ::= '{' ? [ | | | | | ] ** ? '}' ::= tuple-type A C node specifies a new relation type that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of the C attribute of a value of the catalog data type C, which is how a relation type specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. The C tuple has 2 primary attributes, C and C, which are valued from the C node's C and C elements, respectively. A C specifies an entire relation type besides its name. It is interpreted as a Muldis D C value. I Examples: relation-type Artists { with tuple-type Artist { attr artist_id : Int attr artist_name : Text } with primary-key pk_artist_id { artist_id } with key-constraint sk_artist_name { artist_name } tuple-type nlx.lib.Artist constraint nlx.lib.pk_artist_id constraint nlx.lib.sk_artist_name } relation-type CDs { with tuple-type CD { attr cd_id : Int attr artist_id : Int attr cd_title : Text } with primary-key pk_cd_id { cd_id } with key-constraint sk_cd_title { cd_title } tuple-type nlx.lib.CD constraint nlx.lib.pk_cd_id constraint nlx.lib.sk_cd_title } =head2 Domain Type Specification Grammar: ::= 'domain-type' ::= '{' ? [ | | | | | ] ** ? '}' ::= ['source-union' | 'source-intersection'] '{' ? [ | ** [? ',' ?] [? ',']?] ? '}' ::= ['filter-union' | 'filter-intersection'] '{' ? [ ** [? ',' ?] [? ',']?]? ? '}' A C node specifies a new domain type that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of the C attribute of a value of the catalog data type C, which is how a domain type specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. The C tuple has 2 primary attributes, C and C, which are valued from the C node's C and C elements, respectively. A C specifies an entire domain type besides its name. It is interpreted as a Muldis D C value. I I =head2 Subset Type Specification Grammar: ::= 'subset-type' ::= | ::= '{' ? [ | | | | ] ** ? '}' ::= ['base-type' | of] ::= [constraint | where] ::= ::= default ::= [ ]? [ ]? A C node specifies a new subset type that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of the C attribute of a value of the catalog data type C, which is how a subset type specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. The C tuple has 2 primary attributes, C and C, which are valued from the C node's C and C elements, respectively. A C specifies an entire subset type besides its name. It is interpreted as a Muldis D C value. I I =head2 Mixin Type Specification Grammar: ::= 'mixin-type' ::= '{' ? [[ | ] ** ]? ? '}' ::= composes [ ]? ::= 'and-provides-its-default' A C node specifies a new mixin type that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of the C attribute of a value of the catalog data type C, which is how a mixin type specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. The C tuple has 2 primary attributes, C and C, which are valued from the C node's C and C elements, respectively. A C specifies an entire mixin type besides its name. It is interpreted as a Muldis D C value. I I =head2 Key Constraint Specification Grammar: ::= ::= 'key-constraint' | 'primary-key' ::= '{' ? [ ** [? ',' ?] [? ',']?]? ? '}' A C node specifies a new unique key constraint or candidate key, for a relation type, that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of the C attribute of a value of the catalog data type C, which is how a unique key constraint specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. The C tuple has 2 primary attributes, C and C, which are valued from the C node's C and C elements, respectively. A C specifies an entire unique key constraint or candidate key, for a relation type, besides its name. It is interpreted as a Muldis D C value. Each C element of a C is interpreted as a tuple of the C's C attribute. Iff there are no C, then we have a nullary key which restricts the relation to have a maximum of 1 tuple. The C element of a C node is the sole determinant of the value of the C attribute of the resulting C; C means C, while C means C. Examples: #`at most one tuple allowed`# key-constraint maybe_one {} #`relation type's artist_id attr is its primary key`# primary-key pk_artist_id { artist_id } #`relation type has surrogate key over both name attrs`# key-constraint sk_name { last_name, first_name } =head2 Distributed Key Constraint Specification I =head2 Subset Constraint Specification Grammar: ::= 'subset-constraint' ::= '{' ? parent 'using-key' child 'using-attrs' ? '}' ::= ::= ::= ::= ::= ::= '{' ? [[ | ] ** [? ',' ?] [? ',']?]? ? '}' ::= ? '=>' ? ::= ::= A C node specifies a (non-distributed) subset constraint (foreign key constraint) over relation-valued attributes, for a tuple type, that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of the C attribute of a value of the catalog data type C, which is how a (non-distributed) subset constraint specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. The C tuple has 2 primary attributes, C and C, which are valued from the C node's C and C elements, respectively. A C specifies an entire (non-distributed) subset constraint, for a relation type, besides its name. It is interpreted as a Muldis D C value. I Examples: #`relation foo must have exactly 1 tuple when bar has at least 1`# subset-constraint sc_mutual_inclusion { parent foo using-key nlx.lib.maybe_one child bar using-attrs {} } subset-constraint sc_artist_has_cds { parent artists using-key nlx.lib.Artists.pk_artist_id child cds using-attrs { =>artist_id } } =head2 Distributed Subset Constraint Specification I =head2 Stimulus-Response Rule Specification Grammar: ::= 'stimulus-response-rule' ::= when invoke ::= 'after-mount' ::= A C node specifies a new stimulus-response rule that lives in a depot or subdepot. A C node in the PTMD_STD grammar corresponds directly to a tuple of the C attribute of a value of the catalog data type C, which is how a stimulus-response rule specification is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. The C tuple has 2 primary attributes, C and C, which are valued from the C node's C and C elements, respectively. A C specifies an entire stimulus-response rule besides its name. It is interpreted as a Muldis D C value. The C and C elements specify the C and C attributes, respectively, of the new C, which is the kind of stimulus and the name of the procedure being invoked in response. Currently, C is the only kind of stimulus supported; other kinds will be defined in the future. Examples: stimulus-response-rule bootstrap { when after-mount invoke nlx.lib.main } =head1 GENERIC VALUE EXPRESSIONS Grammar: ::= | | | | | | | | ::= | ::= '(' ? ? ')' ::= ::= [let ]? An C node is the general case of a Muldis D value expression tree (which normally denotes a Muldis D value selector), which must be composed beneath a C, or specifically into a routine or type or constraint (etc) definition, because in the general case an C can I be completely evaluated at compile time. An C node is a proper superset of a C node, and any occurrences of C nodes in this document may optionally be substituted with C nodes on a per-instance basis. An C node in the PTMD_STD grammar corresponds directly to a tuple of an attribute of a value of the catalog data type C, which is how a value expression node is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. Or more specifically, an entire tree of PTMD_STD C nodes corresponds to a set of said attribute tuples, one attribute tuple per C node. In the nonsugared form, every C node has an explicitly designated name, as per a PTMD_STD C node, and all child nodes are not declared inline with their parent nodes but rather are declared in parallel with them, and the parents refer to their children by their names. A feature of the PTMD_STD grammar is that expression nodes may be declared without explicit names, such that the parser would generate names for them when deriving system catalog entries, and that is why PTMD_STD supports, and encourages the use of for code brevity/readability, the use of inline-declared expression nodes, especially so when the C in question is an C. Iff an C is a C, then it is interpreted simply as if it were its child C element; the I reason that the C grammar element exists is to assist the parser in determining the boundaries of an C where code otherwise might be ambiguous or be interpreted differently than desired due to nesting precedence rules (see L for more about those). There is never a distinct node in a parser's output for a C itself. Iff an C is an C, then this typically means that the parent C is having at least one of its children declared with an explicit name rather than inline, same as the corresponding system catalog entry would do, and then the C is the invocation name of that child. Alternately, the C may be the invocation name of one of the expression-containing routine's parameters, in which case the C in question represents the current argument to that parameter; this also is exactly the same as a corresponding catalog entry for using an argument. Iff an C is a C, then the C element of the C is being declared with an explicit name, and the C element of the C is that name. But if the C element of the C is an C (or a C I >), then the C is in fact declaring a new node itself (rather than simply naming its child node), which is a tuple of a Muldis D C value; the new node is simply declaring an alias for another node, namely the C element. Examples: #`an expr_name node`# foo_expr #`a named_expr node`# let bar_expr ::= factorial( foo_expr ) =head2 Generic Expression Attribute Accessors Grammar: ::= | | ::= ::= '.' ::= '.' An C node represents an accessor or alias for an attribute of another, tuple-valued expression node. It is interpreted as a tuple of a Muldis D C value. If an C is an C, then the C element specifies the C attribute of the new C. If an C is an C, then it is interpreted in exactly the same manner as for an C except that the C element is interpreted with a C element prepended to it; so for example a C<.foo> is treated as being C. If an C is an C, then the C is derived from a catenation of the node name that C has (explicitly or that will be generated for it by the parser) with the C in that order. Note that an C whose C is an C is also an C, and vice-versa. Examples: #`an accessor node of a named tuple-valued node`# foo_t.bar_attr #`an accessor node of a tuple-valued node named "topic"`# .attr #`same as topic.attr`# #`an accessor node of an anonymous tuple-valued node`# nlx.lib.tuple_res_func( arg ).quux_attr =head2 Generic Function Invocation Expressions Grammar: ::= ::= '(' ? [ ** [? ',' ?] [? ',']?]? ? ')' ::= | | A C node represents the result of invoking a named function with specific arguments. It is interpreted as a tuple of a Muldis D C value. The C element specifies the C attribute of the new C, which is the name of the function being invoked, and the C element specifies the C attribute. In the general case of a function invocation, all of the arguments are named, as per C, and formatting a C node that way is always allowed. In some (common) special cases, some (which might be all) arguments may be anonymous, as per C. With just functions in the top-level namespaces C, these 4 special cases apply: If a function has exactly one parameter, then it may be invoked with a single anonymous argument and the latter will bind to that parameter. Or, if a function has multiple parameters but exactly one of those is mandatory, then it may be invoked with just one anonymous argument, which is assumed to bind to the single mandatory parameter, and all optional arguments must be named. Or, if a function has multiple mandatory parameters and one of them is named C, then it may be invoked with a single anonymous argument and the latter will bind to that parameter. Or, if a function has multiple mandatory parameters and two of them are named C and C, then it may be invoked with two anonymous arguments and the latter will bind to those parameters in sequential order, the first one to C and the second one to C. With just functions in all top-level namespaces I C, these 2 special cases apply (similar to the prior-mentioned latter 2): If a function invocation has either 1 or 2 anonymous arguments, then they will be treated as if they were named arguments for the C and C parameters; the only or sequentially first argument will bind to C, and any sequentially second argument will bind to C. One reason for this difference between treatment of top-level namespaces is it allows the Muldis D parser to convert all the anonymous arguments to named ones (all arguments in the system catalog are named) when parsing the expression-containing routine/etc in isolation from any other user-defined entities. The other reason for this limitation is that it helps with self-documentation; programmers wanting to know an anonymous argument's parameter name won't have to look outside the language spec to find the answer. I A special shorthand for C also exists, C, which may be used only if the C of the otherwise-C is an C and that C is identical to the C. In this situation, the identical name can be specified just once, which is the shorthand; for example, the named argument C<< foo => foo >> may alternately be written out as C<< =>foo >>. This shorthand is to help with the possibly common situation where two successive routines in a call-chain have any same-named parameters and arguments are simply being passed through. (This shorthand is like Perl 6's C<:$a> being short for C<< a => $a >>.) Examples: #`zero params`# Nothing() #`single mandatory param`# median( Bag:{ 22, 20, 21, 20, 21, 21, 23 } ) #`single mandatory param`# factorial( topic => 5 ) #`two mandatory params`# frac_quotient( dividend => 43.7, divisor => 16.9 ) #`same as previous`# frac_quotient( divisor => 16.9, dividend => 43.7 ) #`one mandatory 'topic' param, two optional`# nlx.lib.barfunc( mand_arg, oa1 => opt_arg1, oa2 => opt_arg2 ) #`same as previous`# nlx.lib.barfunc( oa2 => opt_arg2, mand_arg, oa1 => opt_arg1 ) #`a user-defined function`# nlx.lib.foodb.bazfunc( a1 => 52, a2 => 'hello world' ) #`two params named 'topic' and 'other'`# is_same( foo, bar ) #`invoke the lexically innermost routine with 2 args`# rtn( x, y ) #`three named params taking 2 same-named args, 1 diff-named arg`# nlx.lib.passed_thru( =>a, b => 5, =>c ) =head2 Generic If-Else Expressions Grammar: ::= if then else | '??' '!!' ::= ::= ::= An C node represents a ternary if-then-else control flow expression. It is interpreted as a tuple of a Muldis D C value. The C, C, and C elements specify the C, C, and C attributes, respectively, of the new C; C is the condition to evaluate at runtime and must result in a C; iff the result of that condition is C then the C is evaluated and its result is the result of the whole if-then-else expression at runtime; otherwise, the C is evaluated and its result is the whole if-then-else's result. Examples: if foo > 5 then bar else baz if is_empty(ary) then empty_result else ary.[0] if x = ∅ or y = ∅ then ∅ else Just:{x.{*} + (y.{*} exp 3)} if val isa then val exp 3 else if val isa then val ~# 5 else True 'My answer is: ' ~ (maybe ?? 'yes' !! 'no') =head2 Generic Given-When-Default Expressions Grammar: ::= given [when then ]* default ::= ::= ::= ::= A C node represents an N-way given-when-default switch control flow expression that dispatches based on matching a single value with several options. It is interpreted as a tuple of a Muldis D C value. The C element specifies the C attribute of the new C, which is the control value for the expression. The whole collection of nonordered 0..N C + C elements specifies the C attribute, which is a set of I comparands; if any of these I values matches the value of C, its associated I result value is the result of the C. The C element specifies the C attribute, which determines the result value of the C at runtime if either C is an empty set or none of its comparands match C. Examples: given digit when 'T' then 10 when 'E' then 11 default digit =head2 Material Reference Selector Expressions Grammar: ::= | ::= '<' '>' ::= ::= A C node represents a selector invocation for a value of the C type, which is selected in terms of a value of the C type. It is interpreted as a tuple of a Muldis D C value. The C element specifies the C attribute of the new C, which is the name, from the point of view of the routine embedding this expression node, of the routine or type that the new C value is supposed to facilitate portable invoking of, from any other routine besides the embedding routine. A C node also serves as a less-verbose alternate syntax for a C node, but only for C values where you actually don't want a relative-path name-chain value. For any C node whose C is already an C payload, a Muldis D parser will silently replace the C node with a C node whose payload is its C. In other words, you can safely use any primary namespace qualified name chain in a C node and get the result that you would reasonably expect. This is primarily useful for system-defined types and routines. A C node represents a value of the C type. It is a special shorthand syntax for a C node that defines a tuple with 2 attributes, C and C, where the first's value is a C node and the second's value is a C node as per a C node's argument list. Examples: #`a higher-order function curried with 1 argument`# ( =>search_term ) #`a reference to an updater`# #`a reference to a data type`# =head1 GENERIC PROCEDURE STATEMENTS Grammar: ::= | | | | | | | | ::= | ::= | | ::= | ::= ::= [let ]? A C node is the general case of a Muldis D statement tree, which must be composed beneath a C, or specifically into a procedure definition, because in the general case a C can I be completely evaluated at compile time. A C node in the PTMD_STD grammar corresponds directly to a tuple of an attribute of a value of the catalog data type C, which is how a statement node is actually represented in Muldis D's nonsugared form, which is as a component of the system catalog. Or more specifically, an entire tree of PTMD_STD C nodes corresponds to a set of said attribute tuples, one attribute tuple per C node. In the nonsugared form, every C node has an explicitly designated name, as per a PTMD_STD C node, and all child nodes are not declared inline with their parent nodes but rather are declared in parallel with them, and the parents refer to their children by their names. A feature of the PTMD_STD grammar is that statement nodes may be declared without explicit names, such that the parser would generate names for them when deriving system catalog entries, and that is why PTMD_STD supports, and encourages the use of for code brevity/readability, the use of inline-declared statement nodes. Iff a C is an C, then this typically means that the parent C is having at least one of its children declared with an explicit name rather than inline, same as the corresponding system catalog entry would do, and then the C is the invocation name of that child. Note that, regarding Muldis D's feature of a statement node having an explicit name that can be referenced by "leave" and "iterate" control flow statements to leave or re-iterate the corresponding block, both SQL and Perl have native counterpart features in the form of block labels. Examples: #`a stmt_name node`# foo_stmt #`a named_stmt node`# let bar_stmt ::= nlx.lib.swap( &=>first, &=>second ) =head2 Generic Compound Statements Grammar: ::= A C node specifies a procedure compound statement composed of a sequence of 0..N other statements such that those other statements execute in this given sequence; each statement of the sequence conceptually executes at a different time. It is interpreted as a tuple of a Muldis D C value. Each C element of a C is a nested statement that is interpreted as its own tuple of an attribute of the C attribute of the host C; for each said tuple, there exists an element of the C's C attribute which matches the C attribute of the tuple. Any C or C direct elements of a C are interpreted as if they were directly in the C that the C is under. Examples: [ var message : Text read_Text_line( &message ) write_Text_line( message ) ] =head2 Multi-Update Statements Grammar: ::= A C node specifies a multi-update statement, which is a procedure compound statement composed of a set of 0..N other statements such that those other statements execute all as one and collectively at a single point in time, as if the collection were a single statement that did all the work of the component statements itself. It is interpreted as a tuple of a Muldis D C value. Each C element of a C is a nested statement that is interpreted as its own tuple of an attribute of the C attribute of the host C; for each said tuple, there exists an element of the C's C attribute which matches the C attribute of the tuple. Any C direct elements of a C are interpreted as if they were directly in the C that the C is under. Examples: { let order_id ::= is_empty(orders) ?? 1 !! max( Set_from_attr( orders, name => Name:order_id ) ) ++ assign_insertion( &orders, %:{ =>order_id, date => '2011-03-04' } ) assign_union( &order_details, @:[ order_id, prod_code, qty ]:{ [ order_id, 'COG' , 20, ], [ order_id, 'CAM' , 10, ], [ order_id, 'BOLT', 70, ], } ) } =head2 Generic Procedure Invocation Statements Grammar: ::= ::= '(' ? [ ** [? ',' ?] [? ',']?]? ? ')' ::= | | | | | ::= ::= ? '=>' ? ::= ::= ::= ::= ::= '=>' A C node represents the invocation of a named procedure, with specific subject-to-update or read-only arguments, as a statement of a procedure. It is interpreted as a tuple of a Muldis D C value. The C element specifies the C attribute of the new C, which is the name of the procedure being invoked, and the C element specifies the C plus C attributes, one tuple thereof per C; each C having an C yields an C tuple, and each C without one yields a C tuple. In the general case of a procedure invocation, all of the arguments are named, as per C, and formatting a C node that way is always allowed. In some (common) special cases, some (which might be all) arguments may be anonymous, as per C. For further details on this, see the C node kind, under L, because the rules regarding when arguments may be anonymous or must be named are the same for both main routine kinds. The sole exception to said rules is that the rules are evaluated independently for subject-to-update arguments and read-only arguments, because those 2 argument groups and their corresponding parameters effectively have independent namespaces with respect to that the presence or absence of an C can always be counted on to distinguish the groups. This means, for example, that you can have an anonymous subject-to-update argument plus an anonymous read-only argument to a system-defined procedure where none of the corresponding parameters are named C. The C node kind also has the same special shorthand for named arguments, in the form of C, as the C node kind does with its C, but that C's version also works with subject-to-update arguments. Examples: #`two mandatory params, one s-t-u, one r-o`# assign( &foo, 3 ) #`same as previous`# assign( 3, &foo ) #`still same as previous but with all-named syntax`# assign( &target => foo, v => 3 ) #`three mandatory params`# nlx.lib.lookup( &=>addr, =>people, =>name ) fetch_trans_instant( &now ) prompt_Text_line( &name, 'Enter a person\'s name: ' ) Integer.fetch_random( &rand, interval ) =head2 Generic Try-Catch Statements Grammar: ::= try [ catch ]? ::= ::= A C node represents a try-catch control flow statement. It is interpreted as a tuple of a Muldis D C value. The C and C elements specify the C and C attributes, respectively, of the new C, which are the names or definitions of statements that represent the invocation of named procedures. The C routine is unconditionally invoked first and then iff C throws an exception then it will be caught and the C routine, if any, will be invoked immediately after to handle it; if C also throws an exception then it will not be caught. It is invalid for C or C to name or define a procedure statement that isn't just a routine invocation, though the grammar itself doesn't say so; mainly the valid options are: C, C, C, and C or C for the first 3. Examples: try nlx.lib.attempt_the_work() catch nlx.lib.deal_with_failure() =head2 Generic If-Else Statements Grammar: ::= if then [ else ]? ::= ::= An C node represents a ternary if-then-else control flow statement. It is interpreted as a tuple of a Muldis D C value. The C, C, and C elements specify the C, C, and C attributes, respectively, of the new C; C is the condition to evaluate at runtime and must result in a C; iff the result of that condition is C then C is invoked; otherwise, C is invoked. Examples: if out_of_options then nlx.lib.give_up() else nlx.lib.keep_going() =head2 Generic Given-When-Default Statements Grammar: ::= given [when then ]* [default ]? ::= ::= A C node represents an N-way given-when-default switch control flow statement that dispatches based on matching a single value with several options. It is interpreted as a tuple of a Muldis D C value. The C element specifies the C attribute of the new C, which is the control value for the statement. The whole collection of nonordered 0..N C + C elements specifies the C attribute, which is a set of I comparands; if any of these I values matches the value of C, its associated I statement is executed as if it were the whole C. The C element specifies the C attribute, which determines the statement that is executed at runtime as if it were the whole C if either C is an empty set or none of its comparands match C. Examples: given picked_menu_item when 'v' then nlx.lib.screen_view_record() when 'a' then nlx.lib.screen_add_record() when 'd' then nlx.lib.screen_delete_record() default nlx.lib.display_bad_choice_error() =head2 Procedure Leave, Iterate, and Loop Statements Grammar: ::= | | ::= leave [ ]? ::= iterate [ ]? ::= loop The 3 node kinds C, C, C are all very useable independently and are also commonly used together. A C node represents an instruction to abnormally exit the block defined by a parent statement node (a normal exit is to simply execute to the end of the block). If the parent node in question is the root (compound) statement node for the host procedure, that is, if the parent node has the empty string as its name, then the latter will be exited; this is how a "return" statement is represented. If the parent node in question is an iterating or looping statement, then any remaining iterations it might have had are skipped, especially useful if it was an infinite loop. A C node is interpreted as a tuple of a Muldis D C value. The optional C element specifies the name of the parent statement node to completely abort; that name becomes the C attribute of the new C tuple. Iff the C has no C element then the parser will automatically generate said element with a value of the empty string, meaning it is a "return" statement. An C node represents an instruction to abnormally end the current iteration of a looping block defined by a parent statement node, and then start at the beginning of the next iteration of that loop if there are any left; or, it can also be used to "redo" any non-looping parent statement. It is interpreted as a tuple of a Muldis D C value. The optional C element specifies the name of the parent statement node to continue execution at the beginning of; that name becomes the C attribute of the new C tuple. Iff the C has no C element then the parser will automatically generate said element with a value of the empty string. Having the C value of the empty string means that the root (compound) statement of the host procedure is being referenced, in which case the C is saying to redo the whole procedure. A C node represents a generic looping statement block which iterates until a child "leave" statement executes. It is interpreted as a tuple of a Muldis D C value. The C element specifies the name or definition of the child statement node to be repeatedly executed; the name of that statement becomes the C attribute of the new C tuple. A C node in combination with C or C nodes is useful for a more ad-hoc means of performing procedural iteration as well as for effectively simulating the syntax of common "while" or "for i" loops, so Muldis D doesn't include special "while" or "for i" syntax. A C is I an effective "for each item in list" replacement, however; Muldis D currently doesn't provide a procedural "foreach", but typically any such tasks can effectively be performed in functional code using various list-processing relational routines; if a case can be made for procedural "foreach" then Muldis D may gain this feature in the future. Examples: let lookup_person ::= loop [ prompt_Text_line( &name, 'Enter a name to search for: ' ) given name when '' leave lookup_person nlx.lib.do_search( =>name, &=>not_found, &=>report_text ) if not_found then [ write_Text_line( 'No person matched' ) iterate lookup_person ] write_Text_line( report_text ) ] =head1 DEPRECATED - FUNCTION INVOCATION ALTERNATE SYNTAX EXPRESSIONS Grammar: ::= | | | | | | | | | ... A C node represents the result of invoking a named system-defined function with specific arguments. It is interpreted as a tuple of a Muldis D C value. A C node is a lot like a C node in purpose and interpretation but it differs in several significant ways. While a C node can be used to invoke any function at all, a C node can only invoke a fraction of them, and only standard system-defined functions. While a C node uses a simple common format with all functions, written in prefix notation with generally named arguments, a C node uses potentially unique syntax for each function, often written in infix notation, although inter-function format consistency is still applied as much as is reasonably possible. Broadly speaking, a C node has 2-3 kinds of payload elements: The first is the determinant of what function to invoke, hereafter referred to as an I or I. The second is an ordered list of 1-N mandatory function inputs, hereafter referred to as I
, whose elements typically have generic names like C or C or C. The (optional) third is a named list of optional function inputs, hereafter referred to as I, whose elements tend to have more purpose-specific names such as C, though note that things like C can be either mandatory or optional depending on the op they are being used with. The decision of I system-defined functions get the special alternate syntax treatment partly comes down to respecting common good practices in programming languages, letting people write code more like how they're comfortable with. Most programming languages only have special syntax for a handful of their operators, such as common comparison and boolean and mathematical and string and element extraction operators, and so Muldis D mainly does likewise. Functions get special alternate syntax if they would be frequently used and the syntax would significantly aid programmers in quickly writing understandeable code. =head2 Simple Commutative N-adic Infix Reduction Operators Grammar: ::= ** [ ] ::= min | max | and | or | xnor | iff | xor | '+' | '*' | union | intersect | exclude | symdiff | join | times | 'cross-join' | 'union+' | 'union++' | 'intersect+' ::= | '∧' | '∨' | '↔' | '⊻' | '↮' | '∪' | '∩' | '∆' | '⋈' | '×' | '∪+' | '∪++' | '∩+' A C node is for using infix notation to invoke a (homogeneous) commutative N-adic reduction operator function. Such a function takes exactly 1 actual argument, which is unordered-collection typed (set or bag), and the elements of that collection are the inputs of the operation; the inputs are all of the same type as each other and of the result. A single C node is equivalent to a single C node whose C element defines a single argument, whose value is a C or C node, which has a payload C element for each C element of the C, and the relative sequence of the C elements isn't significant. A C node requires at least 2 input value providing child nodes (C must match at least twice), which are its 2-N main op args; if you already have your inputs in a single collection-valued node then use C to invoke the function instead. If C matches more than once in the same C, then all of the C matches must be identical / the same operator. Some of the keywords are aliases for each other: keyword | aliases -----------+-------- and | ∧ or | ∨ xnor | ↔ iff xor | ⊻ ↮ union | ∪ intersect | ∩ exclude | ∆ symdiff join | ⋈ times | × cross-join union+ | ∪+ union++ | ∪++ intersect+ | ∩+ This table indicates which function is invoked by each keyword: min -> Core.Ordered.min( { expr.[0], ..., expr.[n] } ) max -> Core.Ordered.max( { expr.[0], ..., expr.[n] } ) and -> Core.Boolean.and( { expr.[0], ..., expr.[n] } ) or -> Core.Boolean.or( { expr.[0], ..., expr.[n] } ) xnor -> Core.Boolean.xnor( Bag:{ expr.[0], ..., expr.[n] } ) xor -> Core.Boolean.xor( Bag:{ expr.[0], ..., expr.[n] } ) + -> Core.Numeric.sum( Bag:{ expr.[0], ..., expr.[n] } ) * -> Core.Numeric.product( Bag:{ expr.[0], ..., expr.[n] } ) union -> Core.Relation.union( { expr.[0], ..., expr.[n] } ) intersect -> Core.Relation.intersection( { expr.[0], ..., expr.[n] } ) exclude -> Core.Relation.exclusion( Bag:{ expr.[0], ..., expr.[n] } ) join -> Core.Relation.join( { expr.[0], ..., expr.[n] } ) times -> Core.Relation.product( { expr.[0], ..., expr.[n] } ) union+ -> Core.Bag.union( { expr.[0], ..., expr.[n] } ) union++ -> Core.Bag.union_sum( Bag:{ expr.[0], ..., expr.[n] } ) intersect+ -> Core.Bag.intersection( { expr.[0], ..., expr.[n] } ) Examples: a min b min c a max b max c True and False and True True or False or True True xor False xor True 14 + 3 + -5 -6 * 2 * 25 4.25 + -0.002 + 1.0 69.3 * 15*2^6 * 49/23 { 1, 3, 5 } ∪ { 4, 5, 6 } ∪ { 0, 9 } { 1, 3, 5, 7, 9 } ∩ { 3, 4, 5, 6, 7, 8 } ∩ { 2, 5, 9 } =head2 Simple Non-commutative N-adic Infix Reduction Operators Grammar: ::= ** [ ] ::= '[<=>]' | '~' | '//' A C node is for using infix notation to invoke a (homogeneous) non-commutative N-adic reduction operator function. Such a function takes exactly 1 actual argument, which is ordered-collection typed (array), and the elements of that collection are the inputs of the operation; the inputs are all of the same type as each other and of the result. A single C node is equivalent to a single C node whose C element defines a single argument, whose value is an C node, which has a payload C element for each C element of the C, and the C elements have the same relative sequence. A C node requires at least 2 input value providing child nodes (C must match at least twice), which are its 2-N main op args; if you already have your inputs in a single collection-valued node then use C to invoke the function instead. If C matches more than once in the same C, then all of the C matches must be identical / the same operator. Exception: with some of these, the actual C derived from this has 2 actual arguments, the first a collection and the second taking a different type of value, from the last op input list element. This table indicates which function is invoked by each keyword: [<=>] -> Core.Cat.Order.reduction( [ expr.[0], ..., expr.[n] ] ) ~ -> Core.Stringy.catenation( [ expr.[0], ..., expr.[n] ] ) // -> Core.Set.Maybe.attr_or_value( [ expr.[0], ..., expr.[n-1] ], value => expr.[n] ) Examples: Same [<=>] Increase [<=>] Decrease 0x'DEAD' ~ 0b'10001101' ~ 0x'BEEF' 'hello' ~ ' ' ~ 'world' [ 24, 52 ] ~ [ -9 ] ~ [ 0, 11, 24, 7 ] a // b // 42 =head2 Simple Symmetric Dyadic Infix Operators Grammar: ::= ::= '=' | '!=' | nand | nor | '|-|' | compose ::= | '≠' | '⊼' | '↑' | '⊽' | '↓' A C node is for using infix notation to invoke a symmetric dyadic operator function. Such a function takes exactly 2 arguments, which are the inputs of the operation; the inputs are all of the same type as each other but the result might be of either that type or a different type. A single C node is equivalent to a single C node whose C element defines 2 arguments, and the 2 C elements of the C supply the values of those arguments, and which arguments get which C isn't significant. Some of the keywords are aliases for each other: keyword | aliases --------+-------- != | ≠ nand | ⊼ ↑ nor | ⊽ ↓ This table indicates which function is invoked by each keyword: = -> Core.Universal.is_same( expr.[0], expr.[1] ) != -> Core.Universal.is_not_same( expr.[0], expr.[1] ) nand -> Core.Boolean.nand( expr.[0], expr.[1] ) nor -> Core.Boolean.nor( expr.[0], expr.[1] ) |-| -> Core.Numeric.abs_diff( expr.[0], expr.[1] ) compose -> Core.Relation.composition( expr.[0], expr.[1] ) Examples: foo = bar foo ≠ bar False nand True 15 |-| 17 7.5 |-| 9.0 =head2 Simple Non-symmetric Dyadic Infix Operators Grammar: ::= ::= ::= ::= isa | '!isa' | 'not-isa' | as | asserting | assuming | '<' | '<=' | '>' | '>=' | imp | implies | nimp | if | nif | '-' | '/' | '^' | exp | '~#' | where | '!where' | 'not-where' | inside | '!inside'|'not-inside' | holds | '!holds'|'not-holds' | in | '!in' | 'not-in' | has | '!has' | 'not-has' | '{<=}' | '{!<=}' | '{>=}' | '{!>=}' | '{<}' | '{!<}' | '{>}' | '{!>}' | '{<=}+' | '{!<=}+' | '{>=}+' | '{!>=}+' | '{<}+' | '{!<}+' | '{>}+' | '{!>}+' | minus | except | '!matching' | 'not-matching' | antijoin | semiminus | matching | semijoin | divideby | 'minus+' | 'except+' | like | '!like' | 'not-like' ::= | '≤' | '≥' | '→' | '↛' | '←' | '↚' | '∈@' | '∉@' | '@∋' | '@∌' | '∈' | '∉' | '∋' | '∌' | '⊆' | '⊈' | '⊇' | '⊉' | '⊂' | '⊄' | '⊃' | '⊅' | '⊆+' | '⊈+' | '⊇+' | '⊉+' | '⊂+' | '⊄+' | '⊃+' | '⊅+' | '∖' | '⊿' | '⋉' | '÷' | '∖+' A C node is for using infix notation to invoke a non-symmetric dyadic operator function. Such a function takes exactly 2 arguments, which are the inputs of the operation; the inputs and the result may possibly be all of the same type, or they might all be of different types. A single C node is equivalent to a single C node whose C element defines 2 arguments, and the 2 C elements of the C supply the values of those arguments, which are associated in the appropriate sequence. Some of the keywords are aliases for each other: keyword | aliases ----------+-------- !isa | not-isa <= | ≤ >= | ≥ imp | → implies nimp | ↛ if | ← nif | ↚ !where | not-where inside | ∈@ !inside | ∉@ not-inside holds | @∋ !holds | @∌ not-holds in | ∈ !in | ∉ not-in has | ∋ !has | ∌ not-has {<=} | ⊆ {!<=} | ⊈ {>=} | ⊇ {!>=} | ⊉ {<} | ⊂ {!<} | ⊄ {>} | ⊃ {!>} | ⊅ {<=}+ | ⊆+ {!<=}+ | ⊈+ {>=}+ | ⊇+ {!>=}+ | ⊉+ {<}+ | ⊂+ {!<}+ | ⊄+ {>}+ | ⊃+ {!>}+ | ⊅+ minus | ∖ except !matching | ⊿ not-matching antijoin semiminus matching | ⋉ semijoin divideby | ÷ minus+ | ∖+ except+ !like | not-like This table indicates which function is invoked by each keyword: isa -> Core.Universal.is_value_of_type( lhs, type => rhs ) !isa -> Core.Universal.is_not_value_of_type( lhs, type => rhs ) as -> Core.Universal.treated( lhs, as => rhs ) asserting -> Core.Universal.assertion( lhs, is_true => rhs ) assuming -> sys.std.Core.Cat.curried_func_static_exten( function => lhs, args => rhs ) < -> Core.Ordered.is_before( lhs, rhs ) > -> Core.Ordered.is_after( lhs, rhs ) <= -> Core.Ordered.is_before_or_same( lhs, rhs ) >= -> Core.Ordered.is_after_or_same( lhs, rhs ) imp -> Core.Boolean.imp( lhs, rhs ) nimp -> Core.Boolean.nimp( lhs, rhs ) if -> Core.Boolean.if( lhs, rhs ) nif -> Core.Boolean.nif( lhs, rhs ) - -> Core.Numeric.diff( minuend => lhs, subtrahend => rhs ) / -> Core.Numeric.frac_quotient( dividend => lhs, divisor => rhs ) ^ -> Core.Numeric.power_with_whole_exp( radix => lhs, exponent => rhs ) exp -> Core.Integer.power( radix => lhs, exponent => rhs ) ~# -> Core.Stringy.replication( lhs, count => rhs ) where -> Core.Relation.restriction( lhs, func => rhs ) !where -> Core.Relation.cmpl_restr( lhs, func => rhs ) inside -> Core.Relation.tuple_is_member( t => lhs, r => rhs ) !inside -> Core.Relation.tuple_is_not_member( t => lhs, r => rhs ) holds -> Core.Relation.has_member( r => lhs, t => rhs ) !holds -> Core.Relation.has_not_member( r => lhs, t => rhs ) in -> Core.Collective.value_is_member( value => lhs, coll => rhs ) !in -> Core.Collective.value_is_not_member( value=>lhs, coll=>rhs ) has -> Core.Collective.has_member( coll => lhs, value => rhs ) !has -> Core.Collective.has_not_member( coll => lhs, value => rhs ) {<=} -> Core.Relation.is_subset( lhs, rhs ) {!<=} -> Core.Relation.is_not_subset( lhs, rhs ) {>=} -> Core.Relation.is_superset( lhs, rhs ) {!>=} -> Core.Relation.is_not_superset( lhs, rhs ) {<} -> Core.Relation.is_proper_subset( lhs, rhs ) {!<} -> Core.Relation.is_not_proper_subset( lhs, rhs ) {>} -> Core.Relation.is_proper_superset( lhs, rhs ) {!>} -> Core.Relation.is_not_proper_superset( lhs, rhs ) {<=}+ -> Core.Bag.is_subset( lhs, rhs ) {!<=}+ -> Core.Bag.is_not_subset( lhs, rhs ) {>=}+ -> Core.Bag.is_superset( lhs, rhs ) {!>=}+ -> Core.Bag.is_not_superset( lhs, rhs ) {<}+ -> Core.Bag.is_proper_subset( lhs, rhs ) {!<}+ -> Core.Bag.is_not_proper_subset( lhs, rhs ) {>}+ -> Core.Bag.is_proper_superset( lhs, rhs ) {!>}+ -> Core.Bag.is_not_proper_superset( lhs, rhs ) minus -> Core.Relation.diff( source => lhs, filter => rhs ) !matching -> Core.Relation.semidiff( source => lhs, filter => rhs ) matching -> Core.Relation.semijoin( source => lhs, filter => rhs ) divideby -> Core.Relation.quotient( dividend => lhs, divisor => rhs ) minus+ -> Core.Bag.diff( source => lhs, filter => rhs ) like -> Core.Text.is_like( look_in => lhs, look_for => rhs ) !like -> Core.Text.is_not_like( look_in => lhs, look_for => rhs ) Note that while the C functions also have an optional third parameter C, you will have to use a C node to exploit it; for simplicity, the infix C and C don't support that customization; but most actual uses of like/etc don't use C anyway. Examples: bar isa bar !isa scalar as int asserting (int ≠ 0) True implies False foo < bar foo > bar foo ≤ bar foo ≥ bar 34 - 21 2 exp 63 9.2 - 0.1 0b101.01 / 0b11.0 '-' ~# 80 a ∈ {1..5} foo ∉ {"min"..^"max"} { 8, 4, 6, 7 } ∖ { 9, 0, 7 } @:[ x, y ]:{ [ 5, 6 ], [ 3, 6 ] } ÷ @:{ { y => 6 } } =head2 Simple Monadic Prefix Operators Grammar: ::= | ::= ::= not abs ::= ? ::= '!' | '#' | '#+' | '%' | '@' ::= | '¬' A C node is for using prefix notation to invoke a monadic operator function. Such a function takes exactly 1 argument, which is the input of the operation. A single C node is equivalent to a single C node whose C element defines 1 argument, and the 1 C element of the C supplies the value of that argument. Some of the keywords are aliases for each other: keyword | aliases --------+-------- not | ¬ ! This table indicates which function is invoked by each keyword: not -> Core.Boolean.not( expr ) abs -> Core.Numeric.abs( expr ) # -> Core.Relation.cardinality( expr ) #+ -> Core.Bag.cardinality( expr ) % -> Core.Cast.Tuple_from_Relation( expr ) @ -> Core.Cast.Relation_from_Tuple( expr ) Examples: not True abs -23 abs -4.59 #{ 5, -1, 2 } %relvar @tupvar =head2 Simple Monadic Postfix Operators Grammar: ::= ? ::= '++' | '--' | '!' A C node is for using prefix notation to invoke a monadic operator function. Such a function takes exactly 1 argument, which is the input of the operation. A single C node is equivalent to a single C node whose C element defines 1 argument, and the 1 C element of the C supplies the value of that argument. This table indicates which function is invoked by each keyword: ++ -> Core.Ordered.Ordinal.succ( expr ) -- -> Core.Ordered.Ordinal.pred( expr ) ! -> Core.Integer.factorial( expr ) Examples: 13++ 4-- 5! =head2 Simple Postcircumfix Operators Grammar: ::= | | | | | ::= | ::= '.{' [? ]? ':' ? ? '}' ::= '.{' ? ? '}' ::= '{' [? ]? ':' ? [ | ] ? '}' ::= '{' ? [ | | | | | ] ? '}' ::= '{' ? [ | | | ] ? '}' ::= ::= [ | ** [? ',' ?] [? ',']?] ::= ? '<-' ? ::= ::= ::= ? ::= '!' ? ::= [ | ** [? ',' ?] [? ',']?] ::= '%' ? '<-' ? ::= '%' ? '<-' ? '!' ? ::= ? '<-' ? '%' ::= '@' ? '<-' ? ::= '@' ? '<-' ? '!' ? ::= ? '<-' ? '@' ::= '#@' ? '<-' ? '!' ? ::= ::= ::= ::= ::= '.{*}' ::= | ::= '.[' ? ? ']' ::= '#' | | ::= '[' ? ? ? ? ']' ::= ::= A C node is for using postcircumfix notation to invoke a relational operator function whose operation involves deriving a single tuple|relation from another single tuple|relation customized only by further inputs that are attribute names. Such a function takes exactly 2 (C and C|C) or 3 (C and C and C|C) or 3 (C and C and C) primary arguments, which are the inputs of the operation. A single C node is equivalent to a single C node whose C element defines 2-3 arguments, and the 2-3 C elements of the C supply the values of those arguments, which are associated in the appropriate sequence. This table indicates which function is invoked by each format-keyword: .{:} -> Core.Scalar.attr( expr, possrep => possrep_name, name => attr_name ) .{} -> Core.Tuple.attr( expr, name => attr_name ) {<-} -> Core.Attributive.rename( expr, map => @:{ { after => atnm_after.[0], before => atnm_before.[0] }, ..., { after => atnm_after.[n], before => atnm_before.[n] }, } ) {:} -> Core.Scalar.projection( expr, possrep => possrep_name, attr_names => { pcf_atnms.[0], ..., pcf_atnms.[n] } ) {} -> Core.Attributive.projection( expr, attr_names => { pcf_atnms.[0], ..., pcf_atnms.[n] } ) {:!} -> Core.Scalar.cmpl_proj( expr, possrep => possrep_name, attr_names => { pcf_atnms.[0], ..., pcf_atnms.[n] } ) {!} -> Core.Attributive.cmpl_proj( expr, attr_names => { pcf_atnms.[0], ..., pcf_atnms.[n] } ) {%<-} -> Core.Attributive.wrap( expr, outer => outer_atnm, inner => { inner_atnms.[0], ..., inner_atnms.[n] } ) {%<-!} -> Core.Attributive.cmpl_wrap( expr, outer => outer_atnm, cmpl_inner => { cmpl_inner_atnms.[0], ... } ) {<-%} -> Core.Attributive.unwrap( expr, inner => { inner_atnms.[0], ..., inner_atnms.[n] }, outer => outer_atnm ) {@<-} -> Core.Relation.group( expr, outer => outer_atnm, inner => { inner_atnms.[0], ..., inner_atnms.[n] } ) {@<-!} -> Core.Relation.cmpl_group( expr, outer => outer_atnm, group_per => { cmpl_inner_atnms.[0], ... } ) {<-@} -> Core.Relation.ungroup( expr, inner => { inner_atnms.[0], ..., inner_atnms.[n] }, outer => outer_atnm ) {#@<-!} -> Core.Relation.cardinality_per_group( expr, count_attr_name => count_atnm, group_per => { cmpl_inner_atnms.[0], ... } ) .{*} -> Core.Set.Maybe.attr( expr ) .[] -> Core.Array.value( expr, =>index ) [] -> Core.Array.slice( expr, index_interval => { min_index interval_boundary_kind max_index } ) Examples: birthday.{date:day} pt.{city} pt{pnum<-pno, locale<-city} pr{pnum<-pno, locale<-city} birthday{date:year,month} pt{color,city} pr{color,city} pt{} #`null projection`# pr{} #`null projection`# rnd_rule{:!round_meth} #`radix,min_exp`# pt{!pno,pname,weight} pr{!pno,pname,weight} person{%name <- fname,lname} people{%name <- fname,lname} person{%all_but_name <- !fname,lname} people{%all_but_name <- !fname,lname} person{fname,lname <- %name} people{fname,lname <- %name} orders{@vendors <- vendor} orders{@all_but_vendors <- !vendor} orders{vendor <- @vendors} people{#@count_per_age_ctry <- !age,ctry} maybe_foo.{*} ary.[3] ary[10..14] =head2 Numeric Operators That Do Rounding Grammar: ::= ::= | | | ::= ::= div | mod | '**' | log ::= 'e**' ::= 'log-e' ::= round [ | ] ::= ::= A C node is for using infix or prefix or postfix notation to invoke a rational numeric operator function whose operation involves rounding a number to one with less precision. Such a function takes exactly 1 (C) or 2 (C and C) primary arguments, which are the inputs of the operation, plus a special C argument which specifies explicitly the semantics of the numeric rounding in a declarative way (all 2 or 3 of these are I
). A single C node is equivalent to a single C node whose C element defines 2-3 arguments, and the C elements of the C supply the values of those arguments, which are associated in the appropriate sequence. This table indicates which function is invoked by each keyword: div -> Core.Numeric.whole_quotient( dividend => lhs, divisor => rhs, =>round_meth ) mod -> Core.Numeric.remainder( dividend => lhs, divisor => rhs, =>round_meth ) -> Core.Rational.round( expr, =>round_rule ) ** -> Core.Rational.power( radix => lhs, exponent => rhs, =>round_rule ) log -> Core.Rational.log( lhs, radix => rhs, =>round_rule ) e** -> Core.Rational.natural_power( expr, =>round_rule ) log-e -> Core.Rational.natural_log( expr, =>round_rule ) Examples: 5 div 3 round ToZero 5 mod 3 round ToZero foo round RatRoundRule:[10,-2,HalfEven] 2.0 ** 0.5 round RatRoundRule:[2,-7,ToZero] 309.1 log 5.4 round RatRoundRule:[10,-4,HalfUp] e** 6.3 round RatRoundRule:[10,-6,Up] 17.0 log-e round RatRoundRule:[3,-5,Down] =head2 Order Comparison Operators Grammar: ::= '<=>' [ ]? [ ]? An C node is for using infix notation to invoke an order comparison operator function. I
This table indicates which function is invoked by each keyword: <=> -> Core.Ordered.order( lhs, rhs ) Examples: foo <=> bar =head1 DEPRECATED - PROCEDURE INVOCATION ALTERNATE SYNTAX STATEMENTS Grammar: ::= | | ... A C node represents the invocation of a named system-defined procedure with specific arguments. It is interpreted as a tuple of a Muldis D C value. A C node is a lot like a C node in purpose and interpretation but it differs in several significant ways. While a C node can be used to invoke any procedure at all, a C node can only invoke a fraction of them, and only standard system-defined procedures. While a C node uses a simple common format with all procedures, written in prefix notation with generally named arguments, a C node uses potentially unique syntax for each procedure, often written in infix notation, although inter-procedure format consistency is still applied as much as is reasonably possible. Broadly speaking, a C node has 2-3 kinds of payload elements: The first is the determinant of what procedure to invoke, hereafter referred to as an I or I. The second is an ordered list of 1-N mandatory procedure inputs, hereafter referred to as I
, whose elements typically have generic names like C or C or C. The (optional) third is a named list of optional procedure inputs, hereafter referred to as I, whose elements tend to have more purpose-specific names such as C, though note that things like C can be either mandatory or optional depending on the op they are being used with. Note that the procedures with alternate syntax include recipes, and they are all shown in one list for simplicity. But only alternate syntax for recipes is valid in a recipe; all of these alternate syntaxes are valid in a procedure. =head2 Procedure Simple Monadic Postfix Operators Grammar: ::= ::= ':=++' | ':=--' A C node is for using prefix notation to invoke a monadic operator procedure. Such a procedure takes exactly 1 argument, which is the input of the operation. A single C node is equivalent to a single C node whose C element defines 1 argument, and the 1 C element of the C supplies the value of that argument and takes its result. This table indicates which procedure is invoked by each keyword: :=++ -> Core.Ordered.Ordinal.assign_succ( expr ) :=-- -> Core.Ordered.Ordinal.assign_pred( expr ) Examples: counter :=++ countdown :=-- =head2 Procedure Simple Non-symmetric Dyadic Infix Operators Grammar: ::= ::= ':=' | ':=union' | ':=where' | ':=!where' | ':=not-where' | ':=intersect' | ':=minus' | ':=except' | ':=!matching' | ':=not-matching' | ':=antijoin | ':=semiminus' | ':=matching' | ':=semijoin' | ':=exclude' | ':=symdiff' ::= | ':=∪' | ':=∩' | ':=∖' | ':=⊿' | ':=⋉' | ':=∆' A C node is for using infix notation to invoke a non-symmetric dyadic operator procedure. Such a procedure takes exactly 2 arguments. A single C node is equivalent to a single C node whose C element defines 2 arguments, and the 2 C elements of the C supply the values of those arguments, which are associated in the appropriate sequence. When using this infix syntax, the C<&> sigil isn't used to mark the subject-to-update argument(s). Some of the keywords are aliases for each other: keyword | aliases ------------+-------- :=union | :=∪ :=!where | :=not-where :=intersect | :=∩ :=minus | :=∖ :=except :=!matching | :=⊿ :=not-matching :=antijoin :=semiminus :=matching | :=⋉ :=semijoin :=exclude | :=∆ :=symdiff This table indicates which procedure is invoked by each keyword: := -> Core.Universal.assign( &lhs, rhs ) :=union -> Core.Relation.assign_union( &lhs, rhs ) :=where -> Core.Relation.assign_restriction( &lhs, rhs ) :=!where -> Core.Relation.assign_cmpl_restr( &lhs, rhs ) :=intersect -> Core.Relation.assign_intersection( &lhs,rhs ) :=minus -> Core.Relation.assign_diff( &lhs, rhs ) :=!matching -> Core.Relation.assign_semidiff( &lhs, rhs ) :=matching -> Core.Relation.assign_semijoin( &lhs, rhs ) :=exclude -> Core.Relation.assign_exclusion( &lhs, rhs ) Examples: #`assign 3 to foo`# foo := 3 #`swap x and y using pseudo-variables`# %:{"0"=>x,"1"=>y} := %:{"0"=>y,"1"=>x} #`TODO : A SUBSEQUENT SPEC UPDATE WILL MAKE THIS SHORT FORM VALID`# %:{x,y} := %:{y,x} #`delete every person in people whose age is either 10 or 20`# people :=!matching @:{ { age => 10 }, { age => 20 } } =head1 LANGUAGE MNEMONICS PTMD_STD is designed to respect a variety of mnemonics that bring it some self-similarity and an association between syntax and semantics so that it is easier to read and write Muldis D code. Some of these mnemonics are more about self-similarity and others are more about shared traits with other languages. I =head2 Bareword Strings All barewords, meaning runs of non-quoted non-whitespace alphanumeric characters plus {C<_>,C<->,C<.>}, are generally either of these 3 things: language keywords, entity names (including declarations or invocations of routines/operators, types, variables), numeric literals (which may also contain {C<#>,C,C<*>,C<^>}). =head2 Quoted Strings All quoted strings, meaning runs of characters delimited with any of {C<'>,C<">,C<`>}, are generally either of these 3 things: entity names (iff C<">-quoted), value literals for general purpose string-like types (if C<'>-quoted), code comments (iff C<`>-quoted, and typically also C<#>-delimited forming double-character delimiters). =head2 C<#> The C<#> character is mainly associated with numbers in some way but is also associated with code comments, though in the latter case it always appears together with C<`>. Iff C<#> appears as part of a numeric value literal, it is in a manner like and inspired by the "based" literals of the Ada language, and separates the radix specifier from the main part of the literal that it describes. If C<#> appears other than for a code comment or numeric literal then it generally means "count" or "cardinality". =head2 {C<$>,C<%>,C<@>} The {C<$>,C<%>,C<@>} characters are mainly associated with scalars, tuples, and relations, respectively. They are used both to distinguish value literals of those types as well as operators for those types. In addition, the C<@> character is used to indicate routine "dispatch" parameters. =head2 Bracketing Characters Pairs of corresponding bracketing characters, meaning {C<()>,C<[]>,C<{}>}, are generally associated with groupings or lists of various kinds and serve to delimit such. The C<()> round-parenthesis pair is associated with routine signatures (parameter list or result declarations), routine invocations (argument list consisting of ordered or named arguments), and disambiguating any functional code or value expressions (any parenthesis-delimited routine body or code block therein is a function or a value expression). The C<[]> square-bracket pair is associated with ordered lists and is used for: delimiting sequences of procedure statements, delimiting array value literals, and in array-subscripting operators, and for the reduction meta-operator. The C<{}> pair is associated with unordered lists and is used for: delimiting multi-update statements, delimiting value literals of {tuples, relations, (component-wise) scalars, sets, bags}, and in postcircumfix operators for those same types. =head2 List-Separating Characters The C<,> and C<;> characters are mainly used to separate (or trail) each of the 0..N members of groupings or lists. The C<,> comma is considered tighter and is used for most groupings or lists, including: routine signatures, routine invocations, collection value literals, postcircumfix operators. The C<;> semicolon is considered looser and is used for such things as separating off-side-defined things like named value expressions or variables or statements or inner materials. =head2 C<::=> The C<::=> infix token is used for name binding; it declares that whatever is on the right-hand side is associated with the entity name given on the left-hand side. The C<::=> is thusly used to explicitly name value expressions, procedure statements, library materials, and to associate global variables with lexical aliases. =head2 Pairs The infix tokens {C<:>,C<< => >>,C<< <- >>,C<< <-- >>} are mainly used between two items to designate that the they form a pair of some kind. The C<:> is the most common and is used for any context where the left-hand side of the pair is always an entity name or heading, including: a variable/parameter/attribute-like typed entity declaration (var/param/attr-name : type-name), a named argument or tuple literal attribute (arg/attr-name : value-expr), and a routine heading from a routine body; a C<:> also separates the main parts of some value literals, such as the literal kind keyword from the main literal. The C<< => >> is used for any context where two arbitrary value expressions are paired such as in a C literal or a C literal. The C<< <- >> is used just between 2 entity names in the postcircumfix renaming operator. The C<< <-- >> is used just in a function signature between the result type and parameter list. =head1 RESOLVING AMBIGUITY =head2 Entity Names vs Keywords A user-defined entity name may be any character string at all. In the general case, one must appear formatted as a C, but if the entity name only uses a limited set of characters, then it may appear formatted as a C instead, which is essentially the same bareword format as the PTMD_STD language keywords. When any PTMD_STD code contains a bareword whose meaning is ambiguous, in that it could be interpreted as either a reference to a user-defined entity or as a specific context-appropriate language keyword (including a routine invocation alternate syntax), then the parser must always resolve it to the keyword. In these contexts, you must format a user-defined entity name as a C in order for it to be interpreted correctly. Similarly, a user-defined entity name in C format is guaranteed to never be confused with a language keyword. =head2 Statements vs Expressions Within a procedure, arbitrary value expressions may be used as the left-hand-side of infix procedure calls, and some of those expressions may normally have the same leading syntax as some kinds of statements. For example, an C expression can look like a C, and a C expression can look like a C, an C can look like an C, and so on. When any PTMD_STD code exists whose meaning is ambiguous from the context as to whether it is a statement or an expression, then the parser must always resolve it to the statement. In these contexts, you must surround the expression with parenthesis, a C in order for it to be interpreted correctly. Similarly, any routine code within parenthesis is always one or more expressions. =head1 NESTING PRECEDENCE RULES This documentation section outlines Muldis D's PTMD_STD dialect's nesting precedence rules, meaning how it accepts Muldis D code lacking explicit expression delimiters and implicitly delimits the expressions therein, in a fully deterministic manner. PTMD_STD has 10 precedence levels when the C pragma is C; if it is C instead, then 6.5 of the levels can be eliminated, so then PTMD_STD has just 3.5; if it is C instead, then 2.5 more can be eliminated, leaving just 1. Here we list the levels from "tightest" to "loosest", along with a few examples of each level: Level | Assoc | Examples -----------------+-------+--------------------------------------------- Terms | N/A | Inf True Order:Same Down 42 3.14 -5/7 3*2^8 | | F#'27E04' 'eek' foo "x" #`comment!`# | | {43,9,5} [1,2,3] {'Carrots'=>42} {11..20} | | $:{...} %:{...} @:{...} nlx.lib.MyType | | (1+2) myfunc(...) nlx.data.quux .age -----------------+-------+--------------------------------------------- Postfix | N/A | func().attr p.{...} r{...} x.[...] y[...] | | ++ -- ! log-e -----------------+-------+--------------------------------------------- Generic Prefix | N/A | abs # #+ % @ e** Generic Infix | left | assuming | | ^ exp ** log | | * / div mod intersect join times divideby | | where !where matching !matching compose | | intersect+ | | + - |-| ~ ~# union exclude minus | | union+ union++ minus+ round | | as asserting min max // -----------------+-------+--------------------------------------------- Comparison | left | <=> = != < > <= >= isa !isa like !like | | inside !inside holds !holds in !in has !has | | {<=} {!<=} {>=} {!>=} {<} {!<} {>} {!>} | | {<=}+ {!<=}+ {>=}+ {!>=}+ {<}+ {!<}+ {>}+ {!>}+ -----------------+-------+--------------------------------------------- Logical Prefix | N/A | not ! ¬ Logical Infix | left | and ∧ nand ⊼ ↑ [<=>] | | or ∨ nor ⊽ ↓ xor ⊻ ↮ | | imp → nimp ↛ if ← nif ↚ | | xnor ↔ -----------------+-------+--------------------------------------------- Shorting Infix | right | ??!! if-else-expr given-when-def-expr Binding Infix | right | ::= -----------------+-------+--------------------------------------------- Assignment | non | := :=++ :=-- :=foo Any imperative code that embeds a value expression has looser precedence than all value expressions. Using two C symbols below generically to represent any pair of operators that have the same precedence, the associativities specified above for binary, ternary, or N-ary operators are interpreted as follows: Assoc | Meaning of $a ! $b ! $c ------+------------------------ left | ($a ! $b) ! $c right | $a ! ($b ! $c) non | ILLEGAL =head1 SEE ALSO Go to L for the majority of distribution-internal references, and L for the majority of distribution-external references. =head1 AUTHOR Darren Duncan (C) =head1 LICENSE AND COPYRIGHT This file is part of the formal specification of the Muldis D language. Muldis D is Copyright © 2002-2011, Muldis Data Systems, Inc. See the LICENSE AND COPYRIGHT of L for details. =head1 TRADEMARK POLICY The TRADEMARK POLICY in L applies to this file too. =head1 ACKNOWLEDGEMENTS The ACKNOWLEDGEMENTS in L apply to this file too. =cut