Text:'hartmark', 'login_pass' => Text:'letmein', 'is_special' => Bool:true, } Tuple:{ 'name' => Text:'Michelle', 'age' => Int:9:17, } ---- Same in PTMD_Perl_Tiny dialect: [ 'Tuple', {} ] [ 'Tuple', { 'login_name' => [ 'Text', 'hartmark' ], 'login_pass' => [ 'Text', 'letmein' ], 'is_special' => [ 'Bool', 'md_enum', 'true' ], } ] [ 'Tuple', { 'name' => [ 'Text', 'Michelle' ], 'age' => [ 'Int', 'perl_int', 17 ], } ] ---- Relations in PTMD_Tiny dialect: Relation:{} Relation:{ {}, } Relation:{ 'x', 'y', 'z', } Relation:{ { 'login_name' => Text:'hartmark', 'login_pass' => Text:'letmein', 'is_special' => Bool:true, }, } Relation:[ 'name', 'age', ]:{ [ Text:'Michelle', Int:9:17, ], } ---- Same in PTMD_Perl_Tiny dialect: [ 'Relation', [] ] [ 'Relation', [ {}, ] ] [ 'Relation', [ 'x', 'y', 'z', ] ] [ 'Relation', [ { 'login_name' => [ 'Text', 'hartmark' ], 'login_pass' => [ 'Text', 'letmein' ], 'is_special' => [ 'Bool', 'md_enum', 'true' ], }, ] ] [ 'Relation', [ 'name', 'age', ], [ [ [ 'Text', 'Michelle' ], [ 'Int', 'perl_int', 17 ], ], ] ] ---- More: * Canonical Set, Array, Bag * Just relations with pre-def attrs * Tiny dialects have special syntax ---- Scalar types: * Encapsulated * Like OO languages' object classes * All their attrs private * Only accessible indirectly ---- Scalar type comprises "possreps": * "possible representations" * Each looks like a tuple * Possrep like OO "role" or "interface" ---- PTMD_Tiny dialect: Scalar:'fed.lib.the_db.WeekDay':'name':{ '' => Text:'monday', } Scalar:'fed.lib.the_db.WeekDay':'number':{ '' => Int:9:5, } HDMD_Perl_Tiny dialect: [ 'Scalar', 'fed.lib.the_db.WeekDay', 'name', { '' => [ 'Text', 'monday' ], } ] [ 'Scalar', 'fed.lib.the_db.WeekDay', 'number', { '' => [ 'Int', 'perl_int', 5 ], } ] ---- Some scalar types special: * Bool, Int, Blob, Text * No possreps * Others not for general purpose use * All are "big" as applicable ---- Boolean types ---- Muldis D: Bool SQL: BOOLEAN, TINYINT(1), INTEGER, NUMBER(1), ... ---- Numeric types ---- Muldis D: Int, Rat SQL: TINYINT, SMALLINT, MEDIUMINT, INTEGER, BIGINT, NUMERIC, NUMBER, DECIMAL, REAL, FLOAT, DOUBLE, BIT, PLS_INTEGER, BINARY_INTEGER, ... ---- Numeric types: * All bigints / bignums * Radix agnostic * Exact precision * For all rationals * Fixed and floating-point * Int has no possrep * Rat has 2 possreps ---- Binary string types ---- Muldis D: Blob SQL: BIT, BINARY, VARBINARY, TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB, BYTEA, RAW, LONG RAW, ... ---- Character string types ---- Muldis D: Text SQL: CHAR, VARCHAR, TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT, CLOB, NATIONAL variants, VARCHAR2, LONG, ... ---- All Muldis D character string types are Unicode ---- Also: * Temporals - dates, times, durations * Spatials - 2D geometrics, GIS * They're incomplete ---- Routines: * 3 main kinds * functions * updaters * procedures ---- Functions ---- create in fed.myapp function cube { result sys.std.Core.Type.Int params { topic(sys.std.Core.Type.Int) } exprs { exp(sca_lit Int:9:3), ''(func_invo sys.std.Integer.power args { radix(lex.topic), exponent(lex.exp) } ), } } ; ; ; ---- Functions: * Pure * Only invoke in value expressions * Just read-only params * Just has value expression * Only invokes functions ---- Updaters ---- create in fed.myapp updater chop_sign { upd_params { topic(sys.std.Core.Type.Int) } exprs { abs_val(func_invo sys.std.Integer.abs args { topic(lex.topic) } ), } stmt { sys.std.Core.Universal.assign upd_args { target(lex.topic) } ro_args { v(lex.abs_val) }, } } ---- Updaters: * Imperative * All statements simultaneous * May update multiple subject-to-update params ---- Functions and updaters: * Fully deterministic * No lexical vars * Can't see globals * Implicitly atomic * Safe to implicitly refactor * Have value exprs of named nodes * Helps docs, common sub-expr refactor ---- Procedures ---- create in fed.myapp procedure fetch_all_people { upd_params { people(type.var.fed.mydb.people) } stmts [ proc_invo sys.std.Core.Universal.assign upd_args { target(lex.people) } ro_args { v(fed.mydb.people) }, ] } ; ; ; ; ; ---- Procedures: * Imperative * Can see/update globals * Have lexicals * Seq of ordered statements * May be non-deterministic * May have side-effects in non-args * Pure imperative * Updaters bridge gap ---- Inner routines: * For all routine kinds * Don't have to externalize parts * Like SQL WITH in general * Support internal recursion ---- For example, to represent a SQL procedure like this: CREATE PROCEDURE count_people (INOUT p_count INT) SELECT COUNT(*) INTO p_count FROM mydb.people; ... a Muldis DB procedure would use an inner updater to hold the "COUNT(*) FROM people" portion, which is a value expression ---- create in fed.myapp procedure count_people { upd_args { p_count(sys.std.Core.Type.UInt) } stmts [ proc_invo inn.icount upd_args { p_count(lex.p_count) } ro_args { the_db(fed.mydb) }, ] updater icount { upd_params { p_count(sys.std.Core.Type.UInt) } ro_params { the_db(type.var.fed.mydb) } exprs { pc(func_invo sys.std.Core.Relation.cardinality args { topic(lex.the_db.people) } ), } stmt { sys.std.Core.Universal.assign upd_args { target(lex.p_count) } ro_args { v(lex.pc) }, } } } ---- Routines: * All params named, not positional * All args marked updateable or not * Helps self-documentation ---- Merger: * Generic N-ary operator * SQL aggregate operator * Any built-ins possible are N-ary * Can be used in summary ---- Examples: * Boolean and, or, xor * Ordinal min, max * Numeric sum, product * String catenate * Relational join, product, union, intersect * All these are associative * All but catenate are commutative ---- Nonscalar attributes ---- Data types and relational model orthogonal * Attr may be tuple or relation * Not just scalar * All relational ops work with them * No distinction of SQL MULTISET vs table * Consistency is good * Support for TVAs, RVAs mandatory in "D" * Extremely useful ---- RVA means: * Parent+child in single query * No redundancy * Child tuples in RVA of parents * Very natural for multi-level report * No many-to-many assoc tables needed * No need to split relvar to normalize ---- Relation:[ 'order_id', 'cust_id', 'order_item' ]:{ [ Int:9:5073, Int:9:209, Relation:{ { 'prod_id' => Text:'paint', 'qty' => Int:9:3 }, } ], [ Int:9:5074, Int:9:332, Relation:{ { 'prod_id' => Text:'axle', 'qty' => Int:9:20 }, { 'prod_id' => Text:'lever', 'qty' => Int:9:50 }, { 'prod_id' => Text:'paint', 'qty' => Int:9:8 }, } ], [ Int:9:5075, Int:9:17, Relation:{ { 'prod_id' => Text:'banana', 'qty' => Int:9:100 }, { 'prod_id' => Text:'orange', 'qty' => Int:9:160 }, } ], } ---- Values in logic ---- Native 2-valued logic: * No SQL 3+VL NULL where NULL<>itself * All Muldis D values equal themselves * Alternate canon for "N/A"/"unknown" with 2VL ---- To make attr conceptually nullable: * Make it an RVA * Or a Maybe (Set with 0..1 tuples) * IS (NOT) NULL becomes is (not) Maybe/RVA empty * Empty means IS NULL * Non-empty means IS NOT NULL * Tuple's attribute is the known value * Brings consistency for outer-joins ---- Identity matters ---- "is_identical", "assign" operators: * Work for all types, including nonscalars * Compare/assign whole relations, no contriving * Text case/accents/ws diffs significant * Derived ops (joins etc) act the same * Floating point nums treated as exact * Unicode graphemes significant, codepoints not * Tuples in relation always ident by value * No implicit extra attrs like SQL "row id" ---- Type safety * Strong typing * No implicit sideways type conversion/casting ---- Virtual reality ---- Views: * A var attr is a var too * A whole database is a tuple var * Database relvars are virtual * SQL schemas are databases in databases * Hierarchy with relation leaves ---- Merger: * SQL viewed tables * SQL generated columns * Special functional dependencies between attrs * Virtual attrs gen from other attrs * Mapping functions say how * Virtuals updateable if possible * Case-insensitive key example ---- System Catalog ---- System catalog: * Represents SQL "information schema" * Is special dbvar * Separate from user data dbvar * Is updateable ---- Updating system catalog: * Fundamental means of data definition * DD-specific routines just abstraction * Put desc of relvar into it to create * Put code-as-data in to make routines, types * Compiles, becomes runnable as side-effect * Even connect/disconnect a DBMS * Do all DBMS ops with just queries, DML ---- Catalog contents: * Abstract AST-like data structures * Actually tuples and relations * Not code strings * Easy for DB wrapper to introspect a DB * Even its stored routines * System-defined entities too ---- Depots ---- A depot: * Local abstract of typically external storage * Holds a user data dbvar, related catalog dbvar * Maybe details to map reality * Represents SQL concept of "the database" * Packages user data, user-def types, routines * May just have code - a dynamic loaded lib * Persistent or transient, read-only or not ---- Multiple depots: * Mountable by DBMS at once * Have own namespaces like filesystem * Perform cross-database queries or updates * They're like one database then * Cloning a database just a variable assignment ---- A depot is a completely independent unit: * All types, constraints for dbvar kept together * Likewise for routines etc * No ext dependencies to understand its contents * No inter-depot constraints ---- DB authentication details: * Such as DB user/pass * Apply per depot mount, not on whole DBMS ---- Entity Names ---- All DBMS entities in one namespace: * Hierarchical * System and user defined * Depots, variables, types, routines * Entities always invoked fully-qualified * No name search ambiguity * No subtle code breaks due to new entities ---- Organization: sys - system-defined entities sys.std - official Muldis D features sys.imp - extra implementation-specifics mnt - depot mount controls fed - user-def entities in all depots dep - in just current depot sdp - current subdepot/schema pkg - current Oracle-concept package inn - local inner types/routines lex - lexical param args, vars, expr nodes ---- Special features: * Special suffixes for drilling into attrs * Special prefixes for taking type of something * or avoiding type def explosions ---- Entity names: * As per delimited SQL identifiers * Case/etc sensitive * No reserved words * Unicode * May be the empty string * Users have complete freedom ---- Code comparisons ---- Muldis D: 'x' SQL (as a scalar query): (SELECT 'x') or: (SELECT 'x' FROM dual) ---- Muldis D: tab1 SQL: SELECT * FROM tab1 ---- Muldis D: sys.std.Core.Relation.cardinality( topic(tab1) ) SQL (as a scalar query): (SELECT COUNT(*) FROM tab1) ---- Muldis D: sys.std.Core.Relation.projection( topic(tab1), attrs(Set:{ 'col1', 'col2' }), ) SQL: SELECT DISTINCT col1, col2 FROM tab1 ---- Muldis D: sys.std.Core.Relation.semijoin( source(tab1), filter(Relation:{ { col1(Text:'hello'), col2(Int:9:5) }, { col1(Text:'world'), col2(Int:9:7) }, }), ) ; ; ; ; ; ; ---- SQL: SELECT * FROM tab1 WHERE (col1, col2) IN ( SELECT 'hello' AS col1, 5 AS col2 UNION SELECT 'world' AS col1, 7 AS col2 ) ; ; ; ; ; ; ---- or: SELECT * FROM tab1 WHERE col1 = 'hello' AND col2 = 5 OR col1 = 'world' AND col2 = 7 ; ; ; ; ; ; ; ; ; ---- The next example is a natural join of 2 relations whose attrs are as follows: tab1(col1,col2,col3), tab2(col2,col3,col4), such that the result is tab(col1,col2,col3,col4) ---- Muldis D: sys.std.Core.Relation.join( topic(QuasiSet:{ tab1, tab2 }), ) SQL: SELECT DISTINCT tab1.*, tab2.col4 FROM tab1 NATURAL INNER JOIN tab2 or: SELECT DISTINCT tab1.*, tab2.col4 FROM tab1 INNER JOIN tab2 USING (col2, col3) or: SELECT DISTINCT tab1.*, tab2.col4 FROM tab1 INNER JOIN tab2 ON tab2.col2 = tab1.col2 AND tab2.col3 = tab1.col3 ; ; ---- Muldis D: sys.std.Core.Relation.rename( topic(tab1), map(Relation:['before','after']:{ ['col1', 'foo'], ['col3', 'bar'], }), ) SQL: SELECT col1 AS foo, col2, col3 AS bar FROM tab1 ---- Muldis D: with { function myext { result sys.std.Core.Type.Tuple params { topic(sys.std.Core.Type.Tuple) } body { Tuple:{ a => Text:'x', b => sys.std.Integer.difference( minuend(lex.topic.m), subtrahend(lex.topic.n) ), } } }, } sys.std.Core.Relation.extension( topic(tab1), func(inn.myext), ) SQL: SELECT tab1.*, 'x' AS a, (tab1.m - tab1.n) AS b FROM tab1 ; ; ---- Database constraints ---- Merger: * SQL's CHECK constraint * Some kinds of SQL triggers * Generic constraint function * Takes value as input, results in boolean * Apply per whole DB, or relvar, or per-tuple * Commonly expressed as type constraints * Whole database has a type ---- Database constraints: * Automatically apply at end of each statement * Throws exception if any result in FALSE * Can't be deferred * No "domain check override" * Database always in a valid state * But this isn't a problem * You can update multiple vars simultaneously * Bank account credit/debit example ---- Canonical abstractions: * For SQL's unique, foreign key constraints * Unique key on a relation * Foreign key between db attrs * Muldis D adds distributed unique/foreign keys * Library books and DVDs example * Multi-attribute keys supported ---- Transition constraints: * Apply to vars, not types * Check how a var value may directly change ---- Transactions ---- All "D" languages are ACID compliant ---- Nested transactions: * All "D" languages support * Arbitrary depth * Start a transaction within another one * Let code be atomic, no worry about callers * Only parent-most does actual commit * Easy to rollback just part of overall trans ---- Transactions: * All depots commit or rollback as a unit * Explicitly tied to lexical scopes * And to the exception mechanism * "try" block entry begins a child * "try" success / normal end commits * "try" catches exception, rollback * Easy to get right even with multiple exits * No worry about mis-matched start/end * DBMS must auto-rollback on disconnect ---- Muldis Rosetta ---- Introducing "Muldis Rosetta": * Reference implementation of Muldis D * Is a DBMS * Mostly incomplete * Should be working soon ---- Concepts: * DBMS is a virtual machine * DBMS embedded in your dev environment * DBMS needn't be external program from your app * DBMS is like module in your app * Provides relational ops for munging app vars * State managing solution for relvars/dbvars * Actual embed or external, you code same way * Like local vs remote procedure calls * Write queries as named rtns to invoke later ---- Implemented in Perl: * Lots of fundamental features built in * Can invoke compiler at runtime * Just gen Perl code from MD, let Perl do work * Automatic garbage collection * Text is Unicode, unlimited length * Bigint/bignum built-in * Ad-hoc multi-dimensional data structures * Extensive meta-model * Closures, higher-order functions, map, grep ---- Structure: * A Perl framework like DBI; sep intf, engines * Develop/test on one DBMS, deploy on another * Standalone Example Engine bundled with intf * Develop/test a DBMS app without "real" DBMS * Example just demos semantics, doesn't scale * Most Engines likely bridge to existing DBMSs * Keeps all features, speed of underlying DBMS * Develop/test on one DBMS, deploy on another * Common test suite for all Engines ---- Thank you! ---- Copyright © 2008, Darren Duncan **** http://muldis.com for email, see above ---- This slideshow text is free documentation for software; you can redistribute it and/or modify it under the terms of the GNU General Public License (GPL) as published by the Free Software Foundation (http://www.fsf.org/); either version 3 of the License, or (at your option) any later version. ]]>