This file contains assorted design notes from #perl6 or docs - fglock from PGE docs: Available syntactic categories include "infix:", "prefix:", "postfix:", "term:", "circumfix:", and "postcircumfix:" --> and: statement_control ? random notes... audreyt: what is the relationship between AST and Match? (I'm compiling the match captures) fglock: no relationship whatsoever :) fglock: the Match object may carry an captured object in $/<> aka $() and if you are writing a Perl 6 parser, then that capture object may be an AST object you can set the capture object by rule { $() := ... } or rule { ... { return $capture_object } } or rule { ... { $() := ... } } if the capture object is not set explicitly then it's set to the entire match as matched by the rule. so in q:code:{ say $x; {{{$a}}} } the $x is literal but the $a is unquoted (interpolated)? therefore the {{{ }}}'s? +$/ and ~$/ resolves to +$() and ~$() respectively. ---- re: '&' 22:16 < putter> [&.*?^^From: (\N+).*] $:=(.*) 22:20 < putter> PerlJam: what's an increased specificity example? both of mine were assert-spec&unpack pairs. 22:22 < fglock> putter: .* would make it fail, isn't it? 22:23 < putter> you mean the .* after the (\N+) ? 22:23 < fglock> yes 22:23 < PerlJam> putter: yes, mine are similar, just that the "unpack" part isn't necessary: [&^^<'From:'>] && do_something_only_with_from_lines; 22:23 < putter> I don't believe so. I would expect the & to force the * to backtrack. 22:24 < PerlJam> putter: btw, beware the cut-colon! :-) (unless I'm mistaken that you meant for the : to be matched) ----- re: statement_control 22:50 < putter> one uses multi statement_control: (...){...} multi statement_control: (...){...} etc to fill it in. 22:51 < fglock> putter: you mean, statement_control is represented by an array? (or namespace thing) 22:52 -!- pdcawley [n=pdcawley@adsl-217.146.110.1.merula.net] has joined #perl6 22:52 < putter> fglock: yeah, but there are some issues... like how do (pause) 22:52 -!- clkao [n=clkao@220-132-58-30.HINET-IP.hinet.net] has quit [Read error: 104 (Connection reset by peer)] 22:52 < PerlJam> must not have been enough svk talk here ;-) 22:52 < fglock> putter: please not that it is an array of rule - it is very flexible 22:53 < putter> statement_control is a grammatical category. it defines one of the subrules. 22:53 < putter> (my yeah, was directed at <@subrule>, not represented by an array ;) 22:55 -!- r0n|mess [n=r0nny@p54B893E3.dip0.t-ipconnect.de] has quit [Connection timed out] 22:57 < putter> fglock: yes but. when writing a grammar, in a first match wins engine (like | is), you carefully craft the order of the subrule list. when subrules get added by statement_control defs, someone other than the human has to do the crafting. either the statement_control infrastructure assures the @array has a nice order, or can't use <@array>. 22:58 < fglock> putter: you can opt to use longest-match instead of ordered-match 22:58 < putter> yes 22:58 < fglock> that would be <%statement_control> 22:59 < putter> and the real parser can play games like trying to massage the @array into a trie, so it doesnt have to repeatedly reparse the same stuff the same way. 22:59 < fglock> putter: it is cached 23:02 < putter> re statement_control, http://dev.perl.org/perl6/doc/design/syn/S04.html has a little bit in Statement parsing. ---- re: Smart::Comments re: defining 'if' with macros 23:08 < putter> re hash, "An interpolated hash matches the longest possible key of the hash as a literal, or fails if no key matches.", which doesnt help you distinguish between /if [else ]?/ and /if [else ]? [wrap_around_both_branches ]?/. comparing "if" and "if" isnt going to help. 23:09 < pmurias> Does any one think that useing Smart::Comments in iterator_engine.pl would be a good idea? 23:09 < fglock> pmurias: I like Smart::Comments, but I'd like to keep it simple (that is, no unnecessary dependencies) 23:09 < putter> (there shouldn't have been a ? on the clause) 23:10 < pmurias> It would be a dependency only for debugging :) 23:10 < pmurias> And casual users don't do that often :) 23:11 < pmurias> at least they shouldn't have to :) 23:11 < fglock> pmurias: I'll check that (you mean, disable 'use Smart::Comments' when not in use?) 23:11 < pmurias> the use line should be comment out by default 23:12 < fglock> putter: re /if.../ - I don't understand the mumble part, what would it be? 23:12 < pmurias> and if the debuging messages are needed you just delete the # 23:12 < putter> re if macro, well, you need to add a regex to so you can parse it. and we can currently hang regexs off of, well, rule, and macros macro statement_control: (...) is parsed(/heres the regex/) {...} 23:12 < pmurias> i'll commit it tommorow if youd don't mind 23:12 < fglock> pmurias: sounds good - I'll check the pod again 23:36 < fglock> it will look like: %statement_control: = rule { ... } ------- re: bootstrapping vs. correctness personally, I'd aim for the bootstrap, and if we have to write "funny" code to use it, that's okayish. putter: :) TimToady: ok - this helps! fglock: what are some possible milestones one could imagine reaching for? for example: I could implement 'if' very easily in perl5, but implementing with a Perl 6 macro is more correct-ish putter: the compiler is not OO, so we don't need that in the bootstrap so implementing closures, hash, array should be ok operator precedence isn't important right now * putter 's next question will be "what is the _simplest_ approach you can imagine?" "_simplest_" - keep writing nodes in perl5; "right" - start writing Perl 6 from this point however, nodes can be migrated to Perl 6 later, now that we know how to do it * putter is a great fan of simple maybe next step: rewrite iterator_engine.pl in simple-Perl6, and struggle to compile it to Perl5 hmm, another thing one might do is step back, and try to come up with a list of _other_ things you might be working on/towards. because it's not only is this useful in itself, but is it the most useful thing you can think to do at the moment ----------- re: pads is there some CPAN module that implements a hash that behaves something like a pad? we could use it for implementing lexical subs for example Yes, especially if scoped lexically via %^H modifying pragms. pmurias: each node is actually a state machine what ever happened to MJD's pragma patch? s/each node/each node in a rule/ *** Daveman left #perl6 ["Leaving"] gaal: dunno--I only ever heard about it second-hand--never looked at it. fglock:I sort of understand now :) it's really elegant. (in taking advantage of the parser<->evaluator chumminess of p5) .win 3 oops * gaal wanders off to entertain a guest hf fglock: http://search.cpan.org/~mneylon/Tie-Hash-Stack-0.09/Stack.pm ? ------ re: modifiers fglock: rule xxx { :perl5 .* } fglock: in yours, .* would be a parameter to the :perl5 attribute. *** FurnaceBoy [n=qu1j0t3@67.68.33.193] has quit [brown.freenode.net irc.freenode.net] *** chris2 [n=chris@p549D1C58.dip0.t-ipconnect.de] has quit [brown.freenode.net irc.freenode.net] *** meppl [i=mephisto@meppl.net] has quit [brown.freenode.net irc.freenode.net] :perl5 , like :i :mumble, is a modifier. block scoped [:i(1) This CoDe Is CaSe insensiteive ] ---- re: building AST re OO AST nodes - does it make any sense to use hashes, and autobox when necessary? (just because building objects during the parsing may be expensive - backtracking throws many nodes away) *** Nouk [n=Nouk@219-87-211-192.static.tfn.net.tw] has joined #perl6 *** Nouk [n=Nouk@219-87-211-192.static.tfn.net.tw] has quit [Client Quit] fglock: for perf I'll even just use arrays with constant lookup keys fglock: but I'd suggest to _not_ care about performance just optimize for clarity/effieciency for coding it may be nice to adopt the haskellish Val(VInt(3)) i.e. instead of calling ->new, use functions as constructors audreyt: but are resulting nodes objects of functions? audreyt: you mean like: rule xxx {... { return Val(VInt(3)) } } audreyt: I'm ok with that - Val and VInt are "node constructor" functions, which we can modify later rule xxx {... { return Val(VInt( $1 )) } } yes. this is because in p6 VInt(3) is basically 3 as VInt and you can install multis in the VInt class to anticipate the infix "as" call *** drbean [n=greg@momotaro.chinmin.edu.tw] has joined #perl6 audreyt: and if Val returns a function, we can delay "building" the AST by not evaluating it, right? yes but I'm not sure how much you gain in p5land from that audreyt: I don't know too - just thinking aloud :) without core support for thunks, and with a typical compiler run visiting all nodes delaying ast buildup doesn't gain you much. audreyt: you get time by not bulding things that would be destroyed by backtracking - I think that's all yup, so it's CV creation vs HVMG creation well, with the encapsulation offereed by function constuctors we can switch style at all times but my gut feeling is that plain HVMG is good enough audreyt: VInt(3) - you don't need to specify the argument name? (sorry, I don't really know p6) HVMG? no - it's a coercion blessed hashref some help in p5land for this kind of thing: http://search.cpan.org/dist/Symbol-Opaque/ http://search.cpan.org/dist/Data-Variant/ ------------- re: Parser optimization fglock: the P6 grammar is designed to not require much backtracking at all (unless you want to go back and give better diags), so I wouldn't sweat it. TimToady: ok - it's mostly because I haven't implemented ':' yet :) *** avar [n=Huld@dsl-228-236.hive.is] has quit [Connection timed out] fglock: in fact, if there's *any* place in the grammar that requires backtracking for a correct parse, I'd like to know about it... TimToady: I'll maybe add a switch to disable backtracking or give a warning - so we'll know :) *** avar [n=Huld@dsl-228-236.hive.is] has joined #perl6 cool we even try to avoid alternation in favor of <%tokens>. and <%tokens> is potentially the merge of several syntactic categories, so we don't have to say is there a target grammar category? (in the parsing sense, not the p6 sense;) <%statements_control> | <%unaries> | <%terms> TimToady: is there an existing Grammar file you would recommend as a start point? putter: don't understand what you're asking... * putter wonders what %unaries is LALR(1), etc s/unaries/prefix/ + circumfix left side ah, ok Actually, circumfix could be considered a subset of terms. fglock: not that I'm aware of, unless Patrick has one. The approach we've been taking... *** marmic [n=chatzill@c-93b6e055.1258-1-64736c22.cust.bredbandsbolaget.se] has joined #perl6 is to have a 3-layer parser. (well, where the 3rd layer is really fractal...) the top layer is top-down, down to expression level. the middle layer is a bottom-up operator precedence parser to avoid excessive recursion through 22 or so prec levels. each term can then be its own little top-down parser inside the token grabber for the op prec parser. I also worry about efficiency... :) TimToady: :) the token grabber selects which term to use how? One of the additional reasons for installing the op-prec parser in the middle is to allow for added prec levels. but we don't necessarily need that for a bootstrap. Unless we actually add all of Perl 6's opcodes that way. TimToady: the idea (so far) is to add all prims using sub or macro declarations <%mush_of_all_syntactic_categories_where_term_is_expected> which can have prec declarations, so we could add most prec levels that way, if the underlying prec architecture can support it. terms can have prec decls? terms by definition have a "squinting" prec that is infinitely tight looking out and infinitely loose looking in. That's why circumfixes are so popular to represent visually the "surreal" precedence change. the args to macros should always be ASTs, right? *** justatheory [n=justathe@70.103.133.226] has joined #perl6 What is the default precedence when you create a sub? (tighter than anything else?) *** vodka-goo [n=dbaelde@129.104.11.1] has quit ["Leaving"] bsb: depends on what the "is parsed" rule returns, and the signature declares, and whether they match. by default subs are list operators. except ($) is still unary op, last I checked. but that might be worth breaking... TimToady: '$'? single scalar argument in P5-proto-speak. I had an idea about using unary macros to do haskell's comma-less, tight, currying function calls... could be made to work. is parsed gives the macro's grammar category, right? I believe at one time we made infix: default to the same prec as infix:<+>, but lately I think we just require "is equiv" or some such. bsb: macro infix: ... - the grammar category comes before eh? the grammatical category is part of the macro's name, or I'm misunderstanding your question. inside a circumfix, are the trailing tokens of surrounding circumfixi still tokens? I might just slink away from that question and get to the thing that's confused me... The inner rule has to be parameterized to know what its terminators are, if they would be confused with inner tokens. After talking to audreyt, I thought macro's arguments were always ASTs, but the macros in Prelude seem to expect normal values but usually any unbalanced bracket could be assumed to terminate. ASTs are certainly encouraged as the default, much as opaque objects are now the default in P6. But we want to leave the hooks there to do something else. That means the Prelude should probably revert to subs instead of macros for now terminator params might actually be one of the places we choose to backtrack rather than merging categories: <%terminators> | <%expect_term> since the merge would have to be redone on every subrule call. TimToady: you mean rules inside 'is parsed' may need to take the terminator as a parameter in order to know where to finish macthing? putter: ping circumfix:<+ +> +(3+2)+ ? bsb: pong fglock: yep, if the rule is sufficiently antisocial to want conflicting things. though I see it's actually gaal said you know about the macos in Prelude, can they be reverted to subs? <%terminators> | <%expect_operator> from the +...+ example. TimToady: the problem is there can be any rule inside the 'is parsed' - and then all rules would need to check is this parameter is set I mean, the macro parameter can be any rule until we have macros params that can be ASTs or normal values yes, perhaps it's really something environmentally magical that all <%foo> just automatically track. TimToady: ok *** p2502 [n=p2501@84.12.179.136] has quit [Read error: 104 (Connection reset by peer)] Anyway, for well behaved subrules, we can just assume trailing brackets for now, as long as we keep the generalization in mind. TimToady: I don't see '^' in the example Grammars - is it automatic (to match on the start of the string) Maybe we can semi-generalize that to "stop on syntax error", but that bugs me a little, since there are real syntax errors as well as fake ones... subrules are always anchored on the left to where their parent rule called them. But some of them may also be marked :p to start at the current pos. but it's possible this will turn into a token vs rule distinction on the declaration itself. TimToady: is 'token' a new keyword? conjectured. * putter fails to parse "subrules are always anchored on the left to where their parent rule called them. BUT some of them may also be marked :p to start at the current pos." not equivalent sentences? Interestingly, I see a def for "token" in S06 but nothing discussing it... putter: :p is optional (that's not a tongue) not equivalent when the subrule is called directly rather than as a subrule. if we had a "token" keyword, it might autoanchor on both ends when called directly on a string rather than as a subrule, unlike :p which is currently specced as just the front anchor. But it still needs some thought... is there any conjectured nomenclature for token,definition-of vs token,instance-of? (ie, class names:) well, if a rule is a Rule, then a token would be a Token, I expect. I've got to head off to $job now. thanks for the puzzle answers :) I don't know if they're answers, but maybe they're at least better questions. --------- re: CPAN or, hell, there's no Pugs::* anything yet so let's just use that Pugs::Grammar, Pugs::Optimizer audreyt: ++ audreyt: And maybe, Pugs::Prelude? aye audreyt: Which would be nice on CPAN, I think :) yeah. audreyt: how about the plug-in thing? fglock: follow the Module::Pluggable tradition Pugs::Optimizer::Plugins::ThisPluginName what about freepan? isn't that singular Plugin? right. so, what to do with lib/pugs/run.pod ? and hack.pod oh, yeah, before I turn into a pumpkin, potentially netless for a few days, that's my main reaction to Deal, so, no english volunteers? can we merge both back to Pugs.pod (for now)? audreyt: Pugs::Doc(s) Pugs::Doc::Run Perl6 alpha will be mostly written in p6. pugs+pil2js+pX+whatever. That p6 mostly doesnt exist yet. If it existed, it would actually not be tremendously difficult to do new backends, even from scratch. putter: not only backends, but compilers, optimizers Now, when doing a backend, you have to simultaneously write that p6, which also means figuring out/disambiguating/making up a p6 spec. Which all take far more time than actually doing the backend. ;) you're all mean you don't love me at all nothingmuch: we just don't think we speak English * nothingmuch bops kolibrie * kolibrie ducks * nothingmuch kicks kolibrie while he's on the floor nothingmuch: url? nothingmuch: you mean - about the CV? fglock: yes fglock: right, not just backend. parser, compiler, emitter, runtime, object space. the whole thing. * kolibrie catches url offline r9366 | audreyt++ | * rename pugs::run and pugs::hack to Pugs::Doc::Run r9366 | audreyt++ | and Pugs::Doc::Hack, for the Great Pugs Hierarchy. fglock: okay. so move pX things over -- for non-perl6-pugs-specific things, like the iterator engine nothingmuch: I can help you with portuguese audreyt: which /dir? you can still call it Pugs::Grammar and put into lib/Pugs/Grammar.pm or if you want to distinguish components more fglock: my sister might like help with her portuguese lib/Pugs/Runtime/Grammar.pm or if you are not unhappy with the Perl6 acmeness then lib/Perl6/Grammar.pm The drive spec schrodinger collapse with pugs and pil2js and pilrun has been great. but i suggest at this point, it's no longer the optimal path. if we could get some folks simply nailing down the spec, that would free up implementation efforts to simply implement. audreyt: ok! oops. nailing down spec == reading spec, making up all the questions you can think of, writing them down, getting them answered or approximated, and writing the answers. (aka, ghost-writing) Entirely coincidental, but I'm outta here. (To see the dentist, again.) rofl TimToady: see ya :) later, let's all have the appropriate amount of fun :/ & ----------------- re: perl5 interop { use Perl-6.0; ... write some perl code ... } is the API I'd like audreyt: how about eval( '...', {lang=> 'Pugs' } ) (though it exists, and EngineReplace wouldnt exist without it, which forgives many sins) it's even easy just have a Perl.pm with import that detects "-6.0" or rather -6 yes, it is:) # easy * putter loves %^H. which he knew about it years ago. s/which/wishes/ it's extremely gonzo though, having Perl.pm -- it's like the ultimate violation of CPAN namespace guidelines :) lol audreyt: isn't it 'use v6' ? --> 'v6.pm' yes no, that gets caught in the lexer S11 has the use Perl-6.0 form really... oh, right "use v6" is short for "use Perl-6" according to S11. (i think the question was in perl5 space) (I think we should have a p6 file that's both valid p5 and p6) makes it easier to switch to pure-perl6 boot http://www.vendian.org/mncharity/dir3/multilang/file/ so, in order to not upset the CPAN police agreed use Pugs::Perl-6.0; # maybe something like this. audreyt: rafl and I had a half idea about maybe trying to organize an entire Perl 6 track on froscon. use Perl6 6-0; ? err, 6.0 audreyt: As we all know that there's lots and lots to tell about Perl 6, pugs, parrot, etcetera. Juerd: nod putter: use Perl6.0; I think that works. or, if we don't care about the CPAN police, we can go with "use Perl-6.0" -- which already exists on CPAN in version 0.03 ( http://search.cpan.org/~bmorrow/PerlInterp-0.03/Perl.pm )-- and will be usurped ;)