=head1 NAME Perl6::Overview::Rule - Grammar Rules =head1 DESCRIPTION =head2 Match object $/ -- can be accessed as an array of sub-matches, e.g. $/[0], $/[1] ... -- can be accessed as a hash of named subrules, e.g. $/ $0, $1, $2, ... -- aliases for $/[0], $/[1], $/[2], ... Context Behaviour ------------------------------------------------------------------------ String Stringifies to entire match Number The numeric value of the matched string (i.e. given "foo123bar"~~/\d+/, then $/ in numeric context will be 123) Boolean True if success, false otherwise =head2 Backtracking : Prevents backtracking over previous atom :: Fails entire group if previous atom is backtracked over ::: Fails entire rule if previous atom is backtracked over Also and (see below). =head2 Modifiers Used as adverbial modifiers: "foO bar" ~~ m:s:ignorecase:g/[o | r] ./ and can appear within rules sub rule { [:i ] } # only ignore 's case :b :basechar Match base char ignoring accents and such :bytes Match individual bytes :c, :continue Start scanning from string's current .pos :codes Match individual codepoints :ex, :exhaustive Match every possible way (including overlapping) :g, :global Find all non-overlapping matches :graphs Match individual graphemes :i, :ignorecase Ignore letter case :keepall Force rule and all subrules to remember everything :chars Match maximally abstract characters allowed by pragma :nth(N) Find Nth occurrence. N is an integer; you can also use 1st, 2nd, 3rd, 4th, as well as 1th, 2th. (junctions allowed, e.g. :nth(1|2|3|5).) :once Only match first time :p, :pos Only try to match at string's current .pos :perl5 Use Perl 5 syntax for regex :ov, :overlap Match at all possible character positions, including overlapping; return all matches in list context, disjunction in item context :rw Claim string for modification, instead of copy-on-write :s, :sigspace Replaces every sequence of literal whitespace in pattern with \s+ or \s* according to rule :x(N) Repetition -- find N times (N is an integer) [:ex vs :ov could use clarification. see http://www.nntp.perl.org/group/perl.perl6.language/20985 ] [re :s, except where there is already an explicit space rule] =head2 Built-in assertions '...' Matches ... as a literal string "..." Matches ... as a literal string (after interpolation) Matches a literal space Matches any sequence of whitespace, like the :s modifier Matches any word boundary (Perl 5's \b semantics) Matches a literal . (same as '.') Matches a literal < (same as '<') Matches a literal > (same as '>') Match whatever the most recent successful match did Matches only after pattern (zero-width) Matches only before pattern (zero-width) Fails the entire match if backtracked over Fails the entire match if backtracked over, and removes the portion of the string matched until then Fails the entire match if reached Matches null string Matches an "identifier", same as ([ [ | _] \w* | " [|_] \w* " ]) Matches the same pattern as the current rule (useful for recursion) Zero width assertion that XXX *doesn't* match at this location Alphanumeric character Alphabetic character ASCII character Horizontal whitespace ([ \t]) Control character Numeric character An alphanumeric character or punctuation Lowercase character Printable character -- alphanumeric, punctuation or whitespace Whitespace character ([\s\ck]) Uppercase character Word character (alphanumeric + _) Hexadecimal digit Named rules are stored in the match object unless the rule name is prefixed with a ? / / would store values in $/{'quantity'} and $/{'price'} but not $/{'item'} =head2 Character classes <[abcd]> Matches one of the characters a,b,c, or d. Ranges may be used as <[a..z]>. Can be combined with + and - like so: <[a..z]-[m..p]> (which is the same as <[a..l]+[q..z]>) <-XXX> Matches XXX as a negated character class. For instance, <-alpha> would match one non-alpha character. May also be combined as above. (<-alpha+[qrst]> will match any non-alpha character and the characters q,r,s, and t) <+XXX> Matches XXX as a character class. <+alpha> matches one alpha character. [This does not seem to match S05. Which one is wrong?] =head2 Hypothetical variables Assign value to variable only if entire pattern succeeds. Syntax: let $foo = value my $x; / (\S*) { let $x = .pos } \s* foo / =head2 Aliasing Syntax: let $foo = $1 Shorthand: $foo := $1 my $x; / $x := (\S*) \s* foo / Can use arrays: / @x := [ (\S+) \s* ]* / # returns anonymous list in item context for *, +, **{n,m}: / $x := [ (\S+) \s* ]* / # ? does not pluralize -- result or undef Hashes: / %x := [ (\S+)\: \s* (.*) ]* / # key/value pairs # $0 = list of keys # $1 = list of values / %x := [ (\S+) \s* ]* / # capture only keys, values = undef Can capture return values of closures: / $x := { "Item context" } / / @x := { "List", "context" } / / %x := { "Hash" => "context" } / # note: no parens around closure Reorder paren groups: / $1 := (.*?), \h* $0 := (.*) / # renumbering occurs / $2 := (.*?), \h* (.*) / # $3 = (.*) Relative to current location: $-1, $-2, $-3... Named subrules: / \: { let $hash{$} = $ } /