Generalization: Tree::Pattern, Tree::Transformer, and Tree::Walker

Earlier this summer, when I was considering what sort of interface would be best for the PAST pattern matching and transformation I've been building, Bacek suggested an XPath-like syntax. XPath is, nominally, a language for pattern matching on XML, but it is applicable to any tree-like data structure where each node may have both attributes and children. It's a general-purpose tree-matching language.

Initially, what I developed was not as general as this: it worked only on PASTs, and only on those PAST subclasses for which I explicitly wrote subclasses of PAST::Pattern for(all of the ones included in Parrot). But HLL compilers will sometimes use create their own PAST::Node subclasses. NQP-rx, for example, uses PAST::Regex nodes to represent regexes.

A few days ago, Bacek asked me about using PAST::Pattern in PIRATE, the PIR compiler written in NQP-rx with PCT he and Cotto have been working on. Specifically, he wanted to optimize away lines like "if 1 < 2 goto some_label", which would mean that the compiler need not support opcodes like "le_ic_ic_ic". It should, depending on whether the condition is true or not, be turned into either a no-op or an unconditional goto. However, this could not be done with PAST::Pattern as it was then. I've spent the time since refactoring and cleaning up PAST::Pattern and generalizing it to Tree::Pattern. In the process, I've made it a lot easier to extend Tree::Pattern to new kinds of trees and new sub-types of known types of trees. For matching PCT::Node subclasses(including PAST::Node and POST::Node), declaring only two simple methods in your PCT::Pattern subclass is necessary: "attributes" and "target_class".

For an example of how this works, let's look at the matching-related code of the PAST::Pattern::Op class:

my @attributes := pir::clone__PP(PAST::Pattern.attributes());
for (<pasttype pirop inline>) {
    pir::push(@attributes, $_);
}
method attributes () { @attributes; }

method target_class () { PAST::Op; }

The "target_class" method should return the node class that a matching node must be. The "attributes" method returns an array of the names of the attributes that should be checked, if provided. Previously, PAST::Op's matching functionality was implemented with this code:

method ACCEPTSEXACTLY ($node) {
    return PAST::Pattern::Match.new(0) unless $node ~~ PAST::Op;
    my $/ := PAST::Pattern::Match.new(1);
    (PAST::Pattern::Node::check_attribute(self, $node,
                                                               "pasttype", $/)
     && PAST::Pattern::Node::check_attribute(self, $node,
                                                                    "pirop", $/)
     && PAST::Pattern::Node::check_attribute(self, $node,
                                                                    "inline", $/)
     && PAST::Pattern::Node::check_children(self, $node, $/)
     && PAST::Pattern::Node::check_node_attributes(self, $node, $/));
    $/.from($node) if $/;
    $/;
} 

Each PAST::Node subclass needed another class with another method very similar to this one.

Now, it's a simple matter of:

class POST::Pattern::Value is POST::Pattern {
    my @attributes := pir::clone__PP(POST::Pattern.attributes());
    for (<name type flags declared>) {
        pir::push(@attributes, $_);
    }
    method attributes () { @attributes; }

    method target_class () { POST::Value; }
}

PAST::Pattern::Any, PAST::Pattern::Closure, and PAST::Pattern::Constant are not Tree::Pattern::Any, Tree::Pattern::Closure, and Tree::Pattern::Constant, respectively.

PAST::Walker and PAST::Transformer have similarly been generalized to Tree::Walker and Tree::Transformer. They should do the right thing for any tree type derived from Capture(including any PCT::Node type). Due to lack of support for inheriting only some specializations of a multi-method, the "walk" and "walkChildren" functions are still subroutines, although they are now located in the Tree::Walker namespace.

Now, I'm working on optimizations for PIRATE and for NQP-rx. I hope to also implement some optimizations for some other HLLs, but PIRATE(because Bacek asked me to) and NQP(because it is both convenient to use and fairly simply maps to PAST, making it easier to write simple optimizations for than some other HLLs) will be my main focuses in the next few weeks. I'll also be working on improving and expanding the documentation for Tree::Walker/Transformer/Pattern and their subclasses.

If any HLL devs (or anyone, really) are interested in trying out PAST::Pattern(or Tree::Pattern) for use in their own projects, please contact me. I'll be glad to help with any bugs you discover, any inconveniences in the API, or any other problems that might come up in doing so, as well as answering any questions you might have.