Cleaning up and speeding up optimizations with Tree::Optimizer

After a false start earlier this week, I've begun implementing something like LLVM's PassManager. It can be found in the pass-manager branch of my project's git repository.

Today, I'm going to talk about adding optimizations to compilers using my GSoC project without Tree::Optimizer and with it. Note that most of the features of Tree::Optimizer that I describe here are not yet implemented.

Let's pretend that you're a compiler writer who has been following my project and has implemented some optimizations for your compiler using it already.

Your optimizations are in a Not Quite Perl 6 file called Optimizer.pm that looks something like this:

INIT {
    pir::load_bytecode('PAST/Pattern.pbc');
}

module Lang::Compiler {

    method tail-call-elimination ($past) {
        my $pattern := ...;
        my &transform := sub ($/) { ... };
        $pattern.transform($past, &transform);
    }

    method temp-variable-elimination ($past) {
        my $pattern := ...;
        my &transform := sub ($/) { ... };
        $pattern.transform($past, &transform);
    }
}

Where tail-call-elimination turns "return someFunctionCall()" into a tail-call using Parrot's tail-call support and temp-variable-elimination eliminates unneeded temporary lexical variable stores in order to simplify optimization. Instead of having to look for situations like "aVar = functionCall(); return aVar;", you can assume that most tail-calls will be simply "return functionCall();". For this reason, you want to run temp-variable-elimination first.

In your main compiler file, you add the stages like so:

Lang::Compiler.addstage('temp-variable-elimination', 'after' => 'past');
Lang::Compiler.addstage('tail-call-elimination', 'after' => 'temp-variable-elimination'

I have left out the details of the &transform subs and the patterns. That wasn't too inconvenient, but you might be feeling a little discomfort about having to modify your Compiler.pm and to add methods to your compiler class for every optimization. You decide to factor the optimizations into their own module.

You change Optimizer.pm to this:

INIT {
    pir::load_bytecode('PAST/Pattern.pbc');
}

module Lang::Compiler {
    method optimize-past ($past) {
        my $result := Lang::Optimizer::temp-variable-elimination($past);
        $result := Lang::Optimizer::tail-call-elimination($result);
        $result;
   }
}

module Lang::Optimizer
    sub tail-call-elimination ($past) {
        my $pattern := ...;
        my &transform := sub ($/) { ... };
        $pattern.transform($past, &transform);
    }

    sub temp-variable-elimination ($past) {
        my $pattern := ...;
        my &transform := sub ($/) { ... };
        $pattern.transform($past, &transform);
    }
}

And you replace the code to add the stages in Compiler.pm with Lang::Compiler.addstage('optimize-past', 'after' => 'past').

That's a little nicer. Other than adding the single optimize-past stage in Compiler.pm, all of your optimization-related code is in Optimizer.pm. In addition, the optimizations are in their own separate name-space instead of being methods on the compiler. As time goes on, you write inline-immediate-blocks and eliminate-local-control-exceptions optimizations that, respectively, inline immediate blocks (for example, the block in an if statement) and eliminate control exceptions for return statements when it will be immediately caught within the same PAST::Block. You want eliminate-local-control-exceptions to run after inline-immediate-blocks so that return statements within the inlined blocks will be optimized. You rewrite the optimize-past method like this:

    method optimize-past ($past) {
        my $result := Lang::Optimizer::temp-variable-elimination($past);
        $result := Lang::Optimizer::tail-call-elimination($result);
        $result := Lang::Optimizer::inline-immediate-blocks(result);
        $result := Lang::Optimizer::eliminate-local-control-exceptions($result);
        $result;
   }

This works, but it's rather repetitive, and if you accidentally type $past where you mean $result (as I initially did when writing this post), you'll end up with only some of your optimizations being applied, at best, or with invalid or incorrect generated code, at worst. PAST::Pattern.transform makes no guarantees about the state of the original tree after transforming it. It also makes it hard to tell which optimizations actually depend on each other. You don't actually care if temp-variable-elimination runs before or after inline-immediate-blocks or eliminate-local-control-exceptions, as long as it runs before tail-call-elimination.

Because of these concerns, you decide to use Tree::Optimizer.

You add pir::load_bytecode('Tree/Optimizer.pbc') to the INIT block at the top of Optimizer.pm and modify optimize-past like this:

    method optimize-past ($past) {
        my $opt := Tree::Optimizer.new;
        $opt.register(Lang::Optimizer::temp-variable-elimination,
                             :name<tempvarelim>);
        $opt.register(Lang::Optimizer::tail-call-elimination,
                             :depends-on<tempvarelim>);
        $opt.register(Lang::Optimizer::inline-immediate-blocks,
                             :name<inlineimmediate>);
        $opt.register(Lang::Optimizer::eliminate-local-control-exceptions,
                             :depends-on<inlineimmediate>);
        $opt.run;
   }

You create a Tree::Optimizer, register each optimization, specifying that tail-call-elimination depends on temp-variable-elimination and that eliminate-local-control-exceptions depends on inline-immediate-blocks. Finally, you run the optimizations with the .run method. Now we have the dependency graph of our optimizations clearly specified and we needn't worry about accidentally typing $past instead of $result.

Unfortunately, you notice that all these optimizations really slow your compiler down on large programs. After all, you're traversing the whole PAST four times! Fortunately, Tree::Optimizer has a solution; it can run multiple Tree::Pattern.transform-based optimizations in a single traversal. But first, we need to refactor our optimizations to make $opt aware that it's dealing with Tree::Pattern.transform-based optimizations.

    method optimize-past ($past) {
        my $opt := Tree::Optimizer.new;
        $opt.register(Lang::Optimizer::temp-variable-elimination-transform,
                             :when(Lang::Optimizer::temp-variable-elimination-pattern),
                             :recursive(1),
                             :name<tempvarelim>);
        $opt.register(Lang::Optimizer::tail-call-elimination-transform,
                             :when(Lang::Optimizer::temp-variable-elimination-pattern),
                             :recursive(1),
                             :depends-on<tempvarelim>);
        $opt.register(Lang::Optimizer::inline-immediate-blocks-transform,
                             :when(Lang::Optimizer::inline-immediate-block-pattern),
                             :recursive(1),
                             :name<inlineimmediate>);
        $opt.register(Lang::Optimizer::eliminate-local-control-exceptions-transform,
                             :when(Lang::Optimizer::eliminate-local-control-exceptions-pattern),
                             :recursive(1),
                             :depends-on<inlineimmediate>);
        $opt.run(:combine(1));
   }

In the Lang::Optimizer module, we also move the &transform subs from each optimization to optimization-name-transform and the $pattern variables to optimization-name-pattern.

The :when option specifies a pattern that controls when the optimization will be performed. The optimization will only be performed on trees that match the pattern. The :recursive option means that the optimization will be performed on the children of the PAST, as well. In combination with :when, it becomes equivalent to PAST::Pattern.transform.

We also supply the :combine option to the run method. When this option is supplied, the optimizer will attempt to combine any optimizations using :when and :recursive so that they require only a single traversal of the tree. Note that if a :when/:recursive optimization depends on an optimization that does not use :when/:recursive, which in turn depends on another :when/:recursive optimization, there must be at least two traversal of the tree, since only :when/:recursive optimizations can be combined.

In this case, however, Tree::Optimizer.run should perform all of the optimizations in a single traversal.

There are some of the features that Tree::Optimizer will have. I haven't discussed analysis or convenience classes analogous to LLVM's FunctionPass and LoopPass; I expect to flesh out their design further as the implementation of Tree::Optimizer progresses further.

That's all for tonight. Happy hacking!