One of the advantages of a common virtual machine for various languages is the ability to apply the same optimizations to all of those languages. For example, LLVM includes optimization passes to propagate constants, eliminate dead arguments, code, and globals, inline functions, and eliminate recursive tail calls, among others. Any language with a compiler to LLVM can easily take advantage of these optimizations without any additional work by the compiler writer.
LLVM provides a convenient framework for performing transformations or analyses on LLVM code. Parrot has no analogous framework. If you want to be optimize compiled Parrot code, you can either implement your optimizations in individual HLL compilers or parse PIR or PBC and transform it. There's no way to write optimizations at a high level that apply to all high-level languages on Parrot.
Because of this, few HLL compilers(if any) implement such simple optimizations as eliminating tail calls. Compare this to LLVM, where the "tailcallelim" optimization not only eliminates tail recursive calls, but even transforms some non-tail-recursive calls to tail recursive calls and then eliminates them. With compilers for Parrot, you can't write a tail-recursive factorial without needing to worry about stack overflows. With LLVM, you can write the naive linear-space factorial and not worry about a stack overflow.
This is the problem that my GSoC project will correct. This summer, I will first develop a low-level framework for traversing and transforming PASTs. On top of this framework(tentatively called PAST::Walker), I will implement a library for regex-like pattern matching on PASTs(tentatively called PAST::Pattern). I'll also add some helper functions for adding optimization passes to the HLL::Compiler cycle. Using these libraries, I'll spend the rest of the summer implementing a few useful optimizations.
I haven't yet nailed down the details of the design of any of the libraries I'll be implementing, but I thought the Community Bonding Period(now until May 24) would be a good opportunity to set out some of my initial thoughts on the design and get feedback from chromatic and the rest of the Parrot community. I'm particularly interested in feedback from HLL compiler writers and anyone who is interested in compiler optimization.
In this post, I'm going to discuss the lowest level of my project: the PAST traversal framework. It will be similar to ast.NodeVisitor and ast.NodeTransformer classes Python provides( see here). PAST::Walker will be a class that can be subclassed for individual PAST walkers. It will support two methods that should not generally be overridden in subclasses: 'walk' and 'walkChildren', both of which would accept a PAST as an argument. In addition to these methods, there will be 'walkBlock', 'walkStmts', etc. methods for each PAST::Node subclass. These methods will be called when 'walk' is called on a PAST of the appropriate type. By default, they call 'walkChildren' on the node. The 'walkChildren' method calls 'walk' on each child of the supplied node. In addition, to simplify generating a new traverser at runtime, a subclass of PAST::Walker, tentatively named PAST::Walker::Dynamic, will contain a hash from node type names to subs that will allow it to replace the implementation of individual walkFoo methods dynamically.
I welcome any feedback on IRC, parrot-dev, or by email to email@example.com .
In the next couple of weeks before the official start-coding date, I plan to write a few more blog posts about the higher-level aspects of my projects.
Update: bacek++ reminded me of the existence of multi-methods. A single 'walk' multi-method would make much more sense than walkFoo methods and might have improved performance compared to manually multiple-dispatching as the walkFoo approach would require in 'walk'. I'm a little concerned that that may make NQP use of PAST::Walker less convenient, but given that PAST::Walker is intended to be a somewhat lower-level library that few people will hopefully need to use, that shouldn't be too much of a problem. If someone really wants to use PAST::Walker from NQP, PAST::Walker::Dynamic will usually be more than sufficient.
Is there any other reason why multi-methods might not be the best options for PAST::Walker?