FAQ
What does 'PIR' stand for?
PIR stands for Parrot Intermediate Representation. In the early days, Parrot could only be programmed in Parrot Assembly (PASM), but that soon got an intermediate code compiler (IMCC) to allow for a bit more human readable syntax. For instance, instead of saying:
set I0, 42
one could say:
I0 = 42
Since then, IMCC has been the recommended way of targeting Parrot. At some point, people started using the name PIR instead of IMC, and it was decided that it should be standardized. So, PIR was chosen as the name for the language that was formerly known as IMC. PIR code is still parsed by a compiler called IMCC (see compilers/imcc in the parrot root directory).
What's the difference between PIR and PASM?
PASM stands for Parrot Assembly and is Parrot's native language. PIR, on the other hand, is a bit more high level, and should be considered as a layer of syntactic sugar on top of PASM. Just as PASM, PIR can be read by Parrot as well. As long there is not a complete High Level Language (HLL) for Parrot, PIR is the recommended language to program Parrot.
PIR Tutorial
Introduction
Welcome to the beginners' tutorial for Parrot Intermediate Representation! So, you want to program the most exciting and new virtual machine for dynamic languages eh? This is the place to start! Although Parrot is still under development, it can already solve a lot of your programming problems. As time progresses, it will even get better. Moreover, because Parrot is currently not compiled with optimizations, it will get faster too! It should be noted at this point that the syntax of Parrot's internal language is not set in stone, and may change at some points. However, if you stick to the syntax as described in this tutorial, you should be quite safe.
If you're comfortable with reading grammars BNF format, you might take a look at the grammar of PIR as defined in languages/PIR. Note that this is not the official implementation, it is an attempt to be as close as possible. If you don't know what BNF is, you may forget about it :-).
Now, let's get started!
PIR Basics
Your first Parrot program
As always, we start with the simplest program imaginable:
.sub main
print "Hello Parrot!\n"
.end
Save this program to a file called hello.pir
. Then, to run this program, type this on the command line (assuming you successfully compiled Parrot):
parrot hello.pir
And the output will be:
Hello Parrot!
That was not too hard, now was it? Before we continue to more complex examples, let's first analyze what happened. The first line in the file is .sub main
. This indicates that we're defining a subroutine that goes by the name main
. Note that it is not necessary to name your subroutine like this, even if it's the only subroutine. The name main
does not indicate execution will start at that subroutine, like in C. In PIR, execution will start at the top-most defined subroutine in the file, not matter what its name is. (There are ways to change this, though, but we will forget that for now. More on that later). As you can see on the third line, the subroutine is closed with the .end
directive.
In between these subroutine directives, you can define the subroutine body, which consists of PIR or Parrot assembly (PASM) instructions. In this simple program we just sticked to the print
instruction. It takes 1 parameter that can be of any type, as long as it is something (i.e. it is not undefined or null). Please note that all instruction should be in between a .sub
/.end
pair.
More instructions
Parrot has a lot of instructions. I mean, a lot. This tutorial will not discuss all of them, but instead we will discuss them as the need arises for them. Now, we will first see how to do some calculations so you can do some useful stuff. We'll do it step by step and explain things as they pass by. (Do note however, this is *not* a tutorial on assembly programming, so some knowledge of registers etc. is expected).
Storing things
Before we continue, we need to explain some details on how Parrot stores numbers, strings and objects. As Parrot is a register-based virtual machine (as opposed to stack-based VMs like the Java VM), you store things in registers. There are 4 types: registers for storing integers (I registers), floating-point numbers (N registers), strings (S registers) and objects (P registers). So, let's consider the case we need to store some things, we could do it like this:
I0 = 42 # store 42 in integer register 0
N10 = 3.14 # store 3.14 in numeric register 10
S20 = "Hello world!" # store this string in string register 20
P30 = new .String # create a new String object in PMC register 30.
# See {link: Where to read further?} for links to more tutorials.
Above we used Parrot registers, and there's only a limited number of them. Instead, it's better to use temporary registers; they look almost the same as registers, but have a $
prefix. They can be considered as variables that don't need any declaration (and you can use as many of them as you need). Some examples:
$I0 = 42
$S9999999 = "Hi" # use *any* register number
However, if you like to name things by their name, you might consider using named temporary variables. These, however, do need declaration. This is done by stating:
.local int answer
.local num PI, e
answer = 42
PI = 3.14
e = 2.7
This declares some temporary variables. Although this declares an integer and some numeric variables, you could use any of the following types:
- int - declare an integer variable
- num - declare a floating-point number variable
- string - declare a string variable
- pmc - declare a Parrot Magic Cookie (PMC) variable
You might wonder what the heck is a Parrot Magic Cookie. This is where Parrot's Magic comes in. In fact, it's so magical, there's a separate document written on that. Have a look at the section {link: Where to read further?}.
Now we know how to store numbers and strings, let's do some operations on those values.
Calculating things
Calculating things is as trivial as you might expect. We'll give some full examples below, so you can copy+paste the code and run it yourself:
ABC formula
.sub foo
.local num a, b, c, det
# give a, b and c some value for now; later specify them as parameters
a = 2
b = -3
c = -2
# calculate -b and b squared.
$N0 = -b
$N1 = b * b
# calculate 4ac
$N2 = 4 * a
$N2 = $N2 * c
$N3 = 2 * a
det = $N1 - $N2
$N4 = sqrt det
.local num x1, x2
x1 = $N0 + $N4
x1 = x1 / $N3
x2 = $N0 - $N4
x2 /= $N3 # fancy way of saying x2 = x2 / $N3, but more efficient
print "Answers to ABC formula are:\n"
print "x1 = "
print x1
print "\nx2 = "
print x2
print "\n"
.end
Of course, as Parrot offers operations at a more abstract level than hardware processors, you can also do more fancy things like manipulating strings, like in the example below:
.sub joe
.local string name
name = " Joe!"
$S0 = "Hi"
$S1 = $S0 . name
$S1 .= "\n" # extend $S1 with "\n"
print $S1
.end
The dot is short in PIR for the concat
operation. It takes 2 strings and concatenates them. Just as the assignment operations in the ABC formula example (x2 /= $N3), this can also be done with strings using the .=
operator.
As mentioned, Parrot has many instructions. This tutorial will not list all of them, but instead you could take a look at parrotcode.org/docs/ops that lists all ops by category.
More on subroutines
This section will discuss a little bit more on subroutines so you can do some useful stuff. Although there's much more to subroutines, we'll postpone that to a later section.
Passing parameters
In order to pass parameters to a sub, you'll need to define these parameters. This is easy:
.sub foo
.param int n
.param string message
# do something useful
.end
Sometimes you want a subroutine that might take a parameter, depending on the situation. In that case you'd want to use optional parameters.
Returning values
If you remember the example in which we implemented the ABC formula, you could see that we calculated them, but only printed them. Usually, you'd like to have some subroutine calculate something and then return the answers. Instead of printing them, you could return the answers, as shown in this code snippet:
.sub abc
.local num x1, x2
# do some calculations
.return (x1, x2)
.end
Invoking subroutines
Now you know how to define parameters and return values, it's time to explore how to invoke your subroutine. Some examples:
.sub main
# invoke 'foo' without parameters, no return values
foo()
# invoke 'bar' with parameters $I0, 42, and "hi", no return values
$N0 = 3.14
bar($I0, 42, "hi")
# invoke 'baz' with parameters $N2, "hello yourself", and return values
.local int a
.local num b
.local string c
$N2 = 2.7
(a, b, c) = baz($N2, "hello yourself")
.end
.sub foo
print "Foo!\n"
.end
.sub bar
.param num i
.param int answer
.param string message
print "Bar!\n"
print i
print "\n"
print answer
print "\n"
print message
.end
.sub baz
.param num e
.param string msg
print "Baz!\n"
print e
print "\n"
print msg
.return (1000, 1.23, "hi from baz")
.end
Controlling the flow of your program
PIR as a few instructions to control the flow of your program. This section describes them.
Goto statements
The most basic one is of course the goto
instruction. It's very simple:
...
goto L1
L1:
print "hi!\n"
...
Although the use of goto
is not adviced in high-level languages (HLLs), in PIR you are hardly able to write useful programs without it. Besides, PIR is assembly language after all, so that's ok.
If statements
PIR has a built-in if statement. It can take three forms:
Evaluate an atomic expression:
...
.local int x
x = 1
if x goto L1
...
L1:
Evaluate a binary expression:
...
.local int x
x = 1
if x == 2 goto L2
...
Evaluate an object expression:
...
.local pmc obj
if null obj goto L3
...
If obj
is null, then the execution engine will continue at label L3
. More on PMCs soon.
Unless statements
While an if statement is useful, it is sometimes more efficient to use the unless
instruction. It's the opposite of the if
statement, and jumps to the specified label unless
its argument is true. Its format is exactly the same as the if
statement, except the word if
is replaced by unless
. So you should be able to figure out how to write the unless
instruction.
Loop constructions
PIR has no built-in while
or for
statement, but implementing a loop can be easily done using the if
statement, some goto
s and a couple of labels, like this:
...
.local int i
i = 0
loop_begin:
if i >= 10 goto loop_end
print i
print "\n"
i += 1
goto loop_begin
loop_end:
...
This loop will print the numbers 0 to (but not including) 10 to the screen.
Splitting your program into several files
Sometimes it's easier to split up your program into multiple files. There are two ways to do this:
- use the
.include
directive - compile the PIR files separately and load them using
load_bytecode
The first way is the simplest. Use the .include
directive at a point in your main file where you'd like to pull in the contents of another file. The contents of the specified file is read and replaces the .include
directive, just as the #include
directive does in the C preprocessor.
An example will show what happens:
contents of the main file:
.sub main
foo()
.end
.include "foolib.pir"
And the file foolib.pir
contains:
.sub foo
print "foo!\n"
.end
After processing the .include
directive, the input to the PIR compiler looks like this:
.sub main
foo()
.end
.sub foo
print "foo!\n"
.end
Where to read further?
Take a look at these documents:
- docs/glossary.pod
- contains explanations of some often used terms
- docs/art/pp001-intro.pod
- a general introduction
- docs/art/pp002-pmc.pod
- a good introduction to PMCs
- docs/art/pp003-oop.pod
- an introduction to Object Oriented Programming in Parrot
- docs/imcc/
- all files in this directory
- docs/compiler_faq.pod
- a document describing how to implement various language constructs in PIR
- docs/pdds/pdd03_calling_conventions.pod
- the Parrot Design Document on Parrot's calling conventions
- docs/pdds/pdd20_lexical_vars.pod
- the Parrot Design Document on Lexical variables
- languages/PIR/docs/pirgrammar.pod
- the grammar of PIR as implemented using PGE (matches about 90% of PIR)
- compilers/pirc
- A top-down recursive descent parser for PIR, with embedded specification
- parrotcode.org/docs
- all kinds of files on particular subjects