Channels ▼

Jocelyn Paine

Dr. Dobb's Bloggers

Stack Machines, Expression Evaluation, and the Magic of Reverse Polish

February 21, 2010

To teach a friend how compilers work, I've written a compiler for a small subset of Pascal. It lexically analyses and parses a program, code-generates the syntax tree into code for a stack machine not unlike the Ocode virtual machine once used by C's ancestral language BCPL, and interprets the code. I'm going to write a few essays about the compiler and its target machine; in this first one, I'll demonstrate how a stack machine evaluates expressions.

A good place to start is reverse Polish. As many readers will know, this is a notation for expressions whereby all operators and functions follow their arguments. It's important because once we have the reverse Polish form of an expression, we can trivially transcribe it into a program that evaluates the expression.

Here's a very simple example. The reverse Polish form of the expression

11 * 22<br />
is
11 22 *<br />

This works with any number of operators. So, for example, the reverse Polish form of the expression

11 * 22 + 33<br />
is
11 22 * 33 +<br />

Notice that this is the reverse Polish for "E + 33", which is "E 33 +", with "E" replaced by the reverse Polish for "11 * 22". Reverse Polish has this lovely recursive structure.

What's magic about this is that we can transcribe any reverse-Polish expression into a machine-code program that evaluates the expression. Providing, that is, that our machine has a stack.

So, assume a stack. Let there be a "loadconst" instruction, which pushes a constant onto the stack. And let there be an "add" instruction, which pops the top two elements off the stack, adds them, and pushes the result back. Let "mult" operate similarly. And let us transcribe the reverse Polish above, replacing constants by "loadconst", and renaming operators to the corresponding instruction.

Doing this,

11 22 * 33 +<br />
becomes
loadconst 11 loadconst 22 mult loadconst 33 add<br />

I fed the instruction sequence above to my stack-machine interpreter, and this is what it displayed. The stack, which starts off empty, is shown as a list, with the topmost element on the left:

Stack: []<br /><br />About to do OP(loadconst, 11)<br />Stack: [11]<br /><br />About to do OP(loadconst, 22)<br />Stack: [22, 11]<br /><br />About to do OP(mult)<br />Stack: [242]<br /><br />About to do OP(loadconst, 33)<br />Stack: [33, 242]<br /><br />About to do OP(add)<br />Stack: [275]<br />

Here's a bigger example. I'm going to evaluate this expression:

((11 * 22 + 33 * 44) / 121 = 14)<br />and<br />(10 = 20 / 2)<br />

And here is the same expression rewritten in reverse Polish. The line breaks have no significance, they just stop me overflowing the blog's right-hand margin:

11 22 * 33 44 * +<br />121 /<br />14 =<br />10 20 2 /<br />=<br />and<br />

This is the reverse Polish transcribed into machine instructions:

loadconst 11<br />loadconst 22<br />mult<br />loadconst 33<br />loadconst 44<br />mult<br />add<br />loadconst 121<br />div<br />loadconst 14<br />eq<br />loadconst 10<br />loadconst 20<br />loadconst 2<br />div<br />eq<br />logand<br />

I've used a greater variety of instructions here. The "div" instruction divides the element next to the top of the stack by the element on the top, popping them and pushing back the integer quotient. The "eq" instruction compares the top two elements, pops them, and pushes back 1 (signifying TRUE) if they are equal, 0 (signifying FALSE) otherwise. And the "logand" instruction does a logical "and", popping the top two elements and pushing back 1 if they are both 1, otherwise 0.

Now here is my interpreter interpreting those instructions:

Stack: []<br /><br />About to do OP(loadconst, 11)<br />Stack: [11]<br /><br />About to do OP(loadconst, 22)<br />Stack: [22, 11]<br /><br />About to do OP(mult)<br />Stack: [242]<br /><br />About to do OP(loadconst, 33)<br />Stack: [33, 242]<br /><br />About to do OP(loadconst, 44)<br />Stack: [44, 33, 242]<br /><br />About to do OP(mult)<br />Stack: [1452, 242]<br /><br />About to do OP(add)<br />Stack: [1694]<br /><br />About to do OP(loadconst, 121)<br />Stack: [121, 1694]<br /><br />About to do OP(div)<br />Stack: [14]<br /><br />About to do OP(loadconst, 14)<br />Stack: [14, 14]<br /><br />About to do OP(eq)<br />Stack: [1]<br /><br />About to do OP(loadconst, 10)<br />Stack: [10, 1]<br /><br />About to do OP(loadconst, 20)<br />Stack: [20, 10, 1]<br /><br />About to do OP(loadconst, 2)<br />Stack: [2, 20, 10, 1]<br /><br />About to do OP(div)<br />Stack: [10, 10, 1]<br /><br />About to do OP(eq)<br />Stack: [1, 1]<br /><br />About to do OP(logand)<br />Stack: [1]<br />

The two equality comparisons both returned TRUE, that is 1; and the logical "and" combined these, pushing a final 1 onto the stack.

In my next essay on this topic, I'll extend the machine to implement variables, jumps, and input and output. By the way, it has not been unknown for people to code directly in reverse Polish — as The Evolution of RPN & Numeric Entry in Hewlett-Packard's The Museum of HP Calculators tells us.

 


Jocelyn Paine
popx@j-paine.org

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video