jump to navigation

Writing a Virtual Machine in Python April 17, 2008

Posted by PythonGuy in Advanced Python, Lisp, Python.
Tags: , ,

Python is the ultimate prototyping language, right?

Yes! I am glad I know it.

In a few hours last night, I was able to write a virtual machine that could power a Scheme interpreter.

The virtual machine is simply an object. The instructions and operations are just methods. And there is the execute loop, which simply pulls off an instruction, advances the code pointer, and then executes it.

Writing in this way is certainly easier than writing in C. It’s tempting to use the advanced features of Python, but it is easy to see how I would write the program in C.

Here is a very simple register-based virtual machine that will calculate the greatest common denominator of two integers.

class Machine(object):

    def __init__(self, program):
        # The program--a tuple of tuples which represent instructions.
        self.program = program

        # Registers
        self.a = self.b = self.t = None

        # Whether to branch
        self.flag = False

        # Code pointer
        self.pc = 0

    def execute(self):
        while self.pc is not None:
            i = self.program[self.pc]
            print self.pc, self.a, self.b, self.t, self.flag, i
            instr, rest = i[0], i[1:]
            self.pc += 1 # Don't forget to increment the counter
            getattr(self, 'i_'+instr)(*rest)

    def i_copy(self, a, b):
        """Duplicates register b in register a"""
        setattr(self, a, getattr(self, b))

    def i_set(self, a, b):
        """Sets register a to the value b"""
        setattr(self, a, b)

    def i_exec(self, reg, op, *args):
        """Calls op and stores the result in reg."""
        setattr(self, reg, getattr(self, 'o_'+op)(*args))

    def i_test(self, op, *rest):
        if getattr(self, 'o_'+op)(*rest):
            self.flag = True
            self.flag = False

    def i_branch(self, line):
        """Jump to line if flag is set"""
        if self.flag: self.pc = line

    def i_jump(self, line):
        """Jump to line"""
        self.pc = line

    def o_zero(self, reg):
        """Is reg zero?"""
        return getattr(self, reg) == 0

    def o_lt(self, a, b):
        return getattr(self, a) < getattr(self, b)

    def o_sub(self, a, b):
        """reg a - reg b"""
        return getattr(self, a) - getattr(self, b)

m = Machine((
    # If b is zero, then we are done.
    ('test', 'zero', 'b'),      # if b == 0
    ('branch', None),           # We're done
    ('copy', 't', 'a'),         # t <- a

    # If a < b, then we swap out b and a.
    ('test', 'lt', 't', 'b'),   # t < b?
    ('branch', 7),

    # Subtract out b from a.              
    ('exec', 't', 'sub', 't', 'b'), # t <- t-b
    ('jump', 3),

    ('copy', 'a', 'b'),         # a <- b
    ('copy', 'b', 't'),         # b <- t
    ('jump', 0),
m.a = 56
m.b = 12
print m.a

(Code was borrowed from SICP.)

Optimization? Rewrite it in C. But at least in Python, it’s a throw-away project since it only took a few minutes to implement.



1. prasad iyer - August 29, 2009

Great article.

2. Int-0 - June 28, 2010

Why do you use getattr(self, a) – getattr(self, b) instead of a – b?

PythonGuy - June 29, 2010

a and b hold the names of the attributes, not the values those attributes represent.

Think of pointers in C.

If you’re not familiar with C, then imagine a and b hold the phone numbers of two different people. getattr(self, a) will actually give you the person you want to talk to. So getattr(self, a) means “give me the person who has the phone number that is the same as what’s currently stored in a.”

3. Int-0 - June 29, 2010

Ahh… ok! thank you

I write some virtual machines in python, but I’m use to decode operation a list of methods or dict of methods, for example:
decode = { ‘copy’ : self.__copy, ‘jmp’: self.__jump }

…and then…

But your methods it’s more interesting and, I think, faster…

4. gary - November 20, 2010

I wish you could have written it in C. i could have understood

5. James Mills (@therealprologic) - October 24, 2012

Great little article. Some of the code doesn’t render nicely on the page. –James

6. Kalpak Take - April 22, 2017

We can use mapping or dictionary datatype for memory and stack data structure for instructions

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: