Writing a Virtual Machine in Python April 17, 2008
Posted by PythonGuy in Advanced Python, Lisp, Python.Tags: Python, sicp, virtual machine
trackback
Python is the ultimate prototyping language, right?
Yes! I am glad I know it.
In a few hours last night, I was able to write a virtual machine that could power a Scheme interpreter.
The virtual machine is simply an object. The instructions and operations are just methods. And there is the execute loop, which simply pulls off an instruction, advances the code pointer, and then executes it.
Writing in this way is certainly easier than writing in C. It’s tempting to use the advanced features of Python, but it is easy to see how I would write the program in C.
Here is a very simple register-based virtual machine that will calculate the greatest common denominator of two integers.
class Machine(object): def __init__(self, program): # The program--a tuple of tuples which represent instructions. self.program = program # Registers self.a = self.b = self.t = None # Whether to branch self.flag = False # Code pointer self.pc = 0 def execute(self): while self.pc is not None: i = self.program[self.pc] print self.pc, self.a, self.b, self.t, self.flag, i instr, rest = i[0], i[1:] self.pc += 1 # Don't forget to increment the counter getattr(self, 'i_'+instr)(*rest) def i_copy(self, a, b): """Duplicates register b in register a""" setattr(self, a, getattr(self, b)) def i_set(self, a, b): """Sets register a to the value b""" setattr(self, a, b) def i_exec(self, reg, op, *args): """Calls op and stores the result in reg.""" setattr(self, reg, getattr(self, 'o_'+op)(*args)) def i_test(self, op, *rest): if getattr(self, 'o_'+op)(*rest): self.flag = True else: self.flag = False def i_branch(self, line): """Jump to line if flag is set""" if self.flag: self.pc = line def i_jump(self, line): """Jump to line""" self.pc = line def o_zero(self, reg): """Is reg zero?""" return getattr(self, reg) == 0 def o_lt(self, a, b): return getattr(self, a) < getattr(self, b) def o_sub(self, a, b): """reg a - reg b""" return getattr(self, a) - getattr(self, b) m = Machine(( # If b is zero, then we are done. ('test', 'zero', 'b'), # if b == 0 ('branch', None), # We're done ('copy', 't', 'a'), # t <- a # If a < b, then we swap out b and a. ('test', 'lt', 't', 'b'), # t < b? ('branch', 7), # Subtract out b from a. ('exec', 't', 'sub', 't', 'b'), # t <- t-b ('jump', 3), ('copy', 'a', 'b'), # a <- b ('copy', 'b', 't'), # b <- t ('jump', 0), )) m.a = 56 m.b = 12 m.execute() print m.a
(Code was borrowed from SICP.)
Optimization? Rewrite it in C. But at least in Python, it’s a throw-away project since it only took a few minutes to implement.
Great article.
thanks
Why do you use getattr(self, a) – getattr(self, b) instead of a – b?
a and b hold the names of the attributes, not the values those attributes represent.
Think of pointers in C.
If you’re not familiar with C, then imagine a and b hold the phone numbers of two different people. getattr(self, a) will actually give you the person you want to talk to. So getattr(self, a) means “give me the person who has the phone number that is the same as what’s currently stored in a.”
Ahh… ok! thank you
I write some virtual machines in python, but I’m use to decode operation a list of methods or dict of methods, for example:
decode = { ‘copy’ : self.__copy, ‘jmp’: self.__jump }
…and then…
decode[instruction]();
But your methods it’s more interesting and, I think, faster…
I wish you could have written it in C. i could have understood
Great little article. Some of the code doesn’t render nicely on the page. –James
We can use mapping or dictionary datatype for memory and stack data structure for instructions