Python 3 interpreter in Go

Brett Langdon 0fd448d5ea add start to interpreter interface		10 years ago
ast	start to use gython objects and instructions	10 years ago
bytecode	move bytecode/codeobject to gython/codeobject	10 years ago
cmd/gython	add in some base objects	10 years ago
compiler	start generating CodeObject in assemble()	10 years ago
error	reorganize some things again	10 years ago
errorcode	add errorcode names	10 years ago
grammar	fix up some nodes and stuffs	10 years ago
gython	start generating CodeObject in assemble()	10 years ago
interpreter	add start to interpreter interface	10 years ago
scanner	fix up indentation levels	10 years ago
symbol	don't forget to add in symbol package	10 years ago
token	add token.IsLiteral() method	10 years ago
LICENSE	initial commit	10 years ago
README.md	add compiler purpose/goal	10 years ago

README.md

Gython

This project is currently a for-fun work in progress.

The main goals of this project are to learn about programming languages by trying to rewrite CPython 3.5.0 in Go.

Progress

Scanner

So far I have a mostly working scanner/tokenizer. The main goal was to be able to generate similar output as running python3 -m tokenize --exact <script.py>. Currently there are a few small differences between the output format, but the tokens being produced are the same.

Grammar Parser

Next up is going to be writing the parser to be able to validate the source code grammar; which will match the form provided from:

import parser
import pprint
import symbol
import token


def resolve_symbol_names(part):
    if not isinstance(part, list):
        return part

    if not len(part):
        return part

    symbol_id = part[0]
    if symbol_id in symbol.sym_name:
        symbol_name = symbol.sym_name[symbol_id]
        return [symbol_name] + [resolve_symbol_names(p) for p in part[1:]]
    elif symbol_id in token.tok_name:
        token_name = token.tok_name[symbol_id]
        return [token_name] + part[1:]
    return part


def main(filename):
    with open(filename, 'r') as fp:
        contents = fp.read()
    st = parser.suite(contents)
    ast = resolve_symbol_names(st.tolist())
    pprint.pprint(ast)

if __name__ == '__main__':
    import sys
    main(sys.argv[1])

python3 grammar.py <script.py>

$ echo "print('hello world')" > test.py
$ python3 parse.py test.py
['file_input',
 ['stmt',
  ['simple_stmt',
   ['small_stmt',
    ['expr_stmt',
     ['testlist_star_expr',
      ['test',
       ['or_test',
        ['and_test',
         ['not_test',
          ['comparison',
           ['expr',
            ['xor_expr',
             ['and_expr',
              ['shift_expr',
               ['arith_expr',
                ['term',
                 ['factor',
                  ['power',
                   ['atom_expr',
                    ['atom', ['NAME', 'print']],
                    ['trailer',
                     ['LPAR', '('],
                     ['arglist',
                      ['argument',
                       ['test',
                        ['or_test',
                         ['and_test',
                          ['not_test',
                           ['comparison',
                            ['expr',
                             ['xor_expr',
                              ['and_expr',
                               ['shift_expr',
                                ['arith_expr',
                                 ['term',
                                  ['factor',
                                   ['power',
                                    ['atom_expr',
                                     ['atom',
                                      ['STRING',
                                       "'hello world'"]]]]]]]]]]]]]]]]]],
                     ['RPAR', ')']]]]]]]]]]]]]]]]]]],
   ['NEWLINE', '']]],
 ['NEWLINE', ''],
 ['ENDMARKER', '']]

AST Parsing

AST parsing will take the validated source grammar and convert it into a valid language AST.

The goal is to get a similar AST output as the following:

import ast


def main(filename):
    with open(filename, 'r') as fp:
        contents = fp.read()
    module = ast.parse(contents)
    print(ast.dump(module))

if __name__ == '__main__':
    import sys
    main(sys.argv[1])

$ echo "print('hello world')" > test.py
$ python3 parser.py test.py
Module(body=[Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Str(s='hello world')], keywords=[]))])

Compiler

The purpose of the compiler is to convert an AST into the appropriate Python bytecode.

The goal is to be able to produce a similar output as running:

$ echo "print('hello world')" > test.py
$ python3 -m dis test.py
  1           0 LOAD_CONST               0 (5)
              3 STORE_NAME               0 (num)
              6 LOAD_CONST               1 (None)
              9 RETURN_VALUE

Interpreter

The interpreter will be up after the compiler and will be able to execute on Python bytecode.