|
|
10 years ago | |
|---|---|---|
| ast | 10 years ago | |
| errorcode | 10 years ago | |
| parser | 10 years ago | |
| scanner | 10 years ago | |
| token | 10 years ago | |
| Grammar | 10 years ago | |
| LICENSE | 10 years ago | |
| README.md | 10 years ago | |
| main.go | 10 years ago | |
This project is currently a for-fun work in progress.
The main goals of this project are to learn about programming languages by trying to rewrite CPython 3.5.0 in Go.
So far I have a mostly working scanner/tokenizer. The main goal was to be able to generate similar output as running python3 -m tokenize --exact <script.py>.
Currently there are a few small differences between the output format, but the tokens being produced are the same.
Next up is going to be writing the parser to be able to generate an AST which will match the form provided from:
import parser
import pprint
import symbol
import token
def resolve_symbol_names(part):
if not isinstance(part, list):
return part
if not len(part):
return part
symbol_id = part[0]
if symbol_id in symbol.sym_name:
symbol_name = symbol.sym_name[symbol_id]
return [symbol_name] + [resolve_symbol_names(p) for p in part[1:]]
elif symbol_id in token.tok_name:
token_name = token.tok_name[symbol_id]
return [token_name] + part[1:]
return part
def main(filename):
with open(filename, 'r') as fp:
contents = fp.read()
st = parser.suite(contents)
ast = resolve_symbol_names(st.tolist())
pprint.pprint(ast)
if __name__ == '__main__':
import sys
main(sys.argv[1])
python3 parse.py <script.py>
$ echo "print('hello world')" > test.py
$ python3 parse.py test.py
['file_input',
['stmt',
['simple_stmt',
['small_stmt',
['expr_stmt',
['testlist_star_expr',
['test',
['or_test',
['and_test',
['not_test',
['comparison',
['expr',
['xor_expr',
['and_expr',
['shift_expr',
['arith_expr',
['term',
['factor',
['power',
['atom_expr',
['atom', ['NAME', 'print']],
['trailer',
['LPAR', '('],
['arglist',
['argument',
['test',
['or_test',
['and_test',
['not_test',
['comparison',
['expr',
['xor_expr',
['and_expr',
['shift_expr',
['arith_expr',
['term',
['factor',
['power',
['atom_expr',
['atom',
['STRING',
"'hello world'"]]]]]]]]]]]]]]]]]],
['RPAR', ')']]]]]]]]]]]]]]]]]]],
['NEWLINE', '']]],
['NEWLINE', ''],
['ENDMARKER', '']]
The compiler will be up after the parser. The compiler will be responsible for converting the parsed AST into Python bytecode.
The interpreter will be up after the compiler and will be able to execute on Python bytecode.