- There is no longer a separate Python/executor.c file. - Conventions in Python/bytecodes.c are slightly different -- don't use `goto error`, you must use `GOTO_ERROR(error)` (same for others like `unused_local_error`). - The `TIER_ONE` and `TIER_TWO` symbols are only valid in the generated (.c.h) files. - In Lib/test/support/__init__.py, `Py_C_RECURSION_LIMIT` is imported from `_testcapi`. - On Windows, in debug mode, stack allocation grows from 8MiB to 12MiB. - **Beware!** This changes the env vars to enable uops and their debugging to `PYTHON_UOPS` and `PYTHON_LLTRACE`. |
||
|---|---|---|
| .. | ||
| _typing_backports.py | ||
| analysis.py | ||
| flags.py | ||
| formatting.py | ||
| generate_cases.py | ||
| instructions.py | ||
| interpreter_definition.md | ||
| lexer.py | ||
| mypy.ini | ||
| parsing.py | ||
| plexer.py | ||
| README.md | ||
| stacking.py | ||
Tooling to generate interpreters
Documentation for the instruction definitions in Python/bytecodes.c
("the DSL") is here.
What's currently here:
lexer.py: lexer for C, originally written by Mark Shannonplexer.py: OO interface on top of lexer.py; main class:PLexerparsing.py: Parser for instruction definition DSL; main classParsergenerate_cases.py: driver script to readPython/bytecodes.cand writePython/generated_cases.c.h(and several other files)analysis.py:Analyzerclass used to read the input filesflags.py: abstractions related to metadata flags for instructionsformatting.py:Formatterclass used to write the output filesinstructions.py: classes to analyze and write instructionsstacking.py: code to handle generalized stack effects
Note that there is some dummy C code at the top and bottom of
Python/bytecodes.c
to fool text editors like VS Code into believing this is valid C code.
A bit about the parser
The parser class uses a pretty standard recursive descent scheme,
but with unlimited backtracking.
The PLexer class tokenizes the entire input before parsing starts.
We do not run the C preprocessor.
Each parsing method returns either an AST node (a Node instance)
or None, or raises SyntaxError (showing the error in the C source).
Most parsing methods are decorated with @contextual, which automatically
resets the tokenizer input position when None is returned.
Parsing methods may also raise SyntaxError, which is irrecoverable.
When a parsing method returns None, it is possible that after backtracking
a different parsing method returns a valid AST.
Neither the lexer nor the parsers are complete or fully correct.
Most known issues are tersely indicated by # TODO: comments.
We plan to fix issues as they become relevant.