| 
									
										
										
										
											2024-12-06 16:36:06 +00:00
										 |  |  | # The JIT
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The [adaptive interpreter](interpreter.md) consists of a main loop that | 
					
						
							|  |  |  | executes the bytecode instructions generated by the | 
					
						
							|  |  |  | [bytecode compiler](compiler.md) and their | 
					
						
							|  |  |  | [specializations](interpreter.md#Specialization). Runtime optimization in | 
					
						
							|  |  |  | this interpreter can only be done for one instruction at a time. The JIT | 
					
						
							|  |  |  | is based on a mechanism to replace an entire sequence of bytecode instructions, | 
					
						
							|  |  |  | and this enables optimizations that span multiple instructions. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Historically, the adaptive interpreter was referred to as `tier 1` and | 
					
						
							|  |  |  | the JIT as `tier 2`. You will see remnants of this in the code. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## The Optimizer and Executors
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The program begins running on the adaptive interpreter, until a `JUMP_BACKWARD` | 
					
						
							|  |  |  | instruction determines that it is "hot" because the counter in its | 
					
						
							|  |  |  | [inline cache](interpreter.md#inline-cache-entries) indicates that it | 
					
						
							|  |  |  | executed more than some threshold number of times (see | 
					
						
							|  |  |  | [`backoff_counter_triggers`](../Include/internal/pycore_backoff.h)). | 
					
						
							|  |  |  | It then calls the function `_PyOptimizer_Optimize()` in | 
					
						
							|  |  |  | [`Python/optimizer.c`](../Python/optimizer.c), passing it the current | 
					
						
							|  |  |  | [frame](frames.md) and instruction pointer. `_PyOptimizer_Optimize()` | 
					
						
							|  |  |  | constructs an object of type | 
					
						
							| 
									
										
										
										
											2025-03-18 00:22:12 +03:00
										 |  |  | [`_PyExecutorObject`](../Include/internal/pycore_optimizer.h) which implements | 
					
						
							| 
									
										
										
										
											2024-12-06 16:36:06 +00:00
										 |  |  | an optimized version of the instruction trace beginning at this jump. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The optimizer determines where the trace ends, and the executor is set up | 
					
						
							|  |  |  | to either return to the adaptive interpreter and resume execution, or | 
					
						
							|  |  |  | transfer control to another executor (see `_PyExitData` in | 
					
						
							|  |  |  | Include/internal/pycore_optimizer.h). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The executor is stored on the [`code object`](code_objects.md) of the frame, | 
					
						
							|  |  |  | in the `co_executors` field which is an array of executors. The start | 
					
						
							|  |  |  | instruction of the trace (the `JUMP_BACKWARD`) is replaced by an | 
					
						
							|  |  |  | `ENTER_EXECUTOR` instruction whose `oparg` is equal to the index of the | 
					
						
							|  |  |  | executor in `co_executors`. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## The micro-op optimizer
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The micro-op (abbreviated `uop` to approximate `μop`) optimizer is defined in | 
					
						
							| 
									
										
										
										
											2025-01-28 16:10:51 -08:00
										 |  |  | [`Python/optimizer.c`](../Python/optimizer.c) as `_PyOptimizer_Optimize`. | 
					
						
							| 
									
										
										
										
											2024-12-06 16:36:06 +00:00
										 |  |  | It translates an instruction trace into a sequence of micro-ops by replacing | 
					
						
							|  |  |  | each bytecode by an equivalent sequence of micro-ops (see | 
					
						
							|  |  |  | `_PyOpcode_macro_expansion` in | 
					
						
							|  |  |  | [pycore_opcode_metadata.h](../Include/internal/pycore_opcode_metadata.h) | 
					
						
							|  |  |  | which is generated from [`Python/bytecodes.c`](../Python/bytecodes.c)). | 
					
						
							|  |  |  | The micro-op sequence is then optimized by | 
					
						
							|  |  |  | `_Py_uop_analyze_and_optimize` in | 
					
						
							|  |  |  | [`Python/optimizer_analysis.c`](../Python/optimizer_analysis.c) | 
					
						
							|  |  |  | and an instance of `_PyUOpExecutor_Type` is created to contain it. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## The JIT interpreter
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | After a `JUMP_BACKWARD` instruction invokes the uop optimizer to create a uop | 
					
						
							|  |  |  | executor, it transfers control to this executor via the `GOTO_TIER_TWO` macro. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | CPython implements two executors. Here we describe the JIT interpreter, | 
					
						
							|  |  |  | which is the simpler of them and is therefore useful for debugging and analyzing | 
					
						
							|  |  |  | the uops generation and optimization stages. To run it, we configure the | 
					
						
							|  |  |  | JIT to run on its interpreter (i.e., python is configured with | 
					
						
							|  |  |  | [`--enable-experimental-jit=interpreter`](https://docs.python.org/dev/using/configure.html#cmdoption-enable-experimental-jit)). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | When invoked, the executor jumps to the `tier2_dispatch:` label in | 
					
						
							|  |  |  | [`Python/ceval.c`](../Python/ceval.c), where there is a loop that | 
					
						
							|  |  |  | executes the micro-ops. The body of this loop is a switch statement over | 
					
						
							|  |  |  | the uops IDs, resembling the one used in the adaptive interpreter. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2025-04-05 12:36:16 +02:00
										 |  |  | The switch implementing the uops is in [`Python/executor_cases.c.h`](../Python/executor_cases.c.h), | 
					
						
							| 
									
										
										
										
											2024-12-06 16:36:06 +00:00
										 |  |  | which is generated by the build script | 
					
						
							|  |  |  | [`Tools/cases_generator/tier2_generator.py`](../Tools/cases_generator/tier2_generator.py) | 
					
						
							|  |  |  | from the bytecode definitions in | 
					
						
							|  |  |  | [`Python/bytecodes.c`](../Python/bytecodes.c). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | When an `_EXIT_TRACE` or `_DEOPT` uop is reached, the uop interpreter exits | 
					
						
							|  |  |  | and execution returns to the adaptive interpreter. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Invalidating Executors
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | In addition to being stored on the code object, each executor is also | 
					
						
							|  |  |  | inserted into a list of all executors, which is stored in the interpreter | 
					
						
							|  |  |  | state's `executor_list_head` field. This list is used when it is necessary | 
					
						
							|  |  |  | to invalidate executors because values they used in their construction may | 
					
						
							|  |  |  | have changed. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## The JIT
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | When the full jit is enabled (python was configured with | 
					
						
							|  |  |  | [`--enable-experimental-jit`](https://docs.python.org/dev/using/configure.html#cmdoption-enable-experimental-jit), | 
					
						
							|  |  |  | the uop executor's `jit_code` field is populated with a pointer to a compiled | 
					
						
							|  |  |  | C function that implements the executor logic. This function's signature is | 
					
						
							|  |  |  | defined by `jit_func` in [`pycore_jit.h`](Include/internal/pycore_jit.h). | 
					
						
							|  |  |  | When the executor is invoked by `ENTER_EXECUTOR`, instead of jumping to | 
					
						
							|  |  |  | the uop interpreter at `tier2_dispatch`, the executor runs the function | 
					
						
							|  |  |  | that `jit_code` points to. This function returns the instruction pointer | 
					
						
							|  |  |  | of the next Tier 1 instruction that needs to execute. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The generation of the jitted functions uses the copy-and-patch technique | 
					
						
							|  |  |  | which is described in | 
					
						
							|  |  |  | [Haoran Xu's article](https://sillycross.github.io/2023/05/12/2023-05-12/). | 
					
						
							|  |  |  | At its core are statically generated `stencils` for the implementation | 
					
						
							|  |  |  | of the micro ops, which are completed with runtime information while | 
					
						
							|  |  |  | the jitted code is constructed for an executor by | 
					
						
							|  |  |  | [`_PyJIT_Compile`](../Python/jit.c). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The stencils are generated at build time under the Makefile target `regen-jit` | 
					
						
							|  |  |  | by the scripts in [`/Tools/jit`](/Tools/jit). This script reads | 
					
						
							|  |  |  | [`Python/executor_cases.c.h`](../Python/executor_cases.c.h) (which is | 
					
						
							|  |  |  | generated from [`Python/bytecodes.c`](../Python/bytecodes.c)). For | 
					
						
							|  |  |  | each opcode, it constructs a `.c` file that contains a function for | 
					
						
							|  |  |  | implementing this opcode, with some runtime information injected. | 
					
						
							|  |  |  | This is done by replacing `CASE` by the bytecode definition in the | 
					
						
							|  |  |  | template file [`Tools/jit/template.c`](../Tools/jit/template.c). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Each of the `.c` files is compiled by LLVM, to produce an object file | 
					
						
							|  |  |  | that contains a function that executes the opcode. These compiled | 
					
						
							|  |  |  | functions are used to generate the file | 
					
						
							|  |  |  | [`jit_stencils.h`](../jit_stencils.h), which contains the functions | 
					
						
							|  |  |  | that the JIT can use to emit code for each of the bytecodes. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | For Python maintainers this means that changes to the bytecodes and | 
					
						
							|  |  |  | their implementations do not require changes related to the stencils, | 
					
						
							|  |  |  | because everything is automatically generated from | 
					
						
							|  |  |  | [`Python/bytecodes.c`](../Python/bytecodes.c) at build time. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | See Also: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | * [Copy-and-Patch Compilation: A fast compilation algorithm for high-level languages and bytecode](https://arxiv.org/abs/2011.13127) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | * [PyCon 2024: Building a JIT compiler for CPython](https://www.youtube.com/watch?v=kMO3Ju0QCDo) |