| 
									
										
										
										
											2024-02-20 09:39:55 +00:00
										 |  |  | # The tier 2 execution engine
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## General idea
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | When execution in tier 1 becomes "hot", that is the counter for that point in | 
					
						
							|  |  |  | the code reaches some threshold, we create an executor and execute that | 
					
						
							|  |  |  | instead of the tier 1 bytecode. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Since each executor must exit, we also track the "hotness" of those | 
					
						
							|  |  |  | exits and attach new executors to those exits. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | As the program executes, and the hot parts of the program get optimized, | 
					
						
							|  |  |  | a graph of executors forms. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Superblocks and Executors
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Once a point in the code has become hot enough, we want to optimize it. | 
					
						
							|  |  |  | Starting from that point we project the likely path of execution, | 
					
						
							|  |  |  | using information gathered by tier 1 to guide that projection to | 
					
						
							|  |  |  | form a "superblock", a mostly linear sequence of micro-ops. | 
					
						
							|  |  |  | Although mostly linear, it may include a single loop. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | We then optimize this superblock to form an optimized superblock, | 
					
						
							|  |  |  | which is equivalent but more efficient. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | A superblock is a representation of the code we want to execute, | 
					
						
							|  |  |  | but it is not in executable form. | 
					
						
							|  |  |  | The executable form is known as an executor. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Executors are semantically equivalent to the superblock they are | 
					
						
							|  |  |  | created from, but are in a form that can be efficiently executable. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | There are two execution engines for executors, and two types of executors: | 
					
						
							|  |  |  | * The hardware which runs machine code executors created by the JIT compiler. | 
					
						
							|  |  |  | * The tier 2 interpreter runs bytecode executors. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | It would be very wasteful to support both a tier 2 interpreter and | 
					
						
							|  |  |  | JIT compiler in the same process. | 
					
						
							|  |  |  | For now, we will make the choice of engine a configuration option, | 
					
						
							|  |  |  | but we could make it a command line option in the future if that would prove useful. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### Tier 2 Interpreter
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | For platforms without a JIT and for testing, we need an interpreter | 
					
						
							|  |  |  | for executors. It is similar in design to the tier 1 interpreter, but has a | 
					
						
							|  |  |  | different instruction set, and does not adapt. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### JIT compiler
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The JIT compiler converts superblocks into machine code executors. | 
					
						
							|  |  |  | These have identical behavior to interpreted executors, except that | 
					
						
							|  |  |  | they consume more memory for the generated machine code and are a lot faster. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-03-14 06:53:32 +08:00
										 |  |  | ## Transferring control
 | 
					
						
							| 
									
										
										
										
											2024-02-20 09:39:55 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | There are three types of control transfer that we need to consider: | 
					
						
							|  |  |  | * Tier 1 to tier 2 | 
					
						
							|  |  |  | * Tier 2 to tier 1 | 
					
						
							|  |  |  | * One executor to another within tier 2 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Since we expect the graph of executors to span most of the hot | 
					
						
							|  |  |  | part of the program, transfers from one executor to another should | 
					
						
							|  |  |  | be the most common. | 
					
						
							|  |  |  | Therefore, we want to make those transfers fast. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### Tier 2 to tier 2
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | #### Cold exits
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | All side exits start cold and most stay cold, but a few become | 
					
						
							|  |  |  | hot. We want to keep the memory consumption small for the many | 
					
						
							|  |  |  | cold exits, but those that become hot need to be fast. | 
					
						
							|  |  |  | However we cannot know in advance, which will be which. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | So that tier 2 to tier 2 transfers are fast for hot exits, | 
					
						
							|  |  |  | exits must be implemented as executors. In order to patch | 
					
						
							|  |  |  | executor exits when they get hot, a pointer to the current | 
					
						
							|  |  |  | executor must be passed to the exit executor. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | #### Handling reference counts
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | There must be an implicit reference to the currently executing | 
					
						
							|  |  |  | executor, otherwise it might be freed. | 
					
						
							|  |  |  | Consequently, we must increment the reference count of an | 
					
						
							|  |  |  | executor just before executing it, and decrement it just after | 
					
						
							|  |  |  | executing it. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | We want to minimize the amount of data that is passed from | 
					
						
							|  |  |  | one executor to the next. In the JIT, this reduces the number | 
					
						
							|  |  |  | of arguments in the tailcall, freeing up registers for other uses. | 
					
						
							|  |  |  | It is less important in the interpreter, but following the same | 
					
						
							|  |  |  | design as the JIT simplifies debugging and is good for performance. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Provided that we incref the new executor before executing it, we | 
					
						
							|  |  |  | can jump directly to the code of the executor, without needing | 
					
						
							|  |  |  | to pass a reference to that executor object. | 
					
						
							|  |  |  | However, we do need a reference to the previous executor, | 
					
						
							|  |  |  | so that it can be decref'd and for handling of cold exits. | 
					
						
							|  |  |  | To avoid messing up the JIT's register allocation, we pass a | 
					
						
							|  |  |  | reference to the previous executor in the thread state's | 
					
						
							|  |  |  | `previous_executor` field. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | #### The interpreter
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The tier 2 interpreter has a variable `current_executor` which | 
					
						
							| 
									
										
										
										
											2024-03-14 06:53:32 +08:00
										 |  |  | points to the currently live executor. When transferring from executor | 
					
						
							| 
									
										
										
										
											2024-02-20 09:39:55 +00:00
										 |  |  | `A` to executor `B` we do the following: | 
					
						
							|  |  |  | (Initially `current_executor` points to `A`, and the refcount of | 
					
						
							|  |  |  | `A` is elevated by one) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 1. Set the instruction pointer to start at the beginning of `B` | 
					
						
							|  |  |  | 2. Increment the reference count of `B` | 
					
						
							|  |  |  | 3. Start executing `B` | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | We also make the first instruction in `B` do the following: | 
					
						
							|  |  |  | 1. Set `current_executor` to point to `B` | 
					
						
							|  |  |  | 2. Decrement the reference count of `A` (`A` is referenced by `tstate->previous_executor`) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The net effect of the above is to safely decrement the refcount of `A`, | 
					
						
							|  |  |  | increment the refcount of `B` and set `current_executor` to point to `B`. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | #### In the JIT
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-03-14 06:53:32 +08:00
										 |  |  | Transferring control from one executor to another is done via tailcalls. | 
					
						
							| 
									
										
										
										
											2024-02-20 09:39:55 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | The compiled executor should do the same, except that there is no local | 
					
						
							|  |  |  | variable `current_executor`. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### Tier 1 to tier 2
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Since the executor doesn't know if the previous code was tier 1 or tier 2, | 
					
						
							|  |  |  | we need to make a transfer from tier 1 to tier 2 look like a tier 2 to tier 2 | 
					
						
							|  |  |  | transfer to the executor. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | We can then perform a tier 1 to tier 2 transfer by setting `current_executor` | 
					
						
							|  |  |  | to `None`, and then performing a tier 2 to tier 2 transfer as above. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### Tier 2 to tier 1
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Each micro-op that might exit to tier 1 contains a `target` value, | 
					
						
							|  |  |  | which is the offset of the tier 1 instruction to exit to in the | 
					
						
							|  |  |  | current code object. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Counters
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | TO DO. | 
					
						
							|  |  |  | The implementation will change soon, so there is no point in | 
					
						
							|  |  |  | documenting it until then. | 
					
						
							|  |  |  | 
 |