cpython/Include/internal/pycore_uops.h
Guido van Rossum 8deb8bc2e5
gh-112287: Speed up Tier 2 (uop) interpreter a little (#112286)
This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.

This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.

The rest of the approach is to only load
`oparg` and `operand` in cases that use them.
The code generator know when these are used.

For `oparg`, it will conditionally emit
```
oparg = CURRENT_OPARG();
```
at the top of the case block.
(The `oparg` variable may be referenced multiple times
by the instructions code block, so it must be in a variable.)

For `operand`, it will use `CURRENT_OPERAND()` directly
instead of referencing the `operand` variable,
which no longer exists.
(There is only one place where this will be used.)
2023-11-20 11:25:32 -08:00

35 lines
710 B
C

#ifndef Py_INTERNAL_UOPS_H
#define Py_INTERNAL_UOPS_H
#ifdef __cplusplus
extern "C" {
#endif
#ifndef Py_BUILD_CORE
# error "this header requires Py_BUILD_CORE define"
#endif
#include "pycore_frame.h" // _PyInterpreterFrame
#define _Py_UOP_MAX_TRACE_LENGTH 512
typedef struct {
uint16_t opcode;
uint16_t oparg;
uint32_t target;
uint64_t operand; // A cache entry
} _PyUOpInstruction;
typedef struct {
_PyExecutorObject base;
_PyUOpInstruction trace[1];
} _PyUOpExecutorObject;
_PyInterpreterFrame *_PyUopExecute(
_PyExecutorObject *executor,
_PyInterpreterFrame *frame,
PyObject **stack_pointer);
#ifdef __cplusplus
}
#endif
#endif /* !Py_INTERNAL_UOPS_H */