This adds a "macro" to the optimizer DSL called "REPLACE_OPCODE_IF_EVALUATES_PURE", which allows automatically constant evaluating a bytecode body if certain inputs have no side effects upon evaluations (such as ints, strings, and floats).
Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
The free threading build uses QSBR to delay the freeing of dictionary
keys and list arrays when the objects are accessed by multiple threads
in order to allow concurrent reads to proceed with holding the object
lock. The requests are processed in batches to reduce execution
overhead, but for large memory blocks this can lead to excess memory
usage.
Take into account the size of the memory block when deciding when to
process QSBR requests.
Also track the amount of memory being held by QSBR for mimalloc pages. Advance the write sequence if this memory exceeds a limit. Advancing the sequence will allow it to be freed more quickly.
Process the held QSBR items from the "eval breaker", rather than from `_PyMem_FreeDelayed()`. This gives a higher chance that the global read sequence has advanced enough so that items can be freed.
Co-authored-by: Sam Gross <colesbury@gmail.com>
Most importantly, this resolves the issues with functions and types defined in __main__.
It also expands the number of supported objects and simplifies the implementation.
As noted in the new tests, there are a few situations we must carefully accommodate
for functions that get pickled during interp.call(). We do so by running the script
from the main interpreter's __main__ module in a hidden module in the other
interpreter. That hidden module is used as the function __globals__.
This PR adds a PyJitRef API to the JIT's optimizer that mimics the _PyStackRef API. This allows it to track references and their stack lifetimes properly. Thus opening up the doorway to refcount elimination in the JIT.
For several builtin functions, we now fall back to __main__.__dict__ for the globals
when there is no current frame and _PyInterpreterState_IsRunningMain() returns
true. This allows those functions to be run with Interpreter.call().
The affected builtins:
* exec()
* eval()
* globals()
* locals()
* vars()
* dir()
We take a similar approach with "stateless" functions, which don't use any
global variables.
In this refactor we:
* move some code around
* make a couple of typedefs opaque
* decouple errors from session state
* improve tracebacks for propagated exceptions
This change helps simplify several upcoming changes.
PEP-734 has been accepted (for 3.14).
(FTR, I'm opposed to putting this under the concurrent package, but
doing so is the SC condition under which the module can land in 3.14.)
We were incorrectly handling a few opcodes that leave their operands on the stack. Treat all of these conservatively; assume that they always leave operands on the stack.
On Windows, the `_PyOS_SigintEvent()` event handle is used to interrupt
the main thread when Ctrl-C is pressed. Previously, we also waited on
the event from other threads, but ignored the result. However, this can
race with interpreter shutdown because the main thread closes the handle
in `_PySignal_Fini` and threads may still be running and using mutexes
during interpreter shtudown.
Only use `_PyOS_SigintEvent()` in the main thread in parking_lot.c, like
we do in other places in the CPython codebase.
Apply Intel Control-flow Technology for x86-64 on asm_trampoline.S.
Required for mitigation against return-oriented programming (ROP)
and Call or Jump Oriented Programming (COP/JOP) attacks.
Manual application is required for the assembly files.
See also: https://sourceware.org/annobin/annobin.html/Test-cf-protection.html
Replace most PyUnicodeWriter_WriteUTF8() calls with
PyUnicodeWriter_WriteASCII().
Unrelated change to please the linter: remove an unused
import in test_ctypes.
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Add _Py_PACK_VERSION for CPython's own definitions
Py_PACK_VERSION was added to limited API in 3.14, so if
Py_LIMITED_API is lower, the macro can't be used.
Add a private version that can be used in CPython headers
for checks like `Py_LIMITED_API+0 >= _Py_PACK_VERSION(3, 14)`.
In the free-threaded build, avoid data races caused by updating type
slots or type flags after the type was initially created. For those
(typically rare) cases, use the stop-the-world mechanism. Remove the
use of atomics when reading or writing type flags.