Implement a statistical sampling profiler that can profile external
Python processes by PID. Uses the _remote_debugging module and converts
the results to pstats-compatible format for analysis.
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
Weakrefs to unreachable garbage that are created during running of
finalizers need to be cleared. This avoids exposing objects that
have `tp_clear` called on them to Python-level code.
Move PYOS_LOG2_STACK_MARGIN, PYOS_STACK_MARGIN,
PYOS_STACK_MARGIN_BYTES and PYOS_STACK_MARGIN_SHIFT macros to
pycore_pythonrun.h internal header. Add underscore (_) prefix to the
names to make them private. Rename _PYOS to _PyOS.
After Python finalization gets to the point where no other thread
can attach thread state, attempting to acquire a Python lock must hang.
Raise PythonFinalizationError instead of hanging.
This adds a "macro" to the optimizer DSL called "REPLACE_OPCODE_IF_EVALUATES_PURE", which allows automatically constant evaluating a bytecode body if certain inputs have no side effects upon evaluations (such as ints, strings, and floats).
Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
The free threading build uses QSBR to delay the freeing of dictionary
keys and list arrays when the objects are accessed by multiple threads
in order to allow concurrent reads to proceed with holding the object
lock. The requests are processed in batches to reduce execution
overhead, but for large memory blocks this can lead to excess memory
usage.
Take into account the size of the memory block when deciding when to
process QSBR requests.
Also track the amount of memory being held by QSBR for mimalloc pages. Advance the write sequence if this memory exceeds a limit. Advancing the sequence will allow it to be freed more quickly.
Process the held QSBR items from the "eval breaker", rather than from `_PyMem_FreeDelayed()`. This gives a higher chance that the global read sequence has advanced enough so that items can be freed.
Co-authored-by: Sam Gross <colesbury@gmail.com>
Most importantly, this resolves the issues with functions and types defined in __main__.
It also expands the number of supported objects and simplifies the implementation.
As noted in the new tests, there are a few situations we must carefully accommodate
for functions that get pickled during interp.call(). We do so by running the script
from the main interpreter's __main__ module in a hidden module in the other
interpreter. That hidden module is used as the function __globals__.
This PR adds a PyJitRef API to the JIT's optimizer that mimics the _PyStackRef API. This allows it to track references and their stack lifetimes properly. Thus opening up the doorway to refcount elimination in the JIT.
For several builtin functions, we now fall back to __main__.__dict__ for the globals
when there is no current frame and _PyInterpreterState_IsRunningMain() returns
true. This allows those functions to be run with Interpreter.call().
The affected builtins:
* exec()
* eval()
* globals()
* locals()
* vars()
* dir()
We take a similar approach with "stateless" functions, which don't use any
global variables.
In this refactor we:
* move some code around
* make a couple of typedefs opaque
* decouple errors from session state
* improve tracebacks for propagated exceptions
This change helps simplify several upcoming changes.
PEP-734 has been accepted (for 3.14).
(FTR, I'm opposed to putting this under the concurrent package, but
doing so is the SC condition under which the module can land in 3.14.)