Deprecate PEP-456 [1] support for providing an external definition
of the string hashing scheme. Removal is scheduled for Python 3.19.
Previously, embedders could define the ``Py_HASH_ALGORITHM`` macro to be
``Py_HASH_EXTERNAL`` [2] to indicate that the hashing scheme was provided
externally but this feature was undocumented, untested and most likely
unused.
[1]: https://peps.python.org/pep-0456/
[2]: https://peps.python.org/pep-0456/#hash-function-selection
Align the QSBR thread state array to a 64-byte cache line boundary
and add padding at the end of _PyThreadStateImpl. Depending on heap
layout, the QSBR array could end up sharing a cache line with a
thread's tlbc_index, causing QSBR quiescent state updates to contend
with reads of tlbc_index in RESUME_CHECK. This is sensitive to
earlier allocations during interpreter init and can appear or
disappear with seemingly unrelated changes.
Either change alone is sufficient to fix the specific issue, but both
are worthwhile to avoid similar problems in the future.
Add TYPE_FROZENDICT to the marshal module.
Add C API functions:
* PyAnyDict_Check()
* PyAnyDict_CheckExact()
* PyFrozenDict_Check()
* PyFrozenDict_CheckExact()
* PyFrozenDict_New()
Add PyFrozenDict_Type C type.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Adam Johnson <me@adamj.eu>
Co-authored-by: Benedikt Johannes <benedikt.johannes.hofer@gmail.com>
Remove spurious Py_DECREF on borrowed ref in LOAD_GLOBAL specialization
_PyDict_LookupIndexAndValue() returns a borrowed reference via
_Py_dict_lookup(), but specialize_load_global_lock_held() called
Py_DECREF(value) on it when bailing out for lazy imports. Each time
the adaptive counter fired while a lazy import was still in globals,
this stole one reference from the dict's object. With 8+ threads
racing through LOAD_GLOBAL during concurrent lazy import resolution,
enough triggers accumulated to drive the refcount to zero while the
dict and other threads still referenced the object, causing
use-after-free.
Fix a race condition where a thread could receive a partially-initialized
module when another thread's import fails. The race occurs when:
1. Thread 1 starts importing, adds module to sys.modules
2. Thread 2 sees the module in sys.modules via the fast path
3. Thread 1's import fails, removes module from sys.modules
4. Thread 2 returns a stale module reference not in sys.modules
The fix adds verification after the "skip lock" optimization in both Python
and C code paths to check if the module is still in sys.modules. If the
module was removed (due to import failure), we retry the import so the
caller receives the actual exception from the import failure rather than
a stale module reference.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When _ctypes is imported, it may call dlopen on the libpython shared
library, causing the dynamic linker to load a second mapping of the
library into the process address space. The remote debugging code
iterates memory regions from low addresses upward and returns the first
mapping whose filename matches libpython. After _ctypes is imported, it
finds the dlopen'd copy first, but that copy's PyRuntime section was
never initialized, so reading debug offsets from it fails.
Fix this by validating each candidate PyRuntime address before accepting
it. The validation reads the first 8 bytes and checks for the "xdebugpy"
cookie that is only present in an initialized PyRuntime. Uninitialized
duplicate mappings will fail this check and be skipped, allowing the
search to continue to the real, initialized PyRuntime.
Changing the values requires forking and patching, which is intentional. Simply rebuilding from source does not change the implementation enough to justify changing these values - they would still be `cpython` and compatible with existing `.pyc` files. But people who maintain forks are better served by being able to easily override these values in a place that can be forward-ported reliably.
When the interpreter is in a stop-the-world pause, critical sections
don't need to acquire locks since no other threads can be running.
This avoids a potential deadlock where lock fairness hands off ownership
to a thread that has already suspended for stop-the-world.
Fix thread-safety issues when accessing frame attributes while another
thread is executing the frame:
- Add critical section to frame_repr() to prevent races when accessing
the frame's code object and line number
- Add _Py_NO_SANITIZE_THREAD to PyUnstable_InterpreterFrame_GetLasti()
to allow intentional racy reads of instr_ptr.
- Fix take_ownership() to not write to the original frame's f_executable
Add `_Py_type_getattro_stackref`, a variant of type attribute lookup
that returns `_PyStackRef` instead of `PyObject*`. This allows returning
deferred references in the free-threaded build, reducing reference count
contention when accessing type attributes.
This significantly improves scaling of namedtuple instantiation across
multiple threads.
* Add blurb
* Rename PyObject_GetAttrStackRef to _PyObject_GetAttrStackRef
* Apply suggestion from @vstinner
Co-authored-by: Victor Stinner <vstinner@python.org>
* Apply suggestion from @vstinner
Co-authored-by: Victor Stinner <vstinner@python.org>
* format
* Update Include/internal/pycore_function.h
Co-authored-by: Victor Stinner <vstinner@python.org>
---------
Co-authored-by: Victor Stinner <vstinner@python.org>
The positional arguments passed to _PyStack_UnpackDict are already
kept alive by the caller, so we can avoid the extra reference count
operations by using borrowed references instead of creating new ones.
This reduces reference count contention in the free-threaded build
when calling functions with keyword arguments. In particular, this
avoids contention on the type argument to `__new__` when instantiating
namedtuples with keyword arguments.
The pymalloc huge page support had two problems. First, on
architectures where the default huge page size exceeds the arena
size (e.g. 32 MiB on PPC, 512 MiB on ARM64 with 64 KB base
pages), mmap with MAP_HUGETLB silently allocates a full huge page
even when the requested size is smaller. The subsequent munmap
with the original arena size then fails with EINVAL, permanently
leaking the entire huge page. Second, huge pages were always
attempted when compiled in, with no way to disable them at
runtime. On Linux, if the huge page pool is exhausted, page
faults including copy-on-write faults after fork deliver SIGBUS
and kill the process.
The arena allocator now queries the system huge page size from
/proc/meminfo and skips MAP_HUGETLB when the arena size is not a
multiple of it. Huge pages also now require explicit opt-in at
runtime via the PYTHON_PYMALLOC_HUGEPAGES environment variable,
which is read through PyConfig and respects -E and -I flags.
The config field pymalloc_hugepages is propagated to the runtime
allocators struct so the low-level arena allocator can check it
without calling getenv directly.
Add a FRAME_SUSPENDED_YIELD_FROM_LOCKED state that acts as a brief
lock, preventing other threads from transitioning the frame state
while gen_getyieldfrom reads the yield-from object off the stack.
mmap() returns MAP_FAILED ((void*)-1) on error, not NULL. The current
check never detects mmap failures, so jitdump initialization proceeds
even when the memory mapping fails.
Now that the specializing interpreter works with free threading,
replace ENABLE_SPECIALIZATION_FT with ENABLE_SPECIALIZATION and
replace requires_specialization_ft with requires_specialization.
Also limit the uniquely referenced check to FOR_ITER_RANGE. It's not
necessary for FOR_ITER_GEN and would cause test_for_iter_gen to fail.
We didn't catch this because of a combination of:
1) falling through to the if-statement below works
2) we only specialized FOR_ITER_GEN for uniquely referenced generators,
so we didn't trigger the non-thread-safe behavior.