The positional arguments passed to _PyStack_UnpackDict are already
kept alive by the caller, so we can avoid the extra reference count
operations by using borrowed references instead of creating new ones.
This reduces reference count contention in the free-threaded build
when calling functions with keyword arguments. In particular, this
avoids contention on the type argument to `__new__` when instantiating
namedtuples with keyword arguments.
The pymalloc huge page support had two problems. First, on
architectures where the default huge page size exceeds the arena
size (e.g. 32 MiB on PPC, 512 MiB on ARM64 with 64 KB base
pages), mmap with MAP_HUGETLB silently allocates a full huge page
even when the requested size is smaller. The subsequent munmap
with the original arena size then fails with EINVAL, permanently
leaking the entire huge page. Second, huge pages were always
attempted when compiled in, with no way to disable them at
runtime. On Linux, if the huge page pool is exhausted, page
faults including copy-on-write faults after fork deliver SIGBUS
and kill the process.
The arena allocator now queries the system huge page size from
/proc/meminfo and skips MAP_HUGETLB when the arena size is not a
multiple of it. Huge pages also now require explicit opt-in at
runtime via the PYTHON_PYMALLOC_HUGEPAGES environment variable,
which is read through PyConfig and respects -E and -I flags.
The config field pymalloc_hugepages is propagated to the runtime
allocators struct so the low-level arena allocator can check it
without calling getenv directly.
Add a FRAME_SUSPENDED_YIELD_FROM_LOCKED state that acts as a brief
lock, preventing other threads from transitioning the frame state
while gen_getyieldfrom reads the yield-from object off the stack.
mmap() returns MAP_FAILED ((void*)-1) on error, not NULL. The current
check never detects mmap failures, so jitdump initialization proceeds
even when the memory mapping fails.
Now that the specializing interpreter works with free threading,
replace ENABLE_SPECIALIZATION_FT with ENABLE_SPECIALIZATION and
replace requires_specialization_ft with requires_specialization.
Also limit the uniquely referenced check to FOR_ITER_RANGE. It's not
necessary for FOR_ITER_GEN and would cause test_for_iter_gen to fail.
We didn't catch this because of a combination of:
1) falling through to the if-statement below works
2) we only specialized FOR_ITER_GEN for uniquely referenced generators,
so we didn't trigger the non-thread-safe behavior.
* Halve size of buffers by reusing combined trace + optimizer buffers for TOS caching
* Add simple buffer struct for more maintainable handling of buffers
* Decouple JIT structs from thread state struct
* Ensure terminator is added to trace, when optimizer gives up
This reverts gh-144055 and fixes the bug in a different way. Deferred
reference counting relies on the object being tracked by the GC,
otherwise the object will live until interpreter shutdown. So, take
care that we do not enable deferred reference counting for objects that
are untracked. Also, if a tuple has deferred reference counting
enabled, don't untrack it.
When shutting down, disable deferred refcounting for all GC objects. It
is important to do this also for untracked objects, which before this
change were getting missed.
Small code cleanup:
We can remove the shutdown case disable_deferred_refcounting() call
inside scan_heap_visitor() if we are careful about it. The key is
that frame_disable_deferred_refcounting() might fail if the object
is untracked.
If we are specializing to `LOAD_GLOBAL_MODULE` or `LOAD_ATTR_MODULE`, try
to enable deferred reference counting for the value, if the object is owned by
a different thread. This applies to the free-threaded build only and should
improve scaling of multi-threaded programs.
* Moves the `GET_ITER` instruction into the generator function preamble.
This means the the iterable is converted into an iterator during generator
creation, as documented, but keeps it in the same code object allowing
optimization.
Previously, negative timestamps (representing dates before 1970-01-01) were
not supported on Windows due to platform limitations. The changes introduce a
fallback implementation using the Windows FILETIME API, allowing negative
timestamps to be correctly handled in both UTC and local time conversions.
Additionally, related test code is updated to remove Windows-specific skips
and error handling, ensuring consistent behavior across platforms.
Co-authored-by: Victor Stinner <vstinner@python.org>
Writing out an object may involve a slot lookup, which is not safe to do with
an exception raised. In debug mode an assertion failure will occur if this
happens.