Change old space bit of young objects from 0 to gcstate->visited_space.
This ensures that any object created *and* collected during cycle GC has the bit set correctly.
Instead of calling `exec()` once for each function added to a dataclass, only call `exec()` once per dataclass. This can lead to speed improvements of up to 20%.
Starting in Python 3.12, we prevented calling fork() and starting new threads
during interpreter finalization (shutdown). This has led to a number of
regressions and flaky tests. We should not prevent starting new threads
(or `fork()`) until all non-daemon threads exit and finalization starts in
earnest.
This changes the checks to use `_PyInterpreterState_GetFinalizing(interp)`,
which is set immediately before terminating non-daemon threads.
* GH-116554: Relax list.sort()'s notion of "descending" run
Rewrote `count_run()` so that sub-runs of equal elements no longer end a descending run. Both ascending and descending runs can have arbitrarily many sub-runs of arbitrarily many equal elements now. This is tricky, because we only use ``<`` comparisons, so checking for equality doesn't come "for free". Surprisingly, it turned out there's a very cheap (one comparison) way to determine whether an ascending run consisted of all-equal elements. That sealed the deal.
In addition, after a descending run is reversed in-place, we now go on to see whether it can be extended by an ascending run that just happens to be adjacent. This succeeds in finding at least one additional element to append about half the time, and so appears to more than repay its cost (the savings come from getting to skip a binary search, when a short run is artificially forced to length MIINRUN later, for each new element `count_run()` can add to the initial run).
While these have been in the back of my mind for years, a question on StackOverflow pushed it to action:
https://stackoverflow.com/questions/78108792/
They were wondering why it took about 4x longer to sort a list like:
[999_999, 999_999, ..., 2, 2, 1, 1, 0, 0]
than "similar" lists. Of course that runs very much faster after this patch.
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
This builds on https://github.com/python/cpython/pull/106807, which adds
a return code to ResourceTracker, to make future debugging easier.
Testing this “in situ” proved difficult, since the global ResourceTracker is
involved in test infrastructure. So, the tests here create a new instance and
feed it fake data.
---------
Co-authored-by: Yonatan Bitton <yonatan.bitton@perception-point.io>
Co-authored-by: Yonatan Bitton <bityob@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
This change adds an `eval_breaker` field to `PyThreadState`. The primary
motivation is for performance in free-threaded builds: with thread-local eval
breakers, we can stop a specific thread (e.g., for an async exception) without
interrupting other threads.
The source of truth for the global instrumentation version is stored in the
`instrumentation_version` field in PyInterpreterState. Threads usually read the
version from their local `eval_breaker`, where it continues to be colocated
with the eval breaker bits.
Use critical sections to make deque methods that operate on mutable
state thread-safe when the GIL is disabled. This is mostly accomplished
by using the @critical_section Argument Clinic directive, though there
are a few places where this was not possible and critical sections had
to be manually acquired/released.
Add PythonFinalizationError exception. This exception derived from
RuntimeError is raised when an operation is blocked during the Python
finalization.
The following functions now raise PythonFinalizationError, instead of
RuntimeError:
* _thread.start_new_thread()
* subprocess.Popen
* os.fork()
* os.fork1()
* os.forkpty()
Morever, _winapi.Overlapped finalizer now logs an unraisable
PythonFinalizationError, instead of an unraisable RuntimeError.
Setters for members with an unsigned integer type now support
the same range of valid values for objects that has a __index__()
method as for int.
Previously, Py_T_UINT, Py_T_ULONG and Py_T_ULLONG did not support
objects that has a __index__() method larger than LONG_MAX.
Py_T_ULLONG did not support negative ints. Now it supports them and
emits a RuntimeWarning.