[3.14] GH-148726: Forward-port generational GC (#148720)

Co-authored-by: Neil Schemenauer <nas@arctrix.com>
Co-authored-by: Sergey Miryanov <sergey.miryanov@gmail.com>
Co-authored-by: Zanie Blue <contact@zanie.dev>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Neil Schemenauer <nas-github@arctrix.com>
This commit is contained in:
Sergey Miryanov 2026-04-26 23:12:52 +05:00 committed by GitHub
parent 78c5e54b4f
commit 9a7e205e46
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 6023 additions and 6537 deletions

File diff suppressed because it is too large Load diff

View file

@ -40,18 +40,11 @@ The :mod:`!gc` module provides the following functions:
.. function:: collect(generation=2)
Perform a collection. The optional argument *generation*
With no arguments, run a full collection. The optional argument *generation*
may be an integer specifying which generation to collect (from 0 to 2). A
:exc:`ValueError` is raised if the generation number is invalid. The sum of
collected objects and uncollectable objects is returned.
Calling ``gc.collect(0)`` will perform a GC collection on the young generation.
Calling ``gc.collect(1)`` will perform a GC collection on the young generation
and an increment of the old generation.
Calling ``gc.collect(2)`` or ``gc.collect()`` performs a full collection
The free lists maintained for a number of built-in types are cleared
whenever a full collection or collection of the highest generation (2)
is run. Not all items in some free lists may be freed due to the
@ -63,6 +56,9 @@ The :mod:`!gc` module provides the following functions:
.. versionchanged:: 3.14
``generation=1`` performs an increment of collection.
.. versionchanged:: 3.14.5
``generation=1`` performs collection of the middle generation.
.. function:: set_debug(flags)
@ -78,13 +74,9 @@ The :mod:`!gc` module provides the following functions:
.. function:: get_objects(generation=None)
Returns a list of all objects tracked by the collector, excluding the list
returned. If *generation* is not ``None``, return only the objects as follows:
* 0: All objects in the young generation
* 1: No objects, as there is no generation 1 (as of Python 3.14)
* 2: All objects in the old generation
returned. If *generation* is not ``None``, return only the objects tracked by
the collector that are in that generation.
.. versionchanged:: 3.8
New *generation* parameter.
@ -92,6 +84,9 @@ The :mod:`!gc` module provides the following functions:
.. versionchanged:: 3.14
Generation 1 is removed
.. versionchanged:: 3.14.5
Generation 1 is reintroduced to maintain GC behavior from 3.13.
.. audit-event:: gc.get_objects generation gc.get_objects
.. function:: get_stats()
@ -118,33 +113,33 @@ The :mod:`!gc` module provides the following functions:
Set the garbage collection thresholds (the collection frequency). Setting
*threshold0* to zero disables collection.
The GC classifies objects into two generations depending on whether they have
survived a collection. New objects are placed in the young generation. If an
object survives a collection it is moved into the old generation.
In order to decide when to run, the collector keeps track of the number of object
The GC classifies objects into three generations depending on how many
collection sweeps they have survived. New objects are placed in the youngest
generation (generation ``0``). If an object survives a collection it is moved
into the next older generation. Since generation ``2`` is the oldest
generation, objects in that generation remain there after a collection. In
order to decide when to run, the collector keeps track of the number object
allocations and deallocations since the last collection. When the number of
allocations minus the number of deallocations exceeds *threshold0*, collection
starts. For each collection, all the objects in the young generation and some
fraction of the old generation is collected.
starts. Initially only generation ``0`` is examined. If generation ``0`` has
been examined more than *threshold1* times since generation ``1`` has been
examined, then generation ``1`` is examined as well.
With the third generation, things are a bit more complicated,
see `Collecting the oldest generation <https://github.com/python/cpython/blob/ff0ef0a54bef26fc507fbf9b7a6009eb7d3f17f5/InternalDocs/garbage_collector.md#collecting-the-oldest-generation>`_ for more information.
In the free-threaded build, the increase in process memory usage is also
checked before running the collector. If the memory usage has not increased
by 10% since the last collection and the net number of object allocations
has not exceeded 40 times *threshold0*, the collection is not run.
The fraction of the old generation that is collected is **inversely** proportional
to *threshold1*. The larger *threshold1* is, the slower objects in the old generation
are collected.
For the default value of 10, 1% of the old generation is scanned during each collection.
*threshold2* is ignored.
See `Garbage collector design <https://github.com/python/cpython/blob/3.14/InternalDocs/garbage_collector.md>`_ for more information.
.. versionchanged:: 3.14
*threshold2* is ignored
.. versionchanged:: 3.14.5
*threshold2* is restored to match Python 3.13 behavior.
.. function:: get_count()

View file

@ -118,21 +118,6 @@ static inline void _PyObject_GC_SET_SHARED(PyObject *op) {
/* Bit 1 is set when the object is in generation which is GCed currently. */
#define _PyGC_PREV_MASK_COLLECTING ((uintptr_t)2)
/* Bit 0 in _gc_next is the old space bit.
* It is set as follows:
* Young: gcstate->visited_space
* old[0]: 0
* old[1]: 1
* permanent: 0
*
* During a collection all objects handled should have the bit set to
* gcstate->visited_space, as objects are moved from the young gen
* and the increment into old[gcstate->visited_space].
* When object are moved from the pending space, old[gcstate->visited_space^1]
* into the increment, the old space bit is flipped.
*/
#define _PyGC_NEXT_MASK_OLD_SPACE_1 1
#define _PyGC_PREV_SHIFT 2
#define _PyGC_PREV_MASK (((uintptr_t) -1) << _PyGC_PREV_SHIFT)
@ -159,13 +144,11 @@ typedef enum {
// Lowest bit of _gc_next is used for flags only in GC.
// But it is always 0 for normal code.
static inline PyGC_Head* _PyGCHead_NEXT(PyGC_Head *gc) {
uintptr_t next = gc->_gc_next & _PyGC_PREV_MASK;
uintptr_t next = gc->_gc_next;
return (PyGC_Head*)next;
}
static inline void _PyGCHead_SET_NEXT(PyGC_Head *gc, PyGC_Head *next) {
uintptr_t unext = (uintptr_t)next;
assert((unext & ~_PyGC_PREV_MASK) == 0);
gc->_gc_next = (gc->_gc_next & ~_PyGC_PREV_MASK) | unext;
gc->_gc_next = (uintptr_t)next;
}
// Lowest two bits of _gc_prev is used for _PyGC_PREV_MASK_* flags.
@ -207,10 +190,6 @@ static inline void _PyGC_CLEAR_FINALIZED(PyObject *op) {
extern void _Py_ScheduleGC(PyThreadState *tstate);
#ifndef Py_GIL_DISABLED
extern void _Py_TriggerGC(struct _gc_runtime_state *gcstate);
#endif
/* Tell the GC to track this object.
*
@ -220,7 +199,7 @@ extern void _Py_TriggerGC(struct _gc_runtime_state *gcstate);
* ob_traverse method.
*
* Internal note: interp->gc.generation0->_gc_prev doesn't have any bit flags
* because it's not object header. So we don't use _PyGCHead_PREV() and
* because it's not an object header. So we don't use _PyGCHead_PREV() and
* _PyGCHead_SET_PREV() for it to avoid unnecessary bitwise operations.
*
* See also the public PyObject_GC_Track() function.
@ -244,19 +223,12 @@ static inline void _PyObject_GC_TRACK(
"object is in generation which is garbage collected",
filename, lineno, __func__);
struct _gc_runtime_state *gcstate = &_PyInterpreterState_GET()->gc;
PyGC_Head *generation0 = &gcstate->young.head;
PyGC_Head *generation0 = _PyInterpreterState_GET()->gc.generation0;
PyGC_Head *last = (PyGC_Head*)(generation0->_gc_prev);
_PyGCHead_SET_NEXT(last, gc);
_PyGCHead_SET_PREV(gc, last);
uintptr_t not_visited = 1 ^ gcstate->visited_space;
gc->_gc_next = ((uintptr_t)generation0) | not_visited;
_PyGCHead_SET_NEXT(gc, generation0);
generation0->_gc_prev = (uintptr_t)gc;
gcstate->young.count++; /* number of tracked GC objects */
gcstate->heap_size++;
if (gcstate->young.count > gcstate->young.threshold) {
_Py_TriggerGC(gcstate);
}
#endif
}
@ -291,11 +263,6 @@ static inline void _PyObject_GC_UNTRACK(
_PyGCHead_SET_PREV(next, prev);
gc->_gc_next = 0;
gc->_gc_prev &= _PyGC_PREV_MASK_FINALIZED;
struct _gc_runtime_state *gcstate = &_PyInterpreterState_GET()->gc;
if (gcstate->young.count > 0) {
gcstate->young.count--;
}
gcstate->heap_size--;
#endif
}

View file

@ -195,11 +195,6 @@ struct gc_generation_stats {
Py_ssize_t uncollectable;
};
enum _GCPhase {
GC_PHASE_MARK = 0,
GC_PHASE_COLLECT = 1
};
/* If we change this, we need to change the default value in the
signature of gc.collect. */
#define NUM_GENERATIONS 3
@ -215,8 +210,13 @@ struct _gc_runtime_state {
int enabled;
int debug;
/* linked lists of container objects */
#ifndef Py_GIL_DISABLED
struct gc_generation generations[NUM_GENERATIONS];
PyGC_Head *generation0;
#else
struct gc_generation young;
struct gc_generation old[2];
#endif
/* a permanent generation which won't be collected */
struct gc_generation permanent_generation;
struct gc_generation_stats generation_stats[NUM_GENERATIONS];
@ -227,13 +227,6 @@ struct _gc_runtime_state {
/* a list of callbacks to be invoked when collection is performed */
PyObject *callbacks;
Py_ssize_t heap_size;
Py_ssize_t work_to_do;
/* Which of the old spaces is the visited space */
int visited_space;
int phase;
#ifdef Py_GIL_DISABLED
/* This is the number of objects that survived the last full
collection. It approximates the number of long lived objects
tracked by the GC.
@ -246,6 +239,7 @@ struct _gc_runtime_state {
the first time. */
Py_ssize_t long_lived_pending;
#ifdef Py_GIL_DISABLED
/* True if gc.freeze() has been used. */
int freeze_active;
@ -261,6 +255,22 @@ struct _gc_runtime_state {
#endif
};
#ifndef Py_GIL_DISABLED
#define GC_GENERATION_INIT \
.generations = { \
{ .threshold = 2000, }, \
{ .threshold = 10, }, \
{ .threshold = 10, }, \
},
#else
#define GC_GENERATION_INIT \
.young = { .threshold = 2000, }, \
.old = { \
{ .threshold = 10, }, \
{ .threshold = 10, }, \
},
#endif
#include "pycore_gil.h" // struct _gil_runtime_state
/**** Import ********/

View file

@ -137,13 +137,7 @@ extern PyTypeObject _PyExc_MemoryError;
}, \
.gc = { \
.enabled = 1, \
.young = { .threshold = 2000, }, \
.old = { \
{ .threshold = 10, }, \
{ .threshold = 0, }, \
}, \
.work_to_do = -5000, \
.phase = GC_PHASE_MARK, \
GC_GENERATION_INIT \
}, \
.qsbr = { \
.wr_seq = QSBR_INITIAL, \

View file

@ -107,7 +107,7 @@
[Optimization: reusing fields to save memory](#optimization-reusing-fields-to-save-memory)
section, these two extra fields are normally used to keep doubly linked lists of all the
objects tracked by the garbage collector (these lists are the GC generations, more on
that in the [Optimization: incremental collection](#Optimization-incremental-collection) section), but
that in the [Optimization: generations](#Optimization-generations) section), but
they are also reused to fulfill other purposes when the full doubly linked list
structure is not needed as a memory optimization.
@ -199,22 +199,22 @@
```pycon
>>> import gc
>>>
>>>
>>> class Link:
... def __init__(self, next_link=None):
... self.next_link = next_link
...
...
>>> link_3 = Link()
>>> link_2 = Link(link_3)
>>> link_1 = Link(link_2)
>>> link_3.next_link = link_1
>>> A = link_1
>>> del link_1, link_2, link_3
>>>
>>>
>>> link_4 = Link()
>>> link_4.next_link = link_4
>>> del link_4
>>>
>>>
>>> # Collect the unreachable Link object (and its .__dict__ dict).
>>> gc.collect()
2
@ -350,11 +350,12 @@
the reference counts fall to 0, triggering the destruction of all unreachable
objects.
Optimization: incremental collection
====================================
Optimization: generations
=========================
In order to bound the length of each garbage collection pause, the GC implementation
for the default build uses incremental collection with two generations.
In order to limit the time each garbage collection takes, the GC
implementation for the default build uses a popular optimization:
generations.
Generational garbage collection takes advantage of what is known as the weak
generational hypothesis: Most objects die young.
@ -362,76 +363,29 @@
programs as many temporary objects are created and destroyed very quickly.
To take advantage of this fact, all container objects are segregated into
two generations: young and old. Every new object starts in the young generation.
Each garbage collection scans the entire young generation and part of the old generation.
The time taken to scan the young generation can be controlled by controlling its
size, but the size of the old generation cannot be controlled.
In order to keep pause times down, scanning of the old generation of the heap
occurs in increments.
To keep track of what has been scanned, the old generation contains two lists:
* Those objects that have not yet been scanned, referred to as the `pending` list.
* Those objects that have been scanned, referred to as the `visited` list.
To detect and collect all unreachable objects in the heap, the garbage collector
must scan the whole heap. This whole heap scan is called a full scavenge.
Increments
----------
Each full scavenge is performed in a series of increments.
For each full scavenge, the combined increments will cover the whole heap.
Each increment is made up of:
* The young generation
* The old generation's least recently scanned objects
* All objects reachable from those objects that have not yet been scanned this full scavenge
The surviving objects (those that are not collected) are moved to the back of the
`visited` list in the old generation.
When a full scavenge starts, no objects in the heap are considered to have been scanned,
so all objects in the old generation must be in the `pending` space.
When all objects in the heap have been scanned a cycle ends, and all objects are moved
to the `pending` list again. To avoid having to traverse the entire list, which list is
`pending` and which is `visited` is determined by a field in the `GCState` struct.
The `visited` and `pending` lists can be swapped by toggling this bit.
Correctness
-----------
The [algorithm for identifying cycles](#Identifying-reference-cycles) will find all
unreachable cycles in a list of objects, but will not find any cycles that are
even partly outside of that list.
Therefore, to be guaranteed that a full scavenge will find all unreachable cycles,
each cycle must be fully contained within a single increment.
To make sure that no partial cycles are included in the increment we perform a
[transitive closure](https://en.wikipedia.org/wiki/Transitive_closure)
over reachable, unscanned objects from the initial increment.
Since the transitive closure of objects reachable from an object must be a (non-strict)
superset of any unreachable cycle including that object, we are guaranteed that a
transitive closure cannot contain any partial cycles.
We can exclude scanned objects, as they must have been reachable when scanned.
If a scanned object becomes part of an unreachable cycle after being scanned, it will
not be collected at this time, but it will be collected in the next full scavenge.
three spaces/generations. Every new
object starts in the first generation (generation 0). The previous algorithm is
executed only over the objects of a particular generation and if an object
survives a collection of its generation it will be moved to the next one
(generation 1), where it will be surveyed for collection less often. If
the same object survives another GC round in this new generation (generation 1)
it will be moved to the last generation (generation 2) where it will be
surveyed the least often.
> [!NOTE]
> The GC implementation for the free-threaded build does not use incremental collection.
> Every collection operates on the entire heap.
> The GC implementation for the free-threaded build does not use generational
> collection. Every collection operates on the entire heap.
In order to decide when to run, the collector keeps track of the number of object
allocations and deallocations since the last collection. When the number of
allocations minus the number of deallocations exceeds `threshold0`,
collection starts. `threshold1` determines the fraction of the old
collection that is included in the increment.
The fraction is inversely proportional to `threshold1`,
as historically a larger `threshold1` meant that old generation
collections were performed less frequently.
`threshold2` is ignored.
collection starts. Initially only generation 0 is examined. If generation 0 has
been examined more than `threshold_1` times since generation 1 has been
examined, then generation 1 is examined as well. With generation 2,
things are a bit more complicated; see
[Collecting the oldest generation](#Collecting-the-oldest-generation) for
more information.
These thresholds can be examined using the
[`gc.get_threshold()`](https://docs.python.org/3/library/gc.html#gc.get_threshold)
@ -440,7 +394,7 @@
```pycon
>>> import gc
>>> gc.get_threshold()
(700, 10, 10)
(2000, 10, 10)
```
The content of these generations can be examined using the
@ -453,84 +407,61 @@
... pass
...
>>> # Move everything to the old generation so it's easier to inspect
>>> # the young generation.
>>> # the younger generation.
>>> gc.collect()
0
>>> # Create a reference cycle.
>>> x = MyObj()
>>> x.self = x
>>>
>>> # Initially the object is in the young generation.
>>>
>>> # Initially the object is in the youngest generation.
>>> gc.get_objects(generation=0)
[..., <__main__.MyObj object at 0x7fbcc12a3400>, ...]
>>>
>>>
>>> # After a collection of the youngest generation the object
>>> # moves to the old generation.
>>> # moves to the next generation.
>>> gc.collect(generation=0)
0
>>> gc.get_objects(generation=0)
[]
>>> gc.get_objects(generation=1)
[]
>>> gc.get_objects(generation=2)
[..., <__main__.MyObj object at 0x7fbcc12a3400>, ...]
```
Collecting the oldest generation
--------------------------------
In addition to the various configurable thresholds, the GC only triggers a full
collection of the oldest generation if the ratio `long_lived_pending / long_lived_total`
is above a given value (hardwired to 25%). The reason is that, while "non-full"
collections (that is, collections of the young and middle generations) will always
examine roughly the same number of objects (determined by the aforementioned
thresholds) the cost of a full collection is proportional to the total
number of long-lived objects, which is virtually unbounded. Indeed, it has
been remarked that doing a full collection every <constant number> of object
creations entails a dramatic performance degradation in workloads which consist
of creating and storing lots of long-lived objects (for example, building a large list
of GC-tracked objects would show quadratic performance, instead of linear as
expected). Using the above ratio, instead, yields amortized linear performance
in the total number of objects (the effect of which can be summarized thusly:
"each full garbage collection is more and more costly as the number of objects
grows, but we do fewer and fewer of them").
Optimization: excluding reachable objects
=========================================
An object cannot be garbage if it can be reached. To avoid having to identify
reference cycles across the whole heap, we can reduce the amount of work done
considerably by first identifying objects reachable from objects known to be
alive. These objects are excluded from the normal cyclic detection process.
The default and free-threaded build both implement this optimization but in
slightly different ways.
Finding reachable objects for the default build GC
--------------------------------------------------
This works by first moving most reachable objects to the `visited` space.
Empirically, most reachable objects can be reached from a small set of global
objects and local variables. This step does much less work per object, so
reduces the time spent performing garbage collection by at least half.
> [!NOTE]
> Objects that are not determined to be reachable by this pass are not necessarily
> unreachable. We still need to perform the main algorithm to determine which objects
> are actually unreachable.
We use the same technique of forming a transitive closure as the incremental
collector does to find reachable objects, seeding the list with some global
objects and the currently executing frames.
This phase moves objects to the `visited` space, as follows:
1. All objects directly referred to by any builtin class, the `sys` module, the `builtins`
module and all objects directly referred to from stack frames are added to a working
set of reachable objects.
2. Until this working set is empty:
1. Pop an object from the set and move it to the `visited` space
2. For each object directly reachable from that object:
* If it is not already in `visited` space and it is a GC object,
add it to the working set
Before each increment of collection is performed, the stacks are scanned
to check for any new stack frames that have been created since the last
increment. All objects directly referred to from those stack frames are
added to the working set.
Then the above algorithm is repeated, starting from step 2.
reference cycles across the whole heap, the free-threaded build first identifies
objects reachable from objects known to be alive. These objects are excluded
from the normal cyclic detection process.
Finding reachable objects for the free-threaded GC
--------------------------------------------------
Within the `gc_free_threading.c` implementation, this is known as the "mark
alive" pass or phase. It is similar in concept to what is done for the default
build GC. Rather than moving objects between double-linked lists, the
free-threaded GC uses a flag in `ob_gc_bits` to track if an object is
found to be definitely alive (not garbage).
alive" pass or phase. The free-threaded GC uses a flag in `ob_gc_bits` to track
if an object is found to be definitely alive (not garbage).
To find objects reachable from known alive objects, known as the "roots", the
`gc_mark_alive_from_roots()` function is used. Root objects include
@ -761,6 +692,14 @@
already not tracked. Tuples are examined for untracking in all garbage collection
cycles.
Dictionaries are always tracked from creation and are not untracked by the
garbage collector. Earlier versions (up to 3.13) used lazy tracking: empty or
atomic-only dicts were untracked on creation and re-tracked when a trackable
value was inserted (via `MAINTAIN_TRACKING`), and full collections called
`_PyDict_MaybeUntrack` to prune dicts whose values had become atomic. That
machinery was removed in 3.14 (GH-127010) because the per-set-item cost of
checking the tracking invariant outweighed the savings on full collections.
The garbage collector module provides the Python function `is_tracked(obj)`, which returns
the current tracking status of the object. Subsequent garbage collections may change the
tracking status of the object.

View file

@ -1,48 +0,0 @@
# Run by test_gc.
from test import support
import _testinternalcapi
import gc
import unittest
class IncrementalGCTests(unittest.TestCase):
# Use small increments to emulate longer running process in a shorter time
@support.gc_threshold(200, 10)
def test_incremental_gc_handles_fast_cycle_creation(self):
class LinkedList:
#Use slots to reduce number of implicit objects
__slots__ = "next", "prev", "surprise"
def __init__(self, next=None, prev=None):
self.next = next
if next is not None:
next.prev = self
self.prev = prev
if prev is not None:
prev.next = self
def make_ll(depth):
head = LinkedList()
for i in range(depth):
head = LinkedList(head, head.prev)
return head
head = make_ll(1000)
assert(gc.isenabled())
olds = []
initial_heap_size = _testinternalcapi.get_tracked_heap_size()
for i in range(20_000):
newhead = make_ll(20)
newhead.surprise = head
olds.append(newhead)
if len(olds) == 20:
new_objects = _testinternalcapi.get_tracked_heap_size() - initial_heap_size
self.assertLess(new_objects, 27_000, f"Heap growing. Reached limit after {i} iterations")
del olds[:]
if __name__ == "__main__":
unittest.main()

View file

@ -7,7 +7,7 @@
Py_GIL_DISABLED)
from test.support.import_helper import import_module
from test.support.os_helper import temp_dir, TESTFN, unlink
from test.support.script_helper import assert_python_ok, make_script, run_test_script
from test.support.script_helper import assert_python_ok, make_script
from test.support import threading_helper, gc_threshold
import gc
@ -399,11 +399,19 @@ def test_collect_generations(self):
# each call to collect(N)
x = []
gc.collect(0)
# x is now in the old gen
# x is now in gen 1
a, b, c = gc.get_count()
# We don't check a since its exact values depends on
gc.collect(1)
# x is now in gen 2
d, e, f = gc.get_count()
gc.collect(2)
# x is now in gen 3
g, h, i = gc.get_count()
# We don't check a, d, g since their exact values depends on
# internal implementation details of the interpreter.
self.assertEqual((b, c), (1, 0))
self.assertEqual((e, f), (0, 1))
self.assertEqual((h, i), (0, 0))
def test_trashcan(self):
class Ouch:
@ -870,10 +878,42 @@ def test_get_objects_generations(self):
self.assertTrue(
any(l is element for element in gc.get_objects(generation=0))
)
gc.collect()
self.assertFalse(
any(l is element for element in gc.get_objects(generation=1))
)
self.assertFalse(
any(l is element for element in gc.get_objects(generation=2))
)
gc.collect(generation=0)
self.assertFalse(
any(l is element for element in gc.get_objects(generation=0))
)
self.assertTrue(
any(l is element for element in gc.get_objects(generation=1))
)
self.assertFalse(
any(l is element for element in gc.get_objects(generation=2))
)
gc.collect(generation=1)
self.assertFalse(
any(l is element for element in gc.get_objects(generation=0))
)
self.assertFalse(
any(l is element for element in gc.get_objects(generation=1))
)
self.assertTrue(
any(l is element for element in gc.get_objects(generation=2))
)
gc.collect(generation=2)
self.assertFalse(
any(l is element for element in gc.get_objects(generation=0))
)
self.assertFalse(
any(l is element for element in gc.get_objects(generation=1))
)
self.assertTrue(
any(l is element for element in gc.get_objects(generation=2))
)
del l
gc.collect()
@ -1181,17 +1221,6 @@ def test_tuple_untrack_counts(self):
self.assertTrue(new_count - count > (n // 2))
class IncrementalGCTests(unittest.TestCase):
@unittest.skipIf(_testinternalcapi is None, "requires _testinternalcapi")
@requires_gil_enabled("Free threading does not support incremental GC")
def test_incremental_gc_handles_fast_cycle_creation(self):
# Run this test in a fresh process. The number of alive objects (which can
# be from unit tests run before this one) can influence how quickly cyclic
# garbage is found.
script = support.findfile("_test_gc_fast_cycles.py")
run_test_script(script)
class GCCallbackTests(unittest.TestCase):
def setUp(self):
# Save gc state and disable it.
@ -1444,8 +1473,8 @@ def callback(ignored):
assert not detector.gc_happened
while not detector.gc_happened:
i += 1
if i > 100000:
self.fail("gc didn't happen after 100000 iterations")
if i > 10000:
self.fail("gc didn't happen after 10000 iterations")
self.assertEqual(len(ouch), 0)
junk.append([]) # this will eventually trigger gc
@ -1516,8 +1545,8 @@ def __del__(self):
gc.collect()
while not detector.gc_happened:
i += 1
if i > 50000:
self.fail("gc didn't happen after 50000 iterations")
if i > 10000:
self.fail("gc didn't happen after 10000 iterations")
self.assertEqual(len(ouch), 0)
junk.append([]) # this will eventually trigger gc
@ -1534,8 +1563,8 @@ def test_indirect_calls_with_gc_disabled(self):
detector = GC_Detector()
while not detector.gc_happened:
i += 1
if i > 100000:
self.fail("gc didn't happen after 100000 iterations")
if i > 10000:
self.fail("gc didn't happen after 10000 iterations")
junk.append([]) # this will eventually trigger gc
try:
@ -1545,11 +1574,11 @@ def test_indirect_calls_with_gc_disabled(self):
detector = GC_Detector()
while not detector.gc_happened:
i += 1
if i > 100000:
if i > 10000:
break
junk.append([]) # this may eventually trigger gc (if it is enabled)
self.assertEqual(i, 100001)
self.assertEqual(i, 10001)
finally:
gc.enable()

View file

@ -0,0 +1 @@
Forward-port the generational cycle garbage collector to the default 3.14 build, replacing the incremental collector while leaving the free-threaded collector unchanged.

View file

@ -2353,7 +2353,8 @@ has_deferred_refcount(PyObject *self, PyObject *op)
static PyObject *
get_tracked_heap_size(PyObject *self, PyObject *Py_UNUSED(ignored))
{
return PyLong_FromInt64(PyInterpreterState_Get()->gc.heap_size);
// Generational GC doesn't track heap_size, return -1.
return PyLong_FromInt64(-1);
}
static PyObject *

View file

@ -159,6 +159,15 @@ gc_set_threshold_impl(PyObject *module, int threshold0, int group_right_1,
{
GCState *gcstate = get_gc_state();
#ifndef Py_GIL_DISABLED
gcstate->generations[0].threshold = threshold0;
if (group_right_1) {
gcstate->generations[1].threshold = threshold1;
}
if (group_right_2) {
gcstate->generations[2].threshold = threshold2;
}
#else
gcstate->young.threshold = threshold0;
if (group_right_1) {
gcstate->old[0].threshold = threshold1;
@ -166,6 +175,7 @@ gc_set_threshold_impl(PyObject *module, int threshold0, int group_right_1,
if (group_right_2) {
gcstate->old[1].threshold = threshold2;
}
#endif
Py_RETURN_NONE;
}
@ -180,10 +190,17 @@ gc_get_threshold_impl(PyObject *module)
/*[clinic end generated code: output=7902bc9f41ecbbd8 input=286d79918034d6e6]*/
{
GCState *gcstate = get_gc_state();
#ifndef Py_GIL_DISABLED
return Py_BuildValue("(iii)",
gcstate->generations[0].threshold,
gcstate->generations[1].threshold,
gcstate->generations[2].threshold);
#else
return Py_BuildValue("(iii)",
gcstate->young.threshold,
gcstate->old[0].threshold,
0);
gcstate->old[1].threshold);
#endif
}
/*[clinic input]
@ -207,10 +224,17 @@ gc_get_count_impl(PyObject *module)
gc->alloc_count = 0;
#endif
#ifndef Py_GIL_DISABLED
return Py_BuildValue("(iii)",
gcstate->generations[0].count,
gcstate->generations[1].count,
gcstate->generations[2].count);
#else
return Py_BuildValue("(iii)",
gcstate->young.count,
gcstate->old[gcstate->visited_space].count,
gcstate->old[gcstate->visited_space^1].count);
gcstate->old[0].count,
gcstate->old[1].count);
#endif
}
/*[clinic input]

File diff suppressed because it is too large Load diff