mirror of
https://github.com/python/cpython.git
synced 2026-05-04 09:31:02 +00:00
Merge 63772f368f into 68fe899feb
This commit is contained in:
commit
dd3b3e3970
1 changed files with 129 additions and 0 deletions
|
|
@ -165,3 +165,132 @@ to false. If the flag is true then the :class:`warnings.catch_warnings`
|
|||
context manager uses a context variable for warning filters. If the flag is
|
||||
false then :class:`~warnings.catch_warnings` modifies the global filters list,
|
||||
which is not thread-safe. See the :mod:`warnings` module for more details.
|
||||
|
||||
|
||||
Increased memory usage
|
||||
----------------------
|
||||
|
||||
The free-threaded build will typically use more memory compared to the default
|
||||
build. There are multiple reasons for this, mostly due to design decisions.
|
||||
|
||||
|
||||
All interned strings are immortal
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
For modern Python versions (since version 2.3), interning a string (e.g. with
|
||||
:func:`sys.intern`) does not cause it to become immortal. Instead, if the last
|
||||
reference to that string disappears, it will be removed from the interned
|
||||
string table. This is not the case for the free-threaded build and any interned
|
||||
string will become immortal, surviving until interpreter shutdown.
|
||||
|
||||
|
||||
Non-GC objects have a larger object header
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The free-threaded build uses a different :c:type:`PyObject` structure. Instead
|
||||
of having the GC related information allocated before the :c:type:`PyObject`
|
||||
structure, like in the default build, the GC related info is part of the normal
|
||||
object header. For example, on the AMD64 platform, ``None`` uses 32 bytes on
|
||||
the free-threaded build vs 16 bytes for the default build. GC objects (such as
|
||||
dicts and lists) are the same size for both builds since the free-threaded
|
||||
build does not use additional space for the GC info.
|
||||
|
||||
|
||||
QSBR can delay freeing of memory
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
In order to safely implement lock-free data structures, a safe memory
|
||||
reclamation (SMR) scheme is used, known as quiescent state-based reclamation
|
||||
(QSBR). This means that the memory backing data structures allowing lock-free
|
||||
access will use QSBR, which defers the free operation, rather than immediately
|
||||
freeing the memory. Two examples of these data structures are the list object
|
||||
and the dictionary keys object. See ``InternalDocs/qsbr.md`` in the CPython
|
||||
source tree for more details on how QSBR is implemented. Running
|
||||
:func:`gc.collect` should cause all memory being held by QSBR to be actually
|
||||
freed. Note that even when QSBR frees the memory, the underlying memory
|
||||
allocator may not immediately return that memory to the OS and so the resident
|
||||
set size (RSS) of the process might not decrease.
|
||||
|
||||
|
||||
mimalloc allocator vs pymalloc
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The default build will normally use the "pymalloc" memory allocator for small
|
||||
allocations (512 bytes or smaller). The free-threaded build does not use
|
||||
pymalloc and allocates all Python objects using the "mimalloc" allocator. The
|
||||
pymalloc allocator has the following properties that help keep memory usage
|
||||
low: small per-allocated-block overhead, effective memory fragmentation
|
||||
prevention, and quick return of free memory to the operating system. The
|
||||
mimalloc allocator does quite well in these respects as well but can have some
|
||||
more overhead.
|
||||
|
||||
In the free-threaded build, mimalloc manages memory in a number of separate
|
||||
heaps (currently five). For example, all GC supporting objects are allocated
|
||||
from their own heap. Using separate heaps means that free memory in one heap
|
||||
cannot be used for an allocation that uses another heap. Also, some heaps are
|
||||
configured to use QSBR (quiescent-state based reclamation) when freeing the
|
||||
memory that backs up the heap (known as "pages" in mimalloc terminology). The
|
||||
use of QSBR creates a delay between all memory blocks for a page being freed
|
||||
and the memory page being released, either for new allocations or back to the
|
||||
OS.
|
||||
|
||||
The mimalloc allocator also defers returning freed memory back to the OS. You
|
||||
can reduce that delay by setting the environment variable
|
||||
:envvar:`!MIMALLOC_PURGE_DELAY` to ``0``. Note that this will likely reduce
|
||||
the performance of the allocator.
|
||||
|
||||
|
||||
Free-threaded reference counting can cause objects to live longer
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
In the default build, when an object's reference count reaches zero, it is
|
||||
normally deallocated. The free-threaded build uses "biased reference
|
||||
counting", with a fast-path for objects "owned" by the current thread and a
|
||||
slow path for other objects. See :pep:`703` for additional details. Any time
|
||||
an object's reference count ends up in a "queued" state, deallocation can be
|
||||
deferred. The queued state is cleared from the "eval breaker" section of the
|
||||
bytecode evaluator.
|
||||
|
||||
The free-threaded build also allows a different mode of reference counting,
|
||||
known as "deferred reference counting". This mode is enabled by setting a flag
|
||||
on a per-object basis. Deferred reference counting is enabled for the
|
||||
following types:
|
||||
|
||||
* module objects
|
||||
* module top-level functions
|
||||
* class methods defined in the class scope
|
||||
* descriptor objects
|
||||
* thread-local objects, created by :class:`threading.local`
|
||||
|
||||
When deferred reference counting is enabled, references from Python function
|
||||
stacks are not added to the reference count. This scheme reduces the overhead
|
||||
of reference counting, especially for objects used from multiple threads.
|
||||
Because the stack references are not counted, objects with deferred reference
|
||||
counting are not immediately freed when their internal reference count goes to
|
||||
zero. Instead, they are examined by the next GC run and, if no stack
|
||||
references to them are found, they are freed. This means these objects are
|
||||
freed by the GC and not when their reference count goes to zero, as is typical.
|
||||
|
||||
|
||||
Per-thread reference counting can delay freeing objects
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
To avoid contention on the reference count fields of frequently shared
|
||||
objects, the free-threaded build also uses "per-thread reference counting"
|
||||
for a few selected object types. Rather than updating a single shared
|
||||
reference count, each thread maintains its own local reference count array,
|
||||
indexed by a unique id assigned to the object. The true reference count is
|
||||
only computed by summing the per-thread counts when the object's local
|
||||
count drops to zero. Per-thread reference counting is currently used for:
|
||||
|
||||
* heap type objects (classes created in Python)
|
||||
* code objects
|
||||
* the ``__dict__`` of module objects
|
||||
|
||||
Because the per-thread counts must be merged back to the object before it
|
||||
can be deallocated, objects using per-thread reference counting are
|
||||
typically freed later than they would be in the default build. In
|
||||
particular, such an object is usually not freed until the thread that
|
||||
referenced it reaches a safe point (for example, in the "eval breaker"
|
||||
section of the bytecode evaluator) or exits. Running :func:`gc.collect`
|
||||
will merge the per-thread counts and allow these objects to be freed.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue