Setting the size to 0 turns the list contents into overallocated memory that the deallocator will free.
Ownership is transferred to the new tuple so no refcount adjustment is needed.
* Drop DOUBLE_IS_ARM_MIXED_ENDIAN_IEEE754 macro.
* Use DOUBLE_IS_BIG/LITTLE_ENDIAN_IEEE754 to detect endianness of
float/doubles.
* Drop "unknown_format" code path in PyFloat_Pack/Unpack*().
Co-authored-by: Victor Stinner <vstinner@python.org>
We already have a stop-the-world pause elsewhere in this code path
(type_set_bases) and this makes will make it easier to avoid contention
on the TYPE_LOCK when looking up names in the MRO hierarchy.
Also use deferred reference counting for non-immortal MROs.
Fix three issues that caused mimalloc pages to be leaked until the
owning thread exited:
1. In _PyMem_mi_page_maybe_free(), move pages out of the full queue
when relying on QSBR to defer freeing the page. Pages in the full
queue are never searched by mi_page_queue_find_free_ex(), so a page
left there is unusable for allocations.
2. Move _PyMem_mi_page_clear_qsbr() from _mi_page_free_collect() to
_mi_page_thread_free_collect() where it only fires when all blocks
on the page are free (used == 0). The previous placement was too
broad: it cleared QSBR state whenever local_free was non-NULL, but
_mi_page_free_collect() is called from non-allocation paths (e.g.,
page visiting in mi_heap_visit_blocks) where the page is not being
reused.
3. In _PyMem_mi_page_maybe_free(), use the page's heap tld to find the
correct thread state for QSBR list insertion instead of
PyThreadState_GET(). During stop-the-world pauses, the function may
process pages belonging to other threads, so the current thread
state is not necessarily the owner of the page.
Optimize memoryview comparison: a memoryview is equal to itself, there is no
need to compare values, except if it uses float format.
Benchmark comparing 1 MiB:
from timeit import timeit
with open("/dev/random", 'br') as fp:
data = fp.read(2**20)
view = memoryview(data)
LOOPS = 1_000
b = timeit('x == x', number=LOOPS, globals={'x': data})
m = timeit('x == x', number=LOOPS, globals={'x': view})
print("bytes %f seconds" % b)
print("mview %f seconds" % m)
print("=> %f time slower" % (m / b))
Result before the change:
bytes 0.000026 seconds
mview 1.445791 seconds
=> 55660.873940 time slower
Result after the change:
bytes 0.000026 seconds
mview 0.000028 seconds
=> 1.104382 time slower
This missed optimization was discovered by Pierre-Yves David
while working on Mercurial.
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
PyDict_Contains() and PyDict_ContainsString() now fail with
SystemError if the first argument is not a dict, frozendict, dict
subclass or frozendict subclass.
PyDict_MergeFromSeq2() now fails with SystemError if the first
argument is not a dict or a dict subclass.
PyDict_Update(), PyDict_Merge() and _PyDict_MergeEx() no longer
accept frozendict.
can_modify_dict() is stricter than ASSERT_DICT_LOCKED() for
frozendict. It uses PyUnstable_Object_IsUniquelyReferenced() which
matters for free-threaded builds.
Replace anydict_setitem_take2() with setitem_take2_lock_held(). It's
no longer useful to have two functions.