Commit graph

6423 commits

Author SHA1 Message Date
Sam Gross
19f96f99fe
gh-148393: Use acquire load for _ma_watcher_tag in dict notify event (gh-148509)
The watcher-bits read in _PyDict_NotifyEvent needs to use acquire to
synchronize with the release from PyDict_Watch so that the callback
publication is visible before the callback is invoked.
2026-04-13 14:11:28 -04:00
Kumar Aditya
88e378cc1c
gh-131798: optimize through keyword and bound method calls in the JIT (GH-148466)
Co-authored-by: Ken Jin <kenjin4096@gmail.com>
2026-04-13 21:14:48 +08:00
Sacul
18d7d90ef9
gh-131798: Split _CHECK_AND_ALLOCATE_OBJECT into smaller uops (GH-148433)
Co-authored-by: Hai Zhu <haiizhu@outlook.com>
Co-authored-by: Ken Jin <kenjin4096@gmail.com>
2026-04-13 02:31:24 +08:00
Ken Jin
6f7bb297db
gh-146261: JIT: protect against function version changes (#146300)
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
2026-04-13 02:23:47 +08:00
Sam Gross
3ab94d6842
gh-148393: Use atomic ops on _ma_watcher_tag in free threading build (gh-148397)
Fixes data races between dict mutation and watch/unwatch on the same dict.
2026-04-12 10:40:41 -04:00
Gregory P. Smith
64afa947f4
gh-146302: make Py_IsInitialized() thread-safe and reflect true init completion (GH-146303)
## Summary

- Move the `runtime->initialized = 1` store from before `site.py` import to the end of `init_interp_main()`, so `Py_IsInitialized()` only returns true after initialization has fully completed
- Access `initialized` and `core_initialized` through new inline accessors using acquire/release atomics, to also protect from data race undefined behavior
- `PySys_AddAuditHook()` now uses the accessor, so with the flag move it correctly skips audit hook invocation during all init phases (matching the documented "after runtime initialization" behavior) ... We could argue that running these earlier would be good even if the intent was never explicitly expressed, but that'd be its own issue.

## Motivation

`Py_IsInitialized()` returned 1 while `Py_InitializeEx()` was still running — specifically, before `site.py` had been imported. See https://github.com/PyO3/pyo3/issues/5900 where a second thread could acquire the GIL and start executing Python with an incomplete `sys.path` because `site.py` hadn't finished.

The flag was also a plain `int` with no atomic operations, making concurrent reads a C-standard data race, though unlikely to manifest.

## Regression test:

The added test properly fails on `main` with `ERROR: Py_IsInitialized() was true during site import`.

---

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:54:23 +00:00
Pieter Eendebak
1c89817f51
gh-148276: Optimize object creation and method calls in the JIT by resolving __init__ at trace optimization time (GH-148277)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Ken Jin <kenjin4096@gmail.com>
2026-04-11 22:22:42 +08:00
Donghee Na
a71b043356
gh-148171: Convert CALL_BUILTIN_CLASS to leave arguments on the stack (gh-148381) 2026-04-11 23:01:25 +09:00
Neko Asakura
9831dea3bf
gh-148211: decompose _POP_TWO/_POP_CALL(_ONE/_TWO) in JIT (GH-148377) 2026-04-11 20:46:56 +08:00
Neko Asakura
72006a71b2
gh-148211: decompose [_POP_TWO/_INSERT_2]_LOAD_CONST_INLINE_BORROW in JIT (GH-148357) 2026-04-11 18:27:51 +08:00
Sacul
e872c19922
gh-148171: Convert variadic argument opcodes to leave their arguments on the stack (CALL_BUILTIN_FAST_WITH_KEYWORDS) (#148366) 2026-04-11 13:27:24 +08:00
Kumar Aditya
2b439da972
gh-148171: convert more variadic uops to leave input on stack in JIT (#148361) 2026-04-11 10:29:38 +05:30
Kumar Aditya
8f17140fc1
gh-131798: split _CALL_BUILTIN_CLASS to smaller uops (#148094) 2026-04-10 17:28:20 +00:00
Ken Jin
266247c9a6
gh-148171: Convert CALL_BUILTIN_FAST to leave inputs on the stack for refcount elimination in JIT (GH-148172) 2026-04-10 23:11:18 +08:00
Neko Asakura
aea0b91d65
gh-148211: decompose [_POP_CALL_X/_SHUFFLE_2]_LOAD_CONST_INLINE_BORROW in JIT (GH-148313) 2026-04-10 21:57:01 +08:00
Kumar Aditya
d3b7b93cbb
gh-148037: remove critical section from PyCode_Addr2Line (#148103) 2026-04-10 17:58:17 +05:30
Neko Asakura
0f49232664
gh-148211: decompose _INSERT_1_LOAD_CONST_INLINE(_BORROW) in JIT (GH-148283) 2026-04-10 00:45:39 +08:00
Sacul
38d3aef375
gh-134584 : Optimize and eliminate redundant ref-counting for MAKE_FUNCTION in the JIT (GH-144963) 2026-04-09 22:22:53 +08:00
Kumar Aditya
458aca9237
gh-131798: fold super method lookups in JIT (#148231) 2026-04-09 13:25:01 +05:30
Sacul
bb03c8bd02
gh-145866: Convert _CALL_METHOD_DESCRIPTOR_NOARGS to leave its inputs on the stack to be cleaned up by _POP_TOP (GH-148227) 2026-04-08 23:21:37 +08:00
Neko Asakura
756358524e
gh-148235: remove duplicate uops _LOAD_CONST_UNDER_INLINE(_BORROW) in JIT (GH-148236) 2026-04-08 16:22:59 +08:00
Petr Viktorin
8923ca418c
gh-145921: Add "_DuringGC" functions for tp_traverse (GH-145925)
There are newly documented restrictions on tp_traverse:

    The traversal function must not have any side effects.
    It must not modify the reference counts of any Python
    objects nor create or destroy any Python objects.

* Add several functions that are guaranteed side-effect-free,
  with a _DuringGC suffix.
* Use these in ctypes
* Consolidate tp_traverse docs in gcsupport.rst, moving unique
  content from typeobj.rst there

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
2026-04-08 09:15:11 +02:00
Hugo van Kemenade
8166492c29 Post 3.15.0a8 2026-04-07 20:32:33 +03:00
Hugo van Kemenade
55ea59e7dc Python 3.15.0a8 2026-04-07 14:24:03 +03:00
Petr Viktorin
fbc1a5b076
gh-146636: abi3t: Define Py_GIL_DISABLED but do not use it (GH-148142)
When compiling for abi3t, define Py_GIL_DISABLED, so that users who
check it to enable additional locking aren't broken.

But also avoid using Py_GIL_DISABLED in Python headers themselves
-- abi3 and abi3t ought to be the same except
the _Py_OPAQUE_PYOBJECT differences.

A check for this is coming in a later PR.
It will require rewriting some preprocessor conditions, some of these
changes are included in this PR.
For _Py_IsOwnedByCurrentThread & supporting functions
I opted to move them to a cpython/ header, as they're rather self-contained.
2026-04-07 09:06:17 +02:00
Tim Peters
3df0c4da4c
Note out-of-date obmalloc comments (#148149)
Vladimir's original overviews, from 1998, are still good, but going
on 30 years later details have changed. Note that, but rather try
to keep up with moving targets in a different file, point to
sys._debugmallocstats() as the sure way to discover precise current
details.

No code changes, just added a block comment.

Co-authored-by: Bartosz Sławecki <bartosz@ilikepython.com>
2026-04-06 23:08:47 -05:00
Junya Fukuda
3d724dd914
gh-148072: Cache pickle.dumps/loads per interpreter in XIData (GH-148125)
Store references to pickle.dumps and pickle.loads in _PyXI_state_t
so they are looked up only once per interpreter lifetime, avoiding
repeated PyImport_ImportModuleAttrString calls on every cross-interpreter
data transfer via pickle fallback.

Benchmarks show 1.7x-3.3x speedup for InterpreterPoolExecutor
when transferring mutable types (list, dict) through XIData.
2026-04-06 11:37:02 -04:00
Pieter Eendebak
efda60e2ec
gh-100239: Propagate type info through _BINARY_OP_EXTEND in tier 2 (GH-148146) 2026-04-06 20:52:42 +08:00
Pablo Galindo Salgado
fbfc6ccb0a
gh-148144: Initialize visited on copied interpreter frames (#148143)
_PyFrame_Copy() copied interpreter frames into generator and
frame-object storage without initializing the visited byte. Incremental
GC later reads frame->visited in mark_stacks() on non-start passes, so
copied frames could expose an uninitialized value once they became live
on a thread stack again.

Reset visited when copying a frame so copied frames start with defined
GC bookkeeping state. Preserve lltrace in Py_DEBUG builds.
2026-04-06 00:23:07 +01:00
Pablo Galindo Salgado
77fc2f5a5e
gh-144319: Fix huge page leak in datastack chunk allocator (#147963)
Fix huge page leak in datastack chunk allocator

The original fix rounded datastack chunk allocations in pystate.c so that
_PyObject_VirtualFree() would receive the full huge page mapping size.

Change direction and move that logic into _PyObject_VirtualAlloc() and
_PyObject_VirtualFree() instead. The key invariant is that munmap() must see
the full mapped size, so alloc and free now apply the same platform-specific
rounding in the allocator layer.

This keeps _PyStackChunk bookkeeping in requested-size units, avoids a
hardcoded 2 MB assumption, and also covers other small virtual-memory users
such as the JIT tracer state allocation in optimizer.c.
2026-04-05 16:29:38 +01:00
Serhiy Storchaka
8bf8bf9292
gh-73613: Support Base32 and Base64 without padding (GH-147974)
Add the padded parameter in functions related to Base32 and Base64 codecs
in the binascii and base64 modules.  In the encoding functions it controls
whether the pad character can be added in the output, in the decoding
functions it controls whether padding is required in input.

Padding of input no longer required in base64.urlsafe_b64decode() by default.
2026-04-04 21:26:16 +03:00
Pablo Galindo Salgado
21fb9dc71d
gh-146527: Heap-allocate gc_stats to avoid bloating PyInterpreterState (#148057)
The gc_stats struct contains ring buffers of gc_generation_stats
entries (11 young + 3×2 old on default builds). Embedding it inline
in _gc_runtime_state, which is itself inline in PyInterpreterState,
pushed fields like _gil.locked and threads.head to offsets beyond
what out-of-process profilers and debuggers can reasonably read in
a single buffer (e.g. offset 9384 for _gil.locked vs an 8 KiB read
buffer).

Heap-allocate generation_stats via PyMem_RawCalloc in _PyGC_Init and
free it in _PyGC_Fini. This shrinks PyInterpreterState by ~1.6 KiB
and keeps the GIL, thread-list, and other frequently-inspected fields
at stable, low offsets.
2026-04-04 18:42:30 +01:00
Donghee Na
853dafe23a
gh-148083: Prevent constant folding when lhs is container types (gh-148090) 2026-04-05 00:40:12 +09:00
Ken Jin
328da67b2c
gh-146073: Revert "gh-146073: Add fitness/exit quality mechanism for JIT trace frontend (GH-147966)" (#148082)
This reverts commit 198b04b75f.
2026-04-04 11:56:40 +00:00
Kumar Aditya
e7bf8eac0f
gh-131798: split recursion check to _CHECK_RECURSION_LIMIT and combine checks (GH-148070) 2026-04-04 17:23:03 +08:00
Kumar Aditya
7e275d4965
gh-131798: JIT inline function addresses of builtin methods (#146906) 2026-04-04 09:12:13 +05:30
Hai Zhu
198b04b75f
gh-146073: Add fitness/exit quality mechanism for JIT trace frontend (GH-147966) 2026-04-03 23:54:30 +08:00
Pieter Eendebak
48317feec8
gh-146640: Optimize int operations by mutating uniquely-referenced operands in place (JIT only) (GH-146641) 2026-04-03 23:23:04 +08:00
Petr Viktorin
9b08f8c56f
GH-126910: Revert "Make _Py_get_machine_stack_pointer return the stack pointer (#147945)" (GH-147994)
Revert "GH-126910: Make `_Py_get_machine_stack_pointer` return the stack pointer (#147945)"

This reverts commit 255026d9ee,
which broke a tier-1 buildbot.
2026-04-02 16:53:09 +02:00
Victor Stinner
c1a4112c22
gh-147988: Initialize digits in long_alloc() in debug mode (#147989)
When Python is built in debug mode:

* long_alloc() now initializes digits with a pattern to detect usage of
  uninitialized digits.
* _PyLong_CompactValue() now makes sure that the digit is zero when the
  sign is zero.
* PyLongWriter_Finish() now raises SystemError if it detects uninitialized
  digits

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2026-04-02 11:55:34 +00:00
Sergey B Kirpichev
b456cb25a9
gh-143050: Add helper _PyLong_InitTag() (#147956)
With this we can assume, that _PyLong_SetSignAndDigitCount() and
_PyLong_SetDigitCount() operate on non-immortal integers.

Co-authored-by: Victor Stinner <vstinner@python.org>
2026-04-01 21:42:10 +00:00
Mark Shannon
255026d9ee
GH-126910: Make _Py_get_machine_stack_pointer return the stack pointer (#147945)
* Make _Py_get_machine_stack_pointer return the stack pointer (or close to it), not the frame pointer

* Make ``_Py_ReachedRecursionLimit`` inline again
* Remove ``_Py_MakeRecCheck`` relacing its use with ``_Py_ReachedRecursionLimit``
* Move stack swtiching check into ``_Py_CheckRecursiveCall``
2026-04-01 17:15:13 +01:00
Petr Viktorin
2452324001
gh-146636: PEP 803: add Py_TARGET_ABI3T and .abi3t.so extension (GH-146637)
- Add Py_TARGET_ABI3T macro.
- Add ".abi3t.so" to importlib EXTENSION_SUFFIXES.
- Remove ".abi3.so" from importlib EXTENSION_SUFFIXES on Free Threading.
- Adjust tests

This is part of the implementation for PEP-803.
Detailed documentation to come later.

Co-authored-by: Nathan Goldbaum <nathan.goldbaum@gmail.com>
2026-04-01 16:14:59 +02:00
Serhiy Storchaka
473d2a35ce
gh-147944: Increase range of bytes_per_sep (GH-147946)
Accepted range for the bytes_per_sep argument of bytes.hex(),
bytearray.hex(), memoryview.hex(), and binascii.b2a_hex()
is now increased, so passing sys.maxsize and -sys.maxsize is now
valid.
2026-04-01 08:33:30 +00:00
Sergey B Kirpichev
db5936c5b8
gh-143050: Correct PyLong_FromString() to use _PyLong_Negate() (#145901)
The long_from_string_base() might return a small integer, when the
_pylong.py is used to do conversion.  Hence, we must be careful here to
not smash it "small int" bit by using the _PyLong_FlipSign().

Co-authored-by: Victor Stinner <vstinner@python.org>
2026-03-31 13:17:49 +00:00
Kumar Aditya
8e9d21c64b
gh-146558: JIT optimize dict access for objects with known hash (#146559) 2026-03-30 14:23:29 +00:00
Victor Stinner
adf2c47911
gh-126835: Fix _PY_IS_SMALL_INT() macro (#146631) 2026-03-30 12:48:18 +00:00
Serhiy Storchaka
6932c3ee6a
gh-145876: Do not mask KeyErrors raised during dictionary unpacking in call (GH-146472)
KeyErrors raised in keys() or __getitem__() during dictionary unpacking
in call (func(**mymapping)) are no longer masked by TypeError.
2026-03-29 11:58:52 +03:00
Sergey Miryanov
5bf3a31bc2
GH-146527: Add more data to GC statistics and add it to PyDebugOffsets (#146532) 2026-03-28 18:52:10 +00:00
Ken Jin
1384f025f5
gh-126910: Verify that JIT stencils preserve frame pointer (GH-146524) 2026-03-28 03:38:54 +08:00