Commit graph

6435 commits

Author SHA1 Message Date
Dino Viehland
2a07ff980b
gh-148659: Export some internal functions for the JIT (PEP-523) (#148634)
Export (as internal functions, not public ones) C API functions necessary to implement a JIT as a separate extension module.
2026-04-17 01:55:03 +02:00
Dino Viehland
c0af5c024b
gh-146031: Allow keeping specialization enabled when specifying eval frame function (#146032)
Allow keeping specialization enabled when specifying eval frame function
2026-04-16 09:44:26 -07:00
Mark Shannon
600f4dbd54
GH-145668: Add FOR_ITER specialization for virtual iterators. Specialize GET_ITER. (GH-147967)
* Add FOR_ITER_VIRTUAL to specialize FOR_ITER for virtual iterators
* Add GET_ITER_SELF to specialize GET_ITER for iterators (including generators)
* Add GET_ITER_VIRTUAL to specialize GET_ITER for iterables as virtual iterators
* Add new (internal) _tp_iteritem function slot to PyTypeObject
* Put limited RESUME at start of genexpr for free-threading. Fix up exception handling in genexpr
2026-04-16 15:22:22 +01:00
Pieter Eendebak
1f6a09fb36
gh-100239: Specialize more binary operations using BINARY_OP_EXTEND (GH-128956) 2026-04-16 09:22:41 +01:00
Jelle Zijlstra
5b8cd314e2
gh-137814: Fix __qualname__ of __annotate__ (#137842) 2026-04-15 21:52:30 -07:00
Pieter Eendebak
5c3decad66
gh-146393: Use recorded type instead of instance in BINARY_OP (#148569) 2026-04-14 20:11:42 +00:00
Pieter Eendebak
95cbd4a232
gh-146393: Optimize float division operations by mutating uniquely-referenced operands in place (JIT only) (GH-146397)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 02:08:04 +08:00
Bénédikt Tran
356a031de5
gh-146563: add exception note for invalid Expat handler return values (#146565) 2026-04-14 19:12:47 +02:00
Kumar Aditya
3cb7eaec85
gh-131798: constant fold special method lookups in JIT (#148432) 2026-04-14 16:02:23 +00:00
Kumar Aditya
1aa7e7ee6d
gh-gh-131798: optimize LOAD_ATTR_GETATTRIBUTE_OVERRIDDEN in the JIT (#148555) 2026-04-14 21:00:32 +05:30
Neko Asakura
52a7f1b7f8
gh-148510: restore func_version check in _LOAD_ATTR_PROPERTY_FRAME (GH-148528) 2026-04-14 22:44:39 +08:00
Hai Zhu
5ce0fe8b6c
gh-148378: Allow multiple consecutive recording ops per macro op (GH-148496) 2026-04-14 19:26:53 +08:00
Sam Gross
19f96f99fe
gh-148393: Use acquire load for _ma_watcher_tag in dict notify event (gh-148509)
The watcher-bits read in _PyDict_NotifyEvent needs to use acquire to
synchronize with the release from PyDict_Watch so that the callback
publication is visible before the callback is invoked.
2026-04-13 14:11:28 -04:00
Kumar Aditya
88e378cc1c
gh-131798: optimize through keyword and bound method calls in the JIT (GH-148466)
Co-authored-by: Ken Jin <kenjin4096@gmail.com>
2026-04-13 21:14:48 +08:00
Sacul
18d7d90ef9
gh-131798: Split _CHECK_AND_ALLOCATE_OBJECT into smaller uops (GH-148433)
Co-authored-by: Hai Zhu <haiizhu@outlook.com>
Co-authored-by: Ken Jin <kenjin4096@gmail.com>
2026-04-13 02:31:24 +08:00
Ken Jin
6f7bb297db
gh-146261: JIT: protect against function version changes (#146300)
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
2026-04-13 02:23:47 +08:00
Sam Gross
3ab94d6842
gh-148393: Use atomic ops on _ma_watcher_tag in free threading build (gh-148397)
Fixes data races between dict mutation and watch/unwatch on the same dict.
2026-04-12 10:40:41 -04:00
Gregory P. Smith
64afa947f4
gh-146302: make Py_IsInitialized() thread-safe and reflect true init completion (GH-146303)
## Summary

- Move the `runtime->initialized = 1` store from before `site.py` import to the end of `init_interp_main()`, so `Py_IsInitialized()` only returns true after initialization has fully completed
- Access `initialized` and `core_initialized` through new inline accessors using acquire/release atomics, to also protect from data race undefined behavior
- `PySys_AddAuditHook()` now uses the accessor, so with the flag move it correctly skips audit hook invocation during all init phases (matching the documented "after runtime initialization" behavior) ... We could argue that running these earlier would be good even if the intent was never explicitly expressed, but that'd be its own issue.

## Motivation

`Py_IsInitialized()` returned 1 while `Py_InitializeEx()` was still running — specifically, before `site.py` had been imported. See https://github.com/PyO3/pyo3/issues/5900 where a second thread could acquire the GIL and start executing Python with an incomplete `sys.path` because `site.py` hadn't finished.

The flag was also a plain `int` with no atomic operations, making concurrent reads a C-standard data race, though unlikely to manifest.

## Regression test:

The added test properly fails on `main` with `ERROR: Py_IsInitialized() was true during site import`.

---

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:54:23 +00:00
Pieter Eendebak
1c89817f51
gh-148276: Optimize object creation and method calls in the JIT by resolving __init__ at trace optimization time (GH-148277)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Ken Jin <kenjin4096@gmail.com>
2026-04-11 22:22:42 +08:00
Donghee Na
a71b043356
gh-148171: Convert CALL_BUILTIN_CLASS to leave arguments on the stack (gh-148381) 2026-04-11 23:01:25 +09:00
Neko Asakura
9831dea3bf
gh-148211: decompose _POP_TWO/_POP_CALL(_ONE/_TWO) in JIT (GH-148377) 2026-04-11 20:46:56 +08:00
Neko Asakura
72006a71b2
gh-148211: decompose [_POP_TWO/_INSERT_2]_LOAD_CONST_INLINE_BORROW in JIT (GH-148357) 2026-04-11 18:27:51 +08:00
Sacul
e872c19922
gh-148171: Convert variadic argument opcodes to leave their arguments on the stack (CALL_BUILTIN_FAST_WITH_KEYWORDS) (#148366) 2026-04-11 13:27:24 +08:00
Kumar Aditya
2b439da972
gh-148171: convert more variadic uops to leave input on stack in JIT (#148361) 2026-04-11 10:29:38 +05:30
Kumar Aditya
8f17140fc1
gh-131798: split _CALL_BUILTIN_CLASS to smaller uops (#148094) 2026-04-10 17:28:20 +00:00
Ken Jin
266247c9a6
gh-148171: Convert CALL_BUILTIN_FAST to leave inputs on the stack for refcount elimination in JIT (GH-148172) 2026-04-10 23:11:18 +08:00
Neko Asakura
aea0b91d65
gh-148211: decompose [_POP_CALL_X/_SHUFFLE_2]_LOAD_CONST_INLINE_BORROW in JIT (GH-148313) 2026-04-10 21:57:01 +08:00
Kumar Aditya
d3b7b93cbb
gh-148037: remove critical section from PyCode_Addr2Line (#148103) 2026-04-10 17:58:17 +05:30
Neko Asakura
0f49232664
gh-148211: decompose _INSERT_1_LOAD_CONST_INLINE(_BORROW) in JIT (GH-148283) 2026-04-10 00:45:39 +08:00
Sacul
38d3aef375
gh-134584 : Optimize and eliminate redundant ref-counting for MAKE_FUNCTION in the JIT (GH-144963) 2026-04-09 22:22:53 +08:00
Kumar Aditya
458aca9237
gh-131798: fold super method lookups in JIT (#148231) 2026-04-09 13:25:01 +05:30
Sacul
bb03c8bd02
gh-145866: Convert _CALL_METHOD_DESCRIPTOR_NOARGS to leave its inputs on the stack to be cleaned up by _POP_TOP (GH-148227) 2026-04-08 23:21:37 +08:00
Neko Asakura
756358524e
gh-148235: remove duplicate uops _LOAD_CONST_UNDER_INLINE(_BORROW) in JIT (GH-148236) 2026-04-08 16:22:59 +08:00
Petr Viktorin
8923ca418c
gh-145921: Add "_DuringGC" functions for tp_traverse (GH-145925)
There are newly documented restrictions on tp_traverse:

    The traversal function must not have any side effects.
    It must not modify the reference counts of any Python
    objects nor create or destroy any Python objects.

* Add several functions that are guaranteed side-effect-free,
  with a _DuringGC suffix.
* Use these in ctypes
* Consolidate tp_traverse docs in gcsupport.rst, moving unique
  content from typeobj.rst there

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
2026-04-08 09:15:11 +02:00
Hugo van Kemenade
8166492c29 Post 3.15.0a8 2026-04-07 20:32:33 +03:00
Hugo van Kemenade
55ea59e7dc Python 3.15.0a8 2026-04-07 14:24:03 +03:00
Petr Viktorin
fbc1a5b076
gh-146636: abi3t: Define Py_GIL_DISABLED but do not use it (GH-148142)
When compiling for abi3t, define Py_GIL_DISABLED, so that users who
check it to enable additional locking aren't broken.

But also avoid using Py_GIL_DISABLED in Python headers themselves
-- abi3 and abi3t ought to be the same except
the _Py_OPAQUE_PYOBJECT differences.

A check for this is coming in a later PR.
It will require rewriting some preprocessor conditions, some of these
changes are included in this PR.
For _Py_IsOwnedByCurrentThread & supporting functions
I opted to move them to a cpython/ header, as they're rather self-contained.
2026-04-07 09:06:17 +02:00
Tim Peters
3df0c4da4c
Note out-of-date obmalloc comments (#148149)
Vladimir's original overviews, from 1998, are still good, but going
on 30 years later details have changed. Note that, but rather try
to keep up with moving targets in a different file, point to
sys._debugmallocstats() as the sure way to discover precise current
details.

No code changes, just added a block comment.

Co-authored-by: Bartosz Sławecki <bartosz@ilikepython.com>
2026-04-06 23:08:47 -05:00
Junya Fukuda
3d724dd914
gh-148072: Cache pickle.dumps/loads per interpreter in XIData (GH-148125)
Store references to pickle.dumps and pickle.loads in _PyXI_state_t
so they are looked up only once per interpreter lifetime, avoiding
repeated PyImport_ImportModuleAttrString calls on every cross-interpreter
data transfer via pickle fallback.

Benchmarks show 1.7x-3.3x speedup for InterpreterPoolExecutor
when transferring mutable types (list, dict) through XIData.
2026-04-06 11:37:02 -04:00
Pieter Eendebak
efda60e2ec
gh-100239: Propagate type info through _BINARY_OP_EXTEND in tier 2 (GH-148146) 2026-04-06 20:52:42 +08:00
Pablo Galindo Salgado
fbfc6ccb0a
gh-148144: Initialize visited on copied interpreter frames (#148143)
_PyFrame_Copy() copied interpreter frames into generator and
frame-object storage without initializing the visited byte. Incremental
GC later reads frame->visited in mark_stacks() on non-start passes, so
copied frames could expose an uninitialized value once they became live
on a thread stack again.

Reset visited when copying a frame so copied frames start with defined
GC bookkeeping state. Preserve lltrace in Py_DEBUG builds.
2026-04-06 00:23:07 +01:00
Pablo Galindo Salgado
77fc2f5a5e
gh-144319: Fix huge page leak in datastack chunk allocator (#147963)
Fix huge page leak in datastack chunk allocator

The original fix rounded datastack chunk allocations in pystate.c so that
_PyObject_VirtualFree() would receive the full huge page mapping size.

Change direction and move that logic into _PyObject_VirtualAlloc() and
_PyObject_VirtualFree() instead. The key invariant is that munmap() must see
the full mapped size, so alloc and free now apply the same platform-specific
rounding in the allocator layer.

This keeps _PyStackChunk bookkeeping in requested-size units, avoids a
hardcoded 2 MB assumption, and also covers other small virtual-memory users
such as the JIT tracer state allocation in optimizer.c.
2026-04-05 16:29:38 +01:00
Serhiy Storchaka
8bf8bf9292
gh-73613: Support Base32 and Base64 without padding (GH-147974)
Add the padded parameter in functions related to Base32 and Base64 codecs
in the binascii and base64 modules.  In the encoding functions it controls
whether the pad character can be added in the output, in the decoding
functions it controls whether padding is required in input.

Padding of input no longer required in base64.urlsafe_b64decode() by default.
2026-04-04 21:26:16 +03:00
Pablo Galindo Salgado
21fb9dc71d
gh-146527: Heap-allocate gc_stats to avoid bloating PyInterpreterState (#148057)
The gc_stats struct contains ring buffers of gc_generation_stats
entries (11 young + 3×2 old on default builds). Embedding it inline
in _gc_runtime_state, which is itself inline in PyInterpreterState,
pushed fields like _gil.locked and threads.head to offsets beyond
what out-of-process profilers and debuggers can reasonably read in
a single buffer (e.g. offset 9384 for _gil.locked vs an 8 KiB read
buffer).

Heap-allocate generation_stats via PyMem_RawCalloc in _PyGC_Init and
free it in _PyGC_Fini. This shrinks PyInterpreterState by ~1.6 KiB
and keeps the GIL, thread-list, and other frequently-inspected fields
at stable, low offsets.
2026-04-04 18:42:30 +01:00
Donghee Na
853dafe23a
gh-148083: Prevent constant folding when lhs is container types (gh-148090) 2026-04-05 00:40:12 +09:00
Ken Jin
328da67b2c
gh-146073: Revert "gh-146073: Add fitness/exit quality mechanism for JIT trace frontend (GH-147966)" (#148082)
This reverts commit 198b04b75f.
2026-04-04 11:56:40 +00:00
Kumar Aditya
e7bf8eac0f
gh-131798: split recursion check to _CHECK_RECURSION_LIMIT and combine checks (GH-148070) 2026-04-04 17:23:03 +08:00
Kumar Aditya
7e275d4965
gh-131798: JIT inline function addresses of builtin methods (#146906) 2026-04-04 09:12:13 +05:30
Hai Zhu
198b04b75f
gh-146073: Add fitness/exit quality mechanism for JIT trace frontend (GH-147966) 2026-04-03 23:54:30 +08:00
Pieter Eendebak
48317feec8
gh-146640: Optimize int operations by mutating uniquely-referenced operands in place (JIT only) (GH-146641) 2026-04-03 23:23:04 +08:00