Commit graph

10027 commits

Author SHA1 Message Date
Nadeshiko Manju
a154c9ed4e
gh-134584: Eliminate redundant refcounting from _CALL_STR_1 (GH-136070)
Signed-off-by: Manjusaka <me@manjusaka.me>
2025-12-14 09:33:05 +00:00
Ken Jin
e02a35c365
gh-134584: Cleanups for GH-135860 (GH-142604) 2025-12-13 14:38:10 +00:00
Petr Viktorin
15313dd3d7
gh-140550: Correct error message for PyModExport (PEP 793) hook (GH-142583) 2025-12-12 17:48:43 +01:00
Ken Jin
a3a611b042
gh-134584: Revert partially GH-135860 (GH-142620) 2025-12-12 14:04:11 +00:00
Victor Stinner
e0bca091a4
gh-142627: Ignore anonymous mappings in Linux remote debugging (#142628) 2025-12-12 13:12:11 +00:00
wangjingcun
2a820e2b9c
fix typos in crossinterp.c and qsbr.c (#142612) 2025-12-12 11:48:20 +05:30
AZero13
9fe6e3ed36
gh-142571: Check for errors before calling each syscall in PyUnstable_CopyPerfMapFile() (#142460)
Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2025-12-11 21:18:52 +00:00
Neil Schemenauer
e38967ed60
gh-142531: Fix free-threaded GC performance regression (gh-142562)
If there are many untracked tuples, the GC will run too often, resulting
in poor performance.  The fix is to include untracked tuples in the
"long lived" object count. The number of frozen objects is also now
included since the free-threaded GC must scan those too.
2025-12-11 12:30:56 -08:00
Brett Cannon
af185727b2
GH-65961: Stop setting __cached__ on modules (GH-142165) 2025-12-11 11:44:46 -08:00
Donghee Na
a27538540e
gh-134584: Eliminate redundant refcounting from `_CALL_LEN` (gh-136104) 2025-12-11 15:24:34 +00:00
Noam Cohen
a78f43b001
gh-134584: Eliminate redundant refcounting from _CALL_TUPLE_1 (GH-135860) 2025-12-11 14:31:28 +00:00
Mark Shannon
4eab90f4f3
GH-140683: JIT: Improve machine code for loading smaller constants on AArch64. (GH-142511)
* Use movz and movk instructions for loading 16 and 32 bit operands and oparg.
* Loading of 64 bit operands is unchanged.
2025-12-11 12:33:39 +00:00
Mark Shannon
469f191a85
GH-135379: Top of stack caching for the JIT. (GH-135465)
Uses three registers to cache values at the top of the evaluation stack
This significantly reduces memory traffic for smaller, more common uops.
2025-12-11 10:32:52 +00:00
Ken Jin
97e19014dd
gh-137007: Track executor before any possible deallocations (GH-137016) 2025-12-11 05:09:56 +08:00
Ken Jin
ebf3427615
gh-141976: Protect against non-progressing specializations in tracing JIT (GH-141989) 2025-12-10 19:39:11 +00:00
Diego Russo
46295677a1
GH-142305: JIT: Deduplicating GOT symbols in the trace (#142316) 2025-12-10 16:04:04 +00:00
Kevin Wang
49b1fb43f6
gh-142048: Fix lost gc allocations count on thread cleanup (#142233) 2025-12-10 07:29:40 +00:00
dr-carlos
70671267c1
gh-142029: Raise ValueError instead of crashing on empty name given to create_builtin() (#142033)
Co-authored-by: Victor Stinner <vstinner@python.org>
2025-12-10 12:01:57 +05:30
Ken Jin
97f0a1f203
gh-142276: Watch attribute loads when promoting JIT constants (GH-142303)
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Savannah Ostrowski <savannah@python.org>
2025-12-08 18:03:15 +00:00
Mark Shannon
e0451ceef8
GH-139757: JIT: Remove redundant branches to jumps in the assembly optimizer (GH-140800)
JIT: Remove redundant branches to jump in the assembly optimizer

* Refactor JIT assembly optimizer making instructions instances not just strings
* Remove redundant jumps and branches where legal to do so
* Modifies _BINARY_OP_SUBSCR_STR_INT to avoid excessive inlining depth
2025-12-08 17:57:11 +00:00
Donghee Na
c4ccaf4b10
gh-141770: Annotate anonymous mmap usage if "-X dev" is used (gh-142079) 2025-12-08 14:47:19 +00:00
Pablo Galindo Salgado
d6d850df89
gh-138122: Don't sample partial frame chains (#141912) 2025-12-07 15:53:48 +00:00
Pablo Galindo Salgado
572c780aa8
gh-138122: Implement frame caching in RemoteUnwinder to reduce memory reads (#142137)
This PR implements frame caching in the RemoteUnwinder class to significantly reduce memory reads when profiling remote processes with deep call stacks.

When cache_frames=True, the unwinder stores the frame chain from each sample and reuses unchanged portions in subsequent samples. Since most profiling samples capture similar call stacks (especially the parent frames), this optimization avoids repeatedly reading the same frame data from the target process.

The implementation adds a last_profiled_frame field to the thread state that tracks where the previous sample stopped. On the next sample, if the current frame chain reaches this marker, the cached frames from that point onward are reused instead of being re-read from remote memory.

The sampling profiler now enables frame caching by default.
2025-12-06 22:37:34 +00:00
Kir Chou
35142b18ae
gh-142168: explicitly initialize stack_array in _PyEval_Vector and _PyEvalFramePushAndInit_Ex (#142192)
Co-authored-by: Kir Chou <note351@hotmail.com>
2025-12-06 19:59:52 +01:00
Serhiy Storchaka
706fdda8b3
gh-141370: Fix undefined behavior when using Py_ABS() (GH-141548)
Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
2025-12-05 16:24:35 +02:00
Ken Jin
b3bf212898
gh-141976: Check stack bounds in JIT optimizer (GH-142201) 2025-12-04 20:28:08 +00:00
Mark Shannon
6825d5c11d
GH-139757: Fix reference leaks introduced in GH-140800 (GH-142257) 2025-12-04 12:27:15 +00:00
Mark Shannon
62423c9c36
GH-141794: Limit size of generated machine code. (GH-142228)
* Factor out bodies of the largest uops, to reduce jit code size.
* Factor out common assert, also reducing jit code size.
* Limit size of jitted code for a single executor to 1MB.
2025-12-03 17:43:35 +00:00
Victor Stinner
7e5fcae09b
gh-142217: Remove internal _Py_Identifier functions (#142219)
Remove internal functions:

* _PyDict_ContainsId()
* _PyDict_DelItemId()
* _PyDict_GetItemIdWithError()
* _PyDict_SetItemId()
* _PyEval_GetBuiltinId()
* _PyObject_CallMethodIdNoArgs()
* _PyObject_CallMethodIdObjArgs()
* _PyObject_CallMethodIdOneArg()
* _PyObject_VectorcallMethodId()
* _PyUnicode_EqualToASCIIId()

These functions were not exported and so no usable outside CPython.
2025-12-03 14:33:32 +01:00
Kevin Wang
eb892868b3
gh-142048: Fix quadratically increasing GC delays (gh-142051)
The GC for the free threaded build would get slower with each collection due
to effectively double counting objects freed by the GC.
2025-12-01 19:04:47 -05:00
Victor Stinner
d5d9e89dde
gh-116008: Detect freed thread state in faulthandler (#141988)
Add _PyMem_IsULongFreed() function.
2025-11-27 12:35:00 +01:00
Victor Stinner
83d8134c5b
gh-127635: Use flexible array in tracemalloc (#141991)
Replace frames[1] with frames[] in tracemalloc_traceback structure.
2025-11-27 12:32:31 +01:00
Sergey Miryanov
2ea67caf31
GH-141861: Fix TRACE_RECORD if full (GH-141959) 2025-11-26 14:32:30 +00:00
Itamar Oren
27f62eb711
gh-140011: Delete importdl assertion that prevents importing embedded modules from packages (GH-141605) 2025-11-26 14:12:49 +01:00
Pablo Galindo Salgado
d07d3a3c57
gh-138122: Split Modules/_remote_debugging_module.c into multiple files (#141934)
gh-1381228: Split Modules/_remote_debugging_module.c into multiple files
2025-11-25 12:51:24 +00:00
Sergey Miryanov
dc62b62252
GH-141861: Fix invalid memory read in the ENTER_EXECUTOR (GH-141921) 2025-11-24 22:07:45 +00:00
Petr Viktorin
bf66bce4ee
gh-141780: Make PyModule_FromSlotsAndSpec enable GIL if needed (GH-141785) 2025-11-24 13:26:35 +01:00
Sam Gross
e457d60daa
gh-120158: Fix inconsistent monitoring state when setting events too frequently (gh-141845)
If we overflowed the global version counter (i.e., after 2*24 calls to
`_PyMonitoring_SetEvents`), we bailed out after setting global monitoring
events but before instrumenting code objects, which led to assertion errors
later on.

Also add a `time.sleep()` to `test_free_threading.test_monitoring` to avoid
overflowing the global version counter.
2025-11-23 10:07:17 -05:00
Brandt Bucher
227b9d326e
GH-140638: Add a GC "candidates" stat (GH-141814) 2025-11-22 21:59:14 +00:00
Sam Gross
2d50dd242e
gh-137422: Fix race condition in PyImport_AddModuleRef (gh-141822) 2025-11-21 13:30:33 -05:00
Kumar Aditya
49ff8b6cc0
gh-140795: fetch thread state once on fast path for critical sections (#141406) 2025-11-21 19:49:53 +05:30
Brandt Bucher
598d4c64de
GH-140638: Add a GC "duration" stat (GH-141720) 2025-11-19 08:51:39 -08:00
Mark Shannon
c25a070759
GH-139653: Only raise an exception (or fatal error) when the stack pointer is about to overflow the stack. (GH-141711)
Only raises if the stack pointer is both below the limit *and* above the stack base.
This prevents false positives for user-space threads, as the stack pointer will be outside those bounds
if the stack has been swapped.
2025-11-19 10:16:24 +00:00
Shamil
daafacf005
gh-42400: Fix buffer overflow in _Py_wrealpath() for very long paths (#141529)
Co-authored-by: Victor Stinner <vstinner@python.org>
2025-11-18 17:34:58 +01:00
Victor Stinner
600f3feb23
gh-141070: Add PyUnstable_Object_Dump() function (#141072)
* Promote _PyObject_Dump() as a public function.
* Keep _PyObject_Dump() alias to PyUnstable_Object_Dump()
  for backward compatibility.
* Replace _PyObject_Dump() with PyUnstable_Object_Dump().

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
2025-11-18 16:13:13 +00:00
Mark Shannon
b420f6be53
GH-139109: Support switch/case dispatch with the tracing interpreter. (GH-141703) 2025-11-18 13:31:48 +00:00
Stefano Rivera
f6dd9c12a8
GH-139914: Handle stack growth direction on HPPA (GH-140028)
Adapted from a patch for Python 3.14 submitted to the Debian BTS by John
https://bugs.debian.org/1105111#20

Co-authored-by: John David Anglin <dave.anglin@bell.net>
2025-11-17 14:41:22 +01:00
Brandt Bucher
336366fd7c
GH-140643: Add <native> and <GC> frames to the sampling profiler (#141108)
- Introduce a new field in the GC state to store the frame that initiated garbage collection.
- Update RemoteUnwinder to include options for including "<native>" and "<GC>" frames in the stack trace.
- Modify the sampling profiler to accept parameters for controlling the inclusion of native and GC frames.
- Enhance the stack collector to properly format and append these frames during profiling.
- Add tests to verify the correct behavior of the profiler with respect to native and GC frames, including options to exclude them.

Co-authored-by: Pablo Galindo Salgado <pablogsal@gmail.com>
2025-11-17 13:39:00 +00:00
Pablo Galindo Salgado
89a914c58d
gh-135953: Add GIL contention markers to sampling profiler Gecko format (#139485)
This commit enhances the Gecko format reporter in the sampling profiler
to include markers for GIL acquisition events.
2025-11-17 12:46:26 +00:00
Ken Jin
ed73c909f2
gh-139109: JIT _EXIT_TRACE to ENTER_EXECUTOR rather than _DEOPT (GH-141573) 2025-11-15 20:19:41 +00:00