cpython

mirror of https://github.com/python/cpython.git synced 2025-12-08 06:10:17 +00:00

Author	SHA1	Message	Date
Ken Jin	4fa80ce74c	gh-139109: A new tracing JIT compiler frontend for CPython (GH-140310) This PR changes the current JIT model from trace projection to trace recording. Benchmarking: better pyperformance (about 1.7% overall) geomean versus current https://raw.githubusercontent.com/facebookexperimental/free-threading-benchmarking/refs/heads/main/results/bm-20251108-3.15.0a1%2B-7e2bc1d-JIT/bm-20251108-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-7e2bc1d-vs-base.svg, 100% faster Richards on the most improved benchmark versus the current JIT. Slowdown of about 10-15% on the worst benchmark versus the current JIT. Note: the fastest version isn't the one merged, as it relies on fixing bugs in the specializing interpreter, which is left to another PR. The speedup in the merged version is about 1.1%. https://raw.githubusercontent.com/facebookexperimental/free-threading-benchmarking/refs/heads/main/results/bm-20251112-3.15.0a1%2B-f8a764a-JIT/bm-20251112-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-f8a764a-vs-base.svg Stats: 50% more uops executed, 30% more traces entered the last time we ran them. It also suggests our trace lengths for a real trace recording JIT are too short, as a lot of trace too long aborts https://github.com/facebookexperimental/free-threading-benchmarking/blob/main/results/bm-20251023-3.15.0a1%2B-eb73378-CLANG%2CJIT/bm-20251023-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-eb73378-pystats-vs-base.md . This new JIT frontend is already able to record/execute significantly more instructions than the previous JIT frontend. In this PR, we are now able to record through custom dunders, simple object creation, generators, etc. None of these were done by the old JIT frontend. Some custom dunders uops were discovered to be broken as part of this work gh-140277 The optimizer stack space check is disabled, as it's no longer valid to deal with underflow. Pros: * Ignoring the generated tracer code as it's automatically created, this is only additional 1k lines of code. The maintenance burden is handled by the DSL and code generator. * `optimizer.c` is now significantly simpler, as we don't have to do strange things to recover the bytecode from a trace. * The new JIT frontend is able to handle a lot more control-flow than the old one. * Tracing is very low overhead. We use the tail calling interpreter/computed goto interpreter to switch between tracing mode and non-tracing mode. I call this mechanism dual dispatch, as we have two dispatch tables dispatching to each other. Specialization is still enabled while tracing. * Better handling of polymorphism. We leverage the specializing interpreter for this. Cons: * (For now) requires tail calling interpreter or computed gotos. This means no Windows JIT for now :(. Not to fret, tail calling is coming soon to Windows though https://github.com/python/cpython/pull/139962 Design: * After each instruction, the `record_previous_inst` function/label is executed. This does as the name suggests. * The tracing interpreter lowers bytecode to uops directly so that it can obtain "fresh" values at the point of lowering. * The tracing version behaves nearly identical to the normal interpreter, in fact it even has specialization! This allows it to run without much of a slowdown when tracing. The actual cost of tracing is only a function call and writes to memory. * The tracing interpreter uses the specializing interpreter's deopt to naturally form the side exit chains. This allows it to side exit chain effectively, without repeating much code. We force a re-specializing when tracing a deopt. * The tracing interpreter can even handle goto errors/exceptions, but I chose to disable them for now as it's not tested. * Because we do not share interpreter dispatch, there is should be no significant slowdown to the original specializing interpreter on tailcall and computed got with JIT disabled. With JIT enabled, there might be a slowdown in the form of the JIT trying to trace. * Things that could have dynamic instruction pointer effects are guarded on. The guard deopts to a new instruction --- `_DYNAMIC_EXIT`.	2025-11-13 18:08:32 +00:00
Cody Maloney	732224e113	gh-139871: Add `bytearray.take_bytes([n])` to efficiently extract `bytes` (GH-140128) Update `bytearray` to contain a `bytes` and provide a zero-copy path to "extract" the `bytes`. This allows making several code paths more efficient. This does not move any codepaths to make use of this new API. The documentation changes include common code patterns which can be made more efficient with this API. --- When just changing `bytearray` to contain `bytes` I ran pyperformance on a `--with-lto --enable-optimizations --with-static-libpython` build and don't see any major speedups or slowdowns with this; all seems to be in the noise of my machine (Generally changes under 5% or benchmarks that don't touch bytes/bytearray). Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Maurycy Pawłowski-Wieroński <5383+maurycy@users.noreply.github.com>	2025-11-13 13:19:44 +00:00
Petr Viktorin	589a03a8ce	gh-140550: Initial implementation of PEP 793 – PyModExport (GH-140556) Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Kumar Aditya <kumaraditya@python.org>	2025-11-05 12:31:42 +01:00
Adam Turner	5443b9e52f	gh-133143: Condense the implementation for ``sys.abi_info`` (#138672 )	2025-09-08 19:21:28 +00:00
Klaus Zimmermann	1acb718ea2	gh-133143: Add sys.abi_info (GH-137476) This makes information about the interpreter ABI more accessible. Co-authored-by: Petr Viktorin <encukou@gmail.com> Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>	2025-09-08 14:35:44 +00:00
Peter Bierma	e8251dc0ae	gh-134170: Add colorization to unraisable exceptions (#134183 ) Default implementation of sys.unraisablehook() now uses traceback._print_exception_bltin() to print exceptions with colorized text. Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> Co-authored-by: Victor Stinner <vstinner@python.org>	2025-08-04 14:35:00 +00:00
Serhiy Storchaka	c45f4f3ebe	gh-78465: Fix error message for cls.__new__(cls, ...) where cls is not instantiable (GH-135981) Previous error message suggested to use cls.__new__(), which obviously does not work. Now the error message is the same as for cls(...).	2025-06-27 14:35:55 +03:00
sobolevn	8ca1e4d846	gh-135645: Added `supports_isolated_interpreters` to `sys.implementation` (#135667 ) Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>	2025-06-21 10:56:14 +03:00
Nadeshiko Manju	1ddfe59320	gh-135543: Emit sys.remote_exec audit event when sys.remote_exec is called (GH-135544)	2025-06-19 21:23:38 +01:00
Eric Snow	62143736b6	gh-134939: Add the concurrent.interpreters Module (gh-133958) PEP-734 has been accepted (for 3.14). (FTR, I'm opposed to putting this under the concurrent package, but doing so is the SC condition under which the module can land in 3.14.)	2025-06-11 17:35:48 -06:00
tpburns	54ca55978e	gh-134248 test_getallocatedblocks pre-check to ignore immortalized strings (#134871 ) When sanity checking against gettotalrefcount(), we exclude the blocks for immortalized strings since their references are not tracked/reported. This now matches refleak.py's book-keeping using the same functions.	2025-06-03 18:00:25 +02:00
CF Bolz-Tereick	895119ec24	skip test for sys._stdlib_dir if that is not present (#134973 )	2025-05-31 13:46:22 +02:00
Victor Stinner	ebf6d13567	gh-134745: Change PyThread_allocate_lock() implementation to PyMutex (#134747 ) Co-authored-by: Sam Gross <colesbury@gmail.com>	2025-05-30 10:15:47 +00:00
Serhiy Storchaka	2602d8ae98	gh-71339: Use new assertion methods in tests (GH-129046)	2025-05-22 13:17:22 +03:00
Victor Stinner	009e7b3698	gh-134064: Fix sys.remote_exec() error checking (#134067 )	2025-05-18 00:24:40 +02:00
Serhiy Storchaka	c09cec5d69	gh-133886: Fix sys.remote_exec() for non-UTF-8 paths (GH-133887) It now supports non-ASCII paths in non-UTF-8 locales and non-UTF-8 paths in UTF-8 locales.	2025-05-13 11:55:24 +03:00
Irit Katriel	296cd128bf	Revert "gh-133395: add option for extension modules to specialize BINARY_OP/SUBSCR, apply to arrays (#133396 )" (#133498 )	2025-05-06 13:12:26 +03:00
Brandt Bucher	b1aa515bd6	GH-133231: Add JIT utilities in sys._jit (GH-133233)	2025-05-05 15:25:22 -07:00
Irit Katriel	082dbf7788	gh-133395: add option for extension modules to specialize BINARY_OP/SUBSCR, apply to arrays (#133396 )	2025-05-05 17:46:56 +01:00
littlebutt's workshop	d6078ed6d0	gh-132143: Fix the `AssertionError` in the test case `test.test_sys.TestRemoteExec` (#132248 )	2025-05-05 17:08:49 +01:00
Adam Turner	3f80165a26	GH-91048: Minor fixes for ``_remotedebugging` `& rename to` `_remote_debugging`` (#133398 )	2025-05-05 02:30:14 +02:00
Pablo Galindo Salgado	2bc8365231	GH-91048: Add utils for printing the call stack for asyncio tasks (#133284 )	2025-05-04 00:51:57 +00:00
Srinivas Reddy Thatiparthy (తాటిపర్తి శ్రీనివాస్ రెడ్డి)	8783cec9b6	gh-129027: Raise DeprecationWarning for sys._clear_type_cache (#129043 ) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>	2025-04-25 15:01:48 +03:00
Matt Wozniski	a94c7528b5	gh-132859: Run debugger scripts in their own namespaces (#132860 ) Run debugger scripts in their own namespaces Previously scripts injected by `sys.remote_exec` were run with the globals of the `__main__` module. Instead, run each injected script with an empty set of globals. If someone really wants to use the `__main__` module's namespace, they can always `import __main__`.	2025-04-23 23:40:24 +00:00
Xuehai Pan	26ae05e95c	gh-127405: Add ABIFLAGS to sysconfig variables on Windows (GH-131799)	2025-04-11 16:19:03 +01:00
Neil Schemenauer	d687900f98	gh-128384: Use a context variable for warnings.catch_warnings (gh-130010) Make `warnings.catch_warnings()` use a context variable for holding the warning filtering state if the `sys.flags.context_aware_warnings` flag is set to true. This makes using the context manager thread-safe in multi-threaded programs. Add the `sys.flags.thread_inherit_context` flag. If true, starting a new thread with `threading.Thread` will use a copy of the context from the caller of `Thread.start()`. Both these flags are set to true by default for the free-threaded build and false for the default build. Move the Python implementation of warnings.py into _py_warnings.py. Make _contextvars a builtin module. Co-authored-by: Kumar Aditya <kumaraditya@python.org>	2025-04-09 16:18:54 -07:00
Pablo Galindo Salgado	943cc1431e	gh-131591: Implement PEP 768 (#131937 ) Co-authored-by: Ivona Stojanovic <stojanovic.i@hotmail.com> Co-authored-by: Matt Wozniski <godlygeek@gmail.com>	2025-04-03 16:20:01 +01:00
mpage	053c285f6b	gh-130704: Strength reduce `LOAD_FAST{_LOAD_FAST}` (#130708 ) Optimize `LOAD_FAST` opcodes into faster versions that load borrowed references onto the operand stack when we can prove that the lifetime of the local outlives the lifetime of the temporary that is loaded onto the stack.	2025-04-01 10:18:42 -07:00
Michael Droettboom	8614f86b71	gh-131525: Cache the result of tuple_hash (#131529 ) * gh-131525: Cache the result of tuple_hash * Fix debug builds * Add blurb * Fix formatting * Pre-compute empty tuple singleton * Mostly set the cache within tuple_alloc * Fixes for TSAN * Pre-compute empty tuple singleton * Fix for 32-bit platforms * Assert that op != NULL in _PyTuple_RESET_HASH_CACHE * Use FT_ATOMIC_STORE_SSIZE_RELAXED macro * Update Include/internal/pycore_tuple.h Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> * Fix alignment * atomic load * Update Objects/tupleobject.c Co-authored-by: Chris Eibl <138194463+chris-eibl@users.noreply.github.com> --------- Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> Co-authored-by: Chris Eibl <138194463+chris-eibl@users.noreply.github.com>	2025-03-27 09:57:06 -04:00
Peter Bierma	90b82f2b61	gh-129900: Fix `SystemExit` return codes when the REPL is started from the command line (#129901 )	2025-03-25 19:48:46 +00:00
Serhiy Storchaka	0ef4ffeefd	gh-130163: Fix crashes related to PySys_GetObject() (GH-130503) The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed reference, has been replaced by using one of the following functions, which return a strong reference and distinguish a missing attribute from an error: _PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(), _PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString().	2025-02-25 23:04:27 +02:00
Mark Shannon	014223649c	GH-130396: Use computed stack limits on linux (GH-130398) * Implement C recursion protection with limit pointers for Linux, MacOS and Windows * Remove calls to PyOS_CheckStack * Add stack protection to parser * Make tests more robust to low stacks * Improve error messages for stack overflow	2025-02-25 09:24:48 +00:00
Russell Keith-Magee	8a76eb8469	gh-130384: Skip a test_getallocatedblocks test pre-condition on iOS. (GH-130385)	2025-02-24 15:34:38 +00:00
Sam Gross	a6a8c6f86e	gh-128954: Reorder _PyInterpreterFrame fields for reduced memory usage (#128958 ) This reduces the size of _PyInterpreterFrame by 8 bytes on 64-bit platforms using the free threading build due to alignment requirements. This allows for slightly more recursive calls into the interpreter (from C), but `test_call.test_super_deep` still crashes.	2025-01-27 17:14:51 +00:00
mpage	2e95c5ba3b	gh-115999: Implement thread-local bytecode and enable specialization for `BINARY_OP` (#123926 ) Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads. Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization. Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.	2024-11-04 11:13:32 -08:00
Sam Gross	ad6110a93f	gh-125842: Fix `sys.exit(0xffff_ffff)` on Windows (#125896 ) On Windows, `long` is a signed 32-bit integer so it can't represent `0xffff_ffff` without overflow. Windows exit codes are unsigned 32-bit integers, so if a child process exits with `-1`, it will be represented as `0xffff_ffff`. Also fix a number of other possible cases where `_Py_HandleSystemExit` could return with an exception set, leading to a `SystemError` (or fatal error in debug builds) later on during shutdown.	2024-10-24 12:03:50 -04:00
Donghee Na	ad7c778546	gh-123990: Good bye WITH_FREELISTS macro (gh-124358)	2024-09-24 01:28:59 +00:00
neonene	646f16bdee	gh-124153: Implement `PyType_GetBaseByToken()` and `Py_tp_token` slot (GH-124163)	2024-09-18 09:18:19 +02:00
Sam Gross	dc09301067	gh-122417: Implement per-thread heap type refcounts (#122418 ) The free-threaded build partially stores heap type reference counts in distributed manner in per-thread arrays. This avoids reference count contention when creating or destroying instances. Co-authored-by: Ken Jin <kenjin@python.org>	2024-08-06 14:36:57 -04:00
Sam Gross	4b63cd170e	gh-122527: Fix a crash on deallocation of `PyStructSequence` (GH-122577) The `PyStructSequence` destructor would crash if it was deallocated after its type's dictionary was cleared by the GC, because it couldn't compute the "real size" of the instance. This could occur with relatively straightforward code in the free-threaded build or with a reference cycle involving the type in the default build, due to differing orders in which `tp_clear()` was called. Account for the non-sequence fields in `tp_basicsize` and use that, along with `Py_SIZE()`, to compute the "real" size of a `PyStructSequence` in the dealloc function. This avoids the accesses to the type's dictionary during dealloc, which were unsafe.	2024-08-02 18:11:44 +02:00
Mark Shannon	169324c27a	GH-120024: Use pointer for stack pointer (GH-121923)	2024-07-18 12:47:21 +01:00
Tian Gao	e65cb4c6f0	gh-118934: Make PyEval_GetLocals return borrowed reference (#119769 ) Co-authored-by: Alyssa Coghlan <ncoghlan@gmail.com>	2024-07-16 12:17:47 -07:00
Petr Viktorin	6f1d448bc1	gh-113993: Allow interned strings to be mortal, and fix related issues (GH-120520) * Add an InternalDocs file describing how interning should work and how to use it. * Add internal functions to explicitly request what kind of interning is done: - `_PyUnicode_InternMortal` - `_PyUnicode_InternImmortal` - `_PyUnicode_InternStatic` * Switch uses of `PyUnicode_InternInPlace` to those. * Disallow using `_Py_SetImmortal` on strings directly. You should use `_PyUnicode_InternImmortal` instead: - Strings should be interned before immortalization, otherwise you're possibly interning a immortalizing copy. - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in backports, as they are now part of public API and version-specific ABI. * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery. * Make sure the statically allocated string singletons are unique. This means these sets are now disjoint: - `_Py_ID` - `_Py_STR` (including the empty string) - one-character latin-1 singletons Now, when you intern a singleton, that exact singleton will be interned. * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic). * Intern `_Py_STR` singletons at startup. * For free-threaded builds, intern `_Py_LATIN1_CHR` singletons at startup. * Beef up the tests. Cover internal details (marked with `@cpython_only`). * Add lots of assertions Co-Authored-By: Eric Snow <ericsnowcurrently@gmail.com>	2024-06-21 17:19:31 +02:00
Alyssa Coghlan	3859e09e3d	gh-74929: PEP 667 C API documentation (gh-119379) * Add docs for new APIs * Add soft-deprecation notices * Add What's New porting entries * Update comments referencing `PyFrame_LocalsToFast()` to mention the proxy instead * Other related cleanups found when looking for refs to the deprecated APIs	2024-06-01 13:59:35 +10:00
Jelle Zijlstra	e9875ecb5d	gh-119180: PEP 649: Add __annotate__ attributes (#119209 )	2024-05-22 04:38:12 +02:00
Jeong, YunWon	8d8275b0cf	gh-118473: Fix set_asyncgen_hooks not to be partially set when arguments are invalid (#118474 )	2024-05-06 17:02:52 -07:00
Tian Gao	b034f14a4b	gh-74929: Implement PEP 667 (GH-115153)	2024-05-04 12:12:10 +01:00
Brett Simmers	c2627d6eea	gh-116322: Add Py_mod_gil module slot (#116882 ) This PR adds the ability to enable the GIL if it was disabled at interpreter startup, and modifies the multi-phase module initialization path to enable the GIL when loading a module, unless that module's spec includes a slot indicating it can run safely without the GIL. PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148. A warning will be issued up to once per interpreter for the first GIL-using module that is loaded. If `-v` is given, a shorter message will be printed to stderr every time a GIL-using module is loaded (including the first one that issues a warning).	2024-05-03 11:30:55 -04:00
Sam Gross	2dae505e87	gh-117514: Add `sys._is_gil_enabled()` function (#118514 ) The function returns `True` or `False` depending on whether the GIL is currently enabled. In the default build, it always returns `True` because the GIL is always enabled.	2024-05-03 11:09:57 -04:00
Pablo Galindo Salgado	345e1e04ec	gh-112730: Make the test suite resilient to color-activation environment variables (#117672 )	2024-04-24 21:25:22 +01:00

1 2 3 4 5 ...

455 commits