cpython

mirror of https://github.com/python/cpython.git synced 2025-12-08 06:10:17 +00:00

Author	SHA1	Message	Date
Ken Jin	4fa80ce74c	gh-139109: A new tracing JIT compiler frontend for CPython (GH-140310) This PR changes the current JIT model from trace projection to trace recording. Benchmarking: better pyperformance (about 1.7% overall) geomean versus current https://raw.githubusercontent.com/facebookexperimental/free-threading-benchmarking/refs/heads/main/results/bm-20251108-3.15.0a1%2B-7e2bc1d-JIT/bm-20251108-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-7e2bc1d-vs-base.svg, 100% faster Richards on the most improved benchmark versus the current JIT. Slowdown of about 10-15% on the worst benchmark versus the current JIT. Note: the fastest version isn't the one merged, as it relies on fixing bugs in the specializing interpreter, which is left to another PR. The speedup in the merged version is about 1.1%. https://raw.githubusercontent.com/facebookexperimental/free-threading-benchmarking/refs/heads/main/results/bm-20251112-3.15.0a1%2B-f8a764a-JIT/bm-20251112-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-f8a764a-vs-base.svg Stats: 50% more uops executed, 30% more traces entered the last time we ran them. It also suggests our trace lengths for a real trace recording JIT are too short, as a lot of trace too long aborts https://github.com/facebookexperimental/free-threading-benchmarking/blob/main/results/bm-20251023-3.15.0a1%2B-eb73378-CLANG%2CJIT/bm-20251023-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-eb73378-pystats-vs-base.md . This new JIT frontend is already able to record/execute significantly more instructions than the previous JIT frontend. In this PR, we are now able to record through custom dunders, simple object creation, generators, etc. None of these were done by the old JIT frontend. Some custom dunders uops were discovered to be broken as part of this work gh-140277 The optimizer stack space check is disabled, as it's no longer valid to deal with underflow. Pros: * Ignoring the generated tracer code as it's automatically created, this is only additional 1k lines of code. The maintenance burden is handled by the DSL and code generator. * `optimizer.c` is now significantly simpler, as we don't have to do strange things to recover the bytecode from a trace. * The new JIT frontend is able to handle a lot more control-flow than the old one. * Tracing is very low overhead. We use the tail calling interpreter/computed goto interpreter to switch between tracing mode and non-tracing mode. I call this mechanism dual dispatch, as we have two dispatch tables dispatching to each other. Specialization is still enabled while tracing. * Better handling of polymorphism. We leverage the specializing interpreter for this. Cons: * (For now) requires tail calling interpreter or computed gotos. This means no Windows JIT for now :(. Not to fret, tail calling is coming soon to Windows though https://github.com/python/cpython/pull/139962 Design: * After each instruction, the `record_previous_inst` function/label is executed. This does as the name suggests. * The tracing interpreter lowers bytecode to uops directly so that it can obtain "fresh" values at the point of lowering. * The tracing version behaves nearly identical to the normal interpreter, in fact it even has specialization! This allows it to run without much of a slowdown when tracing. The actual cost of tracing is only a function call and writes to memory. * The tracing interpreter uses the specializing interpreter's deopt to naturally form the side exit chains. This allows it to side exit chain effectively, without repeating much code. We force a re-specializing when tracing a deopt. * The tracing interpreter can even handle goto errors/exceptions, but I chose to disable them for now as it's not tested. * Because we do not share interpreter dispatch, there is should be no significant slowdown to the original specializing interpreter on tailcall and computed got with JIT disabled. With JIT enabled, there might be a slowdown in the form of the JIT trying to trace. * Things that could have dynamic instruction pointer effects are guarded on. The guard deopts to a new instruction --- `_DYNAMIC_EXIT`.	2025-11-13 18:08:32 +00:00
Dino Viehland	ff7bb565d8	gh-139924: Add PyFunction_PYFUNC_EVENT_MODIFY_QUALNAME event for function watchers (#139925 ) Add PyFunction_PYFUNC_EVENT_MODIFY_QUALNAME event for function watchers	2025-10-10 15:25:38 -07:00
Semyon Moroz	968f6e523a	gh-130821: Add type information to error messages for invalid return type (GH-130835)	2025-08-14 11:04:41 +03:00
Xuanteng Huang	b1056c2a44	gh-135607: remove null checking of weakref list in dealloc of extension modules and objects (#135614 ) Co-authored-by: Kumar Aditya <kumaraditya@python.org> Co-authored-by: Victor Stinner <vstinner@python.org>	2025-06-30 11:14:31 +00:00
Peter Bierma	10a3d43188	gh-135755: Move `PyFunction_GET_BUILTINS` to the private API (GH-135938)	2025-06-26 11:43:08 +02:00
Eric Snow	dafd14146f	gh-132775: Fix _PyFunctIon_VerifyStateless() (#134900 ) The problem we're fixing here is that we were using PyDict_Size() on "defaults", which it is actually a tuple. We're also adding some explicit type checks. This is a follow-up to gh-133221/gh-133528.	2025-05-29 20:13:12 +00:00
Eric Snow	27128e4fa8	gh-132775: Unrevert "Add _PyCode_VerifyStateless()" (gh-133528) This reverts commit `3c73cf5` (gh-133497), which itself reverted the original commit `d270bb5` (gh-133221). We reverted the original change due to failing android tests. The checks in _PyCode_CheckNoInternalState() were too strict, so we've relaxed them.	2025-05-08 00:00:33 +00:00
Petr Viktorin	3c73cf51df	gh-132775: Revert "gh-132775: Add _PyCode_VerifyStateless() (gh-133221)" (#133497 )	2025-05-06 13:09:41 +03:00
Eric Snow	d270bb5792	gh-132775: Add _PyCode_VerifyStateless() (gh-133221) "Stateless" code is a function or code object which does not rely on external state or internal state. It may rely on arguments and builtins, but not globals or a closure. I've left a comment in pycore_code.h that provides more detail. We also add _PyFunction_VerifyStateless(). The new functions will be used in several later changes that facilitate "sharing" functions and code objects between interpreters.	2025-05-05 21:48:58 +00:00
Ivan Kirpichnikov	a36367520e	gh-132457: make staticmethod and classmethod generic (#132460 ) Co-authored-by: sobolevn <mail@sobolevn.me>	2025-05-04 19:26:38 +03:00
Victor Stinner	1a082085ae	gh-131238: Remove pycore_object_deferred.h from pycore_object.h (#131549 ) Remove also pycore_function.h from pycore_typeobject.h.	2025-03-21 16:44:10 +00:00
Mark Shannon	a45f25361d	GH-131238: More refactoring of core header files (GH-131351) Adds new pycore_stats.h header file to help break dependencies involving the pycore_code.h header.	2025-03-17 14:41:05 +00:00
Xuanteng Huang	55f17b77c3	gh-128714: Fix function object races in `__annotate__`, `__annotations__` and `__type_params__` in free-threading build (#129016 )	2025-02-06 20:10:50 +05:30
mpage	255762c09f	gh-127274: Defer nested methods (#128012 ) Methods (functions defined in class scope) are likely to be cleaned up by the GC anyway. Add a new code flag, `CO_METHOD`, that is set for functions defined in a class scope. Use that when deciding to defer functions.	2024-12-19 13:03:14 -08:00
Sam Gross	f4f530804b	gh-127582: Make object resurrection thread-safe for free threading. (GH-127612) Objects may be temporarily "resurrected" in destructors when calling finalizers or watcher callbacks. We previously undid the resurrection by decrementing the reference count using `Py_SET_REFCNT`. This was not thread-safe because other threads might be accessing the object (modifying its reference count) if it was exposed by the finalizer, watcher callback, or temporarily accessed by a racy dictionary or list access. This adds internal-only thread-safe functions for temporary object resurrection during destructors.	2024-12-05 16:07:31 -05:00
mpage	09c240f20c	gh-115999: Specialize `LOAD_GLOBAL` in free-threaded builds (#126607 ) Enable specialization of LOAD_GLOBAL in free-threaded builds. Thread-safety of specialization in free-threaded builds is provided by the following: A critical section is held on both the globals and builtins objects during specialization. This ensures we get an atomic view of both builtins and globals during specialization. Generation of new keys versions is made atomic in free-threaded builds. Existing helpers are used to atomically modify the opcode. Thread-safety of specialized instructions in free-threaded builds is provided by the following: Relaxed atomics are used when loading and storing dict keys versions. This avoids potential data races as the dict keys versions are read without holding the dictionary's per-object lock in version guards. Dicts keys objects are passed from keys version guards to the downstream uops. This ensures that we are loading from the correct offset in the keys object. Once a unicode key has been stored in a keys object for a combined dictionary in free-threaded builds, the offset that it is stored in will never be reused for a different key. Once the version guard passes, we know that we are reading from the correct offset. The dictionary read fast-path is used to read values from the dictionary once we know the correct offset.	2024-11-21 11:22:21 -08:00
Xuanteng Huang	35df4eb959	gh-126072: do not add `None` to `co_consts` if there is no docstring (GH-126101)	2024-10-30 09:01:09 +00:00
Sam Gross	3c4a7fa617	gh-124218: Avoid refcount contention on builtins module (GH-125847) This replaces `_PyEval_BuiltinsFromGlobals` with `_PyDict_LoadBuiltinsFromGlobals`, which returns a new reference instead of a borrowed reference. Internally, the new function uses per-thread reference counting when possible to avoid contention on the refcount fields on the builtins module.	2024-10-24 12:44:38 -04:00
Sam Gross	9b0bfba2a2	gh-124218: Use per-thread reference counting for globals and builtins (#125713 ) Use per-thread refcounting for the reference from function objects to the globals and builtins dictionaries.	2024-10-21 12:51:29 -04:00
Zachary Ware	c3164ae3cf	gh-125017: Fix refleak from GH-125636 (GH-125664)	2024-10-17 17:21:32 -05:00
Jelle Zijlstra	f203d1cb52	gh-125017: Fix crash on premature access to classmethod/staticmethod annotations (#125636 )	2024-10-17 09:45:25 -07:00
Sam Gross	3ea488aac4	gh-124218: Use per-thread refcounts for code objects (#125216 ) Use per-thread refcounting for the reference from function objects to their corresponding code object. This can be a source of contention when frequently creating nested functions. Deferred refcounting alone isn't a great fit here because these references are on the heap and may be modified by other libraries.	2024-10-15 15:06:41 -04:00
mpage	e99f159be4	gh-115999: Stop the world when invalidating function versions (#124997 ) Stop the world when invalidating function versions The tier1 interpreter specializes `CALL` instructions based on the values of certain function attributes (e.g. `__code__`, `__defaults__`). The tier1 interpreter uses function versions to verify that the attributes of a function during execution of a specialization match those seen during specialization. A function's version is initialized in `MAKE_FUNCTION` and is invalidated when any of the critical function attributes are changed. The tier1 interpreter stores the function version in the inline cache during specialization. A guard is used by the specialized instruction to verify that the version of the function on the operand stack matches the cached version (and therefore has all of the expected attributes). It is assumed that once the guard passes, all attributes will remain unchanged while executing the rest of the specialized instruction. Stopping the world when invalidating function versions ensures that all critical function attributes will remain unchanged after the function version guard passes in free-threaded builds. It's important to note that this is only true if the remainder of the specialized instruction does not enter and exit a stop-the-world point. We will stop the world the first time any of the following function attributes are mutated: - defaults - vectorcall - kwdefaults - closure - code This should happen rarely and only happens once per function, so the performance impact on majority of code should be minimal. Additionally, refactor the API for manipulating function versions to more clearly match the stated semantics.	2024-10-08 10:04:35 -04:00
Victor Stinner	7a178b7605	gh-111178: Fix function signatures in funcobject.c (#124908 )	2024-10-02 19:29:56 +02:00
sobolevn	e9681211b9	gh-122229: Add missing `Py_DECREF` in `func_get_annotation_dict` (#122230 )	2024-07-24 05:47:52 -07:00
Jelle Zijlstra	d28afd3fa0	gh-119180: Lazily wrap annotations on classmethod and staticmethod (#119864 )	2024-05-31 14:05:51 -07:00
Jelle Zijlstra	e9875ecb5d	gh-119180: PEP 649: Add __annotate__ attributes (#119209 )	2024-05-22 04:38:12 +02:00
mpage	37d0950022	gh-117657: Disable the function/code cache in free-threaded builds (#118301 ) This is only used by the specializing interpreter and the tier 2 optimizer, both of which are disabled in free-threaded builds.	2024-05-03 16:21:04 -04:00
Sam Gross	4ad8f090cc	gh-117376: Partial implementation of deferred reference counting (#117696 ) This marks objects as using deferred refrence counting using the `ob_gc_bits` field in the free-threaded build and collects those objects during GC.	2024-04-12 17:36:20 +00:00
Guido van Rossum	570a82d46a	gh-117045: Add code object to function version cache (#117028 ) Changes to the function version cache: - In addition to the function object, also store the code object, and allow the latter to be retrieved even if the function has been evicted. - Stop assigning new function versions after a critical attribute (e.g. `__code__`) has been modified; the version is permanently reset to zero in this case. - Changes to `__annotations__` are no longer considered critical. (This fixes gh-109998.) Changes to the Tier 2 optimization machinery: - If we cannot map a function version to a function, but it is still mapped to a code object, we continue projecting the trace. The operand of the `_PUSH_FRAME` and `_POP_FRAME` opcodes can be either NULL, a function object, or a code object with the lowest bit set. This allows us to trace through code that calls an ephemeral function, i.e., a function that may not be alive when we are constructing the executor, e.g. a generator expression or certain nested functions. We will lose globals removal inside such functions, but we can still do other peephole operations (and even possibly [call inlining](https://github.com/python/cpython/pull/116290), if we decide to do it), which only need the code object. As before, if we cannot retrieve the code object from the cache, we stop projecting.	2024-03-21 12:37:41 -07:00
Guido van Rossum	7e1f38f2de	gh-116916: Remove separate next_func_version counter (#116918 ) Somehow we ended up with two separate counter variables tracking "the next function version". Most likely this was a historical accident where an old branch was updated incorrectly. This PR merges the two counters into a single one: `interp->func_state.next_version`.	2024-03-18 11:11:10 -07:00
Michael Droettboom	ea3cd0498c	gh-114312: Collect stats for unlikely events (GH-114493)	2024-01-25 11:10:51 +00:00
Nikita Sobolev	2ac4cf4743	gh-112640: Add `kwdefaults` parameter to `types.FunctionType.__new__` (#112641 )	2024-01-11 00:42:30 -08:00
Serhiy Storchaka	18203a6bc9	gh-111789: Use PyDict_GetItemRef() in Objects/ (GH-111827)	2023-11-14 11:25:39 +02:00
Serhiy Storchaka	1d75ef6b61	gh-111999: Add signatures and improve docstrings for builtins (GH-112000)	2023-11-13 09:13:49 +02:00
Irit Katriel	2f9cb7e095	gh-81137: deprecate assignment of code object to a function of a mismatched type (#111823 )	2023-11-07 18:54:36 +00:00
Serhiy Storchaka	970e719a7a	gh-108082: Use PyErr_FormatUnraisable() (GH-111580) Replace most of calls of _PyErr_WriteUnraisableMsg() and some calls of PyErr_WriteUnraisable(NULL) with PyErr_FormatUnraisable(). Co-authored-by: Victor Stinner <vstinner@python.org>	2023-11-02 09:16:34 +00:00
Raymond Hettinger	7f9a99e854	gh-89519: Remove classmethod descriptor chaining, deprecated since 3.11 (gh-110163)	2023-10-27 00:24:56 -05:00
Victor Stinner	be5e8a0103	gh-110964: Remove private _PyArg functions (#110966 ) Move the following private functions and structures to pycore_modsupport.h internal C API: * _PyArg_BadArgument() * _PyArg_CheckPositional() * _PyArg_NoKeywords() * _PyArg_NoPositional() * _PyArg_ParseStack() * _PyArg_ParseStackAndKeywords() * _PyArg_Parser structure * _PyArg_UnpackKeywords() * _PyArg_UnpackKeywordsWithVararg() * _PyArg_UnpackStack() * _Py_ANY_VARARGS() Changes: * Python/getargs.h now includes pycore_modsupport.h to export functions. * clinic.py now adds pycore_modsupport.h when one of these functions is used. * Add pycore_modsupport.h includes when a C extension uses one of these functions. * Define Py_BUILD_CORE_MODULE in C extensions which now include directly or indirectly (via code generated by Argument Clinic) pycore_modsupport.h: * _csv * _curses_panel * _dbm * _gdbm * _multiprocessing.posixshmem * _sqlite.row * _statistics * grp * resource * syslog * _testcapi: bad_get() no longer uses METH_FASTCALL calling convention but METH_VARARGS. Replace _PyArg_UnpackStack() with PyArg_ParseTuple(). * _testcapi: add PYTESTCAPI_NEED_INTERNAL_API macro which is defined by _testcapi sub-modules which need the internal C API (pycore_modsupport.h): exceptions.c, float.c, vectorcall.c, watchers.c. * Remove Include/cpython/modsupport.h header file. Include/modsupport.h no longer includes the removed header file. * Fix mypy clinic.py	2023-10-17 14:30:31 +02:00
Brandt Bucher	13380da91e	GH-104584: Fix refleak when tracing through calls (GH-110593)	2023-10-10 08:29:48 +00:00
Mark Shannon	15d4c9fabc	GH-108716: Turn off deep-freezing of code objects. (GH-108722)	2023-09-08 10:34:40 +01:00
Guido van Rossum	3107b453bc	gh-108253: Fix reads of uninitialized memory in funcobject.c (#108383 )	2023-08-23 22:36:19 +00:00
Guido van Rossum	b8f96b5eda	gh-108253: Fix bug in func version cache (#108296 ) When a function object changed its version, a stale pointer might remain in the cache. Zap these whenever `func_version` changes (even when set to 0).	2023-08-22 08:29:49 -07:00
Guido van Rossum	61c7249759	gh-106581: Project through calls (#108067 ) This finishes the work begun in gh-107760. When, while projecting a superblock, we encounter a call to a short, simple function, the superblock will now enter the function using `_PUSH_FRAME`, continue through it, and leave it using `_POP_FRAME`, and then continue through the original code. Multiple frame pushes and pops are even possible. It is also possible to stop appending to the superblock in the middle of a called function, when running out of space or encountering an unsupported bytecode.	2023-08-17 11:29:58 -07:00
Brandt Bucher	05a824f294	GH-84436: Skip refcounting for known immortals (GH-107605)	2023-08-04 16:24:50 -07:00
Victor Stinner	1a3faba9f1	gh-106869: Use new PyMemberDef constant names (#106871 ) * Remove '#include "structmember.h"'. * If needed, add <stddef.h> to get offsetof() function. * Update Parser/asdl_c.py to regenerate Python/Python-ast.c. * Replace: * T_SHORT => Py_T_SHORT * T_INT => Py_T_INT * T_LONG => Py_T_LONG * T_FLOAT => Py_T_FLOAT * T_DOUBLE => Py_T_DOUBLE * T_STRING => Py_T_STRING * T_OBJECT => _Py_T_OBJECT * T_CHAR => Py_T_CHAR * T_BYTE => Py_T_BYTE * T_UBYTE => Py_T_UBYTE * T_USHORT => Py_T_USHORT * T_UINT => Py_T_UINT * T_ULONG => Py_T_ULONG * T_STRING_INPLACE => Py_T_STRING_INPLACE * T_BOOL => Py_T_BOOL * T_OBJECT_EX => Py_T_OBJECT_EX * T_LONGLONG => Py_T_LONGLONG * T_ULONGLONG => Py_T_ULONGLONG * T_PYSSIZET => Py_T_PYSSIZET * T_NONE => _Py_T_NONE * READONLY => Py_READONLY * PY_AUDIT_READ => Py_AUDIT_READ * READ_RESTRICTED => Py_AUDIT_READ * PY_WRITE_RESTRICTED => _Py_WRITE_RESTRICTED * RESTRICTED => (READ_RESTRICTED \| _Py_WRITE_RESTRICTED)	2023-07-25 15:28:30 +02:00
Serhiy Storchaka	be1b968dc1	gh-106521: Remove _PyObject_LookupAttr() function (GH-106642)	2023-07-12 08:57:10 +03:00
Serhiy Storchaka	93d292c2b3	gh-106303: Use _PyObject_LookupAttr() instead of PyObject_GetAttr() (GH-106304) It simplifies and speed up the code.	2023-07-09 15:27:03 +03:00
Serhiy Storchaka	08c08d21b0	gh-106033: Get rid of PyDict_GetItem in _PyFunction_FromConstructor (GH-106044)	2023-06-29 12:31:08 +03:00
Jelle Zijlstra	3fadd7d585	gh-104600: Make function.__type_params__ writable (#104601 )	2023-05-18 16:45:37 -07:00

1 2 3 4 5

238 commits