Commit graph

1541 commits

Author SHA1 Message Date
Hugo van Kemenade
899fdb213d
Revert "GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-126502)" (#126983) 2024-11-19 11:25:09 +02:00
Mark Shannon
b0fcc2c47a
GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-126502)
* Mark almost all reachable objects before doing collection phase

* Add stats for objects marked

* Visit new frames before each increment

* Remove lazy dict tracking

* Update docs

* Clearer calculation of work to do.
2024-11-18 14:31:26 +00:00
mpage
2e95c5ba3b
gh-115999: Implement thread-local bytecode and enable specialization for BINARY_OP (#123926)
Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads.

Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization.

Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
2024-11-04 11:13:32 -08:00
Shantanu
500f5338a8
gh-123930: Better error for "from imports" when script shadows module (#123929) 2024-10-24 12:11:12 -07:00
Sam Gross
3c4a7fa617
gh-124218: Avoid refcount contention on builtins module (GH-125847)
This replaces `_PyEval_BuiltinsFromGlobals` with
`_PyDict_LoadBuiltinsFromGlobals`, which returns a new reference
instead of a borrowed reference. Internally, the new function uses
per-thread reference counting when possible to avoid contention on the
refcount fields on the builtins module.
2024-10-24 12:44:38 -04:00
Pablo Galindo Salgado
3d1df3d84e
gh-125703: Correctly honour tracemalloc hooks on more PyDECREF specialized paths (#125712) 2024-10-21 15:39:05 +01:00
Eric Snow
6d93690954
gh-125604: Move _Py_AuditHookEntry, etc. Out of pycore_runtime.h (gh-125605)
This is essentially a cleanup, moving a handful of API declarations to the header files where they fit best, creating new ones when needed.

We do the following:

* add pycore_debug_offsets.h and move _Py_DebugOffsets, etc. there
* inline struct _getargs_runtime_state and struct _gilstate_runtime_state in _PyRuntimeState
* move struct _reftracer_runtime_state to the existing pycore_object_state.h
* add pycore_audit.h and move to it _Py_AuditHookEntry , _PySys_Audit(), and _PySys_ClearAuditHooks
* add audit.h and cpython/audit.h and move the existing audit-related API there
*move the perfmap/trampoline API from cpython/sysmodule.h to cpython/ceval.h, and remove the now-empty cpython/sysmodule.h
2024-10-18 09:26:08 -06:00
Michael Droettboom
37986e830b
gh-123153: Fix PGO builds with free-threading on Windows (#125607)
* gh-123153: Fix PGO builds with free-threading

* Redo how the #define works
2024-10-17 08:20:30 -04:00
Michael Droettboom
51410d8bdc
gh-125217: Turn off optimization around_PyEval_EvalFrameDefault to avoid MSVC crash (#125477) 2024-10-16 12:51:15 +00:00
Victor Stinner
b9a8ca0a6a
gh-115754: Use Py_GetConstant(Py_CONSTANT_EMPTY_STR) (#125194)
Replace PyUnicode_New(0, 0), PyUnicode_FromString("")
and PyUnicode_FromStringAndSize("", 0)
with Py_GetConstant(Py_CONSTANT_EMPTY_STR).
2024-10-09 17:15:23 +02:00
Mark Shannon
da071fa3e8
GH-119866: Spill the stack around escaping calls. (GH-124392)
* Spill the evaluation around escaping calls in the generated interpreter and JIT. 

* The code generator tracks live, cached values so they can be saved to memory when needed.

* Spills the stack pointer around escaping calls, so that the exact stack is visible to the cycle GC.
2024-10-07 14:56:39 +01:00
Sam Gross
f4997bb3ac
gh-123923: Defer refcounting for f_funcobj in _PyInterpreterFrame (#124026)
Use a `_PyStackRef` and defer the reference to `f_funcobj` when
possible. This avoids some reference count contention in the common case
of executing the same code object from multiple threads concurrently in
the free-threaded build.
2024-09-24 20:08:18 +00:00
Mark Shannon
c87b0e4a46
GH-124284: Add stats for refcount operations on immortal objects (GH-124288) 2024-09-23 19:10:55 +01:00
Ken Jin
8810e286fa
gh-121459: Deferred LOAD_GLOBAL (GH-123128)
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Sam Gross <655866+colesbury@users.noreply.github.com>
2024-09-14 00:23:51 +08:00
Sam Gross
b2afe2aae4
gh-123923: Defer refcounting for f_executable in _PyInterpreterFrame (#123924)
Use a `_PyStackRef` and defer the reference to `f_executable` when
possible. This avoids some reference count contention in the common case
of executing the same code object from multiple threads concurrently in
the free-threaded build.
2024-09-12 12:37:06 -04:00
Tushar Sadhwani
3597642ed5
gh-122239: Add actual count in unbalanced unpacking error message when possible (#122244) 2024-09-10 16:07:30 +01:00
Sam Gross
556e855684
gh-117376: Make Py_DECREF a macro in ceval.c in free-threaded build (#122975)
`Py_DECREF` and `PyStackRef_CLOSE` are now implemented as macros in the
free-threaded build in ceval.c. There are two motivations;

 * MSVC has problems inlining functions in ceval.c in the PGO build.

 * We will want to mark escaping calls in order to spill the stack
   pointer in ceval.c and we will want to do this around `_Py_Dealloc`
   (or `_Py_MergeZeroLocalRefcount` or `_Py_DecRefShared`), not around
   the entire `Py_DECREF` or `PyStackRef_CLOSE` call.
2024-08-23 15:36:14 -04:00
Mark Shannon
bb1d30336e
GH-118093: Make CALL_ALLOC_AND_ENTER_INIT suitable for tier 2. (GH-123140)
* Convert CALL_ALLOC_AND_ENTER_INIT to micro-ops such that tier 2 supports it

* Allow inexact arguments for CALL_ALLOC_AND_ENTER_INIT.
2024-08-20 16:52:58 +01:00
Victor Stinner
4767a6e31c
gh-122728: Fix SystemError in PyEval_GetLocals() (#122735)
Fix PyEval_GetLocals() to avoid SystemError ("bad argument to
internal function"). Don't redefine the 'ret' variable in the if
block.

Add an unit test on PyEval_GetLocals().
2024-08-06 23:01:44 +02:00
Mark Shannon
7aca84e557
GH-117224: Move the body of a few large-ish micro-ops into helper functions (GH-122601) 2024-08-02 16:31:17 +01:00
Brandt Bucher
15d4cd0967
GH-116090: Fire RAISE events from _FOR_ITER_TIER_TWO (GH-122413) 2024-07-29 12:17:47 -07:00
Mark Shannon
afb0aa6ed2
GH-121131: Clean up and fix some instrumented instructions. (GH-121132)
* Add support for 'prev_instr' to code generator and refactor some INSTRUMENTED instructions
2024-07-26 12:24:12 +01:00
Brandt Bucher
7b36b67b1e
GH-118093: Add tier two support to several instructions (GH-121884) 2024-07-18 14:24:58 -07:00
Mark Shannon
169324c27a
GH-120024: Use pointer for stack pointer (GH-121923) 2024-07-18 12:47:21 +01:00
Tian Gao
e65cb4c6f0
gh-118934: Make PyEval_GetLocals return borrowed reference (#119769)
Co-authored-by: Alyssa Coghlan <ncoghlan@gmail.com>
2024-07-16 12:17:47 -07:00
Michael Droettboom
d69529d31c
gh-121338: Remove #pragma optimize (#121340) 2024-07-08 08:48:42 -04:00
Sam Gross
8e8d202f55
gh-117139: Add _PyTuple_FromStackRefSteal and use it (#121244)
Avoids the extra conversion from stack refs to PyObjects.
2024-07-02 12:30:14 -04:00
Brandt Bucher
33903c53db
GH-116017: Get rid of _COLD_EXITs (GH-120960) 2024-07-01 13:17:40 -07:00
Ken Jin
e6543daf12
gh-117139: Fix a few wrong steals in bytecodes.c (GH-121127)
Fix a few wrong steals in bytecodes.c
2024-06-29 02:14:48 +08:00
Ken Jin
22b0de2755
gh-117139: Convert the evaluation stack to stack refs (#118450)
This PR sets up tagged pointers for CPython.

The general idea is to create a separate struct _PyStackRef for everything on the evaluation stack to store the bits. This forces the C compiler to warn us if we try to cast things or pull things out of the struct directly.

Only for free threading: We tag the low bit if something is deferred - that means we skip incref and decref operations on it. This behavior may change in the future if Mark's plans to defer all objects in the interpreter loop pans out.

This implies a strict stack reference discipline is required. ALL incref and decref operations on stackrefs must use the stackref variants. It is unsafe to untag something then do normal incref/decref ops on it.

The new incref and decref variants are called dup and close. They mimic a "handle" API operating on these stackrefs.

Please read Include/internal/pycore_stackref.h for more information!

---------

Co-authored-by: Mark Shannon <9448417+markshannon@users.noreply.github.com>
2024-06-27 03:10:43 +08:00
Irit Katriel
65a12c559c
gh-120834: fix type of *_iframe field in _PyGenObject_HEAD declaration (#120835) 2024-06-24 10:23:38 +01:00
Mark Shannon
9cefcc0ee7
GH-120507: Lower the BEFORE_WITH and BEFORE_ASYNC_WITH instructions. (#120640)
* Remove BEFORE_WITH and BEFORE_ASYNC_WITH instructions.

* Add LOAD_SPECIAL instruction

* Reimplement `with` and `async with` statements using LOAD_SPECIAL
2024-06-18 12:17:46 +01:00
Xie Yanbo
9e052619a6
Fix typos in documentation and comments (#119763) 2024-06-04 10:22:22 +00:00
Irit Katriel
6e9863d7a3
gh-118692: Avoid creating unnecessary StopIteration instances for monitoring (#119216) 2024-05-21 20:42:51 +00:00
Nikita Sobolev
a8e5fed100
gh-118613: Fix error handling of _PyEval_GetFrameLocals in ceval.c (#118614) 2024-05-06 10:34:56 +03:00
Tian Gao
b034f14a4b
gh-74929: Implement PEP 667 (GH-115153) 2024-05-04 12:12:10 +01:00
Mark Shannon
1ab6356ebe
GH-118095: Use broader specializations of CALL in tier 1, for better tier 2 support of calls. (GH-118322)
* Add CALL_PY_GENERAL, CALL_BOUND_METHOD_GENERAL and call CALL_NON_PY_GENERAL specializations.

* Remove CALL_PY_WITH_DEFAULTS specialization

* Use CALL_NON_PY_GENERAL in more cases when otherwise failing to specialize
2024-05-04 12:11:11 +01:00
Tian Gao
9c14ed0618
gh-107674: Improve performance of sys.settrace (GH-117133)
* Check tracing in RESUME_CHECK

* Only change to RESUME_CHECK if not tracing
2024-05-03 19:49:24 +01:00
Guido van Rossum
7d83f7bcc4
gh-118335: Configure Tier 2 interpreter at build time (#118339)
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.

We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.

On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.

In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
2024-04-30 18:26:34 -07:00
Dino Viehland
4a1cf66c5c
gh-117657: Fix small issues with instrumentation and TSAN (#118064)
Small TSAN fixups for instrumentation
2024-04-30 11:38:05 -07:00
Mark Shannon
3e06c7f719
GH-118095: Add dynamic exit support and FOR_ITER_GEN support to tier 2 (GH-118279) 2024-04-26 18:08:50 +01:00
Dino Viehland
07525c9a85
gh-116818: Make sys.settrace, sys.setprofile, and monitoring thread-safe (#116775)
Makes sys.settrace, sys.setprofile, and monitoring generally thread-safe.

Mostly uses a stop-the-world approach and synchronization around the code object's _co_instrumentation_version.  There may be a little bit of extra synchronization around the monitoring data that's required to be TSAN clean.
2024-04-19 14:47:42 -07:00
Guido van Rossum
40f4d641a9
GH-118036: Fix a bug with CALL_STAT_INC (#117933)
We were under-counting calls in `_PyEvalFramePushAndInit`
because the `CALL_STAT_INC` macro was redefined to a no-op
for the Tier 2 interpreter. The fix is not to `#undef` it at all.
This results in ~37% more "Frames pushed" reported
under "Call stats".
2024-04-18 07:59:02 -07:00
Jeff Glass
acf69e09c6
gh-115178: Add Counts of UOp Pairs to pystats (GH-115181) 2024-04-16 14:27:18 +01:00
Michael Droettboom
0edde64a41
GH-117457: Correct pystats uop "miss" counts (GH-117477) 2024-04-04 15:49:18 -07:00
Guido van Rossum
060a96f1a9
gh-116968: Reimplement Tier 2 counters (#117144)
Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``),
shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The
API used for adaptive specialization counters is changed but the behavior is
(supposed to be) identical.

The behavior of the Tier 2 counters is changed:
- There are no longer dynamic thresholds (we never varied these).
- All counters now use the same exponential backoff.
- The counter for ``JUMP_BACKWARD`` starts counting down from 16.
- The ``temperature`` in side exits starts counting down from 64.
2024-04-04 15:03:27 +00:00
Guido van Rossum
8eda146e87
Fix successor opcode name printing in Tier 2 DEOPT debug message (#117471) 2024-04-02 18:25:48 +00:00
Sam Gross
19c1dd60c5
gh-117323: Make cell thread-safe in free-threaded builds (#117330)
Use critical sections to lock around accesses to cell contents. The critical sections are no-ops in the default (with GIL) build.
2024-03-29 13:35:43 -04:00
Michael Droettboom
26d328b2ba
GH-117121: Add pystats to JIT builds (GH-117346) 2024-03-28 15:23:08 -07:00
Mark Shannon
bf82f77957
GH-116422: Tier2 hot/cold splitting (GH-116813)
Splits the "cold" path, deopts and exits, from the "hot" path, reducing the size of most jitted instructions, at the cost of slower exits.
2024-03-26 09:35:11 +00:00