Commit graph

1540 commits

Author SHA1 Message Date
Guido van Rossum
76d0868907
Cleanup tier2 debug output (#116920)
Various tweaks, including a slight refactor of the special cases for `_PUSH_FRAME`/`_POP_FRAME` to show the actual operand emitted.
2024-03-18 11:08:43 -07:00
Tian Gao
7895a61168
gh-116098: Revert "gh-107674: Improve performance of sys.settrace (GH-114986)" (GH-116178)
Revert "gh-107674: Improve performance of `sys.settrace` (GH-114986)"

This reverts commit 0a61e23700.
2024-03-01 07:46:33 +01:00
Brandt Bucher
f0df35eeca
GH-115802: JIT "small" code for Windows (GH-115964) 2024-02-29 08:11:28 -08:00
Tian Gao
0a61e23700
gh-107674: Improve performance of sys.settrace (GH-114986) 2024-02-28 15:21:42 +00:00
Guido van Rossum
142502ea8d
Tier 2 cleanups and tweaks (#115534)
* Rename `_testinternalcapi.get_{uop,counter}_optimizer` to `new_*_optimizer`
* Use `_PyUOpName()` instead of` _PyOpcode_uop_name[]`
* Add `target` to executor iterator items -- `list(ex)` now returns `(opcode, oparg, target, operand)` quadruples
* Add executor methods `get_opcode()` and `get_oparg()` to get `vmdata.opcode`, `vmdata.oparg`
* Define a helper for printing uops, and unify various places where they are printed
* Add a hack to summarize_stats.py to fix legacy uop names (e.g. `POP_TOP` -> `_POP_TOP`)
* Define helpers in `test_opt.py` for accessing the set or list of opnames of an executor
2024-02-20 20:24:35 +00:00
Ken Jin
7a8c3ed43a
gh-115735: Fix current executor NULL before _START_EXECUTOR (#115736)
This fixes level 3 or higher lltrace debug output `--with-pydebug` runs.
2024-02-20 18:47:05 +00:00
Brett Simmers
0749244d13
gh-112175: Add eval_breaker to PyThreadState (#115194)
This change adds an `eval_breaker` field to `PyThreadState`. The primary
motivation is for performance in free-threaded builds: with thread-local eval
breakers, we can stop a specific thread (e.g., for an async exception) without
interrupting other threads.

The source of truth for the global instrumentation version is stored in the
`instrumentation_version` field in PyInterpreterState. Threads usually read the
version from their local `eval_breaker`, where it continues to be colocated
with the eval breaker bits.
2024-02-20 09:57:48 -05:00
Mark Shannon
626c414995
GH-115457: Support splitting and replication of micro ops. (GH-115558) 2024-02-20 10:50:59 +00:00
Mark Shannon
7b21403ccd
GH-112354: Initial implementation of warm up on exits and trace-stitching (GH-114142) 2024-02-20 09:39:55 +00:00
Brandt Bucher
f6d9e5926b
GH-113464: Add a JIT backend for tier 2 (GH-113465)
Add an option (--enable-experimental-jit for configure-based builds
or --experimental-jit for PCbuild-based ones) to build an
*experimental* just-in-time compiler, based on copy-and-patch (https://fredrikbk.com/publications/copy-and-patch.pdf).

See Tools/jit/README.md for more information on how to install the required build-time tooling.
2024-01-28 18:48:48 -08:00
Brandt Bucher
30e6cbdba2
GH-113860: Get rid of _PyUOpExecutorObject (GH-113954) 2024-01-12 11:58:23 +00:00
Mark Shannon
0ae60b66de
GH-113486: Do not emit spurious PY_UNWIND events for optimized calls to classes. (GH-113680) 2024-01-05 09:45:22 +00:00
Mark Shannon
e96f26083b
GH-111485: Generate instruction and uop metadata (GH-113287) 2023-12-20 14:27:25 +00:00
Mark Shannon
6873555955
GH-112354: Treat _EXIT_TRACE like an unconditional side exit (GH-113104) 2023-12-14 14:26:44 +00:00
Serhiy Storchaka
1161c14e8c
gh-112716: Fix SystemError when __builtins__ is not a dict (GH-112770)
It was raised in two cases:
* in the import statement when looking up __import__
* in pickling some builtin type when looking up built-ins iter, getattr, etc.
2023-12-14 14:24:24 +02:00
Guido van Rossum
5b86644338
A smattering of cleanups in uop debug output and lltrace (#112980)
* Include destination T1 opcode in Error debug message
* Include destination T1 opcode in DEOPT debug message
* Remove obsolete comment from remove_unneeded_uops
* Change lltrace_instruction() to print caller's opcode/oparg
2023-12-11 16:42:30 -08:00
Serhiy Storchaka
8660fb7fd7
gh-112660: Do not clear arbitrary errors on import (GH-112661)
Previously arbitrary errors could be cleared during formatting error
messages for ImportError or AttributeError for modules. Now all
unexpected errors are reported.
2023-12-07 12:19:43 +02:00
Guido van Rossum
e723700190
Rename ...Uop... to ...UOp... (uppercase O) for consistency (#112327)
* Rename _PyUopExecute to _PyUOpExecute (uppercase O) for consistency
* Also rename _PyUopName and _PyUOp_Replacements, and some output strings
2023-11-28 17:10:11 -08:00
apaz
8f71b349de
gh-112217: Add check to call result for do_raise() where cause is a type. (#112216) 2023-11-27 21:13:27 +00:00
Michael Droettboom
6a00a58f60
gh-111786: Use separate opcode vars for Tier 1 and Tier 2 (#112289)
This makes Windows about 3% faster on pyperformance benchmarks.
2023-11-20 15:13:44 -08:00
Guido van Rossum
8deb8bc2e5
gh-112287: Speed up Tier 2 (uop) interpreter a little (#112286)
This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.

This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.

The rest of the approach is to only load
`oparg` and `operand` in cases that use them.
The code generator know when these are used.

For `oparg`, it will conditionally emit
```
oparg = CURRENT_OPARG();
```
at the top of the case block.
(The `oparg` variable may be referenced multiple times
by the instructions code block, so it must be in a variable.)

For `operand`, it will use `CURRENT_OPERAND()` directly
instead of referencing the `operand` variable,
which no longer exists.
(There is only one place where this will be used.)
2023-11-20 11:25:32 -08:00
Guido van Rossum
1995955173
gh-106529: Make FOR_ITER a viable uop (#112134)
This uses the new mechanism whereby certain uops
are replaced by others during translation,
using the `_PyUop_Replacements` table.
We further special-case the `_FOR_ITER_TIER_TWO` uop
to update the deoptimization target to point
just past the corresponding `END_FOR` opcode.

Two tiny code cleanups are also part of this PR.
2023-11-20 10:08:53 -08:00
Hugo van Kemenade
3b3ec0d77f
gh-111863: Rename Py_NOGIL to Py_GIL_DISABLED (#111864)
Rename Py_NOGIL to Py_GIL_DISABLED
2023-11-20 15:52:00 +02:00
Guido van Rossum
7405745817
Various small improvements to uop debug output (#112218)
- Show uop name in Error/DEOPT messages
- Add target to some messages
- Expose uop_name() as _PyUopName()
2023-11-17 22:25:57 +00:00
Mark Shannon
4bbb367ba6
GH-111848: Set the IP when de-optimizing (GH-112065)
* Replace jumps with deopts in tier 2

* Fewer special cases of uop names

* Add target field to uop IR

* Remove more redundant SET_IP and _CHECK_VALIDITY micro-ops

* Extend whitelist of non-escaping API functions.
2023-11-15 15:48:58 +00:00
Serhiy Storchaka
16055c1604
gh-111789: Simplify ceval.c by using PyDict_GetItemRef() (GH-111980) 2023-11-14 11:29:49 +02:00
Brandt Bucher
31ad7e061e
GH-111520: Add back the operand local (GH-111813) 2023-11-13 17:27:19 -08:00
Michael Droettboom
bc12f79112
gh-111786: Optimize for space for _PyEval_EvalFrameDefault on MSVC for PGO (#111794)
In PGO mode, this function caused a compiler error in MSVC.
It turns out that optimizing for space only save the day, and is even faster.
However, without PGO, this is neither necessary nor slower.
2023-11-09 18:41:40 +00:00
Mark Shannon
25c4956488
GH-109369: Exit tier 2 if executor is invalid (GH-111657) 2023-11-09 11:19:51 +00:00
Michael Droettboom
25937e3188
gh-111663: Restore the Tier 2 uop count pystats (#111664) 2023-11-02 15:24:52 -07:00
Irit Katriel
52cc4af6ae
gh-111354: simplify detection of RESUME after YIELD_VALUE at except-depth 1 (#111459) 2023-11-02 10:18:43 +00:00
Serhiy Storchaka
970e719a7a
gh-108082: Use PyErr_FormatUnraisable() (GH-111580)
Replace most of calls of _PyErr_WriteUnraisableMsg() and some
calls of PyErr_WriteUnraisable(NULL) with PyErr_FormatUnraisable().

Co-authored-by: Victor Stinner <vstinner@python.org>
2023-11-02 09:16:34 +00:00
Guido van Rossum
7e135a48d6
gh-111520: Integrate the Tier 2 interpreter in the Tier 1 interpreter (#111428)
- There is no longer a separate Python/executor.c file.
- Conventions in Python/bytecodes.c are slightly different -- don't use `goto error`,
  you must use `GOTO_ERROR(error)` (same for others like `unused_local_error`).
- The `TIER_ONE` and `TIER_TWO` symbols are only valid in the generated (.c.h) files.
- In Lib/test/support/__init__.py, `Py_C_RECURSION_LIMIT` is imported from `_testcapi`.
- On Windows, in debug mode, stack allocation grows from 8MiB to 12MiB.
- **Beware!** This changes the env vars to enable uops and their debugging
  to `PYTHON_UOPS` and `PYTHON_LLTRACE`.
2023-11-01 13:13:02 -07:00
Mark Shannon
d27acd4461
GH-111485: Increment next_instr consistently at the start of the instruction. (GH-111486) 2023-10-31 10:09:54 +00:00
Sam Gross
6dfb8fe023
gh-110481: Implement biased reference counting (gh-110764) 2023-10-30 16:06:09 +00:00
Irit Katriel
a0c414c35d
gh-111354: define names for RESUME oparg values (#111365) 2023-10-26 16:30:18 +01:00
Irit Katriel
67a91f78e4
gh-109094: replace frame->prev_instr by frame->instr_ptr (#109095) 2023-10-26 13:43:10 +00:00
denballakh
dd9d781da3
gh-110237: Check PyList_Append for errors in _PyEval_MatchClass (#110238) 2023-10-07 17:04:51 -07:00
Brandt Bucher
22e65eecaa
GH-105848: Replace KW_NAMES + CALL with LOAD_CONST + CALL_KW (GH-109300) 2023-09-13 10:25:45 -07:00
Mark Shannon
5a2a046151
GH-108390: Prevent non-local events being set with sys.monitoring.set_local_events() (GH-108420) 2023-09-05 08:03:53 +01:00
Victor Stinner
03c4080c71
gh-108765: Python.h no longer includes <ctype.h> (#108831)
Remove <ctype.h> in C files which don't use it; only sre.c and
_decimal.c still use it.

Remove _PY_PORT_CTYPE_UTF8_ISSUE code from pyport.h:

* Code added by commit b5047fd019
  in 2004 for MacOSX and FreeBSD.
* Test removed by commit 52ddaefb6b
  in 2007, since Python str type now uses locale independent
  functions like Py_ISALPHA() and Py_TOLOWER() and the Unicode
  database.

Modules/_sre/sre.c replaces _PY_PORT_CTYPE_UTF8_ISSUE with new
functions: sre_isalnum(), sre_tolower(), sre_toupper().

Remove unused includes:

* _localemodule.c: remove <stdio.h>.
* getargs.c: remove <float.h>.
* dynload_win.c: remove <direct.h>, it no longer calls _getcwd()
  since commit fb1f68ed7c (in 2001).
2023-09-03 18:54:27 +02:00
Mark Shannon
059bd4d299
GH-108614: Remove non-debug uses of #if TIER_ONE and #if TIER_TWO from _POP_FRAME op. (GH-108685) 2023-08-31 11:34:52 +01:00
Victor Stinner
b32d4cad15
gh-108444: Replace _PyLong_AsInt() with PyLong_AsInt() (#108459)
Change generated by the command:

sed -i -e 's!_PyLong_AsInt!PyLong_AsInt!g' \
    $(find -name "*.c" -o -name "*.h")
2023-08-25 01:01:30 +02:00
Irit Katriel
72119d16a5
gh-105481: remove regen-opcode. Generated _PyOpcode_Caches in regen-cases. (#108367) 2023-08-23 18:39:00 +01:00
Pablo Galindo Salgado
75b3db8445
gh-107944: Improve error message for function calls with bad keyword arguments (#107969) 2023-08-17 19:39:42 +01:00
Guido van Rossum
61c7249759
gh-106581: Project through calls (#108067)
This finishes the work begun in gh-107760. When, while projecting a superblock, we encounter a call to a short, simple function, the superblock will now enter the function using `_PUSH_FRAME`, continue through it, and leave it using `_POP_FRAME`, and then continue through the original code. Multiple frame pushes and pops are even possible. It is also possible to stop appending to the superblock in the middle of a called function, when running out of space or encountering an unsupported bytecode.
2023-08-17 11:29:58 -07:00
Mark Shannon
006e44f950
GH-108035: Remove the _PyCFrame struct as it is no longer needed for performance. (GH-108036) 2023-08-17 11:16:03 +01:00
Guido van Rossum
dc8fdf5fd5
gh-106581: Split CALL_PY_EXACT_ARGS into uops (#107760)
* Split `CALL_PY_EXACT_ARGS` into uops

This is only the first step for doing `CALL` in Tier 2.
The next step involves tracing into the called code object and back.
After that we'll have to do the remaining `CALL` specialization.
Finally we'll have to deal with `KW_NAMES`.

Note: this moves setting `frame->return_offset` directly in front of
`DISPATCH_INLINED()`, to make it easier to move it into `_PUSH_FRAME`.
2023-08-16 16:26:43 -07:00
Mark Shannon
52fbcf61b5
GH-107724: Fix the signature of PY_THROW callback functions. (GH-107725) 2023-08-09 09:30:50 +01:00
Guido van Rossum
328d925244
gh-107758: Improvements to lltrace feature (#107757)
- The `dump_stack()` method could call a `__repr__` method implemented in Python,
  causing (infinite) recursion.
  I rewrote it to only print out the values for some fundamental types (`int`, `str`, etc.);
  for everything else it just prints `<type_name @ 0xdeadbeef>`.

- The lltrace-like feature for uops wrote to `stderr`, while the one in `ceval.c` writes to `stdout`;
  I changed the uops to write to stdout as well.
2023-08-07 21:36:25 -07:00