Commit graph

1549 commits

Author SHA1 Message Date
Pablo Galindo Salgado
543acbce5a
bpo-45256: Small cleanups for the code that inlines Python-to-Python calls in ceval.c (GH-28836) 2021-10-09 17:52:05 +01:00
Pablo Galindo Salgado
b4903afd4d
bpo-45256: Remove the usage of the C stack in Python to Python calls (GH-28488)
Ths commit inlines calls to Python functions in the eval loop and steals all the arguments in the call from the caller for
performance.
2021-10-09 16:51:30 +01:00
Mark Shannon
a7252f88d3
bpo-40116: Add insertion order bit-vector to dict values to allow dicts to share keys more freely. (GH-28520) 2021-10-06 13:19:53 +01:00
Mark Shannon
f6eafe18c0
Normalize jumps in compiler. All forward jumps to use JUMP_FORWARD. (GH-28755) 2021-10-06 13:05:45 +01:00
Mark Shannon
bd627eb7ed
bpo-43760: Check for tracing using 'bitwise or' instead of branch in dispatch. (GH-28723) 2021-10-05 11:01:11 +01:00
Pablo Galindo Salgado
07cf10bafc
Fix compiler warning in ceval.c regarding signed comparison (GH-28716) 2021-10-04 12:13:46 +01:00
Serhiy Storchaka
60b9e040c9
bpo-45355: Use sizeof(_Py_CODEUNIT) instead of literal 2 for the size of the code unit (GH-28711) 2021-10-03 21:22:42 +03:00
Mark Shannon
cd760ceb67
Fix a couple of compiler warnings. (GH-28677) 2021-10-01 15:44:19 +01:00
Ken Jin
70bed6f993
bpo-45107: Make LOAD_METHOD_CLASS safer and faster, clean up comments (GH-28177)
* Improve comments

* Check cls is a type, remove dict calculation
2021-09-17 18:47:36 +08:00
Dong-hee Na
e6497fe698
bpo-45045: Optimize mapping patterns of structural pattern matching (GH-28043) 2021-08-30 19:02:32 +09:00
Mark Shannon
d3eaf0cc5b
bpo-44945: Specialize BINARY_ADD (GH-27967) 2021-08-27 09:21:01 +01:00
Mark Shannon
f9242d50b1
bpo-44990: Change layout of evaluation frames. "Layout B" (GH-27933)
Places the locals between the specials and stack. This is the more "natural" layout for a C struct, makes the code simpler and gives a slight speedup (~1%)
2021-08-25 13:44:20 +01:00
Ken Jin
96346cb6d0
bpo-44889: Specialize LOAD_METHOD with PEP 659 adaptive interpreter (GH-27722)
Adds four new instructions:

* LOAD_METHOD_ADAPTIVE
* LOAD_METHOD_CACHED
* LOAD_METHOD_MODULE
* LOAD_METHOD_CLASS
2021-08-17 15:55:55 +01:00
Mark Shannon
4f51fa9e2d
bpo-44900: Add five superinstructions. (GH-27741)
* LOAD_FAST LOAD_FAST
* STORE_FAST LOAD_FAST
* LOAD_FAST LOAD_CONST
* LOAD_CONST LOAD_FAST
* STORE_FAST STORE_FAST
2021-08-16 12:23:13 +01:00
Irit Katriel
8ac0886091
bpo-44890: collect specialization stats if Py_DEBUG (GH-27731) 2021-08-12 12:15:06 +01:00
Mark Shannon
a530a9538f
bpo-44878: Remove loop from interpreter. All dispatching is done by gotos. (GH-27727) 2021-08-12 11:47:38 +01:00
Mark Shannon
f66d00fdd7
bpo-44878: Remove the switch from the main interpreter loop when using computed gotos. (GH-27726)
* Refactor dispatch logic to make flow of control clearer. Moves lltrace and dxprofile instrumentation into DISPATCH macro.

* Remove switch in interpreter loop when using computed gotos. There is no need for two nearly-duplicate dispatch tables.
2021-08-11 14:02:11 +01:00
Mark Shannon
3f3d5dcac3
bpo-44878: _PyEval_EvalFrameDefault readability improvements (GH-27725)
* Move a few variable declarations to point of definition.

* Factor out tracing of function entry into helper function.
2021-08-11 11:47:52 +01:00
Mark Shannon
c174eafc33
Add missing DISPATCH() (GH-27715) 2021-08-11 09:25:26 +01:00
Mark Shannon
c7ea1e3dce
Fix stats for STORE_ATTR specialization. (GH-27708) 2021-08-10 11:40:37 +01:00
Mark Shannon
ac75f6bdd4
bpo-44826: Specialize STORE_ATTR (GH-27590)
* Generalize cache names for LOAD_ATTR to allow store and delete specializations.

* Factor out specialization of attribute dictionary access.

* Specialize STORE_ATTR.
2021-08-09 10:40:21 +01:00
Mark Shannon
c83919bd63
Add option to write specialization stats to files and script to summarize. (GH-27575)
* Add option to write stats to random file in a directory.

* Add script to summarize stats.
2021-08-04 11:39:52 +01:00
Mark Shannon
2116909b3e
Minor fixes to specialization stats. (GH-27457)
* Use class, not value for fail stats for BINARY_SUBSCR.

* Fix counts for unquickened instructions.
2021-07-29 20:50:03 +01:00
Mark Shannon
ae0a2b7562
bpo-44590: Lazily allocate frame objects (GH-27077)
* Convert "specials" array to InterpreterFrame struct, adding f_lasti, f_state and other non-debug FrameObject fields to it.

* Refactor, calls pushing the call to the interpreter upward toward _PyEval_Vector.

* Compute f_back when on thread stack, only filling in value when frame object outlives stack invocation.

* Move ownership of InterpreterFrame in generator from frame object to generator object.

* Do not create frame objects for Python calls.

* Do not create frame objects for generators.
2021-07-26 11:22:16 +01:00
Mark Shannon
d09c134178
bpo-44645: Check for interrupts on any potentially backwards edge (GH-27216) 2021-07-19 11:10:21 +01:00
Pablo Galindo Salgado
c90c591e51
Revert "bpo-44645: Check for interrupts on any potentially backwards edge. (GH-27167)" (#27194)
This reverts commit 000e70ad52.
2021-07-16 19:05:47 +02:00
Mark Shannon
000e70ad52
bpo-44645: Check for interrupts on any potentially backwards edge. (GH-27167) 2021-07-16 10:59:31 +01:00
Pablo Galindo Salgado
4cb7263f0c
Remove sys._deactivate_opcache() now that is not needed (GH-27154) 2021-07-15 14:43:59 +01:00
Irit Katriel
641345d636
bpo-26280: Port BINARY_SUBSCR to PEP 659 adaptive interpreter (GH-27043) 2021-07-15 13:13:12 +01:00
Mark Shannon
da6414f0ac
bpo-44570: Fix line tracing for forwards jumps to duplicated tails (GH-27068) 2021-07-08 19:21:09 +01:00
Mark Shannon
514f76bbac
bpo-44581: Don't execute quickened instructions if tracing is on (GH-27064) 2021-07-08 13:33:13 +01:00
Gabriele N. Tornetta
2f180ce2cb
bpo-44530: Add co_qualname field to PyCodeObject (GH-26941) 2021-07-07 12:21:51 +01:00
Serhiy Storchaka
20a88004ba
bpo-12022: Change error type for bad objects in "with" and "async with" (GH-26809)
A TypeError is now raised instead of an AttributeError in
"with" and "async with" statements for objects which do not
support the context manager or asynchronous context manager
protocols correspondingly.
2021-06-29 11:27:04 +03:00
Mark Shannon
c3f52b4d70
bpo-44486: Make sure that modules always have a dictionary. (GH-26847)
* Make sure that modules always have a dictionary.
2021-06-23 10:00:43 +01:00
Pablo Galindo
06cda808f1
bpo-44472: Fix ltrace functionality when exceptions are raised (GH-26822) 2021-06-21 16:23:53 +01:00
Mark Shannon
fb68791a26
bpo-44337: Improve LOAD_ATTR specialization (GH-26759)
* Specialize obj.__class__ with LOAD_ATTR_SLOT

* Specialize instance attribute lookup with attribute on class, provided attribute on class is not an overriding descriptor.

* Add stat for how many times the unquickened instruction has executed.
2021-06-21 11:49:21 +01:00
Mark Shannon
0982ded179
bpo-44032: Move pointer to code object from frame-object to frame specials array. (GH-26771) 2021-06-18 11:00:29 +01:00
Eric Snow
ac38a9f2df
bpo-43693: Eliminate unused "fast locals". (gh-26587)
Currently, if an arg value escapes (into the closure for an inner function) we end up allocating two indices in the fast locals even though only one gets used.  Additionally, using the lower index would be better in some cases, such as with no-arg `super()`.  To address this, we update the compiler to fix the offsets so each variable only gets one "fast local".  As a consequence, now some cell offsets are interspersed with the locals (only when an arg escapes to an inner function).

https://bugs.python.org/issue43693
2021-06-15 16:35:25 -06:00
Mark Shannon
358aa6197c
Remove accidentally duplicated STAT_INC (GH-26718) 2021-06-14 13:38:16 +01:00
Mark Shannon
eecbc7c390
bpo-44338: Port LOAD_GLOBAL to PEP 659 adaptive interpreter (GH-26638)
* Add specializations of LOAD_GLOBAL.

* Add more stats.

* Remove old opcache; it is no longer used.

* Add NEWS
2021-06-14 11:04:09 +01:00
Mark Shannon
54cb63863f
bpo-44348: Move trace-info to thread-state (GH-26623)
* Move trace-info to thread state.

* Correct output for pdb when turning on tracing in middle of line
2021-06-10 08:46:59 +01:00
Mark Shannon
e117c02837
bpo-44337: Port LOAD_ATTR to PEP 659 adaptive interpreter (GH-26595)
* Specialize LOAD_ATTR with  LOAD_ATTR_SLOT and LOAD_ATTR_SPLIT_KEYS

* Move dict-common.h to internal/pycore_dict.h

* Add LOAD_ATTR_WITH_HINT specialized opcode.

* Quicken in function if loopy

* Specialize LOAD_ATTR for module attributes.

* Add specialization stats
2021-06-10 08:46:01 +01:00
Eric Snow
3e1c7167d8
bpo-43693: Un-revert commit f3fa63e. (#26609)
This was reverted in GH-26596 (commit 6d518bb) due to some bad memory accesses.

* Add the MAKE_CELL opcode. (gh-26396)

The memory accesses have been fixed.

https://bugs.python.org/issue43693
2021-06-08 16:01:34 -06:00
Pablo Galindo
3fe921cd49
Revert "bpo-43693: Add the MAKE_CELL opcode and interleave fast locals offsets. (gh-26396)" (GH-26597)
This reverts commit 631f9938b1.
2021-06-08 13:17:55 +01:00
Eric Snow
631f9938b1
bpo-43693: Add the MAKE_CELL opcode and interleave fast locals offsets. (gh-26396)
This moves logic out of the frame initialization code and into the compiler and eval loop.  Doing so simplifies the runtime code and allows us to optimize it better.

https://bugs.python.org/issue43693
2021-06-07 16:52:00 -06:00
Eric Snow
2ab27c4af4
bpo-43693: Un-revert commits 2c1e258 and b2bf2bc. (gh-26577)
These were reverted in gh-26530 (commit 17c4edc) due to refleaks.

* 2c1e258 - Compute deref offsets in compiler (gh-25152)
* b2bf2bc - Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)

This change fixes the refleaks.

https://bugs.python.org/issue43693
2021-06-07 12:22:26 -06:00
Mark Shannon
001eb520b5
bpo-44187: Quickening infrastructure (GH-26264)
* Add co_firstinstr field to code object.

* Implement barebones quickening.

* Use non-quickened bytecode when tracing.

* Add NEWS item

* Add new file to Windows build.

* Don't specialize instructions with EXTENDED_ARG.
2021-06-07 18:38:06 +01:00
Pablo Galindo
17c4edc4e0
bpo-43693: Revert commits 2c1e2583fd and b2bf2bc1ec (GH-26530)
* Revert "bpo-43693: Compute deref offsets in compiler (gh-25152)"

This reverts commit b2bf2bc1ec.

* Revert "bpo-43693: Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)"

This reverts commit 2c1e2583fd.

These two commits are breaking the refleak buildbots.
2021-06-04 17:51:05 +01:00
Mark Shannon
b2bf2bc1ec
bpo-43693: Compute deref offsets in compiler (gh-25152)
Merges locals and cells into a single array.
Saves a pointer in the interpreter and means that we don't need the LOAD_CLOSURE opcode any more

https://bugs.python.org/issue43693
2021-06-03 18:03:54 -06:00
Eric Snow
2c1e2583fd
bpo-43693: Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)
A number of places in the code base (notably ceval.c and frameobject.c) rely on mapping variable names to indices in the frame "locals plus" array (AKA fast locals), and thus opargs.  Currently the compiler indirectly encodes that information on the code object as the tuples co_varnames, co_cellvars, and co_freevars.  At runtime the dependent code must calculate the proper mapping from those, which isn't ideal and impacts performance-sensitive sections.  This is something we can easily address in the compiler instead.

This change addresses the situation by replacing internal use of co_varnames, etc. with a single combined tuple of names in locals-plus order, along with a minimal array mapping each to its kind (local vs. cell vs. free).  These two new PyCodeObject fields, co_fastlocalnames and co_fastllocalkinds, are not exposed to Python code for now, but co_varnames, etc. are still available with the same values as before (though computed lazily).

Aside from the (mild) performance impact, there are a number of other benefits:

* there's now a clear, direct relationship between locals-plus and variables
* code that relies on the locals-plus-to-name mapping is simpler
* marshaled code objects are smaller and serialize/de-serialize faster

Also note that we can take this approach further by expanding the possible values in co_fastlocalkinds to include specific argument types (e.g. positional-only, kwargs).  Doing so would allow further speed-ups in _PyEval_MakeFrameVector(), which is where args get unpacked into the locals-plus array.  It would also allow us to shrink marshaled code objects even further.

https://bugs.python.org/issue43693
2021-06-03 10:28:27 -06:00