Fix also error handling in _BINARY_OP_ADD_FLOAT,
_BINARY_OP_SUBTRACT_FLOAT and _BINARY_OP_MULTIPLY_FLOAT opcodes.
PyStackRef_FromPyObjectSteal() must not be called with a NULL
pointer.
Add special cases for classmethod and staticmethod descriptors in
_PyObject_GetMethodStackRef() to avoid calling tp_descr_get, which
avoids reference count contention on the bound method and underlying
callable. This improves scaling when calling classmethods and
staticmethods from multiple threads.
Also refactor method_vectorcall in classobject.c into a new _PyObject_VectorcallPrepend() helper so that it can be used by
PyObject_VectorcallMethod as well.
the lazy imports PEP initial implementation (3.15 alpha) inadvertently incremented the length of the sys.flags tuple. In a way that did not do anything useful or related to the lazy imports setting (it exposed sys.flags.gil in the tuple). This fixes that to hard code the length to the 3.13 & 3.14 released length of 18 and have our tests and code comments make it clear that we've since stopped making new sys.flags attributes available via sequence index.
* Convert DEOPT_IFs to EXIT_IFs for guards. Keep DEOPT_IF for intentional drops to the interpreter.
* Modify BINARY_OP_SUBSCR_LIST_INT and STORE_SUBSCR_LIST_INT to handle negative indices, to keep EXIT_IFs and DEOPT_IFs in different uops
Cache one datachunk per tstate to prevent alloc/dealloc thrashing when repeatedly hitting the same call depth at exactly the wrong boundary.
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
* Drop DOUBLE_IS_ARM_MIXED_ENDIAN_IEEE754 macro.
* Use DOUBLE_IS_BIG/LITTLE_ENDIAN_IEEE754 to detect endianness of
float/doubles.
* Drop "unknown_format" code path in PyFloat_Pack/Unpack*().
Co-authored-by: Victor Stinner <vstinner@python.org>
Add PyArg_ParseArray() and PyArg_ParseArrayAndKeywords()
functions to parse arguments of functions using the METH_FASTCALL
calling convention.
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
This undoes a change made as a part of PR 137470, for compatibility with EMSDK
4.0.19. It adds `emscripten_trampoline` field in `pycore_runtime_structs.h`
and initializes it from JS initialization code with the wasm-gc based trampoline
if possible. Otherwise we fall back to the JS trampoline.
Add SIMD optimization for `bytes.hex()`, `bytearray.hex()`, and `binascii.hexlify()` as well as `hashlib` `.hexdigest()` methods using platform-agnostic GCC/Clang vector extensions that compile to native SIMD instructions on our [PEP-11 Tier 1 Linux and macOS](https://peps.python.org/pep-0011/#tier-1) platforms.
- 1.1-3x faster for common small data (16-64 bytes, covering md5 through sha512 digest sizes)
- Up to 11x faster for large data (1KB+)
- Retains the existing scalar code for short inputs (<16 bytes) or platforms lacking SIMD instructions, no observable performance regressions there.
## Supported platforms:
- x86-64: the compiler generates SSE2 - always available, no flags or CPU feature checks needed
- ARM64: NEON is always available, always available, no flags or CPU feature checks needed
- ARM32: Requires NEON support and that appropriate compiler flags enable that (e.g., `-march=native` on a Raspberry Pi 3+) - while we _could_ use runtime detection to allow neon when compiled without a recent enough `-march=` flag (`cortex-a53` and later IIRC), there are diminishing returns in doing so. Anyone using 32-bit ARM in a situation where performance matters will already be compiling with such flags. (as opposed to 32-bit Raspbian compilation that defaults to aiming primarily for compatibility with rpi1&0 armv6 arch=armhf which lacks neon)
- Windows/MSVC: Not supported. MSVC lacks `__builtin_shufflevector`, so the existing scalar path is used. Leaving it as an opportunity for the future for someone to figure out how to express the intent to that compiler.
This is compile time detection of features that are always available on the target architectures. No need for runtime feature inspection.
Deprecate PEP-456 [1] support for providing an external definition
of the string hashing scheme. Removal is scheduled for Python 3.19.
Previously, embedders could define the ``Py_HASH_ALGORITHM`` macro to be
``Py_HASH_EXTERNAL`` [2] to indicate that the hashing scheme was provided
externally but this feature was undocumented, untested and most likely
unused.
[1]: https://peps.python.org/pep-0456/
[2]: https://peps.python.org/pep-0456/#hash-function-selection