When Python build is optimized with GCC using PGO, use
-fprofile-update=atomic option to use atomic operations when updating
profile information. This option reduces the risk of gcov Data Files
(.gcda) corruption which can cause random GCC crashes.
* Drop DOUBLE_IS_ARM_MIXED_ENDIAN_IEEE754 macro.
* Use DOUBLE_IS_BIG/LITTLE_ENDIAN_IEEE754 to detect endianness of
float/doubles.
* Drop "unknown_format" code path in PyFloat_Pack/Unpack*().
Co-authored-by: Victor Stinner <vstinner@python.org>
This undoes a change made as a part of PR 137470, for compatibility with EMSDK
4.0.19. It adds `emscripten_trampoline` field in `pycore_runtime_structs.h`
and initializes it from JS initialization code with the wasm-gc based trampoline
if possible. Otherwise we fall back to the JS trampoline.
* introduce executable specific linker flags
Add PY_CORE_EXE_LDFLAGS and EXE_LDFLAGS which stores executable specific
LDFLAGS, replacing PY_CORE_LDFLAGS for building
executable targets.
If PY_CORE_EXE_LDFLAGS / EXE_LDFLAGS is not provided, then it defaults
to the value of PY_CORE_LDFLAGS which is the existing behaviour.
If both flags are supplied, and there is a need
to distinguish between executable and shared specific LDFLAGS,
in particular, PY_CORE_LDFLAGS should contain the shared specific LDFLAGS.
* documentation for new linker flags
* update Misc folder documentation
* Update Makefile.pre.in
Co-authored-by: Victor Stinner <vstinner@python.org>
---------
Co-authored-by: Filipe Laíns <filipe.lains@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Filipe Laíns <lains@riseup.net>
Add SIMD optimization for `bytes.hex()`, `bytearray.hex()`, and `binascii.hexlify()` as well as `hashlib` `.hexdigest()` methods using platform-agnostic GCC/Clang vector extensions that compile to native SIMD instructions on our [PEP-11 Tier 1 Linux and macOS](https://peps.python.org/pep-0011/#tier-1) platforms.
- 1.1-3x faster for common small data (16-64 bytes, covering md5 through sha512 digest sizes)
- Up to 11x faster for large data (1KB+)
- Retains the existing scalar code for short inputs (<16 bytes) or platforms lacking SIMD instructions, no observable performance regressions there.
## Supported platforms:
- x86-64: the compiler generates SSE2 - always available, no flags or CPU feature checks needed
- ARM64: NEON is always available, always available, no flags or CPU feature checks needed
- ARM32: Requires NEON support and that appropriate compiler flags enable that (e.g., `-march=native` on a Raspberry Pi 3+) - while we _could_ use runtime detection to allow neon when compiled without a recent enough `-march=` flag (`cortex-a53` and later IIRC), there are diminishing returns in doing so. Anyone using 32-bit ARM in a situation where performance matters will already be compiling with such flags. (as opposed to 32-bit Raspbian compilation that defaults to aiming primarily for compatibility with rpi1&0 armv6 arch=armhf which lacks neon)
- Windows/MSVC: Not supported. MSVC lacks `__builtin_shufflevector`, so the existing scalar path is used. Leaving it as an opportunity for the future for someone to figure out how to express the intent to that compiler.
This is compile time detection of features that are always available on the target architectures. No need for runtime feature inspection.
If you are building with `--with-thread-sanitizer` and don't use the
suppression file, then running configure will report a thread leak.
Call `pthread_join()` to avoid the thread leak.
* GH-108819: fix LIBDEST not honoring --with-platlibdir
We look for the pure-Python part of the standard library in
PLATSTDLIBDIR, which may not match the default LIBDIR subdir.
From ``getpath.py``:
```python
...
STDLIB_SUBDIR = f'{platlibdir}/python{VERSION_MAJOR}.{VERSION_MINOR}{ABI_THREAD}'
STDLIB_LANDMARKS = [f'{STDLIB_SUBDIR}/os.py', f'{STDLIB_SUBDIR}/os.pyc']
PLATSTDLIB_LANDMARK = f'{platlibdir}/python{VERSION_MAJOR}.{VERSION_MINOR}{ABI_THREAD}/lib-dynload'
...
```
Signed-off-by: Filipe Laíns <lains@riseup.net>
* Add news
Signed-off-by: Filipe Laíns <lains@riseup.net>
* Always set LIBDEST and BINLIBDEST based on PLATLIBDIR
Signed-off-by: Filipe Laíns <lains@riseup.net>
* Add XXX comment on PLATLIBDIR default value
Signed-off-by: Filipe Laíns <lains@riseup.net>
* Regen configure
Signed-off-by: Filipe Laíns <lains@riseup.net>
---------
Signed-off-by: Filipe Laíns <lains@riseup.net>
On m68k, an fmove instruction accessing %fpcr may only move from
or to a data register or a memory operand. The constraint "g" also
permits the use of address registers, which is invalid. The correct
constraint is "dm". Beginning with GCC 15, the register allocator
picks an address register in the code which causes SIGILL during
runtime.
Co-authored-by: Michael Karcher <github@mkarcher.dialup.fu-berlin.de>
While CPython doesn't support `--enable-wasm-dynamic-linking`, external tools like componentize-py do and they have to patch around it. Since the flag is off by default, allowing the flag so external users can add/inject dynamic linking support seems acceptable.
Adapted from a patch for Python 3.14 submitted to the Debian BTS by John
https://bugs.debian.org/1105111#20
Co-authored-by: John David Anglin <dave.anglin@bell.net>
Some systems have the definitions of the mask bits without having the
corresponding members in struct statx. Add configure checks for members
added after Linux 4.11 (when statx itself was added).
Android has Linux's statx, but MACHDEP is "android" on Android, so
configure doesn't check for statx on Android. Base the check for statx
on ac_sys_system instead, which is "Linux-android" on Android, "Linux"
on other Linux distributions, and "AIX" on AIX (which has an
incompatible function named statx).
stx_atomic_write_unit_max_opt was added in Linux 6.16, but is controlled
by the STATX_WRITE_ATOMIC mask bit added in Linux 6.11. That's safe at
runtime because all kernels clear the reserved space in struct statx and
zero is a valid value for stx_atomic_write_unit_max_opt, and it avoids
allocating another mask bit, which are a limited resource. But it also
means the kernel headers don't provide a way to check whether
stx_atomic_write_unit_max_opt exists, so add a configure check.
Adds tooling to generate and test an iOS XCframework, in a way that will also facilitate
adding other XCframework targets for other Apple platforms (tvOS, watchOS, visionOS and
even macOS, potentially).
---------
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
With https://github.com/llvm/llvm-project/pull/150201 being merged, there is
now a better way to generate the Emscripten trampoline, instead of including
hand-generated binary WASM content. Requires Emscripten 4.0.12.
A runtime check is needed to support cross-compiling.
Remove the _Py_NORMALIZE_CENTURY macro.
Remove _pydatetime.py's _can_support_c99.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
The `Modules/hashlib.h` helper file is now removed and split into multiple files:
* `Modules/_hashlib/hashlib_buffer.[ch]` -- Utilities for getting a buffer view and handling buffer inputs.
* `Modules/_hashlib/hashlib_fetch.h` -- Utilities used when fetching a message digest from a digest-like identifier.
Currently, this file only contains common error messages as the fetching API is not yet implemented.
* `Modules/_hashlib/hashlib_mutex.h` -- Utilities for managing the lock on cryptographic hash objects.
Basic support for pyrepl in Emscripten. Limitations:
* requires JSPI
* no signal handling implemented
As followup work, it would be nice to implement a webworker variant
for when JSPI is not available and proper signal handling.
Because it requires JSPI, it doesn't work in Safari. Firefox requires
setting an experimental flag. All the Chromiums have full support since
May. Until we make it work without JSPI, let's keep the original web_example
around.
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Co-authored-by: Éric <merwok@netwok.org>