cpython

mirror of https://github.com/python/cpython.git synced 2026-04-14 07:41:00 +00:00

Author	SHA1	Message	Date
Gregory P. Smith	ad4ee7cb0f	gh-144015: Add portable SIMD optimization for bytes.hex() et. al. (GH-143991) Add SIMD optimization for `bytes.hex()`, `bytearray.hex()`, and `binascii.hexlify()` as well as `hashlib` `.hexdigest()` methods using platform-agnostic GCC/Clang vector extensions that compile to native SIMD instructions on our [PEP-11 Tier 1 Linux and macOS](https://peps.python.org/pep-0011/#tier-1) platforms. - 1.1-3x faster for common small data (16-64 bytes, covering md5 through sha512 digest sizes) - Up to 11x faster for large data (1KB+) - Retains the existing scalar code for short inputs (<16 bytes) or platforms lacking SIMD instructions, no observable performance regressions there. ## Supported platforms: - x86-64: the compiler generates SSE2 - always available, no flags or CPU feature checks needed - ARM64: NEON is always available, always available, no flags or CPU feature checks needed - ARM32: Requires NEON support and that appropriate compiler flags enable that (e.g., `-march=native` on a Raspberry Pi 3+) - while we _could_ use runtime detection to allow neon when compiled without a recent enough `-march=` flag (`cortex-a53` and later IIRC), there are diminishing returns in doing so. Anyone using 32-bit ARM in a situation where performance matters will already be compiling with such flags. (as opposed to 32-bit Raspbian compilation that defaults to aiming primarily for compatibility with rpi1&0 armv6 arch=armhf which lacks neon) - Windows/MSVC: Not supported. MSVC lacks `__builtin_shufflevector`, so the existing scalar path is used. Leaving it as an opportunity for the future for someone to figure out how to express the intent to that compiler. This is compile time detection of features that are always available on the target architectures. No need for runtime feature inspection.	2026-02-22 19:19:03 -08:00
Serhiy Storchaka	012c498035	gh-142037: Improve error messages for printf-style formatting (GH-142081) This affects string formatting as well as bytes and bytearray formatting. * For errors in the format string, always include the position of the start of the format unit. * For errors related to the formatted arguments, always include the number or the name of the formatted argument. * Suggest more probable causes of errors in the format string (stray %, unsupported format, unexpected character). * Provide more information when the number of arguments does not match the number of format units. * Raise more specific errors when access of arguments by name is mixed with sequential access and when * is used with a mapping. * Add tests for some uncovered cases.	2026-01-24 11:13:50 +00:00
Serhiy Storchaka	522563549a	gh-143003: Fix possible shared buffer overflow in bytearray.extend() (GH-143086) When __length_hint__() returns 0 for non-empty iterator, the data can be written past the shared 0-terminated buffer, corrupting it.	2025-12-28 12:30:36 +00:00
Bénédikt Tran	61ee04834b	gh-142557: fix UAF in `bytearray.__mod__` when object is mutated while formatting `%`-style arguments (#143213 )	2025-12-27 14:57:13 +00:00
Bénédikt Tran	9976c2b634	gh-143195: fix UAF in `{bytearray,memoryview}.hex(sep)` via re-entrant `sep.__len__` (#143209 )	2025-12-27 13:32:52 +01:00
wangxiaolei	220f0b1077	gh-142560: prevent use-after-free in search-like methods by exporting buffer in bytearray (#142938 )	2025-12-19 08:02:23 +00:00
Serhiy Storchaka	706fdda8b3	gh-141370: Fix undefined behavior when using Py_ABS() (GH-141548) Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>	2025-12-05 16:24:35 +02:00
Cody Maloney	019c315a8e	gh-129559: add `bytearray.resize` thread safety test for free-threading (#141739 )	2025-11-21 23:42:22 +05:30
Cody Maloney	e265ce8a56	gh-139871: Optimize small takes in bytearray.take_bytes (GH-141741) When less than half the buffer is taken just copy that small part out rather than doing a big alloc + memmove + big shrink.	2025-11-20 08:49:05 +01:00
Cody Maloney	732224e113	gh-139871: Add `bytearray.take_bytes([n])` to efficiently extract `bytes` (GH-140128) Update `bytearray` to contain a `bytes` and provide a zero-copy path to "extract" the `bytes`. This allows making several code paths more efficient. This does not move any codepaths to make use of this new API. The documentation changes include common code patterns which can be made more efficient with this API. --- When just changing `bytearray` to contain `bytes` I ran pyperformance on a `--with-lto --enable-optimizations --with-static-libpython` build and don't see any major speedups or slowdowns with this; all seems to be in the noise of my machine (Generally changes under 5% or benchmarks that don't touch bytes/bytearray). Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Maurycy Pawłowski-Wieroński <5383+maurycy@users.noreply.github.com>	2025-11-13 13:19:44 +00:00
Stan Ulbrych	d6c89a2df2	gh-140939: Fix memory leak in `_PyBytes_FormatEx` error path (#140957 )	2025-11-06 11:20:57 +05:30
Serhiy Storchaka	a1cf6e92b6	gh-71679: Share the repr implementation between bytes and bytearray (GH-138181) This allows to use the smart quotes algorithm in the bytearray's repr.	2025-09-17 11:10:29 +03:00
Serhiy Storchaka	0dbbf61cc2	gh-71679: Improve tests for repr() of bytes and bytearray (GH-138180) * Merge existing tests test_repr_str and test_to_str. * Add more tests for non-printable and non-ASCII bytes. * Add tests for special escape sequences ('\t\n\r'). * Add tests for slashes. * Add more tests for quotes. * Add tests for subclasses. * Add test for non-ASCII class name. * Only apply @check_bytes_warnings for str() tests.	2025-08-27 13:24:28 +03:00
Bast	5e1e21dee3	gh-91153: prevent a crash in `bytearray.__setitem__(ind, ...)` when `ind.__index__` has side-effects (#132379 ) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>	2025-07-12 13:37:52 +00:00
Serhiy Storchaka	2602d8ae98	gh-71339: Use new assertion methods in tests (GH-129046)	2025-05-22 13:17:22 +03:00
Ageev Maxim	05557788f3	gh-131015: Add test for bytes formatting errors (#131881 ) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>	2025-04-05 10:30:16 +02:00
Ageev Maxim	7c3692fe27	gh-130928: Fix error message during bytes formatting for the `'i'` flag (#130967 )	2025-03-24 22:07:03 +03:00
Victor Stinner	73ab9e2ede	gh-131152: Remove unused imports from tests (#131153 )	2025-03-13 10:55:23 +01:00
Daniel Pope	e0637cebe5	gh-129349: Accept bytes in bytes.fromhex()/bytearray.fromhex() (#129844 ) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> Co-authored-by: Victor Stinner <vstinner@python.org>	2025-03-12 11:40:11 +01:00
Tomasz Pytel	e85f81f430	gh-129107: fix thread safety of `bytearray` where two critical sections are needed (#130227 )	2025-02-27 20:29:58 +05:30
Tomasz Pytel	1b6bef8086	gh-129107: make `bytearray` iterator thread safe (#130096 ) Co-authored-by: Kumar Aditya <kumaraditya@python.org>	2025-02-19 15:42:45 +05:30
AN Long	798f8d3ea9	Replace non-breaking spaces with normal spaces (#130116 ) Using normal spaces in place of non-breaking spaces.	2025-02-16 09:33:14 +08:00
Tomasz Pytel	a05433f24a	gh-129107: make `bytearray` thread safe (#129108 ) Co-authored-by: Kumar Aditya <kumaraditya@python.org>	2025-02-15 07:19:42 +00:00
Cody Maloney	5fb019fc29	gh-129559: Add `bytearray.resize()` (GH-129560) Add bytearray.resize() which wraps PyByteArray_Resize. Make negative size passed to resize exception/error rather than crash in optimized builds.	2025-02-05 11:33:17 -08:00
Srinivas Reddy Thatiparthy (తాటిపర్తి శ్రీనివాస్ రెడ్డి)	c33b6fbf35	gh-127740: Add some more tests for earlier PR #127756 (#127818 )	2024-12-12 02:18:12 +00:00
Srinivas Reddy Thatiparthy (తాటిపర్తి శ్రీనివాస్ రెడ్డి)	db9bea0386	gh-127740: For odd-length input to bytes.fromhex(...) change the error message to ValueError: fromhex() arg must be of even length (#127756 )	2024-12-11 08:35:17 +01:00
Victor Stinner	039d20ae54	gh-116417: Move limited C API abstract.c tests to _testlimitedcapi (#116986 ) Split abstract.c and float.c tests of _testcapi into two parts: limited C API tests in _testlimitedcapi and non-limited C API tests in _testcapi. Update test_bytes and test_class.	2024-03-19 10:44:13 +00:00
Jay Ting	948acd6ed8	gh-115323: Add meaningful error message for using bytearray.extend with str (#115332 ) Perform str check after TypeError is raised --------- Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>	2024-02-24 18:34:45 -05:00
Serhiy Storchaka	2223899adc	gh-104231: Add more tests for str(), repr(), ascii(), and bytes() (GH-112551)	2023-11-30 17:22:04 +02:00
Serhiy Storchaka	16c9415fba	gh-107178: Add the C API tests for the Abstract Objects Layer (GH-107179) Cover all the Mapping Protocol, almost all the Sequence Protocol (except PySequence_Fast) and a part of the Object Protocol. Move existing tests to Lib/test/test_capi/test_abstract.py and Modules/_testcapi/abstract.c. Add also tests for PyDict C API.	2023-08-07 18:51:43 +03:00
Michael Droettboom	dfcdee4a18	gh-94808: Add coverage for bytesarray_setitem (#95802 )	2022-10-10 08:28:41 -07:00
Michael Droettboom	dde15f5879	gh-94808: Improve coverage of _PyBytes_FormatEx (GH-95895) There were two specific areas not covered: - %(name) syntax - %*s syntax Automerge-Triggered-By: GH:iritkatriel	2022-09-07 04:51:50 -07:00
Brandt Bucher	f36589510b	GH-91153: Handle mutating __index__ methods in bytearray item assignment (GH-94891)	2022-07-19 09:42:40 -07:00
Serhiy Storchaka	884eba3c76	bpo-26579: Add object.__getstate__(). (GH-2821) Copying and pickling instances of subclasses of builtin types bytearray, set, frozenset, collections.OrderedDict, collections.deque, weakref.WeakSet, and datetime.tzinfo now copies and pickles instance attributes implemented as slots.	2022-04-06 20:00:14 +03:00
Christian Heimes	e73283a20f	bpo-45668: Fix PGO tests without test extensions (GH-29315)	2021-11-01 11:14:53 +01:00
Mark Dickinson	ae5259171b	Fix bytes.__bytes__ to not truncate at a zero byte (GH-27902)	2021-08-23 15:24:12 +01:00
Dong-hee Na	24b63c695a	bpo-24234: Implement bytes.__bytes__ (GH-27901)	2021-08-23 19:01:51 +09:00
Nikita Sobolev	a2ce538e16	bpo-44891: Tests `id` preserving on `* 1` for `str` and `bytes` (GH-27745) Co-authored-by: Łukasz Langa <lukasz@langa.pl>	2021-08-13 12:36:22 +02:00
Tobias Holl	61d8c54f43	bpo-42924: Fix incorrect copy in bytearray_repeat (GH-24208) Before, using the * operator to repeat a bytearray would copy data from the start of the internal buffer (ob_bytes) and not from the start of the actual data (ob_start).	2021-01-13 18:16:40 +02:00
Ronald Oussoren	41761933c1	bpo-41100: Support macOS 11 and Apple Silicon (GH-22855) Co-authored-by: Lawrence D’Anna <lawrence_danna@apple.com> * Add support for macOS 11 and Apple Silicon (aka arm64) As a side effect of this work use the system copy of libffi on macOS, and remove the vendored copy * Support building on recent versions of macOS while deploying to older versions This allows building installers on macOS 11 while still supporting macOS 10.9.	2020-11-08 10:05:27 +01:00
Hai Shi	fcce8c649a	bpo-40275: Use new test.support helper submodules in tests (GH-21772)	2020-08-07 23:55:35 +02:00
Hai Shi	a089d21df1	bpo-40275: Use new test.support helper submodules in tests (GH-21315)	2020-07-06 11:15:08 +02:00
Bruce Merry	d07d9f4c43	bpo-36051: Drop GIL during large bytes.join() (GH-17757) Improve multi-threaded performance by dropping the GIL in the fast path of bytes.join. To avoid increasing overhead for small joins, it is only done if the output size exceeds a threshold.	2020-01-29 16:09:24 +09:00
Sergey Fedoseev	92709a263e	bpo-37840: Fix handling of negative indices in bytearray_getitem() (GH-15250)	2019-09-09 09:28:34 -07:00
Victor Stinner	22eb689cf3	bpo-37388: Development mode check encoding and errors (GH-14341) In development mode and in debug build, encoding and errors arguments are now checked on string encoding and decoding operations. Examples: open(), str.encode() and bytes.decode(). By default, for best performances, the errors argument is only checked at the first encoding/decoding error, and the encoding argument is sometimes ignored for empty strings.	2019-06-26 00:51:05 +02:00
Gregory P. Smith	0c2f930564	bpo-22385: Support output separators in hex methods. (#13578 ) * bpo-22385: Support output separators in hex methods. Also in binascii.hexlify aka b2a_hex. The underlying implementation behind all hex generation in CPython uses the same pystrhex.c implementation. This adds support to bytes, bytearray, and memoryview objects. The binascii module functions exist rather than being slated for deprecation because they return bytes rather than requiring an intermediate step through a str object. This change was inspired by MicroPython which supports sep in its binascii implementation (and does not yet support the .hex methods). https://bugs.python.org/issue22385	2019-05-29 11:46:58 -07:00
Zackery Spytz	14514d9084	bpo-36946: Fix possible signed integer overflow when handling slices. (GH-13375) The final addition (cur += step) may overflow, so use size_t for "cur". "cur" is always positive (even for negative steps), so it is safe to use size_t here. Co-Authored-By: Martin Panter <vadmium+py@gmail.com>	2019-05-17 10:13:03 +03:00
Serhiy Storchaka	8e79e6e56f	Fix syntax warnings in tests introduced in bpo-15248. (GH-11932)	2019-02-19 13:49:09 +02:00
Serhiy Storchaka	44cc4822bb	bpo-33817: Fix _PyBytes_Resize() for empty bytes object. (GH-11516) Add also tests for PyUnicode_FromFormat() and PyBytes_FromFormat() with empty result.	2019-01-12 09:22:29 +02:00
Serhiy Storchaka	2c2044e789	bpo-34984: Improve error messages for bytes and bytearray constructors. (GH-9874)	2018-10-21 15:29:12 +03:00

1 2 3 4 5

248 commits