gh-139871: Add bytearray.take_bytes([n]) to efficiently extract bytes (GH-140128)

Update `bytearray` to contain a `bytes` and provide a zero-copy path to
"extract" the `bytes`. This allows making several code paths more efficient.

This does not move any codepaths to make use of this new API. The documentation
changes include common code patterns which can be made more efficient with
this API.

---

When just changing `bytearray` to contain `bytes` I ran pyperformance on a
`--with-lto --enable-optimizations --with-static-libpython` build and don't see
any major speedups or slowdowns with this; all seems to be in the noise of
my machine (Generally changes under 5% or benchmarks that don't touch
bytes/bytearray).


Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Maurycy Pawłowski-Wieroński <5383+maurycy@users.noreply.github.com>
This commit is contained in:
Cody Maloney 2025-11-13 05:19:44 -08:00 committed by GitHub
parent 2fbd396666
commit 732224e113
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 407 additions and 96 deletions

View file

@ -60,6 +60,14 @@ PyAPI_FUNC(void)
_PyBytes_Repeat(char* dest, Py_ssize_t len_dest,
const char* src, Py_ssize_t len_src);
/* _PyBytesObject_SIZE gives the basic size of a bytes object; any memory allocation
for a bytes object of length n should request PyBytesObject_SIZE + n bytes.
Using _PyBytesObject_SIZE instead of sizeof(PyBytesObject) saves
3 or 7 bytes per bytes object allocation on a typical system.
*/
#define _PyBytesObject_SIZE (offsetof(PyBytesObject, ob_sval) + 1)
/* --- PyBytesWriter ------------------------------------------------------ */
struct PyBytesWriter {