Commit graph

461 commits

Author SHA1 Message Date
Sergey Miryanov
32c264982e
gh-140061: Use _PyObject_IsUniquelyReferenced() to check if objects are uniquely referenced (gh-140062)
The previous `Py_REFCNT(x) == 1` checks can have data races in the free
threaded build. `_PyObject_IsUniquelyReferenced(x)` is a more conservative
check that is safe in the free threaded build and is identical to
`Py_REFCNT(x) == 1` in the default GIL-enabled build.
2025-10-15 09:48:21 -04:00
Victor Stinner
c9a79a02a8
gh-139156: Use PyBytesWriter in _PyUnicode_EncodeCharmap() (#139251)
Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the
PyBytesWriter API.

Add _PyBytesWriter_GetSize() and _PyBytesWriter_GetData() static
inline functions.
2025-09-24 16:15:34 +02:00
Victor Stinner
dd45179fa0
gh-129813, PEP 782: Remove the private _PyBytesWriter API (#139264)
It is now replaced with the new public PyBytesWriter API (PEP 782).
2025-09-23 15:29:55 +00:00
Victor Stinner
9f7bbafffe
gh-129813, PEP 782: Optimize byteswriter_resize() (#139101)
There is no need to copy the small_buffer in PyBytesWriter_Create().
2025-09-18 09:52:16 +00:00
Victor Stinner
c72ffe71f1
gh-129813, PEP 782: Set invalid bytes in PyBytesWriter (#139054)
Initialize the buffer with 0xFF byte pattern when creating a writer
object, but also when resizing/growing the writer.
2025-09-17 17:58:32 +02:00
Serhiy Storchaka
a1cf6e92b6
gh-71679: Share the repr implementation between bytes and bytearray (GH-138181)
This allows to use the smart quotes algorithm in the bytearray's repr.
2025-09-17 11:10:29 +03:00
Victor Stinner
4e00e2504f
gh-129813, PEP 782: Add PyBytesWriter.overallocate (#138941)
Disable overallocation in _PyBytes_FormatEx() at the last write.
2025-09-16 00:15:32 +02:00
Victor Stinner
7c6efc3a4f
gh-129813, PEP 782: Init small_buffer in PyBytesWriter_Create() (#138924)
Fill small_buffer with 0xFF byte pattern to detect the usage of
uninitialized bytes in debug build.
2025-09-15 16:23:11 +02:00
Victor Stinner
2d724939d7
gh-129813, PEP 782: Use PyBytesWriter in _PyBytes_FormatEx() (#138839)
Replace the private _PyBytesWriter API with the new public
PyBytesWriter API.
2025-09-15 12:23:36 +02:00
Victor Stinner
430900d15b
gh-129813, PEP 782: Use PyBytesWriter in _PyBytes_FromList() (#138837)
Use the new public PyBytesWriter API in:

* _PyBytes_FromHex()
* _PyBytes_FromBuffer()
* _PyBytes_FromList()
* _PyBytes_FromTuple()
* _PyBytes_FromIterator()

Add _PyBytesWriter_ResizeAndUpdatePointer() and
_PyBytesWriter_GetAllocated() helper functions.
2025-09-13 19:23:57 +02:00
Victor Stinner
06b7891f12
gh-129813, PEP 782: Use Py_GetConstant(Py_CONSTANT_EMPTY_BYTES) (#138830)
Replace PyBytes_FromStringAndSize(NULL, 0) with
Py_GetConstant(Py_CONSTANT_EMPTY_BYTES). Py_GetConstant() cannot
fail.
2025-09-13 18:30:25 +02:00
Victor Stinner
8fc6ae68b0
gh-129813, PEP 782: Use PyBytesWriter in _PyBytes_DecodeEscape2() (#138838)
Replace the private _PyBytesWriter API with the new public
PyBytesWriter API.
2025-09-13 18:28:08 +02:00
Victor Stinner
c3fca5d478
gh-129813, PEP 782: Add PyBytesWriter_Format() (#138824)
Modify PyBytes_FromFormatV() to use the public PyBytesWriter API
rather than the _PyBytesWriter private API.
2025-09-12 14:21:57 +02:00
Victor Stinner
adb414044f
gh-129813, PEP 782: Add PyBytesWriter C API (#138822) 2025-09-12 13:41:59 +02:00
Adam Turner
918e3ba6c0
GH-137623: Use an AC decorator for docstring line length enforcement (#137690) 2025-08-18 18:29:00 +01:00
Semyon Moroz
968f6e523a
gh-130821: Add type information to error messages for invalid return type (GH-130835) 2025-08-14 11:04:41 +03:00
Serhiy Storchaka
9f69a58623
gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648)
If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
2025-05-12 20:42:23 +03:00
Ageev Maxim
7c3692fe27
gh-130928: Fix error message during bytes formatting for the 'i' flag (#130967) 2025-03-24 22:07:03 +03:00
Bénédikt Tran
a1205ef524
gh-111178: fix UBSan failures for PyBytesObject (#131603) 2025-03-24 11:02:09 +01:00
Victor Stinner
20c5f969dd
gh-131238: Remove more includes from pycore_interp.h (#131480) 2025-03-19 23:01:32 +01:00
Victor Stinner
061da44bac
gh-111178: Change Argument Clinic signature for @classmethod (#131157)
Use "PyObject*", instead of "PyTypeObject*", for `@classmethod`
functions to fix an undefined behavior.
2025-03-12 17:42:07 +01:00
Daniel Pope
e0637cebe5
gh-129349: Accept bytes in bytes.fromhex()/bytearray.fromhex() (#129844)
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
2025-03-12 11:40:11 +01:00
Victor Stinner
9d759b63d8
gh-111178: Change Argument Clinic signature for METH_O (#130682)
Use "PyObject*" for METH_O functions to fix an undefined behavior.
2025-03-11 16:33:36 +01:00
Umar Butler
8d8b854824
gh-128016: Improved invalid escape sequence warning message (#128020) 2025-01-15 18:00:54 +01:00
Bénédikt Tran
295776c7f3
gh-111178: fix UBSan failures in Objects/bytesobject.c (GH-128237)
* remove redundant casts for `bytesobject`
* fix UBSan failures for `striterobject`
2025-01-10 11:48:06 +01:00
Abhijeet
0706bab1c0
gh-128133: use relaxed atomics for hash of bytes (#128412) 2025-01-03 13:50:56 +05:30
Srinivas Reddy Thatiparthy (తాటిపర్తి శ్రీనివాస్ రెడ్డి)
db9bea0386
gh-127740: For odd-length input to bytes.fromhex(...) change the error message to ValueError: fromhex() arg must be of even length (#127756) 2024-12-11 08:35:17 +01:00
Pablo Galindo Salgado
30aeb00d36
gh-126076: Account for relocated objects in tracemalloc (#126077) 2024-11-19 10:35:17 +00:00
Mark Shannon
fa40922597
GH-126547: Pre-assign version numbers for a few common classes (GH-126551) 2024-11-08 16:44:44 +00:00
Mark Shannon
c9014374c5
GH-125174: Make immortal objects more robust, following design from PEP 683 (GH-125251) 2024-10-10 18:19:08 +01:00
Victor Stinner
6a39e96ab8
gh-115754: Use Py_GetConstant(Py_CONSTANT_EMPTY_BYTES) (#125195)
Replace PyBytes_FromString("") and PyBytes_FromStringAndSize("", 0)
with Py_GetConstant(Py_CONSTANT_EMPTY_BYTES).
2024-10-09 17:12:11 +02:00
Victor Stinner
1d3700f943
gh-111178: Fix function signatures in bytesobject.c (#124806) 2024-10-02 13:35:51 +02:00
Victor Stinner
f1a0d96f41
gh-123091: Use _Py_IsImmortalLoose() (#123511)
Use _Py_IsImmortalLoose() in bytesobject.c, typeobject.c
and ceval.c.
2024-09-02 14:25:19 +02:00
Victor Stinner
d8e69b2c1b
gh-122854: Add Py_HashBuffer() function (#122855) 2024-08-30 15:42:27 +00:00
Victor Stinner
3d60dfbe17
gh-121645: Add PyBytes_Join() function (#121646)
* Replace _PyBytes_Join() with PyBytes_Join().
* Keep _PyBytes_Join() as an alias to PyBytes_Join().
2024-08-30 12:57:33 +00:00
Victor Stinner
fda6bd842a
Replace PyObject_Del with PyObject_Free (#122453)
PyObject_Del() is just a alias to PyObject_Free() kept for backward
compatibility. Use directly PyObject_Free() instead.
2024-08-01 14:12:33 +02:00
Serhiy Storchaka
b313cc68d5
gh-117557: Improve error messages when a string, bytes or bytearray of length 1 are expected (GH-117631) 2024-05-28 12:01:37 +03:00
Geoffrey Thomas
ef172521a9
Remove almost all unpaired backticks in docstrings (#119231)
As reported in #117847 and #115366, an unpaired backtick in a docstring
tends to confuse e.g. Sphinx running on subclasses of standard library
objects, and the typographic style of using a backtick as an opening
quote is no longer in favor. Convert almost all uses of the form

    The variable `foo' should do xyz

to

    The variable 'foo' should do xyz

and also fix up miscellaneous other unpaired backticks (extraneous /
missing characters).

No functional change is intended here other than in human-readable
docstrings.
2024-05-22 12:35:18 -04:00
Erlend E. Aasland
deb921f851
gh-117431: Adapt bytes and bytearray .find() and friends to Argument Clinic (#117502)
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used. The following bytes and bytearray methods are adapted:

- count()
- find()
- index()
- rfind()
- rindex()

Co-authored-by: Inada Naoki <songofacandy@gmail.com>
2024-04-12 07:40:55 +00:00
Sam Gross
1a6594f661
gh-117439: Make refleak checking thread-safe without the GIL (#117469)
This keeps track of the per-thread total reference count operations in
PyThreadState in the free-threaded builds. The count is merged into the
interpreter's total when the thread exits.
2024-04-08 12:11:36 -04:00
Erlend E. Aasland
595bb496b0
gh-117431: Adapt bytes and bytearray .startswith() and .endswith() to Argument Clinic (#117495)
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used.
2024-04-03 13:11:14 +02:00
Serhiy Storchaka
0c1a42cf9c
gh-87193: Support bytes objects with refcount > 1 in _PyBytes_Resize() (GH-117160)
Create a new bytes object and destroy the old one if it has refcount > 1.
2024-03-25 16:32:11 +01:00
Victor Stinner
578ebc5d5f
gh-108767: Replace ctype.h functions with pyctype.h functions (#108772)
Replace <ctype.h> locale dependent functions with Python "pyctype.h"
locale independent functions:

* Replace isalpha() with Py_ISALPHA().
* Replace isdigit() with Py_ISDIGIT().
* Replace isxdigit() with Py_ISXDIGIT().
* Replace tolower() with Py_TOLOWER().

Leave Modules/_sre/sre.c unchanged, it uses locale dependent
functions on purpose.

Include explicitly <ctype.h> in _decimal.c to get isascii().
2023-09-01 18:36:53 +02:00
Victor Stinner
b32d4cad15
gh-108444: Replace _PyLong_AsInt() with PyLong_AsInt() (#108459)
Change generated by the command:

sed -i -e 's!_PyLong_AsInt!PyLong_AsInt!g' \
    $(find -name "*.c" -o -name "*.h")
2023-08-25 01:01:30 +02:00
Victor Stinner
c494fb333b
gh-106320: Remove private _PyEval function (#108433)
Move private _PyEval functions to the internal C API
(pycore_ceval.h):

* _PyEval_GetBuiltin()
* _PyEval_GetBuiltinId()
* _PyEval_GetSwitchInterval()
* _PyEval_MakePendingCalls()
* _PyEval_SetProfile()
* _PyEval_SetSwitchInterval()
* _PyEval_SetTrace()

No longer export most of these functions.
2023-08-24 20:25:22 +02:00
Brandt Bucher
05a824f294
GH-84436: Skip refcounting for known immortals (GH-107605) 2023-08-04 16:24:50 -07:00
Dennis Sweeney
ab86426a34
gh-105235: Prevent reading outside buffer during mmap.find() (#105252)
* Add a special case for s[-m:] == p in _PyBytes_Find

* Add tests for _PyBytes_Find

* Make sure that start <= end in mmap.find
2023-07-12 22:50:45 -04:00
Inada Naoki
d5bd32fb48
gh-104922: remove PY_SSIZE_T_CLEAN (#106315) 2023-07-02 15:07:46 +09:00
Inada Naoki
77ddc9a7b1
fix typos (#106247)
Most typos are in comments, but two typos are in docstring.
2023-06-30 13:00:22 +09:00
Victor Stinner
ef300937c2
gh-92536: Remove PyUnicode_READY() calls (#105210)
Since Python 3.12, PyUnicode_READY() does nothing and always
returns 0.
2023-06-02 01:33:17 +02:00