Commit graph

1772 commits

Author SHA1 Message Date
Serhiy Storchaka
becd7a967f
gh-146143: Fix the PyUnicodeWriter_WriteUCS4() signature (GH-146144)
It now accepts a pointer to constant buffer of Py_UCS4.
2026-03-19 08:23:01 +00:00
Serhiy Storchaka
99e2c5eccd
gh-144545: Improve handling of default values in Argument Clinic (GH-146016)
* Add the c_init_default attribute which is used to initialize the C variable
  if the default is not explicitly provided.
* Add the c_default_init() method which is used to derive c_default from
  default if c_default is not explicitly provided.
* Explicit c_default and py_default are now almost always have precedence
  over the generated value.
* Add support for bytes literals as default values.
* Improve support for str literals as default values (support non-ASCII
  and non-printable characters and special characters like backslash or quotes).
* Fix support for str and bytes literals containing trigraphs, "/*" and "*/".
* Improve support for default values in converters "char" and "int(accept={str})".
* Converter "int(accept={str})" now requires 1-character string instead of
  integer as default value.
* Add support for non-None default values in converter "Py_buffer": NULL,
  str and bytes literals.
* Improve error handling for invalid default values.
* Rename Null to NullType for consistency.
2026-03-17 12:16:35 +02:00
Stan Ulbrych
bdf0105291
gh-103997: Remove incorrect statements about -c dedenting (gh-138624) 2026-03-10 09:56:00 +01:00
Pieter Eendebak
8060aa5d7d
gh-145376: Fix various refleaks in Objects/ (#145609) 2026-03-09 14:17:27 +01:00
Hai Zhu
107863ee17
gh-144569: Avoid creating temporary objects in BINARY_SLICE for list, tuple, and unicode (GH-144590)
* Scalar replacement of BINARY_SLICE for list, tuple, and unicode
2026-03-02 17:02:38 +00:00
VanshAgarwal24036
a249795538
gh-145142: Make str.maketrans safe under free-threading (gh-145157) 2026-02-27 16:08:15 +00:00
Stan Ulbrych
1dfbde9299
gh-145118: Add frozendict support to str.maketrans() (gh-145129)
Add support to `str.maketrans`
2026-02-23 16:04:16 -06:00
Peter Bierma
19e64afddf
gh-141070: Rename PyUnstable_Object_Dump to PyObject_Dump (GH-142848) 2026-01-16 09:19:43 -05:00
Victor Stinner
7e5fcae09b
gh-142217: Remove internal _Py_Identifier functions (#142219)
Remove internal functions:

* _PyDict_ContainsId()
* _PyDict_DelItemId()
* _PyDict_GetItemIdWithError()
* _PyDict_SetItemId()
* _PyEval_GetBuiltinId()
* _PyObject_CallMethodIdNoArgs()
* _PyObject_CallMethodIdObjArgs()
* _PyObject_CallMethodIdOneArg()
* _PyObject_VectorcallMethodId()
* _PyUnicode_EqualToASCIIId()

These functions were not exported and so no usable outside CPython.
2025-12-03 14:33:32 +01:00
Victor Stinner
600f3feb23
gh-141070: Add PyUnstable_Object_Dump() function (#141072)
* Promote _PyObject_Dump() as a public function.
* Keep _PyObject_Dump() alias to PyUnstable_Object_Dump()
  for backward compatibility.
* Replace _PyObject_Dump() with PyUnstable_Object_Dump().

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
2025-11-18 16:13:13 +00:00
Stan Ulbrych
a3ce2f77f0
gh-55531: Implement normalize_encoding in C (#136643)
Closes gh-55531
2025-10-30 15:31:47 +01:00
Victor Stinner
efc37ba49e
gh-139353: Add Objects/unicode_writer.c file (#139911)
Move the public PyUnicodeWriter API and the private _PyUnicodeWriter
API to a new Objects/unicode_writer.c file.

Rename a few helper functions to share them between unicodeobject.c
and unicode_writer.c, such as resize_compact() or unicode_result().
2025-10-30 14:36:15 +01:00
Stan Ulbrych
dbe3950a76
gh-129117: Add unicodedata.isxidstart() function (#140269)
Expose `_PyUnicode_IsXidContinue/Start` in `unicodedata`:
add isxidstart() and isxidcontinue() functions.

Co-authored-by: Victor Stinner <vstinner@python.org>
2025-10-30 10:18:12 +00:00
Shamil
7d70a147f5
Remove dead stores to 'size' in UTF-8 decoder (unicodeobject.c) (#140637) 2025-10-27 11:55:57 +03:00
Victor Stinner
166cdaa6fb
gh-111489: Remove _PyTuple_FromArray() alias (#139973)
Replace _PyTuple_FromArray() with PyTuple_FromArray().
Remove pycore_tuple.h includes.
2025-10-11 22:58:14 +02:00
Victor Stinner
4c119714d5
gh-139353: Add Objects/unicode_format.c file (#139491)
* Move PyUnicode_Format() implementation from unicodeobject.c
  to unicode_format.c.
* Replace unicode_modifiable() with _PyUnicode_IsModifiable()
* Add empty lines to have two empty lines between functions.
2025-10-10 12:52:59 +02:00
Victor Stinner
3d3f126e86
gh-139353: Rename formatter_unicode.c to unicode_formatter.c (#139723)
* Move Python/formatter_unicode.c to Objects/unicode_formatter.c.
* Move Objects/stringlib/localeutil.h content into
  unicode_formatter.c. Remove localeutil.h.
* Move _PyUnicode_InsertThousandsGrouping() to unicode_formatter.c
  and mark the function as static.
* Rename unicode_fill() to _PyUnicode_Fill() and export it in
  pycore_unicodeobject.h.
* Move MAX_UNICODE to pycore_unicodeobject.h as _Py_MAX_UNICODE.
2025-10-08 14:56:00 +02:00
Victor Stinner
e9c538dd54
gh-139156: Optimize _PyUnicode_EncodeCharmap() (#139306)
Specialize _PyUnicode_EncodeCharmap() for EncodingMapType which is
used by Python codecs such as iso8859_15.
2025-09-25 11:42:16 +02:00
Victor Stinner
8d83b7df3f
gh-139156: Optimize the UTF-7 encoder (#139253)
Remove base64SetO and base64WhiteSpace parameters.
2025-09-24 17:57:29 +02:00
Victor Stinner
c7b11b7546
gh-139156: Use PyBytesWriter in PyUnicode_EncodeCodePage() (#139259)
Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the
PyBytesWriter API.
2025-09-24 16:39:40 +02:00
Victor Stinner
c9a79a02a8
gh-139156: Use PyBytesWriter in _PyUnicode_EncodeCharmap() (#139251)
Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the
PyBytesWriter API.

Add _PyBytesWriter_GetSize() and _PyBytesWriter_GetData() static
inline functions.
2025-09-24 16:15:34 +02:00
Victor Stinner
8cfd7b4ecf
gh-129813, PEP 782: Use PyBytesWriter in utf8_encoder() (#138874)
Replace the private _PyBytesWriter API with the new public
PyBytesWriter API in utf8_encoder() and unicode_encode_ucs1().
2025-09-23 11:47:09 +02:00
Victor Stinner
49e83e31bd
gh-139156: Use PyBytesWriter in PyUnicode_AsRawUnicodeEscapeString() (#139250)
Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the
PyBytesWriter API.
2025-09-22 23:46:19 +02:00
Victor Stinner
c497694f77
gh-139156: Use PyBytesWriter in UTF-16 encoder (#139233)
Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the
PyBytesWriter API.
2025-09-22 23:36:05 +02:00
Victor Stinner
e578a9e6a5
gh-139156: Use PyBytesWriter in PyUnicode_AsUnicodeEscapeString() (#139249)
Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the
PyBytesWriter API.
2025-09-22 23:22:27 +02:00
Victor Stinner
c863349f98
gh-139156: Use PyBytesWriter in the UTF-7 encoder (#139248)
Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the
PyBytesWriter API.
2025-09-22 22:49:25 +02:00
Victor Stinner
92ba2c92c4
gh-139156: Use PyBytesWriter in UTF-32 encoder (#139157)
Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the
PyBytesWriter API.
2025-09-22 20:05:35 +00:00
Victor Stinner
06b7891f12
gh-129813, PEP 782: Use Py_GetConstant(Py_CONSTANT_EMPTY_BYTES) (#138830)
Replace PyBytes_FromStringAndSize(NULL, 0) with
Py_GetConstant(Py_CONSTANT_EMPTY_BYTES). Py_GetConstant() cannot
fail.
2025-09-13 18:30:25 +02:00
Petr Viktorin
0c74fc8af0
gh-137210: Add a struct, slot & function for checking an extension's ABI (GH-137212)
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
2025-09-05 16:23:18 +02:00
Adam Turner
918e3ba6c0
GH-137623: Use an AC decorator for docstring line length enforcement (#137690) 2025-08-18 18:29:00 +01:00
Peter Bierma
082f370cdd
gh-137514: Add a free-threading wrapper for mutexes (GH-137515)
Add `FT_MUTEX_LOCK`/`FT_MUTEX_UNLOCK`, which call `PyMutex_Lock` and `PyMutex_Unlock` on the free-threaded build, and no-op otherwise.
2025-08-07 11:24:50 -04:00
Victor Stinner
ce1b747ff6
gh-58124: Avoid CP_UTF8 in UnicodeDecodeError (#137415)
Fix name of the Python encoding in Unicode errors of the code page
codec: use "cp65000" and "cp65001" instead of "CP_UTF7" and "CP_UTF8"
which are not valid Python code names.
2025-08-06 14:35:27 +02:00
Dave Peck
c5e77af131
gh-132661: Disallow Template/str concatenation after PEP 750 spec update (#135996)
Co-authored-by: sobolevn <mail@sobolevn.me>
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2025-07-21 08:44:26 +02:00
Petr Viktorin
e413e26719
gh-134891: Add PyUnstable_Unicode_GET_CACHED_HASH (GH-134892) 2025-06-06 15:51:00 +02:00
Victor Stinner
f49a07b531
gh-133968: Add PyUnicodeWriter_WriteASCII() function (#133973)
Replace most PyUnicodeWriter_WriteUTF8() calls with
PyUnicodeWriter_WriteASCII().

Unrelated change to please the linter: remove an unused
import in test_ctypes.

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2025-05-29 14:54:30 +00:00
Victor Stinner
fe9f6e829a
gh-133968: Add fast path to PyUnicodeWriter_WriteStr() (#133969)
Don't call PyObject_Str() if the input type is str.
2025-05-13 15:31:41 +02:00
Serhiy Storchaka
9f69a58623
gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648)
If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
2025-05-12 20:42:23 +03:00
Stan Ulbrych
4fd1095280
gh-133610: Remove PyUnicode_AsDecoded/Encoded functions (#133612) 2025-05-09 17:31:24 +02:00
Petr Viktorin
987e45e632
gh-128972: Add _Py_ALIGN_AS and revert PyASCIIObject memory layout. (GH-133085)
Add `_Py_ALIGN_AS` as per C API WG vote: https://github.com/capi-workgroup/decisions/issues/61
This patch only adds it to free-threaded builds; the `#ifdef Py_GIL_DISABLED`
can be removed in the future.

Use this to revert `PyASCIIObject` memory layout for non-free-threaded builds.
The long-term plan is to deprecate the entire struct; until that happens
it's better to keep it unchanged, as courtesy to people that rely on it despite
it not being stable ABI.
2025-05-02 18:30:40 +02:00
Lysandros Nikolaou
60202609a2
gh-132661: Implement PEP 750 (#132662)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Wingy <git@wingysam.xyz>
Co-authored-by: Koudai Aono <koxudaxi@gmail.com>
Co-authored-by: Dave Peck <davepeck@gmail.com>
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
Co-authored-by: Paul Everitt <pauleveritt@me.com>
Co-authored-by: sobolevn <mail@sobolevn.me>
2025-04-30 11:46:41 +02:00
Donghee Na
75cbb8d89e
gh-132070: Use _PyObject_IsUniquelyReferenced in unicodeobject (gh-133039)
---------

Co-authored-by: Kumar Aditya <kumaraditya@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2025-04-29 09:48:53 +09:00
Stan Ulbrych
f6fb498c97
gh-132798: Schedule removal of PyUnicode_AsDecoded/Encoded functions for 3.15 (#132799)
Co-authored-by: Victor Stinner <vstinner@python.org>
2025-04-25 15:07:41 +02:00
Jon Crall
fc0ec29889
gh-103997: Automatically dedent the argument to "-c" (#103998)
Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com>
Co-authored-by: Kirill Podoprigora <80244920+Eclips4@users.noreply.github.com>
Co-authored-by: Inada Naoki <songofacandy@gmail.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2025-04-18 17:39:30 +09:00
Bénédikt Tran
edbf7fb129
gh-111178: remove redundant casts for functions with correct signatures (#131673) 2025-04-01 17:18:11 +02:00
Victor Stinner
22706843e0
gh-131238: Remove many includes from pycore_interp.h (#131472) 2025-03-19 17:46:24 +00:00
Mark Shannon
a1aeec61c4
GH-131238: Core header refactor (GH-131250)
* Moves most structs in pycore_ header files into pycore_structs.h and pycore_runtime_structs.h

* Removes many cross-header dependencies
2025-03-17 09:19:04 +00:00
Mark Shannon
f30376c650
GH-127705: Fix _Py_RefcntAdd to handle objects becoming immortal (GH-131140) 2025-03-12 16:54:10 +00:00
Victor Stinner
ed8675c571
gh-111178: Fix function signatures of unicodeiter (#130684) 2025-03-04 10:33:09 +01:00
Sergey Miryanov
3a7f17c7e2
gh-130790: Remove references about unicode's readiness from comments (#130801) 2025-03-03 19:18:09 +00:00
Sergey B Kirpichev
f39a07be47
gh-87790: support thousands separators for formatting fractional part of floats (#125304)
```pycon
>>> f"{123_456.123_456:_._f}"  # Whole and fractional
'123_456.123_456'
>>> f"{123_456.123_456:_f}"    # Integer component only
'123_456.123456'
>>> f"{123_456.123_456:._f}"   # Fractional component only
'123456.123_456'
>>> f"{123_456.123_456:.4_f}"  # with precision
'123456.1_235'
```
2025-02-25 16:27:07 +01:00