Commit graph

15901 commits

Author SHA1 Message Date
sobolevn
6b217ea90b
gh-151126: Fix crash on unset memory error in ctypes.get_errno (#151382) 2026-06-12 14:03:21 +03:00
Matt Wozniski
c37599200f
gh-151297: Fix undefined behavior in _PyObject_MiRealloc (GH-151358)
The standard says that a call to `memcpy` must pass a valid source and
destination pointer even if the size is 0, so we must avoid calling
`memcpy` when our source pointer is NULL. If we don't, an optimizing
compiler can decide that the pointer must be non-NULL based on the
presence of UB, and optimize out checks for null pointers.

Specifically, note that the standard says:

    Where an argument declared as size_t n specifies the length of the
    array for a function, n can have the value zero on a call to that
    function. Unless explicitly stated otherwise in the description of
    a particular function in this subclause, pointer arguments on such
    a call shall still have valid values, as described in 7.1.4.

And section 7.1.4 says:

    If an argument to a function has an invalid value (such as a value
    outside the domain of the function, or a pointer outside the address
    space of the program, or a null pointer, or a pointer to
    non-modifiable storage when the corresponding parameter is not
    const-qualified) or a type (after default argument promotion) not
    expected by a function with a variable number of arguments, the
    behavior is undefined.

The specification for `memcpy` doesn't state that it's allowed to be
called with null pointers, and Linux's `/usr/include/string.h` declares
`memcpy` as `__nonnull ((1, 2))`.
2026-06-11 21:21:04 -04:00
Ivy Xu
71805db429
gh-151337: Avoid possible memory leak in _tkinter.c on Windows. (GH-151340) 2026-06-11 22:55:11 +03:00
Pieter Eendebak
36fe7784b0
gh-150942: Speed up json.loads array and object decoding (GH-150945)
Append parsed values to the result list with _PyList_AppendTakeRef and
insert key/value pairs with _PyDict_SetItem_Take2, which take ownership of
the references instead of incref-ing on insert and then decref-ing the
local.  This removes a reference-count round-trip per element (and, on the
free-threaded build, a per-append lock).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 17:38:49 +01:00
Dino Viehland
efb2fffae1
gh-150490: Raise PyType_Modified for insertion into split dictionary (#150489)
Raise PyType_Modified for insertion into split dictionary
2026-06-11 09:38:31 -07:00
sobolevn
10595b1cb7
gh-151126: Fix missing memory error in os._path_splitroot (#151339) 2026-06-11 15:48:08 +00:00
sobolevn
9fd1a125bc
gh-151126: Fix missing memory errors in _interpchannelsmodule.c (#151239) 2026-06-10 18:59:11 +03:00
tonghuaroot (童话)
896f7fdc7d
gh-143988: Fix re-entrant mutation crashes in socket sendmsg/recvmsg_into (#143987)
Fix crashes in socket.sendmsg() and socket.recvmsg_into() that could
occur if buffer sequences are mutated re-entrantly during argument
parsing via __buffer__ protocol callbacks.

The bug occurs because:

1. PySequence_Fast() returns the original list object when the input
   is already a list (not a copy).
2. During iteration, PyObject_GetBuffer() triggers __buffer__
   callbacks which may clear the list.
3. Subsequent iterations access invalid memory (heap OOB read).

The fix replaces PySequence_Fast() with PySequence_Tuple() which
always creates a new tuple, ensuring the sequence cannot be mutated
during iteration.

Co-authored-by: tonghuaroot <23011166+tonghuaroot@users.noreply.github.com>
2026-06-10 13:03:49 +00:00
sobolevn
8d94fa7b86
gh-151126: Add missing PyErr_NoMemory in _winapi module (#151154) 2026-06-09 19:42:08 +03:00
sobolevn
9fdbade99e
gh-151039: Fix a crash when _datetime types outlive _datetime module (#151044) 2026-06-09 11:44:37 +00:00
Serhiy Storchaka
c3cd75afdf
gh-151130: Add more tests for PyWeakref_* C API (GH-151131) 2026-06-09 11:11:17 +00:00
Cody Maloney
db4b1948bc
gh-143008: Fix Null pointer dereferences in TextIOWrapper underlying stream access (#145957)
TextIOWrapper keeps its underlying stream in a member called
`self->buffer`. That stream can be detached by user code, such as custom
`.flush` implementations resulting in `self->buffer` being set to NULL.
The implementation often checked at the start of functions if
`self->buffer` is in a good state, but did not always recheck after
other Python code was called which could modify `self->buffer`.

The cases which need to be re-checked are hard to spot so rather than
rely on reviewer effort create better safety by making all self->buffer
access go through helper functions.

Thank you yihong0618 for the test, NEWS and initial implementation in
gh-143041.

Co-authored-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
2026-06-09 12:31:44 +02:00
Pieter Eendebak
537702d570
gh-151059: [perf] Use PyObject_CallMethodOneArg in datetime's call_tzinfo_method (#151062) 2026-06-08 14:11:36 +03:00
Stan Ulbrych
5755d0f083
gh-150599: Prevent bz2 decompressor reuse after errors (#150600) 2026-06-07 08:19:05 -07:00
esadomer
f2cab7b0cf
gh-151021: Fix mmap empty searches past the end (GH-151023) 2026-06-07 16:01:24 +03:00
Pieter Eendebak
0f7dc2fefa
gh-150942: Speed up re.findall and re.sub/subn result building (gh-150943) 2026-06-07 21:06:36 +09:00
Edward Xu
12af26d17e
gh-150411: fix gc_generation.count race in free-threading (#150413) 2026-06-06 17:03:04 +00:00
Jeff Epler
b6e66136cc
gh-150534: Add C23 half-turn trigonometric *pi functions (GH-150555)
Add the the following functions to the math module:
acospi, asinpi, atanpi, atan2pi, cospi, sinpi, tanpi.
2026-06-06 10:19:45 +00:00
Pieter Eendebak
97dea30914
gh-150889: Improve performance of unicodedata.normalize() (GH-150890)
Scan the nfc_first/nfc_last reindex tables comparing only .start, range-check
the candidate once, and terminate on a sentinel above every codepoint, so each
entry costs a single comparison. ~2x faster on non-Latin and combining-heavy
NFC/NFKC input; no new data tables.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 11:34:33 +03:00
Jiseok CHOI
fc9c4db130
gh-150913: Fix sqlite3.Blob validation for empty slice assignment (GH-150915)
ass_subscript_slice() returned early when the computed slice length
was zero, bypassing validation performed for non-empty slices.
2026-06-04 16:41:47 +03:00
sobolevn
d83d50b5b7
gh-150750: Fix a race condition in deque.index with free-threading (#150779) 2026-06-04 13:31:31 +00:00
Edward Xu
41eb8ee2bb
gh-148613: Fix race in gc_set_threshold and gc_get_threshold (#150356) 2026-06-03 16:58:26 +05:30
Stephen Rosen
50fe49c879
gh-150319: Replace all documentation which says "See PEP 585" (#150325)
* Replace all documentation which says "See PEP 585"

The following classes in the stdlib get simple updates:

- array.array
- asyncio.Future
- asyncio.Task
- collections.defaultdict
- collections.deque
- contextvars.ContextVar
- contextvars.Token
- ctypes.Array
- os.DirEntry
- re.Match
- re.Pattern
- string.templatelib.Interpolation
- string.templatelib.Template
- types.MappingProxyType
- queue.SimpleQueue
- weakref.ref

The following classes are documented publicly as functions, and are
therefore updated internally (`__class_getitem__.__doc__`) but not in the
public docs:

- functools.partial
- itertools.chain

The following builtin types have updates to `__class_getitem__.__doc__`
but not to any documentation pages:

- BaseExceptionGroup
- coroutines (from generators)
- dict
- enumerate
- frozendict
- frozenset
- generators (and async generators)
- list
- memoryview
- set
- slice
- tuple

Special cases:

- union objects are now documented as "supporting class-level []",
  rather than anything to do with generics.

- Templates might be generic over a single type (union, in theory) or
  over a TypeVarTuple. As this is not currently fully settled, it is
  marked with a comment and a mild hint that it is a single type is used
  (namely, "type" is singular rather than "types", plural)

* Apply suggestions from code review

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>

* Correct several class getitem docs

And expand the text for tuples.

Co-authored-by: Jelle Zijlstra <906600+JelleZijlstra@users.noreply.github.com>

* Add notes on generic typing of builtins

* Fix typo in tuple.__class_getitem__ docstring

* Typo fix: malformed refs

Fix `generic` links which weren't marked as `:ref:`.

* Strike unnecessary docs on generic-ness

Co-authored-by: Jelle Zijlstra <906600+JelleZijlstra@users.noreply.github.com>

* Apply suggestions from code review

These are applied at both the originally indicated locations and in the
corresponding docstring definitions.

Co-authored-by: Alex Waygood <66076021+AlexWaygood@users.noreply.github.com>

* Update Doc/library/re.rst

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

* Update Objects/enumobject.c

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

* Remove tuple generic doc in 'stdtypes' page

This is covered in more detail in the cross-linked typing documentation.
The other copy of this documentation -- in the docstring for
`tuple.__class_getitem__` -- is left in place.

* Fix whitespace around new doc of generics

Per review, do not introduce or remove whitespace such that section
breaks are altered by the introduction of doc on various generic types.

In most cases, this is a removal of an extra line.

In one case (Arrays), it is the reintroduction of a line.

Additionally, two other minor fixes are included:
- incorrect indent on 'defaultdicts'
- make `mappingproxy.__class_getitem__.__doc__` consistent with other
  mapping type generic docs

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

* Move placement of memoryview generic note

Previous placement was at the end of the main docstring, which is
consistent with other types but places it after a section on various
methods (which makes it read somewhat inconsistently). Moving it up
helps resolve.

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

* Ensure sphinxdoc does not start sentences lowercase

Lowercase class names at the start of sentences are marked out with the
`class` role. In the case of `deque`, documentation already refers to
these as `Deques`, so this form is preferred.

* Apply suggestions from code review

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

* Fix line endings and wrap more tightly

Line endings fixed by pre-commit ; also re-wrapped the MappingProxyType
text which was too long.

* Use 'ContextVars' style in sphinx doc

---------

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Jelle Zijlstra <906600+JelleZijlstra@users.noreply.github.com>
Co-authored-by: Alex Waygood <66076021+AlexWaygood@users.noreply.github.com>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2026-06-02 21:13:34 +01:00
Seth Larson
991224b1e8
gh-149079: Fix O(n^2) canonical ordering in unicodedata.normalize() (GH-149080)
Replace the insertion sort used for canonical ordering of combining
characters with a hybrid approach: insertion sort for short runs (< 20)
and counting sort for longer runs, reducing worst-case complexity from
O(n^2) to O(n). This prevents denial of service via crafted Unicode
strings with many combining characters in alternating CCC order.

Co-authored-by: ch4n3-yoon <ch4n3.yoon@gmail.com>
Co-authored-by: Seokchan Yoon <13852925+ch4n3-yoon@users.noreply.github.com>
Co-authored-by: Stan Ulbrych <stan@python.org>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Maurycy Pawłowski-Wieroński <maurycy@maurycy.com>
2026-06-02 11:39:50 +02:00
Sepehr Rasouli
60fdb3192b
gh-149738: Fix segmentation fault bug in sqllite3 (#149754)
Deleting the `row_factory` or `text_factory` attribute is no longer allowed.
2026-06-02 11:07:08 +02:00
Bernát Gábor
c79e18a8e5
gh-150717: Avoid mark-array allocation for groupless regex patterns (GH-150719)
state_init() always did PyMem_New(state->mark, groups*2), which for a
pattern with no capturing groups is PyMem_Malloc(0) -- a real allocation
(plus matching free) on every match/search/fullmatch call, for an array
that is never read: groupless patterns emit no MARK opcodes and group 0's
span is taken from state->start/ptr.

Guard the allocation with `if (pattern->groups)`. state->mark stays NULL
(set by the preceding memset), and both the error path and state_fini
already PyMem_Free(NULL) safely.
2026-06-02 10:45:30 +03:00
Sergey B Kirpichev
59abdf8207
gh-115119: Remove superfluous TEST_COVERAGE private macro from _decimal module (GH-149756)
It was previously shared with `libmpdec`, which is no longer vendored.
2026-06-01 13:41:21 -05:00
Thomas Kowalski
c5516e7e37
gh-150157: Fix critical section for PyDict_Next() in _pickle.c (GH-150158) 2026-06-01 17:32:13 +03:00
sobolevn
cc0269334f
gh-149534: Fix unification of defaultdict and frozendict with | (#149539) 2026-06-01 16:26:49 +03:00
Thomas Kowalski
c98773633c
gh-149046: fix: correctly handle str subclasses in io.StringIO (#149047) 2026-06-01 13:01:57 +00:00
Sergey B Kirpichev
46b5e3e941
gh-80480: Remove deprecated 'u' array type code (#149535)
Reuse array.typecodes in tests.
2026-06-01 11:57:55 +00:00
Sergey B Kirpichev
5b5ffce05c
Correct frexp() docs for zero and non-finite numbers (GH-149753)
0.5 <= abs(m) < 1 is only true for finite nonzero numbers
2026-05-31 07:29:44 +00:00
Thomas Kowalski
56bd9ea676
gh-150372: Add missing null check on completer_word_break_characters in readline.c (GH-150251) 2026-05-30 19:26:05 +00:00
Thomas Kowalski
1e18c45495
gh-150406: Check result of PyThread_allocate_lock() for netdb_lock (GH-150407) 2026-05-30 16:25:40 +00:00
Serhiy Storchaka
1c7011d8fe
gh-150560: Fix crash in XML parser on invalid XML with multi-byte encoding (GH-150568) 2026-05-30 00:23:32 +03:00
Chien Wong
cf2cd0be82
gh-115988: Add ARM64 and RISCV BCJ filters constants in lzma module (GH-115989)
---------

Signed-off-by: Chien Wong <m@xv97.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
2026-05-28 08:05:03 -07:00
Neko Asakura
39bd44fc70
gh-148871: make LOAD_COMMON_CONSTANT use immortal stackref borrows (GH-149625) 2026-05-28 12:27:37 +01:00
Serhiy Storchaka
7de4fcd445
gh-149571: Fix the C implementation of Element.itertext() (GH-149929)
It no longer emits text for comments and processing instructions.
2026-05-27 13:23:28 +03:00
Serhiy Storchaka
8ab7b43a14
gh-62259: Add support of multi-byte encodings in the XML parser (GH-149860)
Supported encodings: "cp932", "cp949", "cp950", "Big5","EUC-JP",
"GB2312", "GBK", "johab", and "Shift_JIS".

Partially supported encodings (only BMP characters): "Big5-HKSCS",
"EUC_JIS-2004", "EUC_JISX0213", "Shift_JIS-2004", "Shift_JISX0213",
"utf-8-sig" and non-standard aliases like "UTF8" (without hyphen).

The parser now raises ValueError for known unsupported
multi-byte encodings such us "ISO-2022-JP" or "raw-unicode-escape"
instead of failing later, when encounter non-ASCII data.
2026-05-26 19:40:25 +00:00
AN Long
ec23ec6870
gh-149931: Fix memory leaks on failed realloc (#149932) 2026-05-26 01:37:14 +01:00
Pablo Galindo Salgado
a5be25d3bd
gh-149619: Harden _remote_debugging error paths (#150349) 2026-05-25 23:22:46 +01:00
Donghee Na
c714b56798
gh-150114: Log the memory usage in regrtest on macOS (gh-150396) 2026-05-26 00:03:06 +09:00
Victor Stinner
dfe7ef6292
gh-150114: Log the memory usage in regrtest on FreeBSD (#150280)
Add _testcapi.get_process_memory_usage().
On FreeBSD, _testcapi is now linked to libkvm.
2026-05-25 13:45:55 +00:00
Pieter Eendebak
43c60ec2fd
gh-149449: Fix use-after-free in _PyUnicode_GetNameCAPI (#150323)
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
2026-05-24 16:17:38 +00:00
Serhiy Storchaka
287c98f4cb
gh-150285: Fix too long docstrings in Argument Clinic code (GH-150338) 2026-05-24 16:16:12 +03:00
Serhiy Storchaka
a5cb7c34dd
gh-150285: Fix too long docstrings in the os module (GH-150296) 2026-05-24 15:04:01 +03:00
Serhiy Storchaka
9da7923835
gh-150285: Fix too long docstrings in the pyexpat module (GH-150294) 2026-05-24 15:03:45 +03:00
Serhiy Storchaka
9fceb1c0c5
gh-150285: Fix too long docstrings in the zstd module (GH-150291) 2026-05-24 15:03:22 +03:00
Serhiy Storchaka
0466560b31
gh-150285: Fix too long docstrings in the sqlite3 module (GH-150290) 2026-05-24 15:02:58 +03:00
Serhiy Storchaka
cdc499ae77
gh-150285: Fix too long docstrings in the _remote_debugging module (GH-150289) 2026-05-24 15:02:43 +03:00