Commit graph

129340 commits

Author SHA1 Message Date
Gregory P. Smith using claude.ai/code
d420f29e2b
Fix _communicate_streams_windows to avoid blocking with large input
Move stdin writing to a background thread in _communicate_streams_windows
to avoid blocking indefinitely when writing large input to a pipeline
where the subprocess doesn't consume stdin quickly.

This mirrors the fix made to Popen._communicate() for Windows in
commit 5b1862b (gh-87512).

Add test_pipeline_timeout_large_input to verify that TimeoutExpired
is raised promptly when run_pipeline() is called with large input
and a timeout, even when the first process is slow to consume stdin.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-11-29 08:41:25 +00:00
Gregory P. Smith using claude.ai/code
9f53a8e883
Refactor POSIX communicate I/O into shared _communicate_io_posix()
Extract the core selector-based I/O loop into a new _communicate_io_posix()
function that is shared by both _communicate_streams_posix() (used by
run_pipeline) and Popen._communicate() (used by Popen.communicate).

The new function:
- Takes a pre-configured selector and output buffers
- Supports resume via input_offset parameter (for Popen timeout retry)
- Returns (new_offset, completed) instead of raising TimeoutExpired
- Does not close streams (caller decides based on use case)

This reduces code duplication and ensures both APIs use the same
well-tested I/O multiplexing logic.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-11-29 08:12:12 +00:00
Gregory P. Smith using claude.ai/code
3c28ed6e93
Remove obsolete XXX comment about non-blocking I/O
The comment suggested rewriting Popen._communicate() to use
non-blocking I/O on file objects now that Python 3's io module
is used instead of C stdio.

This is unnecessary - the current approach using select() to
detect ready fds followed by os.read()/os.write() is correct
and efficient. The selector already solves "when is data ready?"
so non-blocking mode would add complexity with no benefit.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-11-29 08:04:35 +00:00
Gregory P. Smith using claude.ai/code
a3e98a73be
Improve test_pipeline_large_data_with_stderr to use large stderr
Update the test to write 64KB to stderr from each process (128KB total)
instead of just small status messages. This better tests that the
multiplexed I/O handles concurrent large data on both stdout and stderr
without deadlocking.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-11-29 08:04:35 +00:00
Gregory P. Smith using claude.ai/code
e22d1da9bc
Simplify _communicate_streams() to only accept file objects
Remove support for raw file descriptors in _communicate_streams(),
requiring all streams to be file objects. This simplifies both the
Windows and POSIX implementations by removing isinstance() checks
and fd-wrapping logic.

The run_pipeline() function now wraps the stderr pipe's read end
with os.fdopen() immediately after creation.

This change makes _communicate_streams() more compatible with
Popen.communicate() which already uses file objects, enabling
potential future refactoring to share the multiplexed I/O logic.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-11-29 08:04:34 +00:00
Gregory P. Smith using claude.ai/code
2470e14a70
Add deadlock prevention tests for run_pipeline()
Add three tests that verify the multiplexed I/O implementation
properly handles large data volumes that would otherwise cause
pipe buffer deadlocks:

- test_pipeline_large_data_no_deadlock: 256KB through 2-stage pipeline
- test_pipeline_large_data_three_stages: 128KB through 3-stage pipeline
- test_pipeline_large_data_with_stderr: 64KB with concurrent stderr

These tests would timeout or deadlock without proper multiplexing.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-11-29 08:04:34 +00:00
Gregory P. Smith using claude.ai/code
2a11d4bf53
Refactor run_pipeline() to use multiplexed I/O
Add _communicate_streams() helper function that properly multiplexes
read/write operations to prevent pipe buffer deadlocks. The helper
uses selectors on POSIX and threads on Windows, similar to
Popen.communicate().

This fixes potential deadlocks when large amounts of data flow through
the pipeline and significantly improves performance.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-11-29 08:04:34 +00:00
Gregory P. Smith using claude.ai/code
4feb2a80e5
Add documentation for subprocess.run_pipeline()
Document the new run_pipeline() function, PipelineResult class, and
PipelineError exception in the subprocess module documentation.

Includes:
- Function signature with stdin, stdout, stderr, capture_output, etc.
- Note about shared stderr pipe and text mode caveat for interleaved
  multi-byte character sequences
- Note that universal_newlines is not supported (use text=True)
- Explanation that stdin connects to first process, stdout to last
- Usage examples showing basic pipelines, multi-command pipelines,
  input handling, and error handling with check=True
- PipelineResult attributes: commands, returncodes, returncode,
  stdout, stderr, and check_returncodes() method
- PipelineError attributes: commands, returncodes, stdout, stderr,
  and failed list

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-11-29 08:04:34 +00:00
Gregory P. Smith using claude.ai/code
e3a2fbe6da
Add subprocess.run_pipeline() for command pipe chaining
Add a new run_pipeline() function to the subprocess module that enables
running multiple commands connected via pipes, similar to shell pipelines.

New API:
- run_pipeline(*commands, ...) - Run a pipeline of commands
- PipelineResult - Return type with commands, returncodes, stdout, stderr
- PipelineError - Raised when check=True and any command fails

Features:
- Supports arbitrary number of commands (minimum 2)
- capture_output, input, timeout, and check parameters like run()
- stdin= connects to first process, stdout= connects to last process
- Text mode support via text=True, encoding, errors
- All processes share a single stderr pipe for simplicity
- "pipefail" semantics: check=True fails if any command fails

Unlike run(), this function does not accept universal_newlines.
Use text=True instead.

Example:
    result = subprocess.run_pipeline(
        ['cat', 'file.txt'],
        ['grep', 'pattern'],
        ['wc', '-l'],
        capture_output=True, text=True
    )

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-11-29 08:04:33 +00:00
Moshe Kaplan
cfcd52490d
GH-141963: Clarify argparse documentation (GH-141964)
Clarify argparse documentation

Tightens the phrasing for several argparse actions.
2025-11-28 23:23:34 -08:00
Hugo van Kemenade
890fe5aad5
Docs: multi-disk ZIP files -> multipart ZIP files (GH-141962)
* Remove some old currentlies
* multi-disk -> multipart
* Sentence case headings
2025-11-28 23:11:59 -08:00
Sebastian Pipping
440bcb9456
gh-141994: Warn of XXE vulnerability in documentation of SAX feature xml.sax.handler.feature_external_ges (GH-141996)
Doc/library/xml.sax.handler.rst: Warn of XXE with feature_external_ges

Related to commit baa9f33897
2025-11-28 23:08:17 -08:00
Victor Stinner
5e749d3743
Fix multiprocessing queue test_get() (GH-142024)
* Replace sleep() with support.sleeping_retry().
* Test get_nowait() first.
* Restore previously disabled test.

Fix the failure:

FAIL: test_get (test.test_multiprocessing_spawn.test_processes.WithProcessesTestQueue.test_get)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "Lib/test/_test_multiprocessing.py", line 1208, in test_get
    self.assertEqual(queue_empty(queue), False)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: True != False
2025-11-28 23:00:14 -08:00
Gregory P. Smith
5b1862bdd8
gh-87512: Fix subprocess using timeout= on Windows blocking with a large input= (GH-142058)
On Windows, Popen._communicate() previously wrote to stdin synchronously, which could block indefinitely if the subprocess didn't consume input= quickly and the pipe buffer filled up. The timeout= parameter was only checked when joining the reader threads, not during the stdin write.

This change moves the Windows stdin writing to a background thread (similar to how stdout/stderr are read in threads), allowing the timeout to be properly enforced. If timeout expires, TimeoutExpired is raised promptly and the writer thread continues in the background. Subsequent calls to communicate() will join the existing writer thread.

Adds test_communicate_timeout_large_input to verify that TimeoutExpired is raised promptly when communicate() is called with large input and a timeout, even when the subprocess doesn't consume stdin quickly.

This test already passed on POSIX (where select() is used) but failed on Windows where the stdin write blocks without checking the timeout.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-11-28 22:07:03 -08:00
Gregory P. Smith
923056b2d4
gh-74389: gh-70560: subprocess.Popen.communicate() now ignores stdin.flush error when closed (GH-142061)
gh-70560: gh-74389: subprocess.Popen.communicate() now ignores stdin.flush error when closed

with a unittest and news entry.
2025-11-29 05:03:06 +00:00
Gregory P. Smith
cc6bc4c97f
GH-134453: Fix subprocess memoryview input handling on POSIX (GH-134949)
Fix inconsistent subprocess.Popen.communicate() behavior between Windows
and POSIX when using memoryview objects with non-byte elements as input.

On POSIX systems, the code was incorrectly comparing bytes written against
element count instead of byte count, causing data truncation for large
inputs with non-byte element types.

Changes:
- Cast memoryview inputs to byte view when input is already a memoryview
- Fix progress tracking to use len(input_view) instead of len(self._input)
- Add comprehensive test coverage for memoryview inputs

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* old-man-yells-at-ReST
* Update 2025-05-30-18-37-44.gh-issue-134453.kxkA-o.rst
* assertIsNone review feedback
* fix memoryview_nonbytes test to fail without our fix on main, and have a nicer error.

Thanks to Peter Bierma @ZeroIntensity for the code review.
2025-11-29 04:25:06 +00:00
Artur Jamro
526d7a8bb4
gh-141473: Fix subprocess.Popen.communicate to send input to stdin upon a subsequent post-timeout call (GH-141477)
* gh-141473: Fix subprocess.Popen.communicate to send input to stdin
* Docs: Clarify that `input` is one time only on `communicate()`
* NEWS entry
* Add a regression test.

---------

Co-authored-by: Gregory P. Smith <greg@krypto.org>
2025-11-28 18:04:52 -08:00
Cody Maloney
d2d2e92110
Docs: Move to method references for bytearray.take_bytes (#142053) 2025-11-28 22:07:34 +01:00
dgpb
fa9519f8b2
gh-142025: Add c-analyzer include for pyexpat.c (GH-142026)
Co-authored-by: Gregory P. Smith <68491+gpshead@users.noreply.github.com>
2025-11-28 09:51:48 -08:00
Cody Maloney
5a7c9c6861
gh-141968: Use take_bytes in encodings.punycode (#141974)
Removes a copy going from bytearray to bytes.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2025-11-28 17:47:14 +00:00
Cody Maloney
3001464248
gh-141968: Use take_bytes in re._compiler (#141995)
Removes a copy going from bytearray to bytes.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2025-11-28 17:46:10 +00:00
dgpb
5ec03cf3b0
gh-133228: c-analyzer clang preprocessor (GH-133229)
* impl
* included 2 failures to tsvs next to similar entries
* added fix/hack for curses.h fails
* fix leftover from debug
2025-11-27 22:22:21 +00:00
Stefano Rivera
656a64b37f
gh-141930: Use the regular IO stack to write .pyc files for a better error message on failure (GH-141931)
* Use open() to write the bytecode
* Convert to unittest style asserts
* Tweak news, thanks @vstinner
* Tidy
* reword NEWS, avoid word "retried"
2025-11-27 19:17:59 +00:00
Miro HronÄŤok
69f54ce452
gh-140210: Make test_sysconfig.test_parse_makefile_renamed_vars ignore environment variables (#140213)
The test did not expect it could be run with e.g. CFLAGS set to a custom value.
2025-11-27 10:00:02 -08:00
SIVALANAGASHANKARNIVAS
e02801dc37
gh-140505: Fix 'parameters' to 'arguments' in xmlrpc.client.MultiCall docs (GH-141942)
Fix terminology: change 'parameters' to 'arguments' in MultiCall docs

Fixes #140505
2025-11-27 18:01:15 +01:00
Victor Stinner
9c4ff8a615
gh-130396: Export _Py_ReachedRecursionLimitWithMargin() (#142012)
test_peg_generator needs the function.
2025-11-27 12:22:15 +00:00
Victor Stinner
d5d9e89dde
gh-116008: Detect freed thread state in faulthandler (#141988)
Add _PyMem_IsULongFreed() function.
2025-11-27 12:35:00 +01:00
Victor Stinner
83d8134c5b
gh-127635: Use flexible array in tracemalloc (#141991)
Replace frames[1] with frames[] in tracemalloc_traceback structure.
2025-11-27 12:32:31 +01:00
Victor Stinner
7fe1a18b77
gh-130396: Remove _Py_ReachedRecursionLimitWithMargin() function (#141951)
Move the private function to the internal C API (pycore_ceval.h).
2025-11-27 12:32:00 +01:00
Alper
bc9e63dd9d
gh-116738: Fix thread-safety issue in re module for free threading (gh-141923)
Added atomic operations to `scanner_begin()` and `scanner_end()` to prevent
race conditions on the `executing` flag in free-threaded builds. Also added
tests for concurrent usage of the `re` module.

Without the atomic operations, `test_scanner_concurrent_access()` triggers
`assert(self->executing)` failures, or a thread sanitizer run emits errors.
2025-11-26 15:40:45 -05:00
Cody Maloney
9ac14288d7
gh-141968: use bytearray.take_bytes in encodings.idna (#141975) 2025-11-26 21:16:25 +05:30
Cody Maloney
9dbf77beb6
gh-141968: use bytearray.take_bytes in wave._byteswap (#141973) 2025-11-26 21:15:12 +05:30
Cody Maloney
2c1fdf3592
gh-141968: Use bytearray.take_bytes in base64 _b32encode and _b32decode (#141971) 2025-11-26 21:14:25 +05:30
Petr Viktorin
2ff8608b4d
gh-135676: Simplify docs on lexing names (GH-140464)
This simplifies the Lexical Analysis section on Names (but keeps it technically correct) by putting all the info about non-ASCII characters in a separate (and very technical) section.

It uses a mental model where the parser doesn't handle Unicode complexity “immediately”, but:

- parses any non-ASCII character (outside strings/comments) as part of a name, since these can't (yet) be e.g. operators
- normalizes the name
- validates the name, using the xid_start/xid_continue sets


Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
Co-authored-by: Blaise Pabon <blaise@gmail.com>
Co-authored-by: Micha Albert <info@micha.zone>
Co-authored-by: KeithTheEE <kmurrayis@gmail.com>
2025-11-26 16:10:44 +01:00
Petr Viktorin
c359ea4c71
gh-141909: Correct version where Py_mod_gil was added (GH-141979) 2025-11-26 14:45:06 +00:00
Sergey Miryanov
2ea67caf31
GH-141861: Fix TRACE_RECORD if full (GH-141959) 2025-11-26 14:32:30 +00:00
Itamar Oren
27f62eb711
gh-140011: Delete importdl assertion that prevents importing embedded modules from packages (GH-141605) 2025-11-26 14:12:49 +01:00
Petr Viktorin
d7f0214f13
gh-140550: PEP 793 reference documentation (GH-141197)
* gh-140550: PEP 793 reference documentation

Since the PEP calls for soft-deprecation of the existing initialization
function, this reorganizes the relevant docs to put the new way of
doing things first, and de-emphasize the old.

Some bits, like the tutorial, are left out of this patch. (See the
issue for a list.)
2025-11-26 12:50:03 +00:00
Guo Ci
8c33c6143e
Correct indentation in stdtypes.rst (#141957) 2025-11-26 11:55:52 +05:30
Stan Ulbrych
33efd7178e
Remove `Misc/ACKS` check from patchcheck, documentation (#141960)
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2025-11-26 00:00:00 +00:00
Stan Ulbrych
9f2a34af74
Remove references to `Misc/ACKS from CONTRIBUTING.md` (#141952)
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2025-11-25 18:59:28 +00:00
Peter Bierma
a89ee4b9c2
gh-141004: Document missing PyThread* APIs (GH-141810)
Co-authored-by: Victor Stinner <vstinner@python.org>
2025-11-25 12:53:18 -05:00
Petr Viktorin
202fce0dbd
gh-141909: Add PyModuleDef_Slot and earlier Py_mod_* constants to stable ABI manifest (#141910)
These were added to the limited API in 3.5.
Not including them in `Misc/stable_abi.toml` was a bug.
2025-11-25 15:16:49 +01:00
Stan Ulbrych
f445c452ea
gh-141004: Document PyOS_mystr(n)icmp (#141760)
Co-authored-by: Petr Viktorin <encukou@gmail.com>
2025-11-25 14:44:52 +01:00
Petr Viktorin
226011ba12
gh-139165: Make Py_SIZE, Py_IS_TYPE,Py_ SET_SIZE regular functions in stable ABI (GH-139166)
* Make Py_{SIZE,IS_TYPE,SET_SIZE} regular functions in stable ABI

Group them together with Py_TYPE & Py_SET_TYPE to cut down
on repetitive preprocessor macros.
Format repetitive definitions in object.c more concisely.

Py_SET_TYPE is still left out of the Limited API.
2025-11-25 14:30:33 +01:00
Krishna Chaitanya
e6174ee981
gh-140911: Ensure that UserString.index() and UserString.rindex() accept UserString as argument (GH-140945) 2025-11-25 15:25:46 +02:00
Pablo Galindo Salgado
d07d3a3c57
gh-138122: Split Modules/_remote_debugging_module.c into multiple files (#141934)
gh-1381228: Split Modules/_remote_debugging_module.c into multiple files
2025-11-25 12:51:24 +00:00
Paresh Joshi
da1d468bea
gh-141781: Fix pdb.line_prefix binding (#141779) 2025-11-24 18:45:16 -08:00
Sergey Miryanov
dc62b62252
GH-141861: Fix invalid memory read in the ENTER_EXECUTOR (GH-141921) 2025-11-24 22:07:45 +00:00
SubbaraoGarlapati
369ce2b139
Fix implicit import in test_monitoring.py (gh-141795) 2025-11-24 14:48:28 -05:00