Fix error in assertion which causes failure if pos is equal to PY_SSIZE_T_MAX.
Fix undefined behavior in read() and readinto() if pos is larger that the size
of the underlying buffer.
(cherry picked from commit 7d54374f9c)
* [3.13] gh-140607: Validate returned byte count in RawIOBase.read (GH-140611)
While `RawIOBase.readinto` should return a count of bytes between 0 and
the length of the given buffer, it is not required to. Add validation
inside RawIOBase.read() that the returned byte count is valid.
(cherry picked from commit 0f0a362768)
Co-authored-by: Cody Maloney <cmaloney@users.noreply.github.com>
Co-authored-by: Shamil <ashm.tech@proton.me>
Co-authored-by: Victor Stinner <vstinner@python.org>
* fixup: Use older attribute name
---------
Co-authored-by: Shamil <ashm.tech@proton.me>
Co-authored-by: Victor Stinner <vstinner@python.org>
gh-135607: remove null checking of weakref list in dealloc of extension modules and objects (#135614)
(cherry picked from commit b1056c2a44)
Co-authored-by: Xuanteng Huang <44627253+xuantengh@users.noreply.github.com>
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
(cherry picked from commit 44fb7c361c)
Co-authored-by: Duane Griffin <duaneg@dghda.com>
In the C implementation, remove __reduce__ and __reduce_ex__ methods
that always raise TypeError and restore __getstate__ methods that always
raise TypeErrori.
This restores fine details of the pre-3.12 behavior and unifies
both implementations.
(cherry picked from commit e9253ebf74)
Since MultiByteToWideChar()/WideCharToMultiByte() is not reversible if
the data contains invalid UTF-8 sequences, use binary search to
calculate the number of written bytes from the number of written
characters.
Also fix writing incomplete UTF-8 sequences.
Also fix handling of memory allocation failures.
(cherry picked from commit 3cf83d91a5)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
gh-127182: Fix `io.StringIO.__setstate__` crash when `None` is the first value (GH-127219)
(cherry picked from commit a2ee899682)
Co-authored-by: sobolevn <mail@sobolevn.me>
Co-authored-by: Victor Stinner <vstinner@python.org>
gh-121489: Export private _PyBytes_Join() again (GH-122267)
(cherry picked from commit aef95eb107)
Co-authored-by: Marc Mueller <30130371+cdce8p@users.noreply.github.com>
* Add an InternalDocs file describing how interning should work and how to use it.
* Add internal functions to *explicitly* request what kind of interning is done:
- `_PyUnicode_InternMortal`
- `_PyUnicode_InternImmortal`
- `_PyUnicode_InternStatic`
* Switch uses of `PyUnicode_InternInPlace` to those.
* Disallow using `_Py_SetImmortal` on strings directly.
You should use `_PyUnicode_InternImmortal` instead:
- Strings should be interned before immortalization, otherwise you're possibly
interning a immortalizing copy.
- `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
`SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
backports, as they are now part of public API and version-specific ABI.
* Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.
* Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
- `_Py_ID`
- `_Py_STR` (including the empty string)
- one-character latin-1 singletons
Now, when you intern a singleton, that exact singleton will be interned.
* Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).
* Intern `_Py_STR` singletons at startup.
* For free-threaded builds, intern `_Py_LATIN1_CHR` singletons at startup.
* Beef up the tests. Cover internal details (marked with `@cpython_only`).
* Add lots of assertions
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
gh-119661: Add _Py_SINGLETON() include in Argumenet Clinic (#119712)
When the _Py_SINGLETON() is used, Argument Clinic now adds an
explicit "pycore_runtime.h" include to get the macro. Previously, the
macro may or may not be included indirectly by another include.
(cherry picked from commit 7ca74a760a)
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.
PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.
A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
BufferedWriter() was buffering calls that are the exact same size as the buffer. it's a very common case to read/write in blocks of the exact buffer size.
it's pointless to copy a full buffer, it's costing extra memory copy and the full buffer will have to be written in the next call anyway.
Co-authored-by: rmorotti <romain.morotti@man.com>
lseek() always returns 0 for character pseudo-devices like
`/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it
always returns -1, to which CPython reacts by raising appropriate
exceptions). They are thus technically seekable despite not having seek
semantics.
When calling read() on e.g. an instance of `io.BufferedReader` that
wraps such a file, `BufferedReader` reads ahead, filling its buffer,
creating a discrepancy between the number of bytes read and the internal
`tell()` always returning 0, which previously resulted in e.g.
`BufferedReader.tell()` or `BufferedReader.seek()` being able to return
positions < 0 even though these are supposed to be always >= 0.
Invariably keep the return value non-negative by returning
max(former_return_value, 0) instead, and add some corresponding tests.
This comment appears to have been mistakenly copied from what is now
called iobase_check_closed() in commit 4d9aec0220.
Also unite the iobase_check_closed() code with the relevant comment.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
On Windows, set _O_NOINHERIT flag on file descriptors
created by os.pipe() and io.WindowsConsoleIO.
Add test_pipe_spawnl() to test_os.
Co-authored-by: Zackery Spytz <zspytz@gmail.com>
In non-debug more the check for the "errors" argument is skipped,
and then PyUnicode_AsUTF8() can fail, but its result was not checked.
Co-authored-by: Victor Stinner <vstinner@python.org>
* Fix crash when encoding is not string or None.
* Fix crash when both line_buffering and write_through raise exception
when converted ti int.
* Add a number of tests for constructor and reconfigure() method
with invalid arguments.
The `@critical_section` directive instructs Argument Clinic to generate calls
to `Py_BEGIN_CRITICAL_SECTION()` and `Py_END_CRITICAL_SECTION()` around the
bound function. In `--disable-gil` builds, these calls will lock and unlock
the `self` object. They are no-ops in the default build.
This is used in one place (`_io._Buffered.close`) as a demonstration.
Subsequent PRs will use it more widely in the `_io.Buffered` bindings.
* Revert "gh-111089: Use PyUnicode_AsUTF8() in Argument Clinic (#111585)"
This reverts commit d9b606b3d0.
* Revert "gh-111089: Use PyUnicode_AsUTF8() in getargs.c (#111620)"
This reverts commit cde1071b2a.
* Revert "gh-111089: PyUnicode_AsUTF8() now raises on embedded NUL (#111091)"
This reverts commit d731579bfb.
* Revert "gh-111089: Add PyUnicode_AsUTF8() to the limited C API (#111121)"
This reverts commit d8f32be5b6.
* Revert "gh-111089: Use PyUnicode_AsUTF8() in sqlite3 (#111122)"
This reverts commit 37e4e20eaa.
Replace PyUnicode_AsUTF8AndSize() with PyUnicode_AsUTF8() to remove
the explicit check for embedded null characters.
The change avoids to have to include explicitly <string.h> to get the
strlen() function when using a recent version of the limited C API.