PyObject_GetBuffer() can execute user code (e.g. via __buffer__), which may
close or otherwise mutate a BytesIO object while write() or writelines()
is in progress. This could invalidate the internal buffer and lead to a
use-after-free.
Ensure that PyObject_GetBuffer() is called before validation checks.
When using blocking mode in the remote debugging profiler, ptrace calls
to seize threads can fail with EPERM if the thread has exited between
listing and attaching, is in a special kernel state, or is already being
traced. Previously this raised a RuntimeError that was caught by the
Python sampling loop,and retried indefinitely since EPERM is
a persistent condition that will not resolve on its own.
Treat EPERM the same as ESRCH by returning 1 (skip this thread) instead
of -1 (fatal error). This allows profiling to continue with the threads
that can be traced rather than entering an endless retry loop printing
the same error message repeatedly.
Optimize base64 encoding/decoding by eliminating loop-carried dependencies. Key changes:
- Add `base64_encode_trio()` and `base64_decode_quad()` helper functions that process complete groups independently
- Add `base64_encode_fast()` and `base64_decode_fast()` wrappers
- Update `b2a_base64` and `a2b_base64` to use fast path for complete groups
Performance gains (encode/decode speedup vs main, PGO builds):
```
64 bytes 64K 1M
Zen2: 1.2x/1.8x 1.7x/2.8x 1.5x/2.8x
Zen4: 1.2x/1.7x 1.6x/3.0x 1.5x/3.0x [old data, likely faster]
M4: 1.3x/1.9x 2.3x/2.8x 2.4x/2.9x [old data, likely faster]
RPi5-32: 1.2x/1.2x 2.4x/2.4x 2.0x/2.1x
```
Based on my exploratory work done in https://github.com/python/cpython/compare/main...gpshead:cpython:claude/vectorize-base64-c-S7Hku
See PR and issue for further thoughts on sometimes MUCH faster SIMD vectorized versions of this.
mmapmodule: remove unreachable code in Windows error path
Remove an unreachable `return NULL` after `PyErr_SetFromWindowsErr()` in
the Windows mmap resize error path.
Signed-off-by: Yongtao Huang <yongtaoh2022@gmail.com>
Make the attributes in _bz2 module thread-safe on the free-threading build.
Attributes (eof, needs_input, unused_data) are now stored atomically or
accessed via mutex-protected getters.
Makes the zlib module thread-safe free-threading build. Even though operations
are protected by locks, attributes exposed via PyMemberDef (eof, needs_input,
unused_data, unconsumed_tail) should still be stored atomically within locked
sections, since they can be read without acquiring the lock.
If there are many untracked tuples, the GC will run too often, resulting
in poor performance. The fix is to include untracked tuples in the
"long lived" object count. The number of frozen objects is also now
included since the free-threaded GC must scan those too.