faulthandler now detects if a frame or a code object is invalid or
freed.
Add helper functions:
* _PyCode_SafeAddr2Line()
* _PyFrame_SafeGetCode()
* _PyFrame_SafeGetLasti()
_PyMem_IsPtrFreed() now detects pointers in [-0xff, 0xff] range
as freed.
Allow the --enable-pystats build option to be used with free-threading. The
stats are now stored on a per-interpreter basis, rather than process global.
For free-threaded builds, the stats structure is allocated per-thread and
then periodically merged into the per-interpreter stats structure (on thread
exit or when the reporting function is called). Most of the pystats related
code has be moved into the file Python/pystats.c.
Move the public PyUnicodeWriter API and the private _PyUnicodeWriter
API to a new Objects/unicode_writer.c file.
Rename a few helper functions to share them between unicodeobject.c
and unicode_writer.c, such as resize_compact() or unicode_result().
Expose `_PyUnicode_IsXidContinue/Start` in `unicodedata`:
add isxidstart() and isxidcontinue() functions.
Co-authored-by: Victor Stinner <vstinner@python.org>
* Count number of actually tracked objects, instead of trackable objects. This ensures that untracking tuples has the desired effect of reducing GC overhead
* Do not track most untrackable tuples during creation. This prevents large numbers of small tuples causing execessive GCs.
Fix memory leak in sub-interpreter creation caused by overwriting of the previously used `_malloced` field. Now the pointer is stored in the first word of the memory block to avoid it being overwritten accidentally.
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
Previously, the _BlocksOutputBuffer code creates a list of bytes objects to handle the output data from compression libraries. This ends up being slow due to the output buffer code needing to copy each bytes element of the list into the final bytes object buffer at the end of compression.
The new PyBytesWriter API introduced in PEP 782 is an ergonomic and fast method of writing data into a buffer that will later turn into a bytes object. Benchmarks show that using the PyBytesWriter API is 10-30% faster for decompression across a variety of settings. The performance gains are greatest when the decompressor is very performant, such as for Zstandard (and likely zlib-ng). Otherwise the decompressor can bottleneck decompression and the gains are more modest, but still sizable (e.g. 10% faster for zlib)!
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Fix memory leak in sub-interpreter creation caused by overwriting of the previously used `_malloced` field. Now the pointer is stored in the first word of the memory block to avoid it being overwritten accidentally.
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
* Move PyUnicode_Format() implementation from unicodeobject.c
to unicode_format.c.
* Replace unicode_modifiable() with _PyUnicode_IsModifiable()
* Add empty lines to have two empty lines between functions.
* Move Python/formatter_unicode.c to Objects/unicode_formatter.c.
* Move Objects/stringlib/localeutil.h content into
unicode_formatter.c. Remove localeutil.h.
* Move _PyUnicode_InsertThousandsGrouping() to unicode_formatter.c
and mark the function as static.
* Rename unicode_fill() to _PyUnicode_Fill() and export it in
pycore_unicodeobject.h.
* Move MAX_UNICODE to pycore_unicodeobject.h as _Py_MAX_UNICODE.
Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the
PyBytesWriter API.
Add _PyBytesWriter_GetSize() and _PyBytesWriter_GetData() static
inline functions.
With https://github.com/llvm/llvm-project/pull/150201 being merged, there is
now a better way to generate the Emscripten trampoline, instead of including
hand-generated binary WASM content. Requires Emscripten 4.0.12.
There was a deadlock originally seen by Memray when a daemon thread
enabled or disabled profiling while the interpreter was shutting down.
I think this could also happen with garbage collection, but I haven't
seen that in practice.
The daemon thread could be hung while trying acquire the global rwmutex
that prevents overlapping global and per-interpreter stop-the-world events.
Since it already held the main interpreter's stop-the-world lock, it
also deadlocked the main thread, which is trying to perform interpreter
finalization.
Swap the order of lock acquisition to prevent this deadlock.
Additionally, refactor `_PyParkingLot_Park` so that the global buckets
hashtable is left in a clean state if the thread is hung in
`PyEval_AcquireThread`.