Treat the debug offset tables read from a target process as untrusted input
and validate them before the unwinder uses any reported sizes or offsets.
Add a shared validator in debug_offsets_validation.h and run it once when
_Py_DebugOffsets is loaded and once when AsyncioDebug is loaded. The checks
cover section sizes used for fixed local buffers and every offset that is
later dereferenced against a local buffer or local object view. This keeps
the bounds checks out of the sampling hot path while rejecting malformed
tables up front.
Rename from _Py_INTERNAL_ABI_SLOT to _Py_ABI_SLOT
and define the macro using _PyABIInfo_DEFAULT.
Use the ABI slot in stdlib extension modules to enable running
a check of ABI version compatibility.
_tkinter, _tracemalloc and readline don't use the slots, hence they need
explicit handling.
Co-authored-by: Victor Stinner <vstinner@python.org>
This PR implements frame caching in the RemoteUnwinder class to significantly reduce memory reads when profiling remote processes with deep call stacks.
When cache_frames=True, the unwinder stores the frame chain from each sample and reuses unchanged portions in subsequent samples. Since most profiling samples capture similar call stacks (especially the parent frames), this optimization avoids repeatedly reading the same frame data from the target process.
The implementation adds a last_profiled_frame field to the thread state that tracks where the previous sample stopped. On the next sample, if the current frame chain reaches this marker, the cached frames from that point onward are reused instead of being re-read from remote memory.
The sampling profiler now enables frame caching by default.