cpython/Tools
Serhiy Storchaka 59f247e43b
gh-115952: Fix a potential virtual memory allocation denial of service in pickle (GH-119204)
Loading a small data which does not even involve arbitrary code execution
could consume arbitrary large amount of memory. There were three issues:

* PUT and LONG_BINPUT with large argument (the C implementation only).
  Since the memo is implemented in C as a continuous dynamic array, a single
  opcode can cause its resizing to arbitrary size. Now the sparsity of
  memo indices is limited.
* BINBYTES, BINBYTES8 and BYTEARRAY8 with large argument.  They allocated
  the bytes or bytearray object of the specified size before reading into
  it.  Now they read very large data by chunks.
* BINSTRING, BINUNICODE, LONG4, BINUNICODE8 and FRAME with large
  argument.  They read the whole data by calling the read() method of
  the underlying file object, which usually allocates the bytes object of
  the specified size before reading into it.  Now they read very large data
  by chunks.

Also add comprehensive benchmark suite to measure performance and memory
impact of chunked reading optimization in PR #119204.

Features:
- Normal mode: benchmarks legitimate pickles (time/memory metrics)
- Antagonistic mode: tests malicious pickles (DoS protection)
- Baseline comparison: side-by-side comparison of two Python builds
- Support for truncated data and sparse memo attack vectors

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
2025-12-05 19:17:01 +02:00
..
build gh-139707: Add mechanism for distributors to supply error messages for missing stdlib modules (GH-140783) 2025-12-01 14:36:17 +01:00
buildbot gh-115556: Remove quotes from command-line arguments in test.bat and rt.bat (#115557) 2024-02-16 21:24:56 +01:00
c-analyzer gh-142025: Add c-analyzer include for pyexpat.c (GH-142026) 2025-11-28 09:51:48 -08:00
cases_generator gh-141976: Check stack bounds in JIT optimizer (GH-142201) 2025-12-04 20:28:08 +00:00
check-c-api-docs gh-141004: Add a CI job ensuring that new C APIs include documentation (GH-142102) 2025-12-04 03:14:25 +00:00
clinic gh-81313: Add the math.integer module (PEP-791) (GH-133909) 2025-10-31 16:13:43 +02:00
freeze gh-65701: document that freeze doesn't work with framework builds on macOS (#113352) 2023-12-21 16:28:00 +01:00
ftscalingbench gh-139103: fix free-threading dataclass.__init__ perf issue (gh-141596) 2025-11-19 00:57:59 +00:00
gdb gh-127119: Faster check for small ints in long_dealloc (GH-127620) 2025-01-29 15:22:18 +00:00
i18n gh-138286: Run `ruff on Tools/i18n` (#138287) 2025-08-31 20:29:02 +00:00
importbench gh-58032: Do not use argparse.FileType in module CLIs and scripts (GH-113649) 2024-01-10 15:07:19 +02:00
inspection gh-91048: Fix external inspection multi-threaded performance (#136005) 2025-06-28 14:11:31 +01:00
jit GH-142050: Jit stencils on Windows contain debug data (#142052) 2025-12-03 22:08:51 +00:00
lockbench gh-108724: Add PyMutex and _PyParkingLot APIs (gh-109344) 2023-09-19 09:54:29 -06:00
msi gh-138896: Fix error installing C runtime on non-updated Windows machines (GH-138932) 2025-09-17 14:32:52 +01:00
nuget bpo-41744: Package python.props with correct name in NuGet package (GH-22154) 2020-09-14 20:30:15 +01:00
patchcheck Remove `Misc/ACKS` check from patchcheck, documentation (#141960) 2025-11-26 00:00:00 +00:00
peg_generator gh-130396: Remove _Py_ReachedRecursionLimitWithMargin() function (#141951) 2025-11-27 12:32:00 +01:00
picklebench gh-115952: Fix a potential virtual memory allocation denial of service in pickle (GH-119204) 2025-12-05 19:17:01 +02:00
scripts gh-139198: Remove Tools/scripts/checkpip.py script (GH-139199) 2025-10-30 11:50:16 +01:00
ssl gh-139573: Update OpenSSL in CI (GH-139577) 2025-10-04 19:43:17 -05:00
tsan gh-133467: Fix typeobject tp_base race in free threading (gh-140549) 2025-11-05 16:20:40 -05:00
unicode closes gh-138706: update Unicode to 17.0.0 (#138719) 2025-09-11 09:58:39 -07:00
unittestgui Remove a redundant assignment in Tools/unittestgui/unittestgui.py (GH-21438) 2021-05-16 16:55:06 +01:00
wasm Being more flexible in when not to explicitly set the sysroot when compiling for WASI (GH-142242) 2025-12-03 15:42:10 -08:00
README gh-139188: Remove Tools/tz/zdump.py script (GH-139189) 2025-10-30 12:12:45 +01:00
requirements-dev.txt Bump mypy to 1.17.1 (#137542) 2025-08-08 10:14:51 +03:00
requirements-hypothesis.txt gh-136297: Fix hypothesis and subTest usage in test_zoneinfo_property.py (#136384) 2025-07-08 07:51:36 +00:00

This directory contains a number of Python programs that are useful
while building or extending Python.

build           Automatically generated directory by the build system
                contain build artifacts and intermediate files.

buildbot        Batchfiles for running on Windows buildbot workers.

c-analyzer      Tools to check no new global variables have been added.

cases_generator Tooling to generate interpreters.

clinic          A preprocessor for CPython C files in order to automate
                the boilerplate involved with writing argument parsing
                code for "builtins".

freeze          Create a stand-alone executable from a Python program.

ftscalingbench  Benchmarks for free-threading and finding bottlenecks.

gdb             Python code to be run inside gdb, to make it easier to
                debug Python itself (by David Malcolm).

i18n            Tools for internationalization. pygettext.py
                parses Python source code and generates .pot files,
                and msgfmt.py generates a binary message catalog
                from a catalog in text format.

importbench     A set of micro-benchmarks for various import scenarios.

inspection      Tooling for PEP-678 "Safe external debugger interface for CPython".

jit             Tooling for building the JIT.

lockbench       Benchmarks for PyMutex and critical sections.

msi             Support for packaging Python as an MSI package on Windows.

nuget           Files for the NuGet package manager for .NET.

patchcheck      Tools for checking and applying patches to the Python source code
                and verifying the integrity of patch files.

peg_generator   PEG-based parser generator (pegen) used for new parser.

scripts         A number of useful single-file programs, e.g. run_tests.py
                which runs the Python test suite.

ssl             Scripts to generate ssl_data.h from OpenSSL sources, and run
                tests against multiple installations of OpenSSL and LibreSSL.

tsan            Utilities for building CPython with thread-sanitizer.

unicode         Tools for generating unicodedata and codecs from unicode.org
                and other mapping files (by Fredrik Lundh, Marc-Andre Lemburg
                and Martin von Loewis).

unittestgui     A Tkinter based GUI test runner for unittest, with test
                discovery.

wasm            Config and helpers to facilitate cross compilation of CPython
                to WebAssembly (WASM).

Note: The pynche color editor has moved to https://gitlab.com/warsaw/pynche