cpython

mirror of https://github.com/python/cpython.git synced 2025-12-08 06:10:17 +00:00

Author	SHA1	Message	Date
Serhiy Storchaka	59f247e43b	gh-115952: Fix a potential virtual memory allocation denial of service in pickle (GH-119204) Loading a small data which does not even involve arbitrary code execution could consume arbitrary large amount of memory. There were three issues: * PUT and LONG_BINPUT with large argument (the C implementation only). Since the memo is implemented in C as a continuous dynamic array, a single opcode can cause its resizing to arbitrary size. Now the sparsity of memo indices is limited. * BINBYTES, BINBYTES8 and BYTEARRAY8 with large argument. They allocated the bytes or bytearray object of the specified size before reading into it. Now they read very large data by chunks. * BINSTRING, BINUNICODE, LONG4, BINUNICODE8 and FRAME with large argument. They read the whole data by calling the read() method of the underlying file object, which usually allocates the bytes object of the specified size before reading into it. Now they read very large data by chunks. Also add comprehensive benchmark suite to measure performance and memory impact of chunked reading optimization in PR #119204. Features: - Normal mode: benchmarks legitimate pickles (time/memory metrics) - Antagonistic mode: tests malicious pickles (DoS protection) - Baseline comparison: side-by-side comparison of two Python builds - Support for truncated data and sparse memo attack vectors Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Gregory P. Smith <greg@krypto.org>	2025-12-05 19:17:01 +02:00
Victor Stinner	a65236bb39	gh-129813, PEP 782: Use PyBytesWriter in pickle and struct (#138833 ) Replace the private _PyBytesWriter API with the new public PyBytesWriter API.	2025-09-13 18:26:49 +02:00
Peter Bierma	4f6ecd10c2	gh-138342: Use a common utility for visiting an object's type (GH-138343) Add `_PyObject_VisitType` in place of `tp_traverse` functions that only visit the object's type.	2025-09-01 16:20:33 +00:00
Justin Applegate	781294019d	gh-135241: Make unpickling of booleans in protocol 0 more strict (GH-135242) The Python pickle module looks for "00" and "01" but _pickle only looked for 2 characters that parsed to 0 or 1, meaning some payloads like "+0" or " 0" would lead to different results in different implementations.	2025-08-14 22:22:37 +03:00
Justin Applegate	2b8b4774d2	gh-135321: Always raise a correct exception for BINSTRING argument > 0x7fffffff in pickle (GH-135322) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2025-06-11 10:15:12 +00:00
Serhiy Storchaka	bac3fcba5b	gh-108512: Add and use new replacements for PySys_GetObject() (GH-111035) Add functions PySys_GetAttr(), PySys_GetAttrString(), PySys_GetOptionalAttr() and PySys_GetOptionalAttrString().	2025-05-28 20:11:09 +03:00
Daniel Li	05a19b5e56	gh-120170: Exclude __mp_main__ in C version of whichmodule() (#120171 ) Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>	2025-05-23 21:45:45 +03:00
Victor Stinner	20c5f969dd	gh-131238: Remove more includes from pycore_interp.h (#131480 )	2025-03-19 23:01:32 +01:00
Victor Stinner	9a63138e09	gh-111178: Fix function signatures in misc files (#131180 )	2025-03-13 16:55:08 +01:00
Serhiy Storchaka	0ef4ffeefd	gh-130163: Fix crashes related to PySys_GetObject() (GH-130503) The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed reference, has been replaced by using one of the following functions, which return a strong reference and distinguish a missing attribute from an error: _PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(), _PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString().	2025-02-25 23:04:27 +02:00
Bénédikt Tran	5a13faa1b7	gh-111178: fix UBSan failures in `Modules/_pickle.c` (#129787 ) Fix UBSan failures for `Pdata`, `PicklerObject`, `UnpicklerObject`, `PicklerMemoProxyObject`, `UnpicklerMemoProxyObject` Indicate safe fast cast to avoid redundant future checks Use semantically correct parameter names	2025-02-20 14:27:35 +01:00
Sergey Miryanov	e7f00cd14f	gh-130179: Fix `persistent_{id,load}_attr` reference leaks in `_pickle` (#130180 ) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>	2025-02-16 17:00:03 +03:00
Victor Stinner	3bebe46d34	gh-128911: Add PyImport_ImportModuleAttr() function (#128912 ) Add PyImport_ImportModuleAttr() and PyImport_ImportModuleAttrString() functions. * Add unit tests. * Replace _PyImport_GetModuleAttr() with PyImport_ImportModuleAttr(). * Replace _PyImport_GetModuleAttrString() with PyImport_ImportModuleAttrString(). * Remove "pycore_import.h" includes, no longer needed.	2025-01-30 11:17:29 +00:00
Victor Stinner	1d485db953	gh-128863: Deprecate _PyLong_Sign() function (#129176 ) Replace _PyLong_Sign() with PyLong_GetSign().	2025-01-23 03:11:53 +01:00
Justin Applegate	ce76b547f9	gh-126992: Change pickle code to base 10 for load_long and load_int (GH-127042) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2024-12-11 12:37:59 +00:00
Justin Applegate	29cbcbd73b	gh-126991: Fix reference leak in loading pickle's opcode BUILD (GH-126990) If PyObject_SetItem() fails in the `load_build()` function of _pickle.c, no DECREF for the `dict` variable.	2024-11-19 18:00:35 +02:00
Serhiy Storchaka	223d3dc554	gh-125631: Enable setting persistent_id and persistent_load of pickler and unpickler (GH-125752) pickle.Pickler.persistent_id and pickle.Unpickler.persistent_load can again be overridden as instance attributes.	2024-11-07 08:53:02 +02:00
Victor Stinner	a1c57bcfd2	gh-126461: Fix _Unpickler_ReadFromFile() error handling (#126485 ) Handle _Unpickler_SetStringInput() failure.	2024-11-06 14:24:46 +01:00
Serhiy Storchaka	d08c788822	gh-123497: New limit for Python integers on 64-bit platforms (GH-123724) Instead of be limited just by the size of addressable memory (2**63 bytes), Python integers are now also limited by the number of bits, so the number of bit now always fit in a 64-bit integer. Both limits are much larger than what might be available in practice, so it doesn't affect users. _PyLong_NumBits() and _PyLong_Frexp() are now always successful.	2024-09-29 10:40:20 +03:00
Serhiy Storchaka	c0c2aa7644	gh-122213: Add notes for pickle serialization errors (GH-122214) This allows to identify the source of the error.	2024-09-09 21:28:55 +03:00
Serhiy Storchaka	b2a8c38bb2	gh-122311: Improve and unify pickle errors (GH-122771) * Raise PicklingError instead of UnicodeEncodeError, ValueError and AttributeError in both implementations. * Chain the original exception to the pickle-specific one as __context__. * Include the error message of ImportError and some AttributeError in the PicklingError error message. * Unify error messages between Python and C implementations. * Refer to documented __reduce__ and __newobj__ callables instead of internal methods (e.g. save_reduce()) or pickle opcodes (e.g. NEWOBJ). * Include more details in error messages (what expected, what got). * Avoid including a potentially long repr of an arbitrary object in error messages.	2024-09-09 15:04:51 +03:00
Serhiy Storchaka	32c7dbb2bc	gh-121485: Always use 64-bit integers for integers bits count (GH-121486) Use 64-bit integers instead of platform specific size_t or Py_ssize_t to represent the number of bits in Python integer.	2024-08-30 08:13:24 +03:00
Serhiy Storchaka	0c3ea30238	gh-123431: Harmonize extension code checks in pickle (GH-123434) This checks are redundant in normal circumstances and can only work if the extension registry was intentionally broken. * The Python implementation now raises exception for the extension code with false boolean value. * Simplify the C code. RuntimeError is now raised in explicit checks. * Add many tests.	2024-08-29 08:26:16 +03:00
Kirill Podoprigora	94a4bd79a7	gh-122704: Fix reference leak in Modules/_pickle.c (GH-122705)	2024-08-06 08:57:36 +03:00
Serhiy Storchaka	1bb955a2fe	gh-122459: Optimize pickling by name objects without __module__ (GH-122460)	2024-08-05 16:21:32 +03:00
Serhiy Storchaka	68840e91ac	gh-122311: Fix a refleak in pickle (GH-122411)	2024-07-29 21:52:48 +03:00
Serhiy Storchaka	3b034d26eb	gh-122311: Fix some error messages in pickle (GH-122386)	2024-07-29 11:49:13 +03:00
Serhiy Storchaka	dc07f65a53	gh-82951: Fix serializing by name in pickle protocols < 4 (GH-122149) Serializing objects with complex __qualname__ (such as unbound methods and nested classes) by name no longer involves serializing parent objects by value in pickle protocols < 4.	2024-07-25 08:45:19 +00:00
Rodrigo Oliveira	d66b06107b	gh-118830: Bump pickle.DEFAULT_PROTOCOL to 5 (GH-119340)	2024-07-19 16:47:10 +02:00
Justin Applegate	92893fd8dc	gh-121137: Add missing Py_DECREF calls for ADDITEMS opcode of _pickle.c (#121136 ) PyObject_GetAttr returns a new reference, but this reference is never decremented using Py_DECREF, so Py_DECREF calls to this referece are added	2024-06-28 14:43:45 -07:00
Petr Viktorin	6f1d448bc1	gh-113993: Allow interned strings to be mortal, and fix related issues (GH-120520) * Add an InternalDocs file describing how interning should work and how to use it. * Add internal functions to explicitly request what kind of interning is done: - `_PyUnicode_InternMortal` - `_PyUnicode_InternImmortal` - `_PyUnicode_InternStatic` * Switch uses of `PyUnicode_InternInPlace` to those. * Disallow using `_Py_SetImmortal` on strings directly. You should use `_PyUnicode_InternImmortal` instead: - Strings should be interned before immortalization, otherwise you're possibly interning a immortalizing copy. - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in backports, as they are now part of public API and version-specific ABI. * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery. * Make sure the statically allocated string singletons are unique. This means these sets are now disjoint: - `_Py_ID` - `_Py_STR` (including the empty string) - one-character latin-1 singletons Now, when you intern a singleton, that exact singleton will be interned. * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic). * Intern `_Py_STR` singletons at startup. * For free-threaded builds, intern `_Py_LATIN1_CHR` singletons at startup. * Beef up the tests. Cover internal details (marked with `@cpython_only`). * Add lots of assertions Co-Authored-By: Eric Snow <ericsnowcurrently@gmail.com>	2024-06-21 17:19:31 +02:00
Brett Simmers	c2627d6eea	gh-116322: Add Py_mod_gil module slot (#116882 ) This PR adds the ability to enable the GIL if it was disabled at interpreter startup, and modifies the multi-phase module initialization path to enable the GIL when loading a module, unless that module's spec includes a slot indicating it can run safely without the GIL. PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148. A warning will be issued up to once per interpreter for the first GIL-using module that is loaded. If `-v` is given, a shorter message will be printed to stderr every time a GIL-using module is loaded (including the first one that issues a warning).	2024-05-03 11:30:55 -04:00
Donghee Na	94444ea45a	gh-112069: Add _PySet_NextEntryRef to be thread-safe. (gh-117990)	2024-04-19 00:18:22 +09:00
Steve Dower	7861dfd26a	gh-111140: Adds PyLong_AsNativeBytes and PyLong_FromNative[Unsigned]Bytes functions (GH-114886)	2024-02-12 20:13:13 +00:00
Serhiy Storchaka	89cee94b31	gh-89850: Add default C implementations of persistent_id() and persistent_load() (GH-113579) Previously the C implementation of pickle.Pickler and pickle.Unpickler classes did not have such methods and they could only be used if they were overloaded in subclasses or set as instance attributes. Fixed calling super().persistent_id() and super().persistent_load() in subclasses of the C implementation of pickle.Pickler and pickle.Unpickler classes. It no longer causes an infinite recursion.	2024-01-10 15:30:37 +02:00
kale-smoothie	967f2a3052	bpo-41422: Visit the Pickler's and Unpickler's memo in tp_traverse (GH-21664) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2023-11-27 18:09:41 +00:00
Serhiy Storchaka	add16f1a5e	gh-108511: Add C API functions which do not silently ignore errors (GH-109025) Add the following functions: * PyObject_HasAttrWithError() * PyObject_HasAttrStringWithError() * PyMapping_HasKeyWithError() * PyMapping_HasKeyStringWithError()	2023-09-17 14:23:31 +03:00
Victor Stinner	a071ecb4d1	gh-106320: Remove private _PySys functions (#108452 ) Move private functions to the internal C API (pycore_sysmodule.h): * _PySys_GetAttr() * _PySys_GetSizeOf() No longer export most of these functions. Fix also a typo in Include/cpython/optimizer.h: add a missing space.	2023-08-24 20:02:09 +00:00
Victor Stinner	c55e73112c	gh-106320: Remove private PyLong C API functions (#108429 ) Remove private PyLong C API functions: * _PyLong_AsByteArray() * _PyLong_DivmodNear() * _PyLong_Format() * _PyLong_Frexp() * _PyLong_FromByteArray() * _PyLong_FromBytes() * _PyLong_GCD() * _PyLong_Lshift() * _PyLong_Rshift() Move these functions to the internal C API. No longer export _PyLong_FromBytes() function.	2023-08-24 18:53:50 +02:00
Brandt Bucher	05a824f294	GH-84436: Skip refcounting for known immortals (GH-107605)	2023-08-04 16:24:50 -07:00
Victor Stinner	1a3faba9f1	gh-106869: Use new PyMemberDef constant names (#106871 ) * Remove '#include "structmember.h"'. * If needed, add <stddef.h> to get offsetof() function. * Update Parser/asdl_c.py to regenerate Python/Python-ast.c. * Replace: * T_SHORT => Py_T_SHORT * T_INT => Py_T_INT * T_LONG => Py_T_LONG * T_FLOAT => Py_T_FLOAT * T_DOUBLE => Py_T_DOUBLE * T_STRING => Py_T_STRING * T_OBJECT => _Py_T_OBJECT * T_CHAR => Py_T_CHAR * T_BYTE => Py_T_BYTE * T_UBYTE => Py_T_UBYTE * T_USHORT => Py_T_USHORT * T_UINT => Py_T_UINT * T_ULONG => Py_T_ULONG * T_STRING_INPLACE => Py_T_STRING_INPLACE * T_BOOL => Py_T_BOOL * T_OBJECT_EX => Py_T_OBJECT_EX * T_LONGLONG => Py_T_LONGLONG * T_ULONGLONG => Py_T_ULONGLONG * T_PYSSIZET => Py_T_PYSSIZET * T_NONE => _Py_T_NONE * READONLY => Py_READONLY * PY_AUDIT_READ => Py_AUDIT_READ * READ_RESTRICTED => Py_AUDIT_READ * PY_WRITE_RESTRICTED => _Py_WRITE_RESTRICTED * RESTRICTED => (READ_RESTRICTED \| _Py_WRITE_RESTRICTED)	2023-07-25 15:28:30 +02:00
Victor Stinner	5e4af2a3e9	gh-106320: Move private _PySet API to the internal API (#107041 ) * Add pycore_setobject.h header file. * Move the following API to the internal C API: * _PySet_Dummy * _PySet_NextEntry() * _PySet_Update()	2023-07-22 17:04:34 +02:00
Victor Stinner	eda9ce1487	gh-106320: Move _PyNone_Type to the internal C API (#107030 ) Move private types _PyNone_Type and _PyNotImplemented_Type to internal C API.	2023-07-22 14:12:17 +00:00
Serhiy Storchaka	be1b968dc1	gh-106521: Remove _PyObject_LookupAttr() function (GH-106642)	2023-07-12 08:57:10 +03:00
Serhiy Storchaka	4bf43710d1	gh-106307: C API: Add PyMapping_GetOptionalItem() function (GH-106308) Also add PyMapping_GetOptionalItemString() function.	2023-07-11 23:04:12 +03:00
Victor Stinner	ec931fc394	gh-106320: Remove _PyBytesWriter C API (#106399 ) Remove the _PyBytesWriter C API: move it to the internal C API (pycore_bytesobject.h).	2023-07-04 08:27:23 +00:00
Erlend E. Aasland	217589d4f3	gh-105375: Improve error handling in _Unpickler_SetInputStream() (#105667 ) Prevent exceptions from possibly being overwritten in case of multiple failures.	2023-06-13 10:38:01 +02:00
Erlend E. Aasland	ca3cc4b95d	gh-105375: Explicitly initialise all {Pickler,Unpickler}Object fields (#105686 ) All fields must be explicitly initialised to prevent manipulation of uninitialised fields in dealloc. Align initialisation order with the layout of the object structs.	2023-06-12 23:35:07 +02:00
Erlend E. Aasland	89aac6f6b7	gh-105375: Improve _pickle error handling (#105475 ) Error handling was deferred in some cases, which could potentially lead to exceptions being overwritten.	2023-06-09 19:09:53 +02:00
Victor Stinner	ef300937c2	gh-92536: Remove PyUnicode_READY() calls (#105210 ) Since Python 3.12, PyUnicode_READY() does nothing and always returns 0.	2023-06-02 01:33:17 +02:00

1 2 3 4 5 ...

390 commits