Fix a crash caused by immortal interned strings being shared between
sub-interpreters that use basic single-phase init. In that case, the string
can be used by an interpreter that outlives the interpreter that created and
interned it. For interpreters that share obmalloc state, also share the
interned dict with the main interpreter.
This is an un-revert of gh-124646 that then addresses the Py_TRACE_REFS
failures identified by gh-124785 (i.e. backporting gh-125709 too).
(cherry picked from commit f2cb399470, AKA gh-124865)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
gh-127208: Reject null character in _imp.create_dynamic() (#127400)
_imp.create_dynamic() now rejects embedded null characters in the
path and in the module name.
Backport also the _PyUnicode_AsUTF8NoNUL() function.
(cherry picked from commit b14fdadc6c)
This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.
* Allow interned strings to be mortal, and fix related issues (GH-120520)
* Add an InternalDocs file describing how interning should work and how to use it.
* Add internal functions to *explicitly* request what kind of interning is done:
- `_PyUnicode_InternMortal`
- `_PyUnicode_InternImmortal`
- `_PyUnicode_InternStatic`
* Switch uses of `PyUnicode_InternInPlace` to those.
* Disallow using `_Py_SetImmortal` on strings directly.
You should use `_PyUnicode_InternImmortal` instead:
- Strings should be interned before immortalization, otherwise you're possibly
interning a immortalizing copy.
- `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
`SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
backports, as they are now part of public API and version-specific ABI.
* Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.
Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
- `_Py_ID`
- `_Py_STR` (including the empty string)
- one-character latin-1 singletons
Now, when you intern a singleton, that exact singleton will be interned.
* Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).
* Intern `_Py_STR` singletons at startup.
* Beef up the tests. Cover internal details (marked with `@cpython_only`).
* Add lots of assertions
* Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (GH-121364)
* Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs
* Document immortality in some functions that take `const char *`
This is PyUnicode_InternFromString;
PyDict_SetItemString, PyObject_SetAttrString;
PyObject_DelAttrString; PyUnicode_InternFromString;
and the PyModule_Add convenience functions.
Always point out a non-immortalizing alternative.
* Don't immortalize user-provided attr names in _ctypes
* Immortalize names in code objects to avoid crash (GH-121903)
* Intern latin-1 one-byte strings at startup (GH-122303)
There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general `_Py_ID`/`_Py_STR`.)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
_PyArg_Parser holds static global data generated for modules by Argument Clinic. The _PyArg_Parser.kwtuple field is a tuple object, even though it's stored within a static global. In some cases the tuple is statically allocated and thus it's okay that it gets shared by multiple interpreters. However, in other cases the tuple is set lazily, allocated from the heap using the active interprepreter at the point the tuple is needed.
This is a problem once that interpreter is destroyed since _PyArg_Parser.kwtuple becomes at dangling pointer, leading to crashes. It isn't a problem if the tuple is allocated under the main interpreter, since its lifetime is bound to the lifetime of the runtime. The solution here is to temporarily switch to the main interpreter. The alternative would be to always statically allocate the tuple.
This change also fixes a bug where only the most recent parser was added to the global linked list.
(cherry picked from commit 81865002ae)
Fix _Py_ClearImmortal() assertion: use _Py_IsImmortal() to tolerate
reference count lower than _Py_IMMORTAL_REFCNT. Fix the assertion for
the stable ABI, when a C extension is built with Python 3.11 or
lower.
(cherry picked from commit 48c49739f5)
Co-authored-by: Yilei Yang <yileiyang@google.com>
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
gh-112867: fix for WITH_PYMALLOC_RADIX_TREE=0 (GH-112885)
The _obmalloc_usage structure is only defined if the obmalloc radix tree
is enabled.
(cherry picked from commit 890ce430d9)
Co-authored-by: Neil Schemenauer <nas-github@arctrix.com>
gh-106550: Fix sign conversion in pycore_code.h (GH-112613)
Fix sign conversion in pycore_code.h: use unsigned integers and cast
explicitly when needed.
(cherry picked from commit a74902a14c)
Co-authored-by: Victor Stinner <vstinner@python.org>
Fixes GH-109894
* set `interp.static_objects.last_resort_memory_error.args` to empty tuple to avoid crash on `PyErr_Display()` call
* allow `_PyExc_InitGlobalObjects()` to be called on subinterpreter init
---------
(cherry picked from commit 47d3e2ed93)
Co-authored-by: Radislav Chugunov <52372310+chgnrdv@users.noreply.github.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
We do the following:
* add a per-interpreter XID registry (PyInterpreterState.xidregistry)
* put heap types there (keep static types in _PyRuntimeState.xidregistry)
* clear the registries during interpreter/runtime finalization
* avoid duplicate entries in the registry (when _PyCrossInterpreterData_RegisterClass() is called more than once for a type)
* use Py_TYPE() instead of PyObject_Type() in _PyCrossInterpreterData_Lookup()
The per-interpreter registry helps preserve isolation between interpreters. This is important when heap types are registered, which is something we haven't been doing yet but I will likely do soon.
(cherry-picked from commit 80dc39e1dc)
The existence of background threads running on a subinterpreter was preventing interpreters from getting properly destroyed, as well as impacting the ability to run the interpreter again. It also affected how we wait for non-daemon threads to finish.
We add PyInterpreterState.threads.main, with some internal C-API functions.
(cherry-picked from commit 1dd9dee45d)
We tried this before with a dict and for all interned strings. That ran into problems due to interpreter isolation. However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning. Here we circle back to using a global cache, but only for statically allocated strings. We also use a more-basic _Py_hashtable_t for that global cache instead of a dict.
Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter. Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter.
(cherry-picked from commit b72947a8d2)
This change makes sure sys.path[0] is set properly for subinterpreters. Before, it wasn't getting set at all.
This change does not address the broader concerns from gh-109853.
(cherry-picked from commit a040a32ea2)
* gh-108987: Fix _thread.start_new_thread() race condition (#109135)
Fix _thread.start_new_thread() race condition. If a thread is created
during Python finalization, the newly spawned thread now exits
immediately instead of trying to access freed memory and lead to a
crash.
thread_run() calls PyEval_AcquireThread() which checks if the thread
must exit. The problem was that tstate was dereferenced earlier in
_PyThreadState_Bind() which leads to a crash most of the time.
Move _PyThreadState_CheckConsistency() from thread_run() to
_PyThreadState_Bind().
(cherry picked from commit 517cd82ea7)
* gh-109795: `_thread.start_new_thread`: allocate thread bootstate using raw memory allocator (#109808)
(cherry picked from commit 1b8f2366b3)
---------
Co-authored-by: Radislav Chugunov <52372310+chgnrdv@users.noreply.github.com>
gh-104690: thread_run() checks for tstate dangling pointer (#109056)
thread_run() of _threadmodule.c now calls
_PyThreadState_CheckConsistency() to check if tstate is a dangling
pointer when Python is built in debug mode.
Rename ceval_gil.c is_tstate_valid() to
_PyThreadState_CheckConsistency() to reuse it in _threadmodule.c.
(cherry picked from commit f63d37877a)
* GH-108390: Prevent non-local events being set with `sys.monitoring.set_local_events()` (GH-108420)
* Restore generated objects
* Restore size of monitoring arrays in code object for 3.12 ABI compatibility.
* Update ABI file
GH-107724: Fix the signature of `PY_THROW` callback functions. (GH-107725)
(cherry picked from commit 52fbcf61b5)
Co-authored-by: Mark Shannon <mark@hotpy.org>
gh-107080: Fix Py_TRACE_REFS Crashes Under Isolated Subinterpreters (gh-107567)
The linked list of objects was a global variable, which broke isolation between interpreters, causing crashes. To solve this, we've moved the linked list to each interpreter.
(cherry picked from commit 58ef741867)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
gh-105699: Use a _Py_hashtable_t for the PyModuleDef Cache (gh-106974)
This fixes a crasher due to a race condition, triggered infrequently when two isolated (own GIL) subinterpreters simultaneously initialize their sys or builtins modules. The crash happened due the combination of the "detached" thread state we were using and the "last holder" logic we use for the GIL. It turns out it's tricky to use the same thread state for different threads. Who could have guessed?
We solve the problem by eliminating the one object we were still sharing between interpreters. We replace it with a low-level hashtable, using the "raw" allocator to avoid tying it to the main interpreter.
We also remove the accommodations for "detached" thread states, which were a dubious idea to start with.
(cherry picked from commit 8ba4df91ae)
The _xxsubinterpreters module was meant to only use public API. Some internal C-API usage snuck in over the last few years (e.g. gh-28969). This fixes that.
(cherry picked from commit e6373c0d8b)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
GH-103082: Rename PY_MONITORING_EVENTS to _PY_MONITORING_EVENTS (#107069)
Rename private C API constants:
* Rename PY_MONITORING_UNGROUPED_EVENTS to _PY_MONITORING_UNGROUPED_EVENTS
* Rename PY_MONITORING_EVENTS to _PY_MONITORING_EVENTS
(cherry picked from commit 0927a2b25c)
gh-105340: include hidden fast-locals in locals() (GH-105715)
* gh-105340: include hidden fast-locals in locals()
(cherry picked from commit 104d7b760f)
Co-authored-by: Carl Meyer <carl@oddbird.net>
gh-106140: Reorder some fields to facilitate out-of-process inspection (GH-106143)
(cherry picked from commit 2d5a1c2811)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
gh-105587: Remove assertion from `_PyStaticObject_CheckRefcnt` (GH-105638)
(cherry picked from commit 6199fe3b32)
Co-authored-by: Eddie Elizondo <eduardo.elizondorueda@gmail.com>
For a while now, pending calls only run in the main thread (in the main interpreter). This PR changes things to allow any thread run a pending call, unless the pending call was explicitly added for the main thread to run.
(cherry picked from commit 757b402)
The risk of a race with this state is relatively low, but we play it safe anyway. We do avoid using the lock in performance-sensitive cases where the risk of a race is very, very low.
(cherry picked from commit 68dfa49627)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
The risk of a race with this state is relatively low, but we play it safe anyway.
(cherry picked from commit 7799c8e678)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
This avoids the problematic race in drop_gil() by skipping the FORCE_SWITCHING code there for finalizing threads.
(The idea for this approach came out of discussions with @markshannon.)
(cherry picked from commit 3698fda)
Co-authored-by: Eric Snow ericsnowcurrently@gmail.com
In gh-103912 we added tp_bases and tp_mro to each PyInterpreterState.types.builtins entry. However, doing so ignored the fact that both PyTypeObject fields are public API, and not documented as internal (as opposed to tp_subclasses). We address that here by reverting back to shared objects, making them immortal in the process.
(cherry picked from commit 7be667d)
Co-authored-by: Eric Snow ericsnowcurrently@gmail.com