In `_PyDict_GetMethodStackRef`, only use the fast-path unicode lookup
when the dict is owned by the current thread or already marked as shared.
This prevents a race between the lookup and concurrent dict resizes,
which may free the PyDictKeysObject (i.e., it ensures that the resize
uses QSBR).
Address a similar issue in `_Py_dict_lookup_threadsafe_stackref` by
calling `ensure_shared_on_read()`.
Concurrent accesses from multiple threads to the same `cell` object did not
scale well in the free-threaded build. Use `_PyStackRef` and optimistically
avoid locking to improve scaling.
With the locks around cell reads gone, some of the free threading tests were
prone to starvation: the readers were able to run in a tight loop and the
writer threads weren't scheduled frequently enough to make timely progress.
Adjust the tests to avoid this.
Co-authored-by: Donghee Na <donghee.na@python.org>
The `dict.get` implementation uses `_Py_dict_lookup_threadsafe`, which is
thread-safe, so we remove the critical section from the argument clinic.
Add a test for concurrent dict get and set operations.