The test could deadlock trying join on the worker processes.
Apply the same technique as gh-130933.
Join the process before the test ends in `test_notify` as well.
The test could deadlock trying join on the worker processes due to a
combination of behaviors:
* The use of `assertReachesEventually` did not ensure that workers
actually woken.release() because the SyncManager's Semaphore does not
implement get_value.
* This mean that the test could finish and the variable "sleeping" would
got out of scope and be collected. This unregisters the proxy leading
to failures in the worker or possibly the manager.
* The subsequent call to `p.join()` during cleanUp therefore never
finished.
This takes two approaches to fix this:
1) Use woken.acquire() to ensure that the workers actually finish
calling woken.release()
2) At the end of the test, wait until the workers are finished, while `cond`,
`sleeping`, and `woken` are still valid.
Replace hardcoded delay (100 ms) with a loop awaiting until a
condition is true: replace assertReturnsIfImplemented() with
assertReachesEventually().
Use sleeping_retry() in assertReachesEventually() to tolerate slow
buildbots and raise an exception on timeout (30 seconds).
The SyncManager provided support for various data structures such as dict, list, and queue, but oddly, not set.
This introduces support for set by defining SetProxy and registering it with SyncManager.
---
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Correct pthread_sigmask in resource_tracker to restore old signals
Using SIG_UNBLOCK to remove blocked "ignored signals" may accidentally
cause side effects if the calling parent already had said signals
blocked to begin with and did not intend to unblock them when
creating a pool. Use SIG_SETMASK instead with the previous mask of
blocked signals to restore the original blocked set.
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
* Correct pthread_sigmask in resource_tracker to restore old signals
Using SIG_UNBLOCK to remove blocked "ignored signals" may accidentally
cause side effects if the calling parent already had said signals
blocked to begin with and did not intend to unblock them when
creating a pool. Use SIG_SETMASK instead with the previous mask of
blocked signals to restore the original blocked set.
* Adding resource_tracker blocked signals test
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
This adds authentication to the forkserver control socket. In the past only filesystem permissions protected this socket from code injection into the forkserver process by limiting access to the same UID, which didn't exist when Linux abstract namespace sockets were used (see issue) meaning that any process in the same system network namespace could inject code. We've since stopped using abstract namespace sockets by default, but protecting our control sockets regardless of type is a good idea.
This reuses the HMAC based shared key auth already used by `multiprocessing.connection` sockets for other purposes.
Doing this is useful so that filesystem permissions are not relied upon and trust isn't implied by default between all processes running as the same UID with access to the unix socket.
### pyperformance benchmarks
No significant changes. Including `concurrent_imap` which exercises `multiprocessing.Pool.imap` in that suite.
### Microbenchmarks
This does _slightly_ slow down forkserver use. How much so appears to depend on the platform. Modern platforms and simple platforms are less impacted. This PR adds additional IPC round trips to the control socket to tell forkserver to spawn a new process. Systems with potentially high latency IPC are naturally impacted more.
Typically a 1-4% slowdown on a very targeted process creation microbenchmark, with a worst case overloaded system slowdown of 20%. No evidence that these slowdowns appear in practical sense. See the PR for details.
The first version had it running two forkserver and one spawn tests underneath each of the _fork, _forkserver, and _spawn test suites that build off the generic one.
This adds to the existing complexity of the multiprocessing test suite by offering BaseTestCase classes another attribute to control which suites they are invoked under. Practicality vs purity here. :/
Net result: we don't over-run the new test and their internal logic is simplified.
gh-117378: Fix multiprocessing forkserver preload sys.path inheritance.
`sys.path` was not properly being sent from the parent process when launching
the multiprocessing forkserver process to preload imports. This bug has been
there since the forkserver start method was introduced in Python 3.4. It was
always _supposed_ to inherit `sys.path` the same way the spawn method does.
Observable behavior change: A `''` value in `sys.path` will now be replaced in
the forkserver's `sys.path` with an absolute pathname
`os.path.abspath(os.getcwd())` saved at the time that `multiprocessing` was
imported in the parent process as it already was when using the spawn start
method. **This will only be observable during forkserver preload imports**.
The code invoked before calling things in another process already correctly sets `sys.path`.
Which is likely why this went unnoticed for so long as a mere performance issue in
some configurations.
A workaround for the bug on impacted Pythons is to set PYTHONPATH in the
environment before multiprocessing's forkserver process was started. Not perfect
as that is then inherited by other children, etc, but likely good enough for many
people's purposes.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Change the default multiprocessing start method away from fork to forkserver or spawn on the remaining platforms where it was fork. See the issue for context. This makes the default far more thread safe (other than for people spawning threads at import time... - don't do that!).
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
The `time.sleep()` call should happen before the GC to give the worker
threads time to clean-up their remaining references to objs.
Additionally, use `support.gc_collect()` instead of `gc.collect()`
just in case the extra GC calls matter.
- re-enable test_fcntl_64_bit on Linux aarch64, but disable it on all
Android ABIs
- use support.setswitchinterval in all relevant tests
- skip test_fma_zero_result on Android x86_64
- accept EACCES when calling os.get_terminal_size on Android
Fix some test_multiprocessing flakiness.
Potentially introduced by https://github.com/python/cpython/pull/25845
not joining that thread likely leads to recently observed "environment
changed" logically passing but overall failing tests seen on some
buildbots similar to:
```
1 test altered the execution environment (env changed):
test.test_multiprocessing_fork.test_processes
2 re-run tests:
test.test_multiprocessing_fork.test_processes
test.test_multiprocessing_forkserver.test_processes
```
This builds on https://github.com/python/cpython/pull/106807, which adds
a return code to ResourceTracker, to make future debugging easier.
Testing this “in situ” proved difficult, since the global ResourceTracker is
involved in test infrastructure. So, the tests here create a new instance and
feed it fake data.
---------
Co-authored-by: Yonatan Bitton <yonatan.bitton@perception-point.io>
Co-authored-by: Yonatan Bitton <bityob@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
We add _winapi.BatchedWaitForMultipleObjects to wait for larger numbers of handles.
This is an internal module, hence undocumented, and should be used with caution.
Check the docstring for info before using BatchedWaitForMultipleObjects.
gh-113205: test_multiprocessing.test_terminate: Test the API works on threadpools
Threads can't be forced to terminate (without potentially corrupting too much
state), so the expected behaviour of `ThreadPool.terminate` is to wait for
the currently executing tasks to finish.
The entire test was skipped in GH-110848 (0e9c364f4a).
Instead of skipping it entirely, we should ensure the API eventually succeeds:
use a shorter timeout.
For the record: on my machine, when the test is un-skipped, the task manages to
start in about 1.5% cases.
Add a track parameter to shared memory to allow resource tracking via the side-launched resource tracker process to be disabled on platforms that use it (POSIX).
This allows people who do not want automated cleanup at process exit because they are using the shared memory with processes not participating in Python's resource tracking to use the shared_memory API.
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Joining a thread now ensures the underlying OS thread has exited. This is required for safer fork() in multi-threaded processes.
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
multiprocessing test_terminate() and test_wait_socket_slow() no
longer test the CI performance: no longer check maximum elapsed time.
Add CLOCK_RES constant: tolerate a difference of 100 ms.
On Windows, multiprocessing Popen.terminate() now catchs
PermissionError and get the process exit code. If the process is
still running, raise again the PermissionError. Otherwise, the
process terminated as expected: store its exit code.
gh-107275 introduced a regression where a SemLock would fail being passed along nested child processes, as the `is_fork_ctx` attribute would be left missing after the first deserialization.
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>