cpython/Modules
Alexey Izbyshev 976da903a7
bpo-35823: subprocess: Use vfork() instead of fork() on Linux when safe (GH-11671)
* bpo-35823: subprocess: Use vfork() instead of fork() on Linux when safe

When used to run a new executable image, fork() is not a good choice
for process creation, especially if the parent has a large working set:
fork() needs to copy page tables, which is slow, and may fail on systems
where overcommit is disabled, despite that the child is not going to
touch most of its address space.

Currently, subprocess is capable of using posix_spawn() instead, which
normally provides much better performance. However, posix_spawn() does not
support many of child setup operations exposed by subprocess.Popen().
Most notably, it's not possible to express `close_fds=True`, which
happens to be the default, via posix_spawn(). As a result, most users
can't benefit from faster process creation, at least not without
changing their code.

However, Linux provides vfork() system call, which creates a new process
without copying the address space of the parent, and which is actually
used by C libraries to efficiently implement posix_spawn(). Due to sharing
of the address space and even the stack with the parent, extreme care
is required to use vfork(). At least the following restrictions must hold:

* No signal handlers must execute in the child process. Otherwise, they
  might clobber memory shared with the parent, potentially confusing it.

* Any library function called after vfork() in the child must be
  async-signal-safe (as for fork()), but it must also not interact with any
  library state in a way that might break due to address space sharing
  and/or lack of any preparations performed by libraries on normal fork().
  POSIX.1 permits to call only execve() and _exit(), and later revisions
  remove vfork() specification entirely. In practice, however, almost all
  operations needed by subprocess.Popen() can be safely implemented on
  Linux.

* Due to sharing of the stack with the parent, the child must be careful
  not to clobber local variables that are alive across vfork() call.
  Compilers are normally aware of this and take extra care with vfork()
  (and setjmp(), which has a similar problem).

* In case the parent is privileged, special attention must be paid to vfork()
  use, because sharing an address space across different privilege domains
  is insecure[1].

This patch adds support for using vfork() instead of fork() on Linux
when it's possible to do safely given the above. In particular:

* vfork() is not used if credential switch is requested. The reverse case
  (simple subprocess.Popen() but another application thread switches
  credentials concurrently) is not possible for pure-Python apps because
  subprocess.Popen() and functions like os.setuid() are mutually excluded
  via GIL. We might also consider to add a way to opt-out of vfork() (and
  posix_spawn() on platforms where it might be implemented via vfork()) in
  a future PR.

* vfork() is not used if `preexec_fn != None`.

With this change, subprocess will still use posix_spawn() if possible, but
will fallback to vfork() on Linux in most cases, and, failing that,
to fork().

[1] https://ewontfix.com/7

Co-authored-by: Gregory P. Smith [Google LLC] <gps@google.com>
2020-10-23 17:47:01 -07:00
..
_blake2 bpo-1635741: Port _blake2 module to multi-phase init (GH-21856) 2020-09-02 11:45:13 +02:00
_ctypes bpo-16396: Allow wintypes to be imported on non-Windows systems. (GH-21394) 2020-10-19 23:06:05 +01:00
_decimal Revert "bpo-26680: Incorporate is_integer in all built-in and standard library numeric types (GH-6121)" (GH-22584) 2020-10-07 16:43:44 -07:00
_io bpo-40170: Use inline _PyType_HasFeature() function (GH-22375) 2020-09-23 14:08:38 +02:00
_multiprocessing bpo-1635741: Fix NULL ptr deref in multiprocessing (GH-22880) 2020-10-22 03:20:36 -07:00
_sha3 bpo-1635741: Port _sha3 module to multi-phase init (GH-21855) 2020-09-02 11:55:19 +02:00
_sqlite bpo-42021: Fix possible ref leaks during _sqlite3 module init (GH-22673) 2020-10-15 21:20:15 +09:00
_ssl bpo-41056: Fix a NULL pointer dereference on MemoryError within the ssl module. (GH-21009) 2020-06-20 12:15:03 -07:00
_xxtestfuzz Fuzz struct.unpack and catch RecursionError in re.compile (GH-18679) 2020-02-27 23:05:02 -08:00
cjkcodecs bpo-37999: No longer use __int__ in implicit integer conversions. (GH-15636) 2020-05-26 18:43:38 +03:00
clinic bpo-4356: Add key function support to the bisect module (GH-20556) 2020-10-19 22:04:01 -07:00
expat bpo-37731: Reorder includes in xmltok.c to avoid redefinition of _POSIX_C_SOURCE (GH-16733) 2019-10-12 20:14:11 +01:00
_abc.c bpo-40217: Ensure Py_VISIT(Py_TYPE(self)) is always called for PyType_FromSpec types (reverts GH-19414) (GH-20264) 2020-05-27 02:03:38 -07:00
_asynciomodule.c Delete TaskWakeupMethWrapper_Type and use PyCFunction instead (#22875) 2020-10-21 17:49:10 -07:00
_bisectmodule.c bpo-4356: Add key function support to the bisect module (GH-20556) 2020-10-19 22:04:01 -07:00
_bz2module.c bpo-40077: Convert _bz2 module to use PyType_FromSpec (GH-20960) 2020-06-20 00:56:13 +09:00
_codecsmodule.c bpo-41919, test_codecs: Move codecs.register calls to setUp() (GH-22513) 2020-10-16 10:34:15 +02:00
_collectionsmodule.c bpo-40521: Remove freelist from collections.deque() (GH-21073) 2020-06-23 06:50:15 -07:00
_contextvarsmodule.c bpo-1635741: Port _contextvars module to multiphase initialization (PEP 489) (GH-18374) 2020-02-17 14:49:26 +01:00
_cryptmodule.c bpo-1635741: Port _crypt extension module to multiphase initialization (PEP 489) (GH-18404) 2020-02-17 10:11:34 +01:00
_csv.c bpo-12178: Fix escaping of escapechar in csv.writer() (GH-13710) 2020-09-20 09:38:07 +03:00
_curses_panel.c bpo-1635741 port _curses_panel to multi-phase init (PEP 489) (GH-21986) 2020-09-07 17:14:25 +02:00
_cursesmodule.c bpo-36982: Add support for extended color functions in ncurses 6.1 (GH-17536) 2020-08-03 23:51:33 -04:00
_datetimemodule.c bpo-41867: List options for timespec in docstrings of isoformat methods (GH-22418) 2020-10-03 13:43:47 +03:00
_dbmmodule.c bpo-1635741: Port _dbm module to multiphase initialization (GH-20848) 2020-06-16 01:20:54 +09:00
_elementtree.c bpo-39573: Use the Py_TYPE() macro (GH-21433) 2020-07-10 12:40:38 +02:00
_functoolsmodule.c bpo-31082: Use "iterable" in the docstring for functools.reduce() (GH-20796) 2020-06-28 15:40:54 +09:00
_gdbmmodule.c bpo-1635741: Port _gdbm module to multiphase initialization (GH-20920) 2020-06-17 01:41:23 +09:00
_hashopenssl.c bpo-40791: Use CRYPTO_memcmp() for compare_digest (#20456) 2020-05-27 21:50:06 +02:00
_heapqmodule.c bpo-41078: Add pycore_list.h internal header file (GH-21057) 2020-06-22 17:39:32 +02:00
_json.c bpo-40217: Ensure Py_VISIT(Py_TYPE(self)) is always called for PyType_FromSpec types (reverts GH-19414) (GH-20264) 2020-05-27 02:03:38 -07:00
_localemodule.c bpo-38324: Fix test__locale.py Windows failures (GH-20529) 2020-10-20 12:39:52 +01:00
_lsprof.c bpo-1635741: Port _lsprof extension to multi-phase init (PEP 489) (GH-22220) 2020-09-23 12:33:21 +02:00
_lzmamodule.c bpo-1635741: Port _lzma module to multiphase initialization (GH-19382) 2020-06-23 00:53:07 +09:00
_math.c Issue #28256: Cleanup _math.c 2016-10-18 16:29:27 +02:00
_math.h Issue #28256: Cleanup _math.c 2016-10-18 16:29:27 +02:00
_opcode.c bpo-1635741: Port _opcode module to multi-phase init (PEP 489) (GH-22050) 2020-09-07 10:48:44 +02:00
_operator.c bpo-40077: Convert _operator to use PyType_FromSpec (GH-21954) 2020-08-27 02:22:27 +09:00
_pickle.c bpo-41288: Refactor of unpickling NEWOBJ and NEWOBJ_EX opcodes. (GH-21472) 2020-07-18 11:11:21 +03:00
_posixsubprocess.c bpo-35823: subprocess: Use vfork() instead of fork() on Linux when safe (GH-11671) 2020-10-23 17:47:01 -07:00
_queuemodule.c bpo-40268: Remove unused structmember.h includes (GH-19530) 2020-04-15 02:35:41 +02:00
_randommodule.c bpo-41052: Opt out serialization/deserialization for _random.Random (GH-21002) 2020-06-21 18:44:58 +09:00
_scproxy.c bpo-1635741: port scproxy to multi-phase init (GH-22164) 2020-09-09 12:28:48 +09:00
_sre.c bpo-40943: Replace PY_FORMAT_SIZE_T with "z" (GH-20781) 2020-06-10 18:38:05 +02:00
_ssl.c bpo-31122: ssl.wrap_socket() now raises ssl.SSLEOFError rather than OSError when peer closes connection during TLS negotiation (GH-18772) 2020-08-15 10:01:19 -07:00
_ssl_data.h closes bpo-40266, closes bpo-39953: Use numeric lib code if compiling against old OpenSSL. (GH-19506) 2020-04-13 22:11:40 -05:00
_stat.c bpo-40677: Define IO_REPARSE_TAG_APPEXECLINK explicitly (GH-20206) 2020-05-19 13:22:16 +01:00
_statisticsmodule.c bpo-40268: Remove unused structmember.h includes (GH-19530) 2020-04-15 02:35:41 +02:00
_struct.c bpo-40792: Make the result of PyNumber_Index() always having exact type int. (GH-20443) 2020-05-28 10:33:45 +03:00
_testbuffer.c closes bpo-39736: const strings in Modules/_datetimemodule.c and Modules/_testbuffer.c (GH-18637) 2020-02-23 22:40:43 -08:00
_testcapimodule.c bpo-41984: GC track all user classes (GH-22701) 2020-10-14 18:44:07 -07:00
_testimportmultiple.c Remove compile warnings for _testimportmodule 2012-12-15 18:16:47 +02:00
_testinternalcapi.c bpo-29778: test_embed tests the path configuration (GH-21306) 2020-07-08 00:20:37 +02:00
_testmultiphase.c _testmultiphase: Fix possible ref leak (GH-22881) 2020-10-22 02:44:18 -07:00
_threadmodule.c bpo-40453: Add PyConfig._isolated_subinterpreter (GH-19820) 2020-05-01 11:33:44 +02:00
_tkinter.c Trivial typo fix in _tkinter.c (GH-19622) 2020-05-15 03:43:58 -07:00
_tracemalloc.c bpo-41995: Fix null ptr deref in tracemalloc_copy_trace() (GH-22660) 2020-10-13 08:46:31 +02:00
_uuidmodule.c bpo-40501: Replace ctypes code in uuid with native module (GH-19948) 2020-05-12 23:32:32 +01:00
_weakref.c bpo-40170: PyObject_GET_WEAKREFS_LISTPTR() becomes a function (GH-19377) 2020-04-06 14:07:02 +02:00
_winapi.c bpo-1635741: Port _winapi ext to multi-stage init (GH-21371) 2020-08-13 16:22:48 +02:00
_xxsubinterpretersmodule.c bpo-40941: Unify implicit and explicit state in the frame and generator objects into a single value. (GH-20803) 2020-07-17 11:44:23 +01:00
_zoneinfo.c bpo-30155: Add macros to get tzinfo from datetime instances (GH-21633) 2020-09-23 14:43:45 -04:00
addrinfo.h replace PY_LONG_LONG with long long 2016-09-06 10:46:49 -07:00
arraymodule.c bpo-29727: Register array.array as a MutableSequence (GH-21338) 2020-07-05 22:43:14 +01:00
atexitmodule.c Fix atexitmodule doc (GH-21456) 2020-07-26 20:33:00 -03:00
audioop.c bpo-39824: Convert PyModule_GetState() to get_module_state() (GH-19076) 2020-03-19 10:11:33 -07:00
binascii.c Use calloc-based functions, not malloc. (GH-19152) 2020-03-24 23:26:44 -05:00
cmathmodule.c bpo-1635741: Port cmath to multi-phase init (PEP 489) (GH-22165) 2020-09-10 16:09:04 +02:00
config.c.in rename _imp initialization function to follow conventions (#5432) 2018-01-29 11:33:57 -08:00
errnomodule.c bpo-1635741: Port errno module to multiphase initialization (GH-19923) 2020-05-07 10:17:16 +09:00
faulthandler.c bpo-1635741: Port faulthandler module to multiphase initialization (GH-21294) 2020-07-04 01:36:47 +09:00
fcntlmodule.c bpo-41586: Add pipesize parameter to subprocess & F_GETPIPE_SZ and F_SETPIPE_SZ to fcntl. (GH-21921) 2020-10-19 16:30:02 -07:00
gc_weakref.txt Issue #13575: there is only one class type. 2011-12-12 18:54:29 +01:00
gcmodule.c bpo-40521: Make dict free lists per-interpreter (GH-20645) 2020-06-23 11:33:18 +02:00
getaddrinfo.c bpo-32241: Add the const qualifire to declarations of umodifiable strings. (#4748) 2017-12-12 13:55:04 +02:00
getbuildinfo.c bpo-27593: Get SCM build info from git instead of hg. (#446) 2017-03-04 00:19:55 -05:00
getnameinfo.c Issue #15538: Fix compilation of the getnameinfo() / getaddrinfo() emulation code. 2012-08-02 20:37:12 +02:00
getpath.c bpo-40947: getpath.c uses PyConfig.platlibdir (GH-20807) 2020-06-11 17:28:52 +02:00
grpmodule.c bpo-37999: No longer use __int__ in implicit integer conversions. (GH-15636) 2020-05-26 18:43:38 +03:00
hashlib.h bpo-31370: Remove support for threads-less builds (#3385) 2017-09-07 18:56:24 +02:00
itertoolsmodule.c bpo-41078: Rename pycore_tupleobject.h to pycore_tuple.h (GH-21056) 2020-06-22 17:27:35 +02:00
ld_so_aix.in Issue #10656: Fix out-of-tree building on AIX 2016-11-20 07:56:37 +00:00
main.c bpo-41602: raise SIGINT exit code on KeyboardInterrupt from pymain_run_module (#21956) 2020-09-22 08:53:03 -07:00
makesetup closes bpo-34212: Build core extension modules with Py_BUILD_CORE_BUILTIN. (GH-8712) 2018-11-26 20:21:31 -06:00
makexp_aix
mathmodule.c Update link to supporting references (GH-22488) 2020-10-01 19:30:54 -07:00
md5module.c md5module: Fix doc strings variable names (GH-22722) 2020-10-20 18:10:43 +09:00
mmapmodule.c bpo-1635741: Port mmap module to multiphase initialization (GH-19459) 2020-06-06 00:01:02 +09:00
nismodule.c bpo-40950: Port nis module to multiphase initialization (GH-20811) 2020-06-12 11:26:00 +09:00
ossaudiodev.c bpo-40268: Remove unused structmember.h includes (GH-19530) 2020-04-15 02:35:41 +02:00
overlapped.c bpo-1635741: Port _overlapped module to multi-phase init (GH-22051) 2020-09-07 15:12:40 +02:00
posixmodule.c bpo-40422: Move _Py_closerange to fileutils.c (GH-22680) 2020-10-13 22:04:44 +02:00
posixmodule.h bpo-40422: Move _Py_closerange to fileutils.c (GH-22680) 2020-10-13 22:04:44 +02:00
pwdmodule.c bpo-39968: Convert extension modules' macros of get_module_state() to inline functions (GH-19017) 2020-03-16 14:15:01 +01:00
pyexpat.c bpo-40268: Remove unused structmember.h includes (GH-19530) 2020-04-15 02:35:41 +02:00
readline.c bpo-20181: Convert the readline module to the Argument Clinic (#14326) 2020-07-12 19:01:03 +03:00
README Issue #18093: Factor out the programs that embed the runtime 2014-07-25 21:52:14 +10:00
resource.c bpo-1635741: Port resource extension module to multiphase initialization (PEP 489) (GH-19252) 2020-04-02 14:35:08 +02:00
rotatingtree.c
rotatingtree.h bpo-32150: Expand tabs to spaces in C files. (#4583) 2017-11-28 17:56:10 +02:00
selectmodule.c bpo-41985: Add _PyLong_FileDescriptor_Converter and AC converter for "fildes". (GH-22620) 2020-10-09 23:00:45 +03:00
Setup bpo-40422: Move _Py_closerange to fileutils.c (GH-22680) 2020-10-13 22:04:44 +02:00
sha1module.c bpo-1635741: Port _sha1, _sha512, _md5 to multiphase init (GH-21818) 2020-09-06 12:09:51 +02:00
sha256module.c bpo-1635741: Convert _sha256 types to heap types (GH-22134) 2020-09-08 11:16:14 +02:00
sha512module.c bpo-1635741: Port _sha1, _sha512, _md5 to multiphase init (GH-21818) 2020-09-06 12:09:51 +02:00
signalmodule.c bpo-41713: _signal doesn't use multi-phase init (GH-22087) 2020-09-04 14:51:05 +02:00
socketmodule.c bpo-36020: Remove snprintf macro in pyerrors.h (GH-20889) 2020-06-15 21:59:47 +02:00
socketmodule.h bpo-40291: Add support for CAN_J1939 sockets (GH-19538) 2020-04-29 15:31:19 -07:00
spwdmodule.c [security] bpo-13617: Reject embedded null characters in wchar* strings. (#2302) 2017-06-28 08:30:06 +03:00
sre.h bpo-39943: Add the const qualifier to pointers on non-mutable PyUnicode data. (GH-19345) 2020-04-11 10:48:40 +03:00
sre_constants.h bpo-31690: Allow the inline flags "a", "L", and "u" to be used as group flags for RE. (#3885) 2017-10-24 23:31:42 +03:00
sre_lib.h bpo-40943: Replace PY_FORMAT_SIZE_T with "z" (GH-20781) 2020-06-10 18:38:05 +02:00
symtablemodule.c bpo-37253: Add _PyCompilerFlags_INIT macro (GH-14018) 2019-06-13 02:16:41 +02:00
syslogmodule.c bpo-1635741: Port syslog module to multiphase initialization (GH-19907) 2020-05-05 10:49:46 +09:00
termios.c bpo-20184: Convert termios to Argument Clinic. (GH-22693) 2020-10-18 17:54:06 +03:00
testcapi_long.h Issue #9530: Fix undefined behaviour due to signed overflow in testcapi_long.h. 2011-11-19 17:58:15 +00:00
timemodule.c bpo-40192: Use thread_cputime for time.thread_time to improve resolution (GH-19381) 2020-05-16 11:39:09 +02:00
tkappinit.c Issue #4350: Removed a number of out-of-dated and non-working for a long time 2014-07-23 22:33:50 +03:00
tkinter.h Issue #16840. Turn off bignum support in tkinter with with Tcl earlier than 8.5.8 2015-04-22 10:59:32 +03:00
unicodedata.c bpo-1635741: Add a global module state to unicodedata (GH-22712) 2020-10-15 16:22:19 +02:00
unicodedata_db.h closes bpo-39926: Update Unicode to 13.0.0. (GH-18910) 2020-03-10 20:41:34 -07:00
unicodename_db.h closes bpo-39926: Update Unicode to 13.0.0. (GH-18910) 2020-03-10 20:41:34 -07:00
winreparse.h bpo-31512: Add non-elevated symlink support for Windows (GH-3652) 2019-04-09 11:19:46 -07:00
xxlimited.c bpo-40217: Ensure Py_VISIT(Py_TYPE(self)) is always called for PyType_FromSpec types (reverts GH-19414) (GH-20264) 2020-05-27 02:03:38 -07:00
xxmodule.c bpo-39573: Clean up modules and headers to use Py_IS_TYPE() function (GH-18521) 2020-02-17 11:09:15 +01:00
xxsubtype.c bpo-40268: Remove unused structmember.h includes (GH-19530) 2020-04-15 02:35:41 +02:00
zlibmodule.c bpo-1635741 port zlib module to multi-phase init (GH-21995) 2020-09-07 10:27:55 +02:00

Source files for standard library extension modules,
and former extension modules that are now builtin modules.