cpython/Lib
Barry Warsaw 27ebd9abce
gh-150228: Improve the PEP 829 batch processing APIs (#150542)
* gh-150228: Improve the PEP 829 batch processing APIs

As previously discussed with @ncoghlan and approved for 3.15b2 by @hugovk,
this implements the batch processing APIs for addsitedir() and friends.  We
remove the `defer_processing_start_files` flag which required some implicit
module global state, and promote StartupState to the public documented API.

This also moves the bulk of the module global functions into methods of the
`StartupState` class, so it removes the awkward APIs in 3.15b1.  Now, instances
of this class are an accumulator for startup state, using `StartupState.process()`
to process them.  Callers can now batch up startup state themselves by using
the methods on this class.  The module global functions are shims for this
which preserve the legacy APIs and semantics using the new state class.

This PR also fixes the interleaving regression identified by @ncoghlan in the
same issue.  Now, .pth file sys.path extensions are added to sys.path after
the sitedir that the .pth file is found in, restoring the legacy behavior.

Along the way, I've made a lot of improvements to function docstrings,
site.rst documentation, and comments in the code explaining what's going on.

* Add a note that if known_paths is provided to StartupState.__init__(), it
  will get mutated in place.
* Improve some conditional flows.
* Improve some comments.
* Improve the what's new entry.

* Make test_impl_exec_imports_suppressed_by_matching_start() more robust

Based on PR comment, we need to read both the .pth and .start files, and prove
that the .pth file's import line (which passes a bigger increment) is not
called, but the .start file's entry point (which uses the default increment)
is called.

* As per review, move some methods to the private API

_read_pth_file() and _read_start_file() are not intended to be part of the
public API surface outside of the site module, so even though they are used by
methods outside of the StartupState class, make them privately named.

* Resolve several review feedbacks

* Move a `versionadded`
* Better list comprehension formatting (use the output from
  `ruff format --line-length 78`)

* Add docs for site.makepath() and point the case-normalization requirement to
  this utility function.
* Note that StartupState.process() is not idempotent.

* Address another feedback comment

This time, we get rid of the legacy implementation `reset` local, which was
always difficult to understand, and just implement a return value based on the
processing mode selected.

* Changes based on gh-150228 review

The comment by @encukou that started this change:

```
I still see two red flags here though: an argument that doesn't combine with
other arguments, and (another instance of) changing the return type based on
an argument.

Did you consider adding a StartupState.addsitedir(sitedir) method, instead of
the startup_state argument?
```

As it turns out, this is an even cleaner design.  By moving the bulk of the
previous module global functions into `StartupState` methods, we can get rid
of all the awkward `startup_state` keyword-only arguments which conflict
with `known_path` (Petr's first point).  We can also get rid of the
return value dichotomy (Petr's second point) because now we can preserve
exactly the Python 3.14 API in the module global functions, and implement
the better APIs in the class methods.  We also generally don't have to
pass around `process_known_sitedirs`.

Now the following module global functions are essentially shims around
class methods:

* site.addsitedir() -> StartupState.addsitedir()
* site.addusersitepackages() -> StartupState.addusersitepackages()
* site.addsitepackages() -> StartupState.addsitepackages()
* Additional minor changes
* Remove a now unused parameter


Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
2026-06-01 18:43:18 -07:00
..
__phello__
_pyrepl gh-130472: Use fancycompleter in import completions (#148188) 2026-05-05 01:36:43 +00:00
asyncio gh-150345: Fix incorrect reference in BaseSelectorEventLoop docstring (#150538) 2026-05-29 15:24:58 +05:30
collections gh-147957: pop items from UserDict in LIFO order (gh-147958) 2026-04-14 23:29:41 -05:00
compression gh-150285: Fix too long docstrings in the zstd module (GH-150291) 2026-05-24 15:03:22 +03:00
concurrent Fix pyflakes warnings: variable is assigned to but never used (#142294) 2025-12-08 14:00:31 +01:00
ctypes gh-149831: Fix ctypes DLL library name on Cygwin (#149832) 2026-05-15 14:29:26 +00:00
curses gh-150285: Fix too long docstrings in the curses module (GH-150286) 2026-05-24 15:02:12 +03:00
dbm Fix typo: 'exept' -> 'except' in Lib/dbm/dumb.py (GH-144060) 2026-01-20 08:50:34 +02:00
email gh-88726: Stop using non-standard charset names eucgb2312_cn and big5_tw in email (GH-149959) 2026-05-26 21:52:47 +03:00
encodings gh-62259: Add support of multi-byte encodings in the XML parser (GH-149860) 2026-05-26 19:40:25 +00:00
ensurepip gh-150685: update bundled pip to 26.1.2 (gh-150686) 2026-05-31 20:28:02 +01:00
html gh-140875: Fix handling of unclosed charrefs before EOF in HTMLParser (GH-140904) 2025-11-19 13:55:10 +02:00
http gh-149144: Use decodeURIComponent() for UTF-8 support in js_output() (GH-149157) 2026-05-14 23:10:39 +02:00
idlelib gh-139551: add support for BaseExceptionGroup in IDLE (GH-139563) 2026-04-12 10:06:41 -07:00
importlib GH-83065: Fix import deadlock by implementing hierarchical module locking (GH-137196) 2026-04-28 01:06:23 -07:00
json gh-149056: Properly pass array_hook in json.load() to json.loads() (GH-149057) 2026-05-29 22:53:21 +03:00
logging gh-132372: Speed up logging.config existing logger handling (GH-150242) 2026-05-29 16:50:05 +01:00
multiprocessing gh-149879: Fix multiprocessing tests on Cygwin (#150031) 2026-05-19 00:45:35 +02:00
pathlib gh-86533: Restore os.makedirs() ability to apply *mode* recursively (GH-150011) 2026-05-18 23:00:27 +00:00
profiling gh-150258: Show relative percentage on Tachyon flamegraph (#150266) 2026-05-23 08:31:26 -04:00
pydoc_data Python 3.15.0b1 2026-05-07 16:26:31 +03:00
re gh-86519: Add prefixmatch APIs to the re module (GH-31137) 2026-02-15 17:43:39 -08:00
site-packages
sqlite3 gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
string GH-132661: Add `string.templatelib.convert()` (#135217) 2025-07-15 11:56:42 +02:00
sysconfig gh-150208: Avoid double-quoting string values in sysconfigdata (#150209) 2026-05-25 13:30:07 +01:00
test gh-150228: Improve the PEP 829 batch processing APIs (#150542) 2026-06-01 18:43:18 -07:00
tkinter gh-47655: Add support for user data and detail of Tk events to tkinter (GH-7142) 2026-02-25 10:34:00 +02:00
tomllib Update mypy to 2.1.0 (#149709) 2026-05-12 08:40:51 +00:00
turtledemo gh-137586: Open external osascript program with absolute path (GH-137584) 2026-04-06 09:42:10 -07:00
unittest gh-150175: Fix ThreadingMock call_count race condition (#150176) 2026-05-21 08:38:07 +01:00
urllib gh-106693: Revert "Explicitly mark ob_sval as unsigned char to avoid UB (#106826)" (#149514) 2026-05-07 23:39:08 +03:00
venv gh-149879: Fix test_venv on Cygwin (#150483) 2026-05-26 18:32:13 +02:00
wsgiref gh-144370: Disallow usage of control characters in status in wsgiref.handlers for security (#144371) 2026-03-06 13:22:21 +01:00
xml gh-149489: Fix ElementTree serialization to HTML (GH-149490) 2026-05-30 00:04:50 +03:00
xmlrpc gh-136839: Refactor simple dict.update calls (#136811) 2025-07-19 10:12:10 -07:00
zipfile gh-84353: Preserve non-UTF-8 filenames when appending to ZipFile (GH-150091) 2026-05-27 17:56:38 +00:00
zoneinfo gh-145883: Fix two heap-buffer-overflows in _zoneinfo (#145885) 2026-04-04 13:29:17 +01:00
__future__.py
__hello__.py
_aix_support.py
_android_support.py gh-144415: Android testbed fixes (#142912) 2026-02-03 16:37:34 +08:00
_apple_support.py
_ast_unparse.py gh-143055: Fix crash in AST unparser when unparsing dict comprehension unpacking (#145556) 2026-03-09 10:37:23 -07:00
_collections_abc.py gh-150285: Fix too long docstrings in some Python modules (GH-150366) 2026-05-25 07:33:54 +00:00
_colorize.py gh-148352: Add more colour to calendar CLI output (#148354) 2026-05-04 15:14:57 +03:00
_compat_pickle.py gh-142349: Implement PEP 810 - Explicit lazy imports (#142351) 2026-02-12 00:15:33 +00:00
_ios_support.py
_markupbase.py
_opcode_metadata.py GH-143732: SEND specialization (GH-148963) 2026-05-05 15:19:16 +01:00
_osx_support.py gh-136677: Introduce executable specific linker flags to configure (#137296) 2026-02-24 22:52:02 +00:00
_py_abc.py
_py_warnings.py gh-143231: Add the module attribute to warnings.WarningMessage (GH-149298) 2026-05-03 09:35:47 +00:00
_pydatetime.py gh-97517: Add documentation links to datetime strftime/strptime docstrings (#138559) 2025-09-15 19:50:46 +01:00
_pydecimal.py gh-150285: Fix too long docstrings in the decimal module (GH-150288) 2026-05-24 15:02:32 +03:00
_pyio.py gh-150285: Fix too long docstrings in the io module (GH-150287) 2026-05-24 15:02:21 +03:00
_pylong.py
_sitebuiltins.py gh-150427: Remove unused __linecnt attribute from _sitebuiltins (#150428) 2026-05-25 22:56:27 +02:00
_strptime.py GH-70647: Remove support for %d (and deprecate for %e) without year in strptime() (GH-144570) 2026-04-14 17:15:27 -07:00
_threading_local.py Fix pyflakes warnings: variable is assigned to but never used (#142294) 2025-12-08 14:00:31 +01:00
_weakrefset.py
abc.py gh-149609: Raise deprecation warnings for abc.{abstractclassmethod,abstractstaticmethod,abstractproperty} (#149636) 2026-05-31 07:26:52 +00:00
annotationlib.py gh-149528: Remove annotationlib.ForwardRef._evaluate for 3.16 (#149529) 2026-05-08 07:48:15 +03:00
antigravity.py
argparse.py gh-149614 - Restore deepcopiability of argparse.ArgumentParser instances (#149617) 2026-05-11 15:28:23 +00:00
ast.py gh-140344: ast: Add deprecation warnings (#140345) 2026-05-18 11:20:49 -07:00
base64.py gh-134837: Correct and improve base85 documentation for base64 and binascii modules (GH-145843) 2026-05-12 22:46:46 +03:00
bdb.py gh-136057: Allow step and next to step over for loops (#136160) 2025-11-16 13:57:07 -08:00
bisect.py
bz2.py gh-132983: Introduce compression package and move _compression module (GH-133018) 2025-04-27 14:41:30 -07:00
calendar.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
cmd.py gh-133363: Fix Cmd completion for lines beginning with ! (#133364) 2025-05-03 22:50:37 -04:00
code.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
codecs.py gh-62259: Add support of multi-byte encodings in the XML parser (GH-149860) 2026-05-26 19:40:25 +00:00
codeop.py Fix pyflakes warnings: variable is assigned to but never used (#142294) 2025-12-08 14:00:31 +01:00
colorsys.py
compileall.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
configparser.py gh-148370: prevent quadratic behavior in configparser.ParsingError.combine (#148452) 2026-04-14 00:32:54 +02:00
contextlib.py gh-125862: Keep ContextDecorator open across generator/coroutine execution (GH-136212) 2026-04-28 05:26:38 +00:00
contextvars.py
copy.py gh-141510: Fix copy.deepcopy() for recursive frozendict (#145027) 2026-02-21 15:30:40 +00:00
copyreg.py gh-132882: Fix copying of unions with members that do not support __or__ (#132883) 2025-04-24 16:49:09 +00:00
cProfile.py Remove unused imports (#142320) 2025-12-06 11:27:31 +00:00
csv.py gh-137627: Make csv.Sniffer.sniff() delimiter detection 1.6x faster (#137628) 2025-10-23 15:28:29 +03:00
dataclasses.py gh-79413: Add qualname parameter to dataclass.make_dataclass. (GH-150026) 2026-05-18 19:55:47 -04:00
datetime.py
decimal.py gh-76007: Deprecate __version__ attribute in decimal (#140302) 2025-10-26 12:01:04 +01:00
difflib.py gh-149189: Revert "Modern defaults for pprint (#149190)" (#150249) 2026-05-22 23:22:03 +03:00
dis.py GH-150478: Add "show_jit" option to dis.dis to show jit entry points (GH-150554) 2026-06-01 17:52:40 +01:00
doctest.py gh-144384: Lazily import _colorize (#149318) 2026-05-06 16:07:43 +00:00
enum.py gh-139398: [Enum] Add supported sunder names to __dir__ for REPL completions (GH-139985) 2026-05-28 12:55:38 -07:00
filecmp.py
fileinput.py
fnmatch.py gh-133306: Use \z instead of \Z in fnmatch.translate() and glob.translate() (GH-133338) 2025-05-03 17:58:21 +03:00
fractions.py gh-72902: Speedup Fraction.from_decimal/float in typical cases (GH-133251) 2026-05-25 10:04:56 +03:00
ftplib.py gh-87451: Apply CVE-2021-4189 PASV fix to ftplib.ftpcp() (GH-149648) 2026-05-13 17:33:43 +00:00
functools.py gh-150285: Fix too long docstrings in some Python modules (GH-150366) 2026-05-25 07:33:54 +00:00
genericpath.py gh-74453: Deprecate os.path.commonprefix (#144436) 2026-02-05 22:37:05 +02:00
getopt.py
getpass.py gh-138577: Fix keyboard shortcuts in getpass with echo_char (#141597) 2026-03-30 11:11:13 +02:00
gettext.py gh-141510: Use frozendict in the stdlib (#144909) 2026-03-06 10:25:09 +01:00
glob.py gh-150285: Fix too long docstrings in some Python modules (GH-150366) 2026-05-25 07:33:54 +00:00
graphlib.py GH-143948: Explain graphlib's cycle-finding code (#143950) 2026-01-20 19:28:48 -06:00
gzip.py gh-150285: Fix too long docstrings in some Python modules (GH-150366) 2026-05-25 07:33:54 +00:00
hashlib.py gh-136565: use SHA-256 for hashlib.__doc__ example instead of MD5 (#138157) 2025-08-26 10:38:53 +00:00
heapq.py Indexing is more straight-forward (and faster) than unpacking (gh-145154) 2026-02-23 12:31:35 -06:00
hmac.py gh-142451: correctly copy HMAC attributes in HMAC.copy() (#142510) 2025-12-14 09:45:36 +01:00
imaplib.py gh-142307: deprecate legacy support for altering IMAP4.file (#142335) 2026-05-06 17:41:26 +03:00
inspect.py gh-149083: use sentinel to fix _functools.reduce() signature (#149591) 2026-05-10 15:22:16 -07:00
io.py gh-132952: Speed up startup by importing _io instead of io (#132957) 2025-04-28 08:38:56 -07:00
ipaddress.py gh-141497: Make ipaddress.IP{v4,v6}Network.hosts() always returning an iterator (GH-141547) 2025-11-17 19:29:06 +02:00
keyword.py gh-142349: Implement PEP 810 - Explicit lazy imports (#142351) 2026-02-12 00:15:33 +00:00
linecache.py gh-122255: Synchronize warnings in C and Python implementations of the warnings module (GH-122824) 2025-11-14 16:49:28 +02:00
locale.py gh-140924: In locale module, add missing names to __all__ (GH-140925) 2026-05-11 17:21:03 +03:00
lzma.py gh-115988: Add ARM64 and RISCV BCJ filters constants in lzma module (GH-115989) 2026-05-28 08:05:03 -07:00
mailbox.py bpo-32234: Allow mailbox instances as context managers (GH-4770) 2026-02-16 14:14:26 +01:00
mimetypes.py gh-149720: Remove support for undotted ext in mimetypes.MimeType.add_type (#149721) 2026-05-12 13:40:21 +00:00
modulefinder.py gh-84530: fix namespace package support in modulefinder (#29196) 2025-12-09 15:50:50 +00:00
netrc.py gh-139633: Run netrc file permission check only once per parse (GH-139634) 2026-03-30 22:05:18 +03:00
ntpath.py gh-150285: Fix too long docstrings in some Python modules (GH-150366) 2026-05-25 07:33:54 +00:00
nturl2path.py GH-125866: Deprecate nturl2path module (#131432) 2025-03-19 19:33:01 +00:00
numbers.py gh-122450: Expand documentation for `Rational and Fraction` (#136800) 2025-08-04 02:15:59 +00:00
opcode.py gh-148871: Add CONSTANT_EMPTY_TUPLE to LOAD_COMMON_CONSTANT (GH-149688) 2026-05-21 15:54:46 +01:00
operator.py
optparse.py gh-141510: Use frozendict in the stdlib (#144909) 2026-03-06 10:25:09 +01:00
os.py gh-150285: Fix too long docstrings in the os module (GH-150296) 2026-05-24 15:04:01 +03:00
pdb.py gh-148615: Handle -- separator in pdb argument parsing (#148624) 2026-05-05 21:22:58 -07:00
pickle.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
pickletools.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
pkgutil.py gh-148641: Implement PEP 829 - startup configuration files (#149109) 2026-05-03 17:17:29 +00:00
platform.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
plistlib.py gh-141510: Support frozendict in plistlib (#145590) 2026-03-31 15:45:23 +03:00
poplib.py gh-143923: Reject control characters in POP3 commands 2026-01-20 20:46:32 +00:00
posixpath.py gh-74453: Deprecate os.path.commonprefix (#144436) 2026-02-05 22:37:05 +02:00
pprint.py gh-149189: Revert "Modern defaults for pprint (#149190)" (#150249) 2026-05-22 23:22:03 +03:00
profile.py GH-65961: Stop setting __cached__ on modules (GH-142165) 2025-12-11 11:44:46 -08:00
pstats.py gh-140137: Handle empty collections in profiling.sampling (#140154) 2025-10-15 14:59:12 +01:00
pty.py
py_compile.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
pyclbr.py Fix pyflakes warnings: variable is assigned to but never used (#142294) 2025-12-08 14:00:31 +01:00
pydoc.py gh-142349: Add help("lazy") support (#149886) 2026-05-15 16:30:40 +00:00
queue.py Fix Queue.shutdown docs for condition to unblock a join (gh-137088) 2025-07-25 07:56:28 -06:00
quopri.py
random.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
reprlib.py gh-135487: fix reprlib.Repr.repr_int when given very large integers (#135506) 2025-06-24 11:09:46 +00:00
rlcompleter.py gh-112821: Fix rlcompleter failures on objects with descriptors (#149577) 2026-05-10 21:44:59 -04:00
runpy.py gh-149117: Set ImportError.name on errors from runpy.run_module/run_path (gh-149159) 2026-05-02 12:27:23 +10:00
sched.py
secrets.py
selectors.py
shelve.py Drop three unused imports (#141875) 2025-11-23 16:33:05 +00:00
shlex.py gh-138804: Check type in shlex.quote (GH-138809) 2025-09-12 14:26:21 -04:00
shutil.py gh-109503: Fix document for shutil.move() on usage of os.rename() since it's inaccurate (GH-109507) 2026-05-30 14:26:03 +00:00
signal.py gh-149879: Fix test_signal on Cygwin (#149896) 2026-05-15 21:32:10 +02:00
site.py gh-150228: Improve the PEP 829 batch processing APIs (#150542) 2026-06-01 18:43:18 -07:00
smtplib.py gh-70039: smtplib: store the server name in ._host in .connect() (#115259) 2026-04-08 17:46:25 -04:00
socket.py gh-148599: Update WSA socket error codes (#148033) 2026-05-06 19:52:23 +02:00
socketserver.py gh-76007: Deprecate __version__ attribute (#138675) 2025-09-29 12:03:23 +03:00
ssl.py gh-149879: Fix test_ssl on Cygwin (#150419) 2026-05-25 22:32:37 +02:00
stat.py gh-144050: Fix stat.filemode pure Python file type detection (GH-144059) 2026-01-20 14:05:42 +02:00
statistics.py statistics: Fix geometric_mean() error message for negative inputs (#149246) 2026-05-01 22:54:24 -05:00
stringprep.py
struct.py
subprocess.py gh-47798: Refactor the POSIX subprocess.Popen._communicate selector loop into helpers (GH-149032) 2026-04-27 00:40:20 +00:00
symtable.py gh-149530: Remove symtable.Class.get_methods deprecated method (#149531) 2026-05-09 08:33:09 +00:00
tabnanny.py gh-76007: Deprecate __version__ attribute (#138675) 2025-09-29 12:03:23 +03:00
tarfile.py gh-121109: Fix performance of tarfile reading with "r|*" (GH-121296) 2026-05-30 09:23:50 +00:00
tempfile.py gh-66305: Fix a hang on Windows in the tempfile module (GH-144672) 2026-02-24 13:05:06 +02:00
textwrap.py gh-139065: Fix trailing space before long word in textwrap (GH-139070) 2025-10-10 16:29:18 +03:00
this.py
threading.py gh-124397: Add free-threading support for iterators. (gh-148894) 2026-05-01 16:31:00 -05:00
timeit.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
token.py gh-131507: Add support for syntax highlighting in PyREPL (GH-133247) 2025-05-02 20:22:31 +02:00
tokenize.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
trace.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
traceback.py gh-145896: Fix typos and stale docstrings in the traceback module (GH-145897) 2026-05-25 12:45:02 +03:00
tracemalloc.py
tty.py
turtle.py Fix pyflakes warnings: variable is assigned to but never used (#142294) 2025-12-08 14:00:31 +01:00
types.py gh-150285: Fix too long docstrings in some Python modules (GH-150366) 2026-05-25 07:33:54 +00:00
typing.py gh-149995: Update typing.py docstrings and documentation (#149996) 2026-05-21 21:06:42 -07:00
uuid.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
warnings.py gh-128384: Use a context variable for warnings.catch_warnings (gh-130010) 2025-04-09 16:18:54 -07:00
wave.py gh-117716: Fix wave RIFF padding for data chunks (GH-145237) 2026-04-15 14:21:43 +02:00
weakref.py gh-124748: Fix handling kwargs in WeakKeyDictionary.update() (#124783) 2026-02-18 13:17:08 +00:00
webbrowser.py gh-137586: Replace 'osascript' with 'open' on macOS in webbrowser (#146439) 2026-05-06 16:56:17 +03:00
zipapp.py gh-142389: Add backticks to stdlib argparse help to display in colour (#149384) 2026-05-04 22:23:18 +00:00
zipimport.py gh-135801: Add the module parameter to compile() etc (GH-139652) 2025-11-13 13:21:32 +02:00