Adds `python -m profiling.sampling dump <pid>`, which prints a single
traceback-style snapshot of a running process's Python stack via the
existing `_remote_debugging` unwinder. Supports per-thread status,
source line highlighting, optional bytecode opcodes, and async-aware
task reconstruction (`--async-aware`, default `--async-mode=all`).
Support custom headers in `python -m http.server` and `http.server.SimpleHTTPRequestHandler`.
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
unittest.TestCase methods assertWarns() and assertWarnsRegex() no longer
swallow warnings that do not match the specified category or regex.
Nested context managers are now supported.
Previously, identical PickleBuffers did not preserve identity.
Also, empty writable PickleBuffer memoized an empty bytearray object
in place of b'' which is a singleton in CPython, so the following
references to b'' were unpickled as an empty bytearray object.
Set ImportError.name on errors from runpy.run_module/run_path
`runpy.run_module()` and `runpy.run_path()` now set the `name` attribute
of the `ImportError` they raise to the requested module name, matching
the behaviour of a regular import statement (previously `name` was
always `None`, which broke introspection).
The `name=` kwarg is gated on `issubclass(error, ImportError)` because
`_get_module_details()` is also used by `_run_module_as_main()` with
a private `_Error` sentinel class. `_Error` does not subclass
ImportError, and `BaseException.__init__` rejects unknown kwargs at
the C level, so passing `name=` unconditionally would break the
`python -m foo` codepath.
This makes it possible to set the gzip header mtime field without
overriding time.time(), making it useful when creating reproducible
archives.
* 📜🤖 Added by blurb_it.
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Ethan Furman <ethan@stoneleaf.us>
Co-authored-by: Savannah Ostrowski <savannah@python.org>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Emma Smith <emma@emmatyping.dev>
The email.headerregistry.Address constructor raised an error if
addr_spec contained a non-ASCII character. (But it fully supports
non-ASCII in the separate username and domain args.) This change
removes the error for a non-ASCII addr_spec, as well as the
Defect that triggered it. In the unicode era non-ascii is not a
defect, though it is an error when an attempt is made to serialize
it to ascii. The serialization issue was handled in #122540.
The email generators had been incorrectly flattening non-ASCII email
addresses to RFC 2047 encoded-word format, leaving them undeliverable.
(RFC 2047 prohibits use of encoded-word in an addr-spec.)
This change raises a HeaderWriteError when attempting to flatten an
EmailMessage with a non-ASCII addr-spec and a policy with utf8=False.
(Exception: If the non-ASCII address originated from parsing a message,
it will be flattened as originally parsed, without error.) This also applies
to other contexts in which RFC2047 words are not allowed by the RFCs.
Non-ASCII email addresses are supported when using a policy with
utf8=True (such as email.policy.SMTPUTF8) under RFCs 6531 and 6532.
Non-ASCII email address domains (but not localparts) can also be used
with non-SMTPUTF8 policies by encoding the domain as an IDNA A-label.
(The email package does not perform this encoding, because it cannot
know whether the caller wants IDNA 2003, IDNA 2008, or some other
variant such as UTS #46.)
Co-authored-by: R. David Murray <rdmurray@bitdance.com>
As part of fixing bpo-27931 code was introduced to get_bare_quoted_string
that added an empty Terminal if the quoted string was empty. This isn't
the best answer in terms of the parse tree; we really want the token
list to be empty in that case. But having it be empty resulted in
local_part raising the index error. We find that same problem if we
try to parse an address consisting of a single dquote. By fixing
local_part to not raise on an empty token list, we can have the
bare_quoted_string code correctly return an empty token list for
the empty string cases (two dquotes or a single dquote as the
entire addrespec, at the end of a line).
The replaces the incremental GC with a forward port (from 3.13) of the generational GC.
Co-Authored-By: Neil Schemenauer <nas@arctrix.com>
Co-Authored-By: Zanie Blue <contact@zanie.dev>
Co-Authored-By: Sergey Miryanov <sergey.miryanov@gmail.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
When an address in an address-list has garbage at the end, the code will
currently:
1. change the mailbox in the last parsed address into invalid-mailbox by
overriding its token_type;
2. wrap the trailing garbage into another invalid-mailbox and append it
to the last parsed address.
However, that does not take into account that an address may
also contain a Group instead of a single mailbox. In that case,
overwriting token_type leads to undesirable results, e.g. parsing an
email with the following 'To' header:
unlisted-recipients:; (no To-header on input)
raises an AttributeError from trying to treat the Group as a Mailbox.
Moreover it is questionable whether the previously parsed mailbox should
be treated as invalid in addition to the trailing garbage.
Address both of the above by wrapping the trailing garbage in a new
Address with a single invalid-mailbox, and append it to the AddressList
directly.
Changes the results of the
test_get_address_list_mailboxes_invalid_addresses test, where the
address list is now parsed into 4 mailboxes instead of 3 (all but the
first one are invalid).
- Use lazy import for regular expressions.
- Use frozendict for string escapes
Co-authored-by: Taneli Hukkinen <hukkinen@eurecom.fr>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Add a keyword-only `max_threads` argument to `dump_traceback()` and
`dump_traceback_later()`, defaulting to 100 to preserve existing
behavior. Allows server processes with many worker threads to dump
beyond the historical 100-thread cap (previously a hardcoded
`MAX_NTHREADS = 100` in `Python/traceback.c`).
The cap matters in practice: tstates are prepended to the
PyInterpreterState linked list, so the dump walks newest-first. With
more than 100 threads alive, the main thread (oldest, at the tail) is
silently elided from watchdog dumps -- exactly the thread that's
usually wanted.
The hardcoded value is moved to a new internal macro
`_Py_TRACEBACK_MAX_NTHREADS` in `pycore_traceback.h` so the in-tree
fatal-signal callers all reference one source of truth.
There were comments claiming these were implemented as custom classes to give a nicer
repr(), but the repr() wasn't all that nice:
>>> repr(dataclasses.MISSING)
'<dataclasses._MISSING_TYPE object at 0x1005e7e00>'
>>> repr(dataclasses.KW_ONLY)
'<dataclasses._KW_ONLY_TYPE object at 0x100884050>'
Sentinels are conceptually the right tool for these, so let's use them.
This does change the repr() of these two objects.
Fix inverted flamegraph width
The inverted view used thread presence as a proxy for self time.
This missed self samples on C-level wrapper frames like _run_code,
where the node's thread always appears in its children too. Those
samples were silently dropped, causing the chart to render narrower
than full width. Now uses the explicit self field on each node
instead of the thread heuristic.
* Records the same objects for each member of family before execution
* Records derived values when recording the trace
* This makes sure that specialization, or deoptimization, does not cause invalid values to be recorded
Make _find_and_load() acquire the module locks for the full
dotted-name chain (parent before child) when loading a nested module, so
both threads contend on the same first lock and serialise instead of
deadlocking.
When acquiring a parent's lock would itself deadlock with another thread
that is loading that parent (cross-package circular imports), the parent's
lock is skipped and the partially-initialised parent is accepted -- the
same policy _lock_unlock_module() already applies on the existing code
path -- so concurrent circular imports that worked before continue to work.
ContextDecorator and AsyncContextDecorator (and therefore @contextmanager
and @asynccontextmanager used as decorators) now detect generator,
coroutine, and asynchronous generator functions and emit a wrapper of the
matching kind, so the context manager spans iteration or await rather than
just the call that constructs the lazy object. Wrapped generators are
explicitly closed when iteration ends.
For asynchronous generator wrappers, values passed via asend() and
exceptions via athrow() are not forwarded to the wrapped generator.
AsyncContextDecorator now also accepts synchronous functions and
generators, returning an asynchronous wrapper; ContextDecorator remains
the recommended choice for those.
inspect.isgeneratorfunction(), iscoroutinefunction(), and
isasyncgenfunction() now return True for the decorated result when the
input is of that kind.
---------
Co-authored-by: Gregory P. Smith <greg@krypto.org>