When an address in an address-list has garbage at the end, the code will
currently:
1. change the mailbox in the last parsed address into invalid-mailbox by
overriding its token_type;
2. wrap the trailing garbage into another invalid-mailbox and append it
to the last parsed address.
However, that does not take into account that an address may
also contain a Group instead of a single mailbox. In that case,
overwriting token_type leads to undesirable results, e.g. parsing an
email with the following 'To' header:
unlisted-recipients:; (no To-header on input)
raises an AttributeError from trying to treat the Group as a Mailbox.
Moreover it is questionable whether the previously parsed mailbox should
be treated as invalid in addition to the trailing garbage.
Address both of the above by wrapping the trailing garbage in a new
Address with a single invalid-mailbox, and append it to the AddressList
directly.
Changes the results of the
test_get_address_list_mailboxes_invalid_addresses test, where the
address list is now parsed into 4 mailboxes instead of 3 (all but the
first one are invalid).
- Use lazy import for regular expressions.
- Use frozendict for string escapes
Co-authored-by: Taneli Hukkinen <hukkinen@eurecom.fr>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Add a keyword-only `max_threads` argument to `dump_traceback()` and
`dump_traceback_later()`, defaulting to 100 to preserve existing
behavior. Allows server processes with many worker threads to dump
beyond the historical 100-thread cap (previously a hardcoded
`MAX_NTHREADS = 100` in `Python/traceback.c`).
The cap matters in practice: tstates are prepended to the
PyInterpreterState linked list, so the dump walks newest-first. With
more than 100 threads alive, the main thread (oldest, at the tail) is
silently elided from watchdog dumps -- exactly the thread that's
usually wanted.
The hardcoded value is moved to a new internal macro
`_Py_TRACEBACK_MAX_NTHREADS` in `pycore_traceback.h` so the in-tree
fatal-signal callers all reference one source of truth.
There were comments claiming these were implemented as custom classes to give a nicer
repr(), but the repr() wasn't all that nice:
>>> repr(dataclasses.MISSING)
'<dataclasses._MISSING_TYPE object at 0x1005e7e00>'
>>> repr(dataclasses.KW_ONLY)
'<dataclasses._KW_ONLY_TYPE object at 0x100884050>'
Sentinels are conceptually the right tool for these, so let's use them.
This does change the repr() of these two objects.
Fix inverted flamegraph width
The inverted view used thread presence as a proxy for self time.
This missed self samples on C-level wrapper frames like _run_code,
where the node's thread always appears in its children too. Those
samples were silently dropped, causing the chart to render narrower
than full width. Now uses the explicit self field on each node
instead of the thread heuristic.
* Records the same objects for each member of family before execution
* Records derived values when recording the trace
* This makes sure that specialization, or deoptimization, does not cause invalid values to be recorded
Make _find_and_load() acquire the module locks for the full
dotted-name chain (parent before child) when loading a nested module, so
both threads contend on the same first lock and serialise instead of
deadlocking.
When acquiring a parent's lock would itself deadlock with another thread
that is loading that parent (cross-package circular imports), the parent's
lock is skipped and the partially-initialised parent is accepted -- the
same policy _lock_unlock_module() already applies on the existing code
path -- so concurrent circular imports that worked before continue to work.
ContextDecorator and AsyncContextDecorator (and therefore @contextmanager
and @asynccontextmanager used as decorators) now detect generator,
coroutine, and asynchronous generator functions and emit a wrapper of the
matching kind, so the context manager spans iteration or await rather than
just the call that constructs the lazy object. Wrapped generators are
explicitly closed when iteration ends.
For asynchronous generator wrappers, values passed via asend() and
exceptions via athrow() are not forwarded to the wrapped generator.
AsyncContextDecorator now also accepts synchronous functions and
generators, returning an asynchronous wrapper; ContextDecorator remains
the recommended choice for those.
inspect.isgeneratorfunction(), iscoroutinefunction(), and
isasyncgenfunction() now return True for the decorated result when the
input is of that kind.
---------
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Use ZipFile.extractall() to sanitize file names and extract files.
Files with invalid names (e.g. absolute paths) are now skipped.
Files containing ".." in the name are no longer skipped.
No public API change. Lift the per-iteration select/read/write loop out of
Popen._communicate (POSIX) into a module-level _communicate_io_posix(), with
small _flush_stdin / _make_input_view / _translate_newlines helpers alongside
it. Popen._communicate calls the helper and persists the returned input
offset for resume-after-timeout.
Retire the private Popen._remaining_time method in favor of module-level
_deadline_remaining; all call sites (POSIX and Windows) updated.
Defensive behavioural deltas: the stdin and stdout/stderr .close() calls in
the I/O loop now swallow BrokenPipeError / OSError, matching __exit__ and the
no-input path; previously these were bare.
Adds test_communicate_timeout_resume_partial_write to cover _input_offset
bookkeeping across TimeoutExpired/resume.
gh-141473: Speed up test_communicate_timeout_large_input
Replace the slow reader's 30s sleep with a parent-driven wake over a
loopback socket so post-timeout communicate() doesn't block waiting
for the child to wake on its own. Worst-case runtime: ~30s -> <1s.
Add `canonical=False` keyword argument to `a2b_base64`, `a2b_base32`, `a2b_base85`, and `a2b_ascii85` (and their `base64` module wrappers). When `canonical=True`, non-canonical encodings are rejected per [RFC 4648 section 3.5](https://datatracker.ietf.org/doc/html/rfc4648.html#section-3.5).
This is independent of `strict_mode`.
For base85/ascii85, the check also rejects single-character final groups (never produced by a conforming encoder) and verifies partial group padding matches what the encoder would produce.
Co-authored-by: Serhiy Storchaka via lots of great code review!
* Replaces ad-hoc logic for ending traces with a simple inequality: `fitness < exit_quality`
* Fitness starts high and is reduced for branches, backward edges, calls and trace length
* Exit quality reflect how good a spot that instruction is to end a trace. Closing a loop is very, specializable instructions are very low and the others in between.
Add '+' alternatives to signed_number and signed_real_number grammar
rules, mirroring how unary minus is already handled for pattern matching.
Unary plus is a no-op on numbers so the value is returned directly without
wrapping in a UnaryOp node.
Avoid racing with the owning thread's refcount operations when
immortalizing an interned string: if we don't own it and its refcount
isn't merged, intern a copy we own instead. Use atomic stores in
_Py_SetImmortalUntracked so concurrent atomic reads are race-free.
Improve ABI/feature selection, add new header for it.
Add a test that Python headers themselves don't use
Py_GIL_DISABLED in abi3t: abi3 and abi3t ought to be the
same except the _Py_OPAQUE_PYOBJECT differences.
This is done using the GCC-only poison pragma.
Co-authored-by: Victor Stinner <vstinner@python.org>
* Make the `PY_UNWIND` monitoring event available as a code-local
event to allow trapping on function exit events when an exception
bubbles up. This complements the PY_RETURN event by allowing to
catch any function exit event.
* Allow `PY_UNWIND` to be `DISABLE`d; disabling it disables the event for the whole code object.
* Do the above for `PY_THROW`, `RAISE`, `EXCEPTION_HANDLED`, and `RERAISE` events.