diff --git a/.github/workflows/posix-deps-apt.sh b/.github/workflows/posix-deps-apt.sh index 7994a01ee46..6201e719ca8 100755 --- a/.github/workflows/posix-deps-apt.sh +++ b/.github/workflows/posix-deps-apt.sh @@ -26,9 +26,16 @@ apt-get -yq --no-install-recommends install \ xvfb \ zlib1g-dev -# Workaround missing libmpdec-dev on ubuntu 24.04: -# https://launchpad.net/~ondrej/+archive/ubuntu/php -# https://deb.sury.org/ -sudo add-apt-repository ppa:ondrej/php -apt-get update -apt-get -yq --no-install-recommends install libmpdec-dev +# Workaround missing libmpdec-dev on ubuntu 24.04 by building mpdecimal +# from source. ppa:ondrej/php (launchpad.net) are unreliable +# (https://status.canonical.com) so fetch the tarball directly +# from the upstream host. +# https://www.bytereef.org/mpdecimal/ +MPDECIMAL_VERSION=4.0.1 +curl -fsSL "https://www.bytereef.org/software/mpdecimal/releases/mpdecimal-${MPDECIMAL_VERSION}.tar.gz" \ + | tar -xz -C /tmp +(cd "/tmp/mpdecimal-${MPDECIMAL_VERSION}" \ + && ./configure --prefix=/usr/local \ + && make -j"$(nproc)" \ + && make install) +ldconfig diff --git a/Doc/library/email.policy.rst b/Doc/library/email.policy.rst index 8f6e4218c97..816d02d86f4 100644 --- a/Doc/library/email.policy.rst +++ b/Doc/library/email.policy.rst @@ -403,11 +403,26 @@ added matters. To illustrate:: .. attribute:: utf8 If ``False``, follow :rfc:`5322`, supporting non-ASCII characters in - headers by encoding them as "encoded words". If ``True``, follow - :rfc:`6532` and use ``utf-8`` encoding for headers. Messages + headers by encoding them as :rfc:`2047` "encoded words". If ``True``, + follow :rfc:`6532` and use ``utf-8`` encoding for headers. Messages formatted in this way may be passed to SMTP servers that support the ``SMTPUTF8`` extension (:rfc:`6531`). + When ``False``, the generator will raise + :exc:`~email.errors.HeaderWriteError` if any header includes non-ASCII + characters in a context where :rfc:`2047` does not permit encoded words. + This particularly applies to mailboxes ("addr-spec") with non-ASCII + characters, which can be created via + :class:`~email.headerregistry.Address`. To use a mailbox with a non-ASCII + domain name with ``utf8=False``, first encode the domain using the + third-party :pypi:`idna` or :pypi:`uts46` module or with + :mod:`encodings.idna`. It is not possible to use a non-ASCII username + ("local-part") in a mailbox when ``utf8=False``. + + .. versionchanged:: 3.15 + Can trigger the raising of :exc:`~email.errors.HeaderWriteError`. + (Earlier versions incorrectly applied :rfc:`2047` in certain contexts, + mostly notably in addr-specs.) .. attribute:: refold_source diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index dfdfe66be7e..0bb8858aea1 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -953,10 +953,24 @@ when a module is imported) will still emit the syntax warning. (Contributed by Irit Katriel in :gh:`130080`.) +.. _incremental-garbage-collection: .. _whatsnew314-incremental-gc: -Incremental garbage collection ------------------------------- +Garbage collection +------------------ + +**From Python 3.14.5 onwards:** + +The garbage collector (GC) has changed in Python 3.14.5. + +Python 3.14.0-3.14.4 shipped with a new incremental GC. +However, due to a number of `reports +`__ +of significant memory pressure in production environments, +it has been reverted back to the generational GC from 3.13. +This is the GC now used in Python 3.14.5 and later. + +**Previously in Python 3.14.0-3.14.4:** The cycle garbage collector is now incremental. This means that maximum pause times are reduced @@ -2203,7 +2217,18 @@ difflib gc -- -* The new :ref:`incremental garbage collector ` +* **From Python 3.14.5 onwards:** + + Python 3.14.0-3.14.4 shipped with a new incremental garbage collector. + However, due to a number of `reports + `__ + of significant memory pressure in production environments, + it has been reverted back to the generational GC from 3.13. + This is the GC now used in Python 3.14.5 and later. + +* **Previously in Python 3.14.0-3.14.4:** + + The new :ref:`incremental garbage collector ` means that maximum pause times are reduced by an order of magnitude or more for larger heaps. @@ -3447,3 +3472,17 @@ Changes in the C API functions on Python 3.13 and older. .. _pythoncapi-compat project: https://github.com/python/pythoncapi-compat/ + + +Notable changes in 3.14.5 +========================= + +gc +-- + +* The incremental garbage collector shipped in Python 3.14.0-3.14.4 has been + reverted back to the generational garbage collector from 3.13, + due to a number of `reports + `__ + of significant memory pressure in production environments. + See :ref:`whatsnew314-incremental-gc` for details. diff --git a/Doc/whatsnew/3.15.rst b/Doc/whatsnew/3.15.rst index a687ee5115b..83d3cb82195 100644 --- a/Doc/whatsnew/3.15.rst +++ b/Doc/whatsnew/3.15.rst @@ -914,6 +914,16 @@ faulthandler (Contributed by Eric Froemling in :gh:`149085`.) +email +----- + +* Email generators now raise an error when an :class:`.EmailMessage` cannot be + accurately flattened due to a non-ASCII email address (mailbox) in an address + header. Options for supporting Email Address Internationalization (EAI) are + discussed in :attr:`.EmailPolicy.utf8`. + (Contributed by R David Murray and Mike Edmunds in :gh:`122540`.) + + functools --------- @@ -1579,11 +1589,11 @@ Upgraded JIT compiler Results from the `pyperformance `__ benchmark suite report -`6-7% `__ +`8-9% `__ geometric mean performance improvement for the JIT over the standard CPython interpreter built with all optimizations enabled on x86-64 Linux. On AArch64 macOS, the JIT has a -`12-13% `__ +`12-13% `__ speedup over the :ref:`tail calling interpreter ` with all optimizations enabled. The speedups for JIT builds versus no JIT builds range from roughly 15% slowdown to over diff --git a/Include/pyport.h b/Include/pyport.h index c975921beaf..73a3e6cdaf0 100644 --- a/Include/pyport.h +++ b/Include/pyport.h @@ -553,6 +553,7 @@ extern "C" { # if !defined(_Py_MEMORY_SANITIZER) # define _Py_MEMORY_SANITIZER # define _Py_NO_SANITIZE_MEMORY __attribute__((no_sanitize_memory)) +# define _Py_MSAN_UNPOISON(PTR, SIZE) (__msan_unpoison(PTR, SIZE)) # endif # endif # if __has_feature(address_sanitizer) @@ -591,6 +592,9 @@ extern "C" { #ifndef _Py_NO_SANITIZE_MEMORY # define _Py_NO_SANITIZE_MEMORY #endif +#ifndef _Py_MSAN_UNPOISON +# define _Py_MSAN_UNPOISON(PTR, SIZE) +#endif /* AIX has __bool__ redefined in it's system header file. */ #if defined(_AIX) && defined(__bool__) diff --git a/Lib/email/_header_value_parser.py b/Lib/email/_header_value_parser.py index a53903a197f..9873958f5c2 100644 --- a/Lib/email/_header_value_parser.py +++ b/Lib/email/_header_value_parser.py @@ -157,10 +157,7 @@ def all_defects(self): def startswith_fws(self): return self[0].startswith_fws() - @property - def as_ew_allowed(self): - """True if all top level tokens of this part may be RFC2047 encoded.""" - return all(part.as_ew_allowed for part in self) + as_ew_allowed = True @property def comments(self): @@ -429,6 +426,7 @@ def addr_spec(self): class AngleAddr(TokenList): token_type = 'angle-addr' + as_ew_allowed = False @property def local_part(self): @@ -847,26 +845,22 @@ def params(self): class ContentType(ParameterizedHeaderValue): token_type = 'content-type' - as_ew_allowed = False maintype = 'text' subtype = 'plain' class ContentDisposition(ParameterizedHeaderValue): token_type = 'content-disposition' - as_ew_allowed = False content_disposition = None class ContentTransferEncoding(TokenList): token_type = 'content-transfer-encoding' - as_ew_allowed = False cte = '7bit' class HeaderLabel(TokenList): token_type = 'header-label' - as_ew_allowed = False class MsgID(TokenList): @@ -1509,11 +1503,6 @@ def get_local_part(value): local_part.defects.append(errors.ObsoleteHeaderDefect( "local-part is not a dot-atom (contains CFWS)")) local_part[0] = obs_local_part - try: - local_part.value.encode('ascii') - except UnicodeEncodeError: - local_part.defects.append(errors.NonASCIILocalPartDefect( - "local-part contains non-ASCII characters)")) return local_part, value def get_obs_local_part(value): @@ -2835,13 +2824,68 @@ def _steal_trailing_WSP_if_exists(lines): def _refold_parse_tree(parse_tree, *, policy): - """Return string of contents of parse_tree folded according to RFC rules. - - """ # max_line_length 0/None means no limit, ie: infinitely long. maxlen = policy.max_line_length or sys.maxsize encoding = 'utf-8' if policy.utf8 else 'us-ascii' lines = [''] # Folded lines to be output + if parse_tree.as_ew_allowed: + _refold_with_ew(parse_tree, lines, maxlen, encoding, policy=policy) + else: + _refold_without_ew(parse_tree, lines, maxlen, encoding, policy=policy) + return policy.linesep.join(lines) + policy.linesep + +def _refold_without_ew(parse_tree, lines, maxlen, encoding, *, policy): + parts = list(parse_tree) + while parts: + part = parts.pop(0) + tstr = str(part) + try: + tstr.encode(encoding) + except UnicodeEncodeError: + if any(isinstance(x, errors.UndecodableBytesDefect) + for x in part.all_defects): + # There is garbage data from parsing a message in binary mode, + # just pass it through. Not good, but the best we can do. + pass + elif policy.utf8: + # If this happens, it's a programmer error. + raise + else: + raise errors.HeaderWriteError( + f"Non-ASCII {part.token_type} '{part}' is invalid" + " under current policy setting (utf8=False)" + ) + if len(tstr) <= maxlen - len(lines[-1]): + lines[-1] += tstr + continue + # This part is too long to fit. The RFC wants us to break at + # "major syntactic breaks", so unless we don't consider this + # to be one, check if it will fit on the next line by itself. + if (part.syntactic_break and + len(tstr) + 1 <= maxlen): + newline = _steal_trailing_WSP_if_exists(lines) + if newline or part.startswith_fws(): + lines.append(newline + tstr) + continue + if not hasattr(part, 'encode'): + # It's not a terminal, try folding the subparts. + newparts = list(part) + parts = newparts + parts + continue + # We can't figure out how to wrap, it, so give up. + newline = _steal_trailing_WSP_if_exists(lines) + if newline or part.startswith_fws(): + lines.append(newline + tstr) + else: + # We can't fold it onto the next line either... + lines[-1] += tstr + return + + +def _refold_with_ew(parse_tree, lines, maxlen, encoding, *, policy): + """Return string of contents of parse_tree folded according to RFC rules. + + """ last_word_is_ew = False last_ew = None # if there is an encoded word in the last line of lines, # points to the encoded word's first character @@ -2855,6 +2899,11 @@ def _refold_parse_tree(parse_tree, *, policy): if part is end_ew_not_allowed: wrap_as_ew_blocked -= 1 continue + if part.token_type == 'mime-parameters': + # Mime parameter folding (using RFC2231) is extra special. + _fold_mime_parameters(part, lines, maxlen, encoding) + last_word_is_ew = False + continue tstr = str(part) if not want_encoding: if part.token_type in ('ptext', 'vtext'): @@ -2876,14 +2925,11 @@ def _refold_parse_tree(parse_tree, *, policy): charset = 'utf-8' want_encoding = True - if part.token_type == 'mime-parameters': - # Mime parameter folding (using RFC2231) is extra special. - _fold_mime_parameters(part, lines, maxlen, encoding) - last_word_is_ew = False - continue - if want_encoding and not wrap_as_ew_blocked: - if not part.as_ew_allowed: + if any( + not x.as_ew_allowed for x in part + if hasattr(x, 'as_ew_allowed') + ): want_encoding = False last_ew = None if part.syntactic_break: @@ -2964,6 +3010,8 @@ def _refold_parse_tree(parse_tree, *, policy): [ValueTerminal(make_quoted_pairs(p), 'ptext') for p in newparts] + [ValueTerminal('"', 'ptext')]) + _refold_without_ew(newparts, lines, maxlen, encoding, policy=policy) + continue if part.token_type == 'comment': newparts = ( [ValueTerminal('(', 'ptext')] + @@ -2991,7 +3039,7 @@ def _refold_parse_tree(parse_tree, *, policy): lines[-1] += tstr last_word_is_ew = last_word_is_ew and not bool(tstr.strip(_WSP)) - return policy.linesep.join(lines) + policy.linesep + return def _fold_as_ew(to_encode, lines, maxlen, last_ew, ew_combine_allowed, charset, last_word_is_ew): """Fold string to_encode into lines as encoded word, combining if allowed. diff --git a/Lib/email/errors.py b/Lib/email/errors.py index 6bc744bd59c..859307dd85b 100644 --- a/Lib/email/errors.py +++ b/Lib/email/errors.py @@ -109,9 +109,9 @@ class ObsoleteHeaderDefect(HeaderDefect): """Header uses syntax declared obsolete by RFC 5322""" class NonASCIILocalPartDefect(HeaderDefect): - """local_part contains non-ASCII characters""" - # This defect only occurs during unicode parsing, not when - # parsing messages decoded from binary. + """Unused. Note: this error is deprecated and may be removed in the future.""" + # RFC 6532 permits a non-ASCII local-part. _header_value_parser previously + # treated this as a parse-time defect (when parsing Unicode, but not bytes). class InvalidDateDefect(HeaderDefect): """Header has unparsable or invalid date""" diff --git a/Lib/test/test_email/test__header_value_parser.py b/Lib/test/test_email/test__header_value_parser.py index f3c03062572..aded44e85ee 100644 --- a/Lib/test/test_email/test__header_value_parser.py +++ b/Lib/test/test_email/test__header_value_parser.py @@ -1235,17 +1235,6 @@ def test_get_local_part_valid_and_invalid_qp_in_atom_list(self): '@example.com') self.assertEqual(local_part.local_part, r'\example\\ example') - def test_get_local_part_unicode_defect(self): - # Currently this only happens when parsing unicode, not when parsing - # stuff that was originally binary. - local_part = self._test_get_x(parser.get_local_part, - 'exámple@example.com', - 'exámple', - 'exámple', - [errors.NonASCIILocalPartDefect], - '@example.com') - self.assertEqual(local_part.local_part, 'exámple') - # get_dtext def test_get_dtext_only(self): @@ -3374,10 +3363,12 @@ def test_fold_unfoldable_element_stealing_whitespace(self): self._test(token, expected, policy=policy) def test_encoded_word_with_undecodable_bytes(self): - self._test(parser.get_address_list( - ' =?utf-8?Q?=E5=AE=A2=E6=88=B6=E6=AD=A3=E8=A6=8F=E4=BA=A4=E7?=' + self._test( + parser.get_address_list( + ' =?utf-8?Q?=E5=AE=A2=E6=88=B6=E6=AD=A3=E8=A6=8F=E4=BA=A4=E7?=' + ' ' )[0], - ' =?unknown-8bit?b?5a6i5oi25q2j6KaP5Lqk5w==?=\n', + ' =?unknown-8bit?b?5a6i5oi25q2j6KaP5Lqk5w==?= \n', ) diff --git a/Lib/test/test_email/test_generator.py b/Lib/test/test_email/test_generator.py index 3c9a86f3e8c..8d912738029 100644 --- a/Lib/test/test_email/test_generator.py +++ b/Lib/test/test_email/test_generator.py @@ -1,4 +1,5 @@ import io +import re import textwrap import unittest import random @@ -295,6 +296,69 @@ def test_keep_long_encoded_newlines(self): g.flatten(msg) self.assertEqual(s.getvalue(), self.typ(expected)) + def test_non_ascii_addr_spec_raises(self): + # non-ascii is not permitted in any part of an addr-spec. If the + # programmer generated it, it's an error. (See also + # test_non_ascii_addr_spec_preserved below.) + p = self.policy.clone(utf8=False, max_line_length=20) + g = self.genclass(self.ioclass(), policy=p) + # XXX The particular part detected here isn't part of a behavioral + # spec and may change in the future. + cases = [ + ('wők@example.com', 'wők', 'local-part'), + ('wok@exàmple.com', 'exàmple.com', 'domain'), + ('wők@exàmple.com', 'wők', 'local-part'), + ( + '"Name, for display" ', + 'wők@example.com', + 'addr-spec', + ), + ( + 'Näyttönimi ', + 'wők@example.com', + 'addr-spec', + ), + ( + '"a lőng quoted string as the local part"@example.com', + 'a lőng quoted string as the local part', + 'local-part', + ), + + ] + for address, badtoken, partname in cases: + with self.subTest(address=address): + msg = EmailMessage() + msg['To'] = address + expected_error = ( + fr"(?i)(?=.*non-ascii)" + fr"(?=.*{re.escape(badtoken)})" + fr"(?=.*{partname})" + fr"(?=.*policy.*utf8)" + ) + with self.assertRaisesRegex( + email.errors.HeaderWriteError, expected_error + ): + g.flatten(msg) + + def test_local_part_quoted_string_wrapped_correctly(self): + msg = self.msgmaker(self.typ(textwrap.dedent("""\ + To: <"a long local part in a quoted string"@example.com> + Subject: test + + None + """)), policy=self.policy.clone(max_line_length=20)) + expected = textwrap.dedent("""\ + To: <"a long local part in a + quoted string"@example.com> + Subject: test + + None + """) + s = self.ioclass() + g = self.genclass(s, policy=self.policy.clone(max_line_length=30)) + g.flatten(msg) + self.assertEqual(s.getvalue(), self.typ(expected)) + def _test_boundary_detection(self, linesep): # Generate a boundary token in the same way as _make_boundary token = random.randrange(sys.maxsize) @@ -515,12 +579,12 @@ def test_cte_type_7bit_transforms_8bit_cte(self): def test_smtputf8_policy(self): msg = EmailMessage() - msg['From'] = "Páolo " + msg['From'] = "Páolo " msg['To'] = 'Dinsdale' msg['Subject'] = 'Nudge nudge, wink, wink \u1F609' msg.set_content("oh là là, know what I mean, know what I mean?") expected = textwrap.dedent("""\ - From: Páolo + From: Páolo To: Dinsdale Subject: Nudge nudge, wink, wink \u1F609 Content-Type: text/plain; charset="utf-8" @@ -555,6 +619,37 @@ def test_smtp_policy(self): g.flatten(msg) self.assertEqual(s.getvalue(), expected) + def test_non_ascii_addr_spec_preserved(self): + # A defective non-ASCII addr-spec parsed from the original + # message is left unchanged when flattening. + # (See also test_non_ascii_addr_spec_raises above.) + source = ( + 'To: jörg@example.com, "But a long name still works with refold_source" ' + ).encode() + expected = ( + b'To: j\xc3\xb6rg@example.com,\n' + b' "But a long name still works with refold_source" \n' + b'\n' + ) + msg = message_from_bytes(source, policy=policy.default) + s = io.BytesIO() + g = BytesGenerator(s, policy=policy.default) + g.flatten(msg) + self.assertEqual(s.getvalue(), expected) + + def test_idna_encoding_preserved(self): + # Nothing tries to decode a pre-encoded IDNA domain. + msg = EmailMessage() + msg["To"] = Address( + username='jörg', + domain='☕.example'.encode('idna').decode() # IDNA 2003 + ) + expected = 'To: jörg@xn--53h.example\n\n'.encode() + s = io.BytesIO() + g = BytesGenerator(s, policy=policy.default.clone(utf8=True)) + g.flatten(msg) + self.assertEqual(s.getvalue(), expected) + if __name__ == '__main__': unittest.main() diff --git a/Lib/test/test_email/test_headerregistry.py b/Lib/test/test_email/test_headerregistry.py index 2aaa7d68ca3..aa918255d15 100644 --- a/Lib/test/test_email/test_headerregistry.py +++ b/Lib/test/test_email/test_headerregistry.py @@ -1543,17 +1543,19 @@ def test_quoting(self): self.assertEqual(str(a), '"Sara J." <"bad name"@example.com>') def test_il8n(self): - a = Address('Éric', 'wok', 'exàmple.com') + a = Address('Éric', 'wők', 'exàmple.com') self.assertEqual(a.display_name, 'Éric') - self.assertEqual(a.username, 'wok') + self.assertEqual(a.username, 'wők') self.assertEqual(a.domain, 'exàmple.com') - self.assertEqual(a.addr_spec, 'wok@exàmple.com') - self.assertEqual(str(a), 'Éric ') + self.assertEqual(a.addr_spec, 'wők@exàmple.com') + self.assertEqual(str(a), 'Éric ') - # XXX: there is an API design issue that needs to be solved here. - #def test_non_ascii_username_raises(self): - # with self.assertRaises(ValueError): - # Address('foo', 'wők', 'example.com') + def test_i18n_in_addr_spec(self): + a = Address(addr_spec='wők@exàmple.com') + self.assertEqual(a.username, 'wők') + self.assertEqual(a.domain, 'exàmple.com') + self.assertEqual(a.addr_spec, 'wők@exàmple.com') + self.assertEqual(str(a), 'wők@exàmple.com') def test_crlf_in_constructor_args_raises(self): cases = ( @@ -1574,10 +1576,6 @@ def test_crlf_in_constructor_args_raises(self): with self.subTest(kwargs=kwargs), self.assertRaisesRegex(ValueError, "invalid arguments"): Address(**kwargs) - def test_non_ascii_username_in_addr_spec_raises(self): - with self.assertRaises(ValueError): - Address('foo', addr_spec='wők@example.com') - def test_address_addr_spec_and_username_raises(self): with self.assertRaises(TypeError): Address('foo', username='bing', addr_spec='bar@baz') diff --git a/Misc/NEWS.d/next/Core_and_Builtins/2026-04-21-19-29-29.gh-issue-148850.MSH0J_.rst b/Misc/NEWS.d/next/Core_and_Builtins/2026-04-21-19-29-29.gh-issue-148850.MSH0J_.rst new file mode 100644 index 00000000000..324d1610310 --- /dev/null +++ b/Misc/NEWS.d/next/Core_and_Builtins/2026-04-21-19-29-29.gh-issue-148850.MSH0J_.rst @@ -0,0 +1 @@ +Fix the memory sanitizer false positive in :func:`os.getrandom`. diff --git a/Misc/NEWS.d/next/Library/2024-07-30-19-19-33.gh-issue-81074.YAeWNf.rst b/Misc/NEWS.d/next/Library/2024-07-30-19-19-33.gh-issue-81074.YAeWNf.rst new file mode 100644 index 00000000000..87de4fade14 --- /dev/null +++ b/Misc/NEWS.d/next/Library/2024-07-30-19-19-33.gh-issue-81074.YAeWNf.rst @@ -0,0 +1,8 @@ +The :mod:`email` module no longer treats email addresses with non-ASCII +characters as defects when parsing a Unicode string or in the ``addr_spec`` +parameter to :class:`email.headerregistry.Address`. :rfc:`5322` permits such +addresses, and they were already supported when parsing bytes and in the Address +``username`` parameter. + +The (undocumented) :exc:`!email.errors.NonASCIILocalPartDefect` is no longer +used and should be considered deprecated. diff --git a/Misc/NEWS.d/next/Library/2024-07-31-17-22-10.gh-issue-83938.TtUa-c.rst b/Misc/NEWS.d/next/Library/2024-07-31-17-22-10.gh-issue-83938.TtUa-c.rst new file mode 100644 index 00000000000..7082c72f685 --- /dev/null +++ b/Misc/NEWS.d/next/Library/2024-07-31-17-22-10.gh-issue-83938.TtUa-c.rst @@ -0,0 +1,8 @@ +The :mod:`email` module no longer incorrectly uses :rfc:`2047` encoding for +a mailbox with non-ASCII characters in its domain. Under a policy with +:attr:`~email.policy.EmailPolicy.utf8` set ``False``, attempting to serialize +such a message will now raise an :exc:`~email.errors.HeaderWriteError`. +Either apply an appropriate IDNA encoding to convert the domain to ASCII before +serialization, or use :data:`email.policy.SMTPUTF8` (or another policy with +``utf8=True``) to correctly pass through the internationalized domain name +as Unicode characters. diff --git a/Misc/NEWS.d/next/Library/2024-07-31-17-23-06.gh-issue-122476.TtUa-c.rst b/Misc/NEWS.d/next/Library/2024-07-31-17-23-06.gh-issue-122476.TtUa-c.rst new file mode 100644 index 00000000000..29c076d3a74 --- /dev/null +++ b/Misc/NEWS.d/next/Library/2024-07-31-17-23-06.gh-issue-122476.TtUa-c.rst @@ -0,0 +1,7 @@ +The :mod:`email` module no longer incorrectly uses :rfc:`2047` encoding for +a mailbox with non-ASCII characters in its local-part. Under a policy with +:attr:`~email.policy.EmailPolicy.utf8` set ``False``, attempting to serialize +such a message will now raise an :exc:`~email.errors.HeaderWriteError`. +There is no valid 7-bit encoding for an internationalized local-part. Use +:data:`email.policy.SMTPUTF8` (or another policy with ``utf8=True``) to +correctly pass through the local-part as Unicode characters. diff --git a/Modules/posixmodule.c b/Modules/posixmodule.c index e5ce487723b..3bdbf2ef816 100644 --- a/Modules/posixmodule.c +++ b/Modules/posixmodule.c @@ -17195,6 +17195,8 @@ os_getrandom_impl(PyObject *module, Py_ssize_t size, int flags) goto error; } + _Py_MSAN_UNPOISON(data, size); + return PyBytesWriter_FinishWithSize(writer, n); error: