Commit graph

812 commits

Author SHA1 Message Date
Miss Islington (bot)
fa3af11739
[3.13] gh-101913: changed wording of docstring for _parsedate_tz (GH-134446) (#150798)
Fixed incorrect word.
(cherry picked from commit f7e0fb60cf)

Co-authored-by: Gustaf <79180496+gostak-dd@users.noreply.github.com>
Co-authored-by: Gustaf <79180496+GGyll@users.noreply.github.com>
2026-06-02 13:52:09 -04:00
Miss Islington (bot)
a58c24d9fd
[3.13] gh-88726: Stop using non-standard charset names eucgb2312_cn and big5_tw in email (GH-149959) (GH-150493)
(cherry picked from commit 5e467f4331)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2026-05-27 10:53:08 +00:00
Miss Islington (bot)
82db654e13
[3.13] gh-128110: Fix rfc2047 whitespace handling in email parser address headers (GH-130749) (#149789)
RFC 2047 Section 6.2 requires that "any 'linear-white-space' that
separates a pair of adjacent 'encoded-word's is ignored." The modern
header value parser correctly implements that for unstructured headers,
but had missed a case in structured headers. This could cause a parsed
address header to include extraneous spaces in a display-name.

Switch to @bitdancer's fix from review feedback. Recharacterize space
between ews as fws after parsing in get_phrase.

RDM: This fix is dependent on the fact that "subsequent" atoms will never have
leading whitespace because that's been consumed already. I don't think
it's worth adding extra code for the possibility of leading whitespace
because the parser won't produce it. It's a bit of parser fragility in the
face of code changes, but I think that's a minor concern given the
parser design (which is that it consumes whitespace greedily)
(cherry picked from commit 7a4c6dfb88)

Co-authored-by: Mike Edmunds <medmunds@gmail.com>
Co-authored-by: R David Murray <rdmurray@bitdance.com>
2026-05-13 16:29:07 -04:00
Miss Islington (bot)
104a38c495
[3.13] gh-148518 fix index error in local part attribute (GH-148522) (#149199)
As part of fixing bpo-27931 code was introduced to get_bare_quoted_string
that added an empty Terminal if the quoted string was empty.  This isn't
the best answer in terms of the parse tree; we really want the token
list to be empty in that case.  But having it be empty resulted in
local_part raising the index error.  We find that same problem if we
try to parse an address consisting of a single dquote.  By fixing
local_part to not raise on an empty token list, we can have the
bare_quoted_string code correctly return an empty token list for
the empty string cases (two dquotes or a single dquote as the
entire addrespec, at the end of a line).
(cherry picked from commit bdbb55c403)

Co-authored-by: R. David Murray <rdmurray@bitdance.com>
2026-04-30 18:13:27 -04:00
Miss Islington (bot)
b3e0c72fa4
[3.13] bpo-39100: _header_value_parser: do not treat a Group as invalid-mailbox (GH-24872) (#149192)
When an address in an address-list has garbage at the end, the code will
currently:

1. change the mailbox in the last parsed address into invalid-mailbox by
   overriding its token_type;
2. wrap the trailing garbage into another invalid-mailbox and append it
   to the last parsed address.

However, that does not take into account that an address may
also contain a Group instead of a single mailbox. In that case,
overwriting token_type leads to undesirable results, e.g. parsing an
email with the following 'To' header:

unlisted-recipients:; (no To-header on input)

raises an AttributeError from trying to treat the Group as a Mailbox.

Moreover it is questionable whether the previously parsed mailbox should
be treated as invalid in addition to the trailing garbage.

Address both of the above by wrapping the trailing garbage in a new
Address with a single invalid-mailbox, and append it to the AddressList
directly.

Changes the results of the
test_get_address_list_mailboxes_invalid_addresses test, where the
address list is now parsed into 4 mailboxes instead of 3 (all but the
first one are invalid).
(cherry picked from commit b413bc7a1f)

Co-authored-by: elenril <anton@khirnov.net>
2026-04-30 14:15:43 -04:00
Miss Islington (bot)
5a4143a392
[3.13] gh-148192: Fix Generator._make_boundary behavior with CRLF line endings. (GH-148193) (#148549)
The Generator._make_boundary regex did not match on boundary phrases correctly when using CRLF line endings due to re.MULTILINE not considering \r\n as a line ending.
(cherry picked from commit 4af46b4ab5)

Co-authored-by: Henry Jones <44321887+henryivesjones@users.noreply.github.com>
2026-04-14 12:21:55 -04:00
Miss Islington (bot)
0530f105ce
[3.13] gh-145831: email.quoprimime: decode() leaves stray \r when eol='\r\n' (GH-145832) (#148311)
decoded[:-1] only strips one character, leaving a stray \r when eol
is two characters. Fix: decoded[:-len(eol)].
(cherry picked from commit 1a0edb1fa8)

Co-authored-by: Stefan Zetzsche <120379523+stefanzetzsche@users.noreply.github.com>
2026-04-10 08:51:34 -04:00
R. David Murray
a3c0a809e1
[3.13] gh-144156: Fix email header folding concatenating encoded words (GH-144692) (#145195)
The fix for gh-92081 (gh-92281) was unfortunately flawed, and broke whitespace handling for encoded word patterns that had previously been working correctly but had no corresponding tests, unfortunately in a way that made the resulting headers not RFC compliant, in such a way that Yahoo started rejecting the resulting emails.  This fix was released in 3.14 alpha 1, 3.13 beta 2 and 3.12.5.   This PR fixes the original problem in a way that does not break anything, and in fact fixes a small pre-existing bug (a spurious whitespace after the ':' of the header label if the header value is immediately wrapped on to the next line).  (RDM)
(cherry picked from commit 0f7cd5544a)

Co-authored-by: Robsdedude <dev@rouvenbauer.de>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2026-02-24 15:55:54 -05:00
Miss Islington (bot)
f738386838
[3.13] gh-143935: Email preserve parens when folding comments (GH-143936) (#144035)
gh-143935: Email preserve parens when folding comments (GH-143936)

Fix a bug in the folding of comments when flattening an email message
using a modern email policy. Comments consisting of a very long sequence of
non-foldable characters could trigger a forced line wrap that omitted the
required leading space on the continuation line, causing the remainder of
the comment to be interpreted as a new header field. This enabled header
injection with carefully crafted inputs.
(cherry picked from commit 17d1490aa9)

Co-authored-by: Seth Michael Larson <seth@python.org>
Co-authored-by: Denis Ledoux <dle@odoo.com>
2026-01-25 17:09:53 +00:00
Miss Islington (bot)
0a925ab591
[3.13] gh-144125: email: verify headers are sound in BytesGenerator (#144181)
gh-144125: email: verify headers are sound in BytesGenerator
(cherry picked from commit 052e55e7d4)

Co-authored-by: Seth Michael Larson <seth@python.org>
Co-authored-by: Denis Ledoux <dle@odoo.com>
Co-authored-by: Denis Ledoux <5822488+beledouxdenis@users.noreply.github.com>
Co-authored-by: Petr Viktorin <302922+encukou@users.noreply.github.com>
Co-authored-by: Bas Bloemsaat <1586868+basbloemsaat@users.noreply.github.com>
2026-01-25 17:09:26 +00:00
Miss Islington (bot)
88025560aa
[3.13] Correctly fold unknown-8bit originating from encoded words. (GH-142517) (#143147)
The unknown-8bit trick was designed to deal with unknown bytes in an
ASCII message, and it works fine for that.  However, I also tried to
extend it to handle bytes that can't be decoded using the charset
specified in an encoded word, and there it fails because there can be
other non-ASCII characters that were *successfully* decoded.  The fix is
simple: do the unknown-8bit encoding using the utf-8 codec.  This is
especially appropriate since anyone trying to do recovery on an unknown
byte string will probably attempt utf-8 first.
(cherry picked from commit 1e17ccd030)

Co-authored-by: R. David Murray <rdmurray@bitdance.com>
Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
2025-12-24 13:19:28 -05:00
Miss Islington (bot)
c084a66568
[3.13] gh-79986: Add parsing for References/In-Reply-To email headers (GH-137201) (#142574)
gh-79986: Add parsing for References/In-Reply-To email headers (GH-137201)

This is a followup to 46d88a1131 (GH-13397),
which added parsing for Message-ID. Similar handling is needed for the
other two identification headers.
(cherry picked from commit 79aa43a979)

Co-authored-by: elenril <anton@khirnov.net>
2025-12-21 14:36:23 -05:00
Miss Islington (bot)
b57d69588c
[3.13] gh-68552: fix defects policy (GH-138579) (#142367)
Co-authored-by: Ivo Bellin Salarin <nilleb@users.noreply.github.com>
Co-authored-by: Martin Panter <vadmium@users.noreply.github.com>
Co-authored-by: Ivo Bellin Salarin <ivo@nilleb.com>
2025-12-09 07:39:03 +00:00
Miss Islington (bot)
90ca216cdc
[3.13] gh-142006: Fix HeaderWriteError in email.policy.default caused by extra newline (GH-142008) (#142362)
gh-142006: Fix HeaderWriteError in email.policy.default caused by extra newline (GH-142008)

RDM: This fixes a subtle folding error that showed up when a token exactly filled a line and was followed by whitespace and a token with no folding whitespace that was longer than a line.  In this particular circumstance the whitespace after the first token got pushed on to the next line, and then stolen to go in front of the next unfoldable token...leaving a completely empty line in the line buffer.  That line got turned in to a newline, which is RFC illegal, and the newish security check caught it.  The fix is to just delete that empty line from the buffer.
(cherry picked from commit 07eff899d8)

Co-authored-by: Paresh Joshi <rahulj9223@gmail.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2025-12-06 16:40:07 -05:00
Miss Islington (bot)
b34ecec682
[3.13] gh-136063: fix quadratic-complexity parsing in email.message._parseparam (GH-136072) (#140828)
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2025-11-30 14:34:22 +02:00
Filip Łajszczak
d76e411891
[3.13] gh-139434: Update selected RFC 2822 references to RFC 5322 (GH-139435) (#141024)
Update selected RFC 2822 references to RFC 5322

RFC 2822 was obsoleted by RFC 5322 in 2008. This updates references
to use the current standard in documentation, docstrings, and comments.

It preserves RFC 2822 references in legacy API components to maintain their
historical context.

RFC 822 → RFC 2822 → RFC 5322 progression is explained where relevant.

In some places specific sections of RFC are referenced where it seems helpful.

Scout rule was applied in some places and RFC mentions format was
normalized in doc strings and comments.
(cherry picked from commit ce1bb85d28)
2025-11-04 16:22:31 -05:00
Jiucheng(Oliver)
10af8404b4
[3.13] gh-135307: Fix email error when policy max_line_length is set to 0 or None (GH-135367) (#140917)
[3.13] gh-135307: Fix email error when policy max_line_length is set to 0 or None (GH-135367)
(cherry picked from commit 6d45cd8dbb)

Co-authored-by: Jiucheng(Oliver) <git.jiucheng@gmail.com>
RDM: Like the change made in a earlier PR to the folder, we can/must use 'maxlen' as a stand in for 'unlimited' when computing line lengths when max_line_length is 0 or None; otherwise the computation results in a traceback.
2025-11-02 15:20:29 -05:00
Miss Islington (bot)
6176101b4a
[3.13] gh-134759: fix UnboundLocalError in email.message.Message.get_payload (GH-136071) (#136580)
gh-134759: fix `UnboundLocalError` in `email.message.Message.get_payload` (GH-136071)
(cherry picked from commit 25335d297b)

Co-authored-by: Kliment Lamonov <klimentlamonov@yandex.ru>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2025-07-12 13:52:54 +00:00
Miss Islington (bot)
a09199e3cc
[3.13] Docs: fix docstring of email.message.Message.add_header (GH-134355) (#135340)
Docs: fix docstring of `email.message.Message.add_header` (GH-134355)
(cherry picked from commit c23eec2960)

Co-authored-by: Alexander Shadchin <shadchin@yandex-team.com>
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
2025-07-03 10:22:39 +00:00
Miss Islington (bot)
3bd2818664
[3.13] gh-67022: Document bytes/str inconsistency in email.header.decode_header() and suggest email.headerregistry.HeaderRegistry as a sane alternative (GH-92900) (#135549)
gh-67022: Document bytes/str inconsistency in email.header.decode_header() and suggest email.headerregistry.HeaderRegistry as a sane alternative (GH-92900)

* gh-67022: Document bytes/str inconsistency in email.header.decode_header()

This function's possible return types have been surprising and error-prone
for the entirety of its Python 3.x history. It can return either:

1. `typing.List[typing.Tuple[bytes, typing.Optional[str]]]` of length >1
2. or `typing.List[typing.Tuple[str, None]]`, of length exactly 1

This means that any user of this function must be prepared to accept either
`bytes` or `str` for the first member of the 2-tuples it returns, which is a
very surprising behavior in Python 3.x, particularly given that the second
member of the tuple is supposed to represent the charset/encoding of the
first member.

This patch documents the behavior of this function, and adds test cases
to demonstrate it.

As discussed in bpo-22833, this cannot be changed in a backwards-compatible
way, and some users of this function depend precisely on the existing
behavior.

Add warnings about obsolescence of 'email.header.decode_header' and 'email.header.make_header' functions.

Recommend use of `email.headerregistry.HeaderRegistry` instead, as suggested
in https://github.com/python/cpython/pull/92900#discussion_r1112472177
(cherry picked from commit 60181f4ed0)

Co-authored-by: Dan Lenski <dlenski@gmail.com>
2025-06-15 16:02:43 -04:00
Miss Islington (bot)
e5d1771c6b
[3.13] gh-134151 Fix TypeError in email.utils.decode_params when sorting RFC 2231 continuations (GH-134687) (#135248)
gh-134151 Fix `TypeError` in `email.utils.decode_params` when sorting RFC 2231 continuations (GH-134687)

- Fix sorting logic in `email.utils.decode_params` to handle None values.
- Update tests for RFC 2231 continuation sorting.
(cherry picked from commit bcb6b45cb8)

Co-authored-by: Jiucheng(Oliver) <git.jiucheng@gmail.com>
2025-06-08 07:38:00 +00:00
Miss Islington (bot)
58b9581ff5
[3.13] gh-134155: fix AttributeError in email._header_value_parser.get_address (GH-134194) (#135192)
gh-134155: fix AttributeError in email._header_value_parser.get_address (GH-134194)

Append the defect to defects instead of to the parse tree.
(cherry picked from commit d9cad074d5)

Co-authored-by: Sergey Miryanov <sergey.miryanov@gmail.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
2025-06-06 14:11:15 -04:00
Miss Islington (bot)
08e1ba8813
[3.13] gh-134152: Fix UnboundLocalError in email._header_value_parser _get_ptext_to_endchars (GH-134233) (#134677)
Co-authored-by: R. David Murray <rdmurray@bitdance.com>
2025-05-26 11:02:58 +03:00
Miss Islington (bot)
31767e6100
[3.13] gh-121284: Fix email address header folding with parsed encoded-word (GH-122754) (#131403)
gh-121284: Fix email address header folding with parsed encoded-word (GH-122754)

Email generators using email.policy.default may convert an RFC 2047
encoded-word to unencoded form during header refolding. In a structured
header, this could allow 'specials' chars outside a quoted-string,
leading to invalid address headers and enabling spoofing. This change
ensures a parsed encoded-word that contains specials is kept as an
encoded-word while the header is refolded.

[Better fix from @bitdancer.]

---------
(cherry picked from commit 295b53df2a)

Co-authored-by: Mike Edmunds <medmunds@gmail.com>
Co-authored-by: R David Murray <rdmurray@bitdance.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
2025-03-18 15:34:00 -04:00
Miss Islington (bot)
2120089547
[3.13] gh-80222: Fix email address header folding with long quoted-string (GH-122753) (#129007)
gh-80222: Fix email address header folding with long quoted-string (GH-122753)

Email generators using email.policy.default could incorrectly omit the
quote ('"') characters from a quoted-string during header refolding,
leading to invalid address headers and enabling header spoofing. This
change restores the quote characters on a bare-quoted-string as the
header is refolded, and escapes backslash and quote chars in the string.
(cherry picked from commit 5aaf416858)

Co-authored-by: Mike Edmunds <medmunds@gmail.com>
2025-01-19 16:06:28 -05:00
Miss Islington (bot)
ad3bbb6b0d
[3.13] gh-98188: Fix EmailMessage.get_payload to decode data when CTE value has extra text (GH-127547) (#128528)
gh-98188: Fix EmailMessage.get_payload to decode data when CTE value has extra text (GH-127547)

Up to this point message handling has been very strict with regards to content encoding values: mixed case was accepted, but trailing blanks or other text would cause decoding failure, even if the first token was a valid encoding.  By Postel's Rule we should go ahead and decode as long as we can recognize that first token.  We have not thought of any security or backward compatibility concerns with this fix.

This fix does introduce a new technique/pattern to the Message code: we look to see if the header has a 'cte' attribute, and if so we use that.  This effectively promotes the header API exposed by HeaderRegistry to an API that any header parser "should" support.  This seems like a reasonable thing to do.  It is not, however, a requirement, as the string value of the header is still used if there is no cte attribute.

The full fix (ignore any trailing blanks or blank-separated trailing text) applies only to the non-compat32 API.  compat32 is only fixed to the extent that it now ignores trailing spaces.  Note that the HeaderRegistry parsing still records a HeaderDefect if there is extra text.

(cherry picked from commit a62ba52f14)

Co-authored-by: RanKKI <hliu86.me@gmail.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2025-01-07 12:43:04 -05:00
Miss Islington (bot)
af35aa2880
[3.13] gh-124452: Fix header mismatches when folding/unfolding with email message (GH-125919) (#126917)
gh-124452: Fix header mismatches when folding/unfolding with email message (GH-125919)

The header-folder of the new email API has a long standing known buglet where
if the first token is longer than max_line_length, it puts that token on the next
line.  It turns out there is also a *parsing* bug when parsing such a header:
the space prefixing that first, non-empty line gets preserved and tacked on to
the start of the header value, which is not the expected behavior per the RFCs.
The bug arises from the fact that the parser assumed that there would be at
least one token on the line with the header, which is going to be true for
probably every email producer other than the python email library with its
folding buglet.  Clearly, though, this is a case that needs to be handled
correctly.  The fix is simple: strip the blanks off the start of the whole
value, not just the first physical line of the value.

(cherry picked from commit ed81971e6b)

Co-authored-by: RanKKI <hliu86.me@gmail.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2024-11-17 15:06:18 -05:00
Miss Islington (bot)
4aaa4259b5
[3.13] gh-121650: Encode newlines in headers, and verify headers are sound (GH-122233) (#122484)
gh-121650: Encode newlines in headers, and verify headers are sound (GH-122233)

GH-GH- Encode header parts that contain newlines

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

GH-GH- Verify that email headers are well-formed

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339818)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2024-08-06 19:06:41 +02:00
Miss Islington (bot)
9a332f260d
[3.13] gh-120930: Remove extra blank occuring in wrapped encoded words in email headers (GH-121747) (GH-121963)
gh-120930: Remove extra blank occuring in wrapped encoded words in email headers (GH-121747)
(cherry picked from commit cecaceea31)

Co-authored-by: Matthieu Caneill <matthieucan@users.noreply.github.com>
2024-07-19 19:21:53 +02:00
Serhiy Storchaka
a45d9051ed
[3.13] gh-121905: Consistently use "floating-point" instead of "floating point" (GH-121907) (GH-122012)
(cherry picked from commit 1a0c7b9ba4)
2024-07-19 09:13:08 +00:00
Miss Islington (bot)
6892b400dc
[3.13] gh-118643: Fix AttributeError in the email module (GH-119099) (GH-119389)
Fix regression introduced in gh-100884: AttributeError when re-fold a long
address list.

Also fix more cases of incorrect encoding of the address separator in the
address list missed in gh-100884.
(cherry picked from commit 858b9e85fc)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2024-05-22 14:07:38 +03:00
Miss Islington (bot)
054f1af811
[3.13] gh-92081: Fix for email.generator.Generator with whitespace between encoded words. (GH-92281) (#119245)
* Fix for email.generator.Generator with whitespace between encoded words.

email.generator.Generator currently does not handle whitespace between
encoded words correctly when the encoded words span multiple lines.  The
current generator will create an encoded word for each line.  If the end
of the line happens to correspond with the end real word in the
plaintext, the generator will place an unencoded space at the start of
the subsequent lines to represent the whitespace between the plaintext
words.

A compliant decoder will strip all the whitespace from between two
encoded words which leads to missing spaces in the round-tripped
output.

The fix for this is to make sure that whitespace between two encoded
words ends up inside of one or the other of the encoded words.  This
fix places the space inside of the second encoded word.

A second problem happens with continuation lines.  A continuation line that
starts with whitespace and is followed by a non-encoded word is fine because
the newline between such continuation lines is defined as condensing to
a single space character.  When the continuation line starts with whitespace
followed by an encoded word, however, the RFCs specify that the word is run
together with the encoded word on the previous line.  This is because normal
words are filded on syntactic breaks by encoded words are not.

The solution to this is to add the whitespace to the start of the encoded word
on the continuation line.

Test cases are from GH-92081

* Rename a variable so it's not confused with the final variable.
(cherry picked from commit a6fdb31b67)

Co-authored-by: Toshio Kuratomi <a.badger@gmail.com>
2024-05-20 20:01:56 +00:00
wim glenn
fed8d73fde
gh-118455: Fix mangle_from_ default value in email.policy.Policy.__doc__ (#118456)
* Fix mangle_from_ default value in email.policy.Policy.__doc__

The docstring says it defaults to True, but it actually defaults
to False. Only the Compat32 subclass overrides that.

---------

Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
2024-05-05 09:18:04 +03:00
Serhiy Storchaka
deaecb88fa
gh-80361: Fix TypeError in email.Message.get_payload() (GH-117994)
It was raised when the charset is rfc2231 encoded, e.g.:

   Content-Type: text/plain; charset*=ansi-x3.4-1968''utf-8
2024-04-17 19:31:26 +03:00
Ivan Savin
1aa8bbe62f
bpo-40944: Fix IndexError when parse emails with truncated Message-ID, address, routes, etc (GH-20790)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2024-04-17 10:14:22 +00:00
Serhiy Storchaka
aec1dac4ef
gh-117313: Fix re-folding email messages containing non-standard line separators (GH-117369)
Only treat '\n', '\r' and '\r\n' as line separators in re-folding the email
messages.  Preserve control characters '\v', '\f', '\x1c', '\x1d' and '\x1e'
and Unicode line separators '\x85', '\u2028' and '\u2029' as is.
2024-04-17 13:00:25 +03:00
Serhiy Storchaka
f74e51229c
gh-86650: Fix IndexError when parse emails with invalid Message-ID (GH-117934)
In particularly, one-off addresses generated by Microsoft Outlook:
https://learn.microsoft.com/en-us/office/client-developer/outlook/mapi/one-off-addresses

Co-authored-by: fsc-eriker <72394365+fsc-eriker@users.noreply.github.com>
2024-04-17 10:44:41 +03:00
tsufeki
8cc9adbfdd
gh-75171: Fix parsing invalid email address headers starting or ending with a dot (GH-15600)
Co-authored-by: Tim Bell <timothybell@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2024-04-17 10:39:15 +03:00
Serhiy Storchaka
f97f25ef5d
gh-76511: Fix email.Message.as_string() for non-ASCII message with ASCII charset (GH-116125) 2024-03-05 17:49:01 +02:00
Thomas Weißschuh
09fab93c3d
gh-100884: email/_header_value_parser: don't encode list separators (GH-100885)
ListSeparator should not be encoded. This could happen when a long line
pushes its separator to the next line, which would have been encoded.
2024-02-17 10:13:46 +00:00
Shantanu
2124a3ddcc
gh-109653: Improve import time of importlib.metadata / email.utils (#114664)
My criterion for delayed imports is that they're only worth it if the
majority of users of the module would benefit from it, otherwise you're
just moving latency around unpredictably.

mktime_tz is not used anywhere in the standard library and grep.app
indicates it's not got much use in the ecosystem either.

Distribution.files is not nearly as widely used as other
importlib.metadata APIs, so we defer the csv import.

Before:
```
λ hyperfine -w 8 './python -c "import importlib.metadata"'
Benchmark 1: ./python -c "import importlib.metadata"
  Time (mean ± σ):      65.1 ms ±   0.5 ms    [User: 55.3 ms, System: 9.8 ms]
  Range (min … max):    64.4 ms …  66.4 ms    44 runs
```

After:
```
λ hyperfine -w 8 './python -c "import importlib.metadata"'
Benchmark 1: ./python -c "import importlib.metadata"
  Time (mean ± σ):      62.0 ms ±   0.3 ms    [User: 52.5 ms, System: 9.6 ms]
  Range (min … max):    61.3 ms …  62.8 ms    46 runs
```

for about a 3ms saving with warm disk cache, maybe 7-11ms with cold disk
cache.
2024-01-29 01:30:22 -08:00
Rito Takeuchi
504334c7be
gh-77749: Fix inconsistent behavior of non-ASCII handling in EmailPolicy.fold() (GH-6986)
It now always encodes non-ASCII characters in headers if utf8 is false.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2024-01-26 15:19:41 +00:00
Serhiy Storchaka
e9d5b6ea2d
gh-113594: Fix UnicodeEncodeError in TokenList.fold() (GH-113730)
It occurred when try to re-encode an unknown-8bit part combined with non-unknown-8bit part.
2024-01-10 14:54:36 +02:00
Victor Stinner
4a153a1d3b
[CVE-2023-27043] gh-102988: Reject malformed addresses in email.parseaddr() (#111116)
Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
2023-12-15 16:10:40 +01:00
Sidney Markowitz
27a5fd8cb8
gh-94606: Fix error when message with Unicode surrogate not surrogateescaped string (GH-94641)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2023-12-11 18:21:18 +02:00
Alex Waygood
aa3f419acb
gh-109653: Improve the import time of email.utils (#109824) 2023-10-12 15:03:20 -07:00
htsedebenham
c65592c4d6
gh-106186: Don't report MultipartInvariantViolationDefect for valid multipart emails when parsing header only (#107016) 2023-07-23 12:25:18 +02:00
Gregory P. Smith
a31dea1feb
gh-106669: Revert "gh-102988: Detect email address parsing errors ... (#105127)" (#106733)
This reverts commit 18dfbd0357.
Adds a regression test from the issue.

See https://github.com/python/cpython/issues/106669.
2023-07-20 20:30:52 -07:00
CF Bolz-Tereick
7e6ce48872
gh-106628: email parsing speedup (gh-106629) 2023-07-13 15:12:56 +09:00
Thomas Dwyer
18dfbd0357
gh-102988: Detect email address parsing errors and return empty tuple to indicate the parsing error (old API) (#105127)
Detect email address parsing errors and return empty tuple to indicate the parsing error (old API). This fixes or at least ameliorates CVE-2023-27043.

---------

Co-authored-by: Gregory P. Smith <greg@krypto.org>
2023-07-10 23:00:55 +00:00