Commit graph

114 commits

Author SHA1 Message Date
R David Murray
905c8c3d8d #19772: Do not mutate message when downcoding to 7bit.
This is a bit of an ugly hack because of the way generator pieces together the
output message.  The deepcopys aren't too expensive, though, because we know it
is only called on messages that are not multiparts, and the payload (the thing
that could be large) is an immutable object.

Test and preliminary work on patch by Vajrasky Kok.
2014-02-08 11:48:20 -05:00
R David Murray
c489e83432 Merge: #17369: Improve handling of broken RFC2231 values in get_filename. 2014-02-07 15:04:26 -05:00
R David Murray
1e949890f6 #17369: Improve handling of broken RFC2231 values in get_filename.
This fixes a regression relative to python2.
2014-02-07 15:02:19 -05:00
R David Murray
15a693a6f8 #20531: Apply the 3.3 version of the #19063 fix.
So passing unicode to set_payload works again (but still doesn't
do what you want when the message is serialized).
2014-02-07 12:46:17 -05:00
R David Murray
27e9de669b #20531: Revert e20f98a8ed71, the 3.4 version of the #19063 fix. 2014-02-07 12:40:37 -05:00
R David Murray
44fcaae90d Merge #20206, #5803: more efficient algorithm that doesn't truncate output.
(No idea why test_tarfile is listed as changed...it isn't.)
2014-01-13 13:30:13 -05:00
R David Murray
2313e15578 #20206, #5803: more efficient algorithm that doesn't truncate output.
This fixes an edge case (20206) where if the input ended in a character
needing encoding but there was no newline on the string, the last byte
of the encoded character would be dropped.  The fix is to use a more
efficient algorithm, provided by Serhiy Storchaka (5803), that does not
have the bug.
2014-01-13 13:19:21 -05:00
R David Murray
775632ba10 #19957: Simplify encode_7or8bit now that _payload is always str.
Patch by Vajrasky Kok, test enhancement by me.
2013-12-12 21:40:20 -05:00
R David Murray
50bfbb9903 #19063: fix set_payload handling of non-ASCII string input.
This version of the fix raises an error instead of accepting the invalid
input (ie: if a non-ASCII string is used but no charset is specified).
2013-12-11 16:52:11 -05:00
R David Murray
d5c4c7411a #19063: partially fix set_payload handling of non-ASCII string input.
This is a backward compatible partial fix, the complete fix requires raising
an error instead of accepting the invalid input, so the real fix is only
suitable for 3.4.
2013-12-11 16:34:34 -05:00
Serhiy Storchaka
a7a34a83f3 Issue #19590: Use specific asserts in email tests. 2013-11-16 12:56:54 +02:00
Serhiy Storchaka
328cf3cbdf Issue #19590: Use specific asserts in email tests. 2013-11-16 12:56:23 +02:00
Terry Jan Reedy
7e7cf8bc51 Issue #12037: Fix test_email for desktop Windows. 2013-08-31 17:16:45 -04:00
Terry Jan Reedy
740d6b6f39 Issue #12037: Fix test_email for desktop Windows. 2013-08-31 17:12:21 -04:00
R David Murray
b8c537094d Merge #18324: set_payload now correctly handles binary input. 2013-08-21 21:13:51 -04:00
R David Murray
00ae435dee #18324: set_payload now correctly handles binary input.
This also backs out the previous fixes for for #14360, #1717, and #16564.
Those bugs were actually caused by the fact that set_payload didn't decode to
str, thus rendering the model inconsistent.  This fix does mean the data
processed by the encoder functions goes through an extra encode/decode cycle,
but it means the model is always consistent.  Future API updates will provide
a better way to encode payloads, which will bypass this minor de-optimization.

Tests by Vajrasky Kok.
2013-08-21 21:10:31 -04:00
Ezio Melotti
a1e639a0f4 #18505: merge with 3.3. 2013-08-10 18:57:52 +03:00
Ezio Melotti
1c4810b57b #18505: fix duplicate name and remove duplicate test. Patch by Vajrasky Kok. 2013-08-10 18:57:12 +03:00
R David Murray
bb17d2b857 #18600: add policy to add_string, and as_bytes and __bytes__ methods.
This was triggered by wanting to make the doctest in email.policy.rst pass;
as_bytes and __bytes__ are clearly useful now that we have BytesGenerator.
Also updated the Message docs to document the policy keyword that was
added in 3.3.
2013-08-09 16:15:28 -04:00
R David Murray
271ade87ac #18503: small cleanups in test_email.
Patch by Vajrasky Kok.
2013-07-25 12:11:55 -04:00
Ezio Melotti
2a99d5df63 #18380: pass regex flags to the right argument. Patch by Valentina Mukhamedzhanova. 2013-07-06 17:16:04 +02:00
R David Murray
f6069f9f22 #14360: make encoders.encode_quopri work.
There were no tests for the encoders module.  encode_base64 worked
because it is the default and so got tested implicitly elsewhere, and
we use encode_7or8bit internally, so that worked, too.  I previously
fixed encode_noop, so this fix means that everythign in the encoders
module now works, hopefully correctly.  Also added an explicit test
for encode_base64.
2013-06-27 18:37:00 -04:00
R David Murray
8093d6f822 Merge: #17431: Fix missing import of BytesFeedParser in email.parser. 2013-03-15 20:42:29 -04:00
R David Murray
965794ed58 Merge: PEP8 fixup on previous patch, remove unused imports in test_email. 2013-03-07 18:16:47 -05:00
R David Murray
addb0be63e Merge: #14645: Generator now emits correct linesep for all parts.
Previously the parts of the message retained whatever linesep they had on
read, which means if the messages weren't read in univeral newline mode, the
line endings could well be inconsistent.  In general sending it via smtplib
would result in them getting fixed, but it is better to generate them
correctly to begin with.  Also, the new send_message method of smtplib does
not do the fixup, so that method is producing rfc-invalid output without this
fix.
2013-03-07 16:43:58 -05:00
R David Murray
66383b2e0a Merge: #17171: fix email.encoders.encode_7or8bit when applied to binary data. 2013-02-11 10:53:35 -05:00
R David Murray
6cb1d67eb3 Merge: #16564: Fix regression in use of encoders.encode_noop with binary data. 2013-02-09 13:10:54 -05:00
R David Murray
e201e9d584 Merge: #16948: Fix quopri encoding of non-latin1 character sets. 2013-02-05 10:55:27 -05:00
Georg Brandl
1aca31e8f3 Closes #15925: fix regression in parsedate() and parsedate_tz() that should return None if unable to parse the argument. 2012-09-22 09:03:56 +02:00
R David Murray
ad2a7d528a Merge #15249: Mangle From lines correctly when body contains invalid bytes.
Fix by Colin Su.  Test by me, based on a test written by Petri Lehtinen.
2012-08-24 11:23:50 -04:00
R David Murray
970bef295d Merge #15232: correctly mangle From lines in MIME preamble and epilogue 2012-07-22 21:53:54 -04:00
R David Murray
97f43c019f #15160: Extend the new email parser to handle MIME headers.
This code passes all the same tests that the existing RFC mime header
parser passes, plus a bunch of additional ones.

There are a couple of commented out tests where there are issues with the
folding.  The folding doesn't normally get invoked for headers parsed from
source, and the cases are marginal anyway (headers with invalid binary data)
so I'm not worried about them, but will fix them after the beta.

There are things that can be done to make this API even more convenient, but I
think this is a solid foundation worth having.  And the parser is a full RFC
parser, so it handles cases that the current parser doesn't.  (There are also
probably cases where it fails when the current parser doesn't, but I haven't
found them yet ;)

Oh, yeah, and there are some really ugly bits in the parser for handling some
'postel' cases that are unfortunately common.

I hope/plan to to eventually refactor a lot of the code in the parser which
should reduce the line count...but there is no escaping the fact that the
error recovery is welter of special cases.
2012-06-24 05:03:27 -04:00
Alexander Belopolsky
76935b9c8c Issue #14653: email.utils.mktime_tz() no longer relies on system
mktime() when timezone offest is supplied.
2012-06-21 20:48:23 -04:00
R David Murray
82ffabdfa4 #2658: Add test for issue fixed by fix for #1079. 2012-06-03 12:27:07 -04:00
R David Murray
07ea53cb21 #1079: Fix parsing of encoded words.
This is a behavior change: before this leading and trailing spaces were
stripped from ASCII parts, now they are preserved.  Without this fix we didn't
parse the examples in the RFC correctly, so I think breaking backward
compatibility here is justified.

Patch by Ralf Schlatterbeck.
2012-06-02 17:56:49 -04:00
R David Murray
d41595b920 Refactor test_email/test_defect_handling. 2012-05-28 20:14:10 -04:00
R David Murray
80e0aee95b #1672568: email now registers defects for base64 payload format errors.
Which also means that it is now producing *something* for any base64
payload, which is what leads to the couple of older test changes in
test_email.  This is a slightly backward incompatible behavior change,
but the new behavior is so much more useful than the old (you can now
*reliably* detect errors, and any program that was detecting errors by
sniffing for a base64 return from get_payload(decode=True) and then doing
its own error-recovery decode will just get the error-recovery decode
right away).  So this seems to me to be worth the small risk inherent
in this behavior change.

This patch also refactors the defect tests into a separate test file,
since they are no longer just parser tests.
2012-05-27 21:23:34 -04:00
R David Murray
adbdcdbd95 #14925: email now registers a defect for missing header/body separator.
This patch also deprecates the MalformedHeaderDefect.  My best guess is that
this defect was rendered obsolete by a refactoring of the parser, and the
corresponding defect for the new parser (which this patch introduces) was
overlooked.
2012-05-27 20:45:01 -04:00
R David Murray
c27e52265b #14731: refactor email policy framework.
This patch primarily does two things: (1) it adds some internal-interface
methods to Policy that allow for Policy to control the parsing and folding of
headers in such a way that we can construct a backward compatibility policy
that is 100% compatible with the 3.2 API, while allowing a new policy to
implement the email6 API.  (2) it adds that backward compatibility policy and
refactors the test suite so that the only differences between the 3.2
test_email.py file and the 3.3 test_email.py file is some small changes in
test framework and the addition of tests for bugs fixed that apply to the 3.2
API.

There are some additional teaks, such as moving just the code needed for the
compatibility policy into _policybase, so that the library code can import
only _policybase.  That way the new code that will be added for email6
will only get imported when a non-compatibility policy is imported.
2012-05-25 15:01:48 -04:00
R David Murray
42243c4dca #14380: Make actual default match docs, fix __init__ order.
Éric pointed out that given that the default was documented as None, someone
would reasonably pass that to get the default behavior.  In fixing the code to
use None, I noticed that the change to _charset was being done after it had
already been passed to MIMENonMultipart.  The change to the test verifies that
the order is now correct.
2012-03-22 22:40:44 -04:00
R David Murray
8680bcc5db #14380: Have MIMEText defaults to utf-8 when passed non-ASCII unicode
Previously it would just accept the unicode, which would wind up as unicode in
the transfer-encoded message object, which is just wrong.

Patch by Jeff Knupp.
2012-03-22 22:17:51 -04:00
R David Murray
80e22b56d3 Merge #11686: add missing entries to email __all__ lists.
Original patch by Steffen Daode Nurpmeso
2012-03-16 22:46:14 -04:00
R David Murray
b53319f509 #12818: remove escaping of () in quoted strings in formataddr
The quoting of ()s inside quoted strings is allowed by the RFC, but is not
needed.  There seems to be no reason to add needless escapes.
2012-03-14 15:31:47 -04:00
R David Murray
8d8f110492 #14062: fix BytesParser handling of Header objects
This is a different fix than the 3.2 fix, but the new tests are the same.

This also affected smtplib.SMTP.send_message, which calls BytesParser.
2012-03-14 14:24:22 -04:00
R David Murray
e2922835b0 Merge #14291: if a header has non-ascii unicode, default to CTE using utf-8
In Python2, if a unicode string was assigned as the value of a header,
email would automatically CTE encode it using the UTF8 charset.
This capability was lost in the Python3 translation, and this patch
restores it.

Patch by Ali Ikinci, assisted by R. David Murray.

I also added a fix for the mailbox test that was depending (with a comment
that it was a bad idea to so depend) on non-ASCII causing message_from_string
to raise an error.  It now uses support.patch to induce an error during
message serialization.
2012-03-14 03:03:27 -04:00
R David Murray
749073af13 #1874: detect invalid multipart CTE and report it as a defect. 2011-06-22 13:47:53 -04:00
R David Murray
e76ff4081a merge #11584: make Header and make_header handle binary unknown-8bit input 2011-06-18 13:02:42 -04:00
R David Murray
7df08379c6 merge #11584: make decode_header handle Header objects correctly
This updates 12e39cd7a0e4 (merge of b21fdfa0019c), which fixed this bug
incorrectly.
2011-06-18 12:32:27 -04:00
R David Murray
3edd22ac95 #11731: simplify/enhance parser/generator API by introducing policy objects.
This new interface will also allow for future planned enhancements
in control over the parser/generator without requiring any additional
complexity in the parser/generator API.

Patch reviewed by Éric Araujo and Barry Warsaw.
2011-04-18 13:59:37 -04:00
R David Murray
f3299989a2 Merge: #11492: rewrite header folding algorithm. Less code, more passing tests. 2011-04-18 10:11:06 -04:00