Merged revisions 82011 via svnmerge from

svn+ssh://pythondev@svn.python.org/python/branches/py3k

................
  r82011 | r.david.murray | 2010-06-15 22:19:40 -0400 (Tue, 15 Jun 2010) | 17 lines

  Merged revisions 81675 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r81675 | r.david.murray | 2010-06-03 11:43:20 -0400 (Thu, 03 Jun 2010) | 10 lines

    #5610: use \Z not $ so we don't eat extra chars when body part ends with \r\n.

    If a body part ended with \r\n, feedparser, using '$' to terminate its
    search for the newline, would match on the \r\n, and think that it needed
    to strip two characters in order to account for the line end before the
    boundary.  That made it chop one too many characters off the end of
    the body part.  Using \Z makes the match correct.

    Patch and test by Tony Nelson.
  ........
................
This commit is contained in:
R. David Murray 2010-06-16 02:22:56 +00:00
parent 24d83873eb
commit 71df9d9216
3 changed files with 22 additions and 1 deletions

View file

@ -28,7 +28,7 @@
NLCRE = re.compile('\r\n|\r|\n')
NLCRE_bol = re.compile('(\r\n|\r|\n)')
NLCRE_eol = re.compile('(\r\n|\r|\n)$')
NLCRE_eol = re.compile('(\r\n|\r|\n)\Z')
NLCRE_crack = re.compile('(\r\n|\r|\n)')
# RFC 2822 $3.6.8 Optional fields. ftext is %d33-57 / %d59-126, Any character
# except controls, SP, and ":".

View file

@ -2584,6 +2584,24 @@ def test_rfc2822_one_character_header(self):
eq(headers, ['A', 'B', 'CC'])
eq(msg.get_payload(), 'body')
def test_CRLFLF_at_end_of_part(self):
# issue 5610: feedparser should not eat two chars from body part ending
# with "\r\n\n".
m = (
"From: foo@bar.com\n"
"To: baz\n"
"Mime-Version: 1.0\n"
"Content-Type: multipart/mixed; boundary=BOUNDARY\n"
"\n"
"--BOUNDARY\n"
"Content-Type: text/plain\n"
"\n"
"body ending with CRLF newline\r\n"
"\n"
"--BOUNDARY--\n"
)
msg = email.message_from_string(m)
self.assertTrue(msg.get_payload(0).get_payload().endswith('\r\n'))
class TestBase64(unittest.TestCase):

View file

@ -61,6 +61,9 @@ C-API
Library
-------
- Issue #5610: feedparser no longer eats extra characters at the end of
a body part if the body part ends with a \r\n.
- Fix codecs.escape_encode to return the correct consumed size.
- Issue #8897: Fix sunau module, use bytes to write the header. Patch written