cpython/Lib/html
Miss Islington (bot) 03ce4b2617
[3.14] gh-135661: Fix parsing start and end tags in HTMLParser according to the HTML5 standard (GH-135930) (GH-136255)
* Whitespaces no longer accepted between `</` and the tag name.
  E.g. `</ script>` does not end the script section.

* Vertical tabulation (`\v`) and non-ASCII whitespaces no longer recognized
  as whitespaces. The only whitespaces are `\t\n\r\f `.

* Null character (U+0000) no longer ends the tag name.

* Attributes and slashes after the tag name in end tags are now ignored,
  instead of terminating after the first `>` in quoted attribute value.
  E.g. `</script/foo=">"/>`.

* Multiple slashes and whitespaces between the last attribute and closing `>`
  are now ignored in both start and end tags. E.g. `<a foo=bar/ //>`.

* Multiple `=` between attribute name and value are no longer collapsed.
  E.g. `<a foo==bar>` produces attribute "foo" with value "=bar".

* Whitespaces between the `=` separator and attribute name or value are no
  longer ignored. E.g. `<a foo =bar>` produces two attributes "foo" and
  "=bar", both with value None; `<a foo= bar>` produces two attributes:
  "foo" with value "" and "bar" with value None.

---------
(cherry picked from commit 0243f97cba)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>
2025-07-03 20:57:53 +00:00
..
__init__.py gh-100210: Correct the comment link for unescaping HTML (#100212) 2023-02-19 11:18:12 +01:00
entities.py gh-97669: Create Tools/build/ directory (#97963) 2022-10-17 12:01:00 +02:00
parser.py [3.14] gh-135661: Fix parsing start and end tags in HTMLParser according to the HTML5 standard (GH-135930) (GH-136255) 2025-07-03 20:57:53 +00:00