Commit graph

1112 commits

Author SHA1 Message Date
Serhiy Storchaka
73b3040f59
[3.11] gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648) (GH-133944) (GH-134341)
If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58623)
(cherry picked from commit 6279eb8c07)
(cherry picked from commit a75953b347)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2025-06-02 17:52:52 +02:00
Grigoriev Semyon
3bc0d2b851
[3.11] gh-109120: Fix syntax error in handlinh of incorrect star expressions… (#117464)
gh-109120: Fix syntax error in handlinh of incorrect star expressions (#117444)

(cherry picked from commit c97d3af239)
2024-04-03 11:37:39 +01:00
Alex Waygood
a30a1e7a49
[3.11] gh-115881: Ensure ast.parse() parses conditional context managers even with low feature_version passed (#115920) (#115960) 2024-02-26 16:27:51 +00:00
Miss Islington (bot)
35a43d4394
[3.11] gh-115823: Calculate correctly error locations when dealing with implicit encodings (GH-115824) (#115950)
gh-115823: Calculate correctly error locations when dealing with implicit encodings (GH-115824)
(cherry picked from commit 015b97d19a)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2024-02-26 16:08:37 +00:00
Miss Islington (bot)
1c381ec4ed
[3.11] gh-113602: Bail out when the parser tries to override existing errors (GH-113607) (#113653)
gh-113602: Bail out when the parser tries to override existing errors (GH-113607)
(cherry picked from commit 9ed36d533a)

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2024-01-02 13:22:39 +00:00
Serhiy Storchaka
4b358d754c
[3.11] gh-106905: Use separate structs to track recursion depth in each PyAST_mod2obj call. (GH-113035) (GH-113472) (GH-113476)
(cherry picked from commit 48c49739f5)
(cherry picked from commit d58a5f453f)

Co-authored-by: Yilei Yang <yileiyang@google.com>
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
2023-12-25 20:40:33 +00:00
Miss Islington (bot)
390a5b81a9
[3.11] gh-112387: Fix error positions for decoded strings with backwards tokenize errors (GH-112409) (#112469)
gh-112387: Fix error positions for decoded strings with backwards tokenize errors (GH-112409)
(cherry picked from commit 45d648597b)

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2023-11-27 19:05:20 +00:00
Miss Islington (bot)
43b081bfc4
[3.11] gh-112388: Fix an error that was causing the parser to try to overwrite tokenizer errors (GH-112410) (#112467)
gh-112388: Fix an error that was causing the parser to try to overwrite tokenizer errors (GH-112410)
(cherry picked from commit 2c8b191742)

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2023-11-27 18:56:27 +00:00
Miss Islington (bot)
08e4e11b75
[3.11] gh-111380: Show SyntaxWarnings only once when parsing if invalid syntax is encouintered (GH-111381) (#111383)
gh-111380: Show SyntaxWarnings only once when parsing if invalid syntax is encouintered (GH-111381)
(cherry picked from commit 3d2f1f0b83)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2023-10-31 13:29:42 +00:00
Pablo Galindo Salgado
22cde39fbf
[3.11] bpo-43950: handle wide unicode characters in tracebacks (GH-28150) (#111373) 2023-10-27 09:46:20 +09:00
Pablo Galindo Salgado
4e4a3e161f
[3.11] gh-110696: Fix incorrect syntax error message for incorrect argument unpacking (GH-110706) (#110766) 2023-10-18 13:59:17 +01:00
Lysandros Nikolaou
1af7b7db0d
[3.11] gh-107450: Check for overflow in the tokenizer and fix overflow test (GH-110832) (#110939)
(cherry picked from commit a1ac5590e0)

Co-authored-by: Filipe Laíns <lains@riseup.net>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2023-10-18 00:34:56 +02:00
Miss Islington (bot)
c9214b90f4
[3.11] gh-107450: Raise OverflowError when parser column offset overflows (GH-110754) (#110763)
(cherry picked from commit fb7843ee89)

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2023-10-12 09:57:36 +00:00
Serhiy Storchaka
dae62d456e
[3.11] gh-88943: Improve syntax error for non-ASCII character that follows a numerical literal (GH-109081) (GH-109091)
It now points on the invalid non-ASCII character, not on the valid numerical literal.
(cherry picked from commit b2729e93e9)
2023-09-07 14:54:07 +00:00
Miss Islington (bot)
c0c4186858
[3.11] GH-105588: Add missing error checks to some obj2ast_* converters (GH-105839)
GH-105588: Add missing error checks to some obj2ast_* converters (GH-105589)
(cherry picked from commit a4056c8f9c)

Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
2023-06-15 23:13:51 +00:00
Miss Islington (bot)
b764347572
[3.11] Fix typo in the tokenizer (GH-104950) (#104952)
(cherry picked from commit 705e387dd8)

Co-authored-by: Stepfen Shawn <m18824909883@163.com>
2023-05-25 23:32:04 -07:00
Lysandros Nikolaou
a09d3901a5
[3.11] gh-96670: Raise SyntaxError when parsing NULL bytes (GH-97594) (#104195) 2023-05-07 11:12:04 +01:00
Miss Islington (bot)
7b2ac6cf3d
[3.11] gh-102310: Change error range for invalid bytes literals (GH-103663) (#103703) 2023-04-23 17:21:27 -06:00
Miss Islington (bot)
abd6e97020
[3.11] GH-102711: Fix warnings found by clang (GH-102712) (#103075)
There are some warnings if build python via clang:

Parser/pegen.c:812:31: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
_PyPegen_clear_memo_statistics()
                              ^
                               void

Parser/pegen.c:820:29: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
_PyPegen_get_memo_statistics()
                            ^
                             void

Fix it to make clang happy.

(cherry picked from commit 7703def37e)

Signed-off-by: Chenxi Mao <chenxi.mao@suse.com>
Co-authored-by: Chenxi Mao <chenxi.mao@suse.com>
2023-03-28 11:27:30 +02:00
Pablo Galindo Salgado
58de2eb26b
[3.11] gh-102416: Do not memoize incorrectly loop rules in the parser (GH-102467). (#102473) 2023-03-06 17:13:28 +00:00
Pablo Galindo Salgado
31b82abb5c
[3.11] gh-101046: Fix a potential memory leak in the parser when raising MemoryError (GH-101051) (#101085)
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2023-01-16 23:48:51 +00:00
Miss Islington (bot)
2b97ddd512
gh-100050: Fix an assertion error when raising unclosed parenthesis errors in the tokenizer (GH-100065)
(cherry picked from commit 97e7004cfe)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
Automerge-Triggered-By: GH:pablogsal
2022-12-07 01:18:00 -08:00
Pablo Galindo Salgado
6282ef6c3f
[3.11] gh-99891: Fix infinite recursion in the tokenizer when showing warnings (GH-99893) (GH-99896)
Automerge-Triggered-By: GH:pablogsal.
(cherry picked from commit 417206a05c)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-12-01 00:57:04 -08:00
Miss Islington (bot)
f381644819
gh-99581: Fix a buffer overflow in the tokenizer when copying lines that fill the available buffer (GH-99605)
(cherry picked from commit e13d1d9dda)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-11-20 12:53:02 -08:00
Lysandros Nikolaou
152a437b8d
[3.11] gh-99211: Point to except/except* on syntax errors when mixing them (GH-99215) (GH-99622)
gh-99211: Point to except/except* on syntax errors when mixing them (GH-99215)

(cherry picked from commit 9c4232ae89)
2022-11-20 19:29:05 +01:00
Irit Katriel
d8a42bcaf0
[3.11] gh-99153: set location on SyntaxError for try with both except and except* (GH-99160) (#99168) 2022-11-07 09:41:20 +00:00
Nikita Sobolev
8c6ced36ab
[3.11] gh-96587: Raise SyntaxError for PEP654 on older feature_version (GH-96588) (#96591)
(cherry picked from commit 2c7d2e8d46)

Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
2022-10-05 15:00:13 -07:00
Miss Islington (bot)
f2d7fa8839
gh-96678: Fix UB of null pointer arithmetic (GH-96782)
Automerge-Triggered-By: GH:pablogsal
(cherry picked from commit 81e36f350b)

Co-authored-by: Matthias Görgens <matthias.goergens@gmail.com>
2022-09-13 08:03:40 -07:00
Miss Islington (bot)
ffafa9b91d
gh-96268: Fix loading invalid UTF-8 (GH-96270)
This makes tokenizer.c:valid_utf8 match stringlib/codecs.h:decode_utf8.

It also fixes an off-by-one error introduced in 3.10 for the line number when the tokenizer reports bad UTF8.
(cherry picked from commit 8bc356a7dd)

Co-authored-by: Michael Droettboom <mdboom@gmail.com>
2022-09-07 14:49:17 -07:00
Miss Islington (bot)
bb0dab5c48
gh-96611: Fix error message for invalid UTF-8 in mid-multiline string (GH-96623)
(cherry picked from commit 05692c67c5)

Co-authored-by: Michael Droettboom <mdboom@gmail.com>
2022-09-06 16:40:17 -07:00
Gregory P. Smith
f8b71da9aa
[3.11] gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96500)
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.

This PR comes fresh from a pile of work done in our private PSRT security response team repo.

This backports https://github.com/python/cpython/pull/96499 aka 511ca94520

Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).

<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->

I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#).
2022-09-02 09:48:57 -07:00
Shantanu
7fc8221794
[3.11] gh-94996: Disallow lambda pos only params with feature_version < (3, 8) (GH-95934) (GH-95936)
(cherry picked from commit a965db37f2)

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>

Automerge-Triggered-By: GH:lysnikolaou
2022-08-12 12:41:09 -07:00
Miss Islington (bot)
4abf84602f
gh-94996: Disallow parsing pos only params with feature_version < (3, 8) (GH-94997)
(cherry picked from commit b5e3ea2862)

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2022-08-12 10:53:09 -07:00
Miss Islington (bot)
1221e8c400
gh-95876: Fix format string in pegen error location code (GH-95877)
(cherry picked from commit b4c857d0fd)

Co-authored-by: Christian Heimes <christian@python.org>
2022-08-11 02:19:20 -07:00
Miss Islington (bot)
d3cc99bdce
gh-95355: Check tokens[0] after allocating memory (GH-95356)
GH-95355

Automerge-Triggered-By: GH:pablogsal
(cherry picked from commit b946f529ef)

Co-authored-by: Honglin Zhu <zhuhonglin.zhl@alibaba-inc.com>
2022-07-28 03:29:50 -07:00
Miss Islington (bot)
86eb500068
[3.11] gh-95185: Check recursion depth in the AST constructor (GH-95186) (GH-95208)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
(cherry picked from commit 0047447294)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-07-26 12:19:22 +02:00
Miss Islington (bot)
7733aa048e
gh-94949: Disallow parsing parenthesised ctx mgr with old feature_version (GH-94950)
* gh-94949: Disallow parsing parenthesised ctx manager with old feature_version

* 📜🤖 Added by blurb_it.

* Allow it with feature_version=(3, 9) as well

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
(cherry picked from commit 0daba82221)

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2022-07-18 14:57:45 -07:00
Miss Islington (bot)
7dc236d116
gh-94947: Disallow parsing walrus with feature_version < (3, 8) (GH-94948)
* gh-94947: Disallow parsing walrus with feature_version < (3, 8)

* oops, commit the parser

* 📜🤖 Added by blurb_it.

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
(cherry picked from commit ae0be5a53b)

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2022-07-18 02:46:21 -07:00
Miss Islington (bot)
e121cb5814
gh-94869: Fix the location in some expressions for multi-line f-string ast nodes (GH-94895)
(cherry picked from commit 2e9da8e352)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-07-16 12:16:51 -07:00
Miss Islington (bot)
d49c99f10d
gh-94360: Fix a tokenizer crash when reading encoded files with syntax errors from stdin (GH-94386)
* gh-94360: Fix a tokenizer crash when reading encoded files with syntax errors from stdin

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>

* nitty nit

Co-authored-by: Łukasz Langa <lukasz@langa.pl>
(cherry picked from commit 36fcde61ba)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-07-05 10:09:51 -07:00
Miss Islington (bot)
442dd8ffa5
gh-94192: Fix error for dictionary literals with invalid expression as value. (GH-94304)
* Fix error for dictionary literals with invalid expression as value.

* Remove trailing whitespace
(cherry picked from commit 8c237a7a71)

Co-authored-by: wookie184 <wookie1840@gmail.com>
2022-06-26 12:07:02 -07:00
Pablo Galindo Salgado
65ed8b47ee
[3.11] gh-92858: Improve error message for some suites with syntax error before ':' (GH-92894) (#94180)
(cherry picked from commit 2fc83ac3af)

Co-authored-by: wookie184 <wookie1840@gmail.com>

Co-authored-by: wookie184 <wookie1840@gmail.com>
2022-06-23 18:38:06 +01:00
Miss Islington (bot)
f9d0240db8
gh-93671: Avoid exponential backtracking in deeply nested sequence patterns in match statements (GH-93680)
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
(cherry picked from commit 53a8b17895)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-06-10 09:21:04 -07:00
Miss Islington (bot)
376d53771d
gh-93418: Fix an assert when an f-string expression is followed by an '=', but no closing brace. (gh-93419) (gh-93422)
(cherry picked from commit ee70c70aa9)

Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com>

Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com>
2022-06-01 21:04:27 -04:00
Miss Islington (bot)
b425d887aa
gh-92597: Ensure that AST nodes without explicit end positions can be compiled (GH-93359)
(cherry picked from commit 705eaec28f)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-05-31 16:26:16 -07:00
Miss Islington (bot)
7afccd34a6
gh-90473: Decrease recursion limit and skip tests on WASI (GH-92803)
(cherry picked from commit 137fd3d88a)

Co-authored-by: Christian Heimes <christian@python.org>
2022-05-19 08:05:52 -07:00
Victor Stinner
d716a0dfe2
Use static inline function Py_EnterRecursiveCall() (#91988)
Currently, calling Py_EnterRecursiveCall() and
Py_LeaveRecursiveCall() may use a function call or a static inline
function call, depending if the internal pycore_ceval.h header file
is included or not. Use a different name for the static inline
function to ensure that the static inline function is always used in
Python internals for best performance. Similar approach than
PyThreadState_GET() (function call) and _PyThreadState_GET() (static
inline function).

* Rename _Py_EnterRecursiveCall() to _Py_EnterRecursiveCallTstate()
* Rename _Py_LeaveRecursiveCall() to _Py_LeaveRecursiveCallTstate()
* pycore_ceval.h: Rename Py_EnterRecursiveCall() to
  _Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() and
  _Py_LeaveRecursiveCall()
2022-05-04 13:30:23 +02:00
Serhiy Storchaka
3483299a24
gh-81548: Deprecate octal escape sequences with value larger than 0o377 (GH-91668) 2022-04-30 13:16:27 +03:00
Serhiy Storchaka
43a8bf1ea4
gh-87999: Change warning type for numeric literal followed by keyword (GH-91980)
The warning emitted by the Python parser for a numeric literal
immediately followed by keyword has been changed from deprecation
warning to syntax warning.
2022-04-27 20:15:14 +03:00
Matthieu Dartiailh
aa0f056a00
bpo-47212: Improve error messages for un-parenthesized generator expressions (GH-32302) 2022-04-05 14:47:13 +01:00