cpython/Modules/_sre
Serhiy Storchaka 794b42ff8a
gh-95555: Support Unicode property escapes \p{...} in regular expressions (GH-151969)
Add support for \p{property} and \P{property} escapes in Unicode (str)
regular expressions, for the properties the engine can resolve without
the unicodedata database.  They are matched as CATEGORY opcodes or as
fixed sets of character ranges.

Supported in this change: many General_Category values (the groups L, N,
Z, C and the values Lu, Lt, Lm, Nd, Nl, No, Zs, Zl, Zp, Cc, Cf, Cs, Co
and Cn); the binary properties Alphabetic, Lowercase, Uppercase, Numeric,
Printable, XID_Start, XID_Continue, Cased and Case_Ignorable; the POSIX
compatibility classes; the code-point classes ASCII, Any, Assigned,
Noncharacter_Code_Point, Join_Control, Pattern_Syntax and
Pattern_White_Space; and Regional_Indicator, ASCII_Hex_Digit and
Hex_Digit.

Property and value names use loose matching (UAX #44 UAX44-LM3), so a
property may be spelled \p{Lu}, \p{gc=Lu} or \p{name=yes}.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 07:33:33 +03:00
..
clinic gh-86519: Add prefixmatch APIs to the re module (GH-31137) 2026-02-15 17:43:39 -08:00
sre.c gh-95555: Support Unicode property escapes \p{...} in regular expressions (GH-151969) 2026-06-26 07:33:33 +03:00
sre.h gh-67877: Fix memory leaks in terminated RE matching (GH-126840) 2024-11-18 11:53:45 +02:00
sre_constants.h gh-95555: Support Unicode property escapes \p{...} in regular expressions (GH-151969) 2026-06-26 07:33:33 +03:00
sre_lib.h gh-152033: Optimize category escapes outside character sets (GH-152035) 2026-06-24 08:49:14 +03:00
sre_targets.h gh-97669: Create Tools/build/ directory (#97963) 2022-10-17 12:01:00 +02:00