cpython/Modules/_sre
Serhiy Storchaka fde4cf862c
gh-152033: Optimize category escapes outside character sets (GH-152035)
Character class escapes (``\d``, ``\D``, ``\s``, ``\S``, ``\w`` and
``\W``) that occur outside a character set are now compiled directly to a
single CATEGORY opcode instead of being wrapped in an IN block.  This
removes the IN wrapper (three code words) and an indirect charset() call,
and makes such an escape a simple repeatable unit so that, for example,
``\d+`` uses the REPEAT_ONE fast path; a CATEGORY case is added to
SRE(count).

The transformation preserves behaviour exactly.  For category-heavy
patterns the compiled byte code is about 20% smaller and matching is up
to ~2x faster, with no effect on patterns that do not use bare category
escapes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 08:49:14 +03:00
..
clinic gh-86519: Add prefixmatch APIs to the re module (GH-31137) 2026-02-15 17:43:39 -08:00
sre.c gh-152033: Optimize category escapes outside character sets (GH-152035) 2026-06-24 08:49:14 +03:00
sre.h gh-67877: Fix memory leaks in terminated RE matching (GH-126840) 2024-11-18 11:53:45 +02:00
sre_constants.h gh-105687: Remove deprecated objects from re module (#105688) 2023-06-14 12:26:20 +02:00
sre_lib.h gh-152033: Optimize category escapes outside character sets (GH-152035) 2026-06-24 08:49:14 +03:00
sre_targets.h gh-97669: Create Tools/build/ directory (#97963) 2022-10-17 12:01:00 +02:00