cpython/Tools/unicode
Pieter Eendebak 97dea30914
gh-150889: Improve performance of unicodedata.normalize() (GH-150890)
Scan the nfc_first/nfc_last reindex tables comparing only .start, range-check
the candidate once, and terminate on a sentinel above every codepoint, so each
entry costs a single comparison. ~2x faster on non-Latin and combining-heavy
NFC/NFKC input; no new data tables.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 11:34:33 +03:00
..
python-mappings
comparecodecs.py
dawg.py gh-96954: use a directed acyclic word graph for storing the unicodedata codepoint names (#97906) 2023-11-04 15:56:58 +01:00
gencjkcodecs.py
gencodec.py
genmap_japanese.py
genmap_korean.py
genmap_schinese.py
genmap_support.py
genmap_tchinese.py
genwincodec.py
genwincodecs.bat
listcodecs.py
Makefile
makeunicodedata.py gh-150889: Improve performance of unicodedata.normalize() (GH-150890) 2026-06-06 11:34:33 +03:00
mkstringprep.py