cpython/Tools/unicode
Pieter Eendebak 97dea30914
gh-150889: Improve performance of unicodedata.normalize() (GH-150890)
Scan the nfc_first/nfc_last reindex tables comparing only .start, range-check
the candidate once, and terminate on a sentinel above every codepoint, so each
entry costs a single comparison. ~2x faster on non-Latin and combining-heavy
NFC/NFKC input; no new data tables.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 11:34:33 +03:00
..
python-mappings Revert "gh-84508: Add mapping files for Korean and Japanese. (gh-93309)" (#93320) 2022-05-29 09:49:19 +09:00
comparecodecs.py
dawg.py gh-96954: use a directed acyclic word graph for storing the unicodedata codepoint names (#97906) 2023-11-04 15:56:58 +01:00
gencjkcodecs.py
gencodec.py
genmap_japanese.py Code: Update Donghee Na's name (#109744) 2023-09-25 18:17:34 +03:00
genmap_korean.py Code: Update Donghee Na's name (#109744) 2023-09-25 18:17:34 +03:00
genmap_schinese.py Code: Update Donghee Na's name (#109744) 2023-09-25 18:17:34 +03:00
genmap_support.py Code: Update Donghee Na's name (#109744) 2023-09-25 18:17:34 +03:00
genmap_tchinese.py gh-84508: tool to generate cjk traditional chinese mappings (gh-93272) 2022-06-11 23:19:41 +09:00
genwincodec.py
genwincodecs.bat
listcodecs.py
Makefile
makeunicodedata.py gh-150889: Improve performance of unicodedata.normalize() (GH-150890) 2026-06-06 11:34:33 +03:00
mkstringprep.py