sobolevn 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f223efb2a2 
								
							 
						 
						
							
							
								
								gh-126525: Fix makeunicodedata.py output on macOS and Windows ( #126526 )  
							
							
							
						 
						
							2024-11-12 13:23:57 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bb904e063d 
								
							 
						 
						
							
							
								
								closes gh-124016: update Unicode to 16.0.0 ( #124017 )  
							
							
							
						 
						
							2024-09-13 07:47:04 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									CF Bolz-Tereick 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9573d14215 
								
							 
						 
						
							
							
								
								gh-96954: use a directed acyclic word graph for storing the unicodedata codepoint names ( #97906 )  
							
							... 
							
							
							
							Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Co-authored-by: Dennis Sweeney <36520290+sweeneyde@users.noreply.github.com> 
							
						 
						
							2023-11-04 15:56:58 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									James Gerity 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								def828995a 
								
							 
						 
						
							
							
								
								fixes gh-109559: Update unicodedata for Unicode 15.1.0 (GH-109560)  
							
							... 
							
							
							
							---------
Co-authored-by: Benjamin Peterson <benjamin@python.org> 
							
						 
						
							2023-09-19 22:07:47 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									LiarPrincess 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0c1d7a06ed 
								
							 
						 
						
							
							
								
								bpo-47243: Duplicate entry in 'Objects/unicodetype_db.h' (GH-32376)  
							
							... 
							
							
							
							Fix for duplicate 1st entry in 'Objects/unicodetype_db.h':
```c
/* a list of unique character type descriptors */
const _PyUnicode_TypeRecord _PyUnicode_TypeRecords[] = {
    {0, 0, 0, 0, 0, 0},
    {0, 0, 0, 0, 0, 0}, <--- HERE
    {0, 0, 0, 0, 0, 32},
    {0, 0, 0, 0, 0, 48},
    …
```
https://bugs.python.org/issue47243 
Automerge-Triggered-By: GH:isidentical 
							
						 
						
							2022-09-28 06:57:14 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fd1e477f53 
								
							 
						 
						
							
							
								
								closes gh-96734: Update to Unicode 15.0.0. (GH-96809)  
							
							
							
						 
						
							2022-09-13 15:45:12 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Carl Friedrich Bolz-Tereick 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9c197bc8bf 
								
							 
						 
						
							
							
								
								GH-96172 fix unicodedata.east_asian_width being wrong on unassigned code points ( #96207 )  
							
							
							
						 
						
							2022-08-26 19:29:39 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Carl Friedrich Bolz-Tereick 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2d9f252c0c 
								
							 
						 
						
							
							
								
								gh-96019: Fix caching of decompositions in makeunicodedata (GH-96020)  
							
							
							
						 
						
							2022-08-19 12:20:44 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								024fda47d4 
								
							 
						 
						
							
							
								
								closes bpo-45190: Update Unicode data to version 14.0.0. (GH-28336)  
							
							
							
						 
						
							2021-09-14 11:00:38 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								51796e5d26 
								
							 
						 
						
							
							
								
								Update some www.unicode.org URLs to use HTTPS. (GH-18912)  
							
							
							
						 
						
							2020-03-10 21:10:59 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								051b9d08d1 
								
							 
						 
						
							
							
								
								closes bpo-39926: Update Unicode to 13.0.0. (GH-18910)  
							
							
							
						 
						
							2020-03-10 20:41:34 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Greg Price 
								
							 
						 
						
							
							
							
							
								
							
							
								a65678c5c9 
								
							 
						 
						
							
							
								
								bpo-37760: Convert from length-18 lists to a dataclass, in makeunicodedata. (GH-15265)  
							
							... 
							
							
							
							Now the fields have names!  Much easier to keep straight as a
reader than the elements of an 18-tuple.
Runs about 10-15% slower: from 10.8s to 12.3s, on my laptop.
Fortunately that's perfectly fine for this maintenance script. 
							
						 
						
							2019-09-12 10:23:43 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Greg Price 
								
							 
						 
						
							
							
							
							
								
							
							
								3e4498d35c 
								
							 
						 
						
							
							
								
								bpo-37760: Avoid cluttering work tree with downloaded Unicode files. (GH-15128)  
							
							
							
						 
						
							2019-08-14 18:18:53 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Greg Price 
								
							 
						 
						
							
							
							
							
								
							
							
								c03e698c34 
								
							 
						 
						
							
							
								
								bpo-37760: Factor out standard range-expanding logic in makeunicodedata. (GH-15248)  
							
							... 
							
							
							
							Much like the lower-level logic in commit ef2af1ad4 
							
						 
						
							2019-08-13 19:28:38 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Greg Price 
								
							 
						 
						
							
							
							
							
								
							
							
								99d208efed 
								
							 
						 
						
							
							
								
								bpo-37760: Constant-fold some old options in makeunicodedata. (GH-15129)  
							
							... 
							
							
							
							The `expand` option was introduced in 2000 in commit fad27aee1 
							
						 
						
							2019-08-12 22:59:30 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Greg Price 
								
							 
						 
						
							
							
							
							
								
							
							
								ef2af1ad44 
								
							 
						 
						
							
							
								
								bpo-37760: Factor out the basic UCD parsing logic of makeunicodedata. (GH-15130)  
							
							... 
							
							
							
							There were 10 copies of this, and almost as many distinct versions of
exactly how it was written.  They're all implementing the same
standard.  Pull them out to the top, so the more interesting logic
that remains becomes easier to read. 
							
						 
						
							2019-08-12 22:20:56 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Stefan Behnel 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								faa2948654 
								
							 
						 
						
							
							
								
								Clean up and reduce visual clutter in the makeunicode.py script. (GH-7558)  
							
							
							
						 
						
							2019-06-01 21:49:03 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3aca40d3cb 
								
							 
						 
						
							
							
								
								closes bpo-36861: Update Unicode database to 12.1.0. (GH-13214)  
							
							... 
							
							
							
							Adds ㋿. 
							
						 
						
							2019-05-08 20:59:35 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Inada Naoki 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6fec905de5 
								
							 
						 
						
							
							
								
								bpo-36642: make unicodedata const (GH-12855)  
							
							
							
						 
						
							2019-04-17 08:40:34 +09:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								738c19f4c5 
								
							 
						 
						
							
							
								
								closes bpo-33376: Update to Unicode 12.0.0. (GH-12256)  
							
							
							
						 
						
							2019-03-09 16:25:55 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7c69c1c0fb 
								
							 
						 
						
							
							
								
								update to Unicode 11.0.0 (closes bpo-33778) (GH-7439)  
							
							... 
							
							
							
							Also, standardize indentation of generated tables. 
							
						 
						
							2018-06-06 20:14:28 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
							
							
								
							
							
								279a96206f 
								
							 
						 
						
							
							
								
								bpo-30736: upgrade to Unicode 10.0 ( #2344 )  
							
							... 
							
							
							
							Straightforward. While we're at it, though, strip trailing whitespace from generated tables. 
							
						 
						
							2017-06-22 22:31:08 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Jon Dufresne 
								
							 
						 
						
							
							
							
							
								
							
							
								3972628de3 
								
							 
						 
						
							
							
								
								bpo-30296 Remove unnecessary tuples, lists, sets, and dicts ( #1489 )  
							
							... 
							
							
							
							* Replaced list(<generator expression>) with list comprehension
* Replaced dict(<generator expression>) with dict comprehension
* Replaced set(<list literal>) with set literal
* Replaced builtin func(<list comprehension>) with func(<generator
  expression>) when supported (e.g. any(), all(), tuple(), min(), &
  max()) 
							
						 
						
							2017-05-18 07:35:54 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
							
							
								
							
							
								6775231597 
								
							 
						 
						
							
							
								
								Unicode 9.0.0  
							
							... 
							
							
							
							Not completely mechanical since support for East Asian Width changes—emoji
codepoints became Wide—had to be added to unicodedata. 
							
						 
						
							2016-09-14 23:53:47 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
							
							
								
							
							
								4801383c29 
								
							 
						 
						
							
							
								
								upgrade to Unicode 8.0.0  
							
							
							
						 
						
							2015-06-27 15:45:56 -05:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									R David Murray 
								
							 
						 
						
							
							
							
							
								
							
							
								2623a5db6f 
								
							 
						 
						
							
							
								
								Merge:  #18176 : Change generic UCD PropList link to version specific link.  
							
							
							
						 
						
							2014-10-09 20:47:31 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									R David Murray 
								
							 
						 
						
							
							
							
							
								
							
							
								5f16f90d1b 
								
							 
						 
						
							
							
								
								#18176 : Change generic UCD PropList link to version specific link.  
							
							
							
						 
						
							2014-10-09 20:45:59 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									R David Murray 
								
							 
						 
						
							
							
							
							
								
							
							
								532783bd5e 
								
							 
						 
						
							
							
								
								Merge:  #18176 : fix another reference and add it to the makeunicodedata comment.  
							
							
							
						 
						
							2014-10-09 17:41:55 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									R David Murray 
								
							 
						 
						
							
							
							
							
								
							
							
								5bd62420f4 
								
							 
						 
						
							
							
								
								#18176 : fix another reference and add it to the makeunicodedata comment.  
							
							
							
						 
						
							2014-10-09 17:39:48 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									R David Murray 
								
							 
						 
						
							
							
							
							
								
							
							
								5ac125cde3 
								
							 
						 
						
							
							
								
								Merge:  #18176 : updated stdtypes UCD link, added reminder to makeunicodedata.  
							
							
							
						 
						
							2014-10-09 17:33:15 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									R David Murray 
								
							 
						 
						
							
							
							
							
								
							
							
								7445a383a6 
								
							 
						 
						
							
							
								
								#18176 : updated stdtypes UCD link, added reminder to makeunicodedata.  
							
							... 
							
							
							
							Patch by Alexander Belopolsky. 
							
						 
						
							2014-10-09 17:30:33 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
							
							
								
							
							
								3032ed7cb1 
								
							 
						 
						
							
							
								
								upgrade to unicode 7.0.0  
							
							
							
						 
						
							2014-07-06 13:04:20 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
							
							
								
							
							
								94d08d908b 
								
							 
						 
						
							
							
								
								upgrade unicode db to 6.3.0 ( closes   #19221 )  
							
							
							
						 
						
							2013-10-10 17:24:45 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ezio Melotti 
								
							 
						 
						
							
							
							
							
								
							
							
								d640fe2af5 
								
							 
						 
						
							
							
								
								#18803 : merge with 3.3.  
							
							
							
						 
						
							2013-08-26 01:33:30 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ezio Melotti 
								
							 
						 
						
							
							
							
							
								
							
							
								7c4a7e6f3c 
								
							 
						 
						
							
							
								
								#18803 : fix more typos.  Patch by Févry Thibault.  
							
							
							
						 
						
							2013-08-26 01:32:56 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Antoine Pitrou 
								
							 
						 
						
							
							
							
							
								
							
							
								9ed5f27266 
								
							 
						 
						
							
							
								
								Issue  #18722 : Remove uses of the "register" keyword in C code.  
							
							
							
						 
						
							2013-08-13 20:18:52 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
							
							
								
							
							
								b8350f1c7d 
								
							 
						 
						
							
							
								
								upgrade to UCD 6.2  
							
							
							
						 
						
							2012-09-29 13:47:39 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Florent Xicluna 
								
							 
						 
						
							
							
							
							
								
							
							
								c20740109d 
								
							 
						 
						
							
							
								
								Some cleanup in the Tools directory.  
							
							
							
						 
						
							2012-07-07 17:03:54 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
							
							
								
							
							
								71f660e00f 
								
							 
						 
						
							
							
								
								update to Unicode 6.1  
							
							
							
						 
						
							2012-02-20 22:24:29 -05:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
							
							
								
							
							
								ad9c569825 
								
							 
						 
						
							
							
								
								delta encoding of upper/lower/title makes a glorious return ( #12736 )  
							
							
							
						 
						
							2012-01-15 21:19:20 -05:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
							
							
								
							
							
								d5890c8db5 
								
							 
						 
						
							
							
								
								add str.casefold() ( closes   #13752 )  
							
							
							
						 
						
							2012-01-14 13:23:30 -05:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Peterson 
								
							 
						 
						
							
							
							
							
								
							
							
								b2bf01d824 
								
							 
						 
						
							
							
								
								use full unicode mappings for upper/lower/title case ( #12736 )  
							
							... 
							
							
							
							Also broaden the category of characters that count as lowercase/uppercase. 
							
						 
						
							2012-01-11 18:17:06 -05:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ezio Melotti 
								
							 
						 
						
							
							
							
							
								
							
							
								931b8aac80 
								
							 
						 
						
							
							
								
								#12753 : Add support for Unicode name aliases and named sequences.  
							
							
							
						 
						
							2011-10-21 21:57:36 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ezio Melotti 
								
							 
						 
						
							
							
							
							
								
							
							
								2a1e926d63 
								
							 
						 
						
							
							
								
								Fix ResourceWarnings in makeunicodedata.py.  
							
							
							
						 
						
							2011-09-30 08:46:25 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ezio Melotti 
								
							 
						 
						
							
							
							
							
								
							
							
								3b3499ba69 
								
							 
						 
						
							
							
								
								#11565 : Merge with 3.1.  
							
							
							
						 
						
							2011-03-16 11:35:38 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ezio Melotti 
								
							 
						 
						
							
							
							
							
								
							
							
								13925008dc 
								
							 
						 
						
							
							
								
								#11565 : Fix several typos. Patch by Piotr Kasprzyk.  
							
							
							
						 
						
							2011-03-16 11:05:33 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Martin v. Löwis 
								
							 
						 
						
							
							
							
							
								
							
							
								5cbc71e50a 
								
							 
						 
						
							
							
								
								Issue  #10459 : Update CJK character names to Unicode 6.0.  
							
							
							
						 
						
							2010-11-22 09:00:02 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Martin v. Löwis 
								
							 
						 
						
							
							
							
							
								
							
							
								baecd7243a 
								
							 
						 
						
							
							
								
								Upgrade to Unicode 6.0.0.  
							
							... 
							
							
							
							makeunicodedata.py: download all data files from unicode.org,
  switch to extracting Unihan data from zip file.
  Read linebreakprops and derivednormalizationprops even for
  old versions, even though they are not used in delta records.
test:unicode.py: U+11000 is now assigned, use U+14000 instead. 
							
						 
						
							2010-10-11 22:42:28 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Amaury Forgeot d'Arc 
								
							 
						 
						
							
							
							
							
								
							
							
								feb7307db4 
								
							 
						 
						
							
							
								
								#9210 : remove --with-wctype-functions configure option.  
							
							... 
							
							
							
							The internal unicode database is now always used.
(after 5 years: see
  http://mail.python.org/pipermail/python-dev/2004-December/050193.html 
) 
							
						 
						
							2010-09-12 22:42:57 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Amaury Forgeot d'Arc 
								
							 
						 
						
							
							
							
							
								
							
							
								324ac65ceb 
								
							 
						 
						
							
							
								
								#5127 : Even on narrow unicode builds, the C functions that access the Unicode  
							
							... 
							
							
							
							Database (Py_UNICODE_TOLOWER, Py_UNICODE_ISDECIMAL, and others) now accept
and return characters from the full Unicode range (Py_UCS4).
The differences from Python code are few:
- unicodedata.numeric(), unicodedata.decimal() and unicodedata.digit()
  now return the correct value for large code points
- repr() may consider more characters as printable. 
							
						 
						
							2010-08-18 20:44:58 +00:00