Stefan Krah 
								
							 
						 
						
							
							
							
							
								
							
							
								f432a3234f 
								
							 
						 
						
							
							
								
								bpo-30923: Silence fall-through warnings included in -Wextra since gcc-7.0. ( #3157 )  
							
							 
							
							
							
						 
						
							2017-08-21 13:09:59 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								998c9cdd42 
								
							 
						 
						
							
							
								
								Issue  #28561 : Clean up UTF-8 encoder: remove dead code, update comments, etc.  
							
							 
							
							... 
							
							
							
							Patch by Xiang Zhang. 
							
						 
						
							2016-10-30 18:25:27 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								1a05d6c04d 
								
							 
						 
						
							
							
								
								PEP 7 style for if/else in C  
							
							 
							
							... 
							
							
							
							Add also a newline for readability in normalize_encoding(). 
							
						 
						
							2016-09-02 12:12:23 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Raymond Hettinger 
								
							 
						 
						
							
							
							
							
								
							
							
								15f44ab043 
								
							 
						 
						
							
							
								
								Issue  #27895 :  Spelling fixes (Contributed by Ville Skyttä).  
							
							 
							
							
							
						 
						
							2016-08-30 10:47:49 -07:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								bcde10aa7e 
								
							 
						 
						
							
							
								
								Issue  #26765 : Ensure that bytes- and unicode-specific stringlib files are used  
							
							 
							
							... 
							
							
							
							with correct type. 
							
						 
						
							2016-05-16 09:42:29 +03:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								6bd525b656 
								
							 
						 
						
							
							
								
								Optimize error handlers of ASCII and Latin1 encoders when the replacement  
							
							 
							
							... 
							
							
							
							string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual
character.
Cleanup unicode_encode_ucs1():
* Rename repunicode to rep
* Clear rep object on error
* Factorize code between bytes and unicode path 
							
						 
						
							2015-10-09 13:10:05 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								ce179bf6ba 
								
							 
						 
						
							
							
								
								Add _PyBytesWriter_WriteBytes() to factorize the code  
							
							 
							
							
							
						 
						
							2015-10-09 12:57:22 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								ad7715891e 
								
							 
						 
						
							
							
								
								_PyBytesWriter: simplify code to avoid "prealloc" parameters  
							
							 
							
							... 
							
							
							
							Substract preallocate bytes from min_size before calling
_PyBytesWriter_Prepare(). 
							
						 
						
							2015-10-09 12:38:53 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								e7bf86cd7d 
								
							 
						 
						
							
							
								
								Optimize backslashreplace error handler  
							
							 
							
							... 
							
							
							
							Issue #25318 : Optimize backslashreplace and xmlcharrefreplace error handlers in
UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and
Latin1 encoders.
Use the new _PyBytesWriter API to optimize these error handlers for the
encoders. It avoids to create an exception and call the slow implementation of
the error handler. 
							
						 
						
							2015-10-09 01:39:28 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								fdfbf78114 
								
							 
						 
						
							
							
								
								Issue  #25318 : Add _PyBytesWriter API  
							
							 
							
							... 
							
							
							
							Add a new private API to optimize Unicode encoders. It uses a small buffer
allocated on the stack and supports overallocation.
Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable
overallocation for the UTF-8 encoder with error handlers.
unicode_encode_ucs1(): initialize collend to collstart+1 to not check the
current character twice, we already know that it is not ASCII. 
							
						 
						
							2015-10-09 00:33:49 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								01ada3996b 
								
							 
						 
						
							
							
								
								Issue  #25267 : The UTF-8 encoder is now up to 75 times as fast for error  
							
							 
							
							... 
							
							
							
							handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``.
Patch co-written with Serhiy Storchaka. 
							
						 
						
							2015-10-01 21:54:51 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								9ce71a6475 
								
							 
						 
						
							
							
								
								Fixed typos in comments.  
							
							 
							
							
							
						 
						
							2015-05-18 22:20:18 +03:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								7e29eea926 
								
							 
						 
						
							
							
								
								Fixed typos in comments.  
							
							 
							
							
							
						 
						
							2015-05-18 22:19:42 +03:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								0d4df752ac 
								
							 
						 
						
							
							
								
								Issue  #15027 : The UTF-32 encoder is now 3x to 7x faster.  
							
							 
							
							
							
						 
						
							2015-05-12 23:12:45 +03:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								3079328d29 
								
							 
						 
						
							
							
								
								Reverted changeset b72c5573c5e7 (issue  #15027 ).  
							
							 
							
							
							
						 
						
							2014-01-04 22:44:01 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								583a93943c 
								
							 
						 
						
							
							
								
								Issue  #15027 : Rewrite the UTF-32 encoder.  It is now 1.6x to 3.5x faster.  
							
							 
							
							
							
						 
						
							2014-01-04 19:25:37 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								dc2fd5101a 
								
							 
						 
						
							
							
								
								Remove dead code committed in issue  #12892 .  
							
							 
							
							
							
						 
						
							2013-11-19 15:56:05 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								58cf607d13 
								
							 
						 
						
							
							
								
								Issue  #12892 : The utf-16* and utf-32* codecs now reject (lone) surrogates.  
							
							 
							
							... 
							
							
							
							The utf-16* and utf-32* encoders no longer allow surrogate code points
(U+D800-U+DFFF) to be encoded.
The utf-32* decoders no longer decode byte sequences that correspond to
surrogate code points.
The surrogatepass error handler now works with the utf-16* and utf-32* codecs.
Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu. 
							
						 
						
							2013-11-19 11:32:41 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Antoine Pitrou 
								
							 
						 
						
							
							
							
							
								
							
							
								9ed5f27266 
								
							 
						 
						
							
							
								
								Issue  #18722 : Remove uses of the "register" keyword in C code.  
							
							 
							
							
							
						 
						
							2013-08-13 20:18:52 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								6caa6fb535 
								
							 
						 
						
							
							
								
								(Merge 3.3) Issue  #8271 : Fix compilation on Windows  
							
							 
							
							
							
						 
						
							2012-11-05 00:00:50 +01:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								ab60de478d 
								
							 
						 
						
							
							
								
								Issue  #8271 : Fix compilation on Windows  
							
							 
							
							
							
						 
						
							2012-11-04 23:59:15 +01:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ezio Melotti 
								
							 
						 
						
							
							
							
							
								
							
							
								cfa9636404 
								
							 
						 
						
							
							
								
								#8271 : merge with 3.3.  
							
							 
							
							
							
						 
						
							2012-11-04 23:23:09 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ezio Melotti 
								
							 
						 
						
							
							
							
							
								
							
							
								f7ed5d111b 
								
							 
						 
						
							
							
								
								#8271 : the utf-8 decoder now outputs the correct number of U+FFFD  characters when used with the "replace" error handler on invalid utf-8 sequences.  Patch by Serhiy Storchaka, tests by Ezio Melotti.  
							
							 
							
							
							
						 
						
							2012-11-04 23:21:38 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Christian Heimes 
								
							 
						 
						
							
							
							
							
								
							
							
								743e0cd6b5 
								
							 
						 
						
							
							
								
								Issue  #16166 : Add PY_LITTLE_ENDIAN and PY_BIG_ENDIAN macros and unified  
							
							 
							
							... 
							
							
							
							endianess detection and handling. 
							
						 
						
							2012-10-17 23:52:17 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Antoine Pitrou 
								
							 
						 
						
							
							
							
							
								
							
							
								ca8aa4acf6 
								
							 
						 
						
							
							
								
								Issue  #15144 : Fix possible integer overflow when handling pointers as integer values, by using Py_uintptr_t instead of size_t.  
							
							 
							
							... 
							
							
							
							Patch by Serhiy Storchaka. 
							
						 
						
							2012-09-20 20:56:47 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Mark Dickinson 
								
							 
						 
						
							
							
							
							
								
							
							
								01ac8b6ab1 
								
							 
						 
						
							
							
								
								Use correct types for ASCII_CHAR_MASK integer constants.  
							
							 
							
							
							
						 
						
							2012-07-07 14:08:48 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Mark Dickinson 
								
							 
						 
						
							
							
							
							
								
							
							
								106c4145ff 
								
							 
						 
						
							
							
								
								Issue  #14923 : Optimize continuation-byte check in UTF-8 decoding.  Patch by Serhiy Storchaka.  
							
							 
							
							
							
						 
						
							2012-06-23 21:45:14 +01:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Antoine Pitrou 
								
							 
						 
						
							
							
							
							
								
							
							
								27f6a3b0bf 
								
							 
						 
						
							
							
								
								Issue  #15026 : utf-16 encoding is now significantly faster (up to 10x).  
							
							 
							
							... 
							
							
							
							Patch by Serhiy Storchaka. 
							
						 
						
							2012-06-15 22:15:23 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Antoine Pitrou 
								
							 
						 
						
							
							
							
							
								
							
							
								63065d761e 
								
							 
						 
						
							
							
								
								Issue  #14624 : UTF-16 decoding is now 3x to 4x faster on various inputs.  
							
							 
							
							... 
							
							
							
							Patch by Serhiy Storchaka. 
							
						 
						
							2012-05-15 23:48:04 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Antoine Pitrou 
								
							 
						 
						
							
							
							
							
								
							
							
								ca5f91b888 
								
							 
						 
						
							
							
								
								Issue  #14738 : Speed-up UTF-8 decoding on non-ASCII data.  Patch by Serhiy Storchaka.  
							
							 
							
							
							
						 
						
							2012-05-10 16:36:02 +02:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								6099a03202 
								
							 
						 
						
							
							
								
								Issue  #13624 : Write a specialized UTF-8 encoder to allow more optimization  
							
							 
							
							... 
							
							
							
							The main bottleneck was the PyUnicode_READ() macro. 
							
						 
						
							2011-12-18 14:22:26 +01:00  
						
						
							 
							
							
							
								 
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Antoine Pitrou 
								
							 
						 
						
							
							
							
							
								
							
							
								0a3229de6b 
								
							 
						 
						
							
							
								
								Issue  #13417 : speed up utf-8 decoding by around 2x for the non-fully-ASCII case.  
							
							 
							
							... 
							
							
							
							This almost catches up with pre-PEP 393 performance, when decoding needed
only one pass. 
							
						 
						
							2011-11-21 20:39:13 +01:00