Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								170ca6f84b 
								
							 
						 
						
							
							
								
								Fix bug in Unicode decoders related to _PyUnicodeWriter  
							
							... 
							
							
							
							Bug introduced by changesets 7ed9993d53b4 and edf029fc9591. 
							
						 
						
							2013-04-18 00:25:28 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								376cfa122d 
								
							 
						 
						
							
							
								
								Fix typo in unicode_decode_call_errorhandler_writer()  
							
							... 
							
							
							
							Bug introduced by changeset 7ed9993d53b4. 
							
						 
						
							2013-04-17 23:58:16 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								8f674ccd64 
								
							 
						 
						
							
							
								
								Close   #17694 : Add minimum length to _PyUnicodeWriter  
							
							... 
							
							
							
							* Add also min_char attribute to _PyUnicodeWriter structure (currently unused)
 * _PyUnicodeWriter_Init() has no more argument (except the writer itself):
   min_length and overallocate must be set explicitly
 * In error handlers, only enable overallocation if the replacement string
   is longer than 1 character
 * CJK decoders don't use overallocation anymore
 * Set min_length, instead of preallocating memory using
   _PyUnicodeWriter_Prepare(), in many decoders
 * _PyUnicode_DecodeUnicodeInternal() checks for integer overflow 
							
						 
						
							2013-04-17 23:02:17 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								77282cb4f8 
								
							 
						 
						
							
							
								
								Cleanup PyUnicode_Contains()  
							
							... 
							
							
							
							* No need to double-check that strings are ready: test already done by
   PyUnicode_FromObject()
 * Remove useless kind variable (use kind1 instead) 
							
						 
						
							2013-04-14 19:22:47 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								d92e078c8d 
								
							 
						 
						
							
							
								
								Minor change: fix character in do_strip() for the ASCII case  
							
							
							
						 
						
							2013-04-14 19:17:42 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								f033510fee 
								
							 
						 
						
							
							
								
								Cleanup PyUnicode_Append()  
							
							... 
							
							
							
							* Check also that right is a Unicode object
 * call directly resize_compact() instead of unicode_resize() for a more
   explicit error handling, and to avoid testing some properties twice
   (ex: unicode_modifiable()) 
							
						 
						
							2013-04-14 19:13:03 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								4560f9c63f 
								
							 
						 
						
							
							
								
								PyUnicode_Join(): move use_memcpy test out of the loop to cleanup and optimize the code  
							
							
							
						 
						
							2013-04-14 18:56:46 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								55c08781e8 
								
							 
						 
						
							
							
								
								Optimize repr(str): use _PyUnicode_FastCopyCharacters() when no character is escaped  
							
							
							
						 
						
							2013-04-14 18:45:39 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								af03757d20 
								
							 
						 
						
							
							
								
								Optimize ascii(str): don't encode/decode repr if repr is already ASCII  
							
							
							
						 
						
							2013-04-14 18:44:10 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								8a1a6cffd6 
								
							 
						 
						
							
							
								
								Add _PyUnicodeWriter_WriteCharInline()  
							
							
							
						 
						
							2013-04-14 02:35:33 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								e2cef885a2 
								
							 
						 
						
							
							
								
								Issue  #16061 : Speed up str.replace() for replacing 1-character strings.  
							
							
							
						 
						
							2013-04-13 22:45:04 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								a0dd0213cc 
								
							 
						 
						
							
							
								
								Close   #17693 : Rewrite CJK decoders to use the _PyUnicodeWriter API instead of  
							
							... 
							
							
							
							the legacy Py_UNICODE API.
Add also a new _PyUnicodeWriter_WriteChar() function. 
							
						 
						
							2013-04-11 22:09:04 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								247109e74d 
								
							 
						 
						
							
							
								
								Issue  #17615 : On Windows (VS2010), Performances of wmemcmp() to compare Unicode  
							
							... 
							
							
							
							strings are not convincing. For UCS2 (16-bit wchar_t type), use a dummy loop
instead of wmemcmp(). The dummy loop is as fast, or a little bit faster.
wchar_t is only 16-bit long on Windows. wmemcmp() is still used for 32-bit
wchar_t. 
							
						 
						
							2013-04-09 23:53:26 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								0cff4b16d9 
								
							 
						 
						
							
							
								
								replace(): only call PyUnicode_DATA(u) once  
							
							
							
						 
						
							2013-04-09 22:52:48 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								cc7af72192 
								
							 
						 
						
							
							
								
								Write super-fast version of str.strip(), str.lstrip() and str.rstrip() for pure ASCII  
							
							
							
						 
						
							2013-04-09 22:39:24 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								f50a4e9bc9 
								
							 
						 
						
							
							
								
								Don't calls macros in PyUnicode_WRITE() parameters  
							
							... 
							
							
							
							PyUnicode_WRITE() expands some parameters twice or more. 
							
						 
						
							2013-04-09 22:38:52 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								9c79e41fc5 
								
							 
						 
						
							
							
								
								Fix do_strip(): don't call PyUnicode_READ() in Py_UNICODE_ISSPACE() to not call  
							
							... 
							
							
							
							it twice 
							
						 
						
							2013-04-09 22:21:08 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								b3a6014504 
								
							 
						 
						
							
							
								
								Fix _PyUnicode_XStrip()  
							
							... 
							
							
							
							Inline the BLOOM_MEMBER() to only call PyUnicode_READ() only once (per loop
iteration). Store also the length of the seperator in a variable to avoid calls
to PyUnicode_GET_LENGTH(). 
							
						 
						
							2013-04-09 22:19:21 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								63d5c1a14a 
								
							 
						 
						
							
							
								
								Optimize PyUnicode_DecodeCharmap()  
							
							... 
							
							
							
							Avoid expensive PyUnicode_READ() and PyUnicode_WRITE(), manipulate pointers
instead. 
							
						 
						
							2013-04-09 22:13:33 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								a85af502a4 
								
							 
						 
						
							
							
								
								Optimize make_bloom_mask(), used by str.strip(), str.lstrip() and str.rstrip()  
							
							... 
							
							
							
							Write specialized functions per Unicode kind to avoid the expensive
PyUnicode_READ() macro. 
							
						 
						
							2013-04-09 21:53:54 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								69ed0f4c86 
								
							 
						 
						
							
							
								
								Use PyUnicode_READ() instead of PyUnicode_READ_CHAR()  
							
							... 
							
							
							
							"PyUnicode_READ_CHAR() is less efficient than PyUnicode_READ() because it calls
PyUnicode_KIND() and might call it twice." according to its documentation. 
							
						 
						
							2013-04-09 21:48:24 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								03c3e35d42 
								
							 
						 
						
							
							
								
								Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings:  
							
							... 
							
							
							
							cp037, cp500 and iso8859_1 codecs 
							
						 
						
							2013-04-09 21:53:09 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								cd777eaf53 
								
							 
						 
						
							
							
								
								Issue  #17615 : Comparing two Unicode strings now uses wmemcmp() when possible  
							
							... 
							
							
							
							wmemcmp() is twice faster than a dummy loop (342 usec vs 744 usec) on Fedora
18/x86_64, GCC 4.7.2. 
							
						 
						
							2013-04-08 22:43:44 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								c1302bba4c 
								
							 
						 
						
							
							
								
								Issue  #17615 : Expand expensive PyUnicode_READ() macro in unicode_compare():  
							
							... 
							
							
							
							write specialized functions for each combination of Unicode kinds. 
							
						 
						
							2013-04-08 21:50:54 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								207dd38726 
								
							 
						 
						
							
							
								
								fix unused variable  
							
							
							
						 
						
							2013-04-03 03:14:58 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								eb4b5ac8af 
								
							 
						 
						
							
							
								
								Close   #16757 : Avoid calling the expensive _PyUnicode_FindMaxChar() function  
							
							... 
							
							
							
							when possible 
							
						 
						
							2013-04-03 02:02:33 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								cfc4c13b04 
								
							 
						 
						
							
							
								
								Add _PyUnicodeWriter_WriteSubstring() function  
							
							... 
							
							
							
							Write a function to enable more optimizations:
 * If the substring is the whole string and overallocation is disabled, just
   keep a reference to the string, don't copy characters
 * Avoid a call to the expensive _PyUnicode_FindMaxChar() function when
   possible 
							
						 
						
							2013-04-03 01:48:39 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Raymond Hettinger 
								
							 
						 
						
							
							
							
							
								
							
							
								51612fd803 
								
							 
						 
						
							
							
								
								merge  
							
							
							
						 
						
							2013-03-23 08:21:52 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Raymond Hettinger 
								
							 
						 
						
							
							
							
							
								
							
							
								378170d5d9 
								
							 
						 
						
							
							
								
								Issue 17447:  Clarify that str.isidentifier doesn't check for reserved keywords.  
							
							
							
						 
						
							2013-03-23 08:21:12 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								fb84b5d48d 
								
							 
						 
						
							
							
								
								(Merge 3.3) _PyUnicode_Writer() now also reuses Unicode singletons:  
							
							... 
							
							
							
							empty string and latin1 single character 
							
						 
						
							2013-03-06 19:29:09 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								2cb16aa3cb 
								
							 
						 
						
							
							
								
								_PyUnicode_Writer() now also reuses Unicode singletons:  
							
							... 
							
							
							
							empty string and latin1 single character 
							
						 
						
							2013-03-06 19:28:37 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								cf77da9fb5 
								
							 
						 
						
							
							
								
								Backed out changeset b9f7b1bf36aa  
							
							
							
						 
						
							2013-03-06 01:09:24 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								313cac88c5 
								
							 
						 
						
							
							
								
								Issue  #17223 : Fix PyUnicode_FromUnicode() on Windows (16-bit wchar_t type)  
							
							... 
							
							
							
							to reject invalid UTF-16 surrogate. 
							
						 
						
							2013-03-06 00:41:50 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								36025478bf 
								
							 
						 
						
							
							
								
								(Merge 3.3) Issue  #17223 : Fix PyUnicode_FromUnicode() for string of 1 character  
							
							... 
							
							
							
							outside the range U+0000-U+10ffff. 
							
						 
						
							2013-02-26 00:16:57 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								d21b58c05d 
								
							 
						 
						
							
							
								
								Issue  #17223 : Fix PyUnicode_FromUnicode() for string of 1 character outside  
							
							... 
							
							
							
							the range U+0000-U+10ffff. 
							
						 
						
							2013-02-26 00:15:54 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								cfd2c1b4cc 
								
							 
						 
						
							
							
								
								(Merge 3.3) Issue  #17137 : When an Unicode string is resized, the internal wide  
							
							... 
							
							
							
							character string (wstr) format is now cleared. 
							
						 
						
							2013-02-07 23:17:34 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Victor Stinner 
								
							 
						 
						
							
							
							
							
								
							
							
								bbbac2ec34 
								
							 
						 
						
							
							
								
								Issue  #17137 : When an Unicode string is resized, the internal wide character  
							
							... 
							
							
							
							string (wstr) format is now cleared. 
							
						 
						
							2013-02-07 23:12:46 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								d0c79dcda5 
								
							 
						 
						
							
							
								
								Issue  #17043 : The unicode-internal decoder no longer read past the end of  
							
							... 
							
							
							
							input buffer. 
							
						 
						
							2013-02-07 16:26:55 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								03ee12ed72 
								
							 
						 
						
							
							
								
								Issue  #17043 : The unicode-internal decoder no longer read past the end of  
							
							... 
							
							
							
							input buffer. 
							
						 
						
							2013-02-07 16:25:25 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								3fd4ab356d 
								
							 
						 
						
							
							
								
								Issue  #17043 : The unicode-internal decoder no longer read past the end of  
							
							... 
							
							
							
							input buffer. 
							
						 
						
							2013-02-07 16:23:21 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								2aee6a6460 
								
							 
						 
						
							
							
								
								Issue  #16971 : Fix a refleak in the charmap decoder.  
							
							
							
						 
						
							2013-01-29 12:16:57 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								afb1cb5579 
								
							 
						 
						
							
							
								
								Issue  #16971 : Fix a refleak in the charmap decoder.  
							
							
							
						 
						
							2013-01-29 12:13:22 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								8fe5a9f9c3 
								
							 
						 
						
							
							
								
								Issue  #16979 : Fix error handling bugs in the unicode-escape-decode decoder.  
							
							
							
						 
						
							2013-01-29 10:37:39 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								24193debd4 
								
							 
						 
						
							
							
								
								Issue  #16979 : Fix error handling bugs in the unicode-escape-decode decoder.  
							
							
							
						 
						
							2013-01-29 10:28:07 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								d679377be7 
								
							 
						 
						
							
							
								
								Issue  #16979 : Fix error handling bugs in the unicode-escape-decode decoder.  
							
							
							
						 
						
							2013-01-29 10:20:44 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								ed3c4128c0 
								
							 
						 
						
							
							
								
								Issue  #10156 : In the interpreter's initialization phase, unicode globals  
							
							... 
							
							
							
							are now initialized dynamically as needed. 
							
						 
						
							2013-01-26 12:18:17 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								678db84b37 
								
							 
						 
						
							
							
								
								Issue  #10156 : In the interpreter's initialization phase, unicode globals  
							
							... 
							
							
							
							are now initialized dynamically as needed. 
							
						 
						
							2013-01-26 12:16:36 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								059972535f 
								
							 
						 
						
							
							
								
								Issue  #10156 : In the interpreter's initialization phase, unicode globals  
							
							... 
							
							
							
							are now initialized dynamically as needed. 
							
						 
						
							2013-01-26 12:14:02 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								570c5b2354 
								
							 
						 
						
							
							
								
								Issue  #16980 : Fix processing of escaped non-ascii bytes in the  
							
							... 
							
							
							
							unicode-escape-decode decoder. 
							
						 
						
							2013-01-25 23:53:29 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Serhiy Storchaka 
								
							 
						 
						
							
							
							
							
								
							
							
								73e38809e0 
								
							 
						 
						
							
							
								
								Issue  #16980 : Fix processing of escaped non-ascii bytes in the  
							
							... 
							
							
							
							unicode-escape-decode decoder. 
							
						 
						
							2013-01-25 23:52:21 +02:00