Commit graph

915 commits

Author SHA1 Message Date
Victor Stinner
f59c28c930 unicode_writer_finish() checks string consistency 2012-05-09 03:24:14 +02:00
Victor Stinner
106802547c Backout ab500b297900: the check for integer overflow is wrong
Issue #14716: Change integer overflow check in unicode_writer_prepare()
to compute the limit at compile time instead of runtime. Patch writen by Serhiy
Storchaka.
2012-05-07 23:50:05 +02:00
Victor Stinner
0576f9b4cf Issue #14716: Change integer overflow check in unicode_writer_prepare()
to compute the limit at compile time instead of runtime. Patch writen by Serhiy
Storchaka.
2012-05-07 13:02:44 +02:00
Victor Stinner
202fdca133 Close #14716: str.format() now uses the new "unicode writer" API instead of the
PyAccu API. For example, it makes str.format() from 25% to 30% faster on Linux.
2012-05-07 12:47:02 +02:00
Mark Dickinson
99e2e5552a Issue #14700: Fix two broken and undefined-behaviour-inducing overflow checks in old-style string formatting. Thanks Serhiy Storchaka for report and original patch. 2012-05-07 11:20:50 +01:00
Victor Stinner
d0dba6eee8 unicode_writer: don't force inline when it is not necessary
Keep inline for performance critical functions (functions used in loops)
2012-05-04 01:19:15 +02:00
Benjamin Peterson
b63f49f2b4 if the kind of the string to count is larger than the string to search, shortcut to 0 2012-05-03 18:31:07 -04:00
Victor Stinner
a7b654be30 unicode_writer: add finish() method and assertions to write_str() method
* The write_str() method does nothing if the length is zero.
 * Replace "struct unicode_writer_t" with "unicode_writer_t"
2012-05-03 23:58:55 +02:00
Victor Stinner
bf4e266397 Issue #14687: Remove redundant length attribute of unicode_write_t
The length can be read directly from the buffer
2012-05-03 19:27:14 +02:00
Victor Stinner
7989157e49 Issue #14687: Cleanup unicode_writer_prepare()
"Inline" PyUnicode_Resize(): call directly resize_compact()
2012-05-03 13:43:07 +02:00
Victor Stinner
f2c76aa6cb Issue #14687: str%tuple now uses an optimistic "unicode writer" instead of an
accumulator. Directly write characters into the output (don't use a temporary
list): resize and widen the string on demand.
2012-05-03 13:10:40 +02:00
Victor Stinner
1b487b467b Issue #14624, #14687: Optimize unicode_widen()
Don't convert uninitialized characters. Patch written by Serhiy Storchaka.
2012-05-03 12:29:04 +02:00
Victor Stinner
3a7f7977f1 Remove buggy assertion in PyUnicode_Substring()
Use also directly unicode_empty, instead of PyUnicode_New(0,0).
2012-05-03 03:36:40 +02:00
Victor Stinner
684d5fd420 Fix PyUnicode_Substring() for start >= length and start > end
Remove the fast-path for 1-character string: unicode_fromascii() and
_PyUnicode_FromUCS*() now have their own fast-path for 1-character strings.
2012-05-03 02:32:34 +02:00
Victor Stinner
b6cd014d75 Unicode: optimize creating of 1-character strings 2012-05-03 02:17:04 +02:00
Victor Stinner
bff7c96834 Issue #14687: Optimize str%tuple for the "%(name)s" syntax
Avoid an useless and expensive call to PyUnicode_READ().
2012-05-03 01:44:59 +02:00
Victor Stinner
e6abb488c9 unicodeobject.c: Add MAX_MAXCHAR() macro to (micro-)optimize the computation
of the second argument of PyUnicode_New().

 * Create also align_maxchar() function
 * Optimize fix_decimal_and_space_to_ascii(): don't compute the maximum
   character when ch <= 127 (it is ASCII)
2012-05-02 01:15:40 +02:00
Victor Stinner
438106b66e Issue #14687: Cleanup PyUnicode_Format() 2012-05-02 00:41:57 +02:00
Victor Stinner
b5c3ea3af3 Issue #14687: Optimize str%args
* formatfloat() uses unicode_fromascii() instead of PyUnicode_DecodeASCII()
   to not have to check characters, we know that it is really ASCII
 * Use PyUnicode_FromOrdinal() instead of _PyUnicode_FromUCS4() to format
   a character: if avoids a call to ucs4lib_find_max_char() to compute
   the maximum character (whereas we already know it, it is just the character
   itself)
2012-05-02 00:29:36 +02:00
Victor Stinner
b80e46eca4 Issue #14687: Avoid an useless duplicated string in PyUnicode_Format() 2012-04-30 05:21:52 +02:00
Victor Stinner
aff3cc659b Issue #14687: Cleanup PyUnicode_Format() 2012-04-30 05:19:21 +02:00
Victor Stinner
b11d91d969 Fix my previous commit: bool is a long, restore the specical case for bool 2012-04-28 00:25:34 +02:00
Victor Stinner
d0880d57b0 Simplify and optimize formatlong()
* Remove _PyBytes_FormatLong(): inline it into formatlong()
 * the input type is always a long, so remove the code for bool
 * don't duplicate the string if the length does not change
 * Use PyUnicode_DATA() instead of _PyUnicode_AsString()
2012-04-27 23:40:13 +02:00
Victor Stinner
94d558b063 Optimize _PyUnicode_FindMaxChar() find pure ASCII strings 2012-04-27 22:26:58 +02:00
Victor Stinner
8f825060f1 Check newly created consistency using _PyUnicode_CheckConsistency(str, 1)
* In debug mode, fill the string data with invalid characters
 * Simplify also reference counting in PyCodec_BackslashReplaceErrors()
   and PyCodec_XMLCharRefReplaceError()
2012-04-27 13:55:39 +02:00
Victor Stinner
718fbf078c _PyUnicode_CheckConsistency() ensures that the unicode string ends with a
null character
2012-04-26 00:39:37 +02:00
Benjamin Peterson
b9f4c9daad make pointer arith c89 2012-04-23 21:45:40 -04:00
Benjamin Peterson
f3b7d86e25 use correct base ptr 2012-04-23 18:07:01 -04:00
Benjamin Peterson
2844a7a6d3 simplify and reformat 2012-04-23 18:00:25 -04:00
Victor Stinner
ece58deb9f Close #14648: Compute correctly maxchar in str.format() for substrin 2012-04-23 23:36:38 +02:00
Benjamin Peterson
64ed576de8 merge 3.2 (#14509) 2012-04-09 15:04:39 -04:00
Benjamin Peterson
ca819c3c9d merge 3.1 (#14509) 2012-04-09 15:01:02 -04:00
Benjamin Peterson
f6622c8a3e fix build without Py_DEBUG and DNDEBUG (closes #14509) 2012-04-09 14:53:07 -04:00
Victor Stinner
afb5205c48 Close #14249: Use bit shifts instead of an union, it's more efficient.
Patch written by Serhiy Storchaka
2012-04-05 22:54:49 +02:00
Victor Stinner
e7eee01f36 Close #14249: Use an union instead of a long to short pointer to avoid aliasing
issue. Speed up UTF-16 by 20%.
2012-04-05 13:44:34 +02:00
Antoine Pitrou
a701388de1 Rename _PyIter_GetBuiltin to _PyObject_GetBuiltin, and do not include it in the stable ABI. 2012-04-05 00:04:20 +02:00
Kristján Valur Jónsson
31668b8f7a Issue #14288: Serialization support for builtin iterators. 2012-04-03 10:49:41 +00:00
Benjamin Peterson
0df542985a grammar 2012-03-26 14:50:32 -04:00
Benjamin Peterson
c067d6661f merge 3.2 2012-03-25 22:41:16 -04:00
Benjamin Peterson
a8755c586e kill this terribly outdated comment 2012-03-25 22:40:54 -04:00
Victor Stinner
0d03478b88 Remove an unused variable 2012-03-06 02:06:01 +01:00
Victor Stinner
c9590ad745 Close #14085: remove assertions from PyUnicode_WRITE macro
Add checks in PyUnicode_WriteChar() and convert PyUnicode_New() assertion to a
test raising a Python exception.
2012-03-04 01:34:37 +01:00
Ezio Melotti
cda6b6d60d #14081: The sep and maxsplit parameter to str.split, bytes.split, and bytearray.split may now be passed as keyword arguments. 2012-02-26 09:39:55 +02:00
Victor Stinner
b0800dc53b Oops, revert unwanted changes 2012-02-25 00:47:08 +01:00
Victor Stinner
abc649ddbe Issue #14107: fix bigmem tests on str.capitalize(), str.swapcase() and
str.title(). Compute correctly how much memory is required for the test
(memuse).
2012-02-25 00:43:27 +01:00
Antoine Pitrou
842c0f17eb Fix compilation error under Windows (and warnings too). 2012-02-24 13:30:46 +01:00
Victor Stinner
90f50d4df9 Issue #13706: Fix format(float, "n") for locale with non-ASCII decimal point (e.g. ps_aF) 2012-02-24 01:44:47 +01:00
Victor Stinner
41a863cb81 Issue #13706: Fix format(int, "n") for locale with non-ASCII thousands separator
* Decode thousands separator and decimal point using PyUnicode_DecodeLocale()
   (from the locale encoding), instead of decoding them implicitly from latin1
 * Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used
 * Change _PyUnicode_InsertThousandsGrouping() API to return the maximum
   character if unicode is NULL
 * Replace MIN/MAX macros by Py_MIN/Py_MAX
 * stringlib/undef.h undefines STRINGLIB_IS_UNICODE
 * stringlib/localeutil.h only supports Unicode
2012-02-24 00:37:51 +01:00
Victor Stinner
b429d3b09c Fix doc of an internal function: unicode_write_cstr() 2012-02-22 21:22:20 +01:00
Antoine Pitrou
ba6bafcfbe Fix compile failure under Windows 2012-02-22 16:41:50 +01:00