Commit graph

900 commits

Author SHA1 Message Date
Victor Stinner
126c559d05 _PyUnicode_Ready() for 16-bit wchar_t 2011-10-03 04:17:10 +02:00
Victor Stinner
2fd82278cb Fix compilation error on Windows
Fix also a compiler warning.
2011-10-03 04:06:05 +02:00
Victor Stinner
a3be613a56 Use PyUnicode_WCHAR_KIND to check if a string is a wstr string
Simplify the test in wstr pointer in unicode_sizeof().
2011-10-03 02:16:37 +02:00
Victor Stinner
910337b42e Add _PyUnicode_CheckConsistency() macro to help debugging
* Document Unicode string states
 * Use _PyUnicode_CheckConsistency() to ensure that objects are always
   consistent.
2011-10-03 03:20:16 +02:00
Victor Stinner
4fae54cb0e In release mode, PyUnicode_InternInPlace() does nothing if the input is NULL or
not a unicode, instead of failing with a fatal error.

Use assertions in debug mode (provide better error messages).
2011-10-03 02:01:52 +02:00
Victor Stinner
23e5668214 PyUnicode_Append() now works in-place when it's possible 2011-10-03 03:54:37 +02:00
Victor Stinner
fe226c0d37 Rewrite PyUnicode_Resize()
* Rename _PyUnicode_Resize() to unicode_resize()
 * unicode_resize() creates a copy if the string cannot be resized instead
   of failing
 * Optimize resize_copy() for wstr strings
 * Disable temporary resize_inplace()
2011-10-03 03:52:20 +02:00
Victor Stinner
829c0adca9 Add _PyUnicode_HAS_UTF8_MEMORY() macro 2011-10-03 01:08:02 +02:00
Victor Stinner
fe0c155c4f Write _PyUnicode_Dump() to help debugging 2011-10-03 02:59:31 +02:00
Victor Stinner
f42dc448e0 PyUnicode_CopyCharacters() fails when copying latin1 into ascii 2011-10-02 23:33:16 +02:00
Victor Stinner
c53be96c54 unicode_convert_wchar_to_ucs4() cannot fail 2011-10-02 21:33:54 +02:00
Victor Stinner
c3c7415639 Add _PyUnicode_DATA_ANY(op) private macro 2011-10-02 20:39:55 +02:00
Victor Stinner
a464fc141d unicode_empty and unicode_latin1 are PyObject* objects, not PyUnicodeObject* 2011-10-02 20:39:30 +02:00
Victor Stinner
267aa24365 PyUnicode_FindChar() raises a IndexError on invalid index 2011-10-02 01:08:37 +02:00
Victor Stinner
bc603d12b7 Optimize _PyUnicode_AsKind() for UCS1->UCS4 and UCS2->UCS4
* Ensure that the input string is ready
 * Raise a ValueError instead of of a fatal error
2011-10-02 01:00:40 +02:00
Victor Stinner
5a706cf8c0 Fix usage of PyUnicode_READY() in PyUnicode_GetLength() 2011-10-02 00:36:53 +02:00
Victor Stinner
cd9950fd09 PyUnicode_WriteChar() raises IndexError on invalid index
PyUnicode_WriteChar() raises also a ValueError if the string has more than 1
reference.
2011-10-02 00:34:53 +02:00
Victor Stinner
2fe5ced752 PyUnicode_ReadChar() raises a IndexError if the index in invalid
unicode_getitem() reuses PyUnicode_ReadChar()
2011-10-02 00:25:40 +02:00
Victor Stinner
202b62bd90 PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown 2011-10-01 23:48:37 +02:00
Victor Stinner
07ac3ebd7b Optimize unicode_subtype_new(): don't encode to wchar_t and decode from wchar_t
Rewrite unicode_subtype_new(): allocate directly the right type.
2011-10-01 16:16:43 +02:00
Victor Stinner
e90fe6a8f4 Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros
* Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8()
 * Rename existing _PyUnicode_UTF8_LENGTH() macro to PyUnicode_UTF8_LENGTH()
 * PyUnicode_UTF8() and PyUnicode_UTF8_LENGTH() are more strict
2011-10-01 16:48:13 +02:00
Martin v. Löwis
0b1d348990 Issue 13085: Fix some memory leaks. Patch by Stefan Krah. 2011-10-01 16:35:40 +02:00
Benjamin Peterson
5c0fb00ad8 merge heads 2011-10-01 00:12:20 -04:00
Benjamin Peterson
31616ea2ff remove reference to non-existent file 2011-10-01 00:11:09 -04:00
Victor Stinner
de636f3c34 PyUnicode_Substring() now accepts end bigger than string length
Fix also a bug: call PyUnicode_READY() before reading string length.
2011-10-01 03:55:54 +02:00
Victor Stinner
c759f3e7ec Ooops, avoid a division by zero in unicode_repeat() 2011-10-01 03:09:58 +02:00
Victor Stinner
d3a83d5eb3 PyUnicode_FromObject() ensures that its output is a ready string 2011-10-01 03:09:33 +02:00
Victor Stinner
67ca64ce54 I want a super fast 'a' * n!
* Optimize unicode_repeat() for a special case with memset()
 * Simplify integer overflow checking; remove the second check because
   PyUnicode_New() already does it and uses a smaller limit (Py_ssize_t vs
   size_t)
2011-10-01 02:47:29 +02:00
Victor Stinner
e9a2935c1f Fix usage of PyUnicode_READY in unicodeobject.c 2011-10-01 02:14:59 +02:00
Victor Stinner
12bab6dace Remove private substring() function, reuse public PyUnicode_Substring()
* PyUnicode_Substring() now fails if start or end is invalid
 * PyUnicode_Substring() reuses PyUnicode_Copy() for non-exact strings
2011-10-01 01:53:49 +02:00
Victor Stinner
c841e7db1f Optimize PyUnicode_Copy(): don't recompute maximum character 2011-10-01 01:34:32 +02:00
Victor Stinner
2219e0a37e PyUnicode_FromObject() reuses PyUnicode_Copy()
* PyUnicode_Copy() is faster than substring()
 * Fix also a compiler warning
2011-10-01 01:16:59 +02:00
Victor Stinner
034f6cf10c Add PyUnicode_Copy() function, include it to the public API 2011-09-30 02:26:44 +02:00
Victor Stinner
b153615008 PyUnicode_CopyCharacters() uses exceptions instead of assertions
Call PyErr_BadInternalCall() if inputs are not unicode strings.
2011-09-30 02:26:10 +02:00
Victor Stinner
d8f6510acc _PyUnicode_Ready() cannot be used on ready strings anymore
* Change its prototype: PyObject* instead of PyUnicodeoObject*.
 * Remove an old assertion, the result of PyUnicode_READY (_PyUnicode_Ready)
   must be checked instead
2011-09-29 19:43:17 +02:00
Victor Stinner
bc8b81bc4e Move _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() outside unicodeobject.h
Move these macros to unicodeobject.c
2011-09-29 19:31:34 +02:00
Victor Stinner
a0702ab1fe Add a note in PyUnicode_CopyCharacters() doc: it doesn't write null character
Cleanup also the code (avoid the goto).
2011-09-29 14:14:38 +02:00
Victor Stinner
639418812f Use the new Py_ARRAY_LENGTH macro 2011-09-29 00:42:28 +02:00
Victor Stinner
b9dcffb51e Fix 'c' format of PyUnicode_Format()
formatbuf is now an array of Py_UCS4, not of Py_UNICODE
2011-09-29 00:39:24 +02:00
Victor Stinner
c17f540b7a Oops, fix my previous commit: unicode => to 2011-09-29 00:16:58 +02:00
Victor Stinner
b15d4d899c PyUnicode_CopyCharacters() marks the string as dirty (reset the hash) 2011-09-28 23:59:20 +02:00
Victor Stinner
f5ca1a21a5 PyUnicode_CopyCharacters() fails if 'to' has more than 1 reference 2011-09-28 23:54:59 +02:00
Ezio Melotti
2aa2b3b4d5 Clean up a few tabs that went in with PEP393. 2011-09-29 00:58:57 +03:00
Ezio Melotti
48a2f8fd97 #13054: sys.maxunicode is now always 0x10FFFF. 2011-09-29 00:18:19 +03:00
Victor Stinner
506f592769 Check size of wchar_t using the preprocessor 2011-09-28 22:34:18 +02:00
Victor Stinner
73f01c65c8 PyUnicode_CopyCharacters() initializes overflow 2011-09-28 22:28:04 +02:00
Victor Stinner
e57b1c0da1 Mark PyUnicode_FromUCS[124] as private 2011-09-28 22:20:48 +02:00
Victor Stinner
ff9e50fd04 Oops, fix Py_MIN/Py_MAX case 2011-09-28 22:17:19 +02:00
Victor Stinner
17222160e7 Mark _PyUnicode_FindMaxCharAndNumSurrogatePairs() as private 2011-09-28 22:15:37 +02:00
Victor Stinner
157f83fcfc Strip trailing spaces in unicodeobject.[ch] 2011-09-28 21:41:31 +02:00