Commit graph

815 commits

Author SHA1 Message Date
Benjamin Peterson
5fd4bd3796 avoid casting with this nice macro 2011-03-06 09:06:34 -06:00
Victor Stinner
2f283c2c19 Fix my previous commit (r88709) for str.encode(errors=...) 2011-03-02 01:21:46 +00:00
Victor Stinner
a5c68c3cb7 Issue #8923: cache str.encode() result
When a string is encoded to UTF-8 in strict mode, the result is cached into the
object. Examples: str.encode(), str.encode('utf-8'), PyUnicode_AsUTF8String()
and PyUnicode_AsEncodedString(unicode, "utf-8", NULL).
2011-03-02 01:03:14 +00:00
Victor Stinner
f3fd733f92 Remove useless argument of _PyUnicode_AsDefaultEncodedString() 2011-03-02 01:03:11 +00:00
Victor Stinner
6d970f4713 Issue #10831: PyUnicode_FromFormat() supports %li, %lli and %zi formats 2011-03-02 00:04:25 +00:00
Victor Stinner
e7faec1aa9 Fix my previous commit (r88702): initialize size_tflag in parse_format_flags() 2011-03-02 00:01:53 +00:00
Victor Stinner
968654515f Issue #10829: Refactor PyUnicode_FromFormat()
* Use the same function to parse the format string in the 3 steps
 * Fix crashs on invalid format strings
2011-03-01 23:44:09 +00:00
Victor Stinner
2b574a2332 Merged revisions 88697 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r88697 | victor.stinner | 2011-03-01 23:46:52 +0100 (mar., 01 mars 2011) | 4 lines

  Issue #11246: Fix PyUnicode_FromFormat("%V")

  Decode the byte string from UTF-8 (with replace error handler) instead of
  ISO-8859-1 (in strict mode). Patch written by Ray Allen.
........
2011-03-01 22:48:49 +00:00
Victor Stinner
2512a8b62e Issue #11246: Fix PyUnicode_FromFormat("%V")
Decode the byte string from UTF-8 (with replace error handler) instead of
ISO-8859-1 (in strict mode). Patch written by Ray Allen.
2011-03-01 22:46:52 +00:00
Alexander Belopolsky
4001847a98 PEP 7 conformance changes (whitespace only). 2011-02-26 01:02:56 +00:00
Alexander Belopolsky
1d52146a25 Issue #11303: Added shortcuts for utf8 and latin1 encodings.
Documented the list of optimized encodings as CPython implementation
detail.
2011-02-25 19:19:57 +00:00
Victor Stinner
659eb84457 Merged revisions 88481 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r88481 | victor.stinner | 2011-02-21 22:13:44 +0100 (lun., 21 févr. 2011) | 4 lines

  Fix PyUnicode_FromFormatV("%c") for non-BMP char

  Issue #10830: Fix PyUnicode_FromFormatV("%c") for non-BMP characters on
  narrow build.
........
2011-02-23 12:14:22 +00:00
Brett Cannon
b94767ff44 Issue #8914: fix various warnings from the Clang static analyzer v254. 2011-02-22 20:15:44 +00:00
Victor Stinner
5ed8b2c737 Fix PyUnicode_FromFormatV("%c") for non-BMP char
Issue #10830: Fix PyUnicode_FromFormatV("%c") for non-BMP characters on
narrow build.
2011-02-21 21:13:44 +00:00
Victor Stinner
fd34b3788f Remove bootstrap code of PyUnicode_AsEncodedString()
Issue #11187: Remove bootstrap code (use ASCII) of
PyUnicode_AsEncodedString(), it was replaced by a better fallback (use
the locale encoding) in PyUnicode_EncodeFSDefault().

Prepare also empty sections in NEWS.
2011-02-21 20:51:28 +00:00
Alexander Belopolsky
b9cc00caab Removed unneeded #include 2010-12-22 02:35:20 +00:00
Benjamin Peterson
28a4dce6a8 remove (un)transform methods 2010-12-12 01:33:04 +00:00
Alexander Belopolsky
942af5a9a4 Issue #10557: Fixed error messages from float() and other numeric
types.  Added a new API function, PyUnicode_TransformDecimalToASCII(),
which transforms non-ASCII decimal digits in a Unicode string to their
ASCII equivalents.
2010-12-04 03:38:46 +00:00
Martin v. Löwis
4d0d471a80 Merge branches/pep-0384. 2010-12-03 20:14:31 +00:00
Georg Brandl
3b9406b08a Remove redundant check for PyBytes in unicode_encode. 2010-12-03 07:54:09 +00:00
Georg Brandl
02524629f3 #7475: add (un)transform method to bytes/bytearray and str, add back codecs that can be used with them from Python 2. 2010-12-02 18:06:51 +00:00
Georg Brandl
e5b99f0fb3 Remove redundant includes of headers that are already included by Python.h. 2010-11-30 09:41:01 +00:00
Victor Stinner
d5af0a5df0 PyUnicode_DecodeFSDefaultAndSize() raises MemoryError if _Py_char2wchar() fails 2010-11-08 23:34:29 +00:00
Victor Stinner
2f02a51135 PyUnicode_EncodeFS() raises an exception if _Py_wchar2char() fails
* Add error_pos optional argument to _Py_wchar2char()
 * PyUnicode_EncodeFS() raises a UnicodeEncodeError or MemoryError if
   _Py_wchar2char() fails
2010-11-08 22:43:46 +00:00
Victor Stinner
c911bbfd5d str, bytes, bytearray docstring: remove unnecessary [...] 2010-11-07 19:04:46 +00:00
Victor Stinner
e14e212221 Fix encode/decode method doc of str, bytes, bytearray types
* Specify the default encoding: write 'utf-8' instead of
   sys.getdefaultencoding(), because the default encoding is now constant
 * Specify the default errors value
2010-11-07 18:41:46 +00:00
Eric Smith
16562f41b0 Merged revisions 86277 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r86277 | eric.smith | 2010-11-06 15:27:37 -0400 (Sat, 06 Nov 2010) | 1 line

  Added more to docstrings for str.format, format_map, and __format__.
........
2010-11-06 19:29:45 +00:00
Eric Smith
51d2fd983b Added more to docstrings for str.format, format_map, and __format__. 2010-11-06 19:27:37 +00:00
David Malcolm
9696088b6d Issue #10288: The deprecated family of "char"-handling macros
(ISLOWER()/ISUPPER()/etc) have now been removed: use Py_ISLOWER() etc
instead.
2010-11-05 17:23:41 +00:00
Eric Smith
27bbca6f79 Issue #6081: Add str.format_map. str.format_map(mapping) is similar to str.format(**mapping), except mapping does not get converted to a dict. 2010-11-04 17:06:58 +00:00
Victor Stinner
ad15872854 Simplify PyUnicode_Encode/DecodeFSDefault on Windows/Mac OS X
* Windows always uses mbcs
 * Mac OS X always uses utf-8
2010-10-27 00:25:46 +00:00
Victor Stinner
f933e1ab6f Issue #4388: On Mac OS X, decode command line arguments from UTF-8, instead of
the locale encoding. If the LANG (and LC_ALL and LC_CTYPE) environment variable
is not set, the locale encoding is ISO-8859-1, whereas most programs (including
Python) expect UTF-8. Python already uses UTF-8 for the filesystem encoding and
to encode command line arguments on this OS.
2010-10-20 22:58:25 +00:00
Victor Stinner
9a90900da5 PyUnicode_FromFormatV(): Fix %A format
It was not completly implemented. Add a test.
2010-10-18 20:59:24 +00:00
Benjamin Peterson
8f67d0893f make hashes always the size of pointers; introduce Py_hash_t #9778 2010-10-17 20:54:53 +00:00
Georg Brandl
ded5acf34a Merged revisions 81936 via svnmerge from
svn+ssh://svn.python.org/python/branches/py3k

........
  r81936 | mark.dickinson | 2010-06-12 11:10:14 +0200 (Sa, 12 Jun 2010) | 2 lines

  Silence 'unused variable' gcc warning.  Patch by Éric Araujo.
........
2010-10-17 11:48:07 +00:00
Victor Stinner
168e117e0a Add an optional size argument to _Py_char2wchar()
_Py_char2wchar() callers usually need the result size in characters. Since it's
trivial to compute it in _Py_char2wchar() (O(1) whereas wcslen() is O(n)), add
an option to get it.
2010-10-16 23:16:16 +00:00
Victor Stinner
f3170ccef8 Use locale encoding if Py_FileSystemDefaultEncoding is not set
* PyUnicode_EncodeFSDefault(), PyUnicode_DecodeFSDefaultAndSize() and
   PyUnicode_DecodeFSDefault() use the locale encoding instead of UTF-8 if
   Py_FileSystemDefaultEncoding is NULL
 * redecode_filenames() functions and _Py_code_object_list (issue #9630)
   are no more needed: remove them
2010-10-15 12:04:23 +00:00
Georg Brandl
66c221e993 #9418: first step of moving private string methods to _string module. 2010-10-14 07:04:07 +00:00
Victor Stinner
beb4135b8c PyUnicode_AsWideCharString() takes a PyObject*, not a PyUnicodeObject*
All unicode functions uses PyObject* except PyUnicode_AsWideChar(). Fix the
prototype for the new function PyUnicode_AsWideCharString().
2010-10-07 01:02:42 +00:00
Victor Stinner
5593d8aeb4 Issue #8670: PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() replace
UTF-16 surrogate pairs by single non-BMP characters for 16 bits Py_UNICODE
and 32 bits wchar_t (eg. Linux in narrow build).
2010-10-02 11:11:27 +00:00
Victor Stinner
1c24bd0252 Issue #8870: PyUnicode_AsWideCharString() doesn't count the trailing nul character
And write unit tests for PyUnicode_AsWideChar() and PyUnicode_AsWideCharString().
2010-10-02 11:03:13 +00:00
Victor Stinner
71e91a358b Fix PyUnicode_AsWideCharString(): set *size if size is not NULL 2010-09-29 17:55:12 +00:00
Victor Stinner
c39211f51e Issue #9630: Redecode filenames when setting the filesystem encoding
Redecode the filenames of:

 - all modules: __file__ and __path__ attributes
 - all code objects: co_filename attribute
 - sys.path
 - sys.meta_path
 - sys.executable
 - sys.path_importer_cache (keys)

Keep weak references to all code objects until initfsencoding() is called, to
be able to redecode co_filename attribute of all code objects.
2010-09-29 16:35:47 +00:00
Victor Stinner
137c34c027 Issue #9979: Create function PyUnicode_AsWideCharString(). 2010-09-29 10:25:54 +00:00
Benjamin Peterson
d4ac96a336 use return NULL; it's just as correct 2010-09-12 16:40:53 +00:00
Victor Stinner
4c7db315df Issue #9738, #9836: Fix refleak introduced by r84704 2010-09-12 07:51:18 +00:00
Benjamin Peterson
9be0b2e312 detect non-ascii characters much earlier (plugs ref leak) 2010-09-12 03:40:54 +00:00
Victor Stinner
1205f2774e Issue #9738: PyUnicode_FromFormat() and PyErr_Format() raise an error on
a non-ASCII byte in the format string.

Document also the encoding.
2010-09-11 00:54:47 +00:00
Victor Stinner
46408606d8 Rename PyUnicode_strdup() to PyUnicode_AsUnicodeCopy() 2010-09-03 16:18:00 +00:00
Victor Stinner
71133ff368 Create PyUnicode_strdup() function 2010-09-01 23:43:53 +00:00