2019-05-17 11:55:34 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. highlight:: c
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. _unicodeobjects:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								Unicode Objects and Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								--------------------------
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2012-08-11 16:51:50 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. sectionauthor:: Marc-André Lemburg <mal@lemburg.com>
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. sectionauthor:: Georg Brandl <georg@python.org>
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								Unicode Objects
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								^^^^^^^^^^^^^^^
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Since the implementation of :pep:`393` in Python 3.3, Unicode objects internally
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								use a variety of representations, in order to allow handling the complete range
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								of Unicode characters while staying memory efficient.  There are special cases
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								for strings where all code points are below 128, 256, or 65536; otherwise, code
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								points must be below 1114112 (which is the full Unicode range).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 14:48:38 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								UTF-8 representation is created on demand and cached in the Unicode object.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2020-08-05 10:48:51 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. note::
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 14:48:38 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   The :c:type:`Py_UNICODE` representation has been removed since Python 3.12
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   with deprecated APIs.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   See :pep:`623` for more information.
							 | 
						
					
						
							
								
									
										
										
										
											2020-08-05 10:48:51 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Unicode Type
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								These are the basic Unicode object types used for the Unicode implementation in
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								Python:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:type:: Py_UCS4
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								            Py_UCS2
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								            Py_UCS1
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   These types are typedefs for unsigned integer types wide enough to contain
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   characters of 32 bits, 16 bits and 8 bits, respectively.  When dealing with
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   single Unicode characters, use :c:type:`Py_UCS4`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:type:: Py_UNICODE
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-22 21:35:22 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   This is a typedef of :c:type:`wchar_t`, which is a 16-bit type or 32-bit type
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   depending on the platform.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      In previous versions, this was a 16-bit type or a 32-bit type depending on
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      whether you selected a "narrow" or "wide" Unicode version of Python at
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      build time.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-06-01 08:56:35 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. deprecated-removed:: 3.13 3.15
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:type:: PyASCIIObject
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								            PyCompactUnicodeObject
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								            PyUnicodeObject
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   These subtypes of :c:type:`PyObject` represent a Python Unicode object.  In
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   almost all cases, they shouldn't be used directly, since all API functions
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   that deal with Unicode objects take and return :c:type:`PyObject` pointers.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:var:: PyTypeObject PyUnicode_Type
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   This instance of :c:type:`PyTypeObject` represents the Python Unicode type.  It
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   is exposed to Python code as ``str``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-04-22 14:59:18 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								The following APIs are C macros and static inlined functions for fast checks and
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								access to internal read-only data of Unicode objects:
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_Check(PyObject *obj)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return true if the object *obj* is a Unicode object or an instance of a Unicode
							 | 
						
					
						
							
								
									
										
										
										
											2021-01-06 12:38:26 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   subtype.  This function always succeeds.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_CheckExact(PyObject *obj)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return true if the object *obj* is a Unicode object, but not an instance of a
							 | 
						
					
						
							
								
									
										
										
										
											2021-01-06 12:38:26 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   subtype.  This function always succeeds.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_READY(PyObject *unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 14:48:38 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Returns ``0``. This API is kept only for backward compatibility.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 14:48:38 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. deprecated:: 3.10
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-13 13:15:41 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      This API does nothing since Python 3.12.
							 | 
						
					
						
							
								
									
										
										
										
											2020-08-05 10:48:51 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_ssize_t PyUnicode_GET_LENGTH(PyObject *unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return the length of the Unicode string, in code points.  *unicode* has to be a
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Unicode object in the "canonical" representation (not checked).
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_UCS1* PyUnicode_1BYTE_DATA(PyObject *unicode)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								                Py_UCS2* PyUnicode_2BYTE_DATA(PyObject *unicode)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								                Py_UCS4* PyUnicode_4BYTE_DATA(PyObject *unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return a pointer to the canonical representation cast to UCS1, UCS2 or UCS4
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   integer types for direct character access.  No checks are performed if the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   canonical representation has the correct character size; use
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 14:48:38 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_KIND` to select the right function.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 14:48:38 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:macro:: PyUnicode_1BYTE_KIND
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								             PyUnicode_2BYTE_KIND
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								             PyUnicode_4BYTE_KIND
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return values of the :c:func:`PyUnicode_KIND` macro.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 14:48:38 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.12
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      ``PyUnicode_WCHAR_KIND`` has been removed.
							 | 
						
					
						
							
								
									
										
										
										
											2020-08-05 10:48:51 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_KIND(PyObject *unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return one of the PyUnicode kind constants (see above) that indicate how many
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   bytes per character this Unicode object uses to store its data.  *unicode* has to
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   be a Unicode object in the "canonical" representation (not checked).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: void* PyUnicode_DATA(PyObject *unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return a void pointer to the raw Unicode buffer.  *unicode* has to be a Unicode
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   object in the "canonical" representation (not checked).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 01:35:41 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: void PyUnicode_WRITE(int kind, void *data, \
							 | 
						
					
						
							
								
									
										
										
										
											2022-04-22 14:59:18 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                                     Py_ssize_t index, Py_UCS4 value)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Write into a canonical representation *data* (as obtained with
							 | 
						
					
						
							
								
									
										
										
										
											2022-04-22 14:59:18 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_DATA`).  This function performs no sanity checks, and is
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   intended for usage in loops.  The caller should cache the *kind* value and
							 | 
						
					
						
							
								
									
										
										
										
											2022-04-22 14:59:18 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *data* pointer as obtained from other calls.  *index* is the index in
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   the string (starts at 0) and *value* is the new code point value which should
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   be written to that location.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 01:35:41 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_UCS4 PyUnicode_READ(int kind, void *data, \
							 | 
						
					
						
							
								
									
										
										
										
											2022-04-22 14:59:18 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                                       Py_ssize_t index)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Read a code point from a canonical representation *data* (as obtained with
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_DATA`).  No checks or ready calls are performed.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_UCS4 PyUnicode_READ_CHAR(PyObject *unicode, Py_ssize_t index)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Read a character from a Unicode object *unicode*, which must be in the "canonical"
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   representation.  This is less efficient than :c:func:`PyUnicode_READ` if you
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   do multiple consecutive reads.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_UCS4 PyUnicode_MAX_CHAR_VALUE(PyObject *unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return the maximum code point that is suitable for creating another string
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   based on *unicode*, which must be in the "canonical" representation.  This is
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   always an approximation but more efficient than iterating over the string.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
											 
										 
										
											
												Merged revisions 60481,60485,60489-60492,60494-60496,60498-60499,60501-60503,60505-60506,60508-60509,60523-60524,60532,60543,60545,60547-60548,60552,60554,60556-60559,60561-60562,60569,60571-60572,60574,60576-60583,60585-60586,60589,60591,60594-60595,60597-60598,60600-60601,60606-60612,60615,60617,60619-60621,60623-60625,60627-60629,60631,60633,60635,60647,60650,60652,60654,60656,60658-60659,60664-60666,60668-60670,60672,60676,60678,60680-60683,60685-60686,60688,60690,60692-60694,60697-60700,60705-60706,60708,60711,60714,60720,60724-60730,60732,60736,60742,60744,60746,60748,60750-60751,60753,60756-60757,60759-60761,60763-60764,60766,60769-60770,60774-60784,60787-60845 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
  r60790 | raymond.hettinger | 2008-02-14 10:32:45 +0100 (Thu, 14 Feb 2008) | 4 lines
  Add diagnostic message to help figure-out why SocketServer tests occasionally crash
  when trying to remove a pid that in not in the activechildren list.
........
  r60791 | raymond.hettinger | 2008-02-14 11:46:57 +0100 (Thu, 14 Feb 2008) | 1 line
  Add fixed-point examples to the decimal FAQ
........
  r60792 | raymond.hettinger | 2008-02-14 12:01:10 +0100 (Thu, 14 Feb 2008) | 1 line
  Improve rst markup
........
  r60794 | raymond.hettinger | 2008-02-14 12:57:25 +0100 (Thu, 14 Feb 2008) | 1 line
  Show how to remove exponents.
........
  r60795 | raymond.hettinger | 2008-02-14 13:05:42 +0100 (Thu, 14 Feb 2008) | 1 line
  Fix markup.
........
  r60797 | christian.heimes | 2008-02-14 13:47:33 +0100 (Thu, 14 Feb 2008) | 1 line
  Implemented Martin's suggestion to clear the free lists during the garbage collection of the highest generation.
........
  r60798 | raymond.hettinger | 2008-02-14 13:49:37 +0100 (Thu, 14 Feb 2008) | 1 line
  Simplify moneyfmt() recipe.
........
  r60810 | raymond.hettinger | 2008-02-14 20:02:39 +0100 (Thu, 14 Feb 2008) | 1 line
  Fix markup
........
  r60811 | raymond.hettinger | 2008-02-14 20:30:30 +0100 (Thu, 14 Feb 2008) | 1 line
  No need to register subclass of ABCs.
........
  r60814 | thomas.heller | 2008-02-14 22:00:28 +0100 (Thu, 14 Feb 2008) | 1 line
  Try to correct a markup error that does hide the following paragraph.
........
  r60822 | christian.heimes | 2008-02-14 23:40:11 +0100 (Thu, 14 Feb 2008) | 1 line
  Use a static and interned string for __subclasscheck__ and __instancecheck__ as suggested by Thomas Heller in #2115
........
  r60827 | christian.heimes | 2008-02-15 07:57:08 +0100 (Fri, 15 Feb 2008) | 1 line
  Fixed repr() and str() of complex numbers. Complex suffered from the same problem as floats but I forgot to test and fix them.
........
  r60830 | christian.heimes | 2008-02-15 09:20:11 +0100 (Fri, 15 Feb 2008) | 2 lines
  Bug #2111: mmap segfaults when trying to write a block opened with PROT_READ
  Thanks to Thomas Herve for the fix.
........
  r60835 | eric.smith | 2008-02-15 13:14:32 +0100 (Fri, 15 Feb 2008) | 1 line
  In PyNumber_ToBase, changed from an assert to returning an error when PyObject_Index() returns something other than an int or long.  It should never be possible to trigger this, as PyObject_Index checks to make sure it returns an int or long.
........
  r60837 | skip.montanaro | 2008-02-15 20:03:59 +0100 (Fri, 15 Feb 2008) | 8 lines
  Two new functions:
    * place_summary_first copies the regrtest summary to the front of the file
      making it easier to scan quickly for problems.
    * count_failures gets the actual count of the number of failing tests, not
      just a 1 (some failures) or 0 (no failures).
........
  r60840 | raymond.hettinger | 2008-02-15 22:21:25 +0100 (Fri, 15 Feb 2008) | 1 line
  Update example to match the current syntax.
........
  r60841 | amaury.forgeotdarc | 2008-02-15 22:22:45 +0100 (Fri, 15 Feb 2008) | 8 lines
  Issue #2115: __slot__ attributes setting was 10x slower.
  Also correct a possible crash using ABCs.
  This change is exactly the same as an optimisation
  done 5 years ago, but on slot *access*:
  http://svn.python.org/view?view=rev&rev=28297
........
  r60842 | amaury.forgeotdarc | 2008-02-15 22:27:44 +0100 (Fri, 15 Feb 2008) | 2 lines
  Temporarily let these tests pass
........
  r60843 | kurt.kaiser | 2008-02-15 22:56:36 +0100 (Fri, 15 Feb 2008) | 2 lines
  ScriptBinding event handlers weren't returning 'break'. Patch 2050, Tal Einat.
........
  r60844 | kurt.kaiser | 2008-02-15 23:25:09 +0100 (Fri, 15 Feb 2008) | 4 lines
  Configured selection highlighting colors were ignored; updating highlighting
  in the config dialog would cause non-Python files to be colored as if they
  were Python source; improve use of ColorDelagator.  Patch 1334. Tal Einat.
........
  r60845 | amaury.forgeotdarc | 2008-02-15 23:44:20 +0100 (Fri, 15 Feb 2008) | 9 lines
  Re-enable tests, they were failing since gc.collect() clears the various freelists.
  They still remain fragile.
  For example, a call to assertEqual currently does not make any allocation
  (which surprised me at first).
  But this can change when gc.collect also deletes the numerous "zombie frames"
  attached to each function.
........
											
										 
										
											2008-02-16 07:38:31 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_IsIdentifier(PyObject *unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2020-02-11 14:29:33 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` if the string is a valid identifier according to the language
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   definition, section :ref:`identifiers`. Return ``0`` otherwise.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.9
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      The function does not call :c:func:`Py_FatalError` anymore if the string
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      is not ready.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Unicode Character Properties
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								""""""""""""""""""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								Unicode provides many different character properties. The most often needed ones
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								are available through these macros which are mapped to C functions depending on
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								the Python configuration.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_ISSPACE(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` or ``0`` depending on whether *ch* is a whitespace character.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_ISLOWER(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` or ``0`` depending on whether *ch* is a lowercase character.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_ISUPPER(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` or ``0`` depending on whether *ch* is an uppercase character.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_ISTITLE(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` or ``0`` depending on whether *ch* is a titlecase character.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_ISLINEBREAK(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` or ``0`` depending on whether *ch* is a linebreak character.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_ISDECIMAL(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` or ``0`` depending on whether *ch* is a decimal character.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_ISDIGIT(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` or ``0`` depending on whether *ch* is a digit character.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_ISNUMERIC(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` or ``0`` depending on whether *ch* is a numeric character.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_ISALPHA(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` or ``0`` depending on whether *ch* is an alphabetic character.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_ISALNUM(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` or ``0`` depending on whether *ch* is an alphanumeric character.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-06-11 18:37:52 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_ISPRINTABLE(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-06-11 18:37:52 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` or ``0`` depending on whether *ch* is a printable character.
							 | 
						
					
						
							
								
									
										
										
										
											2008-06-11 18:37:52 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Nonprintable characters are those characters defined in the Unicode character
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   database as "Other" or "Separator", excepting the ASCII space (0x20) which is
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   considered printable.  (Note that printable characters in this context are
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   those which should not be escaped when :func:`repr` is invoked on a string.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   It has no bearing on the handling of strings written to :data:`sys.stdout` or
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :data:`sys.stderr`.)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								These APIs can be used for fast direct character conversions:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_UCS4 Py_UNICODE_TOLOWER(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return the character *ch* converted to lower case.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_UCS4 Py_UNICODE_TOUPPER(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return the character *ch* converted to upper case.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_UCS4 Py_UNICODE_TOTITLE(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return the character *ch* converted to title case.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_TODECIMAL(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return the character *ch* converted to a decimal positive integer.  Return
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 01:33:52 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   ``-1`` if this is not possible.  This function does not raise exceptions.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_TODIGIT(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return the character *ch* converted to a single digit integer. Return ``-1`` if
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 01:33:52 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   this is not possible.  This function does not raise exceptions.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-01-11 14:33:06 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: double Py_UNICODE_TONUMERIC(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return the character *ch* converted to a double. Return ``-1.0`` if this is not
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 01:33:52 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   possible.  This function does not raise exceptions.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-08-22 20:03:25 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								These APIs can be used to work with surrogates:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 03:38:49 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_IS_SURROGATE(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2011-08-22 20:03:25 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Check if *ch* is a surrogate (``0xD800 <= ch <= 0xDFFF``).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 03:38:49 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_IS_HIGH_SURROGATE(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2011-08-22 20:03:25 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-04-17 08:32:47 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Check if *ch* is a high surrogate (``0xD800 <= ch <= 0xDBFF``).
							 | 
						
					
						
							
								
									
										
										
										
											2011-08-22 20:03:25 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 03:38:49 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int Py_UNICODE_IS_LOW_SURROGATE(Py_UCS4 ch)
							 | 
						
					
						
							
								
									
										
										
										
											2011-08-22 20:03:25 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Check if *ch* is a low surrogate (``0xDC00 <= ch <= 0xDFFF``).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 03:38:49 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_UCS4 Py_UNICODE_JOIN_SURROGATES(Py_UCS4 high, Py_UCS4 low)
							 | 
						
					
						
							
								
									
										
										
										
											2011-08-22 20:03:25 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-09-27 22:13:29 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Join two surrogate code points and return a single :c:type:`Py_UCS4` value.
							 | 
						
					
						
							
								
									
										
										
										
											2011-08-22 20:03:25 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *high* and *low* are respectively the leading and trailing surrogates in a
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 01:33:52 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   surrogate pair. *high* must be in the range [0xD800; 0xDBFF] and *low* must
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   be in the range [0xDC00; 0xDFFF].
							 | 
						
					
						
							
								
									
										
										
										
											2011-08-22 20:03:25 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Creating and accessing Unicode strings
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								""""""""""""""""""""""""""""""""""""""
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								To create Unicode objects and access their basic sequence properties, use these
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								APIs:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_New(Py_ssize_t size, Py_UCS4 maxchar)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Create a new Unicode object.  *maxchar* should be the true maximum code point
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   to be placed in the string.  As an approximation, it can be rounded up to the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   nearest value in the sequence 127, 255, 65535, 1114111.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   This is the recommended way to allocate a new Unicode object.  Objects
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   created using this function are not resizable.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-09-27 12:35:55 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception and return ``NULL``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_FromKindAndData(int kind, const void *buffer, \
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								                                                    Py_ssize_t size)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Create a new Unicode object with the given *kind* (possible values are
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :c:macro:`PyUnicode_1BYTE_KIND` etc., as returned by
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_KIND`).  The *buffer* must point to an array of *size*
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   units of 1, 2 or 4 bytes per character, as given by the kind.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2021-06-03 06:33:44 -07:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If necessary, the input *buffer* is copied and transformed into the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   canonical representation.  For example, if the *buffer* is a UCS4 string
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   (:c:macro:`PyUnicode_4BYTE_KIND`) and it consists only of codepoints in
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   the UCS1 range, it will be transformed into UCS1
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   (:c:macro:`PyUnicode_1BYTE_KIND`).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_FromStringAndSize(const char *str, Py_ssize_t size)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object from the char buffer *str*.  The bytes will be
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   interpreted as being UTF-8 encoded.  The buffer is copied into the new
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 14:48:38 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   object.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   The return value might be a shared object, i.e. modification of the data is
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   not allowed.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 14:48:38 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   This function raises :exc:`SystemError` when:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   * *size* < 0,
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   * *str* is ``NULL`` and *size* > 0
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-12 14:48:38 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.12
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      *str* == ``NULL`` with *size* > 0 is not allowed anymore.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject *PyUnicode_FromString(const char *str)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-04-15 02:14:19 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object from a UTF-8 encoded null-terminated char buffer
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *str*.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_FromFormat(const char *format, ...)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Take a C :c:func:`printf`\ -style *format* string and a variable number of
							 | 
						
					
						
							
								
									
										
										
										
											2019-05-08 11:02:34 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   arguments, calculate the size of the resulting Python Unicode string and return
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   a string with the values formatted into it.  The variable arguments must be C
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   types and must correspond exactly to the format characters in the *format*
							 | 
						
					
						
							
								
									
										
										
										
											2023-05-22 00:32:39 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   ASCII-encoded string.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   A conversion specifier contains two or more characters and has the following
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   components, which must occur in this order:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   #. The ``'%'`` character, which marks the start of the specifier.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   #. Conversion flags (optional), which affect the result of some conversion
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      types.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   #. Minimum field width (optional).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      If specified as an ``'*'`` (asterisk), the actual width is given in the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      next argument, which must be of type :c:expr:`int`, and the object to
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      convert comes after the minimum field width and optional precision.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   #. Precision (optional), given as a ``'.'`` (dot) followed by the precision.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      If specified as ``'*'`` (an asterisk), the actual precision is given in
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      the next argument, which must be of type :c:expr:`int`, and the value to
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      convert comes after the precision.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   #. Length modifier (optional).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   #. Conversion type.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   The conversion flag characters are:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. tabularcolumns:: |l|L|
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   +-------+-------------------------------------------------------------+
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   | Flag  | Meaning                                                     |
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   +=======+=============================================================+
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   | ``0`` | The conversion will be zero padded for numeric values.      |
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   +-------+-------------------------------------------------------------+
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   | ``-`` | The converted value is left adjusted (overrides the ``0``   |
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   |       | flag if both are given).                                    |
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   +-------+-------------------------------------------------------------+
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   The length modifiers for following integer conversions (``d``, ``i``,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``o``, ``u``, ``x``, or ``X``) specify the type of the argument
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   (:c:expr:`int` by default):
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. tabularcolumns:: |l|L|
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   +----------+-----------------------------------------------------+
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   | Modifier | Types                                               |
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   +==========+=====================================================+
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   | ``l``    | :c:expr:`long` or :c:expr:`unsigned long`           |
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   +----------+-----------------------------------------------------+
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   | ``ll``   | :c:expr:`long long` or :c:expr:`unsigned long long` |
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   +----------+-----------------------------------------------------+
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-22 21:35:22 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   | ``j``    | :c:type:`intmax_t` or :c:type:`uintmax_t`           |
							 | 
						
					
						
							
								
									
										
										
										
											2023-05-22 00:32:39 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   +----------+-----------------------------------------------------+
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-22 21:35:22 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   | ``z``    | :c:type:`size_t` or :c:type:`ssize_t`               |
							 | 
						
					
						
							
								
									
										
										
										
											2023-05-22 00:32:39 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   +----------+-----------------------------------------------------+
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-22 21:35:22 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   | ``t``    | :c:type:`ptrdiff_t`                                 |
							 | 
						
					
						
							
								
									
										
										
										
											2023-05-22 00:32:39 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   +----------+-----------------------------------------------------+
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   The length modifier ``l`` for following conversions ``s`` or ``V`` specify
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   that the type of the argument is :c:expr:`const wchar_t*`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   The conversion specifiers are:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. list-table::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      :widths: auto
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      :header-rows: 1
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - Conversion Specifier
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - Type
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - Comment
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``%``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - *n/a*
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - The literal ``%`` character.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``d``, ``i``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - Specified by the length modifier
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - The decimal representation of a signed C integer.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``u``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - Specified by the length modifier
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - The decimal representation of an unsigned C integer.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``o``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - Specified by the length modifier
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - The octal representation of an unsigned C integer.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``x``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - Specified by the length modifier
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - The hexadecimal representation of an unsigned C integer (lowercase).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``X``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - Specified by the length modifier
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - The hexadecimal representation of an unsigned C integer (uppercase).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``c``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`int`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - A single character.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``s``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`const char*` or :c:expr:`const wchar_t*`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - A null-terminated C character array.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``p``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`const void*`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - The hex representation of a C  pointer.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								          Mostly equivalent to ``printf("%p")`` except that it is guaranteed to
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								          start with the literal ``0x`` regardless of what the platform's
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								          ``printf`` yields.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``A``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`PyObject*`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - The result of calling :func:`ascii`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``U``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`PyObject*`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - A Unicode object.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``V``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`PyObject*`, :c:expr:`const char*` or :c:expr:`const wchar_t*`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - A Unicode object (which may be ``NULL``) and a null-terminated
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								          C character array as a second parameter (which will be used,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								          if the first parameter is ``NULL``).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``S``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`PyObject*`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - The result of calling :c:func:`PyObject_Str`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``R``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`PyObject*`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - The result of calling :c:func:`PyObject_Repr`.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-03-14 23:23:00 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      * - ``T``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`PyObject*`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - Get the fully qualified name of an object type;
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								          call :c:func:`PyType_GetFullyQualifiedName`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-04-08 19:27:25 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      * - ``#T``
							 | 
						
					
						
							
								
									
										
										
										
											2024-03-14 23:23:00 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`PyObject*`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - Similar to ``T`` format, but use a colon (``:``) as separator between
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								          the module name and the qualified name.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      * - ``N``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`PyTypeObject*`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - Get the fully qualified name of a type;
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								          call :c:func:`PyType_GetFullyQualifiedName`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-04-08 19:27:25 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      * - ``#N``
							 | 
						
					
						
							
								
									
										
										
										
											2024-03-14 23:23:00 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								        - :c:expr:`PyTypeObject*`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								        - Similar to ``N`` format, but use a colon (``:``) as separator between
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								          the module name and the qualified name.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2013-05-06 23:11:54 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. note::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      The width formatter unit is number of characters rather than bytes.
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-22 21:35:22 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      The precision formatter unit is number of bytes or :c:type:`wchar_t`
							 | 
						
					
						
							
								
									
										
										
										
											2023-05-22 00:32:39 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      items (if the length modifier ``l`` is used) for ``"%s"`` and
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 21:37:16 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      ``"%V"`` (if the ``PyObject*`` argument is ``NULL``), and a number of
							 | 
						
					
						
							
								
									
										
										
										
											2013-05-06 23:11:54 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      characters for ``"%A"``, ``"%U"``, ``"%S"``, ``"%R"`` and ``"%V"``
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 21:37:16 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      (if the ``PyObject*`` argument is not ``NULL``).
							 | 
						
					
						
							
								
									
										
										
										
											2013-05-06 23:11:54 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-05-22 00:32:39 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. note::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      Unlike to C :c:func:`printf` the ``0`` flag has effect even when
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      a precision is given for integer conversions (``d``, ``i``, ``u``, ``o``,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      ``x``, or ``X``).
							 | 
						
					
						
							
								
									
										
										
										
											2017-04-27 11:36:35 +08:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2009-11-16 17:00:11 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.2
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-17 15:07:14 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      Support for ``"%lld"`` and ``"%llu"`` added.
							 | 
						
					
						
							
								
									
										
										
										
											2009-11-16 17:00:11 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-03-02 00:10:34 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      Support for ``"%li"``, ``"%lli"`` and ``"%zi"`` added.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2013-05-06 23:11:54 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.4
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      Support width and precision formatter for ``"%s"``, ``"%A"``, ``"%U"``,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      ``"%V"``, ``"%S"``, ``"%R"`` added.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-08-08 19:21:07 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.12
							 | 
						
					
						
							
								
									
										
										
										
											2023-05-22 00:32:39 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      Support for conversion specifiers ``o`` and ``X``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      Support for length modifiers ``j`` and ``t``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      Length modifiers are now applied to all integer conversions.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      Length modifier ``l`` is now applied to conversion specifiers ``s`` and ``V``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      Support for variable width and precision ``*``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      Support for flag ``-``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-08-08 19:21:07 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      An unrecognized format character now sets a :exc:`SystemError`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      In previous versions it caused all the rest of the format string to be
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      copied as-is to the result string, and any extra arguments discarded.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-03-14 23:23:00 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.13
							 | 
						
					
						
							
								
									
										
										
										
											2024-04-08 19:27:25 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      Support for ``%T``, ``%#T``, ``%N`` and ``%#N`` formats added.
							 | 
						
					
						
							
								
									
										
										
										
											2024-03-14 23:23:00 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_FromFormatV(const char *format, va_list vargs)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Identical to :c:func:`PyUnicode_FromFormat` except that it takes exactly two
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   arguments.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-04-27 14:53:11 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_FromObject(PyObject *obj)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Copy an instance of a Unicode subtype to a new true Unicode object if
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   necessary. If *obj* is already a true Unicode object (not a subtype),
							 | 
						
					
						
							
								
									
										
										
										
											2023-08-07 15:40:59 -06:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   return a new :term:`strong reference` to the object.
							 | 
						
					
						
							
								
									
										
										
										
											2023-04-27 14:53:11 +09:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Objects other than Unicode or its subtypes will cause a :exc:`TypeError`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_FromEncodedObject(PyObject *obj, \
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								                               const char *encoding, const char *errors)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-04-15 00:56:21 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Decode an encoded object *obj* to a Unicode object.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2014-12-05 22:25:22 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   :class:`bytes`, :class:`bytearray` and other
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :term:`bytes-like objects <bytes-like object>`
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   are decoded according to the given *encoding* and using the error handling
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   defined by *errors*. Both can be ``NULL`` to have the interface use the default
							 | 
						
					
						
							
								
									
										
										
										
											2016-04-15 00:56:21 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   values (see :ref:`builtincodecs` for details).
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   All other objects, including Unicode objects, cause a :exc:`TypeError` to be
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   set.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   The API returns ``NULL`` if there was an error.  The caller is responsible for
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   decref'ing the returned objects.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_ssize_t PyUnicode_GetLength(PyObject *unicode)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return the length of the Unicode object, in code points.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-09-27 12:35:55 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception and return ``-1``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-08 22:45:38 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_ssize_t PyUnicode_CopyCharacters(PyObject *to, \
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								                                                    Py_ssize_t to_start, \
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								                                                    PyObject *from, \
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								                                                    Py_ssize_t from_start, \
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								                                                    Py_ssize_t how_many)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Copy characters from one Unicode object into another.  This function performs
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-27 02:52:40 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   character conversion when necessary and falls back to :c:func:`!memcpy` if
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   possible.  Returns ``-1`` and sets an exception on error, otherwise returns
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-08 22:45:38 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   the number of copied characters.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2012-01-04 03:59:16 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_ssize_t PyUnicode_Fill(PyObject *unicode, Py_ssize_t start, \
							 | 
						
					
						
							
								
									
										
										
										
											2012-01-04 00:33:50 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                        Py_ssize_t length, Py_UCS4 fill_char)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Fill a string with a character: write *fill_char* into
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``unicode[start:start+length]``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Fail if *fill_char* is bigger than the string maximum character, or if the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   string has more than 1 reference.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return the number of written character, or return ``-1`` and raise an
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   exception on error.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_WriteChar(PyObject *unicode, Py_ssize_t index, \
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								                                        Py_UCS4 character)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Write a character to a string.  The string must have been created through
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_New`.  Since Unicode strings are supposed to be immutable,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   the string must not be shared, or have been hashed yet.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   This function checks that *unicode* is a Unicode object, that the index is
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   not out of bounds, and that the object can be modified safely (i.e. that it
							 | 
						
					
						
							
								
									
										
										
										
											2016-04-24 03:06:44 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   its reference count is one).
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-09-27 12:35:55 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``0`` on success, ``-1`` on error with an exception set.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_UCS4 PyUnicode_ReadChar(PyObject *unicode, Py_ssize_t index)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Read a character from a string.  This function checks that *unicode* is a
							 | 
						
					
						
							
								
									
										
										
										
											2022-04-22 14:59:18 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Unicode object and the index is not out of bounds, in contrast to
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_READ_CHAR`, which performs no error checking.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-09-27 12:35:55 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return character on success, ``-1`` on error with an exception set.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_Substring(PyObject *unicode, Py_ssize_t start, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                                              Py_ssize_t end)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return a substring of *unicode*, from character index *start* (included) to
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   character index *end* (excluded).  Negative indices are not supported.
							 | 
						
					
						
							
								
									
										
										
										
											2024-09-27 12:35:55 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception and return ``NULL``.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_UCS4* PyUnicode_AsUCS4(PyObject *unicode, Py_UCS4 *buffer, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                                          Py_ssize_t buflen, int copy_null)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Copy the string *unicode* into a UCS4 buffer, including a null character, if
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *copy_null* is set.  Returns ``NULL`` and sets an exception on error (in
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-02 21:29:26 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   particular, a :exc:`SystemError` if *buflen* is smaller than the length of
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *unicode*).  *buffer* is returned on success.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_UCS4* PyUnicode_AsUCS4Copy(PyObject *unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Copy the string *unicode* into a new UCS4 buffer that is allocated using
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyMem_Malloc`.  If this fails, ``NULL`` is returned with a
							 | 
						
					
						
							
								
									
										
										
										
											2015-05-13 20:31:53 -04:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   :exc:`MemoryError` set.  The returned buffer always has an extra
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   null code point appended.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-16 23:56:01 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Locale Encoding
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								"""""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								The current locale encoding can be used to decode text from the operating
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								system.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2012-11-28 12:33:58 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeLocaleAndSize(const char *str, \
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                                                        Py_ssize_t length, \
							 | 
						
					
						
							
								
									
										
										
										
											2012-11-28 12:33:58 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                                                        const char *errors)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-03-04 17:02:06 +08:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Decode a string from UTF-8 on Android and VxWorks, or from the current
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   locale encoding on other platforms. The supported
							 | 
						
					
						
							
								
									
										
										
										
											2012-11-28 12:33:58 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   error handlers are ``"strict"`` and ``"surrogateescape"``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   (:pep:`383`). The decoder uses ``"strict"`` error handler if
							 | 
						
					
						
							
								
									
										
										
										
											2012-11-29 15:23:15 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *errors* is ``NULL``.  *str* must end with a null character but
							 | 
						
					
						
							
								
									
										
										
										
											2012-11-28 12:33:58 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   cannot contain embedded null characters.
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-16 23:56:01 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2014-08-01 12:28:48 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Use :c:func:`PyUnicode_DecodeFSDefaultAndSize` to decode a string from
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-23 14:56:59 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   the :term:`filesystem encoding and error handler`.
							 | 
						
					
						
							
								
									
										
										
										
											2014-08-01 12:28:48 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2020-11-02 16:49:54 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   This function ignores the :ref:`Python UTF-8 Mode <utf8-mode>`.
							 | 
						
					
						
							
								
									
										
										
										
											2018-01-15 10:45:49 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-16 23:56:01 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. seealso::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2014-08-01 12:28:48 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      The :c:func:`Py_DecodeLocale` function.
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-16 23:56:01 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2018-01-15 10:45:49 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.7
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      The function now also uses the current locale encoding for the
							 | 
						
					
						
							
								
									
										
										
										
											2018-01-22 19:07:32 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      ``surrogateescape`` error handler, except on Android. Previously, :c:func:`Py_DecodeLocale`
							 | 
						
					
						
							
								
									
										
										
										
											2018-01-15 10:45:49 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      was used for the ``surrogateescape``, and the current locale encoding was
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      used for ``strict``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-16 23:56:01 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2012-11-28 12:33:58 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeLocale(const char *str, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-16 23:56:01 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Similar to :c:func:`PyUnicode_DecodeLocaleAndSize`, but compute the string
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-27 02:52:40 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   length using :c:func:`!strlen`.
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-16 23:56:01 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2012-11-28 12:33:58 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_EncodeLocale(PyObject *unicode, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-17 04:13:41 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-03-04 17:02:06 +08:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Encode a Unicode object to UTF-8 on Android and VxWorks, or to the current
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   locale encoding on other platforms. The
							 | 
						
					
						
							
								
									
										
										
										
											2012-11-28 12:33:58 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   supported error handlers are ``"strict"`` and ``"surrogateescape"``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   (:pep:`383`). The encoder uses ``"strict"`` error handler if
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-17 00:45:56 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *errors* is ``NULL``. Return a :class:`bytes` object. *unicode* cannot
							 | 
						
					
						
							
								
									
										
										
										
											2012-11-28 12:33:58 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   contain embedded null characters.
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-17 04:13:41 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-23 14:56:59 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Use :c:func:`PyUnicode_EncodeFSDefault` to encode a string to the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :term:`filesystem encoding and error handler`.
							 | 
						
					
						
							
								
									
										
										
										
											2014-08-01 12:28:48 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2020-11-02 16:49:54 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   This function ignores the :ref:`Python UTF-8 Mode <utf8-mode>`.
							 | 
						
					
						
							
								
									
										
										
										
											2018-01-15 10:45:49 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-17 04:13:41 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. seealso::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2014-08-01 12:28:48 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      The :c:func:`Py_EncodeLocale` function.
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-17 04:13:41 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2018-01-15 10:45:49 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.7
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      The function now also uses the current locale encoding for the
							 | 
						
					
						
							
								
									
										
										
										
											2018-01-22 19:07:32 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      ``surrogateescape`` error handler, except on Android. Previously,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      :c:func:`Py_EncodeLocale`
							 | 
						
					
						
							
								
									
										
										
										
											2018-01-15 10:45:49 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      was used for the ``surrogateescape``, and the current locale encoding was
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      used for ``strict``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-17 04:13:41 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								File System Encoding
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								""""""""""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-23 14:56:59 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Functions encoding to and decoding from the :term:`filesystem encoding and
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								error handler` (:pep:`383` and :pep:`529`).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								To encode file names to :class:`bytes` during argument parsing, the ``"O&"``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								converter should be used, passing :c:func:`PyUnicode_FSConverter` as the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								conversion function:
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_FSConverter(PyObject* obj, void* result)
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-09-06 15:50:29 -07:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   ParseTuple converter: encode :class:`str` objects -- obtained directly or
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   through the :class:`os.PathLike` interface -- to :class:`bytes` using
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_EncodeFSDefault`; :class:`bytes` objects are output as-is.
							 | 
						
					
						
							
								
									
										
										
										
											2022-10-05 00:11:34 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *result* must be a :c:expr:`PyBytesObject*` which must be released when it is
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-13 23:59:58 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   no longer used.
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.1
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-09-06 15:50:29 -07:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.6
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      Accepts a :term:`path-like object`.
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-17 15:07:14 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-09-08 10:35:16 -07:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								To decode file names to :class:`str` during argument parsing, the ``"O&"``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								converter should be used, passing :c:func:`PyUnicode_FSDecoder` as the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								conversion function:
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-13 23:59:58 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_FSDecoder(PyObject* obj, void* result)
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-13 23:59:58 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-09-06 19:36:01 -07:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   ParseTuple converter: decode :class:`bytes` objects -- obtained either
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   directly or indirectly through the :class:`os.PathLike` interface -- to
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :class:`str` using :c:func:`PyUnicode_DecodeFSDefaultAndSize`; :class:`str`
							 | 
						
					
						
							
								
									
										
										
										
											2022-10-05 00:11:20 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   objects are output as-is. *result* must be a :c:expr:`PyUnicodeObject*` which
							 | 
						
					
						
							
								
									
										
										
										
											2016-09-06 19:36:01 -07:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   must be released when it is no longer used.
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-13 23:59:58 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.2
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-09-06 19:36:01 -07:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.6
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      Accepts a :term:`path-like object`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-17 15:07:14 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeFSDefaultAndSize(const char *str, Py_ssize_t size)
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2020-11-02 16:49:54 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Decode a string from the :term:`filesystem encoding and error handler`.
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-23 14:56:59 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If you need to decode a string from the current locale encoding, use
							 | 
						
					
						
							
								
									
										
										
										
											2014-08-01 12:28:48 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_DecodeLocaleAndSize`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-16 23:56:01 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. seealso::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2014-08-01 12:28:48 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      The :c:func:`Py_DecodeLocale` function.
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-16 23:56:01 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-09-08 10:35:16 -07:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.6
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-23 14:56:59 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      The :term:`filesystem error handler <filesystem encoding and error
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      handler>` is now used.
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-15 16:27:27 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeFSDefault(const char *str)
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2020-11-02 16:49:54 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Decode a null-terminated string from the :term:`filesystem encoding and
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   error handler`.
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-23 14:56:59 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If the string length is known, use
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_DecodeFSDefaultAndSize`.
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-09 10:34:37 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-09-08 10:35:16 -07:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.6
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-23 14:56:59 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      The :term:`filesystem error handler <filesystem encoding and error
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      handler>` is now used.
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-09 10:34:37 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_EncodeFSDefault(PyObject *unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-15 16:27:27 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-23 14:56:59 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Encode a Unicode object to the :term:`filesystem encoding and error
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   handler`, and return :class:`bytes`. Note that the resulting :class:`bytes`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   object can contain null bytes.
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-15 16:27:27 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-23 14:56:59 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If you need to encode a string to the current locale encoding, use
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_EncodeLocale`.
							 | 
						
					
						
							
								
									
										
										
										
											2014-08-01 12:28:48 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-17 04:13:41 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. seealso::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2014-08-01 12:28:48 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      The :c:func:`Py_EncodeLocale` function.
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-17 04:13:41 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-15 16:27:27 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.2
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-09-08 10:35:16 -07:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.6
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-23 14:56:59 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      The :term:`filesystem error handler <filesystem encoding and error
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      handler>` is now used.
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-15 16:27:27 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								wchar_t Support
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								"""""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-22 21:35:22 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								:c:type:`wchar_t` support for platforms which support it:
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_FromWideChar(const wchar_t *wstr, Py_ssize_t size)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object from the :c:type:`wchar_t` buffer *wstr* of the given *size*.
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Passing ``-1`` as the *size* indicates that the function must itself compute the length,
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   using :c:func:`!wcslen`.
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``NULL`` on failure.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_ssize_t PyUnicode_AsWideChar(PyObject *unicode, wchar_t *wstr, Py_ssize_t size)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Copy the Unicode object contents into the :c:type:`wchar_t` buffer *wstr*.  At most
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-22 21:35:22 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *size* :c:type:`wchar_t` characters are copied (excluding a possibly trailing
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   null termination character).  Return the number of :c:type:`wchar_t` characters
							 | 
						
					
						
							
								
									
										
										
										
											2024-02-13 22:23:10 +08:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   copied or ``-1`` in case of an error.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   When *wstr* is ``NULL``, instead return the *size* that would be required
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   to store all of *unicode* including a terminating null.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Note that the resulting :c:expr:`wchar_t*`
							 | 
						
					
						
							
								
									
										
										
										
											2015-05-13 20:31:53 -04:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   string may or may not be null-terminated.  It is the responsibility of the caller
							 | 
						
					
						
							
								
									
										
										
										
											2022-10-05 19:01:14 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   to make sure that the :c:expr:`wchar_t*` string is null-terminated in case this is
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   required by the application. Also, note that the :c:expr:`wchar_t*` string
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-18 19:22:31 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   might contain null characters, which would cause the string to be truncated
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   when used with most C functions.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-07 01:02:42 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: wchar_t* PyUnicode_AsWideCharString(PyObject *unicode, Py_ssize_t *size)
							 | 
						
					
						
							
								
									
										
										
										
											2010-09-29 10:25:54 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Convert the Unicode object to a wide character string. The output string
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   always ends with a null character. If *size* is not ``NULL``, write the number
							 | 
						
					
						
							
								
									
										
										
										
											2015-05-13 20:31:53 -04:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   of wide characters (excluding the trailing null termination character) into
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-22 21:35:22 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *\*size*. Note that the resulting :c:type:`wchar_t` string might contain
							 | 
						
					
						
							
								
									
										
										
										
											2017-06-27 16:03:14 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   null characters, which would cause the string to be truncated when used with
							 | 
						
					
						
							
								
									
										
										
										
											2022-10-05 19:01:14 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   most C functions. If *size* is ``NULL`` and the :c:expr:`wchar_t*` string
							 | 
						
					
						
							
								
									
										
										
										
											2017-06-27 16:03:14 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   contains null characters a :exc:`ValueError` is raised.
							 | 
						
					
						
							
								
									
										
										
										
											2010-09-29 10:25:54 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-27 02:52:40 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Returns a buffer allocated by :c:macro:`PyMem_New` (use
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyMem_Free` to free it) on success. On error, returns ``NULL``
							 | 
						
					
						
							
								
									
										
										
										
											2017-06-27 16:03:14 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   and *\*size* is undefined. Raises a :exc:`MemoryError` if memory allocation
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   is failed.
							 | 
						
					
						
							
								
									
										
										
										
											2010-09-29 10:25:54 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.2
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2017-06-27 16:03:14 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.7
							 | 
						
					
						
							
								
									
										
										
										
											2022-10-05 19:01:14 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      Raises a :exc:`ValueError` if *size* is ``NULL`` and the :c:expr:`wchar_t*`
							 | 
						
					
						
							
								
									
										
										
										
											2017-06-27 16:03:14 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      string contains null characters.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-09-29 10:25:54 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. _builtincodecs:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								Built-in Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								^^^^^^^^^^^^^^^
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2009-07-26 14:54:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Python provides a set of built-in codecs which are written in C for speed. All of
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								these codecs are directly usable via the following functions.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-04-14 07:43:53 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Many of the following APIs take two arguments encoding and errors, and they
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								have the same semantics as the ones of the built-in :func:`str` string object
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								constructor.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Setting encoding to ``NULL`` causes the default encoding to be used
							 | 
						
					
						
							
								
									
										
										
										
											2020-02-10 23:32:18 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								which is UTF-8.  The file system calls should use
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								:c:func:`PyUnicode_FSConverter` for encoding file names. This uses the
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-23 14:56:59 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								:term:`filesystem encoding and error handler` internally.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Error handling is set by errors which may also be set to ``NULL`` meaning to use
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								the default handling defined for the codec.  Default error handling for all
							 | 
						
					
						
							
								
									
										
										
										
											2009-07-26 14:54:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								built-in codecs is "strict" (:exc:`ValueError` is raised).
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-22 20:34:17 -08:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								The codecs all use a similar interface.  Only deviations from the following
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								generic ones are documented for simplicity.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Generic Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								""""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								These are the generic codec APIs:
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_Decode(const char *str, Py_ssize_t size, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              const char *encoding, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object by decoding *size* bytes of the encoded string *str*.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *encoding* and *errors* have the same meaning as the parameters of the same name
							 | 
						
					
						
							
								
									
										
										
										
											2013-10-09 13:26:17 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   in the :func:`str` built-in function.  The codec to be used is looked up
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   using the Python codec registry.  Return ``NULL`` if an exception was raised by
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   the codec.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_AsEncodedString(PyObject *unicode, \
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								                              const char *encoding, const char *errors)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Encode a Unicode object and return the result as Python bytes object.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *encoding* and *errors* have the same meaning as the parameters of the same
							 | 
						
					
						
							
								
									
										
										
										
											2013-10-09 13:26:17 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   name in the Unicode :meth:`~str.encode` method. The codec to be used is looked up
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   using the Python codec registry. Return ``NULL`` if an exception was raised by
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   the codec.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								UTF-8 Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								These are the UTF-8 codec APIs:
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeUTF8(const char *str, Py_ssize_t size, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object by decoding *size* bytes of the UTF-8 encoded string
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *str*. Return ``NULL`` if an exception was raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeUTF8Stateful(const char *str, Py_ssize_t size, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              const char *errors, Py_ssize_t *consumed)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If *consumed* is ``NULL``, behave like :c:func:`PyUnicode_DecodeUTF8`. If
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *consumed* is not ``NULL``, trailing incomplete UTF-8 byte sequences will not be
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   treated as an error. Those bytes will not be decoded and the number of bytes
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   that have been decoded will be stored in *consumed*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_AsUTF8String(PyObject *unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2009-01-13 23:14:04 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Encode a Unicode object using UTF-8 and return the result as Python bytes
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   object.  Error handling is "strict".  Return ``NULL`` if an exception was
							 | 
						
					
						
							
								
									
										
										
										
											2009-01-13 23:14:04 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-09-27 22:13:29 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   The function fails if the string contains surrogate code points
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   (``U+D800`` - ``U+DFFF``).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2017-01-22 23:07:07 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: const char* PyUnicode_AsUTF8AndSize(PyObject *unicode, Py_ssize_t *size)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2015-05-13 20:31:53 -04:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return a pointer to the UTF-8 encoding of the Unicode object, and
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   store the size of the encoded representation (in bytes) in *size*.  The
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *size* argument can be ``NULL``; in this case no size will be stored.  The
							 | 
						
					
						
							
								
									
										
										
										
											2015-05-13 20:31:53 -04:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   returned buffer always has an extra null byte appended (not included in
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *size*), regardless of whether there are any other null code points.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-10-20 20:03:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception, set *size* to ``-1`` (if it's not NULL) and
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   return ``NULL``.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-09-27 22:13:29 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   The function fails if the string contains surrogate code points
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   (``U+D800`` - ``U+DFFF``).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   This caches the UTF-8 representation of the string in the Unicode object, and
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   subsequent calls will return a pointer to the same buffer.  The caller is not
							 | 
						
					
						
							
								
									
										
										
										
											2022-05-06 05:37:08 -04:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   responsible for deallocating the buffer. The buffer is deallocated and
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   pointers to it become invalid when the Unicode object is garbage collected.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2017-01-22 23:07:07 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.7
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      The return type is now ``const char *`` rather of ``char *``.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2020-10-19 18:17:50 -04:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.10
							 | 
						
					
						
							
								
									
										
										
										
											2023-06-06 10:40:32 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      This function is a part of the :ref:`limited API <limited-c-api>`.
							 | 
						
					
						
							
								
									
										
										
										
											2020-10-19 18:17:50 -04:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2017-01-22 23:07:07 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: const char* PyUnicode_AsUTF8(PyObject *unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   As :c:func:`PyUnicode_AsUTF8AndSize`, but does not store the size.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2017-01-22 23:07:07 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.7
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      The return type is now ``const char *`` rather of ``char *``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								UTF-32 Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								"""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								These are the UTF-32 codec APIs:
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeUTF32(const char *str, Py_ssize_t size, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              const char *errors, int *byteorder)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-04-14 07:43:53 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Decode *size* bytes from a UTF-32 encoded buffer string and return the
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   corresponding Unicode object.  *errors* (if non-``NULL``) defines the error
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   handling. It defaults to "strict".
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If *byteorder* is non-``NULL``, the decoder starts decoding using the given byte
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   order::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      *byteorder == -1: little endian
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      *byteorder == 0:  native order
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      *byteorder == 1:  big endian
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
											 
										 
										
											
												Merged revisions 74779-74786,74793,74795,74811,74860-74861,74863,74876,74886,74896,74901,74903,74908,74912,74930,74933,74943,74946,74952-74955,75015,75019,75032,75068,75076,75095,75098,75102,75129,75139,75230 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
  r74779 | michael.foord | 2009-09-13 11:13:36 -0500 (Sun, 13 Sep 2009) | 1 line
  Change to tutorial wording for reading text / binary files on Windows. Issue #6301.
........
  r74780 | michael.foord | 2009-09-13 11:40:02 -0500 (Sun, 13 Sep 2009) | 1 line
  Objects that compare equal automatically pass or fail assertAlmostEqual and assertNotAlmostEqual tests on unittest.TestCase. Issue 6567.
........
  r74781 | michael.foord | 2009-09-13 11:46:19 -0500 (Sun, 13 Sep 2009) | 1 line
  Note that sys._getframe is not guaranteed to exist in all implementations of Python, and a corresponding note in inspect.currentframe. Issue 6712.
........
  r74782 | michael.foord | 2009-09-13 12:07:46 -0500 (Sun, 13 Sep 2009) | 1 line
  Tutorial tweaks. Issue 6849.
........
  r74783 | michael.foord | 2009-09-13 12:28:35 -0500 (Sun, 13 Sep 2009) | 1 line
  unittest.TestLoader.loadTestsFromName honors the loader suiteClass attribute. Issue 6866.
........
  r74784 | georg.brandl | 2009-09-13 13:15:07 -0500 (Sun, 13 Sep 2009) | 1 line
  Typo fix.
........
  r74785 | michael.foord | 2009-09-13 14:07:03 -0500 (Sun, 13 Sep 2009) | 1 line
  Test discovery in unittest will only attempt to import modules that are importable; i.e. their names are valid Python identifiers. If an import fails during discovery this will be recorded as an error and test discovery will continue. Issue 6568.
........
  r74786 | michael.foord | 2009-09-13 14:08:18 -0500 (Sun, 13 Sep 2009) | 1 line
  Remove an extraneous space in unittest documentation.
........
  r74793 | georg.brandl | 2009-09-14 09:50:47 -0500 (Mon, 14 Sep 2009) | 1 line
  #6908: fix association of hashlib hash attributes.
........
  r74795 | benjamin.peterson | 2009-09-14 22:36:26 -0500 (Mon, 14 Sep 2009) | 1 line
  Py_SetPythonHome uses static storage #6913
........
  r74811 | georg.brandl | 2009-09-15 15:26:59 -0500 (Tue, 15 Sep 2009) | 1 line
  Add Armin Ronacher.
........
  r74860 | benjamin.peterson | 2009-09-16 21:46:54 -0500 (Wed, 16 Sep 2009) | 1 line
  kill bare except
........
  r74861 | benjamin.peterson | 2009-09-16 22:18:28 -0500 (Wed, 16 Sep 2009) | 1 line
  pep 8 defaults
........
  r74863 | benjamin.peterson | 2009-09-16 22:27:33 -0500 (Wed, 16 Sep 2009) | 1 line
  rationalize a bit
........
  r74876 | georg.brandl | 2009-09-17 11:15:53 -0500 (Thu, 17 Sep 2009) | 1 line
  #6932: remove paragraph that advises relying on __del__ being called.
........
  r74886 | benjamin.peterson | 2009-09-17 16:33:46 -0500 (Thu, 17 Sep 2009) | 1 line
  use macros
........
  r74896 | georg.brandl | 2009-09-18 02:22:41 -0500 (Fri, 18 Sep 2009) | 1 line
  #6936: for interactive use, quit() is just fine.
........
  r74901 | georg.brandl | 2009-09-18 04:14:52 -0500 (Fri, 18 Sep 2009) | 1 line
  #6905: use better exception messages in inspect when the argument is of the wrong type.
........
  r74903 | georg.brandl | 2009-09-18 04:18:27 -0500 (Fri, 18 Sep 2009) | 1 line
  #6938: "ident" is always a string, so use a format code which works.
........
  r74908 | georg.brandl | 2009-09-18 08:57:11 -0500 (Fri, 18 Sep 2009) | 1 line
  Use str.format() to fix beginner's mistake with %-style string formatting.
........
  r74912 | georg.brandl | 2009-09-18 11:19:56 -0500 (Fri, 18 Sep 2009) | 1 line
  Optimize optimization and fix method name in docstring.
........
  r74930 | georg.brandl | 2009-09-18 16:21:41 -0500 (Fri, 18 Sep 2009) | 1 line
  #6925: rewrite docs for locals() and vars() a bit.
........
  r74933 | georg.brandl | 2009-09-18 16:35:59 -0500 (Fri, 18 Sep 2009) | 1 line
  #6930: clarify description about byteorder handling in UTF decoder routines.
........
  r74943 | georg.brandl | 2009-09-19 02:35:07 -0500 (Sat, 19 Sep 2009) | 1 line
  #6944: the argument to PyArg_ParseTuple should be a tuple, otherwise a SystemError is set.  Also clean up another usage of PyArg_ParseTuple.
........
  r74946 | georg.brandl | 2009-09-19 03:43:16 -0500 (Sat, 19 Sep 2009) | 1 line
  Update bug tracker reference.
........
  r74952 | georg.brandl | 2009-09-19 05:42:34 -0500 (Sat, 19 Sep 2009) | 1 line
  #6946: fix duplicate index entries for datetime classes.
........
  r74953 | georg.brandl | 2009-09-19 07:04:16 -0500 (Sat, 19 Sep 2009) | 1 line
  Fix references to threading.enumerate().
........
  r74954 | georg.brandl | 2009-09-19 08:13:56 -0500 (Sat, 19 Sep 2009) | 1 line
  Add Doug.
........
  r74955 | georg.brandl | 2009-09-19 08:20:49 -0500 (Sat, 19 Sep 2009) | 1 line
  Add Mark Summerfield.
........
  r75015 | georg.brandl | 2009-09-22 05:55:08 -0500 (Tue, 22 Sep 2009) | 1 line
  Fix encoding name.
........
  r75019 | vinay.sajip | 2009-09-22 12:23:41 -0500 (Tue, 22 Sep 2009) | 1 line
  Fixed a typo, and added sections on optimization and using arbitrary objects as messages.
........
  r75032 | benjamin.peterson | 2009-09-22 17:15:28 -0500 (Tue, 22 Sep 2009) | 1 line
  fix typos/rephrase
........
  r75068 | benjamin.peterson | 2009-09-25 21:57:59 -0500 (Fri, 25 Sep 2009) | 1 line
  comment out ugly xxx
........
  r75076 | vinay.sajip | 2009-09-26 09:53:32 -0500 (Sat, 26 Sep 2009) | 1 line
  Tidied up name of parameter in StreamHandler
........
  r75095 | michael.foord | 2009-09-27 14:15:41 -0500 (Sun, 27 Sep 2009) | 1 line
  Test creation moved from TestProgram.parseArgs to TestProgram.createTests exclusively. Issue 6956.
........
  r75098 | michael.foord | 2009-09-27 15:08:23 -0500 (Sun, 27 Sep 2009) | 1 line
  Documentation improvement for load_tests protocol in unittest. Issue 6515.
........
  r75102 | skip.montanaro | 2009-09-27 21:12:27 -0500 (Sun, 27 Sep 2009) | 3 lines
  Patch from Thomas Barr so that csv.Sniffer will set doublequote property.
  Closes issue 6606.
........
  r75129 | vinay.sajip | 2009-09-29 02:08:54 -0500 (Tue, 29 Sep 2009) | 1 line
  Issue #7014: logging: Improved IronPython 2.6 compatibility.
........
  r75139 | raymond.hettinger | 2009-09-29 13:53:24 -0500 (Tue, 29 Sep 2009) | 3 lines
  Issue 7008: Better document str.title and show how to work around the apostrophe problem.
........
  r75230 | benjamin.peterson | 2009-10-04 08:38:38 -0500 (Sun, 04 Oct 2009) | 1 line
  test logging
........
											
										 
										
											2009-10-04 14:49:41 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If ``*byteorder`` is zero, and the first four bytes of the input data are a
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   byte order mark (BOM), the decoder switches to this byte order and the BOM is
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   not copied into the resulting Unicode string.  If ``*byteorder`` is ``-1`` or
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``1``, any byte order mark is copied to the output.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   After completion, *\*byteorder* is set to the current byte order at the end
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   of input data.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If *byteorder* is ``NULL``, the codec starts in native order mode.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``NULL`` if an exception was raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeUTF32Stateful(const char *str, Py_ssize_t size, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              const char *errors, int *byteorder, Py_ssize_t *consumed)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If *consumed* is ``NULL``, behave like :c:func:`PyUnicode_DecodeUTF32`. If
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *consumed* is not ``NULL``, :c:func:`PyUnicode_DecodeUTF32Stateful` will not treat
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   trailing incomplete UTF-32 byte sequences (such as a number of bytes not divisible
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   by four) as an error. Those bytes will not be decoded and the number of bytes
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   that have been decoded will be stored in *consumed*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_AsUTF32String(PyObject *unicode)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return a Python byte string using the UTF-32 encoding in native byte
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   order. The string always starts with a BOM mark.  Error handling is "strict".
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``NULL`` if an exception was raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								UTF-16 Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								"""""""""""""
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								These are the UTF-16 codec APIs:
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeUTF16(const char *str, Py_ssize_t size, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              const char *errors, int *byteorder)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-04-14 07:43:53 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Decode *size* bytes from a UTF-16 encoded buffer string and return the
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   corresponding Unicode object.  *errors* (if non-``NULL``) defines the error
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   handling. It defaults to "strict".
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If *byteorder* is non-``NULL``, the decoder starts decoding using the given byte
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   order::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      *byteorder == -1: little endian
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      *byteorder == 0:  native order
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      *byteorder == 1:  big endian
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
											 
										 
										
											
												Merged revisions 74779-74786,74793,74795,74811,74860-74861,74863,74876,74886,74896,74901,74903,74908,74912,74930,74933,74943,74946,74952-74955,75015,75019,75032,75068,75076,75095,75098,75102,75129,75139,75230 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
  r74779 | michael.foord | 2009-09-13 11:13:36 -0500 (Sun, 13 Sep 2009) | 1 line
  Change to tutorial wording for reading text / binary files on Windows. Issue #6301.
........
  r74780 | michael.foord | 2009-09-13 11:40:02 -0500 (Sun, 13 Sep 2009) | 1 line
  Objects that compare equal automatically pass or fail assertAlmostEqual and assertNotAlmostEqual tests on unittest.TestCase. Issue 6567.
........
  r74781 | michael.foord | 2009-09-13 11:46:19 -0500 (Sun, 13 Sep 2009) | 1 line
  Note that sys._getframe is not guaranteed to exist in all implementations of Python, and a corresponding note in inspect.currentframe. Issue 6712.
........
  r74782 | michael.foord | 2009-09-13 12:07:46 -0500 (Sun, 13 Sep 2009) | 1 line
  Tutorial tweaks. Issue 6849.
........
  r74783 | michael.foord | 2009-09-13 12:28:35 -0500 (Sun, 13 Sep 2009) | 1 line
  unittest.TestLoader.loadTestsFromName honors the loader suiteClass attribute. Issue 6866.
........
  r74784 | georg.brandl | 2009-09-13 13:15:07 -0500 (Sun, 13 Sep 2009) | 1 line
  Typo fix.
........
  r74785 | michael.foord | 2009-09-13 14:07:03 -0500 (Sun, 13 Sep 2009) | 1 line
  Test discovery in unittest will only attempt to import modules that are importable; i.e. their names are valid Python identifiers. If an import fails during discovery this will be recorded as an error and test discovery will continue. Issue 6568.
........
  r74786 | michael.foord | 2009-09-13 14:08:18 -0500 (Sun, 13 Sep 2009) | 1 line
  Remove an extraneous space in unittest documentation.
........
  r74793 | georg.brandl | 2009-09-14 09:50:47 -0500 (Mon, 14 Sep 2009) | 1 line
  #6908: fix association of hashlib hash attributes.
........
  r74795 | benjamin.peterson | 2009-09-14 22:36:26 -0500 (Mon, 14 Sep 2009) | 1 line
  Py_SetPythonHome uses static storage #6913
........
  r74811 | georg.brandl | 2009-09-15 15:26:59 -0500 (Tue, 15 Sep 2009) | 1 line
  Add Armin Ronacher.
........
  r74860 | benjamin.peterson | 2009-09-16 21:46:54 -0500 (Wed, 16 Sep 2009) | 1 line
  kill bare except
........
  r74861 | benjamin.peterson | 2009-09-16 22:18:28 -0500 (Wed, 16 Sep 2009) | 1 line
  pep 8 defaults
........
  r74863 | benjamin.peterson | 2009-09-16 22:27:33 -0500 (Wed, 16 Sep 2009) | 1 line
  rationalize a bit
........
  r74876 | georg.brandl | 2009-09-17 11:15:53 -0500 (Thu, 17 Sep 2009) | 1 line
  #6932: remove paragraph that advises relying on __del__ being called.
........
  r74886 | benjamin.peterson | 2009-09-17 16:33:46 -0500 (Thu, 17 Sep 2009) | 1 line
  use macros
........
  r74896 | georg.brandl | 2009-09-18 02:22:41 -0500 (Fri, 18 Sep 2009) | 1 line
  #6936: for interactive use, quit() is just fine.
........
  r74901 | georg.brandl | 2009-09-18 04:14:52 -0500 (Fri, 18 Sep 2009) | 1 line
  #6905: use better exception messages in inspect when the argument is of the wrong type.
........
  r74903 | georg.brandl | 2009-09-18 04:18:27 -0500 (Fri, 18 Sep 2009) | 1 line
  #6938: "ident" is always a string, so use a format code which works.
........
  r74908 | georg.brandl | 2009-09-18 08:57:11 -0500 (Fri, 18 Sep 2009) | 1 line
  Use str.format() to fix beginner's mistake with %-style string formatting.
........
  r74912 | georg.brandl | 2009-09-18 11:19:56 -0500 (Fri, 18 Sep 2009) | 1 line
  Optimize optimization and fix method name in docstring.
........
  r74930 | georg.brandl | 2009-09-18 16:21:41 -0500 (Fri, 18 Sep 2009) | 1 line
  #6925: rewrite docs for locals() and vars() a bit.
........
  r74933 | georg.brandl | 2009-09-18 16:35:59 -0500 (Fri, 18 Sep 2009) | 1 line
  #6930: clarify description about byteorder handling in UTF decoder routines.
........
  r74943 | georg.brandl | 2009-09-19 02:35:07 -0500 (Sat, 19 Sep 2009) | 1 line
  #6944: the argument to PyArg_ParseTuple should be a tuple, otherwise a SystemError is set.  Also clean up another usage of PyArg_ParseTuple.
........
  r74946 | georg.brandl | 2009-09-19 03:43:16 -0500 (Sat, 19 Sep 2009) | 1 line
  Update bug tracker reference.
........
  r74952 | georg.brandl | 2009-09-19 05:42:34 -0500 (Sat, 19 Sep 2009) | 1 line
  #6946: fix duplicate index entries for datetime classes.
........
  r74953 | georg.brandl | 2009-09-19 07:04:16 -0500 (Sat, 19 Sep 2009) | 1 line
  Fix references to threading.enumerate().
........
  r74954 | georg.brandl | 2009-09-19 08:13:56 -0500 (Sat, 19 Sep 2009) | 1 line
  Add Doug.
........
  r74955 | georg.brandl | 2009-09-19 08:20:49 -0500 (Sat, 19 Sep 2009) | 1 line
  Add Mark Summerfield.
........
  r75015 | georg.brandl | 2009-09-22 05:55:08 -0500 (Tue, 22 Sep 2009) | 1 line
  Fix encoding name.
........
  r75019 | vinay.sajip | 2009-09-22 12:23:41 -0500 (Tue, 22 Sep 2009) | 1 line
  Fixed a typo, and added sections on optimization and using arbitrary objects as messages.
........
  r75032 | benjamin.peterson | 2009-09-22 17:15:28 -0500 (Tue, 22 Sep 2009) | 1 line
  fix typos/rephrase
........
  r75068 | benjamin.peterson | 2009-09-25 21:57:59 -0500 (Fri, 25 Sep 2009) | 1 line
  comment out ugly xxx
........
  r75076 | vinay.sajip | 2009-09-26 09:53:32 -0500 (Sat, 26 Sep 2009) | 1 line
  Tidied up name of parameter in StreamHandler
........
  r75095 | michael.foord | 2009-09-27 14:15:41 -0500 (Sun, 27 Sep 2009) | 1 line
  Test creation moved from TestProgram.parseArgs to TestProgram.createTests exclusively. Issue 6956.
........
  r75098 | michael.foord | 2009-09-27 15:08:23 -0500 (Sun, 27 Sep 2009) | 1 line
  Documentation improvement for load_tests protocol in unittest. Issue 6515.
........
  r75102 | skip.montanaro | 2009-09-27 21:12:27 -0500 (Sun, 27 Sep 2009) | 3 lines
  Patch from Thomas Barr so that csv.Sniffer will set doublequote property.
  Closes issue 6606.
........
  r75129 | vinay.sajip | 2009-09-29 02:08:54 -0500 (Tue, 29 Sep 2009) | 1 line
  Issue #7014: logging: Improved IronPython 2.6 compatibility.
........
  r75139 | raymond.hettinger | 2009-09-29 13:53:24 -0500 (Tue, 29 Sep 2009) | 3 lines
  Issue 7008: Better document str.title and show how to work around the apostrophe problem.
........
  r75230 | benjamin.peterson | 2009-10-04 08:38:38 -0500 (Sun, 04 Oct 2009) | 1 line
  test logging
........
											
										 
										
											2009-10-04 14:49:41 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If ``*byteorder`` is zero, and the first two bytes of the input data are a
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   byte order mark (BOM), the decoder switches to this byte order and the BOM is
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   not copied into the resulting Unicode string.  If ``*byteorder`` is ``-1`` or
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``1``, any byte order mark is copied to the output (where it will result in
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   either a ``\ufeff`` or a ``\ufffe`` character).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2022-02-22 20:34:17 -08:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   After completion, ``*byteorder`` is set to the current byte order at the end
							 | 
						
					
						
							
								
									
										
											 
										 
										
											
												Merged revisions 74779-74786,74793,74795,74811,74860-74861,74863,74876,74886,74896,74901,74903,74908,74912,74930,74933,74943,74946,74952-74955,75015,75019,75032,75068,75076,75095,75098,75102,75129,75139,75230 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
  r74779 | michael.foord | 2009-09-13 11:13:36 -0500 (Sun, 13 Sep 2009) | 1 line
  Change to tutorial wording for reading text / binary files on Windows. Issue #6301.
........
  r74780 | michael.foord | 2009-09-13 11:40:02 -0500 (Sun, 13 Sep 2009) | 1 line
  Objects that compare equal automatically pass or fail assertAlmostEqual and assertNotAlmostEqual tests on unittest.TestCase. Issue 6567.
........
  r74781 | michael.foord | 2009-09-13 11:46:19 -0500 (Sun, 13 Sep 2009) | 1 line
  Note that sys._getframe is not guaranteed to exist in all implementations of Python, and a corresponding note in inspect.currentframe. Issue 6712.
........
  r74782 | michael.foord | 2009-09-13 12:07:46 -0500 (Sun, 13 Sep 2009) | 1 line
  Tutorial tweaks. Issue 6849.
........
  r74783 | michael.foord | 2009-09-13 12:28:35 -0500 (Sun, 13 Sep 2009) | 1 line
  unittest.TestLoader.loadTestsFromName honors the loader suiteClass attribute. Issue 6866.
........
  r74784 | georg.brandl | 2009-09-13 13:15:07 -0500 (Sun, 13 Sep 2009) | 1 line
  Typo fix.
........
  r74785 | michael.foord | 2009-09-13 14:07:03 -0500 (Sun, 13 Sep 2009) | 1 line
  Test discovery in unittest will only attempt to import modules that are importable; i.e. their names are valid Python identifiers. If an import fails during discovery this will be recorded as an error and test discovery will continue. Issue 6568.
........
  r74786 | michael.foord | 2009-09-13 14:08:18 -0500 (Sun, 13 Sep 2009) | 1 line
  Remove an extraneous space in unittest documentation.
........
  r74793 | georg.brandl | 2009-09-14 09:50:47 -0500 (Mon, 14 Sep 2009) | 1 line
  #6908: fix association of hashlib hash attributes.
........
  r74795 | benjamin.peterson | 2009-09-14 22:36:26 -0500 (Mon, 14 Sep 2009) | 1 line
  Py_SetPythonHome uses static storage #6913
........
  r74811 | georg.brandl | 2009-09-15 15:26:59 -0500 (Tue, 15 Sep 2009) | 1 line
  Add Armin Ronacher.
........
  r74860 | benjamin.peterson | 2009-09-16 21:46:54 -0500 (Wed, 16 Sep 2009) | 1 line
  kill bare except
........
  r74861 | benjamin.peterson | 2009-09-16 22:18:28 -0500 (Wed, 16 Sep 2009) | 1 line
  pep 8 defaults
........
  r74863 | benjamin.peterson | 2009-09-16 22:27:33 -0500 (Wed, 16 Sep 2009) | 1 line
  rationalize a bit
........
  r74876 | georg.brandl | 2009-09-17 11:15:53 -0500 (Thu, 17 Sep 2009) | 1 line
  #6932: remove paragraph that advises relying on __del__ being called.
........
  r74886 | benjamin.peterson | 2009-09-17 16:33:46 -0500 (Thu, 17 Sep 2009) | 1 line
  use macros
........
  r74896 | georg.brandl | 2009-09-18 02:22:41 -0500 (Fri, 18 Sep 2009) | 1 line
  #6936: for interactive use, quit() is just fine.
........
  r74901 | georg.brandl | 2009-09-18 04:14:52 -0500 (Fri, 18 Sep 2009) | 1 line
  #6905: use better exception messages in inspect when the argument is of the wrong type.
........
  r74903 | georg.brandl | 2009-09-18 04:18:27 -0500 (Fri, 18 Sep 2009) | 1 line
  #6938: "ident" is always a string, so use a format code which works.
........
  r74908 | georg.brandl | 2009-09-18 08:57:11 -0500 (Fri, 18 Sep 2009) | 1 line
  Use str.format() to fix beginner's mistake with %-style string formatting.
........
  r74912 | georg.brandl | 2009-09-18 11:19:56 -0500 (Fri, 18 Sep 2009) | 1 line
  Optimize optimization and fix method name in docstring.
........
  r74930 | georg.brandl | 2009-09-18 16:21:41 -0500 (Fri, 18 Sep 2009) | 1 line
  #6925: rewrite docs for locals() and vars() a bit.
........
  r74933 | georg.brandl | 2009-09-18 16:35:59 -0500 (Fri, 18 Sep 2009) | 1 line
  #6930: clarify description about byteorder handling in UTF decoder routines.
........
  r74943 | georg.brandl | 2009-09-19 02:35:07 -0500 (Sat, 19 Sep 2009) | 1 line
  #6944: the argument to PyArg_ParseTuple should be a tuple, otherwise a SystemError is set.  Also clean up another usage of PyArg_ParseTuple.
........
  r74946 | georg.brandl | 2009-09-19 03:43:16 -0500 (Sat, 19 Sep 2009) | 1 line
  Update bug tracker reference.
........
  r74952 | georg.brandl | 2009-09-19 05:42:34 -0500 (Sat, 19 Sep 2009) | 1 line
  #6946: fix duplicate index entries for datetime classes.
........
  r74953 | georg.brandl | 2009-09-19 07:04:16 -0500 (Sat, 19 Sep 2009) | 1 line
  Fix references to threading.enumerate().
........
  r74954 | georg.brandl | 2009-09-19 08:13:56 -0500 (Sat, 19 Sep 2009) | 1 line
  Add Doug.
........
  r74955 | georg.brandl | 2009-09-19 08:20:49 -0500 (Sat, 19 Sep 2009) | 1 line
  Add Mark Summerfield.
........
  r75015 | georg.brandl | 2009-09-22 05:55:08 -0500 (Tue, 22 Sep 2009) | 1 line
  Fix encoding name.
........
  r75019 | vinay.sajip | 2009-09-22 12:23:41 -0500 (Tue, 22 Sep 2009) | 1 line
  Fixed a typo, and added sections on optimization and using arbitrary objects as messages.
........
  r75032 | benjamin.peterson | 2009-09-22 17:15:28 -0500 (Tue, 22 Sep 2009) | 1 line
  fix typos/rephrase
........
  r75068 | benjamin.peterson | 2009-09-25 21:57:59 -0500 (Fri, 25 Sep 2009) | 1 line
  comment out ugly xxx
........
  r75076 | vinay.sajip | 2009-09-26 09:53:32 -0500 (Sat, 26 Sep 2009) | 1 line
  Tidied up name of parameter in StreamHandler
........
  r75095 | michael.foord | 2009-09-27 14:15:41 -0500 (Sun, 27 Sep 2009) | 1 line
  Test creation moved from TestProgram.parseArgs to TestProgram.createTests exclusively. Issue 6956.
........
  r75098 | michael.foord | 2009-09-27 15:08:23 -0500 (Sun, 27 Sep 2009) | 1 line
  Documentation improvement for load_tests protocol in unittest. Issue 6515.
........
  r75102 | skip.montanaro | 2009-09-27 21:12:27 -0500 (Sun, 27 Sep 2009) | 3 lines
  Patch from Thomas Barr so that csv.Sniffer will set doublequote property.
  Closes issue 6606.
........
  r75129 | vinay.sajip | 2009-09-29 02:08:54 -0500 (Tue, 29 Sep 2009) | 1 line
  Issue #7014: logging: Improved IronPython 2.6 compatibility.
........
  r75139 | raymond.hettinger | 2009-09-29 13:53:24 -0500 (Tue, 29 Sep 2009) | 3 lines
  Issue 7008: Better document str.title and show how to work around the apostrophe problem.
........
  r75230 | benjamin.peterson | 2009-10-04 08:38:38 -0500 (Sun, 04 Oct 2009) | 1 line
  test logging
........
											
										 
										
											2009-10-04 14:49:41 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   of input data.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If *byteorder* is ``NULL``, the codec starts in native order mode.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``NULL`` if an exception was raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeUTF16Stateful(const char *str, Py_ssize_t size, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              const char *errors, int *byteorder, Py_ssize_t *consumed)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If *consumed* is ``NULL``, behave like :c:func:`PyUnicode_DecodeUTF16`. If
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *consumed* is not ``NULL``, :c:func:`PyUnicode_DecodeUTF16Stateful` will not treat
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   trailing incomplete UTF-16 byte sequences (such as an odd number of bytes or a
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   split surrogate pair) as an error. Those bytes will not be decoded and the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   number of bytes that have been decoded will be stored in *consumed*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_AsUTF16String(PyObject *unicode)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return a Python byte string using the UTF-16 encoding in native byte
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   order. The string always starts with a BOM mark.  Error handling is "strict".
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``NULL`` if an exception was raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-02 20:05:19 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								UTF-7 Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								These are the UTF-7 codec APIs:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeUTF7(const char *str, Py_ssize_t size, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-02 20:05:19 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object by decoding *size* bytes of the UTF-7 encoded string
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *str*.  Return ``NULL`` if an exception was raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-02 20:05:19 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeUTF7Stateful(const char *str, Py_ssize_t size, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              const char *errors, Py_ssize_t *consumed)
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-02 20:05:19 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If *consumed* is ``NULL``, behave like :c:func:`PyUnicode_DecodeUTF7`.  If
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *consumed* is not ``NULL``, trailing incomplete UTF-7 base-64 sections will not
							 | 
						
					
						
							
								
									
										
										
										
											2010-08-02 20:05:19 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   be treated as an error.  Those bytes will not be decoded and the number of
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   bytes that have been decoded will be stored in *consumed*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Unicode-Escape Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								"""""""""""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								These are the "Unicode Escape" codec APIs:
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeUnicodeEscape(const char *str, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              Py_ssize_t size, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object by decoding *size* bytes of the Unicode-Escape encoded
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   string *str*.  Return ``NULL`` if an exception was raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_AsUnicodeEscapeString(PyObject *unicode)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-11-20 17:20:19 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Encode a Unicode object using Unicode-Escape and return the result as a
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   bytes object.  Error handling is "strict".  Return ``NULL`` if an exception was
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   raised by the codec.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Raw-Unicode-Escape Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								"""""""""""""""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								These are the "Raw Unicode Escape" codec APIs:
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeRawUnicodeEscape(const char *str, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              Py_ssize_t size, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object by decoding *size* bytes of the Raw-Unicode-Escape
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   encoded string *str*.  Return ``NULL`` if an exception was raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_AsRawUnicodeEscapeString(PyObject *unicode)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Encode a Unicode object using Raw-Unicode-Escape and return the result as
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   a bytes object.  Error handling is "strict".  Return ``NULL`` if an exception
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   was raised by the codec.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Latin-1 Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								""""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								These are the Latin-1 codec APIs: Latin-1 corresponds to the first 256 Unicode
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								ordinals and only these are accepted by the codecs during encoding.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeLatin1(const char *str, Py_ssize_t size, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object by decoding *size* bytes of the Latin-1 encoded string
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *str*.  Return ``NULL`` if an exception was raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_AsLatin1String(PyObject *unicode)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Encode a Unicode object using Latin-1 and return the result as Python bytes
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   object.  Error handling is "strict".  Return ``NULL`` if an exception was
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   raised by the codec.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								ASCII Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								These are the ASCII codec APIs.  Only 7-bit ASCII data is accepted. All other
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								codes generate errors.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeASCII(const char *str, Py_ssize_t size, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object by decoding *size* bytes of the ASCII encoded string
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *str*.  Return ``NULL`` if an exception was raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_AsASCIIString(PyObject *unicode)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Encode a Unicode object using ASCII and return the result as Python bytes
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   object.  Error handling is "strict".  Return ``NULL`` if an exception was
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   raised by the codec.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Character Map Codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								""""""""""""""""""""
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								This codec is special in that it can be used to implement many different codecs
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								(and this is in fact what was done to obtain most of the standard codecs
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-26 18:59:06 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								included in the :mod:`!encodings` package). The codec uses mappings to encode and
							 | 
						
					
						
							
								
									
										
										
										
											2017-03-19 08:15:17 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								decode characters.  The mapping objects provided must support the
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-27 01:41:15 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								:meth:`~object.__getitem__` mapping interface; dictionaries and sequences work well.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-04-14 07:43:53 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								These are the mapping codec APIs:
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeCharmap(const char *str, Py_ssize_t length, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              PyObject *mapping, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object by decoding *size* bytes of the encoded string *str*
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   using the given *mapping* object.  Return ``NULL`` if an exception was raised
							 | 
						
					
						
							
								
									
										
										
										
											2017-03-19 08:15:17 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   by the codec.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If *mapping* is ``NULL``, Latin-1 decoding will be applied.  Else
							 | 
						
					
						
							
								
									
										
										
										
											2017-03-19 08:15:17 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *mapping* must map bytes ordinals (integers in the range from 0 to 255)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   to Unicode strings, integers (which are then interpreted as Unicode
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ordinals) or ``None``.  Unmapped data bytes -- ones which cause a
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :exc:`LookupError`, as well as ones which get mapped to ``None``,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``0xFFFE`` or ``'\ufffe'``, are treated as undefined mappings and cause
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   an error.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_AsCharmapString(PyObject *unicode, PyObject *mapping)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2017-03-19 08:15:17 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Encode a Unicode object using the given *mapping* object and return the
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   result as a bytes object.  Error handling is "strict".  Return ``NULL`` if an
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   exception was raised by the codec.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2017-03-19 08:15:17 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   The *mapping* object must map Unicode ordinal integers to bytes objects,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   integers in the range from 0 to 255 or ``None``.  Unmapped character
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ordinals (ones which cause a :exc:`LookupError`) as well as mapped to
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``None`` are treated as "undefined mapping" and cause an error.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2017-03-19 08:15:17 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								The following codec API is special in that maps Unicode to Unicode.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_Translate(PyObject *unicode, PyObject *table, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2020-08-13 19:16:02 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Translate a string by applying a character mapping table to it and return the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   resulting Unicode object. Return ``NULL`` if an exception was raised by the
							 | 
						
					
						
							
								
									
										
										
										
											2017-03-19 08:15:17 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   codec.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2020-08-13 19:16:02 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   The mapping table must map Unicode ordinal integers to Unicode ordinal integers
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   or ``None`` (causing deletion of the character).
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-27 01:41:15 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Mapping tables need only provide the :meth:`~object.__getitem__` interface; dictionaries
							 | 
						
					
						
							
								
									
										
										
										
											2020-08-13 19:16:02 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   and sequences work well.  Unmapped character ordinals (ones which cause a
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :exc:`LookupError`) are left untouched and are copied as-is.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *errors* has the usual meaning for codecs. It may be ``NULL`` which indicates to
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   use the default error handling.
							 | 
						
					
						
							
								
									
										
										
										
											2017-03-19 08:15:17 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								MBCS codecs for Windows
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								"""""""""""""""""""""""
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								These are the MBCS codec APIs. They are currently only available on Windows and
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								use the Win32 MBCS converters to implement the conversions.  Note that MBCS (or
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								DBCS) is a class of encodings, not just one.  The target encoding is defined by
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								the user settings on the machine running the codec.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeMBCS(const char *str, Py_ssize_t size, const char *errors)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode object by decoding *size* bytes of the MBCS encoded string *str*.
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``NULL`` if an exception was raised by the codec.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_DecodeMBCSStateful(const char *str, Py_ssize_t size, \
							 | 
						
					
						
							
								
									
										
										
										
											2018-12-19 15:31:40 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              const char *errors, Py_ssize_t *consumed)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If *consumed* is ``NULL``, behave like :c:func:`PyUnicode_DecodeMBCS`. If
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *consumed* is not ``NULL``, :c:func:`PyUnicode_DecodeMBCSStateful` will not decode
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   trailing lead byte and the number of bytes that have been decoded will be stored
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   in *consumed*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_AsMBCSString(PyObject *unicode)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Encode a Unicode object using MBCS and return the result as Python bytes
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   object.  Error handling is "strict".  Return ``NULL`` if an exception was
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-22 21:56:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   raised by the codec.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-09 00:18:11 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_EncodeCodePage(int code_page, PyObject *unicode, const char *errors)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Encode the Unicode object using the specified code page and return a Python
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   bytes object.  Return ``NULL`` if an exception was raised by the codec. Use
							 | 
						
					
						
							
								
									
										
										
										
											2023-08-22 15:50:30 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   :c:macro:`!CP_ACP` code page to get the MBCS encoder.
							 | 
						
					
						
							
								
									
										
										
										
											2011-12-09 00:18:11 +01:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-05-14 15:58:55 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								Methods & Slots
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								"""""""""""""""
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. _unicodemethodsandslots:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								Methods and Slot Functions
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								^^^^^^^^^^^^^^^^^^^^^^^^^^
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								The following APIs are capable of handling Unicode objects and strings on input
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								(we refer to them as strings in the descriptions) and return Unicode objects or
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								integers as appropriate.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								They all return ``NULL`` or ``-1`` if an exception occurs.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_Concat(PyObject *left, PyObject *right)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Concat two strings giving a new Unicode string.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_Split(PyObject *unicode, PyObject *sep, Py_ssize_t maxsplit)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-10-30 12:03:20 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Split a string giving a list of Unicode strings.  If *sep* is ``NULL``, splitting
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   will be done at all whitespace substrings.  Otherwise, splits occur at the given
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   separator.  At most *maxsplit* splits will be done.  If negative, no limit is
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   set.  Separators are not included in the resulting list.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_Splitlines(PyObject *unicode, int keepends)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Split a Unicode string at line breaks, returning a list of Unicode strings.
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   CRLF is considered to be one line break.  If *keepends* is ``0``, the Line break
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   characters are not included in the resulting strings.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_Join(PyObject *separator, PyObject *seq)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-04-14 07:43:53 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Join a sequence of strings using the given *separator* and return the resulting
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Unicode string.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_ssize_t PyUnicode_Tailmatch(PyObject *unicode, PyObject *substr, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                        Py_ssize_t start, Py_ssize_t end, int direction)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return ``1`` if *substr* matches ``unicode[start:end]`` at the given tail end
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   (*direction* == ``-1`` means to do a prefix match, *direction* == ``1`` a suffix match),
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``0`` otherwise. Return ``-1`` if an error occurred.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_ssize_t PyUnicode_Find(PyObject *unicode, PyObject *substr, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                               Py_ssize_t start, Py_ssize_t end, int direction)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return the first position of *substr* in ``unicode[start:end]`` using the given
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *direction* (*direction* == ``1`` means to do a forward search, *direction* == ``-1`` a
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   backward search).  The return value is the index of the first match; a value of
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``-1`` indicates that no match was found, and ``-2`` indicates that an error
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   occurred and an exception has been set.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_ssize_t PyUnicode_FindChar(PyObject *unicode, Py_UCS4 ch, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                               Py_ssize_t start, Py_ssize_t end, int direction)
							 | 
						
					
						
							
								
									
										
										
										
											2011-09-28 07:41:54 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Return the first position of the character *ch* in ``unicode[start:end]`` using
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   the given *direction* (*direction* == ``1`` means to do a forward search,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *direction* == ``-1`` a backward search).  The return value is the index of the
							 | 
						
					
						
							
								
									
										
										
										
											2011-09-28 07:41:54 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   first match; a value of ``-1`` indicates that no match was found, and ``-2``
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   indicates that an error occurred and an exception has been set.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-09-28 21:51:06 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.3
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-12-20 22:52:33 +08:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. versionchanged:: 3.7
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								      *start* and *end* are now adjusted to behave like ``unicode[start:end]``.
							 | 
						
					
						
							
								
									
										
										
										
											2016-12-20 22:52:33 +08:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2011-09-28 07:41:54 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: Py_ssize_t PyUnicode_Count(PyObject *unicode, PyObject *substr, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                               Py_ssize_t start, Py_ssize_t end)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return the number of non-overlapping occurrences of *substr* in
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   ``unicode[start:end]``.  Return ``-1`` if an error occurred.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_Replace(PyObject *unicode, PyObject *substr, \
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								                              PyObject *replstr, Py_ssize_t maxcount)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Replace at most *maxcount* occurrences of *substr* in *unicode* with *replstr* and
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   return the resulting Unicode object. *maxcount* == ``-1`` means replace all
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   occurrences.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_Compare(PyObject *left, PyObject *right)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-10-27 21:41:19 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Compare two strings and return ``-1``, ``0``, ``1`` for less than, equal, and greater than,
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   respectively.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-11-16 10:17:58 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   This function returns ``-1`` upon failure, so one should call
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyErr_Occurred` to check for errors.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-10-07 23:24:53 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   .. seealso::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      The :c:func:`PyUnicode_Equal` function.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_Equal(PyObject *a, PyObject *b)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Test if two strings are equal:
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   * Return ``1`` if *a* is equal to *b*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   * Return ``0`` if *a* is not equal to *b*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   * Set a :exc:`TypeError` exception and return ``-1`` if *a* or *b* is not a
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								     :class:`str` object.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   The function always succeeds if *a* and *b* are :class:`str` objects.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   The function works for :class:`str` subclasses, but does not honor custom
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``__eq__()`` method.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. seealso::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      The :c:func:`PyUnicode_Compare` function.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.14
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-10-11 16:41:58 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_EqualToUTF8AndSize(PyObject *unicode, const char *string, Py_ssize_t size)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Compare a Unicode object with a char buffer which is interpreted as
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   being UTF-8 or ASCII encoded and return true (``1``) if they are equal,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   or false (``0``) otherwise.
							 | 
						
					
						
							
								
									
										
										
										
											2024-09-27 22:13:29 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If the Unicode object contains surrogate code points
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   (``U+D800`` - ``U+DFFF``) or the C string is not valid UTF-8,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   false (``0``) is returned.
							 | 
						
					
						
							
								
									
										
										
										
											2023-10-11 16:41:58 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   This function does not raise exceptions.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.13
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_EqualToUTF8(PyObject *unicode, const char *string)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Similar to :c:func:`PyUnicode_EqualToUTF8AndSize`, but compute *string*
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   length using :c:func:`!strlen`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   If the Unicode object contains null characters, false (``0``) is returned.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. versionadded:: 3.13
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_CompareWithASCIIString(PyObject *unicode, const char *string)
							 | 
						
					
						
							
								
									
										
										
										
											2008-07-01 19:12:34 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Compare a Unicode object, *unicode*, with *string* and return ``-1``, ``0``, ``1`` for less
							 | 
						
					
						
							
								
									
										
										
										
											2010-12-28 23:39:51 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   than, equal, and greater than, respectively. It is best to pass only
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ASCII-encoded strings, but the function interprets the input string as
							 | 
						
					
						
							
								
									
										
										
										
											2014-06-06 09:13:18 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   ISO-8859-1 if it contains non-ASCII characters.
							 | 
						
					
						
							
								
									
										
										
										
											2008-07-01 19:12:34 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2016-12-06 00:13:34 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   This function does not raise exceptions.
							 | 
						
					
						
							
								
									
										
										
										
											2016-11-16 10:17:58 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2008-07-01 19:12:34 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_RichCompare(PyObject *left,  PyObject *right, int op)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2019-05-08 11:02:34 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Rich compare two Unicode strings and return one of the following:
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   * ``NULL`` in case an exception was raised
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-21 10:52:07 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   * :c:data:`Py_True` or :c:data:`Py_False` for successful comparisons
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   * :c:data:`Py_NotImplemented` in case the type combination is unknown
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-07-21 10:52:07 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Possible values for *op* are :c:macro:`Py_GT`, :c:macro:`Py_GE`, :c:macro:`Py_EQ`,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :c:macro:`Py_NE`, :c:macro:`Py_LT`, and :c:macro:`Py_LE`.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_Format(PyObject *format, PyObject *args)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return a new string object from *format* and *args*; this is analogous to
							 | 
						
					
						
							
								
									
										
										
										
											2014-07-19 16:34:33 -07:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   ``format % args``.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicode_Contains(PyObject *unicode, PyObject *substr)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Check whether *substr* is contained in *unicode* and return true or false
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   accordingly.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   *substr* has to coerce to a one element Unicode string. ``-1`` is returned
							 | 
						
					
						
							
								
									
										
										
										
											2011-10-07 11:19:11 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   if there was an error.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: void PyUnicode_InternInPlace(PyObject **p_unicode)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   Intern the argument :c:expr:`*p_unicode` in place.  The argument must be the address of a
							 | 
						
					
						
							
								
									
										
										
										
											2019-05-08 11:02:34 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   pointer variable pointing to a Python Unicode string object.  If there is an
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   existing interned string that is the same as :c:expr:`*p_unicode`, it sets :c:expr:`*p_unicode` to
							 | 
						
					
						
							
								
									
										
										
										
											2023-08-07 15:40:59 -06:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   it (releasing the reference to the old string object and creating a new
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :term:`strong reference` to the interned string object), otherwise it leaves
							 | 
						
					
						
							
								
									
										
										
										
											2024-07-16 15:36:21 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   :c:expr:`*p_unicode` alone and interns it.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-08-07 15:40:59 -06:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   (Clarification: even though there is a lot of talk about references, think
							 | 
						
					
						
							
								
									
										
										
										
											2024-07-16 15:36:21 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   of this function as reference-neutral. You must own the object you pass in;
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   after the call you no longer own the passed-in reference, but you newly own
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   the result.)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   This function never raises an exception.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On error, it leaves its argument unchanged without interning it.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Instances of subclasses of :py:class:`str` may not be interned, that is,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :c:expr:`PyUnicode_CheckExact(*p_unicode)` must be true. If it is not,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   then -- as with any other error -- the argument is left unchanged.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Note that interned strings are not “immortal”.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   You must keep a reference to the result to benefit from interning.
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2023-12-05 04:21:09 -05:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicode_InternFromString(const char *str)
							 | 
						
					
						
							
								
									
										
										
										
											2008-01-20 09:30:57 +00:00
										 
									 
								 
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2010-10-06 10:11:56 +00:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   A combination of :c:func:`PyUnicode_FromString` and
							 | 
						
					
						
							
								
									
										
										
										
											2024-07-16 15:36:21 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_InternInPlace`, meant for statically allocated strings.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return a new ("owned") reference to either a new Unicode string object
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   that has been interned, or an earlier interned string object with the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   same value.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Python may keep a reference to the result, or make it :term:`immortal`,
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   preventing it from being garbage-collected promptly.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   For interning an unbounded number of different strings, such as ones coming
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   from user input, prefer calling :c:func:`PyUnicode_FromString` and
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   :c:func:`PyUnicode_InternInPlace` directly.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   .. impl-detail::
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								      Strings interned this way are made :term:`immortal`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-06-17 17:10:52 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								PyUnicodeWriter
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								^^^^^^^^^^^^^^^
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								The :c:type:`PyUnicodeWriter` API can be used to create a Python :class:`str`
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								object.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-10-15 22:29:35 +03:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. versionadded:: 3.14
							 | 
						
					
						
							
								
									
										
										
										
											2024-06-17 17:10:52 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:type:: PyUnicodeWriter
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   A Unicode writer instance.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   The instance must be destroyed by :c:func:`PyUnicodeWriter_Finish` on
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   success, or :c:func:`PyUnicodeWriter_Discard` on error.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyUnicodeWriter* PyUnicodeWriter_Create(Py_ssize_t length)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Create a Unicode writer instance.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Set an exception and return ``NULL`` on error.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: PyObject* PyUnicodeWriter_Finish(PyUnicodeWriter *writer)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Return the final Python :class:`str` object and destroy the writer instance.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Set an exception and return ``NULL`` on error.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: void PyUnicodeWriter_Discard(PyUnicodeWriter *writer)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Discard the internal Unicode buffer and destroy the writer instance.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-10-10 01:32:02 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   If *writer* is ``NULL``, no operation is performed.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-06-17 17:10:52 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicodeWriter_WriteChar(PyUnicodeWriter *writer, Py_UCS4 ch)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Write the single Unicode character *ch* into *writer*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On success, return ``0``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception, leave the writer unchanged, and return ``-1``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicodeWriter_WriteUTF8(PyUnicodeWriter *writer, const char *str, Py_ssize_t size)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Decode the string *str* from UTF-8 in strict mode and write the output into *writer*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *size* is the string length in bytes. If *size* is equal to ``-1``, call
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``strlen(str)`` to get the string length.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On success, return ``0``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception, leave the writer unchanged, and return ``-1``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-06-21 19:33:15 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								   See also :c:func:`PyUnicodeWriter_DecodeUTF8Stateful`.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicodeWriter_WriteWideChar(PyUnicodeWriter *writer, const wchar_t *str, Py_ssize_t size)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Writer the wide string *str* into *writer*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *size* is a number of wide characters. If *size* is equal to ``-1``, call
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``wcslen(str)`` to get the string length.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On success, return ``0``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception, leave the writer unchanged, and return ``-1``.
							 | 
						
					
						
							
								
									
										
										
										
											2024-06-17 17:10:52 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-06-24 17:40:39 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicodeWriter_WriteUCS4(PyUnicodeWriter *writer, Py_UCS4 *str, Py_ssize_t size)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Writer the UCS4 string *str* into *writer*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *size* is a number of UCS4 characters.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On success, return ``0``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception, leave the writer unchanged, and return ``-1``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							
								
									
										
										
										
											2024-06-17 17:10:52 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicodeWriter_WriteStr(PyUnicodeWriter *writer, PyObject *obj)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Call :c:func:`PyObject_Str` on *obj* and write the output into *writer*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On success, return ``0``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception, leave the writer unchanged, and return ``-1``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicodeWriter_WriteRepr(PyUnicodeWriter *writer, PyObject *obj)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Call :c:func:`PyObject_Repr` on *obj* and write the output into *writer*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On success, return ``0``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception, leave the writer unchanged, and return ``-1``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicodeWriter_WriteSubstring(PyUnicodeWriter *writer, PyObject *str, Py_ssize_t start, Py_ssize_t end)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Write the substring ``str[start:end]`` into *writer*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *str* must be Python :class:`str` object. *start* must be greater than or
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   equal to 0, and less than or equal to *end*. *end* must be less than or
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   equal to *str* length.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On success, return ``0``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception, leave the writer unchanged, and return ``-1``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicodeWriter_Format(PyUnicodeWriter *writer, const char *format, ...)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Similar to :c:func:`PyUnicode_FromFormat`, but write the output directly into *writer*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On success, return ``0``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception, leave the writer unchanged, and return ``-1``.
							 | 
						
					
						
							
								
									
										
										
										
											2024-06-21 19:33:15 +02:00
										 
									 
								 
							 | 
							
								
									
										
									
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								.. c:function:: int PyUnicodeWriter_DecodeUTF8Stateful(PyUnicodeWriter *writer, const char *string, Py_ssize_t length, const char *errors, Py_ssize_t *consumed)
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   Decode the string *str* from UTF-8 with *errors* error handler and write the
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   output into *writer*.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *size* is the string length in bytes. If *size* is equal to ``-1``, call
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``strlen(str)`` to get the string length.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   *errors* is an error handler name, such as ``"replace"``. If *errors* is
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   ``NULL``, use the strict error handler.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   If *consumed* is not ``NULL``, set *\*consumed* to the number of decoded
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   bytes on success.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   If *consumed* is ``NULL``, treat trailing incomplete UTF-8 byte sequences
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   as an error.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On success, return ``0``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   On error, set an exception, leave the writer unchanged, and return ``-1``.
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								
							 | 
						
					
						
							| 
								
							 | 
							
								
							 | 
							
								
							 | 
							
							
								   See also :c:func:`PyUnicodeWriter_WriteUTF8`.
							 |