mirror of
				https://github.com/python/cpython.git
				synced 2025-11-03 15:11:34 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			605 lines
		
	
	
	
		
			24 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			605 lines
		
	
	
	
		
			24 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
:mod:`struct` --- Interpret bytes as packed binary data
 | 
						|
=======================================================
 | 
						|
 | 
						|
.. module:: struct
 | 
						|
   :synopsis: Interpret bytes as packed binary data.
 | 
						|
 | 
						|
**Source code:** :source:`Lib/struct.py`
 | 
						|
 | 
						|
.. index::
 | 
						|
   pair: C; structures
 | 
						|
   triple: packing; binary; data
 | 
						|
 | 
						|
--------------
 | 
						|
 | 
						|
This module converts between Python values and C structs represented
 | 
						|
as Python :class:`bytes` objects.  Compact :ref:`format strings <struct-format-strings>`
 | 
						|
describe the intended conversions to/from Python values.
 | 
						|
The module's functions and objects can be used for two largely
 | 
						|
distinct applications, data exchange with external sources (files or
 | 
						|
network connections), or data transfer between the Python application
 | 
						|
and the C layer.
 | 
						|
 | 
						|
.. note::
 | 
						|
 | 
						|
   When no prefix character is given, native mode is the default. It
 | 
						|
   packs or unpacks data based on the platform and compiler on which
 | 
						|
   the Python interpreter was built.
 | 
						|
   The result of packing a given C struct includes pad bytes which
 | 
						|
   maintain proper alignment for the C types involved; similarly,
 | 
						|
   alignment is taken into account when unpacking.  In contrast, when
 | 
						|
   communicating data between external sources, the programmer is
 | 
						|
   responsible for defining byte ordering and padding between elements.
 | 
						|
   See :ref:`struct-alignment` for details.
 | 
						|
 | 
						|
Several :mod:`struct` functions (and methods of :class:`Struct`) take a *buffer*
 | 
						|
argument.  This refers to objects that implement the :ref:`bufferobjects` and
 | 
						|
provide either a readable or read-writable buffer.  The most common types used
 | 
						|
for that purpose are :class:`bytes` and :class:`bytearray`, but many other types
 | 
						|
that can be viewed as an array of bytes implement the buffer protocol, so that
 | 
						|
they can be read/filled without additional copying from a :class:`bytes` object.
 | 
						|
 | 
						|
 | 
						|
Functions and Exceptions
 | 
						|
------------------------
 | 
						|
 | 
						|
The module defines the following exception and functions:
 | 
						|
 | 
						|
 | 
						|
.. exception:: error
 | 
						|
 | 
						|
   Exception raised on various occasions; argument is a string describing what
 | 
						|
   is wrong.
 | 
						|
 | 
						|
 | 
						|
.. function:: pack(format, v1, v2, ...)
 | 
						|
 | 
						|
   Return a bytes object containing the values *v1*, *v2*, ... packed according
 | 
						|
   to the format string *format*.  The arguments must match the values required by
 | 
						|
   the format exactly.
 | 
						|
 | 
						|
 | 
						|
.. function:: pack_into(format, buffer, offset, v1, v2, ...)
 | 
						|
 | 
						|
   Pack the values *v1*, *v2*, ... according to the format string *format* and
 | 
						|
   write the packed bytes into the writable buffer *buffer* starting at
 | 
						|
   position *offset*.  Note that *offset* is a required argument.
 | 
						|
 | 
						|
 | 
						|
.. function:: unpack(format, buffer)
 | 
						|
 | 
						|
   Unpack from the buffer *buffer* (presumably packed by ``pack(format, ...)``)
 | 
						|
   according to the format string *format*.  The result is a tuple even if it
 | 
						|
   contains exactly one item.  The buffer's size in bytes must match the
 | 
						|
   size required by the format, as reflected by :func:`calcsize`.
 | 
						|
 | 
						|
 | 
						|
.. function:: unpack_from(format, /, buffer, offset=0)
 | 
						|
 | 
						|
   Unpack from *buffer* starting at position *offset*, according to the format
 | 
						|
   string *format*.  The result is a tuple even if it contains exactly one
 | 
						|
   item.  The buffer's size in bytes, starting at position *offset*, must be at
 | 
						|
   least the size required by the format, as reflected by :func:`calcsize`.
 | 
						|
 | 
						|
 | 
						|
.. function:: iter_unpack(format, buffer)
 | 
						|
 | 
						|
   Iteratively unpack from the buffer *buffer* according to the format
 | 
						|
   string *format*.  This function returns an iterator which will read
 | 
						|
   equally sized chunks from the buffer until all its contents have been
 | 
						|
   consumed.  The buffer's size in bytes must be a multiple of the size
 | 
						|
   required by the format, as reflected by :func:`calcsize`.
 | 
						|
 | 
						|
   Each iteration yields a tuple as specified by the format string.
 | 
						|
 | 
						|
   .. versionadded:: 3.4
 | 
						|
 | 
						|
 | 
						|
.. function:: calcsize(format)
 | 
						|
 | 
						|
   Return the size of the struct (and hence of the bytes object produced by
 | 
						|
   ``pack(format, ...)``) corresponding to the format string *format*.
 | 
						|
 | 
						|
 | 
						|
.. _struct-format-strings:
 | 
						|
 | 
						|
Format Strings
 | 
						|
--------------
 | 
						|
 | 
						|
Format strings describe the data layout when
 | 
						|
packing and unpacking data.  They are built up from :ref:`format characters<format-characters>`,
 | 
						|
which specify the type of data being packed/unpacked.  In addition,
 | 
						|
special characters control the :ref:`byte order, size and alignment<struct-alignment>`.
 | 
						|
Each format string consists of an optional prefix character which
 | 
						|
describes the overall properties of the data and one or more format
 | 
						|
characters which describe the actual data values and padding.
 | 
						|
 | 
						|
 | 
						|
.. _struct-alignment:
 | 
						|
 | 
						|
Byte Order, Size, and Alignment
 | 
						|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 | 
						|
 | 
						|
By default, C types are represented in the machine's native format and byte
 | 
						|
order, and properly aligned by skipping pad bytes if necessary (according to the
 | 
						|
rules used by the C compiler).
 | 
						|
This behavior is chosen so
 | 
						|
that the bytes of a packed struct correspond exactly to the memory layout
 | 
						|
of the corresponding C struct.
 | 
						|
Whether to use native byte ordering
 | 
						|
and padding or standard formats depends on the application.
 | 
						|
 | 
						|
.. index::
 | 
						|
   single: @ (at); in struct format strings
 | 
						|
   single: = (equals); in struct format strings
 | 
						|
   single: < (less); in struct format strings
 | 
						|
   single: > (greater); in struct format strings
 | 
						|
   single: ! (exclamation); in struct format strings
 | 
						|
 | 
						|
Alternatively, the first character of the format string can be used to indicate
 | 
						|
the byte order, size and alignment of the packed data, according to the
 | 
						|
following table:
 | 
						|
 | 
						|
+-----------+------------------------+----------+-----------+
 | 
						|
| Character | Byte order             | Size     | Alignment |
 | 
						|
+===========+========================+==========+===========+
 | 
						|
| ``@``     | native                 | native   | native    |
 | 
						|
+-----------+------------------------+----------+-----------+
 | 
						|
| ``=``     | native                 | standard | none      |
 | 
						|
+-----------+------------------------+----------+-----------+
 | 
						|
| ``<``     | little-endian          | standard | none      |
 | 
						|
+-----------+------------------------+----------+-----------+
 | 
						|
| ``>``     | big-endian             | standard | none      |
 | 
						|
+-----------+------------------------+----------+-----------+
 | 
						|
| ``!``     | network (= big-endian) | standard | none      |
 | 
						|
+-----------+------------------------+----------+-----------+
 | 
						|
 | 
						|
If the first character is not one of these, ``'@'`` is assumed.
 | 
						|
 | 
						|
Native byte order is big-endian or little-endian, depending on the
 | 
						|
host system. For example, Intel x86, AMD64 (x86-64), and Apple M1 are
 | 
						|
little-endian; IBM z and many legacy architectures are big-endian.
 | 
						|
Use :data:`sys.byteorder` to check the endianness of your system.
 | 
						|
 | 
						|
Native size and alignment are determined using the C compiler's
 | 
						|
``sizeof`` expression.  This is always combined with native byte order.
 | 
						|
 | 
						|
Standard size depends only on the format character;  see the table in
 | 
						|
the :ref:`format-characters` section.
 | 
						|
 | 
						|
Note the difference between ``'@'`` and ``'='``: both use native byte order, but
 | 
						|
the size and alignment of the latter is standardized.
 | 
						|
 | 
						|
The form ``'!'`` represents the network byte order which is always big-endian
 | 
						|
as defined in `IETF RFC 1700 <IETF RFC 1700_>`_.
 | 
						|
 | 
						|
There is no way to indicate non-native byte order (force byte-swapping); use the
 | 
						|
appropriate choice of ``'<'`` or ``'>'``.
 | 
						|
 | 
						|
Notes:
 | 
						|
 | 
						|
(1) Padding is only automatically added between successive structure members.
 | 
						|
    No padding is added at the beginning or the end of the encoded struct.
 | 
						|
 | 
						|
(2) No padding is added when using non-native size and alignment, e.g.
 | 
						|
    with '<', '>', '=', and '!'.
 | 
						|
 | 
						|
(3) To align the end of a structure to the alignment requirement of a
 | 
						|
    particular type, end the format with the code for that type with a repeat
 | 
						|
    count of zero.  See :ref:`struct-examples`.
 | 
						|
 | 
						|
 | 
						|
.. _format-characters:
 | 
						|
 | 
						|
Format Characters
 | 
						|
^^^^^^^^^^^^^^^^^
 | 
						|
 | 
						|
Format characters have the following meaning; the conversion between C and
 | 
						|
Python values should be obvious given their types.  The 'Standard size' column
 | 
						|
refers to the size of the packed value in bytes when using standard size; that
 | 
						|
is, when the format string starts with one of ``'<'``, ``'>'``, ``'!'`` or
 | 
						|
``'='``.  When using native size, the size of the packed value is
 | 
						|
platform-dependent.
 | 
						|
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| Format | C Type                   | Python type        | Standard size  | Notes      |
 | 
						|
+========+==========================+====================+================+============+
 | 
						|
| ``x``  | pad byte                 | no value           |                | \(7)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``c``  | :c:expr:`char`           | bytes of length 1  | 1              |            |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``b``  | :c:expr:`signed char`    | integer            | 1              | \(1), \(2) |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``B``  | :c:expr:`unsigned char`  | integer            | 1              | \(2)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``?``  | :c:expr:`_Bool`          | bool               | 1              | \(1)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``h``  | :c:expr:`short`          | integer            | 2              | \(2)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``H``  | :c:expr:`unsigned short` | integer            | 2              | \(2)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``i``  | :c:expr:`int`            | integer            | 4              | \(2)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``I``  | :c:expr:`unsigned int`   | integer            | 4              | \(2)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``l``  | :c:expr:`long`           | integer            | 4              | \(2)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``L``  | :c:expr:`unsigned long`  | integer            | 4              | \(2)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``q``  | :c:expr:`long long`      | integer            | 8              | \(2)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``Q``  | :c:expr:`unsigned long   | integer            | 8              | \(2)       |
 | 
						|
|        | long`                    |                    |                |            |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``n``  | :c:expr:`ssize_t`        | integer            |                | \(3)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``N``  | :c:expr:`size_t`         | integer            |                | \(3)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``e``  | \(6)                     | float              | 2              | \(4)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``f``  | :c:expr:`float`          | float              | 4              | \(4)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``d``  | :c:expr:`double`         | float              | 8              | \(4)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``s``  | :c:expr:`char[]`         | bytes              |                | \(9)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``p``  | :c:expr:`char[]`         | bytes              |                | \(8)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
| ``P``  | :c:expr:`void \*`        | integer            |                | \(5)       |
 | 
						|
+--------+--------------------------+--------------------+----------------+------------+
 | 
						|
 | 
						|
.. versionchanged:: 3.3
 | 
						|
   Added support for the ``'n'`` and ``'N'`` formats.
 | 
						|
 | 
						|
.. versionchanged:: 3.6
 | 
						|
   Added support for the ``'e'`` format.
 | 
						|
 | 
						|
 | 
						|
Notes:
 | 
						|
 | 
						|
(1)
 | 
						|
   .. index:: single: ? (question mark); in struct format strings
 | 
						|
 | 
						|
   The ``'?'`` conversion code corresponds to the :c:expr:`_Bool` type defined by
 | 
						|
   C99. If this type is not available, it is simulated using a :c:expr:`char`. In
 | 
						|
   standard mode, it is always represented by one byte.
 | 
						|
 | 
						|
(2)
 | 
						|
   When attempting to pack a non-integer using any of the integer conversion
 | 
						|
   codes, if the non-integer has a :meth:`__index__` method then that method is
 | 
						|
   called to convert the argument to an integer before packing.
 | 
						|
 | 
						|
   .. versionchanged:: 3.2
 | 
						|
      Added use of the :meth:`__index__` method for non-integers.
 | 
						|
 | 
						|
(3)
 | 
						|
   The ``'n'`` and ``'N'`` conversion codes are only available for the native
 | 
						|
   size (selected as the default or with the ``'@'`` byte order character).
 | 
						|
   For the standard size, you can use whichever of the other integer formats
 | 
						|
   fits your application.
 | 
						|
 | 
						|
(4)
 | 
						|
   For the ``'f'``, ``'d'`` and ``'e'`` conversion codes, the packed
 | 
						|
   representation uses the IEEE 754 binary32, binary64 or binary16 format (for
 | 
						|
   ``'f'``, ``'d'`` or ``'e'`` respectively), regardless of the floating-point
 | 
						|
   format used by the platform.
 | 
						|
 | 
						|
(5)
 | 
						|
   The ``'P'`` format character is only available for the native byte ordering
 | 
						|
   (selected as the default or with the ``'@'`` byte order character). The byte
 | 
						|
   order character ``'='`` chooses to use little- or big-endian ordering based
 | 
						|
   on the host system. The struct module does not interpret this as native
 | 
						|
   ordering, so the ``'P'`` format is not available.
 | 
						|
 | 
						|
(6)
 | 
						|
   The IEEE 754 binary16 "half precision" type was introduced in the 2008
 | 
						|
   revision of the `IEEE 754 standard <ieee 754 standard_>`_. It has a sign
 | 
						|
   bit, a 5-bit exponent and 11-bit precision (with 10 bits explicitly stored),
 | 
						|
   and can represent numbers between approximately ``6.1e-05`` and ``6.5e+04``
 | 
						|
   at full precision. This type is not widely supported by C compilers: on a
 | 
						|
   typical machine, an unsigned short can be used for storage, but not for math
 | 
						|
   operations. See the Wikipedia page on the `half-precision floating-point
 | 
						|
   format <half precision format_>`_ for more information.
 | 
						|
 | 
						|
(7)
 | 
						|
   When packing, ``'x'`` inserts one NUL byte.
 | 
						|
 | 
						|
(8)
 | 
						|
   The ``'p'`` format character encodes a "Pascal string", meaning a short
 | 
						|
   variable-length string stored in a *fixed number of bytes*, given by the count.
 | 
						|
   The first byte stored is the length of the string, or 255, whichever is
 | 
						|
   smaller.  The bytes of the string follow.  If the string passed in to
 | 
						|
   :func:`pack` is too long (longer than the count minus 1), only the leading
 | 
						|
   ``count-1`` bytes of the string are stored.  If the string is shorter than
 | 
						|
   ``count-1``, it is padded with null bytes so that exactly count bytes in all
 | 
						|
   are used.  Note that for :func:`unpack`, the ``'p'`` format character consumes
 | 
						|
   ``count`` bytes, but that the string returned can never contain more than 255
 | 
						|
   bytes.
 | 
						|
 | 
						|
(9)
 | 
						|
   For the ``'s'`` format character, the count is interpreted as the length of the
 | 
						|
   bytes, not a repeat count like for the other format characters; for example,
 | 
						|
   ``'10s'`` means a single 10-byte string mapping to or from a single
 | 
						|
   Python byte string, while ``'10c'`` means 10
 | 
						|
   separate one byte character elements (e.g., ``cccccccccc``) mapping
 | 
						|
   to or from ten different Python byte objects. (See :ref:`struct-examples`
 | 
						|
   for a concrete demonstration of the difference.)
 | 
						|
   If a count is not given, it defaults to 1.  For packing, the string is
 | 
						|
   truncated or padded with null bytes as appropriate to make it fit. For
 | 
						|
   unpacking, the resulting bytes object always has exactly the specified number
 | 
						|
   of bytes.  As a special case, ``'0s'`` means a single, empty string (while
 | 
						|
   ``'0c'`` means 0 characters).
 | 
						|
 | 
						|
A format character may be preceded by an integral repeat count.  For example,
 | 
						|
the format string ``'4h'`` means exactly the same as ``'hhhh'``.
 | 
						|
 | 
						|
Whitespace characters between formats are ignored; a count and its format must
 | 
						|
not contain whitespace though.
 | 
						|
 | 
						|
When packing a value ``x`` using one of the integer formats (``'b'``,
 | 
						|
``'B'``, ``'h'``, ``'H'``, ``'i'``, ``'I'``, ``'l'``, ``'L'``,
 | 
						|
``'q'``, ``'Q'``), if ``x`` is outside the valid range for that format
 | 
						|
then :exc:`struct.error` is raised.
 | 
						|
 | 
						|
.. versionchanged:: 3.1
 | 
						|
   Previously, some of the integer formats wrapped out-of-range values and
 | 
						|
   raised :exc:`DeprecationWarning` instead of :exc:`struct.error`.
 | 
						|
 | 
						|
.. index:: single: ? (question mark); in struct format strings
 | 
						|
 | 
						|
For the ``'?'`` format character, the return value is either :const:`True` or
 | 
						|
:const:`False`. When packing, the truth value of the argument object is used.
 | 
						|
Either 0 or 1 in the native or standard bool representation will be packed, and
 | 
						|
any non-zero value will be ``True`` when unpacking.
 | 
						|
 | 
						|
 | 
						|
 | 
						|
.. _struct-examples:
 | 
						|
 | 
						|
Examples
 | 
						|
^^^^^^^^
 | 
						|
 | 
						|
.. note::
 | 
						|
   Native byte order examples (designated by the ``'@'`` format prefix or
 | 
						|
   lack of any prefix character) may not match what the reader's
 | 
						|
   machine produces as
 | 
						|
   that depends on the platform and compiler.
 | 
						|
 | 
						|
Pack and unpack integers of three different sizes, using big endian
 | 
						|
ordering::
 | 
						|
 | 
						|
    >>> from struct import *
 | 
						|
    >>> pack(">bhl", 1, 2, 3)
 | 
						|
    b'\x01\x00\x02\x00\x00\x00\x03'
 | 
						|
    >>> unpack('>bhl', b'\x01\x00\x02\x00\x00\x00\x03'
 | 
						|
    (1, 2, 3)
 | 
						|
    >>> calcsize('>bhl')
 | 
						|
    7
 | 
						|
 | 
						|
Attempt to pack an integer which is too large for the defined field::
 | 
						|
 | 
						|
    >>> pack(">h", 99999)
 | 
						|
    Traceback (most recent call last):
 | 
						|
      File "<stdin>", line 1, in <module>
 | 
						|
    struct.error: 'h' format requires -32768 <= number <= 32767
 | 
						|
 | 
						|
Demonstrate the difference between ``'s'`` and ``'c'`` format
 | 
						|
characters::
 | 
						|
 | 
						|
    >>> pack("@ccc", b'1', b'2', b'3')
 | 
						|
    b'123'
 | 
						|
    >>> pack("@3s", b'123')
 | 
						|
    b'123'
 | 
						|
 | 
						|
Unpacked fields can be named by assigning them to variables or by wrapping
 | 
						|
the result in a named tuple::
 | 
						|
 | 
						|
    >>> record = b'raymond   \x32\x12\x08\x01\x08'
 | 
						|
    >>> name, serialnum, school, gradelevel = unpack('<10sHHb', record)
 | 
						|
 | 
						|
    >>> from collections import namedtuple
 | 
						|
    >>> Student = namedtuple('Student', 'name serialnum school gradelevel')
 | 
						|
    >>> Student._make(unpack('<10sHHb', record))
 | 
						|
    Student(name=b'raymond   ', serialnum=4658, school=264, gradelevel=8)
 | 
						|
 | 
						|
The ordering of format characters may have an impact on size in native
 | 
						|
mode since padding is implicit. In standard mode, the user is
 | 
						|
responsible for inserting any desired padding.
 | 
						|
Note in
 | 
						|
the first ``pack`` call below that three NUL bytes were added after the
 | 
						|
packed ``'#'`` to align the following integer on a four-byte boundary.
 | 
						|
In this example, the output was produced on a little endian machine::
 | 
						|
 | 
						|
    >>> pack('@ci', b'#', 0x12131415)
 | 
						|
    b'#\x00\x00\x00\x15\x14\x13\x12'
 | 
						|
    >>> pack('@ic', 0x12131415, b'#')
 | 
						|
    b'\x15\x14\x13\x12#'
 | 
						|
    >>> calcsize('@ci')
 | 
						|
    8
 | 
						|
    >>> calcsize('@ic')
 | 
						|
    5
 | 
						|
 | 
						|
The following format ``'llh0l'`` results in two pad bytes being added
 | 
						|
at the end, assuming the platform's longs are aligned on 4-byte boundaries::
 | 
						|
 | 
						|
    >>> pack('@llh0l', 1, 2, 3)
 | 
						|
    b'\x00\x00\x00\x01\x00\x00\x00\x02\x00\x03\x00\x00'
 | 
						|
 | 
						|
 | 
						|
.. seealso::
 | 
						|
 | 
						|
   Module :mod:`array`
 | 
						|
      Packed binary storage of homogeneous data.
 | 
						|
 | 
						|
   Module :mod:`json`
 | 
						|
      JSON encoder and decoder.
 | 
						|
 | 
						|
   Module :mod:`pickle`
 | 
						|
      Python object serialization.
 | 
						|
 | 
						|
 | 
						|
.. _applications:
 | 
						|
 | 
						|
Applications
 | 
						|
------------
 | 
						|
 | 
						|
Two main applications for the :mod:`struct` module exist, data
 | 
						|
interchange between Python and C code within an application or another
 | 
						|
application compiled using the same compiler (:ref:`native formats<struct-native-formats>`), and
 | 
						|
data interchange between applications using agreed upon data layout
 | 
						|
(:ref:`standard formats<struct-standard-formats>`).  Generally speaking, the format strings
 | 
						|
constructed for these two domains are distinct.
 | 
						|
 | 
						|
 | 
						|
.. _struct-native-formats:
 | 
						|
 | 
						|
Native Formats
 | 
						|
^^^^^^^^^^^^^^
 | 
						|
 | 
						|
When constructing format strings which mimic native layouts, the
 | 
						|
compiler and machine architecture determine byte ordering and padding.
 | 
						|
In such cases, the ``@`` format character should be used to specify
 | 
						|
native byte ordering and data sizes.  Internal pad bytes are normally inserted
 | 
						|
automatically.  It is possible that a zero-repeat format code will be
 | 
						|
needed at the end of a format string to round up to the correct
 | 
						|
byte boundary for proper alignment of consective chunks of data.
 | 
						|
 | 
						|
Consider these two simple examples (on a 64-bit, little-endian
 | 
						|
machine)::
 | 
						|
 | 
						|
    >>> calcsize('@lhl')
 | 
						|
    24
 | 
						|
    >>> calcsize('@llh')
 | 
						|
    18
 | 
						|
 | 
						|
Data is not padded to an 8-byte boundary at the end of the second
 | 
						|
format string without the use of extra padding.  A zero-repeat format
 | 
						|
code solves that problem::
 | 
						|
 | 
						|
    >>> calcsize('@llh0l')
 | 
						|
    24
 | 
						|
 | 
						|
The ``'x'`` format code can be used to specify the repeat, but for
 | 
						|
native formats it is better to use a zero-repeat format like ``'0l'``.
 | 
						|
 | 
						|
By default, native byte ordering and alignment is used, but it is
 | 
						|
better to be explicit and use the ``'@'`` prefix character.
 | 
						|
 | 
						|
 | 
						|
.. _struct-standard-formats:
 | 
						|
 | 
						|
Standard Formats
 | 
						|
^^^^^^^^^^^^^^^^
 | 
						|
 | 
						|
When exchanging data beyond your process such as networking or storage,
 | 
						|
be precise.  Specify the exact byte order, size, and alignment.  Do
 | 
						|
not assume they match the native order of a particular machine.
 | 
						|
For example, network byte order is big-endian, while many popular CPUs
 | 
						|
are little-endian.  By defining this explicitly, the user need not
 | 
						|
care about the specifics of the platform their code is running on.
 | 
						|
The first character should typically be ``<`` or ``>``
 | 
						|
(or ``!``).  Padding is the responsibility of the programmer.  The
 | 
						|
zero-repeat format character won't work.  Instead, the user must
 | 
						|
explicitly add ``'x'`` pad bytes where needed.  Revisiting the
 | 
						|
examples from the previous section, we have::
 | 
						|
 | 
						|
    >>> calcsize('<qh6xq')
 | 
						|
    24
 | 
						|
    >>> pack('<qh6xq', 1, 2, 3) == pack('@lhl', 1, 2, 3)
 | 
						|
    True
 | 
						|
    >>> calcsize('@llh')
 | 
						|
    18
 | 
						|
    >>> pack('@llh', 1, 2, 3) == pack('<qqh', 1, 2, 3)
 | 
						|
    True
 | 
						|
    >>> calcsize('<qqh6x')
 | 
						|
    24
 | 
						|
    >>> calcsize('@llh0l')
 | 
						|
    24
 | 
						|
    >>> pack('@llh0l', 1, 2, 3) == pack('<qqh6x', 1, 2, 3)
 | 
						|
    True
 | 
						|
 | 
						|
The above results (executed on a 64-bit machine) aren't guaranteed to
 | 
						|
match when executed on different machines.  For example, the examples
 | 
						|
below were executed on a 32-bit machine::
 | 
						|
 | 
						|
    >>> calcsize('<qqh6x')
 | 
						|
    24
 | 
						|
    >>> calcsize('@llh0l')
 | 
						|
    12
 | 
						|
    >>> pack('@llh0l', 1, 2, 3) == pack('<qqh6x', 1, 2, 3)
 | 
						|
    False
 | 
						|
 | 
						|
 | 
						|
.. _struct-objects:
 | 
						|
 | 
						|
Classes
 | 
						|
-------
 | 
						|
 | 
						|
The :mod:`struct` module also defines the following type:
 | 
						|
 | 
						|
 | 
						|
.. class:: Struct(format)
 | 
						|
 | 
						|
   Return a new Struct object which writes and reads binary data according to
 | 
						|
   the format string *format*.  Creating a ``Struct`` object once and calling its
 | 
						|
   methods is more efficient than calling module-level functions with the
 | 
						|
   same format since the format string is only compiled once.
 | 
						|
 | 
						|
   .. note::
 | 
						|
 | 
						|
      The compiled versions of the most recent format strings passed to
 | 
						|
      the module-level functions are cached, so programs that use only a few
 | 
						|
      format strings needn't worry about reusing a single :class:`Struct`
 | 
						|
      instance.
 | 
						|
 | 
						|
   Compiled Struct objects support the following methods and attributes:
 | 
						|
 | 
						|
   .. method:: pack(v1, v2, ...)
 | 
						|
 | 
						|
      Identical to the :func:`pack` function, using the compiled format.
 | 
						|
      (``len(result)`` will equal :attr:`size`.)
 | 
						|
 | 
						|
 | 
						|
   .. method:: pack_into(buffer, offset, v1, v2, ...)
 | 
						|
 | 
						|
      Identical to the :func:`pack_into` function, using the compiled format.
 | 
						|
 | 
						|
 | 
						|
   .. method:: unpack(buffer)
 | 
						|
 | 
						|
      Identical to the :func:`unpack` function, using the compiled format.
 | 
						|
      The buffer's size in bytes must equal :attr:`size`.
 | 
						|
 | 
						|
 | 
						|
   .. method:: unpack_from(buffer, offset=0)
 | 
						|
 | 
						|
      Identical to the :func:`unpack_from` function, using the compiled format.
 | 
						|
      The buffer's size in bytes, starting at position *offset*, must be at least
 | 
						|
      :attr:`size`.
 | 
						|
 | 
						|
 | 
						|
   .. method:: iter_unpack(buffer)
 | 
						|
 | 
						|
      Identical to the :func:`iter_unpack` function, using the compiled format.
 | 
						|
      The buffer's size in bytes must be a multiple of :attr:`size`.
 | 
						|
 | 
						|
      .. versionadded:: 3.4
 | 
						|
 | 
						|
   .. attribute:: format
 | 
						|
 | 
						|
      The format string used to construct this Struct object.
 | 
						|
 | 
						|
      .. versionchanged:: 3.7
 | 
						|
         The format string type is now :class:`str` instead of :class:`bytes`.
 | 
						|
 | 
						|
   .. attribute:: size
 | 
						|
 | 
						|
      The calculated size of the struct (and hence of the bytes object produced
 | 
						|
      by the :meth:`pack` method) corresponding to :attr:`format`.
 | 
						|
 | 
						|
 | 
						|
.. _half precision format: https://en.wikipedia.org/wiki/Half-precision_floating-point_format
 | 
						|
 | 
						|
.. _ieee 754 standard: https://en.wikipedia.org/wiki/IEEE_754-2008_revision
 | 
						|
 | 
						|
.. _IETF RFC 1700: https://tools.ietf.org/html/rfc1700
 |