2017-01-11 04:04:23 +01:00
|
|
|
======================
|
2013-02-24 18:06:50 +09:00
|
|
|
MessagePack for Python
|
2017-01-11 04:04:23 +01:00
|
|
|
======================
|
2009-06-27 12:03:00 +09:00
|
|
|
|
2017-10-11 20:49:02 +03:00
|
|
|
.. image:: https://travis-ci.org/msgpack/msgpack-python.svg?branch=master
|
|
|
|
:target: https://travis-ci.org/msgpack/msgpack-python
|
|
|
|
:alt: Build Status
|
2017-01-13 20:48:48 +09:00
|
|
|
|
|
|
|
.. image:: https://readthedocs.org/projects/msgpack-python/badge/?version=latest
|
2017-10-11 20:49:02 +03:00
|
|
|
:target: https://msgpack-python.readthedocs.io/en/latest/?badge=latest
|
2017-01-13 20:48:48 +09:00
|
|
|
:alt: Documentation Status
|
2012-11-06 09:35:06 +09:00
|
|
|
|
2018-01-09 13:17:47 +09:00
|
|
|
Upgrading from msgpack-0.4
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
TL;DR: When upgrading from msgpack-0.4 or earlier, don't do `pip install -U msgpack-python`.
|
|
|
|
Do `pip uninstall msgpack-python; pip install msgpack` instead.
|
|
|
|
|
|
|
|
Package name on PyPI was changed to msgpack from 0.5.
|
|
|
|
I upload transitional package (msgpack-python 0.5 which depending on msgpack)
|
|
|
|
for smooth transition from msgpack-python to msgpack.
|
|
|
|
|
|
|
|
Sadly, this doesn't work for upgrade install. After `pip install -U msgpack-python`,
|
|
|
|
msgpack is removed and `import msgpack` fail.
|
|
|
|
|
|
|
|
|
2013-02-24 18:06:50 +09:00
|
|
|
What's this
|
2017-01-11 04:04:23 +01:00
|
|
|
-----------
|
2012-12-06 23:44:27 +11:00
|
|
|
|
2017-10-11 20:49:02 +03:00
|
|
|
`MessagePack <https://msgpack.org/>`_ is an efficient binary serialization format.
|
2017-01-11 04:04:23 +01:00
|
|
|
It lets you exchange data among multiple languages like JSON.
|
|
|
|
But it's faster and smaller.
|
|
|
|
This package provides CPython bindings for reading and writing MessagePack data.
|
2012-12-06 23:44:27 +11:00
|
|
|
|
2013-02-24 18:06:50 +09:00
|
|
|
Install
|
2017-01-11 04:04:23 +01:00
|
|
|
-------
|
2013-02-24 18:06:50 +09:00
|
|
|
|
2015-01-07 15:59:35 +09:00
|
|
|
::
|
|
|
|
|
2018-01-07 02:01:20 +09:00
|
|
|
$ pip install msgpack
|
2013-02-24 18:06:50 +09:00
|
|
|
|
|
|
|
PyPy
|
2017-01-11 04:04:23 +01:00
|
|
|
^^^^
|
2013-02-24 18:06:50 +09:00
|
|
|
|
2018-01-07 02:01:20 +09:00
|
|
|
msgpack provides a pure Python implementation. PyPy can use this.
|
2013-02-24 18:06:50 +09:00
|
|
|
|
|
|
|
Windows
|
|
|
|
^^^^^^^
|
|
|
|
|
2017-10-16 20:30:55 -07:00
|
|
|
When you can't use a binary distribution, you need to install Visual Studio
|
2015-11-08 17:34:52 +09:00
|
|
|
or Windows SDK on Windows.
|
2017-10-16 20:30:55 -07:00
|
|
|
Without extension, using pure Python implementation on CPython runs slowly.
|
2013-02-24 18:06:50 +09:00
|
|
|
|
2015-11-09 00:50:07 +09:00
|
|
|
For Python 2.7, `Microsoft Visual C++ Compiler for Python 2.7 <https://www.microsoft.com/en-us/download/details.aspx?id=44266>`_
|
2015-11-08 17:34:52 +09:00
|
|
|
is recommended solution.
|
|
|
|
|
2015-11-09 00:50:07 +09:00
|
|
|
For Python 3.5, `Microsoft Visual Studio 2015 <https://www.visualstudio.com/en-us/products/vs-2015-product-editions.aspx>`_
|
2015-11-08 17:34:52 +09:00
|
|
|
Community Edition or Express Edition can be used to build extension module.
|
|
|
|
|
2012-12-07 11:35:16 +09:00
|
|
|
|
2013-02-24 18:06:50 +09:00
|
|
|
How to use
|
2017-01-11 04:04:23 +01:00
|
|
|
----------
|
2012-06-27 18:05:35 +09:00
|
|
|
|
2013-02-24 18:06:50 +09:00
|
|
|
One-shot pack & unpack
|
2012-06-27 18:05:35 +09:00
|
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
|
|
|
|
Use ``packb`` for packing and ``unpackb`` for unpacking.
|
2017-10-16 20:30:55 -07:00
|
|
|
msgpack provides ``dumps`` and ``loads`` as an alias for compatibility with
|
2012-06-27 18:05:35 +09:00
|
|
|
``json`` and ``pickle``.
|
|
|
|
|
2017-10-16 20:30:55 -07:00
|
|
|
``pack`` and ``dump`` packs to a file-like object.
|
|
|
|
``unpack`` and ``load`` unpacks from a file-like object.
|
2012-06-27 18:05:35 +09:00
|
|
|
|
2015-01-08 20:37:31 -08:00
|
|
|
.. code-block:: pycon
|
2012-12-07 00:53:17 +11:00
|
|
|
|
2012-06-27 18:05:35 +09:00
|
|
|
>>> import msgpack
|
2018-01-06 02:07:39 +09:00
|
|
|
>>> msgpack.packb([1, 2, 3], use_bin_type=True)
|
2012-06-27 18:05:35 +09:00
|
|
|
'\x93\x01\x02\x03'
|
|
|
|
>>> msgpack.unpackb(_)
|
2012-12-07 00:53:17 +11:00
|
|
|
[1, 2, 3]
|
2012-06-27 18:05:35 +09:00
|
|
|
|
2017-10-16 20:30:55 -07:00
|
|
|
``unpack`` unpacks msgpack's array to Python's list, but can also unpack to tuple:
|
2015-01-08 20:37:31 -08:00
|
|
|
|
|
|
|
.. code-block:: pycon
|
2012-06-27 18:05:35 +09:00
|
|
|
|
2012-12-07 00:53:17 +11:00
|
|
|
>>> msgpack.unpackb(b'\x93\x01\x02\x03', use_list=False)
|
|
|
|
(1, 2, 3)
|
2012-06-27 18:05:35 +09:00
|
|
|
|
2017-10-16 20:30:55 -07:00
|
|
|
You should always specify the ``use_list`` keyword argument for backward compatibility.
|
|
|
|
See performance issues relating to `use_list option`_ below.
|
2012-12-06 22:26:39 +09:00
|
|
|
|
2012-12-06 23:36:16 +11:00
|
|
|
Read the docstring for other options.
|
2012-06-27 18:05:35 +09:00
|
|
|
|
|
|
|
|
2013-02-24 18:06:50 +09:00
|
|
|
Streaming unpacking
|
2012-06-27 18:05:35 +09:00
|
|
|
^^^^^^^^^^^^^^^^^^^
|
|
|
|
|
2012-12-06 23:36:16 +11:00
|
|
|
``Unpacker`` is a "streaming unpacker". It unpacks multiple objects from one
|
2012-12-06 23:36:46 +11:00
|
|
|
stream (or from bytes provided through its ``feed`` method).
|
2012-06-27 18:05:35 +09:00
|
|
|
|
2015-01-08 20:37:31 -08:00
|
|
|
.. code-block:: python
|
2012-06-27 18:05:35 +09:00
|
|
|
|
|
|
|
import msgpack
|
|
|
|
from io import BytesIO
|
|
|
|
|
|
|
|
buf = BytesIO()
|
|
|
|
for i in range(100):
|
2018-01-06 02:07:39 +09:00
|
|
|
buf.write(msgpack.packb(range(i), use_bin_type=True))
|
2012-06-27 18:05:35 +09:00
|
|
|
|
|
|
|
buf.seek(0)
|
|
|
|
|
2012-12-06 23:01:12 +11:00
|
|
|
unpacker = msgpack.Unpacker(buf)
|
|
|
|
for unpacked in unpacker:
|
2018-01-06 02:07:39 +09:00
|
|
|
print(unpacked)
|
2009-06-27 12:03:00 +09:00
|
|
|
|
2012-12-06 23:36:16 +11:00
|
|
|
|
2013-02-24 18:06:50 +09:00
|
|
|
Packing/unpacking of custom data type
|
2012-10-12 13:32:29 +03:00
|
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
|
2012-12-06 23:36:16 +11:00
|
|
|
It is also possible to pack/unpack custom data types. Here is an example for
|
2012-10-12 13:32:29 +03:00
|
|
|
``datetime.datetime``.
|
|
|
|
|
2015-01-08 20:37:31 -08:00
|
|
|
.. code-block:: python
|
2012-10-12 15:25:14 +03:00
|
|
|
|
2012-10-12 13:32:29 +03:00
|
|
|
import datetime
|
|
|
|
import msgpack
|
|
|
|
|
|
|
|
useful_dict = {
|
|
|
|
"id": 1,
|
|
|
|
"created": datetime.datetime.now(),
|
|
|
|
}
|
|
|
|
|
|
|
|
def decode_datetime(obj):
|
|
|
|
if b'__datetime__' in obj:
|
|
|
|
obj = datetime.datetime.strptime(obj["as_str"], "%Y%m%dT%H:%M:%S.%f")
|
|
|
|
return obj
|
|
|
|
|
|
|
|
def encode_datetime(obj):
|
|
|
|
if isinstance(obj, datetime.datetime):
|
|
|
|
return {'__datetime__': True, 'as_str': obj.strftime("%Y%m%dT%H:%M:%S.%f")}
|
|
|
|
return obj
|
|
|
|
|
|
|
|
|
2018-01-06 02:07:39 +09:00
|
|
|
packed_dict = msgpack.packb(useful_dict, default=encode_datetime, use_bin_type=True)
|
2012-10-12 13:32:29 +03:00
|
|
|
this_dict_again = msgpack.unpackb(packed_dict, object_hook=decode_datetime)
|
|
|
|
|
2012-12-06 23:10:25 +11:00
|
|
|
``Unpacker``'s ``object_hook`` callback receives a dict; the
|
|
|
|
``object_pairs_hook`` callback may instead be used to receive a list of
|
|
|
|
key-value pairs.
|
2009-06-27 12:03:00 +09:00
|
|
|
|
2013-10-19 18:43:16 +02:00
|
|
|
Extended types
|
2017-01-11 04:04:23 +01:00
|
|
|
^^^^^^^^^^^^^^
|
2013-10-19 18:43:16 +02:00
|
|
|
|
2015-11-09 00:43:52 +09:00
|
|
|
It is also possible to pack/unpack custom data types using the **ext** type.
|
2013-10-20 23:27:32 +09:00
|
|
|
|
2015-01-08 20:37:31 -08:00
|
|
|
.. code-block:: pycon
|
|
|
|
|
2013-10-20 23:27:32 +09:00
|
|
|
>>> import msgpack
|
|
|
|
>>> import array
|
|
|
|
>>> def default(obj):
|
|
|
|
... if isinstance(obj, array.array) and obj.typecode == 'd':
|
|
|
|
... return msgpack.ExtType(42, obj.tostring())
|
|
|
|
... raise TypeError("Unknown type: %r" % (obj,))
|
|
|
|
...
|
|
|
|
>>> def ext_hook(code, data):
|
|
|
|
... if code == 42:
|
|
|
|
... a = array.array('d')
|
|
|
|
... a.fromstring(data)
|
|
|
|
... return a
|
|
|
|
... return ExtType(code, data)
|
|
|
|
...
|
|
|
|
>>> data = array.array('d', [1.2, 3.4])
|
|
|
|
>>> packed = msgpack.packb(data, default=default)
|
|
|
|
>>> unpacked = msgpack.unpackb(packed, ext_hook=ext_hook)
|
|
|
|
>>> data == unpacked
|
|
|
|
True
|
2013-10-19 18:43:16 +02:00
|
|
|
|
2012-12-06 23:34:18 +11:00
|
|
|
|
2013-02-24 18:06:50 +09:00
|
|
|
Advanced unpacking control
|
2012-12-06 23:34:18 +11:00
|
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
|
|
|
|
As an alternative to iteration, ``Unpacker`` objects provide ``unpack``,
|
|
|
|
``skip``, ``read_array_header`` and ``read_map_header`` methods. The former two
|
2017-01-11 04:04:23 +01:00
|
|
|
read an entire message from the stream, respectively de-serialising and returning
|
2012-12-06 23:34:18 +11:00
|
|
|
the result, or ignoring it. The latter two methods return the number of elements
|
|
|
|
in the upcoming container, so that each element in an array, or key-value pair
|
|
|
|
in a map, can be unpacked or skipped individually.
|
|
|
|
|
|
|
|
Each of these methods may optionally write the packed data it reads to a
|
|
|
|
callback function:
|
|
|
|
|
2015-01-08 20:37:31 -08:00
|
|
|
.. code-block:: python
|
2012-12-06 23:34:18 +11:00
|
|
|
|
|
|
|
from io import BytesIO
|
|
|
|
|
|
|
|
def distribute(unpacker, get_worker):
|
|
|
|
nelems = unpacker.read_map_header()
|
|
|
|
for i in range(nelems):
|
|
|
|
# Select a worker for the given key
|
|
|
|
key = unpacker.unpack()
|
|
|
|
worker = get_worker(key)
|
|
|
|
|
|
|
|
# Send the value as a packed message to worker
|
|
|
|
bytestream = BytesIO()
|
|
|
|
unpacker.skip(bytestream.write)
|
|
|
|
worker.send(bytestream.getvalue())
|
|
|
|
|
2015-11-09 00:43:52 +09:00
|
|
|
|
|
|
|
Notes
|
|
|
|
-----
|
|
|
|
|
|
|
|
string and binary type
|
|
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
|
2017-10-16 20:30:55 -07:00
|
|
|
Early versions of msgpack didn't distinguish string and binary types (like Python 1).
|
|
|
|
The type for representing both string and binary types was named **raw**.
|
2015-11-09 00:43:52 +09:00
|
|
|
|
2017-10-16 20:30:55 -07:00
|
|
|
For backward compatibility reasons, msgpack-python will still default all
|
|
|
|
strings to byte strings, unless you specify the `use_bin_type=True` option in
|
|
|
|
the packer. If you do so, it will use a non-standard type called **bin** to
|
|
|
|
serialize byte arrays, and **raw** becomes to mean **str**. If you want to
|
|
|
|
distinguish **bin** and **raw** in the unpacker, specify `encoding='utf-8'`.
|
2015-11-09 00:43:52 +09:00
|
|
|
|
2018-01-06 02:07:39 +09:00
|
|
|
**In future version, default value of ``use_bin_type`` will be changed to ``False``.
|
|
|
|
To avoid this change will break your code, you must specify it explicitly
|
|
|
|
even when you want to use old format.**
|
|
|
|
|
2017-10-16 20:30:55 -07:00
|
|
|
Note that Python 2 defaults to byte-arrays over Unicode strings:
|
2015-11-09 00:43:52 +09:00
|
|
|
|
2017-10-16 20:30:55 -07:00
|
|
|
.. code-block:: pycon
|
|
|
|
|
|
|
|
>>> import msgpack
|
|
|
|
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs']))
|
|
|
|
['spam', 'eggs']
|
|
|
|
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs'], use_bin_type=True),
|
|
|
|
encoding='utf-8')
|
|
|
|
['spam', u'eggs']
|
|
|
|
|
|
|
|
This is the same code in Python 3 (same behaviour, but Python 3 has a
|
|
|
|
different default):
|
2015-11-09 00:43:52 +09:00
|
|
|
|
|
|
|
.. code-block:: pycon
|
|
|
|
|
|
|
|
>>> import msgpack
|
2017-10-16 20:30:55 -07:00
|
|
|
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs']))
|
|
|
|
[b'spam', b'eggs']
|
|
|
|
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs'], use_bin_type=True),
|
|
|
|
encoding='utf-8')
|
|
|
|
[b'spam', 'eggs']
|
|
|
|
|
2015-11-09 00:43:52 +09:00
|
|
|
|
|
|
|
ext type
|
|
|
|
^^^^^^^^
|
|
|
|
|
2017-10-16 20:30:55 -07:00
|
|
|
To use the **ext** type, pass ``msgpack.ExtType`` object to packer.
|
2015-11-09 00:43:52 +09:00
|
|
|
|
|
|
|
.. code-block:: pycon
|
|
|
|
|
|
|
|
>>> import msgpack
|
|
|
|
>>> packed = msgpack.packb(msgpack.ExtType(42, b'xyzzy'))
|
|
|
|
>>> msgpack.unpackb(packed)
|
|
|
|
ExtType(code=42, data='xyzzy')
|
|
|
|
|
|
|
|
You can use it with ``default`` and ``ext_hook``. See below.
|
|
|
|
|
|
|
|
|
2013-02-24 18:06:50 +09:00
|
|
|
Note about performance
|
2017-01-11 04:04:23 +01:00
|
|
|
----------------------
|
2012-12-06 22:26:39 +09:00
|
|
|
|
|
|
|
GC
|
|
|
|
^^
|
|
|
|
|
|
|
|
CPython's GC starts when growing allocated object.
|
|
|
|
This means unpacking may cause useless GC.
|
|
|
|
You can use ``gc.disable()`` when unpacking large message.
|
|
|
|
|
2015-01-02 12:12:09 +09:00
|
|
|
use_list option
|
2017-01-11 04:04:23 +01:00
|
|
|
^^^^^^^^^^^^^^^
|
2012-12-06 22:26:39 +09:00
|
|
|
List is the default sequence type of Python.
|
|
|
|
But tuple is lighter than list.
|
|
|
|
You can use ``use_list=False`` while unpacking when performance is important.
|
|
|
|
|
|
|
|
Python's dict can't use list as key and MessagePack allows array for key of mapping.
|
|
|
|
``use_list=False`` allows unpacking such message.
|
|
|
|
Another way to unpacking such object is using ``object_pairs_hook``.
|
|
|
|
|
|
|
|
|
2015-11-09 00:43:52 +09:00
|
|
|
Development
|
2017-01-11 04:04:23 +01:00
|
|
|
-----------
|
2015-11-09 00:43:52 +09:00
|
|
|
|
2013-02-24 18:06:50 +09:00
|
|
|
Test
|
2015-11-09 00:43:52 +09:00
|
|
|
^^^^
|
|
|
|
|
2013-02-16 14:03:39 +09:00
|
|
|
MessagePack uses `pytest` for testing.
|
2009-06-29 08:23:27 +09:00
|
|
|
Run test with following command:
|
|
|
|
|
2013-02-16 14:03:39 +09:00
|
|
|
$ py.test
|
2010-01-25 12:21:46 +09:00
|
|
|
|
2015-11-09 00:43:52 +09:00
|
|
|
|
2010-01-25 12:21:46 +09:00
|
|
|
..
|
|
|
|
vim: filetype=rst
|