README: ` ->

This commit is contained in:
Inada Naoki 2020-02-06 21:06:04 +09:00
parent 0dad821169
commit ff1f5f89d9

View file

@ -28,10 +28,10 @@ msgpack is removed, and `import msgpack` fail.
### Compatibility with the old format ### Compatibility with the old format
You can use ``use_bin_type=False`` option to pack ``bytes`` You can use `use_bin_type=False` option to pack `bytes`
object into raw type in the old msgpack spec, instead of bin type in new msgpack spec. object into raw type in the old msgpack spec, instead of bin type in new msgpack spec.
You can unpack old msgpack format using ``raw=True`` option. You can unpack old msgpack format using `raw=True` option.
It unpacks str (raw) type in msgpack into Python bytes. It unpacks str (raw) type in msgpack into Python bytes.
See note below for detail. See note below for detail.
@ -42,23 +42,23 @@ See note below for detail.
* Python 2 * Python 2
* The extension module does not support Python 2 anymore. * The extension module does not support Python 2 anymore.
The pure Python implementation (``msgpack.fallback``) is used for Python 2. The pure Python implementation (`msgpack.fallback`) is used for Python 2.
* Packer * Packer
* ``use_bin_type=True`` by default. bytes are encoded in bin type in msgpack. * `use_bin_type=True` by default. bytes are encoded in bin type in msgpack.
**If you are still sing Python 2, you must use unicode for all string types.** **If you are still sing Python 2, you must use unicode for all string types.**
You can use ``use_bin_type=False`` to encode into old msgpack format. You can use `use_bin_type=False` to encode into old msgpack format.
* ``encoding`` option is removed. UTF-8 is used always. * `encoding` option is removed. UTF-8 is used always.
* Unpacker * Unpacker
* ``raw=False`` by default. It assumes str types are valid UTF-8 string * `raw=False` by default. It assumes str types are valid UTF-8 string
and decode them to Python str (unicode) object. and decode them to Python str (unicode) object.
* ``encoding`` option is removed. You can use ``raw=True`` to support old format. * `encoding` option is removed. You can use `raw=True` to support old format.
* Default value of ``max_buffer_size`` is changed from 0 to 100 MiB. * Default value of `max_buffer_size` is changed from 0 to 100 MiB.
* Default value of ``strict_map_key`` is changed to True to avoid hashdos. * Default value of `strict_map_key` is changed to True to avoid hashdos.
You need to pass ``strict_map_key=False`` if you have data which contain map keys You need to pass `strict_map_key=False` if you have data which contain map keys
which type is not bytes or str. which type is not bytes or str.
@ -70,10 +70,10 @@ See note below for detail.
### Pure Python implementation ### Pure Python implementation
The extension module in msgpack (``msgpack._cmsgpack``) does not support The extension module in msgpack (`msgpack._cmsgpack`) does not support
Python 2 and PyPy. Python 2 and PyPy.
But msgpack provides a pure Python implementation (``msgpack.fallback``) But msgpack provides a pure Python implementation (`msgpack.fallback`)
for PyPy and Python 2. for PyPy and Python 2.
Since the [pip](https://pip.pypa.io/) uses the pure Python implementation, Since the [pip](https://pip.pypa.io/) uses the pure Python implementation,
@ -89,18 +89,18 @@ Without extension, using pure Python implementation on CPython runs slowly.
## How to use ## How to use
NOTE: In examples below, I use ``raw=False`` and ``use_bin_type=True`` for users NOTE: In examples below, I use `raw=False` and `use_bin_type=True` for users
using msgpack < 1.0. These options are default from msgpack 1.0 so you can omit them. using msgpack < 1.0. These options are default from msgpack 1.0 so you can omit them.
### One-shot pack & unpack ### One-shot pack & unpack
Use ``packb`` for packing and ``unpackb`` for unpacking. Use `packb` for packing and `unpackb` for unpacking.
msgpack provides ``dumps`` and ``loads`` as an alias for compatibility with msgpack provides `dumps` and `loads` as an alias for compatibility with
``json`` and ``pickle``. `json` and `pickle`.
``pack`` and ``dump`` packs to a file-like object. `pack` and `dump` packs to a file-like object.
``unpack`` and ``load`` unpacks from a file-like object. `unpack` and `load` unpacks from a file-like object.
```pycon ```pycon
>>> import msgpack >>> import msgpack
@ -110,14 +110,14 @@ msgpack provides ``dumps`` and ``loads`` as an alias for compatibility with
[1, 2, 3] [1, 2, 3]
``` ```
``unpack`` unpacks msgpack's array to Python's list, but can also unpack to tuple: `unpack` unpacks msgpack's array to Python's list, but can also unpack to tuple:
```pycon ```pycon
>>> msgpack.unpackb(b'\x93\x01\x02\x03', use_list=False, raw=False) >>> msgpack.unpackb(b'\x93\x01\x02\x03', use_list=False, raw=False)
(1, 2, 3) (1, 2, 3)
``` ```
You should always specify the ``use_list`` keyword argument for backward compatibility. You should always specify the `use_list` keyword argument for backward compatibility.
See performance issues relating to `use_list option`_ below. See performance issues relating to `use_list option`_ below.
Read the docstring for other options. Read the docstring for other options.
@ -125,8 +125,8 @@ Read the docstring for other options.
### Streaming unpacking ### Streaming unpacking
``Unpacker`` is a "streaming unpacker". It unpacks multiple objects from one `Unpacker` is a "streaming unpacker". It unpacks multiple objects from one
stream (or from bytes provided through its ``feed`` method). stream (or from bytes provided through its `feed` method).
```py ```py
import msgpack import msgpack
@ -147,7 +147,7 @@ stream (or from bytes provided through its ``feed`` method).
### Packing/unpacking of custom data type ### Packing/unpacking of custom data type
It is also possible to pack/unpack custom data types. Here is an example for It is also possible to pack/unpack custom data types. Here is an example for
``datetime.datetime``. `datetime.datetime`.
```py ```py
import datetime import datetime
@ -173,8 +173,8 @@ It is also possible to pack/unpack custom data types. Here is an example for
this_dict_again = msgpack.unpackb(packed_dict, object_hook=decode_datetime, raw=False) this_dict_again = msgpack.unpackb(packed_dict, object_hook=decode_datetime, raw=False)
``` ```
``Unpacker``'s ``object_hook`` callback receives a dict; the `Unpacker`'s `object_hook` callback receives a dict; the
``object_pairs_hook`` callback may instead be used to receive a list of `object_pairs_hook` callback may instead be used to receive a list of
key-value pairs. key-value pairs.
@ -207,8 +207,8 @@ It is also possible to pack/unpack custom data types using the **ext** type.
### Advanced unpacking control ### Advanced unpacking control
As an alternative to iteration, ``Unpacker`` objects provide ``unpack``, As an alternative to iteration, `Unpacker` objects provide `unpack`,
``skip``, ``read_array_header`` and ``read_map_header`` methods. The former two `skip`, `read_array_header` and `read_map_header` methods. The former two
read an entire message from the stream, respectively de-serialising and returning read an entire message from the stream, respectively de-serialising and returning
the result, or ignoring it. The latter two methods return the number of elements the result, or ignoring it. The latter two methods return the number of elements
in the upcoming container, so that each element in an array, or key-value pair in the upcoming container, so that each element in an array, or key-value pair
@ -222,8 +222,8 @@ in a map, can be unpacked or skipped individually.
Early versions of msgpack didn't distinguish string and binary types. Early versions of msgpack didn't distinguish string and binary types.
The type for representing both string and binary types was named **raw**. The type for representing both string and binary types was named **raw**.
You can pack into and unpack from this old spec using ``use_bin_type=False`` You can pack into and unpack from this old spec using `use_bin_type=False`
and ``raw=True`` options. and `raw=True` options.
```pycon ```pycon
>>> import msgpack >>> import msgpack
@ -235,7 +235,7 @@ and ``raw=True`` options.
### ext type ### ext type
To use the **ext** type, pass ``msgpack.ExtType`` object to packer. To use the **ext** type, pass `msgpack.ExtType` object to packer.
```pycon ```pycon
>>> import msgpack >>> import msgpack
@ -244,7 +244,7 @@ To use the **ext** type, pass ``msgpack.ExtType`` object to packer.
ExtType(code=42, data='xyzzy') ExtType(code=42, data='xyzzy')
``` ```
You can use it with ``default`` and ``ext_hook``. See below. You can use it with `default` and `ext_hook`. See below.
### Security ### Security
@ -252,24 +252,24 @@ You can use it with ``default`` and ``ext_hook``. See below.
To unpacking data received from unreliable source, msgpack provides To unpacking data received from unreliable source, msgpack provides
two security options. two security options.
``max_buffer_size`` (default: 100*1024*1024) limits the internal buffer size. `max_buffer_size` (default: `100*1024*1024`) limits the internal buffer size.
It is used to limit the preallocated list size too. It is used to limit the preallocated list size too.
``strict_map_key`` (default: ``True``) limits the type of map keys to bytes and str. `strict_map_key` (default: `True`) limits the type of map keys to bytes and str.
While msgpack spec doesn't limit the types of the map keys, While msgpack spec doesn't limit the types of the map keys,
there is a risk of the hashdos. there is a risk of the hashdos.
If you need to support other types for map keys, use ``strict_map_key=False``. If you need to support other types for map keys, use `strict_map_key=False`.
### Performance tips ### Performance tips
CPython's GC starts when growing allocated object. CPython's GC starts when growing allocated object.
This means unpacking may cause useless GC. This means unpacking may cause useless GC.
You can use ``gc.disable()`` when unpacking large message. You can use `gc.disable()` when unpacking large message.
List is the default sequence type of Python. List is the default sequence type of Python.
But tuple is lighter than list. But tuple is lighter than list.
You can use ``use_list=False`` while unpacking when performance is important. You can use `use_list=False` while unpacking when performance is important.
## Development ## Development