mirror of
https://github.com/python/cpython.git
synced 2025-10-28 20:25:04 +00:00
svn+ssh://pythondev@svn.python.org/python/branches/p3yk
................
r55636 | neal.norwitz | 2007-05-29 00:06:39 -0700 (Tue, 29 May 2007) | 149 lines
Merged revisions 55506-55635 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r55507 | georg.brandl | 2007-05-22 07:28:17 -0700 (Tue, 22 May 2007) | 2 lines
Remove the "panel" module doc file which has been ignored since 1994.
........
r55522 | mark.hammond | 2007-05-22 19:04:28 -0700 (Tue, 22 May 2007) | 4 lines
Remove definition of PY_UNICODE_TYPE from pyconfig.h, allowing the
definition in unicodeobject.h to be used, giving us the desired
wchar_t in place of 'unsigned short'. As discussed on python-dev.
........
r55525 | neal.norwitz | 2007-05-22 23:35:32 -0700 (Tue, 22 May 2007) | 6 lines
Add -3 option to the interpreter to warn about features that are
deprecated and will be changed/removed in Python 3.0.
This patch is mostly from Anthony. I tweaked some format and added
a little doc.
........
r55527 | neal.norwitz | 2007-05-22 23:57:35 -0700 (Tue, 22 May 2007) | 1 line
Whitespace cleanup
........
r55528 | neal.norwitz | 2007-05-22 23:58:36 -0700 (Tue, 22 May 2007) | 1 line
Add a bunch more deprecation warnings for builtins that are going away in 3.0
........
r55549 | georg.brandl | 2007-05-24 09:49:29 -0700 (Thu, 24 May 2007) | 2 lines
shlex.split() now has an optional "posix" parameter.
........
r55550 | georg.brandl | 2007-05-24 10:33:33 -0700 (Thu, 24 May 2007) | 2 lines
Fix parameter passing.
........
r55555 | facundo.batista | 2007-05-24 10:50:54 -0700 (Thu, 24 May 2007) | 6 lines
Added an optional timeout parameter to urllib.ftpwrapper, with tests
(for this and a basic one, because there weren't any). Changed also
NEWS, but didn't find documentation for this function, assumed it
wasn't public...
........
r55563 | facundo.batista | 2007-05-24 13:01:59 -0700 (Thu, 24 May 2007) | 4 lines
Removed the .recv() in the test, is not necessary, and was
causing problems that didn't have anything to do with was
actually being tested...
........
r55564 | facundo.batista | 2007-05-24 13:51:19 -0700 (Thu, 24 May 2007) | 5 lines
Let's see if reading exactly what is written allow this live
test to pass (now I know why there were so few tests in ftp,
http, etc, :( ).
........
r55567 | facundo.batista | 2007-05-24 20:10:28 -0700 (Thu, 24 May 2007) | 4 lines
Trying to make the tests work in Windows and Solaris, everywhere
else just works
........
r55568 | facundo.batista | 2007-05-24 20:47:19 -0700 (Thu, 24 May 2007) | 4 lines
Fixing stupid error, and introducing a sleep, to see if the
other thread is awakened and finish sending data.
........
r55569 | facundo.batista | 2007-05-24 21:20:22 -0700 (Thu, 24 May 2007) | 4 lines
Commenting out the tests until find out who can test them in
one of the problematic enviroments.
........
r55570 | neal.norwitz | 2007-05-24 22:13:40 -0700 (Thu, 24 May 2007) | 2 lines
Get test passing again by commenting out the reference to the test class.
........
r55575 | vinay.sajip | 2007-05-25 00:05:59 -0700 (Fri, 25 May 2007) | 1 line
Updated docstring for SysLogHandler (#1720726).
........
r55576 | vinay.sajip | 2007-05-25 00:06:55 -0700 (Fri, 25 May 2007) | 1 line
Updated documentation for SysLogHandler (#1720726).
........
r55592 | brett.cannon | 2007-05-25 13:17:15 -0700 (Fri, 25 May 2007) | 3 lines
Remove direct call's to file's constructor and replace them with calls to
open() as ths is considered best practice.
........
r55601 | kristjan.jonsson | 2007-05-26 12:19:50 -0700 (Sat, 26 May 2007) | 1 line
Remove the rgbimgmodule from PCBuild8
........
r55602 | kristjan.jonsson | 2007-05-26 12:31:39 -0700 (Sat, 26 May 2007) | 1 line
Include <windows.h> after python.h, so that WINNT is properly set before windows.h is included. Fixes warnings in PC builds.
........
r55603 | walter.doerwald | 2007-05-26 14:04:13 -0700 (Sat, 26 May 2007) | 2 lines
Fix typo.
........
r55604 | peter.astrand | 2007-05-26 15:18:20 -0700 (Sat, 26 May 2007) | 1 line
Applied patch 1669481, slightly modified: Support close_fds on Win32
........
r55606 | neal.norwitz | 2007-05-26 21:08:54 -0700 (Sat, 26 May 2007) | 2 lines
Add the new function object attribute names from py3k.
........
r55617 | lars.gustaebel | 2007-05-27 12:49:30 -0700 (Sun, 27 May 2007) | 20 lines
Added errors argument to TarFile class that allows the user to
specify an error handling scheme for character conversion. Additional
scheme "utf-8" in read mode. Unicode input filenames are now
supported by design. The values of the pax_headers dictionary are now
limited to unicode objects.
Fixed: The prefix field is no longer used in PAX_FORMAT (in
conformance with POSIX).
Fixed: In read mode use a possible pax header size field.
Fixed: Strip trailing slashes from pax header name values.
Fixed: Give values in user-specified pax_headers precedence when
writing.
Added unicode tests. Added pax/regtype4 member to testtar.tar all
possible number fields in a pax header.
Added two chapters to the documentation about the different formats
tarfile.py supports and how unicode issues are handled.
........
r55618 | raymond.hettinger | 2007-05-27 22:23:22 -0700 (Sun, 27 May 2007) | 1 line
Explain when groupby() issues a new group.
........
r55634 | martin.v.loewis | 2007-05-28 21:01:29 -0700 (Mon, 28 May 2007) | 2 lines
Test pre-commit hook for a link to a .py file.
........
r55635 | martin.v.loewis | 2007-05-28 21:02:03 -0700 (Mon, 28 May 2007) | 2 lines
Revert 55634.
........
................
r55639 | neal.norwitz | 2007-05-29 00:58:11 -0700 (Tue, 29 May 2007) | 1 line
Remove sys.exc_{type,exc_value,exc_traceback}
................
r55641 | neal.norwitz | 2007-05-29 01:03:50 -0700 (Tue, 29 May 2007) | 1 line
Missed one sys.exc_type. I wonder why exc_{value,traceback} were already gone
................
r55642 | neal.norwitz | 2007-05-29 01:08:33 -0700 (Tue, 29 May 2007) | 1 line
Missed more doc for sys.exc_* attrs.
................
r55643 | neal.norwitz | 2007-05-29 01:18:19 -0700 (Tue, 29 May 2007) | 1 line
Remove sys.exc_clear()
................
r55665 | guido.van.rossum | 2007-05-29 19:45:43 -0700 (Tue, 29 May 2007) | 4 lines
Make None, True, False keywords.
We can now also delete all the other places that explicitly forbid
assignment to None, but I'm not going to bother right now.
................
r55666 | guido.van.rossum | 2007-05-29 20:01:51 -0700 (Tue, 29 May 2007) | 3 lines
Found another place that needs check for forbidden names.
Fixed test_syntax.py accordingly (it helped me find that one).
................
r55668 | guido.van.rossum | 2007-05-29 20:41:48 -0700 (Tue, 29 May 2007) | 2 lines
Mark None, True, False as keywords.
................
r55673 | neal.norwitz | 2007-05-29 23:28:25 -0700 (Tue, 29 May 2007) | 3 lines
Get the dis module working on modules again after changing dicts
to not return lists and also new-style classes. Add a test.
................
r55674 | neal.norwitz | 2007-05-29 23:35:45 -0700 (Tue, 29 May 2007) | 1 line
Umm, it helps to add the module that the test uses
................
r55675 | neal.norwitz | 2007-05-29 23:53:05 -0700 (Tue, 29 May 2007) | 4 lines
Try to fix up all the other places that were assigning to True/False.
There's at least one more problem in test.test_xmlrpc. I have other
changes in that file and that should be fixed soon (I hope).
................
r55679 | neal.norwitz | 2007-05-30 00:31:55 -0700 (Wed, 30 May 2007) | 1 line
Fix up another place that was assigning to True/False.
................
r55688 | brett.cannon | 2007-05-30 14:19:47 -0700 (Wed, 30 May 2007) | 2 lines
Ditch MimeWriter.
................
r55692 | brett.cannon | 2007-05-30 14:52:00 -0700 (Wed, 30 May 2007) | 2 lines
Remove the mimify module.
................
r55707 | guido.van.rossum | 2007-05-31 05:08:45 -0700 (Thu, 31 May 2007) | 2 lines
Backport the addition of show_code() to dis.py -- it's too handy.
................
r55708 | guido.van.rossum | 2007-05-31 06:22:57 -0700 (Thu, 31 May 2007) | 7 lines
Fix a fairly long-standing bug in the check for assignment to None (and other
keywords, these days). In 2.5, you could write foo(None=1) without getting
a SyntaxError (although foo()'s definition would have to use **kwds to avoid
getting a runtime error complaining about an unknown keyword of course).
This ought to be backported to 2.5.2 or at least 2.6.
................
r55724 | brett.cannon | 2007-05-31 19:32:41 -0700 (Thu, 31 May 2007) | 2 lines
Remove the cfmfile.
................
r55727 | neal.norwitz | 2007-05-31 22:19:44 -0700 (Thu, 31 May 2007) | 1 line
Remove reload() builtin.
................
r55729 | neal.norwitz | 2007-05-31 22:51:30 -0700 (Thu, 31 May 2007) | 59 lines
Merged revisions 55636-55728 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r55637 | georg.brandl | 2007-05-29 00:16:47 -0700 (Tue, 29 May 2007) | 2 lines
Fix rst markup.
........
r55638 | neal.norwitz | 2007-05-29 00:51:39 -0700 (Tue, 29 May 2007) | 1 line
Fix typo in doc
........
r55671 | neal.norwitz | 2007-05-29 21:53:41 -0700 (Tue, 29 May 2007) | 1 line
Fix indentation (whitespace only).
........
r55676 | thomas.heller | 2007-05-29 23:58:30 -0700 (Tue, 29 May 2007) | 1 line
Fix compiler warnings.
........
r55677 | thomas.heller | 2007-05-30 00:01:25 -0700 (Wed, 30 May 2007) | 2 lines
Correct the name of a field in the WIN32_FIND_DATAA and WIN32_FIND_DATAW structures.
Closes bug #1726026.
........
r55686 | brett.cannon | 2007-05-30 13:46:26 -0700 (Wed, 30 May 2007) | 2 lines
Have MimeWriter raise a DeprecationWarning as per PEP 4 and its documentation.
........
r55690 | brett.cannon | 2007-05-30 14:48:58 -0700 (Wed, 30 May 2007) | 3 lines
Have mimify raise a DeprecationWarning. The docs and PEP 4 have listed the
module as deprecated for a while.
........
r55696 | brett.cannon | 2007-05-30 15:24:28 -0700 (Wed, 30 May 2007) | 2 lines
Have md5 raise a DeprecationWarning as per PEP 4.
........
r55705 | neal.norwitz | 2007-05-30 21:14:22 -0700 (Wed, 30 May 2007) | 1 line
Add some spaces in the example code.
........
r55716 | brett.cannon | 2007-05-31 12:20:00 -0700 (Thu, 31 May 2007) | 2 lines
Have the sha module raise a DeprecationWarning as specified in PEP 4.
........
r55719 | brett.cannon | 2007-05-31 12:40:42 -0700 (Thu, 31 May 2007) | 2 lines
Cause buildtools to raise a DeprecationWarning.
........
r55721 | brett.cannon | 2007-05-31 13:01:11 -0700 (Thu, 31 May 2007) | 2 lines
Have cfmfile raise a DeprecationWarning as per PEP 4.
........
r55726 | neal.norwitz | 2007-05-31 21:56:47 -0700 (Thu, 31 May 2007) | 1 line
Mail if there is an installation failure.
........
................
r55730 | neal.norwitz | 2007-05-31 23:22:07 -0700 (Thu, 31 May 2007) | 2 lines
Remove the code that was missed in rev 55303.
................
r55738 | neal.norwitz | 2007-06-01 19:10:43 -0700 (Fri, 01 Jun 2007) | 1 line
Fix doc breakage
................
r55741 | neal.norwitz | 2007-06-02 00:41:58 -0700 (Sat, 02 Jun 2007) | 1 line
Remove timing module (plus some remnants of other modules).
................
r55742 | neal.norwitz | 2007-06-02 00:51:44 -0700 (Sat, 02 Jun 2007) | 1 line
Remove posixfile module (plus some remnants of other modules).
................
r55744 | neal.norwitz | 2007-06-02 10:18:56 -0700 (Sat, 02 Jun 2007) | 1 line
Fix doc breakage.
................
r55745 | neal.norwitz | 2007-06-02 11:32:16 -0700 (Sat, 02 Jun 2007) | 1 line
Make a whatsnew 3.0 template.
................
r55754 | neal.norwitz | 2007-06-03 23:24:18 -0700 (Sun, 03 Jun 2007) | 1 line
SF #1730441, os._execvpe raises UnboundLocal due to new try/except semantics
................
r55755 | neal.norwitz | 2007-06-03 23:26:00 -0700 (Sun, 03 Jun 2007) | 1 line
Get rid of extra whitespace
................
r55794 | guido.van.rossum | 2007-06-06 15:29:22 -0700 (Wed, 06 Jun 2007) | 3 lines
Make this compile in GCC 2.96, which does not allow interspersing
declarations and code.
................
888 lines
36 KiB
TeX
888 lines
36 KiB
TeX
\section{\module{pickle} --- Python object serialization}
|
|
|
|
\declaremodule{standard}{pickle}
|
|
\modulesynopsis{Convert Python objects to streams of bytes and back.}
|
|
% Substantial improvements by Jim Kerr <jbkerr@sr.hp.com>.
|
|
% Rewritten by Barry Warsaw <barry@zope.com>
|
|
|
|
\index{persistence}
|
|
\indexii{persistent}{objects}
|
|
\indexii{serializing}{objects}
|
|
\indexii{marshalling}{objects}
|
|
\indexii{flattening}{objects}
|
|
\indexii{pickling}{objects}
|
|
|
|
The \module{pickle} module implements a fundamental, but powerful
|
|
algorithm for serializing and de-serializing a Python object
|
|
structure. ``Pickling'' is the process whereby a Python object
|
|
hierarchy is converted into a byte stream, and ``unpickling'' is the
|
|
inverse operation, whereby a byte stream is converted back into an
|
|
object hierarchy. Pickling (and unpickling) is alternatively known as
|
|
``serialization'', ``marshalling,''\footnote{Don't confuse this with
|
|
the \refmodule{marshal} module} or ``flattening'',
|
|
however, to avoid confusion, the terms used here are ``pickling'' and
|
|
``unpickling''.
|
|
|
|
This documentation describes both the \module{pickle} module and the
|
|
\refmodule{cPickle} module.
|
|
|
|
\subsection{Relationship to other Python modules}
|
|
|
|
The \module{pickle} module has an optimized cousin called the
|
|
\module{cPickle} module. As its name implies, \module{cPickle} is
|
|
written in C, so it can be up to 1000 times faster than
|
|
\module{pickle}. However it does not support subclassing of the
|
|
\function{Pickler()} and \function{Unpickler()} classes, because in
|
|
\module{cPickle} these are functions, not classes. Most applications
|
|
have no need for this functionality, and can benefit from the improved
|
|
performance of \module{cPickle}. Other than that, the interfaces of
|
|
the two modules are nearly identical; the common interface is
|
|
described in this manual and differences are pointed out where
|
|
necessary. In the following discussions, we use the term ``pickle''
|
|
to collectively describe the \module{pickle} and
|
|
\module{cPickle} modules.
|
|
|
|
The data streams the two modules produce are guaranteed to be
|
|
interchangeable.
|
|
|
|
Python has a more primitive serialization module called
|
|
\refmodule{marshal}, but in general
|
|
\module{pickle} should always be the preferred way to serialize Python
|
|
objects. \module{marshal} exists primarily to support Python's
|
|
\file{.pyc} files.
|
|
|
|
The \module{pickle} module differs from \refmodule{marshal} several
|
|
significant ways:
|
|
|
|
\begin{itemize}
|
|
|
|
\item The \module{pickle} module keeps track of the objects it has
|
|
already serialized, so that later references to the same object
|
|
won't be serialized again. \module{marshal} doesn't do this.
|
|
|
|
This has implications both for recursive objects and object
|
|
sharing. Recursive objects are objects that contain references
|
|
to themselves. These are not handled by marshal, and in fact,
|
|
attempting to marshal recursive objects will crash your Python
|
|
interpreter. Object sharing happens when there are multiple
|
|
references to the same object in different places in the object
|
|
hierarchy being serialized. \module{pickle} stores such objects
|
|
only once, and ensures that all other references point to the
|
|
master copy. Shared objects remain shared, which can be very
|
|
important for mutable objects.
|
|
|
|
\item \module{marshal} cannot be used to serialize user-defined
|
|
classes and their instances. \module{pickle} can save and
|
|
restore class instances transparently, however the class
|
|
definition must be importable and live in the same module as
|
|
when the object was stored.
|
|
|
|
\item The \module{marshal} serialization format is not guaranteed to
|
|
be portable across Python versions. Because its primary job in
|
|
life is to support \file{.pyc} files, the Python implementers
|
|
reserve the right to change the serialization format in
|
|
non-backwards compatible ways should the need arise. The
|
|
\module{pickle} serialization format is guaranteed to be
|
|
backwards compatible across Python releases.
|
|
|
|
\end{itemize}
|
|
|
|
\begin{notice}[warning]
|
|
The \module{pickle} module is not intended to be secure against
|
|
erroneous or maliciously constructed data. Never unpickle data
|
|
received from an untrusted or unauthenticated source.
|
|
\end{notice}
|
|
|
|
Note that serialization is a more primitive notion than persistence;
|
|
although
|
|
\module{pickle} reads and writes file objects, it does not handle the
|
|
issue of naming persistent objects, nor the (even more complicated)
|
|
issue of concurrent access to persistent objects. The \module{pickle}
|
|
module can transform a complex object into a byte stream and it can
|
|
transform the byte stream into an object with the same internal
|
|
structure. Perhaps the most obvious thing to do with these byte
|
|
streams is to write them onto a file, but it is also conceivable to
|
|
send them across a network or store them in a database. The module
|
|
\refmodule{shelve} provides a simple interface
|
|
to pickle and unpickle objects on DBM-style database files.
|
|
|
|
\subsection{Data stream format}
|
|
|
|
The data format used by \module{pickle} is Python-specific. This has
|
|
the advantage that there are no restrictions imposed by external
|
|
standards such as XDR\index{XDR}\index{External Data Representation}
|
|
(which can't represent pointer sharing); however it means that
|
|
non-Python programs may not be able to reconstruct pickled Python
|
|
objects.
|
|
|
|
By default, the \module{pickle} data format uses a printable \ASCII{}
|
|
representation. This is slightly more voluminous than a binary
|
|
representation. The big advantage of using printable \ASCII{} (and of
|
|
some other characteristics of \module{pickle}'s representation) is that
|
|
for debugging or recovery purposes it is possible for a human to read
|
|
the pickled file with a standard text editor.
|
|
|
|
There are currently 3 different protocols which can be used for pickling.
|
|
|
|
\begin{itemize}
|
|
|
|
\item Protocol version 0 is the original ASCII protocol and is backwards
|
|
compatible with earlier versions of Python.
|
|
|
|
\item Protocol version 1 is the old binary format which is also compatible
|
|
with earlier versions of Python.
|
|
|
|
\item Protocol version 2 was introduced in Python 2.3. It provides
|
|
much more efficient pickling of new-style classes.
|
|
|
|
\end{itemize}
|
|
|
|
Refer to PEP 307 for more information.
|
|
|
|
If a \var{protocol} is not specified, protocol 0 is used.
|
|
If \var{protocol} is specified as a negative value
|
|
or \constant{HIGHEST_PROTOCOL},
|
|
the highest protocol version available will be used.
|
|
|
|
\versionchanged[Introduced the \var{protocol} parameter]{2.3}
|
|
|
|
A binary format, which is slightly more efficient, can be chosen by
|
|
specifying a \var{protocol} version >= 1.
|
|
|
|
\subsection{Usage}
|
|
|
|
To serialize an object hierarchy, you first create a pickler, then you
|
|
call the pickler's \method{dump()} method. To de-serialize a data
|
|
stream, you first create an unpickler, then you call the unpickler's
|
|
\method{load()} method. The \module{pickle} module provides the
|
|
following constant:
|
|
|
|
\begin{datadesc}{HIGHEST_PROTOCOL}
|
|
The highest protocol version available. This value can be passed
|
|
as a \var{protocol} value.
|
|
\versionadded{2.3}
|
|
\end{datadesc}
|
|
|
|
\note{Be sure to always open pickle files created with protocols >= 1 in
|
|
binary mode. For the old ASCII-based pickle protocol 0 you can use
|
|
either text mode or binary mode as long as you stay consistent.
|
|
|
|
A pickle file written with protocol 0 in binary mode will contain
|
|
lone linefeeds as line terminators and therefore will look ``funny''
|
|
when viewed in Notepad or other editors which do not support this
|
|
format.}
|
|
|
|
The \module{pickle} module provides the
|
|
following functions to make the pickling process more convenient:
|
|
|
|
\begin{funcdesc}{dump}{obj, file\optional{, protocol}}
|
|
Write a pickled representation of \var{obj} to the open file object
|
|
\var{file}. This is equivalent to
|
|
\code{Pickler(\var{file}, \var{protocol}).dump(\var{obj})}.
|
|
|
|
If the \var{protocol} parameter is omitted, protocol 0 is used.
|
|
If \var{protocol} is specified as a negative value
|
|
or \constant{HIGHEST_PROTOCOL},
|
|
the highest protocol version will be used.
|
|
|
|
\versionchanged[Introduced the \var{protocol} parameter]{2.3}
|
|
|
|
\var{file} must have a \method{write()} method that accepts a single
|
|
string argument. It can thus be a file object opened for writing, a
|
|
\refmodule{StringIO} object, or any other custom
|
|
object that meets this interface.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{load}{file}
|
|
Read a string from the open file object \var{file} and interpret it as
|
|
a pickle data stream, reconstructing and returning the original object
|
|
hierarchy. This is equivalent to \code{Unpickler(\var{file}).load()}.
|
|
|
|
\var{file} must have two methods, a \method{read()} method that takes
|
|
an integer argument, and a \method{readline()} method that requires no
|
|
arguments. Both methods should return a string. Thus \var{file} can
|
|
be a file object opened for reading, a
|
|
\module{StringIO} object, or any other custom
|
|
object that meets this interface.
|
|
|
|
This function automatically determines whether the data stream was
|
|
written in binary mode or not.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{dumps}{obj\optional{, protocol}}
|
|
Return the pickled representation of the object as a string, instead
|
|
of writing it to a file.
|
|
|
|
If the \var{protocol} parameter is omitted, protocol 0 is used.
|
|
If \var{protocol} is specified as a negative value
|
|
or \constant{HIGHEST_PROTOCOL},
|
|
the highest protocol version will be used.
|
|
|
|
\versionchanged[The \var{protocol} parameter was added]{2.3}
|
|
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{loads}{string}
|
|
Read a pickled object hierarchy from a string. Characters in the
|
|
string past the pickled object's representation are ignored.
|
|
\end{funcdesc}
|
|
|
|
The \module{pickle} module also defines three exceptions:
|
|
|
|
\begin{excdesc}{PickleError}
|
|
A common base class for the other exceptions defined below. This
|
|
inherits from \exception{Exception}.
|
|
\end{excdesc}
|
|
|
|
\begin{excdesc}{PicklingError}
|
|
This exception is raised when an unpicklable object is passed to
|
|
the \method{dump()} method.
|
|
\end{excdesc}
|
|
|
|
\begin{excdesc}{UnpicklingError}
|
|
This exception is raised when there is a problem unpickling an object.
|
|
Note that other exceptions may also be raised during unpickling,
|
|
including (but not necessarily limited to) \exception{AttributeError},
|
|
\exception{EOFError}, \exception{ImportError}, and \exception{IndexError}.
|
|
\end{excdesc}
|
|
|
|
The \module{pickle} module also exports two callables\footnote{In the
|
|
\module{pickle} module these callables are classes, which you could
|
|
subclass to customize the behavior. However, in the \refmodule{cPickle}
|
|
module these callables are factory functions and so cannot be
|
|
subclassed. One common reason to subclass is to control what
|
|
objects can actually be unpickled. See section~\ref{pickle-sub} for
|
|
more details.}, \class{Pickler} and \class{Unpickler}:
|
|
|
|
\begin{classdesc}{Pickler}{file\optional{, protocol}}
|
|
This takes a file-like object to which it will write a pickle data
|
|
stream.
|
|
|
|
If the \var{protocol} parameter is omitted, protocol 0 is used.
|
|
If \var{protocol} is specified as a negative value,
|
|
the highest protocol version will be used.
|
|
|
|
\versionchanged[Introduced the \var{protocol} parameter]{2.3}
|
|
|
|
\var{file} must have a \method{write()} method that accepts a single
|
|
string argument. It can thus be an open file object, a
|
|
\module{StringIO} object, or any other custom
|
|
object that meets this interface.
|
|
\end{classdesc}
|
|
|
|
\class{Pickler} objects define one (or two) public methods:
|
|
|
|
\begin{methoddesc}[Pickler]{dump}{obj}
|
|
Write a pickled representation of \var{obj} to the open file object
|
|
given in the constructor. Either the binary or \ASCII{} format will
|
|
be used, depending on the value of the \var{protocol} argument passed to the
|
|
constructor.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[Pickler]{clear_memo}{}
|
|
Clears the pickler's ``memo''. The memo is the data structure that
|
|
remembers which objects the pickler has already seen, so that shared
|
|
or recursive objects pickled by reference and not by value. This
|
|
method is useful when re-using picklers.
|
|
|
|
\begin{notice}
|
|
Prior to Python 2.3, \method{clear_memo()} was only available on the
|
|
picklers created by \refmodule{cPickle}. In the \module{pickle} module,
|
|
picklers have an instance variable called \member{memo} which is a
|
|
Python dictionary. So to clear the memo for a \module{pickle} module
|
|
pickler, you could do the following:
|
|
|
|
\begin{verbatim}
|
|
mypickler.memo.clear()
|
|
\end{verbatim}
|
|
|
|
Code that does not need to support older versions of Python should
|
|
simply use \method{clear_memo()}.
|
|
\end{notice}
|
|
\end{methoddesc}
|
|
|
|
It is possible to make multiple calls to the \method{dump()} method of
|
|
the same \class{Pickler} instance. These must then be matched to the
|
|
same number of calls to the \method{load()} method of the
|
|
corresponding \class{Unpickler} instance. If the same object is
|
|
pickled by multiple \method{dump()} calls, the \method{load()} will
|
|
all yield references to the same object.\footnote{\emph{Warning}: this
|
|
is intended for pickling multiple objects without intervening
|
|
modifications to the objects or their parts. If you modify an object
|
|
and then pickle it again using the same \class{Pickler} instance, the
|
|
object is not pickled again --- a reference to it is pickled and the
|
|
\class{Unpickler} will return the old value, not the modified one.
|
|
There are two problems here: (1) detecting changes, and (2)
|
|
marshalling a minimal set of changes. Garbage Collection may also
|
|
become a problem here.}
|
|
|
|
\class{Unpickler} objects are defined as:
|
|
|
|
\begin{classdesc}{Unpickler}{file}
|
|
This takes a file-like object from which it will read a pickle data
|
|
stream. This class automatically determines whether the data stream
|
|
was written in binary mode or not, so it does not need a flag as in
|
|
the \class{Pickler} factory.
|
|
|
|
\var{file} must have two methods, a \method{read()} method that takes
|
|
an integer argument, and a \method{readline()} method that requires no
|
|
arguments. Both methods should return a string. Thus \var{file} can
|
|
be a file object opened for reading, a
|
|
\module{StringIO} object, or any other custom
|
|
object that meets this interface.
|
|
\end{classdesc}
|
|
|
|
\class{Unpickler} objects have one (or two) public methods:
|
|
|
|
\begin{methoddesc}[Unpickler]{load}{}
|
|
Read a pickled object representation from the open file object given
|
|
in the constructor, and return the reconstituted object hierarchy
|
|
specified therein.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[Unpickler]{noload}{}
|
|
This is just like \method{load()} except that it doesn't actually
|
|
create any objects. This is useful primarily for finding what's
|
|
called ``persistent ids'' that may be referenced in a pickle data
|
|
stream. See section~\ref{pickle-protocol} below for more details.
|
|
|
|
\strong{Note:} the \method{noload()} method is currently only
|
|
available on \class{Unpickler} objects created with the
|
|
\module{cPickle} module. \module{pickle} module \class{Unpickler}s do
|
|
not have the \method{noload()} method.
|
|
\end{methoddesc}
|
|
|
|
\subsection{What can be pickled and unpickled?}
|
|
|
|
The following types can be pickled:
|
|
|
|
\begin{itemize}
|
|
|
|
\item \code{None}, \code{True}, and \code{False}
|
|
|
|
\item integers, long integers, floating point numbers, complex numbers
|
|
|
|
\item normal and Unicode strings
|
|
|
|
\item tuples, lists, sets, and dictionaries containing only picklable objects
|
|
|
|
\item functions defined at the top level of a module
|
|
|
|
\item built-in functions defined at the top level of a module
|
|
|
|
\item classes that are defined at the top level of a module
|
|
|
|
\item instances of such classes whose \member{__dict__} or
|
|
\method{__setstate__()} is picklable (see
|
|
section~\ref{pickle-protocol} for details)
|
|
|
|
\end{itemize}
|
|
|
|
Attempts to pickle unpicklable objects will raise the
|
|
\exception{PicklingError} exception; when this happens, an unspecified
|
|
number of bytes may have already been written to the underlying file.
|
|
Trying to pickle a highly recursive data structure may exceed the
|
|
maximum recursion depth, a \exception{RuntimeError} will be raised
|
|
in this case. You can carefully raise this limit with
|
|
\function{sys.setrecursionlimit()}.
|
|
|
|
Note that functions (built-in and user-defined) are pickled by ``fully
|
|
qualified'' name reference, not by value. This means that only the
|
|
function name is pickled, along with the name of module the function
|
|
is defined in. Neither the function's code, nor any of its function
|
|
attributes are pickled. Thus the defining module must be importable
|
|
in the unpickling environment, and the module must contain the named
|
|
object, otherwise an exception will be raised.\footnote{The exception
|
|
raised will likely be an \exception{ImportError} or an
|
|
\exception{AttributeError} but it could be something else.}
|
|
|
|
Similarly, classes are pickled by named reference, so the same
|
|
restrictions in the unpickling environment apply. Note that none of
|
|
the class's code or data is pickled, so in the following example the
|
|
class attribute \code{attr} is not restored in the unpickling
|
|
environment:
|
|
|
|
\begin{verbatim}
|
|
class Foo:
|
|
attr = 'a class attr'
|
|
|
|
picklestring = pickle.dumps(Foo)
|
|
\end{verbatim}
|
|
|
|
These restrictions are why picklable functions and classes must be
|
|
defined in the top level of a module.
|
|
|
|
Similarly, when class instances are pickled, their class's code and
|
|
data are not pickled along with them. Only the instance data are
|
|
pickled. This is done on purpose, so you can fix bugs in a class or
|
|
add methods to the class and still load objects that were created with
|
|
an earlier version of the class. If you plan to have long-lived
|
|
objects that will see many versions of a class, it may be worthwhile
|
|
to put a version number in the objects so that suitable conversions
|
|
can be made by the class's \method{__setstate__()} method.
|
|
|
|
\subsection{The pickle protocol
|
|
\label{pickle-protocol}}\setindexsubitem{(pickle protocol)}
|
|
|
|
This section describes the ``pickling protocol'' that defines the
|
|
interface between the pickler/unpickler and the objects that are being
|
|
serialized. This protocol provides a standard way for you to define,
|
|
customize, and control how your objects are serialized and
|
|
de-serialized. The description in this section doesn't cover specific
|
|
customizations that you can employ to make the unpickling environment
|
|
slightly safer from untrusted pickle data streams; see section~\ref{pickle-sub}
|
|
for more details.
|
|
|
|
\subsubsection{Pickling and unpickling normal class
|
|
instances\label{pickle-inst}}
|
|
|
|
When a pickled class instance is unpickled, its \method{__init__()}
|
|
method is normally \emph{not} invoked. If it is desirable that the
|
|
\method{__init__()} method be called on unpickling, an old-style class
|
|
can define a method \method{__getinitargs__()}, which should return a
|
|
\emph{tuple} containing the arguments to be passed to the class
|
|
constructor (\method{__init__()} for example). The
|
|
\method{__getinitargs__()} method is called at
|
|
pickle time; the tuple it returns is incorporated in the pickle for
|
|
the instance.
|
|
\withsubitem{(copy protocol)}{\ttindex{__getinitargs__()}}
|
|
\withsubitem{(instance constructor)}{\ttindex{__init__()}}
|
|
|
|
\withsubitem{(copy protocol)}{\ttindex{__getnewargs__()}}
|
|
|
|
New-style types can provide a \method{__getnewargs__()} method that is
|
|
used for protocol 2. Implementing this method is needed if the type
|
|
establishes some internal invariants when the instance is created, or
|
|
if the memory allocation is affected by the values passed to the
|
|
\method{__new__()} method for the type (as it is for tuples and
|
|
strings). Instances of a new-style type \class{C} are created using
|
|
|
|
\begin{alltt}
|
|
obj = C.__new__(C, *\var{args})
|
|
\end{alltt}
|
|
|
|
where \var{args} is the result of calling \method{__getnewargs__()} on
|
|
the original object; if there is no \method{__getnewargs__()}, an
|
|
empty tuple is assumed.
|
|
|
|
\withsubitem{(copy protocol)}{
|
|
\ttindex{__getstate__()}\ttindex{__setstate__()}}
|
|
\withsubitem{(instance attribute)}{
|
|
\ttindex{__dict__}}
|
|
|
|
Classes can further influence how their instances are pickled; if the
|
|
class defines the method \method{__getstate__()}, it is called and the
|
|
return state is pickled as the contents for the instance, instead of
|
|
the contents of the instance's dictionary. If there is no
|
|
\method{__getstate__()} method, the instance's \member{__dict__} is
|
|
pickled.
|
|
|
|
Upon unpickling, if the class also defines the method
|
|
\method{__setstate__()}, it is called with the unpickled
|
|
state.\footnote{These methods can also be used to implement copying
|
|
class instances.} If there is no \method{__setstate__()} method, the
|
|
pickled state must be a dictionary and its items are assigned to the
|
|
new instance's dictionary. If a class defines both
|
|
\method{__getstate__()} and \method{__setstate__()}, the state object
|
|
needn't be a dictionary and these methods can do what they
|
|
want.\footnote{This protocol is also used by the shallow and deep
|
|
copying operations defined in the
|
|
\refmodule{copy} module.}
|
|
|
|
\begin{notice}[warning]
|
|
For new-style classes, if \method{__getstate__()} returns a false
|
|
value, the \method{__setstate__()} method will not be called.
|
|
\end{notice}
|
|
|
|
|
|
\subsubsection{Pickling and unpickling extension types}
|
|
|
|
When the \class{Pickler} encounters an object of a type it knows
|
|
nothing about --- such as an extension type --- it looks in two places
|
|
for a hint of how to pickle it. One alternative is for the object to
|
|
implement a \method{__reduce__()} method. If provided, at pickling
|
|
time \method{__reduce__()} will be called with no arguments, and it
|
|
must return either a string or a tuple.
|
|
|
|
If a string is returned, it names a global variable whose contents are
|
|
pickled as normal. The string returned by \method{__reduce__} should
|
|
be the object's local name relative to its module; the pickle module
|
|
searches the module namespace to determine the object's module.
|
|
|
|
When a tuple is returned, it must be between two and five elements
|
|
long. Optional elements can either be omitted, or \code{None} can be provided
|
|
as their value. The semantics of each element are:
|
|
|
|
\begin{itemize}
|
|
|
|
\item A callable object that will be called to create the initial
|
|
version of the object. The next element of the tuple will provide
|
|
arguments for this callable, and later elements provide additional
|
|
state information that will subsequently be used to fully reconstruct
|
|
the pickled data.
|
|
|
|
In the unpickling environment this object must be either a class, a
|
|
callable registered as a ``safe constructor'' (see below), or it must
|
|
have an attribute \member{__safe_for_unpickling__} with a true value.
|
|
Otherwise, an \exception{UnpicklingError} will be raised in the
|
|
unpickling environment. Note that as usual, the callable itself is
|
|
pickled by name.
|
|
|
|
\item A tuple of arguments for the callable object.
|
|
\versionchanged[Formerly, this argument could also be \code{None}]{2.5}
|
|
|
|
\item Optionally, the object's state, which will be passed to
|
|
the object's \method{__setstate__()} method as described in
|
|
section~\ref{pickle-inst}. If the object has no
|
|
\method{__setstate__()} method, then, as above, the value must
|
|
be a dictionary and it will be added to the object's
|
|
\member{__dict__}.
|
|
|
|
\item Optionally, an iterator (and not a sequence) yielding successive
|
|
list items. These list items will be pickled, and appended to the
|
|
object using either \code{obj.append(\var{item})} or
|
|
\code{obj.extend(\var{list_of_items})}. This is primarily used for
|
|
list subclasses, but may be used by other classes as long as they have
|
|
\method{append()} and \method{extend()} methods with the appropriate
|
|
signature. (Whether \method{append()} or \method{extend()} is used
|
|
depends on which pickle protocol version is used as well as the number
|
|
of items to append, so both must be supported.)
|
|
|
|
\item Optionally, an iterator (not a sequence)
|
|
yielding successive dictionary items, which should be tuples of the
|
|
form \code{(\var{key}, \var{value})}. These items will be pickled
|
|
and stored to the object using \code{obj[\var{key}] = \var{value}}.
|
|
This is primarily used for dictionary subclasses, but may be used by
|
|
other classes as long as they implement \method{__setitem__}.
|
|
|
|
\end{itemize}
|
|
|
|
It is sometimes useful to know the protocol version when implementing
|
|
\method{__reduce__}. This can be done by implementing a method named
|
|
\method{__reduce_ex__} instead of \method{__reduce__}.
|
|
\method{__reduce_ex__}, when it exists, is called in preference over
|
|
\method{__reduce__} (you may still provide \method{__reduce__} for
|
|
backwards compatibility). The \method{__reduce_ex__} method will be
|
|
called with a single integer argument, the protocol version.
|
|
|
|
The \class{object} class implements both \method{__reduce__} and
|
|
\method{__reduce_ex__}; however, if a subclass overrides
|
|
\method{__reduce__} but not \method{__reduce_ex__}, the
|
|
\method{__reduce_ex__} implementation detects this and calls
|
|
\method{__reduce__}.
|
|
|
|
An alternative to implementing a \method{__reduce__()} method on the
|
|
object to be pickled, is to register the callable with the
|
|
\refmodule[copyreg]{copy_reg} module. This module provides a way
|
|
for programs to register ``reduction functions'' and constructors for
|
|
user-defined types. Reduction functions have the same semantics and
|
|
interface as the \method{__reduce__()} method described above, except
|
|
that they are called with a single argument, the object to be pickled.
|
|
|
|
The registered constructor is deemed a ``safe constructor'' for purposes
|
|
of unpickling as described above.
|
|
|
|
|
|
\subsubsection{Pickling and unpickling external objects}
|
|
|
|
For the benefit of object persistence, the \module{pickle} module
|
|
supports the notion of a reference to an object outside the pickled
|
|
data stream. Such objects are referenced by a ``persistent id'',
|
|
which is just an arbitrary string of printable \ASCII{} characters.
|
|
The resolution of such names is not defined by the \module{pickle}
|
|
module; it will delegate this resolution to user defined functions on
|
|
the pickler and unpickler.\footnote{The actual mechanism for
|
|
associating these user defined functions is slightly different for
|
|
\module{pickle} and \module{cPickle}. The description given here
|
|
works the same for both implementations. Users of the \module{pickle}
|
|
module could also use subclassing to effect the same results,
|
|
overriding the \method{persistent_id()} and \method{persistent_load()}
|
|
methods in the derived classes.}
|
|
|
|
To define external persistent id resolution, you need to set the
|
|
\member{persistent_id} attribute of the pickler object and the
|
|
\member{persistent_load} attribute of the unpickler object.
|
|
|
|
To pickle objects that have an external persistent id, the pickler
|
|
must have a custom \function{persistent_id()} method that takes an
|
|
object as an argument and returns either \code{None} or the persistent
|
|
id for that object. When \code{None} is returned, the pickler simply
|
|
pickles the object as normal. When a persistent id string is
|
|
returned, the pickler will pickle that string, along with a marker
|
|
so that the unpickler will recognize the string as a persistent id.
|
|
|
|
To unpickle external objects, the unpickler must have a custom
|
|
\function{persistent_load()} function that takes a persistent id
|
|
string and returns the referenced object.
|
|
|
|
Here's a silly example that \emph{might} shed more light:
|
|
|
|
\begin{verbatim}
|
|
import pickle
|
|
from cStringIO import StringIO
|
|
|
|
src = StringIO()
|
|
p = pickle.Pickler(src)
|
|
|
|
def persistent_id(obj):
|
|
if hasattr(obj, 'x'):
|
|
return 'the value %d' % obj.x
|
|
else:
|
|
return None
|
|
|
|
p.persistent_id = persistent_id
|
|
|
|
class Integer:
|
|
def __init__(self, x):
|
|
self.x = x
|
|
def __str__(self):
|
|
return 'My name is integer %d' % self.x
|
|
|
|
i = Integer(7)
|
|
print i
|
|
p.dump(i)
|
|
|
|
datastream = src.getvalue()
|
|
print repr(datastream)
|
|
dst = StringIO(datastream)
|
|
|
|
up = pickle.Unpickler(dst)
|
|
|
|
class FancyInteger(Integer):
|
|
def __str__(self):
|
|
return 'I am the integer %d' % self.x
|
|
|
|
def persistent_load(persid):
|
|
if persid.startswith('the value '):
|
|
value = int(persid.split()[2])
|
|
return FancyInteger(value)
|
|
else:
|
|
raise pickle.UnpicklingError, 'Invalid persistent id'
|
|
|
|
up.persistent_load = persistent_load
|
|
|
|
j = up.load()
|
|
print j
|
|
\end{verbatim}
|
|
|
|
In the \module{cPickle} module, the unpickler's
|
|
\member{persistent_load} attribute can also be set to a Python
|
|
list, in which case, when the unpickler reaches a persistent id, the
|
|
persistent id string will simply be appended to this list. This
|
|
functionality exists so that a pickle data stream can be ``sniffed''
|
|
for object references without actually instantiating all the objects
|
|
in a pickle.\footnote{We'll leave you with the image of Guido and Jim
|
|
sitting around sniffing pickles in their living rooms.} Setting
|
|
\member{persistent_load} to a list is usually used in conjunction with
|
|
the \method{noload()} method on the Unpickler.
|
|
|
|
% BAW: Both pickle and cPickle support something called
|
|
% inst_persistent_id() which appears to give unknown types a second
|
|
% shot at producing a persistent id. Since Jim Fulton can't remember
|
|
% why it was added or what it's for, I'm leaving it undocumented.
|
|
|
|
\subsection{Subclassing Unpicklers \label{pickle-sub}}
|
|
|
|
By default, unpickling will import any class that it finds in the
|
|
pickle data. You can control exactly what gets unpickled and what
|
|
gets called by customizing your unpickler. Unfortunately, exactly how
|
|
you do this is different depending on whether you're using
|
|
\module{pickle} or \module{cPickle}.\footnote{A word of caution: the
|
|
mechanisms described here use internal attributes and methods, which
|
|
are subject to change in future versions of Python. We intend to
|
|
someday provide a common interface for controlling this behavior,
|
|
which will work in either \module{pickle} or \module{cPickle}.}
|
|
|
|
In the \module{pickle} module, you need to derive a subclass from
|
|
\class{Unpickler}, overriding the \method{load_global()}
|
|
method. \method{load_global()} should read two lines from the pickle
|
|
data stream where the first line will the name of the module
|
|
containing the class and the second line will be the name of the
|
|
instance's class. It then looks up the class, possibly importing the
|
|
module and digging out the attribute, then it appends what it finds to
|
|
the unpickler's stack. Later on, this class will be assigned to the
|
|
\member{__class__} attribute of an empty class, as a way of magically
|
|
creating an instance without calling its class's \method{__init__()}.
|
|
Your job (should you choose to accept it), would be to have
|
|
\method{load_global()} push onto the unpickler's stack, a known safe
|
|
version of any class you deem safe to unpickle. It is up to you to
|
|
produce such a class. Or you could raise an error if you want to
|
|
disallow all unpickling of instances. If this sounds like a hack,
|
|
you're right. Refer to the source code to make this work.
|
|
|
|
Things are a little cleaner with \module{cPickle}, but not by much.
|
|
To control what gets unpickled, you can set the unpickler's
|
|
\member{find_global} attribute to a function or \code{None}. If it is
|
|
\code{None} then any attempts to unpickle instances will raise an
|
|
\exception{UnpicklingError}. If it is a function,
|
|
then it should accept a module name and a class name, and return the
|
|
corresponding class object. It is responsible for looking up the
|
|
class and performing any necessary imports, and it may raise an
|
|
error to prevent instances of the class from being unpickled.
|
|
|
|
The moral of the story is that you should be really careful about the
|
|
source of the strings your application unpickles.
|
|
|
|
\subsection{Example \label{pickle-example}}
|
|
|
|
For the simplest code, use the \function{dump()} and \function{load()}
|
|
functions. Note that a self-referencing list is pickled and restored
|
|
correctly.
|
|
|
|
\begin{verbatim}
|
|
import pickle
|
|
|
|
data1 = {'a': [1, 2.0, 3, 4+6j],
|
|
'b': ('string', u'Unicode string'),
|
|
'c': None}
|
|
|
|
selfref_list = [1, 2, 3]
|
|
selfref_list.append(selfref_list)
|
|
|
|
output = open('data.pkl', 'wb')
|
|
|
|
# Pickle dictionary using protocol 0.
|
|
pickle.dump(data1, output)
|
|
|
|
# Pickle the list using the highest protocol available.
|
|
pickle.dump(selfref_list, output, -1)
|
|
|
|
output.close()
|
|
\end{verbatim}
|
|
|
|
The following example reads the resulting pickled data. When reading
|
|
a pickle-containing file, you should open the file in binary mode
|
|
because you can't be sure if the ASCII or binary format was used.
|
|
|
|
\begin{verbatim}
|
|
import pprint, pickle
|
|
|
|
pkl_file = open('data.pkl', 'rb')
|
|
|
|
data1 = pickle.load(pkl_file)
|
|
pprint.pprint(data1)
|
|
|
|
data2 = pickle.load(pkl_file)
|
|
pprint.pprint(data2)
|
|
|
|
pkl_file.close()
|
|
\end{verbatim}
|
|
|
|
Here's a larger example that shows how to modify pickling behavior for a
|
|
class. The \class{TextReader} class opens a text file, and returns
|
|
the line number and line contents each time its \method{readline()}
|
|
method is called. If a \class{TextReader} instance is pickled, all
|
|
attributes \emph{except} the file object member are saved. When the
|
|
instance is unpickled, the file is reopened, and reading resumes from
|
|
the last location. The \method{__setstate__()} and
|
|
\method{__getstate__()} methods are used to implement this behavior.
|
|
|
|
\begin{verbatim}
|
|
class TextReader:
|
|
"""Print and number lines in a text file."""
|
|
def __init__(self, file):
|
|
self.file = file
|
|
self.fh = open(file)
|
|
self.lineno = 0
|
|
|
|
def readline(self):
|
|
self.lineno = self.lineno + 1
|
|
line = self.fh.readline()
|
|
if not line:
|
|
return None
|
|
if line.endswith("\n"):
|
|
line = line[:-1]
|
|
return "%d: %s" % (self.lineno, line)
|
|
|
|
def __getstate__(self):
|
|
odict = self.__dict__.copy() # copy the dict since we change it
|
|
del odict['fh'] # remove filehandle entry
|
|
return odict
|
|
|
|
def __setstate__(self, dict):
|
|
fh = open(dict['file']) # reopen file
|
|
count = dict['lineno'] # read from file...
|
|
while count: # until line count is restored
|
|
fh.readline()
|
|
count = count - 1
|
|
self.__dict__.update(dict) # update attributes
|
|
self.fh = fh # save the file object
|
|
\end{verbatim}
|
|
|
|
A sample usage might be something like this:
|
|
|
|
\begin{verbatim}
|
|
>>> import TextReader
|
|
>>> obj = TextReader.TextReader("TextReader.py")
|
|
>>> obj.readline()
|
|
'1: #!/usr/local/bin/python'
|
|
>>> # (more invocations of obj.readline() here)
|
|
... obj.readline()
|
|
'7: class TextReader:'
|
|
>>> import pickle
|
|
>>> pickle.dump(obj,open('save.p', 'wb'))
|
|
\end{verbatim}
|
|
|
|
If you want to see that \refmodule{pickle} works across Python
|
|
processes, start another Python session, before continuing. What
|
|
follows can happen from either the same process or a new process.
|
|
|
|
\begin{verbatim}
|
|
>>> import pickle
|
|
>>> reader = pickle.load(open('save.p', 'rb'))
|
|
>>> reader.readline()
|
|
'8: "Print and number lines in a text file."'
|
|
\end{verbatim}
|
|
|
|
|
|
\begin{seealso}
|
|
\seemodule[copyreg]{copy_reg}{Pickle interface constructor
|
|
registration for extension types.}
|
|
|
|
\seemodule{shelve}{Indexed databases of objects; uses \module{pickle}.}
|
|
|
|
\seemodule{copy}{Shallow and deep object copying.}
|
|
|
|
\seemodule{marshal}{High-performance serialization of built-in types.}
|
|
\end{seealso}
|
|
|
|
|
|
\section{\module{cPickle} --- A faster \module{pickle}}
|
|
|
|
\declaremodule{builtin}{cPickle}
|
|
\modulesynopsis{Faster version of \refmodule{pickle}, but not subclassable.}
|
|
\moduleauthor{Jim Fulton}{jim@zope.com}
|
|
\sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
|
|
|
|
The \module{cPickle} module supports serialization and
|
|
de-serialization of Python objects, providing an interface and
|
|
functionality nearly identical to the
|
|
\refmodule{pickle}\refstmodindex{pickle} module. There are several
|
|
differences, the most important being performance and subclassability.
|
|
|
|
First, \module{cPickle} can be up to 1000 times faster than
|
|
\module{pickle} because the former is implemented in C. Second, in
|
|
the \module{cPickle} module the callables \function{Pickler()} and
|
|
\function{Unpickler()} are functions, not classes. This means that
|
|
you cannot use them to derive custom pickling and unpickling
|
|
subclasses. Most applications have no need for this functionality and
|
|
should benefit from the greatly improved performance of the
|
|
\module{cPickle} module.
|
|
|
|
The pickle data stream produced by \module{pickle} and
|
|
\module{cPickle} are identical, so it is possible to use
|
|
\module{pickle} and \module{cPickle} interchangeably with existing
|
|
pickles.\footnote{Since the pickle data format is actually a tiny
|
|
stack-oriented programming language, and some freedom is taken in the
|
|
encodings of certain objects, it is possible that the two modules
|
|
produce different data streams for the same input objects. However it
|
|
is guaranteed that they will always be able to read each other's
|
|
data streams.}
|
|
|
|
There are additional minor differences in API between \module{cPickle}
|
|
and \module{pickle}, however for most applications, they are
|
|
interchangeable. More documentation is provided in the
|
|
\module{pickle} module documentation, which
|
|
includes a list of the documented differences.
|
|
|
|
|