[3.14] gh-135252: Document Zstandard integration across zipfile, shutil, and tarfile (GH-135311) (#136254)

gh-135252: Document Zstandard integration across zipfile, shutil, and tarfile (GH-135311)

Document Zstandard integration across zipfile, shutil, and tarfile
(cherry picked from commit 938a5d7e62)

Co-authored-by: Emma Smith <emma@emmatyping.dev>
This commit is contained in:
Miss Islington (bot) 2025-07-03 22:34:38 +02:00 committed by GitHub
parent ea84943574
commit e39f33259e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 77 additions and 19 deletions

View file

@ -523,8 +523,14 @@ Advanced parameter control
.. attribute:: compression_level
A high-level means of setting other compression parameters that affect
the speed and ratio of compressing data. Setting the level to zero uses
:attr:`COMPRESSION_LEVEL_DEFAULT`.
the speed and ratio of compressing data.
Regular compression levels are greater than ``0``. Values greater than
``20`` are considered "ultra" compression and require more memory than
other levels. Negative values can be used to trade off faster compression
for worse compression ratios.
Setting the level to zero uses :attr:`COMPRESSION_LEVEL_DEFAULT`.
.. attribute:: window_log

View file

@ -618,7 +618,8 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules.
*format* is the archive format: one of
"zip" (if the :mod:`zlib` module is available), "tar", "gztar" (if the
:mod:`zlib` module is available), "bztar" (if the :mod:`bz2` module is
available), or "xztar" (if the :mod:`lzma` module is available).
available), "xztar" (if the :mod:`lzma` module is available), or "zstdtar"
(if the :mod:`compression.zstd` module is available).
*root_dir* is a directory that will be the root directory of the
archive, all paths in the archive will be relative to it; for example,
@ -673,6 +674,8 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules.
- *gztar*: gzip'ed tar-file (if the :mod:`zlib` module is available).
- *bztar*: bzip2'ed tar-file (if the :mod:`bz2` module is available).
- *xztar*: xz'ed tar-file (if the :mod:`lzma` module is available).
- *zstdtar*: Zstandard compressed tar-file (if the :mod:`compression.zstd`
module is available).
You can register new formats or provide your own archiver for any existing
formats, by using :func:`register_archive_format`.
@ -716,8 +719,8 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules.
*extract_dir* is the name of the target directory where the archive is
unpacked. If not provided, the current working directory is used.
*format* is the archive format: one of "zip", "tar", "gztar", "bztar", or
"xztar". Or any other format registered with
*format* is the archive format: one of "zip", "tar", "gztar", "bztar",
"xztar", or "zstdtar". Or any other format registered with
:func:`register_unpack_format`. If not provided, :func:`unpack_archive`
will use the archive file name extension and see if an unpacker was
registered for that extension. In case none is found,
@ -789,6 +792,8 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules.
- *gztar*: gzip'ed tar-file (if the :mod:`zlib` module is available).
- *bztar*: bzip2'ed tar-file (if the :mod:`bz2` module is available).
- *xztar*: xz'ed tar-file (if the :mod:`lzma` module is available).
- *zstdtar*: Zstandard compressed tar-file (if the :mod:`compression.zstd`
module is available).
You can register new formats or provide your own unpacker for any existing
formats, by using :func:`register_unpack_format`.

View file

@ -18,8 +18,8 @@ higher-level functions in :ref:`shutil <archiving-operations>`.
Some facts and figures:
* reads and writes :mod:`gzip`, :mod:`bz2` and :mod:`lzma` compressed archives
if the respective modules are available.
* reads and writes :mod:`gzip`, :mod:`bz2`, :mod:`compression.zstd`, and
:mod:`lzma` compressed archives if the respective modules are available.
* read/write support for the POSIX.1-1988 (ustar) format.
@ -47,6 +47,10 @@ Some facts and figures:
or paths outside of the destination. Previously, the filter strategy
was equivalent to :func:`fully_trusted <fully_trusted_filter>`.
.. versionchanged:: 3.14
Added support for Zstandard compression using :mod:`compression.zstd`.
.. function:: open(name=None, mode='r', fileobj=None, bufsize=10240, **kwargs)
Return a :class:`TarFile` object for the pathname *name*. For detailed
@ -71,6 +75,8 @@ Some facts and figures:
+------------------+---------------------------------------------+
| ``'r:xz'`` | Open for reading with lzma compression. |
+------------------+---------------------------------------------+
| ``'r:zst'`` | Open for reading with Zstandard compression.|
+------------------+---------------------------------------------+
| ``'x'`` or | Create a tarfile exclusively without |
| ``'x:'`` | compression. |
| | Raise a :exc:`FileExistsError` exception |
@ -88,6 +94,10 @@ Some facts and figures:
| | Raise a :exc:`FileExistsError` exception |
| | if it already exists. |
+------------------+---------------------------------------------+
| ``'x:zst'`` | Create a tarfile with Zstandard compression.|
| | Raise a :exc:`FileExistsError` exception |
| | if it already exists. |
+------------------+---------------------------------------------+
| ``'a' or 'a:'`` | Open for appending with no compression. The |
| | file is created if it does not exist. |
+------------------+---------------------------------------------+
@ -99,6 +109,8 @@ Some facts and figures:
+------------------+---------------------------------------------+
| ``'w:xz'`` | Open for lzma compressed writing. |
+------------------+---------------------------------------------+
| ``'w:zst'`` | Open for Zstandard compressed writing. |
+------------------+---------------------------------------------+
Note that ``'a:gz'``, ``'a:bz2'`` or ``'a:xz'`` is not possible. If *mode*
is not suitable to open a certain (compressed) file for reading,
@ -115,6 +127,15 @@ Some facts and figures:
For modes ``'w:xz'``, ``'x:xz'`` and ``'w|xz'``, :func:`tarfile.open` accepts the
keyword argument *preset* to specify the compression level of the file.
For modes ``'w:zst'``, ``'x:zst'`` and ``'w|zst'``, :func:`tarfile.open`
accepts the keyword argument *level* to specify the compression level of
the file. The keyword argument *options* may also be passed, providing
advanced Zstandard compression parameters described by
:class:`~compression.zstd.CompressionParameter`. The keyword argument
*zstd_dict* can be passed to provide a :class:`~compression.zstd.ZstdDict`,
a Zstandard dictionary used to improve compression of smaller amounts of
data.
For special purposes, there is a second format for *mode*:
``'filemode|[compression]'``. :func:`tarfile.open` will return a :class:`TarFile`
object that processes its data as a stream of blocks. No random seeking will
@ -146,6 +167,9 @@ Some facts and figures:
| ``'r|xz'`` | Open an lzma compressed *stream* for |
| | reading. |
+-------------+--------------------------------------------+
| ``'r|zst'`` | Open a Zstandard compressed *stream* for |
| | reading. |
+-------------+--------------------------------------------+
| ``'w|'`` | Open an uncompressed *stream* for writing. |
+-------------+--------------------------------------------+
| ``'w|gz'`` | Open a gzip compressed *stream* for |
@ -157,6 +181,9 @@ Some facts and figures:
| ``'w|xz'`` | Open an lzma compressed *stream* for |
| | writing. |
+-------------+--------------------------------------------+
| ``'w|zst'`` | Open a Zstandard compressed *stream* for |
| | writing. |
+-------------+--------------------------------------------+
.. versionchanged:: 3.5
The ``'x'`` (exclusive creation) mode was added.

View file

@ -129,14 +129,28 @@ The module defines the following items:
.. versionadded:: 3.3
.. data:: ZIP_ZSTANDARD
The numeric constant for Zstandard compression. This requires the
:mod:`compression.zstd` module.
.. note::
The ZIP file format specification has included support for bzip2 compression
since 2001, and for LZMA compression since 2006. However, some tools
(including older Python releases) do not support these compression
methods, and may either refuse to process the ZIP file altogether,
or fail to extract individual files.
In APPNOTE 6.3.7, the method ID ``20`` was assigned to Zstandard
compression. This was changed in APPNOTE 6.3.8 to method ID ``93`` to
avoid conflicts, with method ID ``20`` being deprecated. For
compatibility, the :mod:`!zipfile` module reads both method IDs but will
only write data with method ID ``93``.
.. versionadded:: 3.14
.. note::
The ZIP file format specification has included support for bzip2 compression
since 2001, for LZMA compression since 2006, and Zstandard compression since
2020. However, some tools (including older Python releases) do not support
these compression methods, and may either refuse to process the ZIP file
altogether, or fail to extract individual files.
.. seealso::
@ -176,10 +190,11 @@ ZipFile Objects
*compression* is the ZIP compression method to use when writing the archive,
and should be :const:`ZIP_STORED`, :const:`ZIP_DEFLATED`,
:const:`ZIP_BZIP2` or :const:`ZIP_LZMA`; unrecognized
values will cause :exc:`NotImplementedError` to be raised. If
:const:`ZIP_DEFLATED`, :const:`ZIP_BZIP2` or :const:`ZIP_LZMA` is specified
but the corresponding module (:mod:`zlib`, :mod:`bz2` or :mod:`lzma`) is not
:const:`ZIP_BZIP2`, :const:`ZIP_LZMA`, or :const:`ZIP_ZSTANDARD`;
unrecognized values will cause :exc:`NotImplementedError` to be raised. If
:const:`ZIP_DEFLATED`, :const:`ZIP_BZIP2`, :const:`ZIP_LZMA`, or
:const:`ZIP_ZSTANDARD` is specified but the corresponding module
(:mod:`zlib`, :mod:`bz2`, :mod:`lzma`, or :mod:`compression.zstd`) is not
available, :exc:`RuntimeError` is raised. The default is :const:`ZIP_STORED`.
If *allowZip64* is ``True`` (the default) zipfile will create ZIP files that
@ -194,6 +209,10 @@ ZipFile Objects
(see :class:`zlib <zlib.compressobj>` for more information).
When using :const:`ZIP_BZIP2` integers ``1`` through ``9`` are accepted
(see :class:`bz2 <bz2.BZ2File>` for more information).
When using :const:`ZIP_ZSTANDARD` integers ``-131072`` through ``22`` are
commonly accepted (see
:attr:`CompressionParameter.compression_level <compression.zstd.CompressionParameter.compression_level>`
for more on retrieving valid values and their meaning).
The *strict_timestamps* argument, when set to ``False``, allows to
zip files older than 1980-01-01 at the cost of setting the
@ -415,9 +434,10 @@ ZipFile Objects
read or append. *pwd* is the password used for encrypted files as a :class:`bytes`
object and, if specified, overrides the default password set with :meth:`setpassword`.
Calling :meth:`read` on a ZipFile that uses a compression method other than
:const:`ZIP_STORED`, :const:`ZIP_DEFLATED`, :const:`ZIP_BZIP2` or
:const:`ZIP_LZMA` will raise a :exc:`NotImplementedError`. An error will also
be raised if the corresponding compression module is not available.
:const:`ZIP_STORED`, :const:`ZIP_DEFLATED`, :const:`ZIP_BZIP2`,
:const:`ZIP_LZMA`, or :const:`ZIP_ZSTANDARD` will raise a
:exc:`NotImplementedError`. An error will also be raised if the
corresponding compression module is not available.
.. versionchanged:: 3.6
Calling :meth:`read` on a closed ZipFile will raise a :exc:`ValueError`.