diff --git a/Doc/library/gzip.rst b/Doc/library/gzip.rst index 33c40676f74..8cea2649ee6 100644 --- a/Doc/library/gzip.rst +++ b/Doc/library/gzip.rst @@ -174,19 +174,30 @@ The module defines the following items: Compress the *data*, returning a :class:`bytes` object containing the compressed data. *compresslevel* and *mtime* have the same meaning as in - the :class:`GzipFile` constructor above. + the :class:`GzipFile` constructor above. When *mtime* is set to ``0``, this + function is equivalent to :func:`zlib.compress` with *wbits* set to ``31``. + The zlib function is faster. .. versionadded:: 3.2 .. versionchanged:: 3.8 Added the *mtime* parameter for reproducible output. + .. versionchanged:: 3.11 + Speed is improved by compressing all data at once instead of in a + streamed fashion. Calls with *mtime* set to ``0`` are delegated to + :func:`zlib.compress` for better speed. .. function:: decompress(data) Decompress the *data*, returning a :class:`bytes` object containing the - uncompressed data. + uncompressed data. This function is capable of decompressing multi-member + gzip data (multiple gzip blocks concatenated together). When the data is + certain to contain only one member the :func:`zlib.decompress` function with + *wbits* set to 31 is faster. .. versionadded:: 3.2 - + .. versionchanged:: 3.11 + Speed is improved by decompressing members at once in memory instead of in + a streamed fashion. .. _gzip-usage-examples: diff --git a/Doc/library/zlib.rst b/Doc/library/zlib.rst index ec60ea24db6..793c90f3c4e 100644 --- a/Doc/library/zlib.rst +++ b/Doc/library/zlib.rst @@ -47,7 +47,7 @@ The available exception and functions in this module are: platforms, use ``adler32(data) & 0xffffffff``. -.. function:: compress(data, /, level=-1) +.. function:: compress(data, /, level=-1, wbits=MAX_WBITS) Compresses the bytes in *data*, returning a bytes object containing compressed data. *level* is an integer from ``0`` to ``9`` or ``-1`` controlling the level of compression; @@ -55,26 +55,8 @@ The available exception and functions in this module are: is slowest and produces the most. ``0`` (Z_NO_COMPRESSION) is no compression. The default value is ``-1`` (Z_DEFAULT_COMPRESSION). Z_DEFAULT_COMPRESSION represents a default compromise between speed and compression (currently equivalent to level 6). - Raises the :exc:`error` exception if any error occurs. - .. versionchanged:: 3.6 - *level* can now be used as a keyword parameter. - - -.. function:: compressobj(level=-1, method=DEFLATED, wbits=MAX_WBITS, memLevel=DEF_MEM_LEVEL, strategy=Z_DEFAULT_STRATEGY[, zdict]) - - Returns a compression object, to be used for compressing data streams that won't - fit into memory at once. - - *level* is the compression level -- an integer from ``0`` to ``9`` or ``-1``. - A value of ``1`` (Z_BEST_SPEED) is fastest and produces the least compression, - while a value of ``9`` (Z_BEST_COMPRESSION) is slowest and produces the most. - ``0`` (Z_NO_COMPRESSION) is no compression. The default value is ``-1`` (Z_DEFAULT_COMPRESSION). - Z_DEFAULT_COMPRESSION represents a default compromise between speed and compression - (currently equivalent to level 6). - - *method* is the compression algorithm. Currently, the only supported value is - :const:`DEFLATED`. + .. _compress-wbits: The *wbits* argument controls the size of the history buffer (or the "window size") used when compressing data, and whether a header and @@ -94,6 +76,34 @@ The available exception and functions in this module are: window size logarithm, while including a basic :program:`gzip` header and trailing checksum in the output. + Raises the :exc:`error` exception if any error occurs. + + .. versionchanged:: 3.6 + *level* can now be used as a keyword parameter. + + .. versionchanged:: 3.11 + The *wbits* parameter is now available to set window bits and + compression type. + +.. function:: compressobj(level=-1, method=DEFLATED, wbits=MAX_WBITS, memLevel=DEF_MEM_LEVEL, strategy=Z_DEFAULT_STRATEGY[, zdict]) + + Returns a compression object, to be used for compressing data streams that won't + fit into memory at once. + + *level* is the compression level -- an integer from ``0`` to ``9`` or ``-1``. + A value of ``1`` (Z_BEST_SPEED) is fastest and produces the least compression, + while a value of ``9`` (Z_BEST_COMPRESSION) is slowest and produces the most. + ``0`` (Z_NO_COMPRESSION) is no compression. The default value is ``-1`` (Z_DEFAULT_COMPRESSION). + Z_DEFAULT_COMPRESSION represents a default compromise between speed and compression + (currently equivalent to level 6). + + *method* is the compression algorithm. Currently, the only supported value is + :const:`DEFLATED`. + + The *wbits* parameter controls the size of the history buffer (or the + "window size"), and what header and trailer format will be used. It has + the same meaning as `described for compress() <#compress-wbits>`__. + The *memLevel* argument controls the amount of memory used for the internal compression state. Valid values range from ``1`` to ``9``. Higher values use more memory, but are faster and produce smaller output. diff --git a/Lib/gzip.py b/Lib/gzip.py index 3d837b74480..0dddb51553f 100644 --- a/Lib/gzip.py +++ b/Lib/gzip.py @@ -403,6 +403,59 @@ def __iter__(self): return self._buffer.__iter__() +def _read_exact(fp, n): + '''Read exactly *n* bytes from `fp` + + This method is required because fp may be unbuffered, + i.e. return short reads. + ''' + data = fp.read(n) + while len(data) < n: + b = fp.read(n - len(data)) + if not b: + raise EOFError("Compressed file ended before the " + "end-of-stream marker was reached") + data += b + return data + + +def _read_gzip_header(fp): + '''Read a gzip header from `fp` and progress to the end of the header. + + Returns last mtime if header was present or None otherwise. + ''' + magic = fp.read(2) + if magic == b'': + return None + + if magic != b'\037\213': + raise BadGzipFile('Not a gzipped file (%r)' % magic) + + (method, flag, last_mtime) = struct.unpack(" bytes: + """ + Write a simple gzip header with no extra fields. + :param compresslevel: Compresslevel used to determine the xfl bytes. + :param mtime: The mtime (must support conversion to a 32-bit integer). + :return: A bytes object representing the gzip header. + """ + if mtime is None: + mtime = time.time() + if compresslevel == _COMPRESS_LEVEL_BEST: + xfl = 2 + elif compresslevel == _COMPRESS_LEVEL_FAST: + xfl = 4 + else: + xfl = 0 + # Pack ID1 and ID2 magic bytes, method (8=deflate), header flags (no extra + # fields added to header), mtime, xfl and os (255 for unknown OS). + return struct.pack("