| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | \section{\module{bz2} --- | 
					
						
							|  |  |  |          Compression compatible with \program{bzip2}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \declaremodule{builtin}{bz2} | 
					
						
							|  |  |  | \modulesynopsis{Interface to compression and decompression | 
					
						
							|  |  |  |                 routines compatible with \program{bzip2}.} | 
					
						
							|  |  |  | \moduleauthor{Gustavo Niemeyer}{niemeyer@conectiva.com} | 
					
						
							|  |  |  | \sectionauthor{Gustavo Niemeyer}{niemeyer@conectiva.com} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \versionadded{2.3} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | This module provides a comprehensive interface for the bz2 compression library. | 
					
						
							|  |  |  | It implements a complete file interface, one-shot (de)compression functions, | 
					
						
							|  |  |  | and types for sequential (de)compression. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Here is a resume of the features offered by the bz2 module: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{itemize} | 
					
						
							|  |  |  | \item \class{BZ2File} class implements a complete file interface, including | 
					
						
							| 
									
										
										
										
											2002-11-15 16:38:06 +00:00
										 |  |  |       \method{readline()}, \method{readlines()}, | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  |       \method{writelines()}, \method{seek()}, etc; | 
					
						
							|  |  |  | \item \class{BZ2File} class implements emulated \method{seek()} support; | 
					
						
							|  |  |  | \item \class{BZ2File} class implements universal newline support; | 
					
						
							|  |  |  | \item \class{BZ2File} class offers an optimized line iteration using | 
					
						
							|  |  |  |       the readahead algorithm borrowed from file objects; | 
					
						
							|  |  |  | \item Sequential (de)compression supported by \class{BZ2Compressor} and | 
					
						
							|  |  |  |       \class{BZ2Decompressor} classes; | 
					
						
							|  |  |  | \item One-shot (de)compression supported by \function{compress()} and | 
					
						
							|  |  |  |       \function{decompress()} functions; | 
					
						
							|  |  |  | \item Thread safety uses individual locking mechanism; | 
					
						
							|  |  |  | \item Complete inline documentation; | 
					
						
							|  |  |  | \end{itemize} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{(De)compression of files} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Handling of compressed files is offered by the \class{BZ2File} class. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2002-11-05 17:54:02 +00:00
										 |  |  | \begin{classdesc}{BZ2File}{filename\optional{, mode\optional{, | 
					
						
							|  |  |  |                            buffering\optional{, compresslevel}}}} | 
					
						
							|  |  |  | Open a bz2 file. Mode can be either \code{'r'} or \code{'w'}, for reading  | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | (default) or writing. When opened for writing, the file will be created if | 
					
						
							| 
									
										
										
										
											2002-11-05 17:54:02 +00:00
										 |  |  | it doesn't exist, and truncated otherwise. If \var{buffering} is given, | 
					
						
							|  |  |  | \code{0} means unbuffered, and larger numbers specify the buffer size; | 
					
						
							|  |  |  | the default is \code{0}. If | 
					
						
							| 
									
										
										
										
											2002-11-25 18:51:43 +00:00
										 |  |  | \var{compresslevel} is given, it must be a number between \code{1} and | 
					
						
							| 
									
										
										
										
											2002-11-05 17:54:02 +00:00
										 |  |  | \code{9}; the default is \code{9}. | 
					
						
							| 
									
										
										
										
											2002-11-15 16:38:06 +00:00
										 |  |  | Add a \character{U} to mode to open the file for input with universal newline | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | support. Any line ending in the input file will be seen as a | 
					
						
							| 
									
										
										
										
											2002-11-15 16:38:06 +00:00
										 |  |  | \character{\e n} in Python.  Also, a file so opened gains the | 
					
						
							|  |  |  | attribute \member{newlines}; the value for this attribute is one of | 
					
						
							|  |  |  | \code{None} (no newline read yet), \code{'\e r'}, \code{'\e n'}, | 
					
						
							|  |  |  | \code{'\e r\e n'} or a tuple containing all the newline types | 
					
						
							|  |  |  | seen. Universal newlines are available only when reading. | 
					
						
							|  |  |  | Instances support iteration in the same way as normal \class{file} | 
					
						
							|  |  |  | instances. | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | \end{classdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{methoddesc}[BZ2File]{close}{} | 
					
						
							|  |  |  | Close the file. Sets data attribute \member{closed} to true. A closed file | 
					
						
							|  |  |  | cannot be used for further I/O operations. \method{close()} may be called | 
					
						
							|  |  |  | more than once without error. | 
					
						
							|  |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{methoddesc}[BZ2File]{read}{\optional{size}} | 
					
						
							|  |  |  | Read at most \var{size} uncompressed bytes, returned as a string. If the | 
					
						
							|  |  |  | \var{size} argument is negative or omitted, read until EOF is reached. | 
					
						
							|  |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{methoddesc}[BZ2File]{readline}{\optional{size}} | 
					
						
							|  |  |  | Return the next line from the file, as a string, retaining newline. | 
					
						
							|  |  |  | A non-negative \var{size} argument limits the maximum number of bytes to | 
					
						
							|  |  |  | return (an incomplete line may be returned then). Return an empty | 
					
						
							|  |  |  | string at EOF. | 
					
						
							|  |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{methoddesc}[BZ2File]{readlines}{\optional{size}} | 
					
						
							|  |  |  | Return a list of lines read. The optional \var{size} argument, if given, | 
					
						
							|  |  |  | is an approximate bound on the total number of bytes in the lines returned. | 
					
						
							|  |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{methoddesc}[BZ2File]{xreadlines}{} | 
					
						
							|  |  |  | For backward compatibility. \class{BZ2File} objects now include the | 
					
						
							| 
									
										
										
										
											2002-11-15 16:38:06 +00:00
										 |  |  | performance optimizations previously implemented in the | 
					
						
							|  |  |  | \refmodule{xreadlines} module. | 
					
						
							|  |  |  | \deprecated{2.3}{This exists only for compatibility with the method by | 
					
						
							|  |  |  |                  this name on \class{file} objects, which is | 
					
						
							|  |  |  |                  deprecated.  Use \code{for line in file} instead.} | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2002-11-05 17:54:02 +00:00
										 |  |  | \begin{methoddesc}[BZ2File]{seek}{offset\optional{, whence}} | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | Move to new file position. Argument \var{offset} is a byte count. Optional | 
					
						
							|  |  |  | argument \var{whence} defaults to \code{0} (offset from start of file, | 
					
						
							|  |  |  | offset should be \code{>= 0}); other values are \code{1} (move relative to | 
					
						
							|  |  |  | current position, positive or negative), and \code{2} (move relative to end | 
					
						
							|  |  |  | of file, usually negative, although many platforms allow seeking beyond | 
					
						
							|  |  |  | the end of a file). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Note that seeking of bz2 files is emulated, and depending on the parameters | 
					
						
							|  |  |  | the operation may be extremely slow. | 
					
						
							|  |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{methoddesc}[BZ2File]{tell}{} | 
					
						
							|  |  |  | Return the current file position, an integer (may be a long integer). | 
					
						
							|  |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{methoddesc}[BZ2File]{write}{data} | 
					
						
							|  |  |  | Write string \var{data} to file. Note that due to buffering, \method{close()} | 
					
						
							|  |  |  | may be needed before the file on disk reflects the data written. | 
					
						
							|  |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{methoddesc}[BZ2File]{writelines}{sequence_of_strings} | 
					
						
							|  |  |  | Write the sequence of strings to the file. Note that newlines are not added. | 
					
						
							|  |  |  | The sequence can be any iterable object producing strings. This is equivalent | 
					
						
							|  |  |  | to calling write() for each string. | 
					
						
							|  |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Sequential (de)compression} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Sequential compression and decompression is done using the classes | 
					
						
							|  |  |  | \class{BZ2Compressor} and \class{BZ2Decompressor}. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2002-11-05 17:54:02 +00:00
										 |  |  | \begin{classdesc}{BZ2Compressor}{\optional{compresslevel}} | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | Create a new compressor object. This object may be used to compress | 
					
						
							|  |  |  | data sequentially. If you want to compress data in one shot, use the | 
					
						
							|  |  |  | \function{compress()} function instead. The \var{compresslevel} parameter, | 
					
						
							| 
									
										
										
										
											2002-11-05 17:54:02 +00:00
										 |  |  | if given, must be a number between \code{1} and \code{9}; the default | 
					
						
							|  |  |  | is \code{9}. | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | \end{classdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{methoddesc}[BZ2Compressor]{compress}{data} | 
					
						
							|  |  |  | Provide more data to the compressor object. It will return chunks of compressed | 
					
						
							|  |  |  | data whenever possible. When you've finished providing data to compress, call | 
					
						
							|  |  |  | the \method{flush()} method to finish the compression process, and return what | 
					
						
							|  |  |  | is left in internal buffers. | 
					
						
							|  |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{methoddesc}[BZ2Compressor]{flush}{} | 
					
						
							|  |  |  | Finish the compression process and return what is left in internal buffers. You | 
					
						
							|  |  |  | must not use the compressor object after calling this method. | 
					
						
							|  |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{classdesc}{BZ2Decompressor}{} | 
					
						
							|  |  |  | Create a new decompressor object. This object may be used to decompress | 
					
						
							|  |  |  | data sequentially. If you want to decompress data in one shot, use the | 
					
						
							|  |  |  | \function{decompress()} function instead. | 
					
						
							|  |  |  | \end{classdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{methoddesc}[BZ2Decompressor]{decompress}{data} | 
					
						
							|  |  |  | Provide more data to the decompressor object. It will return chunks of | 
					
						
							|  |  |  | decompressed data whenever possible. If you try to decompress data after the | 
					
						
							|  |  |  | end of stream is found, \exception{EOFError} will be raised. If any data was | 
					
						
							|  |  |  | found after the end of stream, it'll be ignored and saved in | 
					
						
							|  |  |  | \member{unused\_data} attribute. | 
					
						
							|  |  |  | \end{methoddesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{One-shot (de)compression} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2002-11-05 23:55:27 +00:00
										 |  |  | One-shot compression and decompression is provided through the | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | \function{compress()} and \function{decompress()} functions. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2002-11-05 17:54:02 +00:00
										 |  |  | \begin{funcdesc}{compress}{data\optional{, compresslevel}} | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | Compress \var{data} in one shot. If you want to compress data sequentially, | 
					
						
							|  |  |  | use an instance of \class{BZ2Compressor} instead. The \var{compresslevel} | 
					
						
							| 
									
										
										
										
											2002-11-05 17:54:02 +00:00
										 |  |  | parameter, if given, must be a number between \code{1} and \code{9}; | 
					
						
							|  |  |  | the default is \code{9}. | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2002-11-05 17:54:02 +00:00
										 |  |  | \begin{funcdesc}{decompress}{data} | 
					
						
							| 
									
										
										
										
											2002-11-05 16:50:05 +00:00
										 |  |  | Decompress \var{data} in one shot. If you want to decompress data | 
					
						
							|  |  |  | sequentially, use an instance of \class{BZ2Decompressor} instead. | 
					
						
							|  |  |  | \end{funcdesc} |