bz2
— Support for bzip2 compression支持bzip2压缩¶
Source code: Lib/bz2.py
This module provides a comprehensive interface for compressing and decompressing data using the bzip2 compression algorithm.该模块提供了一个使用bzip2压缩算法压缩和解压缩数据的综合接口。
The bz2
module contains:bz2
模块包含:
Theopen()
function andBZ2File
class for reading and writing compressed files.open()
函数和BZ2File
类,用于读取和写入压缩文件。The用于增量(反)压缩的BZ2Compressor
andBZ2Decompressor
classes for incremental (de)compression.BZ2Compressor
类和BZ2Decompressor
类。The用于一次(反)压缩的compress()
anddecompress()
functions for one-shot (de)compression.compress()
函数和decompress()
函数。
(De)compression of files文件的压缩解压缩¶
-
bz2.
open
(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)¶ Open a bzip2-compressed file in binary or text mode, returning a file object.以二进制或文本模式打开bzip2压缩文件,返回文件对象。As with the constructor for与BZ2File
, the filename argument can be an actual filename (astr
orbytes
object), or an existing file object to read from or write to.BZ2File
的构造函数一样,filename参数可以是实际的文件名(str
或bytes
对象),也可以是要读取或写入的现有文件对象。The mode argument can be any of对于二进制模式,mode参数可以是'r'
,'rb'
,'w'
,'wb'
,'x'
,'xb'
,'a'
or'ab'
for binary mode, or'rt'
,'wt'
,'xt'
, or'at'
for text mode.'r'
、'rb'
、'w'
、'wb'
、'x'
、'xb'
、'a'
或'ab'
中的任何一个,对于文本模式,可以是'rt'
、'wt'
、'xt'
或'at'
中的任意一个。The default is默认值为'rb'
.'rb'
。The compresslevel argument is an integer from 1 to 9, as for thecompresslevel参数是从1到9的整数,就像BZ2File
constructor.BZ2File
构造函数一样。For binary mode, this function is equivalent to the
BZ2File
constructor:BZ2File(filename, mode, compresslevel=compresslevel)
. In this case, the encoding, errors and newline arguments must not be provided.For text mode, a
BZ2File
object is created, and wrapped in anio.TextIOWrapper
instance with the specified encoding, error handling behavior, and line ending(s).New in version 3.3.版本3.3中新增。Changed in version 3.4:版本3.4中更改: The'x'
(exclusive creation) mode was added.Changed in version 3.6:版本3.6中更改: Accepts a path-like object.
-
class
bz2.
BZ2File
(filename, mode='r', *, compresslevel=9)¶ Open a bzip2-compressed file in binary mode.
If filename is a
str
orbytes
object, open the named file directly. Otherwise, filename should be a file object, which will be used to read or write the compressed data.The mode argument can be either
'r'
for reading (default),'w'
for overwriting,'x'
for exclusive creation, or'a'
for appending. These can equivalently be given as'rb'
,'wb'
,'xb'
and'ab'
respectively.If filename is a file object (rather than an actual file name), a mode of
'w'
does not truncate the file, and is instead equivalent to'a'
.If mode is
'w'
or'a'
, compresslevel can be an integer between1
and9
specifying the level of compression:1
produces the least compression, and9
(default) produces the most compression.If mode is
'r'
, the input file may be the concatenation of multiple compressed streams.BZ2File
provides all of the members specified by theio.BufferedIOBase
, except fordetach()
andtruncate()
. Iteration and thewith
statement are supported.BZ2File
also provides the following method:-
peek
([n])¶ Return buffered data without advancing the file position.在不推进文件位置的情况下返回缓冲数据。At least one byte of data will be returned (unless at EOF).将返回至少一个字节的数据(除非在EOF)。The exact number of bytes returned is unspecified.未指定返回的确切字节数。Note
While calling虽然调用peek()
does not change the file position of theBZ2File
, it may change the position of the underlying file object (e.g. if theBZ2File
was constructed by passing a file object for filename).peek()
不会改变BZ2File
的文件位置,但它可能会改变底层文件对象的位置(例如,如果BZ2File
是通过传递filename的文件对象构建的)。New in version 3.3.版本3.3中新增。
Changed in version 3.3:版本3.3中更改:The添加了fileno()
,readable()
,seekable()
,writable()
,read1()
andreadinto()
methods were added.fileno()
、readable()
、seekable()
、writable()
、read1()
和readinto()
方法。Changed in version 3.3:版本3.3中更改:Support was added for filename being a file object instead of an actual filename.添加了对filename为文件对象而不是实际文件名的支持。Changed in version 3.3:版本3.3中更改:The添加了'a'
(append) mode was added, along with support for reading multi-stream files.'a'
(附加)模式,以及对读取多流文件的支持。Changed in version 3.4:版本3.4中更改:The添加了'x'
(exclusive creation) mode was added.'x'
(独占创建)模式。Changed in version 3.5:版本3.5中更改:Theread()
method now accepts an argument ofNone
.read()
方法现在接受None
参数。Changed in version 3.6:版本3.6中更改: Accepts a path-like object.Changed in version 3.9:版本3.9中更改:The buffering parameter has been removed.buffering参数已被删除。It was ignored and deprecated since Python 3.0.自Python 3.0以来,它一直被忽略和弃用。Pass an open file object to control how the file is opened.传递打开的文件对象以控制文件的打开方式。The compresslevel parameter became keyword-only.compresslevel参数仅成为关键字。-
Incremental (de)compression实现压缩解压缩¶
-
class
bz2.
BZ2Compressor
(compresslevel=9)¶ Create a new compressor object.创建新的压缩器对象。This object may be used to compress data incrementally.此对象可用于增量压缩数据。For one-shot compression, use the对于单次压缩,请改用compress()
function instead.compress()
函数。compresslevel, if given, must be an integer betweencompresslevel(如果给定)必须是1
and9
.1
到9
之间的整数。The default is默认值为9
.9
。-
compress
(data)¶ Provide data to the compressor object.向压缩器对象提供数据。Returns a chunk of compressed data if possible, or an empty byte string otherwise.如果可能,返回压缩数据块,否则返回空字节字符串。When you have finished providing data to the compressor, call the完成向压缩器提供数据后,调用flush()
method to finish the compression process.flush()
方法完成压缩过程。
-
flush
()¶ Finish the compression process.完成压缩过程。Returns the compressed data left in internal buffers.返回留在内部缓冲区中的压缩数据。The compressor object may not be used after this method has been called.调用此方法后,可能无法使用压缩器对象。
-
-
class
bz2.
BZ2Decompressor
¶ Create a new decompressor object.创建新的解压缩器对象。This object may be used to decompress data incrementally.此对象可用于增量解压缩数据。For one-shot compression, use the对于单次压缩,请改用decompress()
function instead.decompress()
函数。Note
This class does not transparently handle inputs containing multiple compressed streams, unlike与decompress()
andBZ2File
.decompress()
和BZ2File
不同,此类不会透明地处理包含多个压缩流的输入。If you need to decompress a multi-stream input with如果需要使用BZ2Decompressor
, you must use a new decompressor for each stream.BZ2Decompressor
对多流输入进行解压缩,则必须为每个流使用新的解压缩器。-
decompress
(data, max_length=- 1)¶ Decompress data (a bytes-like object), returning uncompressed data as bytes.解压缩data(类字节对象),将未压缩的数据作为字节返回。Some of data may be buffered internally, for use in later calls to一些data可以在内部缓冲,以便在以后调用decompress()
.decompress()
时使用。The returned data should be concatenated with the output of any previous calls to返回的数据应与以前任何decompress()
.decompress()
调用的输出连接起来。If max_length is nonnegative, returns at most max_length bytes of decompressed data.如果max_length为非负,则返回解压缩数据的最大max_length。If this limit is reached and further output can be produced, the如果达到此限制并且可以产生进一步的输出,则needs_input
attribute will be set toFalse
.needs_input
属性将设置为False
。In this case, the next call to在这种情况下,对decompress()
may provide data asb''
to obtain more of the output.decompress()
的下一次调用可能会将data提供为b''
,以获得更多的输出。If all of the input data was decompressed and returned (either because this was less than max_length bytes, or because max_length was negative), the如果所有输入数据都已解压缩并返回(因为这小于max_length字节,或者因为max_length为负),则needs_input
attribute will be set toTrue
.needs_input
属性将设置为True
。Attempting to decompress data after the end of stream is reached raises an EOFError.在到达流结束后尝试解压缩数据会引发EOFError。Any data found after the end of the stream is ignored and saved in the流结束后发现的任何数据都将被忽略并保存在unused_data
attribute.unused_data
属性中。Changed in version 3.5:版本3.5中更改:Added the max_length parameter.添加了max_length参数。
-
eof
¶ 如果已达到流结束标记,则为True
if the end-of-stream marker has been reached.True
。New in version 3.3.版本3.3中新增。
-
unused_data
¶ Data found after the end of the compressed stream.在压缩流结束后找到的数据。If this attribute is accessed before the end of the stream has been reached, its value will be如果在流结束之前访问此属性,则其值将为b''
.b''
。
-
needs_input
¶ 如果False
if thedecompress()
method can provide more decompressed data before requiring new uncompressed input.decompress()
方法可以在需要新的未压缩输入之前提供更多的解压缩数据,则为False
。New in version 3.5.版本3.5中新增。
-
One-shot (de)compression一次性压缩解压缩¶
-
bz2.
compress
(data, compresslevel=9)¶ Compress data, a bytes-like object.压缩data,一个类字节对象。compresslevel, if given, must be an integer betweencompresslevel(如果给定)必须是1
and9
.1
到9
之间的整数。The default is默认值为9
.9
。For incremental compression, use a对于增量压缩,请改用BZ2Compressor
instead.BZ2Compressor
。
-
bz2.
decompress
(data)¶ Decompress data, a bytes-like object.
If data is the concatenation of multiple compressed streams, decompress all of the streams.如果data是多个压缩流的级联,请解压缩所有流。For incremental decompression, use a对于增量解压缩,请改用BZ2Decompressor
instead.BZ2Decompressor
。Changed in version 3.3:版本3.3中更改:Support for multi-stream inputs was added.增加了对多流输入的支持。
Examples of usage用法示例¶
Below are some examples of typical usage of the 下面是bz2
module.bz2
模块典型用法的一些示例。
Using 使用compress()
and decompress()
to demonstrate round-trip compression:compress()
和decompress()
演示往返压缩:
>>> import bz2
>>> data = b"""\
... Donec rhoncus quis sapien sit amet molestie. Fusce scelerisque vel augue
... nec ullamcorper. Nam rutrum pretium placerat. Aliquam vel tristique lorem,
... sit amet cursus ante. In interdum laoreet mi, sit amet ultrices purus
... pulvinar a. Nam gravida euismod magna, non varius justo tincidunt feugiat.
... Aliquam pharetra lacus non risus vehicula rutrum. Maecenas aliquam leo
... felis. Pellentesque semper nunc sit amet nibh ullamcorper, ac elementum
... dolor luctus. Curabitur lacinia mi ornare consectetur vestibulum."""
>>> c = bz2.compress(data)
>>> len(data) / len(c) # Data compression ratio
1.513595166163142
>>> d = bz2.decompress(c)
>>> data == d # Check equality to original object after round-trip
True
Using 使用BZ2Compressor
for incremental compression:BZ2Compressor
进行增量压缩:
>>> import bz2
>>> def gen_data(chunks=10, chunksize=1000):
... """Yield incremental blocks of chunksize bytes."""
... for _ in range(chunks):
... yield b"z" * chunksize
...
>>> comp = bz2.BZ2Compressor()
>>> out = b""
>>> for chunk in gen_data():
... # Provide data to the compressor object
... out = out + comp.compress(chunk)
...
>>> # Finish the compression process. Call this once you have
>>> # finished providing data to the compressor.
>>> out = out + comp.flush()
The example above uses a very “nonrandom” stream of data (a stream of b”z” chunks). 上面的例子使用了一个非常“非随机”的数据流(一个b”z”块流)。Random data tends to compress poorly, while ordered, repetitive data usually yields a high compression ratio.随机数据往往压缩效果不佳,而有序、重复的数据通常会产生较高的压缩比。
Writing and reading a bzip2-compressed file in binary mode:以二进制模式写入和读取bzip2压缩文件:
>>> import bz2
>>> data = b"""\
... Donec rhoncus quis sapien sit amet molestie. Fusce scelerisque vel augue
... nec ullamcorper. Nam rutrum pretium placerat. Aliquam vel tristique lorem,
... sit amet cursus ante. In interdum laoreet mi, sit amet ultrices purus
... pulvinar a. Nam gravida euismod magna, non varius justo tincidunt feugiat.
... Aliquam pharetra lacus non risus vehicula rutrum. Maecenas aliquam leo
... felis. Pellentesque semper nunc sit amet nibh ullamcorper, ac elementum
... dolor luctus. Curabitur lacinia mi ornare consectetur vestibulum."""
>>> with bz2.open("myfile.bz2", "wb") as f:
... # Write compressed data to file
... unused = f.write(data)
>>> with bz2.open("myfile.bz2", "rb") as f:
... # Decompress data from file
... content = f.read()
>>> content == data # Check equality to original object after round-trip
True