gzip
— Support for gzip files支持gzip文件¶
Source code: Lib/gzip.py
This module provides a simple interface to compress and decompress files just like the GNU programs gzip and gunzip would.这个模块提供了一个简单的界面来压缩和解压文件,就像GNU程序gzip和gunzip一样。
The data compression is provided by the 数据压缩由zlib
module.zlib
模块提供。
The gzip
module provides the GzipFile
class, as well as the open()
, compress()
and decompress()
convenience functions. gzip
模块提供GzipFile
类,以及便利函数open()
、compress()
和decompress()
。The GzipFile
class reads and writes gzip-format files, automatically compressing or decompressing the data so that it looks like an ordinary file object.GzipFile
类读取和写入gzip格式的文件,自动压缩或解压缩数据,使其看起来像普通的文件对象。
Note that additional file formats which can be decompressed by the gzip and gunzip programs, such as those produced by compress and pack, are not supported by this module.请注意,此模块不支持可由gzip和gunzip程序解压缩的其他文件格式,例如由compress和pack生成的文件格式。
The module defines the following items:该模块定义了以下项目:
-
gzip.
open
(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)¶ Open a gzip-compressed file in binary or text mode, returning a file object.以二进制或文本模式打开gzip压缩文件,返回文件对象。The filename argument can be an actual filename (afilename参数可以是实际的文件名(str
orbytes
object), or an existing file object to read from or write to.str
或bytes
对象),也可以是要读取或写入的现有文件对象。The mode argument can be any of对于二进制模式,mode参数可以是'r'
,'rb'
,'a'
,'ab'
,'w'
,'wb'
,'x'
or'xb'
for binary mode, or'rt'
,'at'
,'wt'
, or'xt'
for text mode.'r'
、'rb'
、'a'
、'ab'
、'w'
、'wb'
、'x'
或'xb'
中的任何一个,对于文本模式,可以是'rt'
、'at'
、'wt'
或'xt'
。The default is默认值为'rb'
.'rb'
。The compresslevel argument is an integer from 0 to 9, as for thecompresslevel参数是一个从0到9的整数,与GzipFile
constructor.GzipFile
构造函数一样。For binary mode, this function is equivalent to the对于二进制模式,此函数等效于GzipFile
constructor:GzipFile(filename, mode, compresslevel)
.GzipFile
构造函数:GzipFile(filename, mode, compresslevel)
。In this case, the encoding, errors and newline arguments must not be provided.在这种情况下,不得提供encoding、errors和newline参数。For text mode, a对于文本模式,将创建一个GzipFile
object is created, and wrapped in anio.TextIOWrapper
instance with the specified encoding, error handling behavior, and line ending(s).GzipFile
对象,并将其包装在具有指定编码、错误处理行为和行尾的io.TextIOWrapper
实例中。Changed in version 3.3:版本3.3中更改:Added support for filename being a file object, support for text mode, and the encoding, errors and newline arguments.添加了对filename作为文件对象的支持,对文本模式的支持,以及encoding、errors和newline参数。Changed in version 3.4:版本3.4中更改:Added support for the增加了对'x'
,'xb'
and'xt'
modes.'x'
、'xb'
和'xt'
模式的支持。Changed in version 3.6:版本3.6中更改:Accepts a path-like object.接受类似路径的对象。
-
exception
gzip.
BadGzipFile
¶ An exception raised for invalid gzip files.无效的gzip文件引发异常。It inherits它继承了OSError
.OSError
。对于无效的gzip文件,也可以引发EOFError
andzlib.error
can also be raised for invalid gzip files.EOFError
和zlib.error
。New in version 3.8.版本3.8中新增。
-
class
gzip.
GzipFile
(filename=None, mode=None, compresslevel=9, fileobj=None, mtime=None)¶ Constructor for theGzipFile
class, which simulates most of the methods of a file object, with the exception of thetruncate()
method.GzipFile
类的构造函数,该类模拟文件对象的大多数方法,但truncate()
方法除外。At least one of fileobj and filename must be given a non-trivial value.必须为fileobj和filename中的至少一个指定非平凡值。The new class instance is based on fileobj, which can be a regular file, an新的类实例基于fileobj,fileobj可以是常规文件、io.BytesIO
object, or any other object which simulates a file.io.BytesIO
对象或任何其他模拟文件的对象。It defaults to它默认为None
, in which case filename is opened to provide a file object.None
,在这种情况下,打开filename以提供文件对象。When fileobj is not当fileobj不是None
, the filename argument is only used to be included in the gzip file header, which may include the original filename of the uncompressed file.None
时,filename参数仅用于包含在gzip文件头中,其中可能包含未压缩文件的原始文件名。It defaults to the filename of fileobj, if discernible; otherwise, it defaults to the empty string, and in this case the original filename is not included in the header.它默认为fileobj的文件名(如果可以识别);否则,它默认为空字符串,在这种情况下,头中不包括原始文件名。The mode argument can be any ofmode参数可以是'r'
,'rb'
,'a'
,'ab'
,'w'
,'wb'
,'x'
, or'xb'
, depending on whether the file will be read or written.'r'
、'rb'
、'a'
、'ab'
、'w'
、'wb'
、'x'
或'xb'
中的任何一个,具体取决于文件是读还是写。The default is the mode of fileobj if discernible; otherwise, the default is如果可以识别,默认为fileobj模式;否则,默认值为'rb'
.'rb'
。In future Python releases the mode of fileobj will not be used.在未来的Python版本中,将不使用fileobj模式。It is better to always specify mode for writing.最好始终指定写入mode。Note that the file is always opened in binary mode.请注意,文件始终以二进制模式打开。To open a compressed file in text mode, use要以文本模式打开压缩文件,请使用open()
(or wrap yourGzipFile
with anio.TextIOWrapper
).open()
(或使用GzipFile
包装GzipFile
)。The compresslevel argument is an integer fromcompresslevel参数是一个从0
to9
controlling the level of compression;1
is fastest and produces the least compression, and9
is slowest and produces the most compression.0
到9
的整数,用于控制压缩级别;1
是最快的,产生的压缩最少,9
是最慢的,产生最多的压缩。0
is no compression.0
是不压缩。The default is默认值为9
.9
。The mtime argument is an optional numeric timestamp to be written to the last modification time field in the stream when compressing.mtime参数是一个可选的数字时间戳,在压缩时写入流中的最后修改时间字段。It should only be provided in compression mode.只能在压缩模式下提供。If omitted or如果省略或None
, the current time is used.None
,则使用当前时间。See the有关更多详细信息,请参阅mtime
attribute for more details.mtime
属性。Calling a调用GzipFile
object’sclose()
method does not close fileobj, since you might wish to append more material after the compressed data.GzipFile
对象的close()
方法不会关闭fileobj,因为您可能希望在压缩数据之后附加更多材料。This also allows you to pass an这还允许您将一个io.BytesIO
object opened for writing as fileobj, and retrieve the resulting memory buffer using theio.BytesIO
object’sgetvalue()
method.io.BytesIO
对象作为fileobj传递给打开写入的对象,并使用io.BytesIO
对象的getvalue()
方法检索生成的内存缓冲区。GzipFile
supports the支持io.BufferedIOBase
interface, including iteration and thewith
statement.io.BufferedIOBase
接口,包括迭代和with
语句。Only the只有truncate()
method isn’t implemented.truncate()
方法没有实现。GzipFile
also provides the following method and attribute:还提供了以下方法和属性:-
peek
(n)¶ Read n uncompressed bytes without advancing the file position.读取n个未压缩字节,而不推进文件位置。At most one single read on the compressed stream is done to satisfy the call.最多对压缩流执行一次读取以满足调用。The number of bytes returned may be more or less than requested.返回的字节数可能大于或小于请求的字节数。Note
While calling虽然调用peek()
does not change the file position of theGzipFile
, it may change the position of the underlying file object (e.g. if theGzipFile
was constructed with the fileobj parameter).peek()
不会改变GzipFile
的文件位置,但它可能会改变底层文件对象的位置(例如,如果GzipFile
是用fileobj参数构造的)。New in version 3.2.版本3.2中新增。
-
mtime
¶ When decompressing, the value of the last modification time field in the most recently read header may be read from this attribute, as an integer.解压缩时,可以从该属性中读取最近读取的头中的最后修改时间字段的值,作为整数。The initial value before reading any headers is读取任何标头之前的初始值为None
.None
。All gzip compressed streams are required to contain this timestamp field.所有gzip压缩流都需要包含此时间戳字段。Some programs, such as gunzip, make use of the timestamp.有些程序,如gunzip,使用时间戳。The format is the same as the return value of格式与time.time()
and thest_mtime
attribute of the object returned byos.stat()
.time.time()
的返回值和os.stat()
返回的对象的st_mtime
属性相同。
Changed in version 3.1:版本3.1中更改:Support for the添加了对with
statement was added, along with the mtime constructor argument andmtime
attribute.with
语句的支持,以及mtime构造函数参数和mtime
属性。Changed in version 3.2:版本3.2中更改:Support for zero-padded and unseekable files was added.添加了对零填充和不可查找文件的支持。Changed in version 3.3:版本3.3中更改:Theio.BufferedIOBase.read1()
method is now implemented.io.BufferedIOBase.read1()
方法现在已经实现。Changed in version 3.4:版本3.4中更改:Added support for the增加了对'x'
and'xb'
modes.'x'
和'xb'
模式的支持。Changed in version 3.5:版本3.5中更改:Added support for writing arbitrary bytes-like objects.添加了对写入任意类字节对象的支持。Theread()
method now accepts an argument ofNone
.read()
方法现在接受None
参数。Changed in version 3.6:版本3.6中更改:Accepts a path-like object.接受类似路径的对象。-
-
gzip.
compress
(data, compresslevel=9, *, mtime=None)¶ Compress the data, returning a压缩data,返回包含压缩数据的bytes
object containing the compressed data.bytes
对象。compresslevel and mtime have the same meaning as in thecompresslevel和mtime的含义与上述GzipFile
constructor above.GzipFile
构造函数中的含义相同。New in version 3.2.版本3.2中新增。Changed in version 3.8:版本3.8中更改:Added the mtime parameter for reproducible output.增加了可再现输出的mtime参数。
-
gzip.
decompress
(data)¶ Decompress the data, returning a解压缩data,返回包含未压缩数据的bytes
object containing the uncompressed data.bytes
对象。New in version 3.2.版本3.2中新增。
Examples of usage用法示例¶
Example of how to read a compressed file:如何读取压缩文件的示例:
import gzip
with gzip.open('/home/joe/file.txt.gz', 'rb') as f:
file_content = f.read()
Example of how to create a compressed GZIP file:如何创建压缩的GZIP文件的示例:
import gzip
content = b"Lots of content here"
with gzip.open('/home/joe/file.txt.gz', 'wb') as f:
f.write(content)
Example of how to GZIP compress an existing file:如何压缩现有文件的示例:
import gzip
import shutil
with open('/home/joe/file.txt', 'rb') as f_in:
with gzip.open('/home/joe/file.txt.gz', 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
Example of how to GZIP compress a binary string:如何GZIP压缩二进制字符串的示例:
import gzip
s_in = b"Lots of content here"
s_out = gzip.compress(s_in)
See also
- Module
zlib
The basic data compression module needed to support the gzip file format.基本数据压缩模块需要支持gzip文件格式。
Command Line Interface命令行界面¶
The gzip
module provides a simple command line interface to compress or decompress files.gzip
模块提供了一个简单的命令行界面来压缩或解压缩文件。
Once executed the 一旦执行,gzip
module keeps the input file(s).gzip
模块将保留输入文件。
Changed in version 3.8:版本3.8中更改: Add a new command line interface with a usage. 添加具有用法的新命令行界面。By default, when you will execute the CLI, the default compression level is 6.默认情况下,在执行CLI时,默认压缩级别为6。
Command line options命令行选项¶
-
--fast
¶
Indicates the fastest compression method (less compression).表示最快的压缩方法(较少压缩)。
-
--best
¶
Indicates the slowest compression method (best compression).表示最慢的压缩方法(最佳压缩)。
-
-d
,
--decompress
¶
Decompress the given file.解压缩给定文件。
-
-h
,
--help
¶
Show the help message.显示帮助消息。