10. Brief Tour of the Standard Library标准库简介¶
10.1. Operating System Interface操作系统接口¶
The os
module provides dozens of functions for interacting with the operating system:os
模块提供了数十种与操作系统交互的功能:
>>> import os
>>> os.getcwd() # Return the current working directory
'C:\\Python310'
>>> os.chdir('/server/accesslogs') # Change current working directory
>>> os.system('mkdir today') # Run the command mkdir in the system shell
0
Be sure to use the 请确保使用import os
style instead of from os import *
. import os
样式,而不是from os import *
。This will keep 这将防止os.open()
from shadowing the built-in open()
function which operates much differently.os.open()
隐藏运行方式截然不同的内置open()
函数。
The built-in 内置的dir()
and help()
functions are useful as interactive aids for working with large modules like os
:dir()
和help()
函数可作为交互式辅助工具,用于处理操作系统等大型模块:
>>> import os
>>> dir(os)
<returns a list of all module functions>
>>> help(os)
<returns an extensive manual page created from the module's docstrings>
For daily file and directory management tasks, the 对于日常文件和目录管理任务,shutil
module provides a higher level interface that is easier to use:shutil
模块提供了更高级别的界面,更易于使用:
>>> import shutil
>>> shutil.copyfile('data.db', 'archive.db')
'archive.db'
>>> shutil.move('/build/executables', 'installdir')
'installdir'
10.2. File Wildcards文件通配符¶
The glob
module provides a function for making file lists from directory wildcard searches:glob
模块提供了一个通过目录通配符搜索生成文件列表的功能:
>>> import glob
>>> glob.glob('*.py')
['primes.py', 'random.py', 'quote.py']
10.3. Command Line Arguments命令行参数¶
Common utility scripts often need to process command line arguments. 常见的实用程序脚本通常需要处理命令行参数。These arguments are stored in the 这些参数作为列表存储在sys
module’s argv attribute as a list. sys
模块的argv
属性中。For instance the following output results from running 例如,在命令行运行python demo.py one two three
at the command line:python demo.py one two three
会产生以下输出:
>>> import sys
>>> print(sys.argv)
['demo.py', 'one', 'two', 'three']
The argparse
module provides a more sophisticated mechanism to process command line arguments. argparse
模块提供了一种更复杂的机制来处理命令行参数。The following script extracts one or more filenames and an optional number of lines to be displayed:以下脚本提取一个或多个文件名以及要显示的可选行数:
import argparse
parser = argparse.ArgumentParser(
prog='top',
description='Show top lines from each file')
parser.add_argument('filenames', nargs='+')
parser.add_argument('-l', '--lines', type=int, default=10)
args = parser.parse_args()
print(args)
When run at the command line with 当使用python top.py --lines=5 alpha.txt beta.txt
, the script sets args.lines
to 5
and args.filenames
to ['alpha.txt', 'beta.txt']
.python top.py --lines=5 alpha.txt beta.txt
在命令行上运行时,脚本将args.lines
设置为5
,将args.filenames
设置为['alphatxt', 'betatxt']
。
10.4. Error Output Redirection and Program Termination错误输出重定向和程序终止¶
The sys
module also has attributes for stdin, stdout, and stderr. sys
模块还具有stdin、stdout和stderr的属性。The latter is useful for emitting warnings and error messages to make them visible even when stdout has been redirected:后者用于发出警告和错误消息,使其在stdout被重定向时可见:
>>> sys.stderr.write('Warning, log file not found starting a new one\n')
Warning, log file not found starting a new one
The most direct way to terminate a script is to use 终止脚本最直接的方法是使用sys.exit()
.sys.exit()
。
10.5. String Pattern Matching字符串模式匹配¶
The re
module provides regular expression tools for advanced string processing. re
模块为高级字符串处理提供正则表达式工具。For complex matching and manipulation, regular expressions offer succinct, optimized solutions:对于复杂的匹配和操作,正则表达式提供了简洁、优化的解决方案:
>>> import re
>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot', 'fell', 'fastest']
>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'
When only simple capabilities are needed, string methods are preferred because they are easier to read and debug:当只需要简单的功能时,字符串方法是首选方法,因为它们更易于阅读和调试:
>>> 'tea for too'.replace('too', 'two')
'tea for two'
10.6. Mathematics数学¶
The math
module gives access to the underlying C library functions for floating point math:math
模块允许访问浮点数学的底层C库函数:
>>> import math
>>> math.cos(math.pi / 4)
0.70710678118654757
>>> math.log(1024, 2)
10.0
The random
module provides tools for making random selections:random
模块提供了进行随机选择的工具:
>>> import random
>>> random.choice(['apple', 'pear', 'banana'])
'apple'
>>> random.sample(range(100), 10) # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random() # random float
0.17970987693706186
>>> random.randrange(6) # random integer chosen from range(6)
4
The statistics
module calculates basic statistical properties (the mean, median, variance, etc.) of numeric data:statistics
模块计算数字数据的基本统计特性(平均值、中位值、方差等):
>>> import statistics
>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
>>> statistics.mean(data)
1.6071428571428572
>>> statistics.median(data)
1.25
>>> statistics.variance(data)
1.3720238095238095
The SciPy project <https://scipy.org> has many other modules for numerical computations.SciPy项目<https://scipy.org>有许多其他的数值计算模块。
10.7. Internet Access互联网接入¶
There are a number of modules for accessing the internet and processing internet protocols. 有许多模块用于访问互联网和处理互联网协议。Two of the simplest are 其中最简单的两个是用于从URL检索数据的urllib.request
for retrieving data from URLs and smtplib
for sending mail:urllib.request
和用于发送邮件的smtplib
:
>>> from urllib.request import urlopen
>>> with urlopen('http://worldtimeapi.org/api/timezone/etc/UTC.txt') as response:
... for line in response:
... line = line.decode() # Convert bytes to a str
... if line.startswith('datetime'):
... print(line.rstrip()) # Remove trailing newline
...
datetime: 2022-01-01T01:36:47.689215+00:00
>>> import smtplib
>>> server = smtplib.SMTP('localhost')
>>> server.sendmail('soothsayer@example.org', 'jcaesar@example.org',
... """To: jcaesar@example.org
... From: soothsayer@example.org
...
... Beware the Ides of March.
... """)
>>> server.quit()
(Note that the second example needs a mailserver running on localhost.)(请注意,第二个示例需要在本地主机上运行邮件服务器。)
10.8. Dates and Times日期和时间¶
The datetime
module supplies classes for manipulating dates and times in both simple and complex ways. datetime
模块提供用于以简单和复杂的方式操作日期和时间的类。While date and time arithmetic is supported, the focus of the implementation is on efficient member extraction for output formatting and manipulation. 虽然支持日期和时间算法,但实现的重点是高效地提取成员以进行输出格式设置和操作。The module also supports objects that are timezone aware.该模块还支持时区感知的对象。
>>> # dates are easily constructed and formatted
>>> from datetime import date
>>> now = date.today()
>>> now
datetime.date(2003, 12, 2)
>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")
'12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'
>>> # dates support calendar arithmetic
>>> birthday = date(1964, 7, 31)
>>> age = now - birthday
>>> age.days
14368
10.9. Data Compression数据压缩¶
Common data archiving and compression formats are directly supported by modules including: 常见的数据归档和压缩格式由模块直接支持,这些模块包括:zlib
, gzip
, bz2
, lzma
, zipfile
and tarfile
.zlib
、gzip
、bz2
、lzma
、zipfile
和tarfile
。
>>> import zlib
>>> s = b'witch which has which witches wrist watch'
>>> len(s)
41
>>> t = zlib.compress(s)
>>> len(t)
37
>>> zlib.decompress(t)
b'witch which has which witches wrist watch'
>>> zlib.crc32(s)
226805979
10.10. Performance Measurement绩效衡量¶
Some Python users develop a deep interest in knowing the relative performance of different approaches to the same problem. 一些Python用户对了解同一问题的不同方法的相对性能产生了浓厚的兴趣。Python provides a measurement tool that answers those questions immediately.Python提供了一个测量工具,可以立即回答这些问题。
For example, it may be tempting to use the tuple packing and unpacking feature instead of the traditional approach to swapping arguments. 例如,可能会尝试使用元组打包和解包功能,而不是传统的参数交换方法。The timeit
module quickly demonstrates a modest performance advantage:timeit
模块很快展示了一个适度的性能优势:
>>> from timeit import Timer
>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit()
0.57535828626024577
>>> Timer('a,b = b,a', 'a=1; b=2').timeit()
0.54962537085770791
In contrast to 与timeit
’s fine level of granularity, the profile
and pstats
modules provide tools for identifying time critical sections in larger blocks of code.timeit
的细粒度不同,profile
和pstats
模块提供了在更大的代码块中识别时间关键部分的工具。
10.11. Quality Control质量控制¶
One approach for developing high quality software is to write tests for each function as it is developed and to run those tests frequently during the development process.开发高质量软件的一种方法是在开发过程中为每个功能编写测试,并在开发过程中频繁运行这些测试。
The doctest
module provides a tool for scanning a module and validating tests embedded in a program’s docstrings. doctest
模块提供了一种工具,用于扫描模块和验证嵌入在程序docstring中的测试。Test construction is as simple as cutting-and-pasting a typical call along with its results into the docstring. 测试构造非常简单,只需将典型调用及其结果剪切并粘贴到docstring中。This improves the documentation by providing the user with an example and it allows the doctest module to make sure the code remains true to the documentation:这通过向用户提供示例来改进文档,并允许doctest模块确保代码与文档保持一致:
def average(values):
"""Computes the arithmetic mean of a list of numbers.
>>> print(average([20, 30, 70]))
40.0
"""
return sum(values) / len(values)
import doctest
doctest.testmod() # automatically validate the embedded tests
The unittest
module is not as effortless as the doctest
module, but it allows a more comprehensive set of tests to be maintained in a separate file:unittest
模块不像doctest
模块那样轻松,但它允许在一个单独的文件中维护一组更全面的测试:
import unittest
class TestStatisticalFunctions(unittest.TestCase):
def test_average(self):
self.assertEqual(average([20, 30, 70]), 40.0)
self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
with self.assertRaises(ZeroDivisionError):
average([])
with self.assertRaises(TypeError):
average(20, 30, 70)
unittest.main() # Calling from the command line invokes all tests
10.12. Batteries Included包括电池¶
Python has a “batteries included” philosophy. Python有一种“包括电池”的理念。This is best seen through the sophisticated and robust capabilities of its larger packages. 这最好通过其更大软件包的复杂和强大功能来体现。For example:例如:
Thexmlrpc.client
andxmlrpc.server
modules make implementing remote procedure calls into an almost trivial task.xmlrpc.client
和xmlrpc.server
模块使实现远程过程调用成为一项几乎微不足道的任务。Despite the modules names, no direct knowledge or handling of XML is needed.尽管有模块名称,但不需要直接了解或处理XML。Theemail
package is a library for managing email messages, including MIME and other RFC 2822-based message documents.email
包是一个用于管理电子邮件的库,包括MIME和其他基于RFC 2822的邮件文档。Unlike与实际发送和接收消息的smtplib
andpoplib
which actually send and receive messages, the email package has a complete toolset for building or decoding complex message structures (including attachments) and for implementing internet encoding and header protocols.smtplib
和poplib
不同,电子邮件包有一个完整的工具集,用于构建或解码复杂的消息结构(包括附件),以及实现互联网编码和头协议。Thejson
package provides robust support for parsing this popular data interchange format.json
包为解析这种流行的数据交换格式提供了强大的支持。Thecsv
module supports direct reading and writing of files in Comma-Separated Value format, commonly supported by databases and spreadsheets.csv
模块支持以逗号分隔值格式直接读取和写入文件,通常由数据库和电子表格支持。XML processing is supported by thexml.etree.ElementTree
,xml.dom
andxml.sax
packages.xml.etree.ElementTree
、xml.dom
和xml.sax
包支持XML处理。Together, these modules and packages greatly simplify data interchange between Python applications and other tools.总之,这些模块和包大大简化了Python应用程序和其他工具之间的数据交换。Thesqlite3
module is a wrapper for the SQLite database library, providing a persistent database that can be updated and accessed using slightly nonstandard SQL syntax.sqlite3
模块是SQLite数据库库的包装器,提供了一个持久数据库,可以使用稍微不标准的SQL语法进行更新和访问。Internationalization is supported by a number of modules including国际化由许多模块支持,包括gettext
,locale
, and thecodecs
package.gettext
、locale
和codecs
包。