Functional Programming HOWTO函数式编程方法¶
- Author
-
A. M. Kuchling
- Release
-
0.32
In this document, we’ll take a tour of Python’s features suitable for implementing programs in a functional style. 在本文档中,我们将介绍适用于以函数式方式实现程序的Python特性。After an introduction to the concepts of functional programming, we’ll look at language features such as iterators and generators and relevant library modules such as 在介绍了函数式编程的概念之后,我们将研究迭代器和生成器等语言特性以及itertools
and functools
.itertools
和functools
等相关库模块。
Introduction介绍¶
This section explains the basic concept of functional programming; if you’re just interested in learning about Python language features, skip to the next section on Iterators.本节介绍函数式编程的基本概念;如果您对学习Python语言特性感兴趣,请跳到下一节关于迭代器的内容。
Programming languages support decomposing problems in several different ways:编程语言支持以几种不同的方式分解问题:
Most programming languages are procedural: programs are lists of instructions that tell the computer what to do with the program’s input. C, Pascal, and even Unix shells are procedural languages.大多数编程语言都是过程性的:程序是指令列表,告诉计算机如何处理程序的输入。C、 Pascal,甚至Unix shell都是过程语言。In declarative languages, you write a specification that describes the problem to be solved, and the language implementation figures out how to perform the computation efficiently.在声明性语言中,您编写了一个描述要解决的问题的规范,语言实现会找出如何高效地执行计算。SQL is the declarative language you’re most likely to be familiar with; a SQL query describes the data set you want to retrieve, and the SQL engine decides whether to scan tables or use indexes, which subclauses should be performed first, etc.SQL是您最熟悉的声明性语言;SQL查询描述要检索的数据集,SQL引擎决定是扫描表还是使用索引,应该先执行哪些子类,等等。Object-oriented programs manipulate collections of objects.面向对象程序操作对象集合。Objects have internal state and support methods that query or modify this internal state in some way.对象具有内部状态,并支持以某种方式查询或修改此内部状态的方法。Smalltalk and Java are object-oriented languages. C++ and Python are languages that support object-oriented programming, but don’t force the use of object-oriented features.Smalltalk和Java是面向对象的语言。C++和Python是支持面向对象编程的语言,但不强制使用面向对象的特性。Functional programming decomposes a problem into a set of functions.函数式编程将问题分解为一组函数。Ideally, functions only take inputs and produce outputs, and don’t have any internal state that affects the output produced for a given input.理想情况下,函数只接受输入并生成输出,并且没有任何内部状态影响为给定输入生成的输出。Well-known functional languages include the ML family (Standard ML, OCaml, and other variants) and Haskell.众所周知的函数语言包括ML系列(标准ML、OCaml和其他变体)和Haskell。
The designers of some computer languages choose to emphasize one particular approach to programming. 一些计算机语言的设计者选择强调一种特定的编程方法。This often makes it difficult to write programs that use a different approach. 这通常使编写使用不同方法的程序变得困难。Other languages are multi-paradigm languages that support several different approaches. 其他语言是支持多种不同方法的多范式语言。Lisp, C++, and Python are multi-paradigm; you can write programs or libraries that are largely procedural, object-oriented, or functional in all of these languages. Lisp、C++和Python是多范式的;您可以用所有这些语言编写程序或库,这些程序或库主要是过程性的、面向对象的或函数式的。In a large program, different sections might be written using different approaches; the GUI might be object-oriented while the processing logic is procedural or functional, for example.在一个大型程序中,可能会使用不同的方法编写不同的部分;例如,GUI可能是面向对象的,而处理逻辑是过程性的或函数式的。
In a functional program, input flows through a set of functions. 在函数程序中,输入流经一组函数。Each function operates on its input and produces some output. 每个函数对其输入进行操作并产生一些输出。Functional style discourages functions with side effects that modify internal state or make other changes that aren’t visible in the function’s return value. 函数样式不鼓励具有修改内部状态或进行函数返回值中不可见的其他更改的副作用的函数。Functions that have no side effects at all are called purely functional. 完全没有副作用的函数称为纯函数。Avoiding side effects means not using data structures that get updated as a program runs; every function’s output must only depend on its input.避免副作用意味着不使用随着程序运行而更新的数据结构;每个函数的输出必须仅依赖于其输入。
Some languages are very strict about purity and don’t even have assignment statements such as 有些语言对纯度要求非常严格,甚至没有赋值语句,如a=3
or c = a + b
, but it’s difficult to avoid all side effects, such as printing to the screen or writing to a disk file. a=3
或c=a+b
,但很难避免所有副作用,如打印到屏幕或写入磁盘文件。Another example is a call to the 另一个示例是调用print()
or time.sleep()
function, neither of which returns a useful value. print()
或time.sleep()
函数,这两个函数都不会返回有用的值。Both are called only for their side effects of sending some text to the screen or pausing execution for a second.这两种方法的副作用都是向屏幕发送一些文本或暂停执行一秒钟。
Python programs written in functional style usually won’t go to the extreme of avoiding all I/O or all assignments; instead, they’ll provide a functional-appearing interface but will use non-functional features internally. 以函数式风格编写的Python程序通常不会极端地避免所有I/O或所有赋值;相反,它们将提供一个功能性的界面,但将在内部使用非功能性特性。For example, the implementation of a function will still use assignments to local variables, but won’t modify global variables or have other side effects.例如,函数的实现仍将使用对局部变量的赋值,但不会修改全局变量或产生其他副作用。
Functional programming can be considered the opposite of object-oriented programming. 函数式编程可以被认为是面向对象编程的对立面。Objects are little capsules containing some internal state along with a collection of method calls that let you modify this state, and programs consist of making the right set of state changes. 对象是包含一些内部状态以及方法调用集合的小胶囊,这些方法调用允许您修改此状态,程序包括进行正确的状态更改集。Functional programming wants to avoid state changes as much as possible and works with data flowing between functions. 函数式编程希望尽可能避免状态更改,并处理函数之间的数据流。In Python you might combine the two approaches by writing functions that take and return instances representing objects in your application (e-mail messages, transactions, etc.).在Python中,您可以通过编写函数来结合这两种方法,这些函数获取并返回表示应用程序中对象的实例(电子邮件、事务等)。
Functional design may seem like an odd constraint to work under. 函数式设计似乎是一个奇怪的约束条件。Why should you avoid objects and side effects? 为什么要避免对象和副作用?There are theoretical and practical advantages to the functional style:函数式风格在理论和实践上都有优势:
Formal provability.形式可证明性。Modularity.模块化。Composability.可组合性。Ease of debugging and testing.易于调试和测试。
Formal provability形式可证明性¶
A theoretical benefit is that it’s easier to construct a mathematical proof that a functional program is correct.一个理论上的好处是,更容易构造函数程序正确性的数学证明。
For a long time researchers have been interested in finding ways to mathematically prove programs correct. 长期以来,研究人员一直对寻找从数学上证明程序正确的方法感兴趣。This is different from testing a program on numerous inputs and concluding that its output is usually correct, or reading a program’s source code and concluding that the code looks right; the goal is instead a rigorous proof that a program produces the right result for all possible inputs.这不同于在大量输入上测试程序并得出其输出通常正确的结论,或者读取程序的源代码并得出代码看起来正确的结论;相反,目标是严格证明程序为所有可能的输入生成正确的结果。
The technique used to prove programs correct is to write down invariants, properties of the input data and of the program’s variables that are always true. 用来证明程序正确性的技术是写下不变量、输入数据和程序变量的属性,这些属性总是正确的。For each line of code, you then show that if invariants X and Y are true before the line is executed, the slightly different invariants X’ and Y’ are true after the line is executed. 对于每一行代码,您将显示,如果在执行该行之前,不变量X和Y为true
,那么在执行该行之后,略有不同的不变量X’和Y’为true
。This continues until you reach the end of the program, at which point the invariants should match the desired conditions on the program’s output.这将一直持续到程序结束,此时不变量应与程序输出上的所需条件相匹配。
Functional programming’s avoidance of assignments arose because assignments are difficult to handle with this technique; assignments can break invariants that were true before the assignment without producing any new invariants that can be propagated onward.函数式编程之所以避免赋值,是因为这种技术很难处理赋值;赋值可以打断赋值之前为真的不变量,而不产生任何可以向前传播的新不变量。
Unfortunately, proving programs correct is largely impractical and not relevant to Python software. Even trivial programs require proofs that are several pages long; the proof of correctness for a moderately complicated program would be enormous, and few or none of the programs you use daily (the Python interpreter, your XML parser, your web browser) could be proven correct. 不幸的是,证明程序正确性在很大程度上是不切实际的,与Python软件无关。即使是微不足道的程序也需要几页长的证明;对于一个中等复杂的程序来说,正确性的证明将是巨大的,而您每天使用的程序(Python解释器、XML解析器、web浏览器)很少或没有一个可以被证明是正确的。Even if you wrote down or generated a proof, there would then be the question of verifying the proof; maybe there’s an error in it, and you wrongly believe you’ve proved the program correct.即使你写下或生成了证据,也会有验证证据的问题;也许它有一个错误,你错误地认为你已经证明了程序的正确性。
Modularity模块化¶
A more practical benefit of functional programming is that it forces you to break apart your problem into small pieces. 函数式编程的一个更实际的好处是,它迫使您将问题分解成小块。Programs are more modular as a result. 因此,程序更加模块化。It’s easier to specify and write a small function that does one thing than a large function that performs a complicated transformation. 与执行复杂转换的大函数相比,指定和编写一个只做一件事的小函数更容易。Small functions are also easier to read and to check for errors.小函数也更易于阅读和检查错误。
Ease of debugging and testing易于调试和测试¶
Testing and debugging a functional-style program is easier.测试和调试函数式程序更容易。
Debugging is simplified because functions are generally small and clearly specified. 调试被简化,因为函数通常很小,并且明确指定。When a program doesn’t work, each function is an interface point where you can check that the data are correct. 当程序不工作时,每个函数都是一个接口点,您可以在这里检查数据是否正确。You can look at the intermediate inputs and outputs to quickly isolate the function that’s responsible for a bug.您可以查看中间输入和输出,以快速隔离导致bug的函数。
Testing is easier because each function is a potential subject for a unit test. 测试更容易,因为每个函数都是单元测试的潜在主题。Functions don’t depend on system state that needs to be replicated before running a test; instead you only have to synthesize the right input and then check that the output matches expectations.函数不依赖于运行测试之前需要复制的系统状态;相反,您只需综合正确的输入,然后检查输出是否符合预期。
Composability可组合性¶
As you work on a functional-style program, you’ll write a number of functions with varying inputs and outputs. 在处理函数式程序时,您将编写许多具有不同输入和输出的函数。Some of these functions will be unavoidably specialized to a particular application, but others will be useful in a wide variety of programs. 其中一些函数将不可避免地专用于特定的应用程序,但其他函数将在各种各样的程序中有用。For example, a function that takes a directory path and returns all the XML files in the directory, or a function that takes a filename and returns its contents, can be applied to many different situations.例如,采用目录路径并返回目录中所有XML文件的函数,或采用文件名并返回其内容的函数,可以应用于许多不同的情况。
Over time you’ll form a personal library of utilities. 随着时间的推移,您将形成一个个人实用程序库。Often you’ll assemble new programs by arranging existing functions in a new configuration and writing a few functions specialized for the current task.通常,您将通过在新配置中安排现有函数并编写一些专门用于当前任务的函数来组装新程序。
Iterators迭代器¶
I’ll start by looking at a Python language feature that’s an important foundation for writing functional-style programs: iterators.我将首先介绍Python语言的一个特性,它是编写函数式程序的重要基础:迭代器。
An iterator is an object representing a stream of data; this object returns the data one element at a time. 迭代器是表示数据流的对象;此对象每次返回一个元素的数据。A Python iterator must support a method called Python迭代器必须支持名为__next__()
that takes no arguments and always returns the next element of the stream. __next__()
的方法,该方法不带任何参数,并且始终返回流的下一个元素。If there are no more elements in the stream, 如果流中没有其他元素,则__next__()
must raise the StopIteration
exception. __next__()
必须引发StopIteration
异常。Iterators don’t have to be finite, though; it’s perfectly reasonable to write an iterator that produces an infinite stream of data.不过,迭代器不必是有限的;编写一个生成无限数据流的迭代器是完全合理的。
The built-in 内置的iter()
function takes an arbitrary object and tries to return an iterator that will return the object’s contents or elements, raising TypeError
if the object doesn’t support iteration. iter()
函数接受一个任意对象,并尝试返回一个迭代器,该迭代器将返回对象的内容或元素,如果对象不支持迭代,则会引发TypeError
。Several of Python’s built-in data types support iteration, the most common being lists and dictionaries. Python的一些内置数据类型支持迭代,最常见的是列表和字典。An object is called iterable if you can get an iterator for it.如果可以为对象获取迭代器,则称其为可迭代对象。
You can experiment with the iteration interface manually:您可以手动尝试迭代界面:
>>> L = [1, 2, 3]
>>> it = iter(L)
>>> it
<...iterator object at ...>
>>> it.__next__() # same as next(it)
1
>>> next(it)
2
>>> next(it)
3
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
Python expects iterable objects in several different contexts, the most important being the Python期望在几个不同的上下文中使用可迭代对象,最重要的是for
statement. for
语句。In the statement 在语句for X in Y
, Y must be an iterator or some object for which iter()
can create an iterator. for X in Y
中,Y必须是迭代器或某个对象,iter()
可以为其创建迭代器。These two statements are equivalent:这两种说法是等效的:
for i in iter(obj):
print(i)
for i in obj:
print(i)
Iterators can be materialized as lists or tuples by using the 迭代器可以通过使用list()
or tuple()
constructor functions:list()
或tuple()
构造函数具体化为列表或元组:
>>> L = [1, 2, 3]
>>> iterator = iter(L)
>>> t = tuple(iterator)
>>> t
(1, 2, 3)
Sequence unpacking also supports iterators: if you know an iterator will return N elements, you can unpack them into an N-tuple:序列解包还支持迭代器:如果您知道迭代器将返回N个元素,则可以将它们解包为N元组:
>>> L = [1, 2, 3]
>>> iterator = iter(L)
>>> a, b, c = iterator
>>> a, b, c
(1, 2, 3)
Built-in functions such as 诸如max()
and min()
can take a single iterator argument and will return the largest or smallest element. max()
和min()
之类的内置函数可以接受单个迭代器参数,并将返回最大或最小的元素。The code>"in"和"in"
and "not in"
operators also support iterators: X in iterator
is true if X is found in the stream returned by the iterator. "not in"
运算符也支持迭代器:如果在迭代器返回的流中找到X,则X in iterator
为true
。You’ll run into obvious problems if the iterator is infinite; 如果迭代器是无限的,您将遇到明显的问题;max()
, min()
will never return, and if the element X never appears in the stream, the "in"
and "not in"
operators won’t return either.max()
、min()
将永远不会返回,如果元素X从未出现在流中,"in"
和"not in"
运算符也不会返回。
Note that you can only go forward in an iterator; there’s no way to get the previous element, reset the iterator, or make a copy of it. 请注意,您只能在迭代器中前进;无法获取前一个元素、重置迭代器或复制它。Iterator objects can optionally provide these additional capabilities, but the iterator protocol only specifies the 迭代器对象可以选择性地提供这些附加功能,但迭代器协议仅指定__next__()
method. __next__()
方法。Functions may therefore consume all of the iterator’s output, and if you need to do something different with the same stream, you’ll have to create a new iterator.因此,函数可能会消耗迭代器的所有输出,如果需要对同一个流执行不同的操作,则必须创建一个新的迭代器。
Data Types That Support Iterators支持迭代器的数据类型¶
We’ve already seen how lists and tuples support iterators. 我们已经看到列表和元组如何支持迭代器。In fact, any Python sequence type, such as strings, will automatically support creation of an iterator.事实上,任何Python序列类型(如字符串)都将自动支持迭代器的创建。
Calling 对字典调用iter()
on a dictionary returns an iterator that will loop over the dictionary’s keys:iter()
将返回一个迭代器,该迭代器将循环遍历字典的键:
>>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
>>> for key in m:
... print(key, m[key])
Jan 1
Feb 2
Mar 3
Apr 4
May 5
Jun 6
Jul 7
Aug 8
Sep 9
Oct 10
Nov 11
Dec 12
Note that starting with Python 3.7, dictionary iteration order is guaranteed to be the same as the insertion order. 请注意,从Python 3.7开始,字典迭代顺序保证与插入顺序相同。In earlier versions, the behaviour was unspecified and could vary between implementations.在早期版本中,该行为未指定,在不同的实现中可能会有所不同。
Applying 将iter()
to a dictionary always loops over the keys, but dictionaries have methods that return other iterators. iter()
应用于字典总是在键上循环,但字典有返回其他迭代器的方法。If you want to iterate over values or key/value pairs, you can explicitly call the 如果要迭代值或键/值对,可以显式调用values()
or items()
methods to get an appropriate iterator.values()
或items()
方法来获取适当的迭代器。
The dict()
constructor can accept an iterator that returns a finite stream of (key, value)
tuples:dict()
构造函数可以接受一个迭代器,该迭代器返回(key, value)
元组的有限流:
>>> L = [('Italy', 'Rome'), ('France', 'Paris'), ('US', 'Washington DC')]
>>> dict(iter(L))
{'Italy': 'Rome', 'France': 'Paris', 'US': 'Washington DC'}
Files also support iteration by calling the 文件还通过调用readline()
method until there are no more lines in the file. readline()
方法支持迭代,直到文件中没有更多的行。This means you can read each line of a file like this:这意味着您可以像这样读取文件的每一行:
for line in file:
# do something for each line
...
Sets can take their contents from an iterable and let you iterate over the set’s elements:集合可以从可迭代中获取其内容,并允许您迭代集合的元素:
S = {2, 3, 5, 7, 11, 13}
for i in S:
print(i)
Generator expressions and list comprehensions生成器表达式和列表理解¶
Two common operations on an iterator’s output are 1) performing some operation for every element, 2) selecting a subset of elements that meet some condition. 迭代器输出上的两个常见操作是1)对每个元素执行一些操作,2)选择满足某些条件的元素子集。For example, given a list of strings, you might want to strip off trailing whitespace from each line or extract all the strings containing a given substring.例如,给定一个字符串列表,您可能希望从每行中去掉尾随的空格,或者提取包含给定子字符串的所有字符串。
List comprehensions and generator expressions (short form: “listcomps” and “genexps”) are a concise notation for such operations, borrowed from the functional programming language Haskell (https://www.haskell.org/). 列表理解和生成器表达式(缩写:“listcomps”和“genexps”)是此类操作的简明符号,借用自函数式编程语言Haskell(https://www.haskell.org/)。You can strip all the whitespace from a stream of strings with the following code:您可以使用以下代码从字符串流中去除所有空白:
line_list = [' line 1\n', 'line 2 \n', ...]
# Generator expression -- returns iterator
stripped_iter = (line.strip() for line in line_list)
# List comprehension -- returns list
stripped_list = [line.strip() for line in line_list]
You can select only certain elements by adding an 通过添加"if"
condition:"if"
条件,只能选择某些元素:
stripped_list = [line.strip() for line in line_list
if line != ""]
With a list comprehension, you get back a Python list; 通过列表理解,您可以得到一个Python列表;stripped_list
is a list containing the resulting lines, not an iterator. stripped_list
是包含结果行的列表,而不是迭代器。Generator expressions return an iterator that computes the values as necessary, not needing to materialize all the values at once. 生成器表达式返回一个迭代器,该迭代器根据需要计算值,而不需要一次具体化所有值。This means that list comprehensions aren’t useful if you’re working with iterators that return an infinite stream or a very large amount of data. 这意味着,如果您使用的迭代器返回无限流或大量数据,那么列表理解就没有用处。Generator expressions are preferable in these situations.生成器表达式在这些情况下更可取。
Generator expressions are surrounded by parentheses (“()”) and list comprehensions are surrounded by square brackets (“[]”). 生成器表达式由括号(())包围,列表理解由方括号([])包围。Generator expressions have the form:生成器表达式的形式为:
( expression for expr in sequence1
if condition1
for expr2 in sequence2
if condition2
for expr3 in sequence3 ...
if condition3
for exprN in sequenceN
if conditionN )
Again, for a list comprehension only the outside brackets are different (square brackets instead of parentheses).同样,对于列表理解,只有外括号是不同的(方括号而不是括号)。
The elements of the generated output will be the successive values of expression
. The if
clauses are all optional; if present, expression
is only evaluated and added to the result when condition
is true.
Generator expressions always have to be written inside parentheses, but the parentheses signalling a function call also count. 生成器表达式必须始终写在括号内,但表示函数调用的括号也很重要。If you want to create an iterator that will be immediately passed to a function you can write:如果要创建将立即传递给函数的迭代器,可以编写:
obj_total = sum(obj.count for obj in list_all_objects())
The for...in
clauses contain the sequences to be iterated over. for...in
子句包含要迭代的序列。The sequences do not have to be the same length, because they are iterated over from left to right, not in parallel. For each element in sequence1
, sequence2
is looped over from the beginning. sequence3
is then looped over for each resulting pair of elements from sequence1
and sequence2
.
To put it another way, a list comprehension or generator expression is equivalent to the following Python code:换句话说,列表理解或生成器表达式等效于以下Python代码:
for expr1 in sequence1:
if not (condition1):
continue # Skip this element
for expr2 in sequence2:
if not (condition2):
continue # Skip this element
...
for exprN in sequenceN:
if not (conditionN):
continue # Skip this element
# Output the value of
# the expression.
This means that when there are multiple for...in
clauses but no if
clauses, the length of the resulting output will be equal to the product of the lengths of all the sequences. If you have two lists of length 3, the output list is 9 elements long:
>>> seq1 = 'abc'
>>> seq2 = (1, 2, 3)
>>> [(x, y) for x in seq1 for y in seq2]
[('a', 1), ('a', 2), ('a', 3),
('b', 1), ('b', 2), ('b', 3),
('c', 1), ('c', 2), ('c', 3)]
To avoid introducing an ambiguity into Python’s grammar, if expression
is creating a tuple, it must be surrounded with parentheses. The first list comprehension below is a syntax error, while the second one is correct:
# Syntax error
[x, y for x in seq1 for y in seq2]
# Correct
[(x, y) for x in seq1 for y in seq2]
Generators生成器¶
Generators are a special class of functions that simplify the task of writing iterators. 生成器是一类特殊的函数,可以简化编写迭代器的任务。Regular functions compute a value and return it, but generators return an iterator that returns a stream of values.常规函数计算并返回一个值,但生成器返回一个迭代器,该迭代器返回一个值流。
You’re doubtless familiar with how regular function calls work in Python or C. 毫无疑问,您熟悉Python或C中常规函数调用的工作方式。When you call a function, it gets a private namespace where its local variables are created. 调用函数时,它会获得一个私有名称空间,在该名称空间中创建其局部变量。When the function reaches a return
statement, the local variables are destroyed and the value is returned to the caller. A later call to the same function creates a new private namespace and a fresh set of local variables. But, what if the local variables weren’t thrown away on exiting a function? What if you could later resume the function where it left off? 如果您以后可以在中断的地方恢复该功能,会怎么样?This is what generators provide; they can be thought of as resumable functions.这就是发电机所提供的;可将其视为可恢复功能。
Here’s the simplest example of a generator function:下面是生成函数的最简单示例:
>>> def generate_ints(N):
... for i in range(N):
... yield i
Any function containing a yield
keyword is a generator function;
this is detected by Python’s bytecode compiler which compiles the function specially as a result.
When you call a generator function, it doesn’t return a single value; instead it returns a generator object that supports the iterator protocol. 调用生成器函数时,它不会返回单个值;相反,它返回一个支持迭代器协议的生成器对象。On executing the yield
expression, the generator outputs the value of i
, similar to a return
statement. The big difference between yield
and a return
statement is that on reaching a yield
the generator’s state of execution is suspended and local variables are preserved. On the next call to the generator’s __next__()
method, the function will resume executing.
Here’s a sample usage of the 下面是generate_ints()
generator:generate_ints()
生成器的示例用法:
>>> gen = generate_ints(3)
>>> gen
<generator object generate_ints at ...>
>>> next(gen)
0
>>> next(gen)
1
>>> next(gen)
2
>>> next(gen)
Traceback (most recent call last):
File "stdin", line 1, in <module>
File "stdin", line 2, in generate_ints
StopIteration
You could equally write for i in generate_ints(5)
, or a, b, c = generate_ints(3)
.
Inside a generator function, return value
causes StopIteration(value)
to be raised from the __next__()
method. Once this happens, or the bottom of the function is reached, the procession of values ends and the generator cannot yield any further values.
You could achieve the effect of generators manually by writing your own class and storing all the local variables of the generator as instance variables. 通过编写自己的类并将生成器的所有局部变量存储为实例变量,可以手动实现生成器的效果。For example, returning a list of integers could be done by setting self.count
to 0, and having the __next__()
method increment self.count
and return it. However, for a moderately complicated generator, writing a corresponding class can be much messier.
The test suite included with Python’s library, Lib/test/test_generators.py, contains a number of more interesting examples. Here’s one generator that implements an in-order traversal of a tree using generators recursively.
# A recursive generator that generates Tree leaves in in-order.
def inorder(t):
if t:
for x in inorder(t.left):
yield x
yield t.label
for x in inorder(t.right):
yield x
Two other examples in test_generators.py
produce solutions for the N-Queens problem (placing N queens on an NxN chess board so that no queen threatens another) and the Knight’s Tour (finding a route that takes a knight to every square of an NxN chessboard without visiting any square twice).
Passing values into a generator将值传递到生成器¶
In Python 2.4 and earlier, generators only produced output. 在Python 2.4及更早版本中,生成器只生成输出。Once a generator’s code was invoked to create an iterator, there was no way to pass any new information into the function when its execution is resumed. 一旦调用生成器的代码来创建迭代器,当函数继续执行时,就无法将任何新信息传递到函数中。You could hack together this ability by making the generator look at a global variable or by passing in some mutable object that callers then modify, but these approaches are messy.您可以通过让生成器查看全局变量或传入调用方随后修改的可变对象来整合此功能,但这些方法很混乱。
In Python 2.5 there’s a simple way to pass values into a generator. yield
became an expression, returning a value that can be assigned to a variable or otherwise operated on:
val = (yield i)
I recommend that you always put parentheses around a yield
expression when you’re doing something with the returned value, as in the above example. The parentheses aren’t always necessary, but it’s easier to always add them instead of having to remember when they’re needed.
(PEP 342 explains the exact rules, which are that a yield
-expression must always be parenthesized except when it occurs at the top-level expression on the right-hand side of an assignment. This means you can write val = yield i
but have to use parentheses when there’s an operation, as in val = (yield i) + 12
.)
Values are sent into a generator by calling its send(value)
method. This method resumes the generator’s code and the yield
expression returns the specified value. If the regular __next__()
method is called, the yield
returns None
.
Here’s a simple counter that increments by 1 and allows changing the value of the internal counter.这里有一个简单的计数器,它的增量为1,允许更改内部计数器的值。
def counter(maximum):
i = 0
while i < maximum:
val = (yield i)
# If value provided, change counter
if val is not None:
i = val
else:
i += 1
And here’s an example of changing the counter:下面是一个更改计数器的示例:
>>> it = counter(10)
>>> next(it)
0
>>> next(it)
1
>>> it.send(8)
8
>>> next(it)
9
>>> next(it)
Traceback (most recent call last):
File "t.py", line 15, in <module>
it.next()
StopIteration
Because yield
will often be returning None
, you should always check for this case. Don’t just use its value in expressions unless you’re sure that the send()
method will be the only method used to resume your generator function.
In addition to send()
, there are two other methods on generators:
throw(value)
is used to raise an exception inside the generator; the exception is raised by theyield
expression where the generator’s execution is paused.close()
raises aGeneratorExit
exception inside the generator to terminate the iteration. On receiving this exception, the generator’s code must either raiseGeneratorExit
orStopIteration
; catching the exception and doing anything else is illegal and will trigger aRuntimeError
.close()
will also be called by Python’s garbage collector when the generator is garbage-collected.If you need to run cleanup code when a
GeneratorExit
occurs, I suggest using atry: ... finally:
suite instead of catchingGeneratorExit
.
The cumulative effect of these changes is to turn generators from one-way producers of information into both producers and consumers.这些变化的累积效应是将信息的生产者从单向生产者转变为生产者和消费者。
Generators also become coroutines, a more generalized form of subroutines. Subroutines are entered at one point and exited at another point (the top of the function, and a return
statement), but coroutines can be entered, exited, and resumed at many different points (the yield
statements).
Built-in functions内置函数¶
Let’s look in more detail at built-in functions often used with iterators.让我们更详细地了解迭代器经常使用的内置函数。
Two of Python’s built-in functions, map()
and filter()
duplicate the features of generator expressions:
map(f, iterA, iterB, ...)
returns an iterator over the sequencef(iterA[0], iterB[0]), f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...
.>>> def upper(s):
... return s.upper()>>> list(map(upper, ['sentence', 'fragment']))
['SENTENCE', 'FRAGMENT']
>>> [upper(s) for s in ['sentence', 'fragment']]
['SENTENCE', 'FRAGMENT']
You can of course achieve the same effect with a list comprehension.当然,你可以通过列表理解达到同样的效果。
filter(predicate, iter)
returns an iterator over all the sequence elements that meet a certain condition, and is similarly duplicated by list comprehensions. A predicate is a function that returns the truth value of some condition; for use with filter()
, the predicate must take a single value.
>>> def is_even(x):
... return (x % 2) == 0
>>> list(filter(is_even, range(10)))
[0, 2, 4, 6, 8]
This can also be written as a list comprehension:这也可以写为列表理解:
>>> list(x for x in range(10) if is_even(x))
[0, 2, 4, 6, 8]
enumerate(iter, start=0)
counts off the elements in the iterable returning 2-tuples containing the count (from start) and each element.对iterable中的元素进行计数,返回包含计数(从start开始)和每个元素的2元组。
>>> for item in enumerate(['subject', 'verb', 'object']):
... print(item)
(0, 'subject')
(1, 'verb')
(2, 'object')
在循环遍历列表并记录满足某些条件的索引时经常使用enumerate()
is often used when looping through a list and recording the indexes at which certain conditions are met:enumerate()
:
f = open('data.txt', 'r')
for i, line in enumerate(f):
if line.strip() == '':
print('Blank line at line #%i' % i)
sorted(iterable, key=None, reverse=False)
collects all the elements of the iterable into a list, sorts the list, and returns the sorted result. 将iterable的所有元素收集到列表中,对列表进行排序,并返回排序结果。The key and reverse arguments are passed through to the constructed list’s key和reverse参数传递给构造的list的sort()
method.sort()
方法。
>>> import random
>>> # Generate 8 random numbers between [0, 10000)
>>> rand_list = random.sample(range(10000), 8)
>>> rand_list
[769, 7953, 9828, 6431, 8442, 9878, 6213, 2207]
>>> sorted(rand_list)
[769, 2207, 6213, 6431, 7953, 8442, 9828, 9878]
>>> sorted(rand_list, reverse=True)
[9878, 9828, 8442, 7953, 6431, 6213, 2207, 769]
(For a more detailed discussion of sorting, see the Sorting HOW TO.)
The any(iter)
and all(iter)
built-ins look at the truth values of an iterable’s contents. any()
returns True
if any element in the iterable is a true value, and all()
returns True
if all of the elements are true values:
>>> any([0, 1, 0])
True
>>> any([0, 0, 0])
False
>>> any([1, 1, 1])
True
>>> all([0, 1, 0])
False
>>> all([0, 0, 0])
False
>>> all([1, 1, 1])
True
zip(iterA, iterB, ...)
takes one element from each iterable and returns them in a tuple:从每个可迭代对象中获取一个元素,并以元组形式返回它们:
zip(['a', 'b', 'c'], (1, 2, 3)) =>
('a', 1), ('b', 2), ('c', 3)
It doesn’t construct an in-memory list and exhaust all the input iterators before returning; instead tuples are constructed and returned only if they’re requested. (The technical term for this behaviour is lazy evaluation.)
This iterator is intended to be used with iterables that are all of the same length. 此迭代器旨在与所有长度相同的iterables一起使用。If the iterables are of different lengths, the resulting stream will be the same length as the shortest iterable.如果iterable的长度不同,则生成的流将与最短iterable的长度相同。
zip(['a', 'b'], (1, 2, 3)) =>
('a', 1), ('b', 2)
You should avoid doing this, though, because an element may be taken from the longer iterators and discarded. 但是,您应该避免这样做,因为元素可能会从较长的迭代器中获取并丢弃。This means you can’t go on to use the iterators further because you risk skipping a discarded element.这意味着您不能继续进一步使用迭代器,因为您有跳过已丢弃元素的风险。
The itertools moduleitertools模块¶
The itertools
module contains a number of commonly-used iterators as well as functions for combining several iterators. itertools
模块包含许多常用迭代器以及用于组合多个迭代器的函数。This section will introduce the module’s contents by showing small examples.本节将通过展示小示例介绍模块的内容。
The module’s functions fall into a few broad classes:该模块的功能分为几个大类:
Functions that create a new iterator based on an existing iterator.基于现有迭代器创建新迭代器的函数。Functions for treating an iterator’s elements as function arguments.用于将迭代器元素视为函数参数的函数。Functions for selecting portions of an iterator’s output.用于选择迭代器输出部分的函数。A function for grouping an iterator’s output.用于对迭代器的输出进行分组的函数。
Creating new iterators创建新迭代器¶
itertools.count(start, step)
returns an infinite stream of evenly spaced values. 返回无限个等距值流。You can optionally supply the starting number, which defaults to 0, and the interval between numbers, which defaults to 1:您可以选择提供起始数字(默认为0)和数字之间的间隔(默认为1):
itertools.count() =>
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
itertools.count(10) =>
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
itertools.count(10, 5) =>
10, 15, 20, 25, 30, 35, 40, 45, 50, 55, ...
itertools.cycle(iter)
saves a copy of the contents of a provided iterable and returns a new iterator that returns its elements from first to last. 保存所提供iterable的内容副本,并返回一个新的迭代器,该迭代器从头到尾返回其元素。The new iterator will repeat these elements infinitely.新迭代器将无限重复这些元素。
itertools.cycle([1, 2, 3, 4, 5]) =>
1, 2, 3, 4, 5, 1, 2, 3, 4, 5, ...
itertools.repeat(elem, [n])
returns the provided element n times, or returns the element endlessly if n is not provided.
itertools.repeat('abc') =>
abc, abc, abc, abc, abc, abc, abc, abc, abc, abc, ...
itertools.repeat('abc', 5) =>
abc, abc, abc, abc, abc
itertools.chain(iterA, iterB, ...)
takes an arbitrary number of iterables as input, and returns all the elements of the first iterator, then all the elements of the second, and so on, until all of the iterables have been exhausted.将任意数量的可迭代对象作为输入,并返回第一个迭代器的所有元素,然后返回第二个迭代器的所有元素,依此类推,直到所有iterables都已耗尽。
itertools.chain(['a', 'b', 'c'], (1, 2, 3)) =>
a, b, c, 1, 2, 3
itertools.islice(iter, [start], stop, [step])
returns a stream that’s a slice of the iterator. With a single stop argument, it will return the first stop elements. If you supply a starting index, you’ll get stop-start elements, and if you supply a value for step, elements will be skipped accordingly. Unlike Python’s string and list slicing, you can’t use negative values for start, stop, or step.
itertools.islice(range(10), 8) =>
0, 1, 2, 3, 4, 5, 6, 7
itertools.islice(range(10), 2, 8) =>
2, 3, 4, 5, 6, 7
itertools.islice(range(10), 2, 8, 2) =>
2, 4, 6
itertools.tee(iter, [n])
replicates an iterator; it returns n independent iterators that will all return the contents of the source iterator. If you don’t supply a value for n, the default is 2. Replicating iterators requires saving some of the contents of the source iterator, so this can consume significant memory if the iterator is large and one of the new iterators is consumed more than the others.
itertools.tee( itertools.count() ) =>
iterA, iterB
where iterA ->
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
and iterB ->
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
Calling functions on elements在元素上调用函数¶
The operator
module contains a set of functions corresponding to Python’s operators. operator
模块包含一组与Python运算符相对应的函数。Some examples are operator.add(a, b)
(adds two values), operator.ne(a, b)
(same as a != b
), and operator.attrgetter('id')
(returns a callable that fetches the .id
attribute).
itertools.starmap(func, iter)
assumes that the iterable will return a stream of tuples, and calls func using these tuples as the arguments:
itertools.starmap(os.path.join,
[('/bin', 'python'), ('/usr', 'bin', 'java'),
('/usr', 'bin', 'perl'), ('/usr', 'bin', 'ruby')])
=>
/bin/python, /usr/bin/java, /usr/bin/perl, /usr/bin/ruby
Selecting elements选择元素¶
Another group of functions chooses a subset of an iterator’s elements based on a predicate.另一组函数基于谓词选择迭代器元素的子集。
itertools.filterfalse(predicate, iter)
is the opposite of 与filter()
, returning all elements for which the predicate returns false:filter()
相反,返回谓词返回false
的所有元素:
itertools.filterfalse(is_even, itertools.count()) =>
1, 3, 5, 7, 9, 11, 13, 15, ...
itertools.takewhile(predicate, iter)
returns elements for as long as the predicate returns true. Once the predicate returns false, the iterator will signal the end of its results.
def less_than_10(x):
return x < 10
itertools.takewhile(less_than_10, itertools.count()) =>
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
itertools.takewhile(is_even, itertools.count()) =>
0
itertools.dropwhile(predicate, iter)
discards elements while the predicate returns true, and then returns the rest of the iterable’s results.当谓词返回true
时丢弃元素,然后返回iterable的其余结果。
itertools.dropwhile(less_than_10, itertools.count()) =>
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
itertools.dropwhile(is_even, itertools.count()) =>
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
itertools.compress(data, selectors)
takes two iterators and returns only those elements of data for which the corresponding element of selectors is true, stopping whenever either one is exhausted:
itertools.compress([1, 2, 3, 4, 5], [True, True, False, False, True]) =>
1, 2, 5
Combinatoric functions组合函数¶
The itertools.combinations(iterable, r)
returns an iterator giving all possible r-tuple combinations of the elements contained in iterable.
itertools.combinations([1, 2, 3, 4, 5], 2) =>
(1, 2), (1, 3), (1, 4), (1, 5),
(2, 3), (2, 4), (2, 5),
(3, 4), (3, 5),
(4, 5)
itertools.combinations([1, 2, 3, 4, 5], 3) =>
(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5), (1, 4, 5),
(2, 3, 4), (2, 3, 5), (2, 4, 5),
(3, 4, 5)
The elements within each tuple remain in the same order as iterable returned them. 每个元组中的元素保持与iterable返回它们的顺序相同。For example, the number 1 is always before 2, 3, 4, or 5 in the examples above. 例如,在上述示例中,数字1始终在2、3、4或5之前。A similar function, itertools.permutations(iterable, r=None)
, removes this constraint on the order, returning all possible arrangements of length r:
itertools.permutations([1, 2, 3, 4, 5], 2) =>
(1, 2), (1, 3), (1, 4), (1, 5),
(2, 1), (2, 3), (2, 4), (2, 5),
(3, 1), (3, 2), (3, 4), (3, 5),
(4, 1), (4, 2), (4, 3), (4, 5),
(5, 1), (5, 2), (5, 3), (5, 4)
itertools.permutations([1, 2, 3, 4, 5]) =>
(1, 2, 3, 4, 5), (1, 2, 3, 5, 4), (1, 2, 4, 3, 5),
...
(5, 4, 3, 2, 1)
If you don’t supply a value for r the length of the iterable is used, meaning that all the elements are permuted.如果不为r提供值,则使用iterable的长度,这意味着所有元素都被置换。
Note that these functions produce all of the possible combinations by position and don’t require that the contents of iterable are unique:请注意,这些函数按位置生成所有可能的组合,并且不要求iterable的内容是唯一的:
itertools.permutations('aba', 3) =>
('a', 'b', 'a'), ('a', 'a', 'b'), ('b', 'a', 'a'),
('b', 'a', 'a'), ('a', 'a', 'b'), ('a', 'b', 'a')
The identical tuple 相同的元组('a', 'a', 'b')
occurs twice, but the two ‘a’ strings came from different positions.('a', 'a', 'b')
出现两次,但两个“a”字符串来自不同的位置。
The itertools.combinations_with_replacement(iterable, r)
function relaxes a different constraint: elements can be repeated within a single tuple. itertools.combinations_with_replacement(iterable, r)
函数放松了一个不同的约束:元素可以在一个元组中重复。Conceptually an element is selected for the first position of each tuple and then is replaced before the second element is selected.从概念上讲,为每个元组的第一个位置选择一个元素,然后在选择第二个元素之前替换该元素。
itertools.combinations_with_replacement([1, 2, 3, 4, 5], 2) =>
(1, 1), (1, 2), (1, 3), (1, 4), (1, 5),
(2, 2), (2, 3), (2, 4), (2, 5),
(3, 3), (3, 4), (3, 5),
(4, 4), (4, 5),
(5, 5)
Grouping elements分组元素¶
The last function I’ll discuss, itertools.groupby(iter, key_func=None)
, is the most complicated. key_func(elem)
is a function that can compute a key value for each element returned by the iterable. If you don’t supply a key function, the key is simply each element itself.如果不提供键函数,那么键就是每个元素本身。
groupby()
collects all the consecutive elements from the underlying iterable that have the same key value, and returns a stream of 2-tuples containing a key value and an iterator for the elements with that key.收集基础iterable中具有相同键值的所有连续元素,并返回一个包含键值的2元组流和具有该键值的元素的迭代器。
city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'),
('Anchorage', 'AK'), ('Nome', 'AK'),
('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'),
...
]
def get_state(city_state):
return city_state[1]
itertools.groupby(city_list, get_state) =>
('AL', iterator-1),
('AK', iterator-2),
('AZ', iterator-3), ...
where
iterator-1 =>
('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL')
iterator-2 =>
('Anchorage', 'AK'), ('Nome', 'AK')
iterator-3 =>
('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ')
groupby()
assumes that the underlying iterable’s contents will already be sorted based on the key. 假设基础iterable的内容已经基于键进行排序。Note that the returned iterators also use the underlying iterable, so you have to consume the results of iterator-1 before requesting iterator-2 and its corresponding key.请注意,返回的迭代器还使用底层iterable,因此在请求迭代器2及其相应的键之前,必须使用迭代器1的结果。
The functools modulefunctools模块¶
The functools
module in Python 2.5 contains some higher-order functions. A higher-order function takes one or more functions as input and returns a new function. The most useful tool in this module is the functools.partial()
function.
For programs written in a functional style, you’ll sometimes want to construct variants of existing functions that have some of the parameters filled in. 对于以函数样式编写的程序,有时需要构造现有函数的变体,这些变体中填充了一些参数。Consider a Python function f(a, b, c)
; you may wish to create a new function g(b, c)
that’s equivalent to f(1, b, c)
; you’re filling in a value for one of f()
’s parameters. This is called “partial function application”.
The constructor for partial()
takes the arguments (function, arg1, arg2, ..., kwarg1=value1, kwarg2=value2)
. The resulting object is callable, so you can just call it to invoke function
with the filled-in arguments.
Here’s a small but realistic example:下面是一个小但现实的例子:
import functools
def log(message, subsystem):
"""Write the contents of 'message' to the specified subsystem."""
print('%s: %s' % (subsystem, message))
...
server_log = functools.partial(log, subsystem='server')
server_log('Unable to open socket')
functools.reduce(func, iter, [initial_value])
cumulatively performs an operation on all the iterable’s elements and, therefore, can’t be applied to infinite iterables. func must be a function that takes two elements and returns a single value. functools.reduce()
takes the first two elements A and B returned by the iterator and calculates func(A, B)
. It then requests the third element, C, calculates func(func(A, B), C)
, combines this result with the fourth element returned, and continues until the iterable is exhausted. If the iterable returns no values at all, a TypeError
exception is raised. If the initial value is supplied, it’s used as a starting point and func(initial_value, A)
is the first calculation.
>>> import operator, functools
>>> functools.reduce(operator.concat, ['A', 'BB', 'C'])
'ABBC'
>>> functools.reduce(operator.concat, [])
Traceback (most recent call last):
...
TypeError: reduce() of empty sequence with no initial value
>>> functools.reduce(operator.mul, [1, 2, 3], 1)
6
>>> functools.reduce(operator.mul, [], 1)
1
If you use operator.add()
with functools.reduce()
, you’ll add up all the elements of the iterable. This case is so common that there’s a special built-in called 这种情况非常常见,因此有一个特殊的内置函数sum()
to compute it:sum()
来计算它:
>>> import functools, operator
>>> functools.reduce(operator.add, [1, 2, 3, 4], 0)
10
>>> sum([1, 2, 3, 4])
10
>>> sum([])
0
For many uses of 不过,对于functools.reduce()
, though, it can be clearer to just write the obvious for
loop:functools.reduce()
的许多用法,只需编写明显的for
循环就可以了:
import functools
# Instead of:
product = functools.reduce(operator.mul, [1, 2, 3], 1)
# You can write:
product = 1
for i in [1, 2, 3]:
product *= i
A related function is 一个相关函数是itertools.accumulate(iterable, func=operator.add)
. itertools.accumulate(iterable, func=operator.add)
。It performs the same calculation, but instead of returning only the final result, accumulate()
returns an iterator that also yields each partial result:accumulate()
执行相同的计算,但不是只返回最终结果,而是返回一个迭代器,该迭代器也会生成每个部分结果:
itertools.accumulate([1, 2, 3, 4, 5]) =>
1, 3, 6, 10, 15
itertools.accumulate([1, 2, 3, 4, 5], operator.mul) =>
1, 2, 6, 24, 120
The operator module运算符模块¶
The operator
module was mentioned earlier. It contains a set of functions corresponding to Python’s operators. These functions are often useful in functional-style code because they save you from writing trivial functions that perform a single operation.
Some of the functions in this module are:
Math operations:
add()
,sub()
,mul()
,floordiv()
,abs()
, …Logical operations:
not_()
,truth()
.Bitwise operations:
and_()
,or_()
,invert()
.Comparisons:
eq()
,ne()
,lt()
,le()
,gt()
, andge()
.Object identity:
is_()
,is_not()
.
Consult the operator module’s documentation for a complete list.有关完整列表,请参阅运算符模块的文档。
Small functions and the lambda expression小函数与lambda表达式¶
When writing functional-style programs, you’ll often need little functions that act as predicates or that combine elements in some way.编写函数式程序时,通常需要一些充当谓词或以某种方式组合元素的函数。
If there’s a Python built-in or a module function that’s suitable, you don’t need to define a new function at all:如果有适合的Python内置函数或模块函数,则根本不需要定义新函数:
stripped_lines = [line.strip() for line in lines]
existing_files = filter(os.path.exists, file_list)
If the function you need doesn’t exist, you need to write it. One way to write small functions is to use the 如果您需要的函数不存在,则需要编写它。编写小函数的一种方法是使用lambda
expression. lambda
表达式。lambda
takes a number of parameters and an expression combining these parameters, and creates an anonymous function that returns the value of the expression:获取多个参数和一个组合这些参数的表达式,并创建一个匿名函数,该函数返回表达式的值:
adder = lambda x, y: x+y
print_assign = lambda name, value: name + '=' + str(value)
An alternative is to just use the 另一种方法是只使用def语句,并以通常的方式定义函数:def
statement and define a function in the usual way:
def adder(x, y):
return x + y
def print_assign(name, value):
return name + '=' + str(value)
Which alternative is preferable? 哪种选择更可取?That’s a style question; my usual course is to avoid using 这是一个风格问题;我通常的做法是避免使用lambda
.lambda
。
One reason for my preference is that lambda
is quite limited in the functions it can define. The result has to be computable as a single expression, which means you can’t have multiway if... elif... else
comparisons or try... except
statements. If you try to do too much in a lambda
statement, you’ll end up with an overly complicated expression that’s hard to read. Quick, what’s the following code doing?
import functools
total = functools.reduce(lambda a, b: (0, a[1] + b[1]), items)[1]
You can figure it out, but it takes time to disentangle the expression to figure out what’s going on. 你可以弄清楚,但要弄清楚表达式需要时间才能弄清楚到底发生了什么。Using a short nested 使用简短的嵌套def
statements makes things a little bit better:def
语句可以让事情变得更好:
import functools
def combine(a, b):
return 0, a[1] + b[1]
total = functools.reduce(combine, items)[1]
But it would be best of all if I had simply used a 但如果我只使用for
loop:for
循环,那将是最好的:
total = 0
for a, b in items:
total += b
Or the 或sum()
built-in and a generator expression:sum()
内置表达式和生成器表达式:
total = sum(b for a, b in items)
Many uses of functools.reduce()
are clearer when written as for
loops.
Fredrik Lundh once suggested the following set of rules for refactoring uses of Fredrik Lundh曾为lambda
:lambda
的重构使用提出以下一组规则:
Write a lambda function.编写lambda函数。Write a comment explaining what the heck that lambda does.写一条评论解释lambda到底做了什么。Study the comment for a while, and think of a name that captures the essence of the comment.研究一下评论,然后想出一个能抓住评论本质的名字。Convert the lambda to a def statement, using that name.使用该名称将lambda转换为def语句。Remove the comment.删除注释。
I really like these rules, but you’re free to disagree about whether this lambda-free style is better.我真的很喜欢这些规则,但你可以不同意这种lambda free风格是否更好。
Revision History and Acknowledgements修订历史记录和确认¶
The author would like to thank the following people for offering suggestions, corrections and assistance with various drafts of this article: 作者感谢以下人士对本文的各种草案提出建议、更正和帮助:Ian Bicking, Nick Coghlan, Nick Efford, Raymond Hettinger, Jim Jewett, Mike Krell, Leandro Lameiro, Jussi Salmela, Collin Winter, Blake Winton.
Version 0.1: posted June 30 2006.
Version 0.11: posted July 1 2006. Typo fixes.
Version 0.2: posted July 10 2006. Merged genexp and listcomp sections into one. Typo fixes.
Version 0.21: Added more references suggested on the tutor mailing list.
Version 0.30: Adds a section on the functional
module written by Collin Winter; adds short section on the operator module; a few other edits.
References¶
General全体的¶
Structure and Interpretation of Computer Programs, by Harold Abelson and Gerald Jay Sussman with Julie Sussman. Full text at https://mitpress.mit.edu/sicp/. In this classic textbook of computer science, chapters 2 and 3 discuss the use of sequences and streams to organize the data flow inside a program. The book uses Scheme for its examples, but many of the design approaches described in these chapters are applicable to functional-style Python code.
http://www.defmacro.org/ramblings/fp.html: A general introduction to functional programming that uses Java examples and has a lengthy historical introduction.
https://en.wikipedia.org/wiki/Functional_programming: General Wikipedia entry describing functional programming.
https://en.wikipedia.org/wiki/Coroutine: Entry for coroutines.
https://en.wikipedia.org/wiki/Currying: Entry for the concept of currying.
Python-specific¶
http://gnosis.cx/TPiP/: The first chapter of David Mertz’s book Text Processing in Python discusses functional programming for text processing, in the section titled “Utilizing Higher-Order Functions in Text Processing”.
Mertz also wrote a 3-part series of articles on functional programming for IBM’s DeveloperWorks site; see part 1, part 2, and part 3,
Python documentationPython文档¶
Documentation for the itertools
module.
Documentation for the functools
module.
Documentation for the operator
module.
PEP 289: “Generator Expressions”
PEP 342: “Coroutines via Enhanced Generators” describes the new generator features in Python 2.5.