Aggregation Pipeline聚合管道¶

~~On this page~~本页内容

~~Pipeline~~管道
~~Pipeline Expressions~~管道表达式
~~Aggregation Pipeline Behavior~~聚合管道行为
~~Considerations~~注意事项

~~The aggregation pipeline is a framework for data aggregation modeled on the concept of data processing pipelines.~~聚合管道是一个基于数据处理管道概念的数据聚合框架。~~Documents enter a multi-stage pipeline that transforms the documents into aggregated results.~~文档进入一个多阶段管道，将文档转换为聚合结果。~~For example:~~例如：

~~In the example,~~在这个例子中，

db.orders.aggregate([
   { $match: { status: "A" } },
   { $group: { _id: "$cust_id", total: { $sum: "$amount" } } }
])

First Stage: The $match stage filters the documents by the status field and passes to the next stage those documents that have status equal to "A".第一阶段：$match阶段按status字段过滤文档，并将status等于"A"的文档传递到下一阶段。

Second Stage: The $group stage groups the documents by the cust_id field to calculate the sum of the amount for each unique cust_id.第二阶段：$group阶段按cust_id字段对文档进行分组，以计算每个唯一cust_id的金额总和。

Pipeline管道¶

~~The MongoDB aggregation pipeline consists of stages.~~MongoDB聚合管道由多个阶段组成。~~Each stage transforms the documents as they pass through the pipeline.~~每个阶段在文档通过管道时对其进行转换。~~Pipeline stages do not need to produce one output document for every input document; e.g., some stages may generate new documents or filter out documents.~~管道阶段不需要为每个输入文档生成一个输出文档；例如，某些阶段可能会生成新文档或过滤掉文档。

~~Pipeline stages can appear multiple times in the pipeline with the exception of $out, $merge, and $geoNear stages.~~管道阶段可以在管道中多次出现，$out、$merge和$geoNear阶段除外。~~For a list of all available stages, see Aggregation Pipeline Stages.~~有关所有可用阶段的列表，请参见聚合管道阶段。

~~MongoDB provides the db.collection.aggregate() method in the mongo shell and the aggregate command to run the aggregation pipeline.~~MongoDB提供了mongo shell中的db.collection.aggregate()方法，以及aggregate命令来运行聚合管道。

~~For example usage of the aggregation pipeline, consider Aggregation with User Preference Data and Aggregation with the Zip Code Data Set.~~例如，使用聚合管道时，请考虑使用用户首选项数据进行聚合，以及使用邮政编码数据集进行聚合。

~~Starting in MongoDB 4.2, you can use the aggregation pipeline for updates in:~~从MongoDB 4.2开始，您可以在以下位置使用聚合管道进行更新：

~~Command~~命令	`mongo` ~~Shell Methods~~Shell方法
`findAndModify`	db.collection.findOneAndUpdate() db.collection.findAndModify()
`update`	db.collection.updateOne() db.collection.updateMany() db.collection.update() Bulk.find.update() Bulk.find.updateOne() Bulk.find.upsert()

~~See also~~另请参阅

~~Updates with Aggregation Pipeline~~使用聚合管道更新

Pipeline Expressions管道表达式¶

~~Some pipeline stages take a pipeline expression as the operand.~~某些管道阶段采用管道表达式作为操作数。~~Pipeline expressions specify the transformation to apply to the input documents.~~管道表达式指定要应用于输入文档的转换。~~Expressions have a document structure and can contain other expression.~~表达式具有文档结构，可以包含其他表达式。

~~Pipeline expressions can only operate on the current document in the pipeline and cannot refer to data from other documents: expression operations provide in-memory transformation of documents.~~管道表达式只能对管道中的当前文档进行操作，不能引用其他文档中的数据：表达式操作提供文档的内存转换。

~~Generally, expressions are stateless and are only evaluated when seen by the aggregation process with one exception: accumulator expressions.~~通常，表达式是无状态的，只有在聚合过程看到时才进行计算，只有一个例外：累加器表达式。

~~The accumulators, used in the $group stage, maintain their state (e.g. totals, maximums, minimums, and related data) as documents progress through the pipeline.~~在$group阶段使用的累加器在文档通过管道时保持其状态（例如，总计、最大值、最小值和相关数据）。~~Some accumulators are available in the $project stage; however, when used in the $project stage, the accumulators do not maintain their state across documents.~~有些累加器在$project阶段可用；但是，当在$project阶段使用时，累加器不会在文档中保持其状态。

~~Starting in version 4.4, MongoDB provides the $accumulator and $function aggregation operators.~~从4.4版开始，MongoDB提供了$accumulator和$function聚合运算符。~~These operators provide users with the ability to define custom aggregation expressions in JavaScript.~~这些运算符为用户提供了在JavaScript中定义自定义聚合表达式的能力。

~~For more information on expressions, see Expressions.~~有关表达式的详细信息，请参见表达式。

Aggregation Pipeline Behavior聚合管道行为¶

~~In MongoDB, the aggregate command operates on a single collection, logically passing the entire collection into the aggregation pipeline.~~在MongoDB中，聚合命令对单个集合进行操作，在逻辑上将整个集合传递到聚合管道中。~~To optimize the operation, wherever possible, use the following strategies to avoid scanning the entire collection.~~要优化操作，请尽可能使用以下策略以避免扫描整个集合。

Pipeline Operators and Indexes管道运算符和索引¶

~~MongoDB’s query planner analyzes an aggregation pipeline to determine whether indexes can be used to improve pipeline performance.~~MongoDB的查询计划器分析聚合管道，以确定是否可以使用索引来提高管道性能。~~For example, the following pipeline stages can take advantage of indexes:~~例如，以下管道阶段可以利用索引：

Note

~~The following pipeline stages do not represent a complete list of all stages which can use an index.~~以下管道阶段并不代表可以使用索引的所有阶段的完整列表。

$match

~~The $match stage can use an index to filter documents if it occurs at the beginning of a pipeline.~~如果文档发生在管道的开头，$match阶段可以使用索引来过滤文档。

$sort

~~The $sort stage can use an index as long as it is not preceded by a $project, $unwind, or $group stage.~~$sort阶段可以使用索引，只要它前面没有$project、$unwind阶段或$group阶段。

$group

~~The $group stage can sometimes use an index to find the first document in each group if all of the following criteria are met:~~如果满足以下所有条件，$group阶段有时可以使用索引查找每个组中的第一个文档：

~~The $group stage is preceded by a $sort stage that sorts the field to group by,~~$group阶段前面是一个$sort阶段，它对要分组的字段进行排序，
~~There is an index on the grouped field which matches the sort order and~~分组字段上有一个与排序顺序和
~~The only accumulator used in the $group stage is $first.~~$group阶段中使用的唯一累加器是$first。

~~See Optimization to Return the First Document of Each Group for an example.~~有关示例，请参见返回每个组的第一个文档的优化。

$geoNear

~~The $geoNear pipeline operator takes advantage of a geospatial index.~~$geoNear管道操作符利用地理空间索引。~~When using $geoNear, the $geoNear pipeline operation must appear as the first stage in an aggregation pipeline.~~使用$geoNear时，$geoNear管道操作必须显示为聚合管道中的第一个阶段。

~~Changed in version 3.2:~~在版本3.2中更改：~~Starting in MongoDB 3.2, indexes can cover an aggregation pipeline.~~从MongoDB 3.2开始，索引可以覆盖聚合管道。~~In MongoDB 2.6 and 3.0, indexes could not cover an aggregation pipeline since even when the pipeline uses an index, aggregation still requires access to the actual documents.~~在MongoDB 2.6和3.0中，索引不能覆盖聚合管道，因为即使管道使用索引，聚合仍然需要访问实际文档。

Early Filtering早期筛选¶

~~If your aggregation operation requires only a subset of the data in a collection, use the $match, $limit, and $skip stages to restrict the documents that enter at the beginning of the pipeline.~~如果聚合操作只需要集合中的数据子集，请使用$match、$limit和$skip阶段来限制在管道开头输入的文档。~~When placed at the beginning of a pipeline, $match operations use suitable indexes to scan only the matching documents in a collection.~~当放置在管道的开头时，$match操作使用合适的索引来只扫描集合中匹配的文档。

~~Placing a $match pipeline stage followed by a $sort stage at the start of the pipeline is logically equivalent to a single query with a sort and can use an index.~~将$match管道阶段后跟$sort阶段放在管道的开头在逻辑上相当于一个带有排序的查询，可以使用索引。~~When possible, place $match operators at the beginning of the pipeline.~~如果可能，请在管道的开头放置$match操作符。

Considerations注意事项¶

Sharded Collections分片集合¶

~~The aggregation pipeline supports operations on sharded collections.~~聚合管道支持对分片集合的操作。~~See Aggregation Pipeline and Sharded Collections.~~请参阅聚合管道和分片集合。

Aggregation Pipeline vs Map-Reduce聚合管道与Map Reduce的对比¶

~~The aggregation pipeline provides better performance and a more coherent interface than map-reduce.~~聚合管道提供了比map-reduce更好的性能和更一致的接口。

~~Various map-reduce operations can be rewritten using aggregation pipeline operators, such as $group, $merge, etc.~~可以使用聚合管道运算符重写各种map-reduce操作，例如$group、$merge等。~~For map-reduce operations that require custom functionality, MongoDB provides the $accumulator and $function aggregation operators starting in version 4.4.~~对于需要自定义功能的map-reduce操作，MongoDB从4.4版开始提供$accumulator和$function聚合运算符。~~These operators provide users with the ability to define custom aggregation expressions in JavaScript.~~这些运算符为用户提供了在JavaScript中定义自定义聚合表达式的能力。

~~See Map-Reduce Examples for details.~~有关详细信息，请参见Map-Reduce示例。

Limitations局限性¶

~~Aggregation pipeline have some limitations on value types and result size.~~聚合管道在值类型和结果大小上有一些限制。~~See Aggregation Pipeline Limits for details on limits and restrictions on the aggregation pipeline.~~有关聚合管道的限制和限制的详细信息，请参见聚合管道限制。

Pipeline Optimization管道优化¶

~~The aggregation pipeline has an internal optimization phase that provides improved performance for certain sequences of operators.~~聚合管道有一个内部优化阶段，为某些运算符序列提供改进的性能。~~For details, see Aggregation Pipeline Optimization.~~有关详细信息，请参见聚合管道优化。