$accumulator (aggregation)

On this page本页内容

Definition定义

$accumulator

New in version 4.4.版本4.4中的新功能。

Defines a custom accumulator operator.定义自定义累加器运算符Accumulators are operators that maintain their state (e.g. totals, maximums, minimums, and related data) as documents progress through the pipeline.累加器是在文件通过管道时保持其状态(例如,总计、最大值、最小值和相关数据)的运算符。Use the $accumulator operator to execute your own JavaScript functions to implement behavior not supported by the MongoDB Query Language.使用$accumulator运算符执行自己的JavaScript函数,以实现MongoDB查询语言不支持的行为。See also $function.另请参见$function

Important

Executing JavaScript inside of an aggregation operator may decrease performance.在聚合运算符中执行JavaScript可能会降低性能。Only use the $accumulator operator if the provided pipeline operators cannot fulfill your application’s needs.仅当提供的管道运算符无法满足应用程序的需要时,才使用$accumulator运算符。

$accumulator is available in the following pipeline stages:$accumulator可在以下管道阶段使用:

Syntax语法

The $accumulator operator has the following syntax:$accumulator运算符语法如下所示:

{
  $accumulator: {
    init: <code>,
    initArgs: <array expression>,        // Optional
    accumulate: <code>,
    accumulateArgs: <array expression>,
    merge: <code>,
    finalize: <code>,                    // Optional
    lang: <string>
  }
}
Field字段Type类型Description描述
init String or Code字符串还是代码

Function used to initialize the state.用于初始化状态的函数。The init function receives its arguments from the initArgs array expression.init函数从initArgs数组表达式接收参数。You can specify the function definition as either BSON type Code or String.可以将函数定义指定为BSON类型代码或字符串。

The init function has the following form:init函数具有以下形式:

function (<initArg1>, <initArg2>, ...) {
  ...
  return <initialState>
}
initArgs Array

Optional.可选。Arguments passed to the init function.传递给init函数的参数。

initArgs has the following form:具有以下形式:

[ <initArg1>, <initArg2>, ... ]

Important

When used in a $bucketAuto stage, initArgs cannot refer to the group key (i.e., you cannot use the $<fieldName> syntax).$bucketAuto阶段中使用时,initArgs不能引用组键(即,不能使用$<fieldName>语法)。Instead, in a $bucketAuto stage, you can only specify constant values in initArgs.相反,在$bucketAuto阶段中,只能在initArgs中指定常量值。

accumulate String or Code

Function used to accumulate documents.用于积累文档的函数。The accumulate function receives its arguments from the current state and accumulateArgs array expression.accumulate函数从当前状态和accumulateArgs数组表达式接收参数。The result of the accumulate function becomes the new state.accumulate函数的结果成为新状态。You can specify the function definition as either BSON type Code or String.可以将函数定义指定为BSON类型代码或字符串。

The accumulate function has the following form:accumulate函数的形式如下:

function(state, <accumArg1>, <accumArg2>, ...) {
  ...
  return <newState>
}
accumulateArgs Array

Arguments passed to the accumulate function.传递给accumulate函数的参数。You can use accumulateArgs to specify what field value(s) to pass to the accumulate function.可以使用accumulateArgs 指定要传递给accumulate函数的字段值。

accumulateArgs has the following form:具有以下形式:

[ <accumArg1>, <accumArg2>, ... ]
merge String or Code字符串或代码

Function used to merge two internal states.用于合并两个内部状态的函数。merge must be either a String or Code BSON type.merge必须是字符串或代码BSON类型。merge returns the combined result of the two merged states.merge返回两个合并状态的合并结果。For information on when the merge function is called, see Merge Two States with $merge.有关何时调用merge函数的信息,请参阅使用$merge合并两个状态

The merge function has the following form:merge函数的形式如下:

function (<state1>, <state2>) {
  <logic to merge state1 and state2>
  return <newState>
}
finalize String or Code字符串或代码

Optional.可选。Function used to update the result of the accumulation.用于更新累积结果的函数。

The finalize function has the following form:finalize函数具有以下形式:

function (state) {
  ...
  return <finalState>
}
lang String

The language used in the $accumulator code.$accumulator代码中使用的语言。

Important

Currently, the only supported value for lang is js.目前,lang唯一支持的值是js

Behavior行为

The following steps outline how the $accumulator operator processes documents:以下步骤概述了$accumulator运算符如何处理文档:

  1. The operator begins at an initial state, defined by the init function.运算符从初始状态开始,初始状态由init函数定义。
  2. For each document, the operator updates the state based on the accumulate function.对于每个文档,运算符都会根据accumulate函数更新状态。The accumulate function’s first argument is the current state, and additional arguments are be specified in the accumulateArgs array.accumulate函数的第一个参数是当前状态,其他参数可以在accumulateArgs数组中指定。
  3. When the operator needs to merge multiple intermediate states, it executes the merge function. 当运算符需要合并多个中间状态时,它会执行merge函数。For more information on when the merge function is called, see Merge Two States with $merge.有关何时调用merge函数的更多信息,请参阅使用$merge合并两个状态
  4. If a finalize function has been defined, once all documents have been processed and the state has been updated accordingly, finalize converts the state to a final output.如果定义了finalize函数,则在处理完所有文档并相应更新状态后,finalize会将状态转换为最终输出。

Merge Two States with $merge$merge合并两个州

As part of its internal operations, the $accumulator operator may need to merge two separate, intermediate states.作为内部操作的一部分,$accumulator运算符可能需要合并两个独立的中间状态。The merge function specifies how the operator should merge two states.merge函数指定运算符应该如何合并两个状态。

For example, $accumulator may need to combine two states when:例如,在以下情况下,$accumulator可能需要组合两种状态:

  • $accumulator is run on a sharded cluster.$accumulator在分片集群上运行。The operator needs to merge the results from each shard to obtain the final result.运算符需要合并每个碎片的结果以获得最终结果。
  • A single $accumulator operation exceeds its specified memory limit. 单个$accumulator操作超过其指定的内存限制。If you specify the allowDiskUse option, the operator stores the in-progress operation on disk and finishes the operation in memory. 如果指定allowDiskUse选项,运算符会将正在进行的操作存储在磁盘上,并在内存中完成该操作。Once the operation finishes, the results from disk and memory are merged together using the merge function.操作完成后,使用merge函数将磁盘和内存的结果合并在一起。

Note

The merge function always merges two states at a time.merge函数总是一次合并两个状态。In the event that more than two states must be merged, the resulting merge of two states is merged with a single state.如果必须合并两个以上的状态,则两个状态的合并结果将与单个状态合并。This process repeats until all states are merged.此过程将重复,直到所有状态合并。

Javascript Enabled启用Javascript

To use $accumulator, you must have server-side scripting enabled.要使用$accumulator,必须启用服务器端脚本。

If you do not use $accumulator (or $function, $where, or mapReduce), disable server-side scripting:如果不使用$accumulator(或$function$wheremapReduce),请禁用服务器端脚本:

See also 另请参阅Run MongoDB with Secure Configuration Options使用安全配置选项运行MongoDB.

Examples示例

Use $accumulator to Implement the $avg Operator使用$accumulator实现$avg运算符

Note

This example walks through using the $accumulator operator to implement the $avg operator, which is already supported by MongoDB. 本例介绍如何使用$accumulator运算符实现$avg运算符,MongoDB已经支持该运算符。The goal of this example is not to implement new functionality, but to illustrate the behavior and syntax of the $accumulator operator with familiar logic.本例的目标不是实现新功能,而是用熟悉的逻辑说明$accumulator运算符的行为和语法。

From the mongo shell, create a sample collection named books with the following documents:mongo shell创建一个名为books的样本集合,其中包含以下文档:

db.books.insertMany([
  { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
  { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
  { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 },
  { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
  { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
])

The following operation groups the documents by author, and uses $accumulator to compute the average number of copies across books for each author:下面的操作groupsauthor对文档进行分组,并使用$accumulator计算每个作者在不同书籍中的平均副本数:

db.books.aggregate([
{
  $group :
  {
    _id : "$author",
    avgCopies:
    {
      $accumulator:
      {
        init: function() {                        // Set the initial state
          return { count: 0, sum: 0 }
        },

        accumulate: function(state, numCopies) {  // Define how to update the state
          return {
            count: state.count + 1,
            sum: state.sum + numCopies
          }
        },

        accumulateArgs: ["$copies"],              // Argument required by the accumulate function

        merge: function(state1, state2) {         // When the operator performs a merge,
          return {                                // add the fields from the two states
            count: state1.count + state2.count,
            sum: state1.sum + state2.sum
          }
        },

        finalize: function(state) {               // After collecting the results from all documents,
          return (state.sum / state.count)        // calculate the average
        },
        lang: "js"
      }
    }
  }
}
])

Result

This operation returns the following result:此操作返回以下结果:

{ "_id" : "Dante", "avgCopies" : 1.6666666666666667 }
{ "_id" : "Homer", "avgCopies" : 10 }

Behavior行为

The $accumulator defines an initial state where count and sum are both set to 0. $accumulator定义了一个初始状态,其中countsum都设置为0For each document that the $accumulator processes, it updates the state by:对于$accumulator处理的每个文档,它通过以下方式更新状态:

  • Incrementing the count by 1 andcount增加1,然后
  • Adding the values of the document’s copies field to the sum. 将文档copies字段的值添加到sumThe accumulate function can access the copies field because it is passed in the accumulateArgs field.accumulate函数可以访问copies字段,因为它是在accumulateArgs字段中传递的。

With each document that is processed, the accumulate function returns the updated state.对于处理的每个文档,accumulate函数都会返回更新的状态。

Once all documents have been processed, the finalize function divides the sum of the copies by the count of documents to obtain the average. 处理完所有文档后,finalize函数将副本总数sum除以文档数count,以获得平均值。This removes the need to keep a running computed average, since the finalize function receives the cumulative sum and count of all documents.由于finalize函数接收所有文档的累积sumcount,因此无需保持运行计算的平均值。

Comparison with $avg$avg比较

This operation is equivalent to the following pipeline, which uses the $avg operator:此操作相当于使用$avg运算符的以下管道:

db.books.aggregate([
{
  $group : {
    _id : "$author",
    avgCopies: { $avg: "$copies" }
  }
}
])

Use initArgs to Vary the Initial State by Group使用initArgs按组更改初始状态

You can use the initArgs option in to vary the initial state of $accumulator. 您可以在中使用initArgs选项来更改$accumulator的初始状态。This can be useful if you want to, for example:如果您想,这可能很有用,例如:

  • Use the value of a field which is not in your state to affect your state, or使用不在您的状态中的字段的值来影响您的状态,或
  • Set the initial state to a different value based on the group being processed.根据正在处理的组,将初始状态设置为不同的值。

From the mongo shell, create a sample collection named restaurants with the following documents:mongo shell中,创建一个名为restaurants的样本集合,其中包含以下文档:

db.restaurants.insertMany([
  { "_id" : 1, "name" : "Food Fury", "city" : "Bettles", "cuisine" : "American" },
  { "_id" : 2, "name" : "Meal Macro", "city" : "Bettles", "cuisine" : "Chinese" },
  { "_id" : 3, "name" : "Big Crisp", "city" : "Bettles", "cuisine" : "Latin" },
  { "_id" : 4, "name" : "The Wrap", "city" : "Onida", "cuisine" : "American" },
  { "_id" : 5, "name" : "Spice Attack", "city" : "Onida", "cuisine" : "Latin" },
  { "_id" : 6, "name" : "Soup City", "city" : "Onida", "cuisine" : "Chinese" },
  { "_id" : 7, "name" : "Crave", "city" : "Pyote", "cuisine" : "American" },
  { "_id" : 8, "name" : "The Gala", "city" : "Pyote", "cuisine" : "Chinese" }
])

Suppose an application allows users to query this data to find restaurants.假设一个应用程序允许用户查询这些数据来查找餐馆。It may be useful to show more results for the city where the user lives.为用户居住的城市显示更多结果可能很有用。For this example, we assume that the user’s city is called in a variable called userProfileCity.在本例中,我们假设用户的城市在名为userProfileCity的变量中调用。

The following aggregation pipeline groups the documents by city.以下聚合管道groupscity对文档进行分组。The operation uses the $accumulator to display a different number of results from each city depending on whether the restaurant’s city matches the city in the user’s profile:该操作使用$accumulator显示来自每个城市的不同数量的结果,具体取决于餐厅的城市是否与用户配置文件中的城市匹配:

Note

To execute this example in the mongo shell, replace <userProfileCity> in the initArgs with a string containing an actual city value, such as Bettles.要在mongo shell中执行此示例,请将initArgs中的<userProfileCity>替换为包含实际城市值的字符串,例如Bettles

db.restaurants.aggregate([
{
  $group :
  {
    _id : { city: "$city" },
    restaurants:
    {
      $accumulator:
      {
        init: function(city, userProfileCity) {        // Set the initial state
          return {
            max: city === userProfileCity ? 3 : 1,     // If the group matches the user's city, return 3 restaurants
            restaurants: []                            // else, return 1 restaurant
          }
        },

        initArgs: ["$city", <userProfileCity>],        // Argument to pass to the init function

        accumulate: function(state, restaurantName) {  // Define how to update the state
          if (state.restaurants.length < state.max) {
            state.restaurants.push(restaurantName);
          }
          return state;
        },

        accumulateArgs: ["$name"],                     // Argument required by the accumulate function

        merge: function(state1, state2) {
          return {
            max: state1.max,
            restaurants: state1.restaurants.concat(state2.restaurants).slice(0, state1.max)
          }
        },

        finalize: function(state) {                   // Adjust the state to only return field we need
          return state.restaurants
        }

        lang: "js"
      }
    }
  }
}
])

Results后果

If the value of userProfileCity is Bettles, this operation returns the following result:如果userProfileCity的值为Bettles,则此操作返回以下结果:

{ "_id" : { "city" : "Bettles" }, "restaurants" : { "restaurants" : [ "Food Fury", "Meal Macro", "Big Crisp" ] } }
{ "_id" : { "city" : "Onida" }, "restaurants" : { "restaurants" : [ "The Wrap" ] } }
{ "_id" : { "city" : "Pyote" }, "restaurants" : { "restaurants" : [ "Crave" ] } }

If the value of userProfileCity is Onida, this operation returns the following result:如果userProfileCity的值为Onida,则此操作返回以下结果:

{ "_id" : { "city" : "Bettles" }, "restaurants" : { "restaurants" : [ "Food Fury" ] } }
{ "_id" : { "city" : "Onida" }, "restaurants" : { "restaurants" : [ "The Wrap", "Spice Attack", "Soup City" ] } }
{ "_id" : { "city" : "Pyote" }, "restaurants" : { "restaurants" : [ "Crave" ] } }

If the value of userProfileCity is Pyote, this operation returns the following result:如果userProfileCity的值为Pyote,则此操作返回以下结果:

{ "_id" : { "city" : "Bettles" }, "restaurants" : { "restaurants" : [ "Food Fury" ] } }
{ "_id" : { "city" : "Onida" }, "restaurants" : { "restaurants" : [ "The Wrap" ] } }
{ "_id" : { "city" : "Pyote" }, "restaurants" : { "restaurants" : [ "Crave", "The Gala" ] } }

If the value of userProfileCity is any other value, this operation returns the following result:如果userProfileCity的值是任何其他值,则此操作返回以下结果:

{ "_id" : { "city" : "Bettles" }, "restaurants" : { "restaurants" : [ "Food Fury" ] } }
{ "_id" : { "city" : "Onida" }, "restaurants" : { "restaurants" : [ "The Wrap" ] } }
{ "_id" : { "city" : "Pyote" }, "restaurants" : { "restaurants" : [ "Crave" ] } }

Behavior行为

The init function defines an initial state containing max and restaurants fields.init函数定义一个初始状态,其中包含max字段和restaurants字段。The max field sets the maximum number of restaurants for that particular group.max字段设置该特定组的最大餐厅数。If the document’s city field matches userProfileCity, that group contains a maximum of 3 restaurants.如果文档的city字段与userProfileCity匹配,则该组最多包含3家餐厅。Otherwise, if the document _id does not match userProfileCity, the group contains at most a single restaurant. 否则,如果文档_iduserProfileCity不匹配,则该组最多包含一家餐厅。The init function receives both the city userProfileCity arguments from the initArgs array.init函数从initArgs数组接收city userProfileCity参数。

For each document that the $accumulator processes, it pushes the name of the restaurant to the restaurants array, provided that name would not put the length of restaurants over the max value. 对于$accumulator处理的每个文档,它都会将餐厅的name推送到restaurants数组,前提是该名称不会使restaurants的长度超过max值。With each document that is processed, the accumulate function returns the updated state.对于处理的每个文档,accumulate函数都会返回更新的状态。

The merge function defines how to merge two states.merge函数定义如何合并两个状态。The function concatenates the restaurant arrays from each state together, and the length of the resulting array is limited using the slice() method to ensure that it does not exceed the max value.该函数将每个状态的restaurant数组连接在一起,并使用slice()方法限制生成数组的长度,以确保其不超过max值。

Once all documents have been processed, the finalize function modifies the resulting state to only return the names of the restaurants. 处理完所有文档后,finalize函数会修改结果状态,只返回餐厅的名称。Without this function, the max field would also be included in the output, which does not fulfill any needs for the application.如果没有此功能,输出中也会包含max字段,这无法满足应用程序的任何需求。