$bucket (aggregation)¶

~~On this page~~本页内容

~~Definition~~定义
~~Syntax~~语法
~~Behavior~~行为
~~Examples~~示例

Definition定义¶

$bucket¶

~~New in version 3.4.~~版本3.4中的新功能。

~~Categorizes incoming documents into groups, called buckets, based on a specified expression and bucket boundaries and outputs a document per each bucket.~~ 根据指定的表达式和存储桶边界，将传入文档分类为组（称为存储桶），并为每个存储桶输出一个文档。~~Each output document contains an _id field whose value specifies the inclusive lower bound of the bucket.~~ 每个输出文档都包含一个_id字段，其值指定bucket的包含下限。~~The output option specifies the fields included in each output document.~~output选项指定每个输出文档中包含的字段。

$bucket ~~only produces output documents for buckets that contain at least one input document.~~仅为至少包含一个输入文档的存储桶生成输出文档。

Syntax语法¶

{
  $bucket: {
      groupBy: <expression>,
      boundaries: [ <lowerbound1>, <lowerbound2>, ... ],
      default: <literal>,
      output: {
         <output1>: { <$accumulator expression> },
         ...
         <outputN>: { <$accumulator expression> }
      }
   }
}

~~The $bucket document contains the following fields:~~$bucket文档包含以下字段：

~~Field~~字段	~~Type~~类型	~~Description~~描述
groupBy	expression	~~An expression to group documents by.~~ 文档分组依据的表达式。~~To specify a field path, prefix the field name with a dollar sign `$` and enclose it in quotes.~~要指定字段路径，请在字段名称前面加上美元符号`$`，并用引号括起来。 ~~Unless `$bucket` includes a default specification, each input document must resolve the `groupBy` field path or expression to a value that falls within one of the ranges specified by the boundaries.~~除非`$bucket`包含`default`规范，否则每个输入文档必须将`groupBy`字段路径或表达式解析为一个值，该值在边界指定的范围内。
boundaries	array	~~An array of values based on the groupBy expression that specify the boundaries for each bucket.~~ 基于`groupBy`表达式的值数组，用于指定每个bucket的边界。~~Each adjacent pair of values acts as the inclusive lower boundary and the exclusive upper boundary for the bucket.~~ 每个相邻的值对充当桶的包含下边界和独占上边界。~~You must specify at least two boundaries.~~必须至少指定两个边界。 ~~The specified values must be in ascending order and all of the same type.~~ 指定的值必须按升序排列，且类型相同。~~The exception is if the values are of mixed numeric types, such as:~~例外情况是，如果值是混合数字类型，例如： `[ 10, NumberLong(20), NumberInt(30) ]` Example ~~An array of `[ 0, 5, 10 ]` creates two buckets:~~`[0, 5, 10]`的数组创建两个桶： [0, 5) ~~with inclusive lower bound `0` and exclusive upper bound `5`.~~具有包含下限`0`和排除上限`5`。 [5, 10) ~~with inclusive lower bound `5` and exclusive upper bound `10`.~~具有包含下限`5`和排除上限`10`。
default	~~literal~~字面量	~~Optional.~~ 可选择的~~A literal that specifies the `_id` of an additional bucket that contains all documents whose groupBy expression result does not fall into a bucket specified by boundaries.~~一个文本，指定一个额外存储桶的`_id`，该存储桶包含`groupBy`表达式结果不属于`boundaries`指定的存储桶的所有文档。 ~~If unspecified, each input document must resolve the `groupBy` expression to a value within one of the bucket ranges specified by `boundaries` or the operation throws an error.~~如果未指定，则每个输入文档必须将`groupBy`表达式解析为`boundaries`指定的某个存储桶范围内的值，否则操作将抛出错误。 ~~The `default` value must be less than the lowest `boundaries` value, or greater than or equal to the highest `boundaries` value.~~`default`必须小于最低`boundaries`值，或大于或等于最高`boundaries`值。 ~~The `default` value can be of a different type than the entries in `boundaries`.~~`default`值的类型可以与`boundaries`中的条目不同。
output	document	~~Optional.~~ 可选择的~~A document that specifies the fields to include in the output documents in addition to the `_id` field.~~ 除了`_id`字段之外，还指定要包含在输出文档中的字段的文档。~~To specify the field to include, you must use accumulator expressions.~~要指定要包含的字段，必须使用累加器表达式。 <outputfield1>: { <accumulator>: <expression1> }, ... <outputfieldN>: { <accumulator>: <expressionN> } ~~If you do not specify an `output` document, the operation returns a `count` field containing the number of documents in each bucket.~~如果未指定`output`文档，操作将返回一个`count`字段，其中包含每个存储桶中的文档数。 ~~If you specify an `output` document, only the fields specified in the document are returned; i.e. the `count` field is not returned unless it is explicitly included in the `output` document.~~如果指定`output`文档，则只返回文档中指定的字段；亦即，除非`output`文档中明确包含`count`字段，否则不会返回该字段。

Behavior行为¶

$bucket ~~requires at least one of the following conditions to be met or the operation throws an error:~~要求至少满足以下条件之一，否则操作引发错误：

~~Each input document resolves the groupBy expression to a value within one of the bucket ranges specified by boundaries, or~~每个输入文档都会将groupBy表达式解析为boundaries指定的一个存储桶范围内的值，或
~~A default value is specified to bucket documents whose groupBy values are outside of the boundaries or of a different BSON type than the values in boundaries.~~为其groupBy值在boundaries之外或与boundaries中的值不同的BSON类型的存储桶文档指定默认值。

~~If the groupBy expression resolves to an array or a document, $bucket arranges the input documents into buckets using the comparison logic from $sort.~~如果groupBy表达式解析为数组或文档，$bucket使用$sort中的比较逻辑将输入文档排列到bucket中。

Examples示例¶

Bucket by Year and Filter by Bucket Results每年一桶，按桶筛选结果¶

~~From the mongo shell, create a sample collection named artists with the following documents:~~从mongo shell中，创建一个名为artists的样本集合，并附带以下文档：

db.artists.insertMany([
  { "_id" : 1, "last_name" : "Bernard", "first_name" : "Emil", "year_born" : 1868, "year_died" : 1941, "nationality" : "France" },
  { "_id" : 2, "last_name" : "Rippl-Ronai", "first_name" : "Joszef", "year_born" : 1861, "year_died" : 1927, "nationality" : "Hungary" },
  { "_id" : 3, "last_name" : "Ostroumova", "first_name" : "Anna", "year_born" : 1871, "year_died" : 1955, "nationality" : "Russia" },
  { "_id" : 4, "last_name" : "Van Gogh", "first_name" : "Vincent", "year_born" : 1853, "year_died" : 1890, "nationality" : "Holland" },
  { "_id" : 5, "last_name" : "Maurer", "first_name" : "Alfred", "year_born" : 1868, "year_died" : 1932, "nationality" : "USA" },
  { "_id" : 6, "last_name" : "Munch", "first_name" : "Edvard", "year_born" : 1863, "year_died" : 1944, "nationality" : "Norway" },
  { "_id" : 7, "last_name" : "Redon", "first_name" : "Odilon", "year_born" : 1840, "year_died" : 1916, "nationality" : "France" },
  { "_id" : 8, "last_name" : "Diriks", "first_name" : "Edvard", "year_born" : 1855, "year_died" : 1930, "nationality" : "Norway" }
])

~~The following operation groups the documents into buckets according to the year_born field and filters based on the count of documents in the buckets:~~以下操作根据year_born字段将文档分组到存储桶中，并根据存储桶中的文档计数进行筛选：

db.artists.aggregate( [
  // First Stage
  {
    $bucket: {
      groupBy: "$year_born",                        // Field to group by
      boundaries: [ 1840, 1850, 1860, 1870, 1880 ], // Boundaries for the buckets
      default: "Other",                             // Bucket id for documents which do not fall into a bucket
      output: {                                     // Output for each bucket
        "count": { $sum: 1 },
        "artists" :
          {
            $push: {
              "name": { $concat: [ "$first_name", " ", "$last_name"] },
              "year_born": "$year_born"
            }
          }
      }
    }
  },
  // Second Stage
  {
    $match: { count: {$gt: 3} }
  }
] )

~~First Stage~~第一阶段

~~The $bucket stage groups the documents into buckets by the year_born field.~~ $bucket阶段按year_born字段将文档分组到存储桶中。~~The buckets have the following boundaries:~~存储桶具有以下boundaries：

[1840, 1850) with inclusive lowerbound 1840 and exclusive upper bound 1850.
[1850, 1860) with inclusive lowerbound 1850 and exclusive upper bound 1860.
[1860, 1870) with inclusive lowerbound 1860 and exclusive upper bound 1870.
[1870, 1880) with inclusive lowerbound 1870 and exclusive upper bound 1880.
If a document did not contain the year_born field or its year_born field was outside the ranges above, it would be placed in the default bucket with the _id value "Other".

~~The stage includes the output document to determine the fields to return:~~该阶段包括输出文档，用于确定要返回的字段：

_id ~~Inclusive lower bound of the bucket.~~包含桶的下限。

count ~~Count of documents in the bucket.~~清点桶里的文件。

artists

~~Array of documents containing information on each artist in the bucket.~~ 包含桶中每个艺术家信息的文档数组。~~Each document contains the artist’s~~每个文档都包含艺术家的

~~name, which is a concatenation (i.e. $concat) of the artist’s first_name and last_name.~~name，是艺术家的名字和姓氏的串联（即$concat）。
year_born

~~This stage passes the following documents to the next stage:~~本阶段将以下文件传递到下一阶段：

{ "_id" : 1840, "count" : 1, "artists" : [ { "name" : "Odilon Redon", "year_born" : 1840 } ] }

{ "_id" : 1850, "count" : 2, "artists" : [ { "name" : "Vincent Van Gogh", "year_born" : 1853 },
                                           { "name" : "Edvard Diriks", "year_born" : 1855 } ] }

{ "_id" : 1860, "count" : 4, "artists" : [ { "name" : "Emil Bernard", "year_born" : 1868 },
                                           { "name" : "Joszef Rippl-Ronai", "year_born" : 1861 },
                                           { "name" : "Alfred Maurer", "year_born" : 1868 },
                                           { "name" : "Edvard Munch", "year_born" : 1863 } ] }

{ "_id" : 1870, "count" : 1, "artists" : [ { "name" : "Anna Ostroumova", "year_born" : 1871 } ] }

~~Second Stage~~第二阶段

~~The $match stage filters the output from the previous stage to only return buckets which contain more than 3 documents.~~$match阶段筛选前一阶段的输出，只返回包含3个以上文档的存储桶。

~~The operation returns the following document:~~该操作将返回以下文档：

{ "_id" : 1860, "count" : 4, "artists" :
  [
    { "name" : "Emil Bernard", "year_born" : 1868 },
    { "name" : "Joszef Rippl-Ronai", "year_born" : 1861 },
    { "name" : "Alfred Maurer", "year_born" : 1868 },
    { "name" : "Edvard Munch", "year_born" : 1863 }
  ]
}

Use $bucket with $facet to Bucket by Multiple Fields通过多个字段对存储桶使用$bucket和$facet¶

~~You can use the $facet stage to perform multiple $bucket aggregations in a single stage.~~可以使用$facet阶段在单个阶段中执行多个$bucket聚合。

~~From the mongo shell, create a sample collection named artwork with the following documents:~~在mongo shell中，创建一个名为artwork的样本集合，其中包含以下文档：

db.artwork.insertMany([
  { "_id" : 1, "title" : "The Pillars of Society", "artist" : "Grosz", "year" : 1926,
      "price" : NumberDecimal("199.99") },
  { "_id" : 2, "title" : "Melancholy III", "artist" : "Munch", "year" : 1902,
      "price" : NumberDecimal("280.00") },
  { "_id" : 3, "title" : "Dancer", "artist" : "Miro", "year" : 1925,
      "price" : NumberDecimal("76.04") },
  { "_id" : 4, "title" : "The Great Wave off Kanagawa", "artist" : "Hokusai",
      "price" : NumberDecimal("167.30") },
  { "_id" : 5, "title" : "The Persistence of Memory", "artist" : "Dali", "year" : 1931,
      "price" : NumberDecimal("483.00") },
  { "_id" : 6, "title" : "Composition VII", "artist" : "Kandinsky", "year" : 1913,
      "price" : NumberDecimal("385.00") },
  { "_id" : 7, "title" : "The Scream", "artist" : "Munch", "year" : 1893
      /* No price*/ },
  { "_id" : 8, "title" : "Blue Flower", "artist" : "O'Keefe", "year" : 1918,
      "price" : NumberDecimal("118.42") }
])

~~The following operation uses two $bucket stages within a $facet stage to create two groupings, one by price and the other by year:~~以下操作使用$facet阶段中的两个$bucket阶段创建两个分组，一个按price，另一个按year：

db.artwork.aggregate( [
  {
    $facet: {                               // Top-level $facet stage
      "price": [                            // Output field 1
        {
          $bucket: {
              groupBy: "$price",            // Field to group by
              boundaries: [ 0, 200, 400 ],  // Boundaries for the buckets
              default: "Other",             // Bucket id for documents which do not fall into a bucket
              output: {                     // Output for each bucket
                "count": { $sum: 1 },
                "artwork" : { $push: { "title": "$title", "price": "$price" } },
                "averagePrice": { $avg: "$price" }
              }
          }
        }
      ],
      "year": [                                      // Output field 2
        {
          $bucket: {
            groupBy: "$year",                        // Field to group by
            boundaries: [ 1890, 1910, 1920, 1940 ],  // Boundaries for the buckets
            default: "Unknown",                      // Bucket id for documents which do not fall into a bucket
            output: {                                // Output for each bucket
              "count": { $sum: 1 },
              "artwork": { $push: { "title": "$title", "year": "$year" } }
            }
          }
        }
      ]
    }
  }
] )

First Facet

~~The first facet groups the input documents by price.~~ 第一个方面按price对输入文档进行分组。~~The buckets have the following boundaries:~~铲斗具有以下边界：

[0, 200) with inclusive lowerbound 0 and exclusive upper bound 200.
[200, 400) with inclusive lowerbound 200 and exclusive upper bound 400.
“Other”, the default bucket containing documents without prices or prices outside the ranges above.

The $bucket stage includes the output document to determine the fields to return:

`_id`	~~Inclusive lower bound of the bucket.~~包含桶的下限。
`count`	~~Count of documents in the bucket.~~清点桶里的文件。
`artwork`	~~Array of documents containing information on each artwork in the bucket.~~包含桶中每个艺术品信息的文档数组。
`averagePrice`	~~Employs the `$avg` operator to display the average price of all artwork in the bucket.~~雇佣`$avg`运算符显示桶中所有艺术品的平均价格。

~~Second Facet~~第二方面

~~The second facet groups the input documents by year.~~ 第二个方面按year对输入文档进行分组。~~The buckets have the following boundaries:~~铲斗具有以下边界：

[1890, 1910) ~~with inclusive lowerbound 1890 and exclusive upper bound 1910.~~具有包括1890年的下限和排除1910年的上限。
[1910, 1920) ~~with inclusive lowerbound 1910 and exclusive upper bound 1920.~~具有包括1910年的下限和排除1920年的上限。
[1920, 1940) ~~with inclusive lowerbound 1910 and exclusive upper bound 1940.~~具有包括1910年的下限和排除1940年的上限。
~~“Unknown”, the default bucket containing documents without years or years outside the ranges above.~~“Unknown”，default存储桶中包含的文档没有年份或年份超出上述范围。

~~The $bucket stage includes the output document to determine the fields to return:~~$bucket阶段包括用于确定要返回的字段的输出文档：

`count`	~~Count of documents in the bucket.~~清点桶里的文件。
`artwork`	~~Array of documents containing information on each artwork in the bucket.~~包含桶中每个艺术品信息的文档数组。

Output

~~The operation returns the following document:~~该操作将返回以下文档：

{
  "price" : [ // Output of first facet
    {
      "_id" : 0,
      "count" : 4,
      "artwork" : [
        { "title" : "The Pillars of Society", "price" : NumberDecimal("199.99") },
        { "title" : "Dancer", "price" : NumberDecimal("76.04") },
        { "title" : "The Great Wave off Kanagawa", "price" : NumberDecimal("167.30") },
        { "title" : "Blue Flower", "price" : NumberDecimal("118.42") }
      ],
      "averagePrice" : NumberDecimal("140.4375")
    },
    {
      "_id" : 200,
      "count" : 2,
      "artwork" : [
        { "title" : "Melancholy III", "price" : NumberDecimal("280.00") },
        { "title" : "Composition VII", "price" : NumberDecimal("385.00") }
      ],
      "averagePrice" : NumberDecimal("332.50")
    },
    {
      // Includes documents without prices and prices greater than 400
      "_id" : "Other",
      "count" : 2,
      "artwork" : [
        { "title" : "The Persistence of Memory", "price" : NumberDecimal("483.00") },
        { "title" : "The Scream" }
      ],
      "averagePrice" : NumberDecimal("483.00")
    }
  ],
  "year" : [ // Output of second facet
    {
      "_id" : 1890,
      "count" : 2,
      "artwork" : [
        { "title" : "Melancholy III", "year" : 1902 },
        { "title" : "The Scream", "year" : 1893 }
      ]
    },
    {
      "_id" : 1910,
      "count" : 2,
      "artwork" : [
        { "title" : "Composition VII", "year" : 1913 },
        { "title" : "Blue Flower", "year" : 1918 }
      ]
    },
    {
      "_id" : 1920,
      "count" : 3,
      "artwork" : [
        { "title" : "The Pillars of Society", "year" : 1926 },
        { "title" : "Dancer", "year" : 1925 },
        { "title" : "The Persistence of Memory", "year" : 1931 }
      ]
    },
    {
      // Includes documents without a year
      "_id" : "Unknown",
      "count" : 1,
      "artwork" : [
        { "title" : "The Great Wave off Kanagawa" }
      ]
    }
  ]
}

~~See also~~参阅

$bucketAuto