Aggregation with the Zip Code Data Set使用Zip代码数据集进行聚合

On this page本页内容

The examples in this document use the zipcodes collection.本文中的示例使用zipcodes集合。This collection is available at: media.mongodb.org/zips.json.此系列可从以下网址获得:media.mongodb.org/zips.jsonUse mongoimport to load this data set into your mongod instance.使用mongoimport将此数据集加载到mongod实例中。

Data Model数据模型

Each document in the zipcodes collection has the following form:zipcodes集合中的每个文档都有以下格式:

{
  "_id": "10280",
  "city": "NEW YORK",
  "state": "NY",
  "pop": 5574,
  "loc": [
    -74.016323,
    40.710537
  ]
}

aggregate() Method方法

All of the following examples use the aggregate() helper in the mongo shell.

The aggregate() method uses the aggregation pipeline to process documents into aggregated results. An aggregation pipeline consists of stages with each stage processing the documents as they pass along the pipeline. Documents pass through the stages in sequence.

The aggregate() method in the mongo shell provides a wrapper around the aggregate database command. See the documentation for your driver for a more idiomatic interface for data aggregation operations.

Return States with Populations above 10 Million返回人口超过1000万的州

The following aggregation operation returns all states with total population greater than 10 million:以下聚合操作返回总人口大于1000万的所有状态:

db.zipcodes.aggregate( [
   { $group: { _id: "$state", totalPop: { $sum: "$pop" } } },
   { $match: { totalPop: { $gte: 10*1000*1000 } } }
] )

In this example, the aggregation pipeline consists of the $group stage followed by the $match stage:

The equivalent SQL for this aggregation operation is:

SELECT state, SUM(pop) AS totalPop
FROM zipcodes
GROUP BY state
HAVING totalPop >= (10*1000*1000)

See also参阅

$group, $match, $sum

Return Average City Population by State

The following aggregation operation returns the average populations for cities in each state:

db.zipcodes.aggregate( [
   { $group: { _id: { state: "$state", city: "$city" }, pop: { $sum: "$pop" } } },
   { $group: { _id: "$_id.state", avgCityPop: { $avg: "$pop" } } }
] )

In this example, the aggregation pipeline consists of the $group stage followed by another $group stage:

The documents that result from this aggregation operation resembles the following:

{
  "_id" : "MN",
  "avgCityPop" : 5335
}

See also参阅

$group, $sum, $avg

Return Largest and Smallest Cities by State

The following aggregation operation returns the smallest and largest cities by population for each state:

db.zipcodes.aggregate( [
   { $group:
      {
        _id: { state: "$state", city: "$city" },
        pop: { $sum: "$pop" }
      }
   },
   { $sort: { pop: 1 } },
   { $group:
      {
        _id : "$_id.state",
        biggestCity:  { $last: "$_id.city" },
        biggestPop:   { $last: "$pop" },
        smallestCity: { $first: "$_id.city" },
        smallestPop:  { $first: "$pop" }
      }
   },

  // the following $project is optional, and
  // modifies the output format.

  { $project:
    { _id: 0,
      state: "$_id",
      biggestCity:  { name: "$biggestCity",  pop: "$biggestPop" },
      smallestCity: { name: "$smallestCity", pop: "$smallestPop" }
    }
  }
] )

In this example, the aggregation pipeline consists of a $group stage, a $sort stage, another $group stage, and a $project stage:

The output documents of this aggregation operation resemble the following:

{
  "state" : "RI",
  "biggestCity" : {
    "name" : "CRANSTON",
    "pop" : 176404
  },
  "smallestCity" : {
    "name" : "CLAYVILLE",
    "pop" : 45
  }
}
[1]A city can have more than one zip code associated with it as different sections of the city can each have a different zip code.