$graphLookup (aggregation)

On this page本页内容

Changed in version 3.4.在版本3.4中更改。

Definition定义

$graphLookup

Performs a recursive search on a collection, with options for restricting the search by recursion depth and query filter.对集合执行递归搜索,并使用用于按递归深度和查询筛选器限制搜索的选项。

The $graphLookup search process is summarized below:$graphLookup搜索过程总结如下:

  1. Input documents flow into the $graphLookup stage of an aggregation operation.输入文档流入聚合操作的$graphLookup阶段。
  2. $graphLookup targets the search to the collection designated by the from parameter (see below for full list of search parameters).$graphLookup将搜索目标指向由from参数指定的集合(有关搜索参数的完整列表,请参阅下文)。
  3. For each input document, the search begins with the value designated by startWith.对于每个输入文档,搜索从startWith指定的值开始。
  4. $graphLookup matches the startWith value against the field designated by connectToField in other documents in the from collection.$graphLookupstartWith值与from集合中其他文档中connectToField指定的字段相匹配。
  5. For each matching document, $graphLookup takes the value of the connectFromField and checks every document in the from collection for a matching connectToField value. 对于每个匹配的文档,$graphLookup获取connectFromField的值,并检查from集合中的每个文档是否有匹配的connectToField值。For each match, $graphLookup adds the matching document in the from collection to an array field named by the as parameter.对于每个匹配,$graphLookupfrom集合中的匹配文档添加到由as参数命名的数组字段中。

    This step continues recursively until no more matching documents are found, or until the operation reaches a recursion depth specified by the maxDepth parameter. 此步骤将递归继续,直到找不到更多匹配的文档,或者直到操作达到maxDepth参数指定的递归深度。$graphLookup then appends the array field to the input document. 然后$graphLookup将数组字段附加到输入文档。$graphLookup returns results after completing its search on all input documents.$graphLookup在完成对所有输入文档的搜索后返回结果。

$graphLookup has the following prototype form:$graphLookup具有以下原型形式:

{
   $graphLookup: {
      from: <collection>,
      startWith: <expression>,
      connectFromField: <string>,
      connectToField: <string>,
      as: <string>,
      maxDepth: <number>,
      depthField: <string>,
      restrictSearchWithMatch: <document>
   }
}

$graphLookup takes a document with the following fields:$graphLookup获取包含以下字段的文档:

Field字段Description描述
from Target collection for the $graphLookup operation to search, recursively matching the connectFromField to the connectToField. 要搜索的$graphLookup操作的目标集合,递归地将connectFromFieldconnectToField匹配。The from collection cannot be sharded and must be in the same database as any other collections used in the operation. from集合不能被切分,并且必须与操作中使用的任何其他集合位于同一数据库中。For information, see Sharded Collections.有关信息,请参阅分片集合
startWith Expression that specifies the value of the connectFromField with which to start the recursive search. 表达式,该表达式指定用于启动递归搜索的connectFromField的值。Optionally, startWith may be array of values, each of which is individually followed through the traversal process.可选地,startWith可以是一个值数组,每个值在遍历过程中单独跟随。
connectFromField Field name whose value $graphLookup uses to recursively match against the connectToField of other documents in the collection. 字段名,其值$graphLookup用于递归匹配集合中其他文档的connectToFieldIf the value is an array, each element is individually followed through the traversal process.如果值是一个数组,则遍历过程中每个元素都会单独跟随。
connectToField Field name in other documents against which to match the value of the field specified by the connectFromField parameter.connectFromField参数指定的字段值相匹配的其他文档中的字段名。
as

Name of the array field added to each output document. 添加到每个输出文档的数组字段的名称。Contains the documents traversed in the $graphLookup stage to reach the document.包含在$graphLookup阶段中遍历以到达文档的文档。

Note

Documents returned in the as field are not guaranteed to be in any order.as字段中返回的文档不保证按任何顺序排列。

maxDepth Optional.可选。 Non-negative integral number specifying the maximum recursion depth.指定最大递归深度的非负整数。
depthField Optional.可选。 Name of the field to add to each traversed document in the search path. 要添加到搜索路径中每个遍历文档的字段的名称。The value of this field is the recursion depth for the document, represented as a NumberLong. 该字段的值是文档的递归深度,表示为NumberLongRecursion depth value starts at zero, so the first lookup corresponds to zero depth.递归深度值从零开始,因此第一次查找对应于零深度。
restrictSearchWithMatch

Optional.可选。 A document specifying additional conditions for the recursive search. 为递归搜索指定附加条件的文档。The syntax is identical to query filter syntax.该语法与查询筛选器语法相同。

Note

You cannot use any aggregation expression in this filter. 不能在此筛选器中使用任何聚合表达式For example, a query document such as例如,查询文档,如

{ lastName: { $ne: "$lastName" } }

will not work in this context to find documents in which the lastName value is different from the lastName value of the input document, because "$lastName" will act as a string literal, not a field path.无法在此上下文中查找lastName值与输入文档的lastName值不同的文档,因为"$lastName"将充当字符串文字,而不是字段路径。

Considerations考虑事项

Sharded Collections分片集合

The collection specified in from cannot be sharded. from中指定的集合无法分片However, the collection on which you run the aggregate() method can be sharded. 但是,可以对运行aggregate()方法的集合进行分片。That is, in the following:也就是说,在以下方面:

db.collection.aggregate([
   { $graphLookup: { from: "fromCollection", ... } }
])
  • The collection can be sharded.这些collection可以分片。
  • The fromCollection cannot be sharded.fromCollection无法分片。

To join multiple sharded collections, consider:要加入多个分片集合,请考虑:

  • Modifying client applications to perform manual lookups instead of using the $graphLookup aggregation stage.修改客户端应用程序以执行手动查找,而不是使用$graphLookup聚合阶段。
  • If possible, using an embedded data model that removes the need to join collections.如果可能,可以使用嵌入式数据模型,这样就不需要加入集合。

Max Depth最大深度

Setting the maxDepth field to 0 is equivalent to a non-recursive $graphLookup search stage.maxDepth字段设置为0相当于非递归的$graphLookup搜索阶段。

Memory内存

The $graphLookup stage must stay within the 100 megabyte memory limit. $graphLookup阶段必须保持在100 MB内存限制内。If allowDiskUse: true is specified for the aggregate() operation, the $graphLookup stage ignores the option. 如果为aggregate()操作指定了allowDiskUse:true,则$graphLookup阶段将忽略该选项。If there are other stages in the aggregate() operation, allowDiskUse: true option is in effect for these other stages.如果aggregate()操作中还有其他阶段,allowDiskUse:true选项对这些其他阶段有效。

See aggregration pipeline limitations for more information.有关更多信息,请参阅聚合管道限制

Views and Collation视图和排序

If performing an aggregation that involves multiple views, such as with $lookup or $graphLookup, the views must have the same collation.如果执行涉及多个视图的聚合,例如使用$lookup$graphLookup,则这些视图必须具有相同的排序规则

Examples示例

Within a Single Collection在单个集合中

A collection named employees has the following documents:名为employees的集合包含以下文档:

{ "_id" : 1, "name" : "Dev" }
{ "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
{ "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
{ "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }
{ "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }
{ "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew" }

The following $graphLookup operation recursively matches on the reportsTo and name fields in the employees collection, returning the reporting hierarchy for each person:以下$graphLookup操作递归匹配employees集合中的reportsTo字段和name字段,返回每个人的报告层次结构:

db.employees.aggregate( [
   {
      $graphLookup: {
         from: "employees",
         startWith: "$reportsTo",
         connectFromField: "reportsTo",
         connectToField: "name",
         as: "reportingHierarchy"
      }
   }
] )

The operation returns the following:该操作返回以下内容:

{
   "_id" : 1,
   "name" : "Dev",
   "reportingHierarchy" : [ ]
}
{
   "_id" : 2,
   "name" : "Eliot",
   "reportsTo" : "Dev",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" }
   ]
}
{
   "_id" : 3,
   "name" : "Ron",
   "reportsTo" : "Eliot",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
   ]
}
{
   "_id" : 4,
   "name" : "Andrew",
   "reportsTo" : "Eliot",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
   ]
}
{
   "_id" : 5,
   "name" : "Asya",
   "reportsTo" : "Ron",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },
      { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
   ]
}
{
   "_id" : 6,
   "name" : "Dan",
   "reportsTo" : "Andrew",
   "reportingHierarchy" : [
      { "_id" : 1, "name" : "Dev" },
      { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },
      { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }
   ]
}

The following table provides a traversal path for the document { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }:下表提供了文档{ "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }的遍历路径:

Start value起始值

The reportsTo value of the document:文档的reportsTo值:

{ ... "reportsTo" : "Ron" }
Depth 0
{ "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
Depth 1
{ "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
Depth 2
{ "_id" : 1, "name" : "Dev" }

The output generates the hierarchy Asya -> Ron -> Eliot -> Dev.输出生成层次结构Asya -> Ron -> Eliot -> Dev

Across Multiple Collections跨多个集合

Like $lookup, $graphLookup can access another collection in the same database.$lookup一样,$graphLookup可以访问同一数据库中的另一个集合。

In the following example, a database contains two collections:在以下示例中,数据库包含两个集合:

  • A collection airports with the following documents:集合airports具有下列文档:

    { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] }
    { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] }
    { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] }
    { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] }
    { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ] }
  • A collection travelers with the following documents:集合travelers具有以下文档:

    { "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK" }
    { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK" }
    { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS" }

For each document in the travelers collection, the following aggregation operation looks up the nearestAirport value in the airports collection and recursively matches the connects field to the airport field. 对于travelers集合中的每个文档,以下聚合操作将在airports集合中查找nearestAirport值,并递归地将connects字段与airport字段匹配。The operation specifies a maximum recursion depth of 2.该操作指定最大递归深度为2

db.travelers.aggregate( [
   {
      $graphLookup: {
         from: "airports",
         startWith: "$nearestAirport",
         connectFromField: "connects",
         connectToField: "airport",
         maxDepth: 2,
         depthField: "numConnections",
         as: "destinations"
      }
   }
] )

The operation returns the following results:操作返回以下结果:

{
   "_id" : 1,
   "name" : "Dev",
   "nearestAirport" : "JFK",
   "destinations" : [
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(0) }
   ]
}
{
   "_id" : 2,
   "name" : "Eliot",
   "nearestAirport" : "JFK",
   "destinations" : [
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(0) } ]
}
{
   "_id" : 3,
   "name" : "Jeff",
   "nearestAirport" : "BOS",
   "destinations" : [
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 4,
        "airport" : "LHR",
        "connects" : [ "PWM" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(0) }
   ]
}

The following table provides a traversal path for the recursive search, up to depth 2, where the starting airport is JFK:下表提供了递归搜索的遍历路径,可达深度2,其中起始airportJFK

Start value起始值

The nearestAirport value from the travelers collection:travelers集合中的nearestAirport值:

{ ... "nearestAirport" : "JFK" }
Depth 0
{ "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] }
Depth 1
{ "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] }
{ "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] }
Depth 2
{ "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] }

With a Query Filter使用查询筛选器

The following example uses a collection with a set of documents containing names of people along with arrays of their friends and their hobbies. 下面的示例使用一个集合,其中包含一组文档,其中包含人名及其朋友和爱好的数组。An aggregation operation finds one particular person and traverses her network of connections to find people who list golf among their hobbies.聚合操作会找到一个特定的人,并遍历她的连接网络,找到将golf列为自己爱好的人。

A collection named people contains the following documents:名为people的集合包含以下文档:

{
  "_id" : 1,
  "name" : "Tanya Jordan",
  "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ],
  "hobbies" : [ "tennis", "unicycling", "golf" ]
}
{
  "_id" : 2,
  "name" : "Carole Hale",
  "friends" : [ "Joseph Dennis", "Tanya Jordan", "Terry Hawkins" ],
  "hobbies" : [ "archery", "golf", "woodworking" ]
}
{
  "_id" : 3,
  "name" : "Terry Hawkins",
  "friends" : [ "Tanya Jordan", "Carole Hale", "Angelo Ward" ],
  "hobbies" : [ "knitting", "frisbee" ]
}
{
  "_id" : 4,
  "name" : "Joseph Dennis",
  "friends" : [ "Angelo Ward", "Carole Hale" ],
  "hobbies" : [ "tennis", "golf", "topiary" ]
}
{
  "_id" : 5,
  "name" : "Angelo Ward",
  "friends" : [ "Terry Hawkins", "Shirley Soto", "Joseph Dennis" ],
  "hobbies" : [ "travel", "ceramics", "golf" ]
}
{
   "_id" : 6,
   "name" : "Shirley Soto",
   "friends" : [ "Angelo Ward", "Tanya Jordan", "Carole Hale" ],
   "hobbies" : [ "frisbee", "set theory" ]
 }

The following aggregation operation uses three stages:以下聚合操作使用三个阶段:

  • $match matches on documents with a name field containing the string "Tanya Jordan". 匹配包含字符串"Tanya Jordan"name字段的文档。Returns one output document.返回一个输出文档。
  • $graphLookup connects the output document’s friends field with the name field of other documents in the collection to traverse Tanya Jordan's network of connections. $graphLookup将输出文档的friends字段与集合中其他文档的name字段连接起来,以遍历Tanya Jordan的连接网络。This stage uses the restrictSearchWithMatch parameter to find only documents in which the hobbies array contains golf. 此阶段使用restrictSearchWithMatch参数仅查找hobbies数组中包含golf的文档。Returns one output document.返回一个输出文档。
  • $project shapes the output document. $project将形成输出文档。The names listed in connections who play golf are taken from the name field of the documents listed in the input document’s golfers array.connections who play golf中列出的名字取自输入文档的golfers数组中列出的文档的name字段。
db.people.aggregate( [
  { $match: { "name": "Tanya Jordan" } },
  { $graphLookup: {
      from: "people",
      startWith: "$friends",
      connectFromField: "friends",
      connectToField: "name",
      as: "golfers",
      restrictSearchWithMatch: { "hobbies" : "golf" }
    }
  },
  { $project: {
      "name": 1,
      "friends": 1,
      "connections who play golf": "$golfers.name"
    }
  }
] )

The operation returns the following document:该操作将返回以下文档:

{
   "_id" : 1,
   "name" : "Tanya Jordan",
   "friends" : [
      "Shirley Soto",
      "Terry Hawkins",
      "Carole Hale"
   ],
   "connections who play golf" : [
      "Joseph Dennis",
      "Tanya Jordan",
      "Angelo Ward",
      "Carole Hale"
   ]
}

Additional Resource额外资源

Webinar: Working with Graph Data in MongoDB网络研讨会:在MongoDB中使用图形数据