$graphLookup聚合阶段在一个集合中执行递归搜索,可以使用选项来控制递归搜索的深度和条件。
$graphLookup搜索过程总结如下:
- 输入文档进入$graphLookup聚合阶段。
- $graphLookup的搜索目标是- from参数指定的集合(搜索参数的完整列表见下文)。
- 对于每个输入文档,搜索都从startWith指定的值开始。
- graphLookup使用- startWith的值匹配由- from指定的集合和- connectToField指定的字段的值。
- 对于每个匹配文档,$graphLookup拿connectFromField的值来检查每个from参数指定的集合下的connectToField参数指定的字段的值,然后将匹配上的from集合的文档放到由as参数指定的数组中。
 然后该步骤继续递归直到没有匹配的文档或操作达到由maxDepth参数指定的递归深度。然后$graphLookup把数组字段添加到输入文档。在完成所有的文档搜索后返回结果。
语法
{$graphLookup: {from: <collection>,startWith: <expression>,connectFromField: <string>,connectToField: <string>,as: <string>,maxDepth: <number>,depthField: <string>,restrictSearchWithMatch: <document>}
}
参数字段解释:
| 字段 | 描述 | 
|---|---|
| from | $graphLookup操作搜索的目标集合,递归匹配connectFromField和connnectToField字段的值,from指定的集合必须与当前集合在同一个数据库,并且不可以是同一个集合 | 
| startWith | 可选,表达式, connectFromField字段进行递归搜索的起始值。startWith的值也可以是数组,其每个值都会被遍历处理 | 
| connectFromField | 指定一个字段名,其值用于递归搜索匹配。与集合中其他文档 connectToField相对应,如果其值是数组,则会在遍历时单独处理每个元素 | 
| connectToField | 其他文档中的字段名称,用于匹配 connectFromField参数指定的字段值 | 
| as | 添加到每个输出文档中的数组字段名称。包含在 $graphLook阶段遍历的所有文档(注意,数组元素的顺序不保证) | 
| maxDepth | 可选,正整数,指定最大的递归深度 | 
| depthField | 可选,要添加到搜索路径中每个遍历文档的字段名称。该字段的值是文档的递归深度,长整数。递归深度值从零开始,因此第一次查找对应的深度为零 | 
| restrictSearchWithMatch | 可选,文档类型。为递归搜索指定额外的条件,其语法与查询过滤语法相同。可以在过滤条件中使用所有的聚合表达式,如: { lastName: { $ne: "$lastName" } },该表达式无法在该上下文中查找lastName值与输入文档的lastName值不同的文档,因为“$lastName”将充当字符串文本,而不是字段路径 | 
使用
分片集合
从MongoDB 5.1开始,可以在from参数中指定分片集合,但不能在事务中使用分片集合。
最大递归深度
将maxDepth字段设置为0相当于一个非递归的$graphLookup搜索阶段
内存
$graphLookup阶段有100M内存的限制,如果想突破这个限制,可以为聚合指定allowDiskUse: true,该设置也会影响到$graphLookup中使用的其他聚合阶段。
视图和集合
如果执行涉及多个视图的聚合,如使用$lookup或$graphLookup,视图必须有相同的集合。
举例
单个集合
employees集合有下面的文档:
{ "_id" : 1, "name" : "Dev" }
{ "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
{ "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
{ "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }
{ "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }
{ "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew" }
下面的$graphLookup递归匹配employees集合中reportsTo和name字段,返回每个人的报告层次结构:
db.employees.aggregate( [{$graphLookup: {from: "employees",startWith: "$reportsTo",connectFromField: "reportsTo",connectToField: "name",as: "reportingHierarchy"}}
] )
操作返回下面的结果:
{"_id" : 1,"name" : "Dev","reportingHierarchy" : [ ]
}
{"_id" : 2,"name" : "Eliot","reportsTo" : "Dev","reportingHierarchy" : [{ "_id" : 1, "name" : "Dev" }]
}
{"_id" : 3,"name" : "Ron","reportsTo" : "Eliot","reportingHierarchy" : [{ "_id" : 1, "name" : "Dev" },{ "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }]
}
{"_id" : 4,"name" : "Andrew","reportsTo" : "Eliot","reportingHierarchy" : [{ "_id" : 1, "name" : "Dev" },{ "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }]
}
{"_id" : 5,"name" : "Asya","reportsTo" : "Ron","reportingHierarchy" : [{ "_id" : 1, "name" : "Dev" },{ "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },{ "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }]
}
{"_id" : 6,"name" : "Dan","reportsTo" : "Andrew","reportingHierarchy" : [{ "_id" : 1, "name" : "Dev" },{ "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },{ "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }]
}
下表显示了文件的遍历路径:
{ "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }:
| 起始值 | 文档 reportsTo的值 | 
|---|---|
| { ... "reportsTo" : "Ron" } | |
| 深度0 | { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" } | 
| 深度1 | { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" } | 
| 深度2 | { "_id" : 1, "name" : "Dev" } | 
输出结果生成的层次结构Asya -> Ron -> Eliot -> Dev
多个集合
跟$lookup类似,$graphLookup可以跨同一数据库的集合
例如,在同一数据库中分别创建两个集合:
- airports集合有下列文档:
db.airports.insertMany( [{ "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] },{ "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] },{ "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] },{ "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] },{ "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ] }
] )
- travelers集合有以下文档:
db.travelers.insertMany( [{ "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK" },{ "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK" },{ "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS" }
] )
对于travelers集合中的每个文档,下面的聚合操作会查找airports集合中nearestAirport的值,并递归匹配connects字段和airport字段。该操作指定的最大递归深度为2。
db.travelers.aggregate( [{$graphLookup: {from: "airports",startWith: "$nearestAirport",connectFromField: "connects",connectToField: "airport",maxDepth: 2,depthField: "numConnections",as: "destinations"}}
] )
操作返回下面的结果:
{"_id" : 1,"name" : "Dev","nearestAirport" : "JFK","destinations" : [{ "_id" : 3,"airport" : "PWM","connects" : [ "BOS", "LHR" ],"numConnections" : NumberLong(2) },{ "_id" : 2,"airport" : "ORD","connects" : [ "JFK" ],"numConnections" : NumberLong(1) },{ "_id" : 1,"airport" : "BOS","connects" : [ "JFK", "PWM" ],"numConnections" : NumberLong(1) },{ "_id" : 0,"airport" : "JFK","connects" : [ "BOS", "ORD" ],"numConnections" : NumberLong(0) }]
}
{"_id" : 2,"name" : "Eliot","nearestAirport" : "JFK","destinations" : [{ "_id" : 3,"airport" : "PWM","connects" : [ "BOS", "LHR" ],"numConnections" : NumberLong(2) },{ "_id" : 2,"airport" : "ORD","connects" : [ "JFK" ],"numConnections" : NumberLong(1) },{ "_id" : 1,"airport" : "BOS","connects" : [ "JFK", "PWM" ],"numConnections" : NumberLong(1) },{ "_id" : 0,"airport" : "JFK","connects" : [ "BOS", "ORD" ],"numConnections" : NumberLong(0) } ]
}
{"_id" : 3,"name" : "Jeff","nearestAirport" : "BOS","destinations" : [{ "_id" : 2,"airport" : "ORD","connects" : [ "JFK" ],"numConnections" : NumberLong(2) },{ "_id" : 3,"airport" : "PWM","connects" : [ "BOS", "LHR" ],"numConnections" : NumberLong(1) },{ "_id" : 4,"airport" : "LHR","connects" : [ "PWM" ],"numConnections" : NumberLong(2) },{ "_id" : 0,"airport" : "JFK","connects" : [ "BOS", "ORD" ],"numConnections" : NumberLong(1) },{ "_id" : 1,"airport" : "BOS","connects" : [ "JFK", "PWM" ],"numConnections" : NumberLong(0) }]
}
下表显示了递归搜索遍历的路径,最大深度为2,开始的airport为JFK:
| 开始值 | travelers集合中nearestAirport的值 | 
|---|---|
| { ... "nearestAirport" : "JFK" } | |
| 深度0 | { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] } | 
| 深度1 | { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] } | 
| 深度2 | { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] } | 
查询条件
下面的示例使用了一个包含一组文档的集合,文档中包含人名及其朋友和爱好的数组。聚合操作会找到一个特定的人,并遍历她的社交网络,找到爱好为golf的人。
集合people包含了下列文档:
{"_id" : 1,"name" : "Tanya Jordan","friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ],"hobbies" : [ "tennis", "unicycling", "golf" ]
}
{"_id" : 2,"name" : "Carole Hale","friends" : [ "Joseph Dennis", "Tanya Jordan", "Terry Hawkins" ],"hobbies" : [ "archery", "golf", "woodworking" ]
}
{"_id" : 3,"name" : "Terry Hawkins","friends" : [ "Tanya Jordan", "Carole Hale", "Angelo Ward" ],"hobbies" : [ "knitting", "frisbee" ]
}
{"_id" : 4,"name" : "Joseph Dennis","friends" : [ "Angelo Ward", "Carole Hale" ],"hobbies" : [ "tennis", "golf", "topiary" ]
}
{"_id" : 5,"name" : "Angelo Ward","friends" : [ "Terry Hawkins", "Shirley Soto", "Joseph Dennis" ],"hobbies" : [ "travel", "ceramics", "golf" ]
}
{"_id" : 6,"name" : "Shirley Soto","friends" : [ "Angelo Ward", "Tanya Jordan", "Carole Hale" ],"hobbies" : [ "frisbee", "set theory" ]}
下面的聚合操作使用了3个阶段:
-  $match匹配name字段包含字符串"Tanya Jordan"的文档,返回一个输出文档。
-  $graphLookup将输出文档的friends字段与集合中其他文档的name字段连接起来,以遍历Tanya Jordan的社交网络。此阶段使用restrictSearchWithMatch参数只查找爱好数组中包含golf的文档。返回一个输出文档。
-  $project 塑造输出文档。列出的 connections who play golf的名字取自输入文档的golfers数组。
db.people.aggregate( [{ $match: { "name": "Tanya Jordan" } },{ $graphLookup: {from: "people",startWith: "$friends",connectFromField: "friends",connectToField: "name",as: "golfers",restrictSearchWithMatch: { "hobbies" : "golf" }}},{ $project: {"name": 1,"friends": 1,"connections who play golf": "$golfers.name"}}
] )
操作返回下面的文档:
{"_id" : 1,"name" : "Tanya Jordan","friends" : ["Shirley Soto","Terry Hawkins","Carole Hale"],"connections who play golf" : ["Joseph Dennis","Tanya Jordan","Angelo Ward","Carole Hale"]
}