$sample (aggregation)

On this page本页内容

Definition定义

$sample

New in version 3.2.版本3.2中的新功能。

Randomly selects the specified number of documents from its input.从输入中随机选择指定数量的文档。

The $sample stage has the following syntax:$sample阶段语法如下所示:

{ $sample: { size: <positive integer> } }

Behavior行为

$sample uses one of two methods to obtain N random documents, depending on the size of the collection, the size of N, and $sample’s position in the pipeline.$sample使用两种方法之一获取N个随机文档,具体取决于集合的大小、N的大小以及$sample在管道中的位置。

If all the following conditions are met, $sample uses a pseudo-random cursor to select documents:如果满足以下所有条件,$sample使用伪随机游标选择文档:

If any of the above conditions are NOT met, $sample performs a collection scan followed by a random sort to select N documents. 如果不满足上述任何条件,$sample将执行收集扫描,然后进行随机排序,以选择N个文档。In this case, the $sample stage is subject to the sort memory restrictions.在这种情况下,$sample阶段受排序内存限制

Warning

$sample may output the same document more than once in its result set. $sample可能会在其结果集中多次输出同一文档。For more information, see Cursor Isolation.有关更多信息,请参阅游标隔离

Example示例

Given a collection named users with the following documents:给定一个名为users的集合,其中包含以下文档:

{ "_id" : 1, "name" : "dave123", "q1" : true, "q2" : true }
{ "_id" : 2, "name" : "dave2", "q1" : false, "q2" : false  }
{ "_id" : 3, "name" : "ahn", "q1" : true, "q2" : true  }
{ "_id" : 4, "name" : "li", "q1" : true, "q2" : false  }
{ "_id" : 5, "name" : "annT", "q1" : false, "q2" : true  }
{ "_id" : 6, "name" : "li", "q1" : true, "q2" : true  }
{ "_id" : 7, "name" : "ty", "q1" : false, "q2" : true  }

The following aggregation operation randomly selects 3 documents from the collection:以下聚合操作从集合中随机选择3个文档:

db.users.aggregate(
   [ { $sample: { size: 3 } } ]
)

The operation returns three random documents.该操作将返回三个随机文档。