Create Queries that Ensure Selectivity创建确保选择性的查询

Selectivity is the ability of a query to narrow results using the index. 选择性是指查询使用索引缩小结果范围的能力。Effective indexes are more selective and allow MongoDB to use the index for a larger portion of the work associated with fulfilling the query.有效的索引更具选择性,允许MongoDB将索引用于与完成查询相关的大部分工作。

To ensure selectivity, write queries that limit the number of possible documents with the indexed field. 为了确保选择性,可以编写查询来限制索引字段中可能的文档数量。Write queries that are appropriately selective relative to your indexed data.编写相对于索引数据具有适当选择性的查询。

Example示例

Suppose you have a field called status where the possible values are new and processed. 假设您有一个名为status的字段,其中可能的值是newprocessedIf you add an index on status you’ve created a low-selectivity index. 如果在status上添加索引,则创建了低选择性索引。The index will be of little help in locating records.索引对查找记录几乎没有帮助。

A better strategy, depending on your queries, would be to create a compound index that includes the low-selectivity field and another field. 根据您的查询,更好的策略是创建一个包含低选择性字段和另一个字段的复合索引For example, you could create a compound index on status and created_at.例如,您可以在statuscreated_at上创建一个复合索引。

Another option, again depending on your use case, might be to use separate collections, one for each status.另一种选择,同样取决于您的用例,可能是使用单独的集合,每个状态对应一个集合。

Example示例

Consider an index { a : 1 } (i.e. an index on the key a sorted in ascending order) on a collection where a has three values evenly distributed across the collection:考虑一个索引{ a : 1 }(即,在升序排序的键a上的索引),在集合中,a三个值均匀分布在集合上:

{ _id: ObjectId(), a: 1, b: "ab" }
{ _id: ObjectId(), a: 1, b: "cd" }
{ _id: ObjectId(), a: 1, b: "ef" }
{ _id: ObjectId(), a: 2, b: "jk" }
{ _id: ObjectId(), a: 2, b: "lm" }
{ _id: ObjectId(), a: 2, b: "no" }
{ _id: ObjectId(), a: 3, b: "pq" }
{ _id: ObjectId(), a: 3, b: "rs" }
{ _id: ObjectId(), a: 3, b: "tv" }

If you query for { a: 2, b: "no" } MongoDB must scan 3 documents in the collection to return the one matching result. 如果查询{ a: 2, b: "no" },MongoDB必须扫描集合中的3个文档,以返回一个匹配结果。Similarly, a query for { a: { $gt: 1}, b: "tv" } must scan 6 documents, also to return one result.类似地,对{ a: { $gt: 1}, b: "tv" }的查询必须扫描6个文档,也要返回一个结果。

Consider the same index on a collection where a has nine values evenly distributed across the collection:考虑一个集合上的相同索引,其中a有九个值均匀分布在集合上:

{ _id: ObjectId(), a: 1, b: "ab" }
{ _id: ObjectId(), a: 2, b: "cd" }
{ _id: ObjectId(), a: 3, b: "ef" }
{ _id: ObjectId(), a: 4, b: "jk" }
{ _id: ObjectId(), a: 5, b: "lm" }
{ _id: ObjectId(), a: 6, b: "no" }
{ _id: ObjectId(), a: 7, b: "pq" }
{ _id: ObjectId(), a: 8, b: "rs" }
{ _id: ObjectId(), a: 9, b: "tv" }

If you query for { a: 2, b: "cd" }, MongoDB must scan only one document to fulfill the query. 如果查询{ a: 2, b: "cd" },MongoDB必须只扫描一个文档才能完成查询。The index and query are more selective because the values of a are evenly distributed and the query can select a specific document using the index.索引和查询更具选择性,因为a的值分布均匀,查询可以使用索引选择特定文档。

However, although the index on a is more selective, a query such as { a: { $gt: 5 }, b: "tv" } would still need to scan 4 documents.然而,尽管a上的索引更具选择性,但是像{ a: { $gt: 5 }, b: "tv" }这样的查询仍然需要扫描4个文档。

If overall selectivity is low, and if MongoDB must read a number of documents to return results, then some queries may perform faster without indexes. 如果总体选择性较低,并且MongoDB必须读取大量文档才能返回结果,那么一些查询在没有索引的情况下可能会执行得更快。To determine performance, see Measure Index Use.要确定性能,请参阅度量索引使用