On this page本页内容
sparse
MongoDB Atlas Search
Atlas Search makes it easy to build fast, relevance-based search capabilities on top of your MongoDB data. 可以轻松地在MongoDB数据的基础上构建快速、基于相关性的搜索功能。Try it today on MongoDB Atlas, our fully managed database as a service.今天就在MongoDB Atlas上试试吧,这是我们全面管理的数据库即服务。
MongoDB provides text indexes to support text search queries on string content. MongoDB提供文本索引以支持对字符串内容的文本搜索查询。text
indexes can include any field whose value is a string or an array of string elements.text
索引可以包括值为字符串或字符串元素数组的任何字段。
text Index Version | |
---|---|
Version 3 | text index. text 索引的第3版。text indexes created in MongoDB 3.2 and later.text 索引的默认版本。 |
Version 2 | text index. text 索引的第2版。text indexes created in MongoDB 2.6 and 3.0 series.text 索引的默认版本。 |
Version 1 | text index. MongoDB 2.4 can only support version 1 .text 索引的版本1。MongoDB 2.4只能支持版本1 。 |
To override the default version and specify a different version, include the option 要覆盖默认版本并指定其他版本,请在创建索引时包含选项{ "textIndexVersion": <version> }
when creating the index.{ "textIndexVersion": <version> }
。
Important
A collection can have at most one 一个集合最多只能有一个text
index.text
索引。
To create a 要创建text
index, use the db.collection.createIndex()
method. text
索引,请使用db.collection.createIndex()
方法。To index a field that contains a string or an array of string elements, include the field and specify the string literal 要为包含字符串或字符串元素数组的字段编制索引,请在索引文档中包含该字段并指定字符串文字"text"
in the index document, as in the following example:"text"
,如下例所示:
You can index multiple fields for the 可以为text
index. text
索引的多个字段编制索引。The following example creates a 以下示例创建了text
index on the fields subject
and comments
:subject
和commands
字段的text
索引:
A compound index can include 复合索引可以包括text
index keys in combination with ascending/descending index keys. text
索引键和升序/降序索引键。For more information, see Compound Index.有关更多信息,请参阅复合索引。
In order to drop a 要删除text
index, use the index name. text
索引,请使用索引名称。See Use the Index Name to Drop a text Index for more information.有关更多信息,请参阅使用索引名删除文本索引。
For a 对于text
index, the weight of an indexed field denotes the significance of the field relative to the other indexed fields in terms of the text search score.text
索引,索引字段的权重表示该字段相对于其他索引字段在文本搜索分数方面的重要性。
For each indexed field in the document, MongoDB multiplies the number of matches by the weight and sums the results. 对于文档中的每个索引字段,MongoDB将匹配数乘以权重,并对结果求和。Using this sum, MongoDB then calculates the score for the document. 然后,使用这个总和,MongoDB计算文档的分数。See 有关按文本分数返回和排序的详细信息,请参阅$meta
operator for details on returning and sorting by text scores.$meta
运算符。
The default weight is 1 for the indexed fields. 索引字段的默认权重为1。To adjust the weights for the indexed fields, include the 要调整索引字段的权重,请在weights
option in the db.collection.createIndex()
method.db.collection.createIndex()
方法中包含weights
选项。
For more information using weights to control the results of a text search, see Control Search Results with Weights.有关使用权重控制文本搜索结果的详细信息,请参阅<使用权重控制搜索结果。
Note
Wildcard Text Indexes are distinct from Wildcard Indexes. 通配符文本索引不同于通配符索引。Wildcard indexes cannot support queries using the 通配符索引无法支持使用$text
operator.$text
运算符的查询。
While Wildcard Text Indexes and Wildcard Indexes share the wildcard 虽然通配符文本索引和通配符索引共享通配符$**
field pattern, they are distinct index types. $**
字段模式,但它们是不同的索引类型。Only Wildcard Text Indexes support the 只有通配符文本索引支持$text
operator.$text
运算符。
When creating a 在多个字段上创建文本索引时,还可以使用通配符说明符(text
index on multiple fields, you can also use the wildcard specifier ($**
). With a wildcard text index, MongoDB indexes every field that contains string data for each document in the collection. $**
)。使用通配符文本索引,MongoDB为集合中每个文档包含字符串数据的每个字段编制索引。The following example creates a text index using the wildcard specifier:以下示例使用通配符说明符创建文本索引:
This index allows for text search on all fields with string content. 此索引允许对所有包含字符串内容的字段进行文本搜索。Such an index can be useful with highly unstructured data if it is unclear which fields to include in the text index or for ad-hoc querying.如果不清楚要在文本索引中包含哪些字段或用于特殊查询,那么这种索引对于高度非结构化的数据非常有用。
Wildcard text indexes are 通配符文本索引是多个字段上的文本索引。text
indexes on multiple fields. As such, you can assign weights to specific fields during index creation to control the ranking of the results. 因此,可以在创建索引期间为特定字段指定权重,以控制结果的排名。For more information using weights to control the results of a text search, see Control Search Results with Weights.有关使用权重控制文本搜索结果的详细信息,请参阅使用权重控制搜索结果。
Wildcard text indexes, as with all text indexes, can be part of a compound indexes. 与所有文本索引一样,通配符文本索引可以是复合索引的一部分。For example, the following creates a compound index on the field 例如,以下内容在字段a和通配符说明符上创建复合索引:a
as well as the wildcard specifier:
As with all compound text indexes, since the 与所有复合文本索引一样,由于a
precedes the text index key, in order to perform a $text
search with this index, the query predicate must include an equality match conditions a
. a
位于文本索引键之前,为了使用该索引执行$text
搜索,查询谓词必须包含相等匹配条件a
。For information on compound text indexes, see Compound Text Indexes.有关复合文本索引的信息,请参阅复合文本索引。
Changed in version 3.2.在版本3.2中更改。
The version 3 text
index supports the common C
, simple S
, and for Turkish languages, the special T
case foldings as specified in Unicode 8.0 Character Database Case Folding.
The case foldings expands the case insensitivity of the text
index to include characters with diacritics, such as é
and É
, and characters from non-Latin alphabets, such as “И” and “и”
in the Cyrillic alphabet.
Version 3 of the text
index is also diacritic insensitive. As such, the index also does not distinguish between é
, É
, e
, and E
.
Previous versions of the text
index are case insensitive for [A-z]
only; i.e. case insensitive for non-diacritics Latin characters only . For all other characters, earlier versions of the text index treat them as distinct.对于所有其他字符,早期版本的文本索引将它们视为不同的字符。
Changed in version 3.2.在3.2版中进行了更改。
With version 3, 对于版本3,text
index is diacritic insensitive. text
索引不区分重音。That is, the index does not distinguish between characters that contain diacritical marks and their non-marked counterpart, such as é
, ê
, and e
. More specifically, the text
index strips the characters categorized as diacritics in Unicode 8.0 Character Database Prop List.
Version 3 of the text
index is also case insensitive to characters with diacritics. As such, the index also does not distinguish between é
, É
, e
, and E
.
Previous versions of the 以前版本的text
index treat characters with diacritics as distinct.text
索引将带变音符号的字符视为不同的字符。
Changed in version 3.2.在3.2版中进行了更改。
For tokenization, version 3 text
index uses the delimiters categorized under Dash
, Hyphen
, Pattern_Syntax
, Quotation_Mark
, Terminal_Punctuation
, and White_Space
in Unicode 8.0 Character Database Prop List.
For example, if given a string "Il a dit qu'il «était le meilleur joueur du monde»"
, the text
index treats «
, »
, and spaces as delimiters.
Previous versions of the index treat «
as part of the term "«était"
and »
as part of the term "monde»"
.
text
index tokenizes and stems the terms in the indexed fields for the index entries. text
index stores one index entry for each unique stemmed term in each indexed field for each document in the collection. The index uses simple language-specific suffix stemming.
MongoDB supports text search for various languages. MongoDB支持各种语言的文本搜索。text
indexes drop language-specific stop words (e.g. in English, the
, an
, a
, and
, etc.) and use simple language-specific suffix stemming. For a list of the supported languages, see Text Search Languages.
If you specify a language value of "none"
, then the text
index uses simple tokenization with no list of stop words and no stemming.
To specify a language for the text
index, see Specify a Language for Text Index.
sparse
text
indexes are always sparse and ignore the sparse option. text
索引总是稀疏的,忽略sparse
选项。If a document lacks a 如果文档缺少text
index field (or the field is null
or an empty array), MongoDB does not add an entry for the document to the text
index. text
索引字段(或字段为null
或空数组),MongoDB不会将文档条目添加到文本索引中。For inserts, MongoDB inserts the document but does not add to the 对于插入,MongoDB会插入文档,但不会添加到text
index.text
索引中。
For a compound index that includes a 对于包含text
index key along with keys of other types, only the text
index field determines whether the index references a document. text
索引键和其他类型键的复合索引,只有text
索引字段确定索引是否引用文档。The other keys do not determine whether the index references the documents or not.其他键不确定索引是否引用文档。
A collection can have at most one 一个集合最多只能有一个text
index.text
索引。
You cannot use 如果查询包含hint()
if the query includes a $text
query expression.$text
查询表达式,则不能使用hint()
。
Sort operations cannot obtain sort order from a 排序操作无法从text
index, even from a compound text index; i.e. sort operations cannot use the ordering in the text index.text
索引中获得排序顺序,甚至无法从复合文本索引中获得排序顺序;亦即,排序操作不能使用文本索引中的排序。
A compound index can include a 复合索引可以包括text
index key in combination with ascending/descending index keys. text
索引键和升序/降序索引键。However, these compound indexes have the following restrictions:但是,这些复合索引有以下限制:
text
index cannot include any other special index types, such as multi-key or geospatial index fields.text
index includes keys preceding the text
index key, to perform a $text
search, the query predicate must include equality match conditions on the preceding keys.text
index, all text
index keys must be listed adjacently in the index specification document.text
索引时,索引规范文档中必须相邻列出所有text
索引键。See also Text Index and Sort for additional limitations.有关其他限制,请参阅文本索引和排序。
For an example of a compound text index, see Limit the Number of Entries Scanned.有关复合文本索引的示例,请参阅限制扫描的条目数。
To drop a text
index, pass the name of the index to the db.collection.dropIndex()
method. To get the name of the index, run the db.collection.getIndexes()
method.
For information on the default naming scheme for 有关文本索引的默认命名方案以及覆盖默认名称的信息,请参阅为文本索引指定名称。text
indexes as well as overriding the default name, see Specify Name for text Index.
text
indexes only support simple binary comparison and do not support collation.text
索引只支持简单的二进制比较,不支持排序规则。
To create a 要在具有非简单排序规则的集合上创建text
index on a a collection that has a non-simple collation, you must explicitly specify {collation: {locale: "simple"}
}
when creating the index.text
索引,必须在创建索引时显式指定{collation: {locale: "simple"}}
。
text
indexes have the following storage requirements and performance costs:text
索引具有以下存储要求和性能成本:
text
indexes can be large. text
索引可以很大。text
index is very similar to building a large multi-key index and will take longer than building a simple ordered (scalar) index on the same data.text
index on an existing collection, ensure that you have a sufficiently high limit on open file descriptors. See the recommended settings.text
indexes will impact insertion throughput because MongoDB must add an index entry for each unique post-stemmed word in each indexed field of each new source document.text
indexes do not store phrases or information about the proximity of words in the documents. As a result, phrase queries will run much more effectively when the entire collection fits in RAM.The text
index supports $text
query operations. For examples of text search, see the $text reference page
. For examples of $text
operations in aggregation pipelines, see Text Search in the Aggregation Pipeline.