$text¶

~~On this page~~本页内容

~~Definition~~定义
~~Behavior~~行为
~~Examples~~示例

MongoDB Atlas Search

~~Atlas Search makes it easy to build fast, relevance-based search capabilities on top of your MongoDB data.~~ Atlas Search可以轻松地在MongoDB数据的基础上构建快速、基于相关性的搜索功能。~~Try it today on MongoDB Atlas, our fully managed database as a service.~~今天就在MongoDB Atlas上试试吧，这是我们全面管理的数据库即服务。

Definition定义¶

$text¶

$text ~~performs a text search on the content of the fields indexed with a text index.~~ 对使用文本索引索引的字段的内容执行文本搜索。~~A $text expression has the following syntax:~~$text语法如下所示：

~~Changed in version 3.2.~~在版本3.2中更改。

{
  $text:
    {
      $search: <string>,
      $language: <string>,
      $caseSensitive: <boolean>,
      $diacriticSensitive: <boolean>
    }
}

~~The $text operator accepts a text query document with the following fields:~~$text运算符接受包含以下字段的文本查询文档：

~~Field~~字段	~~Type~~类型	~~Description~~描述
`$search`	string	~~A string of terms that MongoDB parses and uses to query the text index.~~ MongoDB解析并用于查询文本索引的术语字符串。~~MongoDB performs a logical `OR` search of the terms unless specified as a phrase.~~ MongoDB对术语执行逻辑`OR`搜索，除非指定为短语。~~See Behavior for more information on the field.~~有关该字段的更多信息，请参阅行为。
`$language`	string	~~Optional.~~可选。~~The language that determines the list of stop words for the search and the rules for the stemmer and tokenizer.~~ 确定搜索停止词列表以及词干分析器和标记器规则的语言。~~If not specified, the search uses the default language of the index. For supported languages, see Text Search Languages.~~如果未指定，搜索将使用索引的默认语言。有关支持的语言，请参阅文本搜索语言。 ~~If you specify a language value of `"none"`, then the text search uses simple tokenization with no list of stop words and no stemming.~~如果将语言值指定为`"none"`，则文本搜索使用简单的标记化，没有停止词列表，也没有词干。
`$caseSensitive`	boolean	~~Optional.~~可选。~~A boolean flag to enable or disable case sensitive search.~~ 用于启用或禁用区分大小写搜索的布尔标志~~Defaults to `false`; i.e. the search defers to the case insensitivity of the text index.~~。默认为`false`；亦即，搜索取决于文本索引的大小写不敏感。 ~~For more information, see Case Insensitivity.~~有关更多信息，请参阅不区分大小写。 ~~New in version 3.2.~~版本3.2中的新功能。
`$diacriticSensitive`	boolean	~~Optional.~~可选。A boolean flag to enable or disable diacritic sensitive search against version 3 text indexes. Defaults to `false`; i.e. the search defers to the diacritic insensitivity of the text index. ~~Text searches against earlier versions of the text index are inherently diacritic sensitive and cannot be diacritic insensitive.~~ 针对早期版本文本索引的文本搜索本质上是区分重音的，不能区分重音。~~As such, the `$diacriticSensitive` option has no effect with earlier versions of the text index.~~因此，`$diacriticSensitive`选项对早期版本的文本索引没有影响。 ~~For more information, see Diacritic Insensitivity.~~有关更多信息，请参阅变音不敏感。 ~~New in version 3.2.~~版本3.2中的新功能。

The $text operator, by default, does not return results sorted in terms of the results’ scores. For more information on sorting by the text search scores, see the Text Score documentation.

Behavior行为¶

Restrictions限制¶

~~A query can specify, at most, one $text expression.~~查询最多可以指定一个$text表达式。
The $text query can not appear in $nor expressions.
The $text query can not appear in $elemMatch query expressions or $elemMatch projection expressions.
To use a $text query in an $or expression, all clauses in the $or array must be indexed.
You cannot use hint() if the query includes a $text query expression.
You cannot specify $natural sort order if the query includes a $text expression.
You cannot combine the $text expression, which requires a special text index, with a query operator that requires a different type of special index. For example you cannot combine $text expression with the $near operator.
~~Views do not support text search.~~视图不支持文本搜索。

If using the $text operator in aggregation, the following restrictions also apply.

The $match stage that includes a $text must be the first stage in the pipeline.
A text operator can only occur once in the stage.
The text operator expression cannot appear in $or or $not expressions.
The text search, by default, does not return the matching documents in order of matching scores. To sort by descending score, use the $meta aggregation expression in the $sort stage.

`$search` Field¶

In the $search field, specify a string of words that the text operator parses and uses to query the text index.

The text operator treats most punctuation in the string as delimiters, except a hyphen-minus (-) that negates term or an escaped double quotes \" that specifies a phrase.

Phrases¶

To match on a phrase, as opposed to individual terms, enclose the phrase in escaped double quotes (\"), as in:

"\"ssl certificate\""

If the $search string includes a phrase and individual terms, text search will only match the documents that include the phrase.

For example, passed a $search string:

"\"ssl certificate\" authority key"

The $text operator searches for the phrase "ssl certificate".

Negations¶

Prefixing a word with a hyphen-minus (-) negates a word:

The negated word excludes documents that contain the negated word from the result set.
When passed a search string that only contains negated words, text search will not match any documents.
A hyphenated word, such as pre-market, is not a negation. If used in a hyphenated word, $text operator treats the hyphen-minus (-) as a delimiter. To negate the word market in this instance, include a space between pre and -market, i.e., pre -market.

The $text operator adds all negations to the query with the logical AND operator.

Match Operation¶

Stop Words¶

The $text operator ignores language-specific stop words, such as the and and in English.

Stemmed Words¶

For case insensitive and diacritic insensitive text searches, the $text operator matches on the complete stemmed word. So if a document field contains the word blueberry, a search on the term blue will not match. However, blueberry or blueberries will match.

Case Sensitive Search and Stemmed Words¶

For case sensitive search (i.e. $caseSensitive: true), if the suffix stem contains uppercase letters, the $text operator matches on the exact word.

Diacritic Sensitive Search and Stemmed Words¶

For diacritic sensitive search (i.e. $diacriticSensitive: true), if the suffix stem contains the diacritic mark or marks, the $text operator matches on the exact word.

Case Insensitivity¶

Changed in version 3.2.

The $text operator defaults to the case insensitivity of the text index:

The version 3 text index is case insensitive for Latin characters with or without diacritics and characters from non-Latin alphabets, such as the Cyrillic alphabet. See text index for details.
Earlier versions of the text index are case insensitive for Latin characters without diacritic marks; i.e. for [A-z].

`$caseSensitive` Option¶

To support case sensitive search where the text index is case insensitive, specify $caseSensitive: true.

Case Sensitive Search Process¶

When performing a case sensitive search ($caseSensitive: true) where the text index is case insensitive, the $text operator:

First searches the text index for case insensitive and diacritic matches.
Then, to return just the documents that match the case of the search terms, the $text query operation includes an additional stage to filter out the documents that do not match the specified case.

For case sensitive search (i.e. $caseSensitive: true), if the suffix stem contains uppercase letters, the $text operator matches on the exact word.

Specifying $caseSensitive: true may impact performance.

~~See also~~参阅

Stemmed Words

Diacritic Insensitivity¶

Changed in version 3.2.

The $text operator defaults to the diacritic insensitivity of the text index:

The version 3 text index is diacritic insensitive. That is, the index does not distinguish between characters that contain diacritical marks and their non-marked counterpart, such as é, ê, and e.
Earlier versions of the text index are diacritic sensitive.

`$diacriticSensitive` Option¶

To support diacritic sensitive text search against the version 3 text index, specify $diacriticSensitive: true.

Text searches against earlier versions of the text index are inherently diacritic sensitive and cannot be diacritic insensitive. As such, the $diacriticSensitive option for the $text operator has no effect with earlier versions of the text index.

Diacritic Sensitive Search Process变音敏感搜索过程¶

To perform a diacritic sensitive text search ($diacriticSensitive: true) against a version 3 text index, the $text operator:

First searches the text index, which is diacritic insensitive.
Then, to return just the documents that match the diacritic marked characters of the search terms, the $text query operation includes an additional stage to filter out the documents that do not match.

Specifying $diacriticSensitive: true may impact performance.

To perform a diacritic sensitive search against an earlier version of the text index, the $text operator searches the text index which is diacritic sensitive.

For diacritic sensitive search, if the suffix stem contains the diacritic mark or marks, the $text operator matches on the exact word.

~~See also~~参阅

Stemmed Words

Text Score¶

The $text operator assigns a score to each document that contains the search term in the indexed fields. The score represents the relevance of a document to a given text search query. The score can be part of a sort() method specification as well as part of the projection expression. The { $meta: "textScore" } expression provides information on the processing of the $text operation. See $meta projection operator for details on accessing the score for projection or sort.

Examples示例¶

The following examples assume a collection articles that has a version 3 text index on the field subject:

db.articles.createIndex( { subject: "text" } )

Populate the collection with the following documents:

db.articles.insert(
   [
     { _id: 1, subject: "coffee", author: "xyz", views: 50 },
     { _id: 2, subject: "Coffee Shopping", author: "efg", views: 5 },
     { _id: 3, subject: "Baking a cake", author: "abc", views: 90  },
     { _id: 4, subject: "baking", author: "xyz", views: 100 },
     { _id: 5, subject: "Café Con Leche", author: "abc", views: 200 },
     { _id: 6, subject: "Сырники", author: "jkl", views: 80 },
     { _id: 7, subject: "coffee and cream", author: "efg", views: 10 },
     { _id: 8, subject: "Cafe con Leche", author: "xyz", views: 10 }
   ]
)

Search for a Single Word搜索一个单词¶

~~The following query specifies a $search string of coffee:~~以下查询指定了一个$search字符串coffee：

db.articles.find( { $text: { $search: "coffee" } } )

This query returns the documents that contain the term coffee in the indexed subject field, or more precisely, the stemmed version of the word:

{ "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg", "views" : 5 }
{ "_id" : 7, "subject" : "coffee and cream", "author" : "efg", "views" : 10 }
{ "_id" : 1, "subject" : "coffee", "author" : "xyz", "views" : 50 }

~~See also~~参阅

Case Insensitivity, Stemmed Words

Match Any of the Search Terms匹配任何搜索词¶

If the search string is a space-delimited string, $text operator performs a logical OR search on each term and returns documents that contains any of the terms.

The following query specifies a $search string of three terms delimited by space, "bake coffee cake":

db.articles.find( { $text: { $search: "bake coffee cake" } } )

This query returns documents that contain either bake or coffee or cake in the indexed subject field, or more precisely, the stemmed version of these words:

{ "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg", "views" : 5 }
{ "_id" : 7, "subject" : "coffee and cream", "author" : "efg", "views" : 10 }
{ "_id" : 1, "subject" : "coffee", "author" : "xyz", "views" : 50 }
{ "_id" : 3, "subject" : "Baking a cake", "author" : "abc", "views" : 90 }
{ "_id" : 4, "subject" : "baking", "author" : "xyz", "views" : 100 }

~~See also~~参阅

Case Insensitivity, Stemmed Words

Search for a Phrase搜索短语¶

~~To match the exact phrase as a single term, escape the quotes.~~要将准确的短语作为一个术语进行匹配，请跳过引号。

The following query searches for the phrase coffee shop:

db.articles.find( { $text: { $search: "\"coffee shop\"" } } )

~~This query returns documents that contain the phrase coffee shop:~~此查询返回包含短语coffee shop的文档：

{ "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg", "views" : 5 }

~~See also~~参阅

Phrases

Exclude Documents That Contain a Term¶

A negated term is a term that is prefixed by a minus sign -. If you negate a term, the $text operator will exclude the documents that contain those terms from the results.

The following example searches for documents that contain the words coffee but do not contain the term shop, or more precisely the stemmed version of the words:

db.articles.find( { $text: { $search: "coffee -shop" } } )

The query returns the following documents:

{ "_id" : 7, "subject" : "coffee and cream", "author" : "efg", "views" : 10 }
{ "_id" : 1, "subject" : "coffee", "author" : "xyz", "views" : 50 }

~~See also~~参阅

Negations, Stemmed Words

Search a Different Language¶

Use the optional $language field in the $text expression to specify a language that determines the list of stop words and the rules for the stemmer and tokenizer for the search string.

If you specify a language value of "none", then the text search uses simple tokenization with no list of stop words and no stemming.

The following query specifies es, i.e. Spanish, as the language that determines the tokenization, stemming, and stop words:

db.articles.find(
   { $text: { $search: "leche", $language: "es" } }
)

The query returns the following documents:

{ "_id" : 5, "subject" : "Café Con Leche", "author" : "abc", "views" : 200 }
{ "_id" : 8, "subject" : "Cafe con Leche", "author" : "xyz", "views" : 10 }

The $text expression can also accept the language by name, spanish. See Text Search Languages for the supported languages.

~~See also~~参阅

Case Insensitivity

Case and Diacritic Insensitive Search¶

Changed in version 3.2.

The $text operator defers to the case and diacritic insensitivity of the text index. The version 3 text index is diacritic insensitive and expands its case insensitivity to include the Cyrillic alphabet as well as characters with diacritics. For details, see text Index Case Insensitivity and text Index Diacritic Insensitivity.

The following query performs a case and diacritic insensitive text search for the terms сы́рники or CAFÉS:

db.articles.find( { $text: { $search: "сы́рники CAFÉS" } } )

Using the version 3 text index, the query matches the following documents.

{ "_id" : 6, "subject" : "Сырники", "author" : "jkl", "views" : 80 }
{ "_id" : 5, "subject" : "Café Con Leche", "author" : "abc", "views" : 200 }
{ "_id" : 8, "subject" : "Cafe con Leche", "author" : "xyz", "views" : 10 }

With the previous versions of the text index, the query would not match any document.

Perform Case Sensitive Search¶

Changed in version 3.2.

To enable case sensitive search, specify $caseSensitive: true. Specifying $caseSensitive: true may impact performance.

Case Sensitive Search for a Term¶

The following query performs a case sensitive search for the term Coffee:

db.articles.find( { $text: { $search: "Coffee", $caseSensitive: true } } )

The search matches just the document:

{ "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg", "views" : 5 }

Case Sensitive Search for a Phrase¶

The following query performs a case sensitive search for the phrase Café Con Leche:

db.articles.find( {
   $text: { $search: "\"Café Con Leche\"", $caseSensitive: true }
} )

The search matches just the document:

{ "_id" : 5, "subject" : "Café Con Leche", "author" : "abc", "views" : 200 }

Case Sensitivity with Negated Term¶

A negated term is a term that is prefixed by a minus sign -. If you negate a term, the $text operator will exclude the documents that contain those terms from the results. You can also specify case sensitivity for negated terms.

The following example performs a case sensitive search for documents that contain the word Coffee but do not contain the lower-case term shop, or more precisely the stemmed version of the words:

db.articles.find( { $text: { $search: "Coffee -shop", $caseSensitive: true } } )

The query matches the following document:

{ "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg" }

Diacritic Sensitive Search¶

Changed in version 3.2.

To enable diacritic sensitive search against a version 3 text index, specify $diacriticSensitive: true. Specifying $diacriticSensitive: true may impact performance.

Diacritic Sensitive Search for a Term¶

The following query performs a diacritic sensitive text search on the term CAFÉ, or more precisely the stemmed version of the word:

db.articles.find( { $text: { $search: "CAFÉ", $diacriticSensitive: true } } )

The query only matches the following document:

{ "_id" : 5, "subject" : "Café Con Leche", "author" : "abc" }

Diacritic Sensitivity with Negated Term¶

The $diacriticSensitive option applies also to negated terms. A negated term is a term that is prefixed by a minus sign -. If you negate a term, the $text operator will exclude the documents that contain those terms from the results.

The following query performs a diacritic sensitive text search for document that contains the term leches but not the term cafés, or more precisely the stemmed version of the words:

db.articles.find(
  { $text: { $search: "leches -cafés", $diacriticSensitive: true } }
)

The query matches the following document:

{ "_id" : 8, "subject" : "Cafe con Leche", "author" : "xyz" }

Text Search Score Examples¶

Return the Text Search Score¶

The following query performs a text search for the term cake and uses the $meta operator in the projection document to append the relevance score to each matching document:

db.articles.find(
   { $text: { $search: "cake" } },
   { score: { $meta: "textScore" } }
)

The returned document includes an additional field score that contains the document’s relevance score:

{ "_id" : 3, "subject" : "Baking a cake", "author" : "abc", "views" : 90, "score" : 0.75 }

~~See also~~参阅

$meta

Sort by Text Search Score¶

Starting in MongoDB 4.4, you can specify the { $meta: "textScore" } expression in the sort() without also specifying the expression in the projection. For example,
```
db.articles.find(
   { $text: { $search: "cake" } }
).sort( { score: { $meta: "textScore" } } )
```
As a result, you can sort the resulting documents by their search relevance without projecting the textScore.

In earlier versions, to include { $meta: "textScore" } expression in the sort(), you must also include the same expression in the projection.
Starting in MongoDB 4.4, if you include the { $meta: "textScore" } expression in both the projection and sort(), the projection and sort documents can have different field names for the expression.

For example, in the following operation, the projection uses a field named score for the expression and the sort() uses the field named ignoredName.
```
db.articles.find(
   { $text: { $search: "cake" } } ,
   { score: { $meta: "textScore" } }
).sort( { ignoredName: { $meta: "textScore" } } )
```
In previous versions of MongoDB, if { $meta: "textScore" } is included in both the projection and sort, you must specify the same field name for the expression.
In MongoDB 4.2 and earlier, to sort by the text score, include the same $meta expression in both the projection document and the sort expression. The following query searches for the term coffee and sorts the results by the descending score:
```
db.articles.find(
   { $text: { $search: "coffee" } },
   { score: { $meta: "textScore" } }
).sort( { score: { $meta: "textScore" } } )
```
The query returns the matching documents sorted by descending score.

~~See also~~参阅

$meta

Return Top 2 Matching Documents¶

Use the limit() method in conjunction with a sort() to return the top n matching documents.

The following query searches for the term coffee and sorts the results by the descending score, limiting the results to the top two matching documents:

db.articles.find(
   { $text: { $search: "coffee" } },
   { score: { $meta: "textScore" } }
).sort( { score: { $meta: "textScore" } } ).limit(2)

~~See also~~参阅

$meta

Text Search with Additional Query and Sort Expressions¶

The following query searches for documents where the author equals "xyz" and the indexed field subject contains the terms coffee or bake. The operation also specifies a sort order of ascending date, then descending text search score:

db.articles.find(
   { author: "xyz", $text: { $search: "coffee bake" } },
   { score: { $meta: "textScore" } }
).sort( { date: 1, score: { $meta: "textScore" } } )

~~See also~~参阅

Text Search in the Aggregation Pipeline

$text¶

Definition定义¶

Behavior行为¶

Restrictions限制¶

$search Field¶

Phrases¶

Negations¶

Match Operation¶

Stop Words¶

Stemmed Words¶

Case Sensitive Search and Stemmed Words¶

Diacritic Sensitive Search and Stemmed Words¶

Case Insensitivity¶

$caseSensitive Option¶

Case Sensitive Search Process¶

Diacritic Insensitivity¶

$diacriticSensitive Option¶

Diacritic Sensitive Search Process变音敏感搜索过程¶

Text Score¶

Examples示例¶

Search for a Single Word搜索一个单词¶

Match Any of the Search Terms匹配任何搜索词¶

Search for a Phrase搜索短语¶

Exclude Documents That Contain a Term¶

Search a Different Language¶

Case and Diacritic Insensitive Search¶

Perform Case Sensitive Search¶

Case Sensitive Search for a Term¶

Case Sensitive Search for a Phrase¶

Case Sensitivity with Negated Term¶

Diacritic Sensitive Search¶

Diacritic Sensitive Search for a Term¶

Diacritic Sensitivity with Negated Term¶

Text Search Score Examples¶

Return the Text Search Score¶

Sort by Text Search Score¶

Return Top 2 Matching Documents¶

Text Search with Additional Query and Sort Expressions¶

`$search` Field¶

`$caseSensitive` Option¶

`$diacriticSensitive` Option¶