Search Text搜索文本

Text search, using the $text query operator, lets you search string type fields in your collection for words or phrases. 文本搜索使用$text查询运算符,可以在集合中搜索字符串类型字段中的单词或短语。This operator performs a logical OR on each term separated by a space in the search string. 此运算符对搜索字符串中由空格分隔的每个术语执行逻辑ORYou can also specify additional options to the operator to handle case sensitivity, word stemming (e.g. plural forms, tense) and stop words for a supported language. 您还可以为运算符指定其他选项,以处理受支持语言的区分大小写、词干(例如复数形式、时态)和停止词。This is particularly useful for unstructured text such as transcripts, essays, or web pages.这对于非结构化文本(如成绩单、论文或网页)特别有用。

The $text query operator requires that you specify the search field in a text index on your collection. $text查询运算符要求您在集合的文本索引中指定搜索字段。See the examples below for sample code for creating a text index and using the $text query operator.有关创建文本索引和使用$text查询运算符的示例代码,请参阅下面的示例。

Note

Atlas Search makes it easy to build fast, relevance-based search capabilities on top of your MongoDB data. Atlas Search可以轻松地在MongoDB数据之上构建快速、基于相关性的搜索功能。Try it today on MongoDB Atlas, our fully managed database as a service.今天就在MongoDB Atlas上试试吧,它是我们完全管理的数据库即服务。

The following examples use the movies collection in the sample_mflix database. 以下示例使用sample_mflix数据库中的movies集合。In order to enable text searches on the title field, create the text index using the following command:要在title字段上启用文本搜索,请使用以下命令创建文本索引

db.movies.createIndex({ title: "text" });

We use this text index for the examples, but you can create a compound text index that broadens your text queries to multiple fields as follows:我们在示例中使用此文本索引,但您可以创建一个复合文本索引,将文本查询扩展到多个字段,如下所示:

db.movies.createIndex({ title: "text", fullplot: "text" });

You can only create one text index per collection. 每个集合只能创建一个文本索引。Every text search queries all the fields specified in that index for matches.每个文本搜索都会查询该索引中指定的所有字段以查找匹配项。

See the MongoDB server manual for more information on creating text indexes.有关创建文本索引的更多信息,请参阅MongoDB服务器手册。

This example queries for Star Trek movies by searching for titles containing the word "trek". 本例通过搜索包含单词“Trek”的标题来查询《星际迷航》电影。If you want to query using multiple words, separate your words with spaces to query for documents that match any of the search terms (logical OR).如果要使用多个词进行查询,请用空格分隔词,以查询与任何搜索词(逻辑OR)匹配的文档。

  const query = { $text: { $search: "trek" } };
// Return only the `title` of each matched document仅返回每个匹配文档的`title` const projection = { _id: 0, title: 1, };
// find documents based on our query and projection根据我们的查询和投影查找文档 const cursor = movies.find(query).project(projection);

This operation returns the following documents:此操作将返回以下文档:

{ title: 'Trek Nation' }
{ title: 'Star Trek' }
{ title: 'Star Trek Into Darkness' }
{ title: 'Star Trek: Nemesis' }
{ title: 'Star Trek: Insurrection' }
{ title: 'Star Trek: Generations' }
{ title: 'Star Trek: First Contact' }
{ title: 'Star Trek: The Motion Picture' }
{ title: 'Star Trek VI: The Undiscovered Country' }
{ title: 'Star Trek V: The Final Frontier' }
{ title: 'Star Trek IV: The Voyage Home' }
{ title: 'Star Trek III: The Search for Spock' }
{ title: 'Star Trek II: The Wrath of Khan' }

Success! The query found every document in the movies collection with a title including the word "trek". 成功该查询在movies集合中找到了包含单词“trek”的标题的每个文档。Unfortunately, the search included one unintended item: "Trek Nation," which is a movie about Star Trek and not part of the Star Trek movie series. 不幸的是,搜索中包含了一个意外的项目:“Trek Nation”,这是一部关于《星际迷航》的电影,而不是《星际迷航》系列电影的一部分。To solve this, we can query with a more specific phrase.为了解决这个问题,我们可以使用更具体的短语进行查询。

To make your query more specific, try using the phrase "star trek" instead of just the word "trek". 为了使您的查询更具体,请尝试使用短语“star trek”,而不仅仅是单词“trek”。To search by phrase, surround your multi-word phrase with escaped quotes (\"<term>\"):要按短语搜索,请在多词短语周围加上转义引号(\"<term>\"):

  const query = { $text: { $search: "\"star trek\"" } };
// Return only the `title` of each matched document仅返回每个匹配文档的`title` const projection = { _id: 0, title: 1, };
// find documents based on our query and projection根据我们的查询和投影查找文档 const cursor = movies.find(query).project(projection);

Querying by the phrase "star trek" instead of just the term "trek" matches the following documents:通过短语“star trek”而不是术语“trek”进行查询与以下文档匹配:

{ title: 'Star Trek' }
{ title: 'Star Trek Into Darkness' }
{ title: 'Star Trek: Nemesis' }
{ title: 'Star Trek: Insurrection' }
{ title: 'Star Trek: Generations' }
{ title: 'Star Trek: First Contact' }
{ title: 'Star Trek: The Motion Picture' }
{ title: 'Star Trek VI: The Undiscovered Country' }
{ title: 'Star Trek V: The Final Frontier' }
{ title: 'Star Trek IV: The Voyage Home' }
{ title: 'Star Trek III: The Search for Spock' }
{ title: 'Star Trek II: The Wrath of Khan' }

These results include all movies in the database that contain the phrase "star trek", which in this case results in only fictional Star Trek movies. 这些结果包括数据库中包含短语“星际迷航”的所有电影,在本例中,这些短语只生成虚构的星际迷航电影。Unfortunately, though, this query returned "Star Trek Into Darkness", a movie that was not part of the original series of movies. 然而,不幸的是,这个查询返回了《星际迷航进入黑暗》,这部电影不是原系列电影的一部分。To resolve this issue, we can omit that document with a negation.为了解决这个问题,我们可以用否定省略该文档。

To use a negated term, place a negative sign (-) in front of the term you would like to omit from the result set. 要使用否定项,请在要从结果集中忽略的项前面放置负号(-)。The query operation omits any documents that contain this term from the search result. 查询操作将从搜索结果中忽略包含此术语的任何文档。Since this query includes two distinct terms, separate them with a space.由于此查询包含两个不同的术语,请用空格分隔它们。

  const query = { $text: { $search: "\"star trek\"-\"into darkness\"" } };
// Include only the `title` field of each matched document const projection = { _id: 0, title: 1, };
// find documents based on our query and projection根据我们的查询和投影查找文档 const cursor = movies.find(query).project(projection);

Querying with the negated term yields the following documents:使用否定项查询将生成以下文档:

{ title: 'Star Trek' }
{ title: 'Star Trek: Nemesis' }
{ title: 'Star Trek: Insurrection' }
{ title: 'Star Trek: Generations' }
{ title: 'Star Trek: First Contact' }
{ title: 'Star Trek: The Motion Picture' }
{ title: 'Star Trek VI: The Undiscovered Country' }
{ title: 'Star Trek V: The Final Frontier' }
{ title: 'Star Trek IV: The Voyage Home' }
{ title: 'Star Trek III: The Search for Spock' }
{ title: 'Star Trek II: The Wrath of Khan' }
Note

Your query operation may return a reference to a cursor that contains matching documents. 查询操作可能返回对包含匹配文档的游标的引用。To learn how to examine data stored in the cursor, see the Cursor Fundamentals page.要了解如何检查游标中存储的数据,请参阅游标基础知识页面

Now that the result set reflects the desired results, you can use the text search textScore, accessed using the $meta operator in the query projection, to order the results by relevance:现在,结果集反映了所需的结果,您可以使用文本搜索textScore(在查询投影中使用$meta运算符访问)按相关性对结果排序:

  const query = { $text: { $search: "\"star trek\"-\"into darkness\"" } };
// sort returned documents by descending text relevance score通过降低文本相关性得分对返回的文档进行排序 const sort = { score: { $meta: "textScore" } }; // Include only the `title` and `score` fields in each returned document const projection = { _id: 0, title: 1, score: { $meta: "textScore" }, };
// find documents based on our query, sort, and projection根据我们的查询、排序和投影查找文档 const cursor = movies .find(query) .sort(sort) .project(projection);

Querying in this way returns the following documents in the following order. 以这种方式查询将按以下顺序返回以下文档。In general, text relevance increases as a string matches more terms and decreases as the unmatched portion of the string lengthens.通常,文本相关性随着字符串匹配更多术语而增加,随着字符串不匹配部分的加长而降低。

{ title: 'Star Trek', score: 1.5 }
{ title: 'Star Trek: Generations', score: 1.3333333333333333 }
{ title: 'Star Trek: Insurrection', score: 1.3333333333333333 }
{ title: 'Star Trek: Nemesis', score: 1.3333333333333333 }
{ title: 'Star Trek: The Motion Picture', score: 1.25 }
{ title: 'Star Trek: First Contact', score: 1.25 }
{ title: 'Star Trek II: The Wrath of Khan', score: 1.2 }
{ title: 'Star Trek III: The Search for Spock', score: 1.2 }
{ title: 'Star Trek IV: The Voyage Home', score: 1.2 }
{ title: 'Star Trek V: The Final Frontier', score: 1.2 }
{ title: 'Star Trek VI: The Undiscovered Country', score: 1.2 }

For more information about the $text operator and its options, see the manual entry.有关$text运算符及其选项的更多信息,请参阅手册条目