On this page本页内容
$regexFindAll¶New in version 4.2.版本4.2中的新功能。
Provides regular expression (regex) pattern matching capability in aggregation expressions. 在聚合表达式中提供正则表达式(regex)模式匹配功能。The operator returns an array of documents that contains information on each match. 运算符返回一个包含每个匹配信息的文档数组。If a match is not found, returns an empty array.如果未找到匹配项,则返回空数组。
MongoDB uses Perl compatible regular expressions (i.e. “PCRE” ) version 8.41 with UTF-8 support.MongoDB使用Perl兼容的正则表达式(即“PCRE”)版本8.41,支持UTF-8。
Prior to MongoDB 4.2, aggregation pipeline can only use the query operator 在MongoDB 4.2之前,聚合管道只能在$regex in the $match stage. $match阶段使用查询运算符$regex。For more information on using regex in a query, see 有关在查询中使用regex的更多信息,请参阅$regex.$regex。
The $regexFindAll operator has the following syntax:$regexFindAll运算符语法如下所示:
| input |
| ||||||||||
| regex |
| ||||||||||
| options |
Note
|
The operator returns an array:运算符返回一个数组:
() in the regex pattern.()指定的。See also参阅
$regexFindAll$regexFindAll ignores the collation specified for the collection, 忽略为集合、db.collection.aggregate(), and the index, if used.db.collection.aggregate()和索引(如果使用)指定的排序规则。
For example, the create a sample collection with collation strength 例如,创建排序规则强度为1 (i.e. compare base character only and ignore other differences such as case and diacritics):1的样本集合(即,仅比较基本字符,忽略其他差异,如大小写和变音符号):
Insert the following documents:插入以下文件:
Using the collection’s collation, the following operation performs a case-insensitive and diacritic-insensitive match:使用集合的排序规则,以下操作执行不区分大小写和不区分重音的匹配:
The operation returns the following 3 documents:该操作返回以下3个文档:
However, the aggregation expression 但是,聚合表达式$regexFind ignores collation; that is, the following regular expression pattern matching examples are case-sensitive and diacritic sensitive:$regexFind忽略排序规则;也就是说,以下正则表达式模式匹配示例区分大小写,区分重音:
Both operations return the following:两个操作都返回以下内容:
To perform a case-insensitive regex pattern matching, use the i Option instead. 要执行不区分大小写的正则表达式模式匹配,请改用i选项。See i Option for an example.有关示例,请参阅i选项。
capturesIf your regex pattern contains capture groups and the pattern finds a match in the input, the 如果正则表达式模式包含捕获组,并且该模式在输入中找到匹配项,则结果中的captures array in the results corresponds to the groups captured by the matching string. captures数组对应于匹配字符串捕获的组。Capture groups are specified with unescaped parentheses 在正则表达式模式中,捕获组是用未转义的括号() in the regex pattern. ()指定的。The length of the captures array equals the number of capture groups in the pattern and the order of the array matches the order in which the capture groups appear.captures数组的长度等于模式中捕获组的数量,数组的顺序与捕获组的出现顺序匹配。
Create a sample collection named 使用以下文档创建名为contacts with the following documents:contacts的样本集合:
The following pipeline applies the regex pattern 以下管道将正则表达式模式/(C(ar)*)ol/ to the fname field:/(C(ar)*)ol/应用于fname字段:
The regex pattern finds a match with 正则表达式模式找到与fname values Carol and Colleen:fname值Carol和Colleen匹配的值:
The pattern contains the capture group 该模式包含包含嵌套组(C(ar)*) which contains the nested group (ar). (ar)的捕获组(C(ar)*)。The elements in the captures array correspond to the two capture groups. captures数组中的元素对应于两个捕获组。If a matching document is not captured by a group (e.g. 如果组(例如Colleen and the group (ar)), $regexFindAll replaces the group with a null placeholder.Colleen和组(ar))未捕获匹配的文档,$regexFindAll将用空占位符替换组。
As shown in the previous example, the 如前一个示例所示,captures array contains an element for each capture group (using null for non-captures). captures数组包含每个捕获组的一个元素(非捕获使用null)。Consider the following example which searches for phone numbers with New York City area codes by applying a logical 考虑下面的例子,通过在or of capture groups to the phone field. phone字段中应用逻辑or捕获组来搜索带有纽约区域代码的电话号码。Each group represents a New York City area code:每组代表一个纽约市区号:
For documents which are matched by the regex pattern, the 对于通过正则表达式模式匹配的文档,captures array includes the matching capture group and replaces any non-capturing groups with null:captures数组包括匹配的捕获组,并将任何非捕获组替换为null:
$regexFindAllTo illustrate the behavior of the 为了说明本例中讨论的$regexFindAll operator as discussed in this example, create a sample collection products with the following documents:$regexFindAll运算符的行为,请使用以下文档创建一个示例集合products:
By default, 默认情况下,$regexFindAll performs a case-sensitive match. $regexFindAll执行区分大小写的匹配。For example, the following aggregation performs a case-sensitive 例如,以下聚合在$regexFindAll on the description field. description字段上执行区分大小写的$regexFindAll。The regex pattern regex模式/line/ does not specify any grouping:/line/未指定任何分组:
The operation returns the following:该操作返回以下内容:
The following regex pattern 以下正则表达式模式/lin(e|k)/ specifies a grouping (e|k) in the pattern:/lin(e|k)/指定模式中的分组(e|k):
The operation returns the following:该操作返回以下内容:
In the return option, the 在返回选项中,idx field is the code point index and not the byte index. idx字段是代码点索引,而不是字节索引。To illustrate, consider the following example that uses the regex pattern 为了说明,请考虑下面的示例,使用正则表达式模式/tier/:/tier/:
The operation returns the following where only the last record matches the pattern and the returned 该操作返回以下结果,其中只有最后一条记录与模式匹配,返回的idx is 2 (instead of 3 if using a byte index)idx为2(如果使用字节索引,则不是3)
iNote
You cannot specify options in both the 不能同时在regex and the options field.regex和options字段中指定选项。
To perform case-insensitive pattern matching, include the i option as part of the regex field or in the options field:要执行不区分大小写的模式匹配,请在regex字段或options字段中包含i选项:
For example, the following aggregation performs a case-insensitive 例如,以下聚合在$regexFindAll on the description field. description字段上执行不区分大小写的$regexFindAll。The regex pattern regex模式/line/ does not specify any grouping:/line/未指定任何分组:
The operation returns the following documents:该操作将返回以下文档:
mNote
You cannot specify options in both the 不能同时在regex and the options field.regex和options字段中指定选项。
To match the specified anchors (e.g. 要为多行字符串的每一行匹配指定的定位点(例如^, $) for each line of a multiline string, include the m option as part of the regex field or in the options field:^,$),请将m选项作为regex字段或options字段的一部分包括在内:
The following example includes both the 以下示例包括i and the m options to match lines starting with either the letter s or S for multiline strings:i和m选项,用于匹配多行字符串中以字母s或S开头的行:
The operation returns the following:该操作返回以下内容:
xNote
You cannot specify options in both the 不能同时在regex and the options field.regex和options字段中指定选项。
To ignore all unescaped white space characters and comments (denoted by the un-escaped hash 要忽略模式中所有未转义的空白字符和注释(由未转义的哈希# character and the next new-line character) in the pattern, include the s option in the options field:#字符和下一个新行字符表示),请在options字段中包含s选项:
The following example includes the 下面的示例包括用于跳过未加修饰的空白和注释的x option to skip unescaped white spaces and comments:x选项:
The operation returns the following:该操作返回以下内容:
sNote
You cannot specify options in both the 不能同时在regex and the options field.regex和options字段中指定选项。
To allow the dot character (i.e. 允许点字符(即.) in the pattern to match all characters including the new line character, include the s option in the options field:.)在要匹配包括新行字符在内的所有字符的模式中,在options字段中包括s选项:
The following example includes the 以下示例包括允许点字符(即s option to allow the dot character (i.e. .) to match all characters including new line as well as the i option to perform a case-insensitive match:.)的s选项要匹配包括新行在内的所有字符,以及执行不区分大小写匹配的i选项,请执行以下操作:
The operation returns the following:该操作返回以下内容:
$regexFindAll to Parse Email from String$regexFindAll从字符串解析电子邮件¶Create a sample collection 使用以下文档创建样本集合feedback with the following documents:feedback:
The following aggregation uses the 下面的聚合使用$regexFindAll to extract all emails from the comment field (case insensitive).$regexFindAll从comment字段中提取所有电子邮件(不区分大小写)。
The stage uses the 该阶段使用$addFields stage to add a new field email to the document. $addFields阶段向文档添加新的字段email。The new field is an array that contains the result of performing the 新字段是一个数组,包含对$regexFindAll on the comment field:comment字段执行$regexFindAll的结果:
The stage use the 此阶段使用$set stage to reset the email array elements to the "email.match" value(s). $set阶段将email数组元素重置为"email.match"值。If the current value of 如果email is null, the new value of email is set to null.email的当前值为null,则电子邮件的新值将设置为null。
Create a sample collection 使用以下文档创建样本集合feedback with the following documents:feedback:
To reply to the feedback, assume you want to parse the local-part of the email address to use as the name in the greetings. 为了回复反馈,假设您想要解析电子邮件地址的本地部分,将其用作问候语中的名称。Using the 使用captured field returned in the $regexFindAll results, you can parse out the local part of each email address:$regexFindAll结果中返回的captured字段,可以解析出每个电子邮件地址的本地部分:
The stage uses the 此阶段使用$addFields stage to add a new field names to the document. $addFields阶段向文档添加新字段名。The new field contains the result of performing the 新字段包含对$regexFindAll on the comment field:comment字段执行$regexFindAll的结果:
The stage use the 此阶段使用$set stage with the $reduce operator to reset names to an array that contains the "$names.captures" elements.$set阶段配合$reduce运算符将名称重置为包含"$names.captures"元素的数组。
See also参阅
For more information on the behavior of the 有关captures array and additional examples, see captures Output Behavior.captures数组行为和其他示例的更多信息,请参阅捕获输出行为。