Chapter 12 Functions and Operators

12.8.2 Regular Expressions正则表达式

Table 12.14 ~~Regular Expression Functions and Operators~~正则表达式函数和运算符

~~Name~~名称	~~Description~~描述
`NOT REGEXP`	~~Negation of REGEXP~~REGEXP的否定
`REGEXP`	~~Whether string matches regular expression~~字符串是否匹配正则表达式
`REGEXP_INSTR()`	~~Starting index of substring matching regular expression~~子串匹配正则表达式的起始索引
`REGEXP_LIKE()`	~~Whether string matches regular expression~~字符串是否匹配正则表达式
`REGEXP_REPLACE()`	~~Replace substrings matching regular expression~~替换与正则表达式匹配的子字符串
`REGEXP_SUBSTR()`	~~Return substring matching regular expression~~返回子字符串匹配正则表达式
`RLIKE`	~~Whether string matches regular expression~~字符串是否匹配正则表达式

~~A regular expression is a powerful way of specifying a pattern for a complex search.~~ 正则表达式是为复杂搜索指定模式的强大方法。This section discusses the functions and operators available for regular expression matching and illustrates, with examples, some of the special characters and constructs that can be used for regular expression operations. 本节讨论可用于正则表达式匹配的函数和运算符，并举例说明可用于正则表达式操作的一些特殊字符和构造。~~See also Section 3.3.4.7, “Pattern Matching”.~~另请参阅第3.3.4.7节，“模式匹配”。

~~MySQL implements regular expression support using International Components for Unicode (ICU), which provides full Unicode support and is multibyte safe.~~ MySQL使用国际Unicode组件（ICU）实现正则表达式支持，ICU提供完全的Unicode支持，并且是多字节安全的。~~(Prior to MySQL 8.0.4, MySQL used Henry Spencer's implementation of regular expressions, which operates in byte-wise fashion and is not multibyte safe.~~ （在MySQL 8.0.4之前，MySQL使用了Henry Spencer的正则表达式实现，它以字节方式运行，并且不是多字节安全的。~~For information about ways in which applications that use regular expressions may be affected by the implementation change, see Regular Expression Compatibility Considerations.)~~有关使用正则表达式的应用程序受实现更改影响的方式的信息，请参阅正则表达式兼容性注意事项。）

Regular Expression Functions and Operators正则表达式函数和运算符

expr NOT REGEXP pat, expr NOT RLIKE pat

~~This is the same as NOT (expr REGEXP pat).~~这相当于NOT (expr REGEXP pat)。

expr REGEXP pat, expr RLIKE pat

~~Returns 1 if the string expr matches the regular expression specified by the pattern pat, 0 otherwise.~~ 如果字符串expr与模式pat指定的正则表达式匹配，则返回1，否则返回0。~~If expr or pat is NULL, the return value is NULL.~~如果expr或pat为NULL，则返回值为NULL。

~~REGEXP and RLIKE are synonyms for REGEXP_LIKE().~~REGEXP和RLIKE是REGEXP_LIKE()的同义词。

~~For additional information about how matching occurs, see the description for REGEXP_LIKE().~~有关如何进行匹配的其他信息，请参阅REGEXP_LIKE()的说明。

mysql> SELECT 'Michael!' REGEXP '.*';
+------------------------+
| 'Michael!' REGEXP '.*' |
+------------------------+
|                      1 |
+------------------------+
mysql> SELECT 'new*\n*line' REGEXP 'new\\*.\\*line';
+---------------------------------------+
| 'new*\n*line' REGEXP 'new\\*.\\*line' |
+---------------------------------------+
|                                     0 |
+---------------------------------------+
mysql> SELECT 'a' REGEXP '^[a-d]';
+---------------------+
| 'a' REGEXP '^[a-d]' |
+---------------------+
|                   1 |
+---------------------+
mysql> SELECT 'a' REGEXP 'A', 'a' REGEXP BINARY 'A';
+----------------+-----------------------+
| 'a' REGEXP 'A' | 'a' REGEXP BINARY 'A' |
+----------------+-----------------------+
|              1 |                     0 |
+----------------+-----------------------+

REGEXP_INSTR(expr, pat[, pos[, occurrence[, return_option[, match_type]]]])

~~Returns the starting index of the substring of the string expr that matches the regular expression specified by the pattern pat, 0 if there is no match.~~ 返回与pat模式指定的正则表达式匹配的字符串expr的子字符串的起始索引，如果不匹配，则返回0。~~If expr or pat is NULL, the return value is NULL.~~ 如果expr或pat为NULL，则返回值为NULL。~~Character indexes begin at 1.~~字符索引从1开始。

~~REGEXP_INSTR() takes these optional arguments:~~REGEXP_INSTR()接受以下可选参数：
- pos~~: The position in expr at which to start the search.~~ ：expr中开始搜索的位置。~~If omitted, the default is 1.~~如果省略，则默认值为1。
- occurrence~~: Which occurrence of a match to search for.~~ ：要搜索匹配项的哪次出现。~~If omitted, the default is 1.~~如果省略，则默认值为1。
- return_option~~: Which type of position to return.~~ ：要返回的位置类型。~~If this value is 0, REGEXP_INSTR() returns the position of the matched substring's first character.~~ 如果此值为0，则REGEXP_INSTR()返回匹配子字符串的第一个字符的位置。~~If this value is 1, REGEXP_INSTR() returns the position following the matched substring.~~ 如果此值为1，则REGEXP_INSTR()返回匹配子字符串后面的位置。~~If omitted, the default is 0.~~如果省略，则默认值为0。
- match_type~~: A string that specifies how to perform matching.~~ ：指定如何执行匹配的字符串。~~The meaning is as described for REGEXP_LIKE().~~其含义如REGEXP_LIKE()所述。
~~For additional information about how matching occurs, see the description for REGEXP_LIKE().~~有关如何进行匹配的其他信息，请参阅REGEXP_LIKE()的说明。
```
mysql> SELECT REGEXP_INSTR('dog cat dog', 'dog');
+------------------------------------+
| REGEXP_INSTR('dog cat dog', 'dog') |
+------------------------------------+
|                                  1 |
+------------------------------------+
mysql> SELECT REGEXP_INSTR('dog cat dog', 'dog', 2);
+---------------------------------------+
| REGEXP_INSTR('dog cat dog', 'dog', 2) |
+---------------------------------------+
|                                     9 |
+---------------------------------------+
mysql> SELECT REGEXP_INSTR('aa aaa aaaa', 'a{2}');
+-------------------------------------+
| REGEXP_INSTR('aa aaa aaaa', 'a{2}') |
+-------------------------------------+
|                                   1 |
+-------------------------------------+
mysql> SELECT REGEXP_INSTR('aa aaa aaaa', 'a{4}');
+-------------------------------------+
| REGEXP_INSTR('aa aaa aaaa', 'a{4}') |
+-------------------------------------+
|                                   8 |
+-------------------------------------+
```
REGEXP_LIKE(expr, pat[, match_type])

~~Returns 1 if the string expr matches the regular expression specified by the pattern pat, 0 otherwise.~~ 如果字符串expr与模式pat指定的正则表达式匹配，则返回1，否则返回0。~~If expr or pat is NULL, the return value is NULL.~~如果expr或pat为NULL，则返回值为NULL。

~~The pattern can be an extended regular expression, the syntax for which is discussed in Regular Expression Syntax.~~ 模式可以是一个扩展的正则表达式，其语法将在正则表达式语法中讨论。~~The pattern need not be a literal string.~~ 模式不必是文本字符串。~~For example, it can be specified as a string expression or table column.~~例如，可以将其指定为字符串表达式或表列。

~~The optional match_type argument is a string that may contain any or all the following characters specifying how to perform matching:~~可选的match类型参数是一个字符串，可以包含指定如何执行匹配的以下任何或所有字符：
- c~~: Case-sensitive matching.~~：区分大小写的匹配。
- i~~: Case-insensitive matching.~~：不区分大小写匹配。
- m~~: Multiple-line mode. Recognize line terminators within the string.~~ ：多行模式。识别字符串中的行终止符。~~The default behavior is to match line terminators only at the start and end of the string expression.~~默认行为是仅在字符串表达式的开始和结束处匹配行终止符。
- n~~: The . character matches line terminators.~~ ：.字符匹配行终止符。~~The default is for . matching to stop at the end of a line.~~.的默认值是匹配到在一行文字的尽头停下来。
- u~~: Unix-only line endings.~~ ：仅Unix行尾。~~Only the newline character is recognized as a line ending by the ., ^, and $ match operators.~~只有换行符被识别为以.、^和$匹配运算符结尾的行。
~~If characters specifying contradictory options are specified within match_type, the rightmost one takes precedence.~~如果在match_type中指定了指定矛盾选项的字符，则最右边的字符优先。

~~By default, regular expression operations use the character set and collation of the expr and pat arguments when deciding the type of a character and performing the comparison.~~ 默认情况下，在确定字符类型和执行比较时，正则表达式操作使用字符集以及expr和pat参数的排序规则。~~If the arguments have different character sets or collations, coercibility rules apply as described in Section 10.8.4, “Collation Coercibility in Expressions”.~~ 如果参数具有不同的字符集或排序规则，则强制规则适用于第10.8.4节，“表达式中的排序规则强制性”。~~Arguments may be specified with explicit collation indicators to change comparison behavior.~~可以使用显式排序指示符指定参数以更改比较行为。
```
mysql> SELECT REGEXP_LIKE('CamelCase', 'CAMELCASE');
+---------------------------------------+
| REGEXP_LIKE('CamelCase', 'CAMELCASE') |
+---------------------------------------+
|                                     1 |
+---------------------------------------+
mysql> SELECT REGEXP_LIKE('CamelCase', 'CAMELCASE' COLLATE utf8mb4_0900_as_cs);
+------------------------------------------------------------------+
| REGEXP_LIKE('CamelCase', 'CAMELCASE' COLLATE utf8mb4_0900_as_cs) |
+------------------------------------------------------------------+
|                                                                0 |
+------------------------------------------------------------------+
```
match_type may be specified with the c or i characters to override the default case sensitivity. match_type可以用c或i字符指定，以覆盖默认的大小写敏感度。~~Exception: If either argument is a binary string, the arguments are handled in case-sensitive fashion as binary strings, even if match_type contains the i character.~~例外：如果其中一个参数是二进制字符串，则参数将以区分大小写的方式作为二进制字符串处理，即使match_type包含i字符。

~~Note~~注意

~~MySQL uses C escape syntax in strings (for example, \n to represent the newline character).~~ MySQL在字符串中使用C转义语法（例如，\n表示换行符）。~~If you want your expr or pat argument to contain a literal \, you must double it.~~ 如果希望expr或pat参数包含文字\，则必须将其加倍。~~(Unless the NO_BACKSLASH_ESCAPES SQL mode is enabled, in which case no escape character is used.)~~（如果启用了NO_BACKSLASH_ESCAPESSQL模式，则不使用转义字符。）
```
mysql> SELECT REGEXP_LIKE('Michael!', '.*');
+-------------------------------+
| REGEXP_LIKE('Michael!', '.*') |
+-------------------------------+
|                             1 |
+-------------------------------+
mysql> SELECT REGEXP_LIKE('new*\n*line', 'new\\*.\\*line');
+----------------------------------------------+
| REGEXP_LIKE('new*\n*line', 'new\\*.\\*line') |
+----------------------------------------------+
|                                            0 |
+----------------------------------------------+
mysql> SELECT REGEXP_LIKE('a', '^[a-d]');
+----------------------------+
| REGEXP_LIKE('a', '^[a-d]') |
+----------------------------+
|                          1 |
+----------------------------+
mysql> SELECT REGEXP_LIKE('a', 'A'), REGEXP_LIKE('a', BINARY 'A');
+-----------------------+------------------------------+
| REGEXP_LIKE('a', 'A') | REGEXP_LIKE('a', BINARY 'A') |
+-----------------------+------------------------------+
|                     1 |                            0 |
+-----------------------+------------------------------+
```
```
mysql> SELECT REGEXP_LIKE('abc', 'ABC');
+---------------------------+
| REGEXP_LIKE('abc', 'ABC') |
+---------------------------+
|                         1 |
+---------------------------+
mysql> SELECT REGEXP_LIKE('abc', 'ABC', 'c');
+--------------------------------+
| REGEXP_LIKE('abc', 'ABC', 'c') |
+--------------------------------+
|                              0 |
+--------------------------------+
```
REGEXP_REPLACE(expr, pat, repl[, pos[, occurrence[, match_type]]])

~~Replaces occurrences in the string expr that match the regular expression specified by the pattern pat with the replacement string repl, and returns the resulting string.~~ 用替换字符串repl替换字符串expr中与pat模式指定的正则表达式匹配的匹配项，并返回结果字符串。~~If expr, pat, or repl is NULL, the return value is NULL.~~如果expr、pat或repl为NULL，则返回值为NULL。

~~REGEXP_REPLACE() takes these optional arguments:~~REGEXP_REPLACE()接受以下可选参数：
- pos~~: The position in expr at which to start the search.~~ ：expr中开始搜索的位置。~~If omitted, the default is 1.~~如果省略，则默认值为1。
- occurrence~~: Which occurrence of a match to replace.~~ ：要替换的匹配项的哪次出现。~~If omitted, the default is 0 (which means “replace all occurrences”).~~如果省略，则默认值为0（表示“替换所有引用”）。
- match_type~~: A string that specifies how to perform matching.~~ ：指定如何执行匹配的字符串。~~The meaning is as described for REGEXP_LIKE().~~其含义如REGEXP_LIKE()所述。
~~Prior to MySQL 8.0.17, the result returned by this function used the UTF-16 character set; in MySQL 8.0.17 and later, the character set and collation of the expression searched for matches is used.~~ 在MySQL8.0.17之前，此函数返回的结果使用UTF-16字符集；在MySQL8.0.17及更高版本中，使用搜索匹配项的表达式的字符集和排序规则。(Bug #94203, Bug #29308212)

~~For additional information about how matching occurs, see the description for REGEXP_LIKE().~~有关如何进行匹配的其他信息，请参阅REGEXP_LIKE()的说明。
```
mysql> SELECT REGEXP_REPLACE('a b c', 'b', 'X');
+-----------------------------------+
| REGEXP_REPLACE('a b c', 'b', 'X') |
+-----------------------------------+
| a X c                             |
+-----------------------------------+
mysql> SELECT REGEXP_REPLACE('abc def ghi', '[a-z]+', 'X', 1, 3);
+----------------------------------------------------+
| REGEXP_REPLACE('abc def ghi', '[a-z]+', 'X', 1, 3) |
+----------------------------------------------------+
| abc def X                                          |
+----------------------------------------------------+
```
REGEXP_SUBSTR(expr, pat[, pos[, occurrence[, match_type]]])

~~Returns the substring of the string expr that matches the regular expression specified by the pattern pat, NULL if there is no match.~~ 返回与pat模式指定的正则表达式匹配的字符串expr的子字符串，如果不匹配，则返回NULL。~~If expr or pat is NULL, the return value is NULL.~~如果expr或pat为NULL，则返回值为NULL。

~~REGEXP_SUBSTR() takes these optional arguments:~~REGEXP_SUBSTR()接受以下可选参数：
- pos~~: The position in expr at which to start the search.~~ ：expr中开始搜索的位置。~~If omitted, the default is 1.~~如果省略，则默认值为1。
- occurrence~~: Which occurrence of a match to search for.~~ ：要搜索的匹配项的哪次出现。~~If omitted, the default is 1.~~如果省略，则默认值为1。
- match_type~~: A string that specifies how to perform matching.~~ ：指定如何执行匹配的字符串。~~The meaning is as described for REGEXP_LIKE().~~其含义如REGEXP_LIKE()所述。
~~Prior to MySQL 8.0.17, the result returned by this function used the UTF-16 character set; in MySQL 8.0.17 and later, the character set and collation of the expression searched for matches is used.~~ 在MySQL8.0.17之前，此函数返回的结果使用UTF-16字符集；在MySQL8.0.17及更高版本中，使用搜索匹配项的表达式的字符集和排序规则。(Bug #94203, Bug #29308212)

~~For additional information about how matching occurs, see the description for REGEXP_LIKE().~~有关如何进行匹配的其他信息，请参阅REGEXP_LIKE()的说明。
```
mysql> SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+');
+----------------------------------------+
| REGEXP_SUBSTR('abc def ghi', '[a-z]+') |
+----------------------------------------+
| abc                                    |
+----------------------------------------+
mysql> SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3);
+----------------------------------------------+
| REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3) |
+----------------------------------------------+
| ghi                                          |
+----------------------------------------------+
```

Regular Expression Syntax正则表达式语法

~~A regular expression describes a set of strings.~~ 正则表达式描述一组字符串。~~The simplest regular expression is one that has no special characters in it.~~ 最简单的正则表达式是没有特殊字符的正则表达式。~~For example, the regular expression hello matches hello and nothing else.~~例如，正则表达式hello匹配hello，而不是其他。

~~Nontrivial regular expressions use certain special constructs so that they can match more than one string.~~ 非平凡正则表达式使用某些特殊构造，以便它们可以匹配多个字符串。~~For example, the regular expression hello|world contains the | alternation operator and matches either the hello or world.~~例如，正则表达式hello|world包含|交替运算符并匹配hello或world。

As a more complex example, the regular expression B[an]*s matches any of the strings Bananas, Baaaaas, Bs, and any other string starting with a B, ending with an s, and containing any number of a or n characters in between.作为一个更复杂的例子，正则表达式B[an]*s匹配任一字符串banana、baaaas、Bs，以及任何其他以B开头、以s结尾、中间包含任意数量的a或n字符的字符串。

~~The following list covers some of the basic special characters and constructs that can be used in regular expressions.~~ 下面的列表介绍了一些可以在正则表达式中使用的基本特殊字符和构造。~~For information about the full regular expression syntax supported by the ICU library used to implement regular expression support, visit the International Components for Unicode website.~~有关用于实现正则表达式支持的ICU库所支持的完整正则表达式语法的信息，请访问International Components For Unicode网站。

^

~~Match the beginning of a string.~~匹配字符串的开头。

mysql> SELECT REGEXP_LIKE('fo\nfo', '^fo$');                   -> 0
mysql> SELECT REGEXP_LIKE('fofo', '^fo');                      -> 1

$

~~Match the end of a string.~~匹配字符串的结尾。

mysql> SELECT REGEXP_LIKE('fo\no', '^fo\no$');                 -> 1
mysql> SELECT REGEXP_LIKE('fo\no', '^fo$');                    -> 0

.

Match any character (including carriage return and newline, although to match these in the middle of a string, the m (multiple line) match-control character or the (?m) within-pattern modifier must be given).匹配任何字符（包括回车符和换行符，但要在字符串中间匹配这些字符，必须指定m（多行）匹配控制字符或(?m)模式内修饰符）。
```
mysql> SELECT REGEXP_LIKE('fofo', '^f.*$');                    -> 1
mysql> SELECT REGEXP_LIKE('fo\r\nfo', '^f.*$');                -> 0
mysql> SELECT REGEXP_LIKE('fo\r\nfo', '^f.*$', 'm');           -> 1
mysql> SELECT REGEXP_LIKE('fo\r\nfo', '(?m)^f.*$');           -> 1
```

a*

~~Match any sequence of zero or more a characters.~~匹配零个或多个a字符的任意序列。

mysql> SELECT REGEXP_LIKE('Ban', '^Ba*n');                     -> 1
mysql> SELECT REGEXP_LIKE('Baaan', '^Ba*n');                   -> 1
mysql> SELECT REGEXP_LIKE('Bn', '^Ba*n');                      -> 1

a+

~~Match any sequence of one or more a characters.~~匹配一个或多个a字符的任意序列。

mysql> SELECT REGEXP_LIKE('Ban', '^Ba+n');                     -> 1
mysql> SELECT REGEXP_LIKE('Bn', '^Ba+n');                      -> 0

a?

~~Match either zero or one a character.~~匹配零或一个a字符。

mysql> SELECT REGEXP_LIKE('Bn', '^Ba?n');                      -> 1
mysql> SELECT REGEXP_LIKE('Ban', '^Ba?n');                     -> 1
mysql> SELECT REGEXP_LIKE('Baan', '^Ba?n');                    -> 0

de|abc

~~Alternation; match either of the sequences de or abc.~~交替；匹配de或abc序列。

mysql> SELECT REGEXP_LIKE('pi', 'pi|apa');                     -> 1
mysql> SELECT REGEXP_LIKE('axe', 'pi|apa');                    -> 0
mysql> SELECT REGEXP_LIKE('apa', 'pi|apa');                    -> 1
mysql> SELECT REGEXP_LIKE('apa', '^(pi|apa)$');                -> 1
mysql> SELECT REGEXP_LIKE('pi', '^(pi|apa)$');                 -> 1
mysql> SELECT REGEXP_LIKE('pix', '^(pi|apa)$');                -> 0

(abc)*

~~Match zero or more instances of the sequence abc.~~匹配序列abc的零个或多个实例。

mysql> SELECT REGEXP_LIKE('pi', '^(pi)*$');                    -> 1
mysql> SELECT REGEXP_LIKE('pip', '^(pi)*$');                   -> 0
mysql> SELECT REGEXP_LIKE('pipi', '^(pi)*$');                  -> 1

{1}, {2,3}

~~Repetition; {n} and {m,n} notation provide a more general way of writing regular expressions that match many occurrences of the previous atom (or “piece”) of the pattern.~~ 重复，{n} 和{m,n}表示法提供了一种更通用的方法来编写正则表达式，这些正则表达式与模式的前一个原子（或“片段”）的许多匹配项相匹配。m and n are integers.m和n是整数。
- a*
  
  ~~Can be written as a{0,}.~~可以写为a{0,}。
- a+
  
  ~~Can be written as a{1,}.~~可以写为a{1,}。
- a?
  
  ~~Can be written as a{0,1}.~~可以写为a{0,1}。
~~To be more precise, a{n} matches exactly n instances of a.~~ 更准确地说，a{n}正好匹配a的n个实例。a{n,} ~~matches n or more instances of a.~~ 匹配a的n个或多个实例。a{m,n} ~~matches m through n instances of a, inclusive.~~ 匹配a的m到n个实例（含）。~~If both m and n are given, m must be less than or equal to n.~~如果m和n都给定，则m必须小于或等于n。
```
mysql> SELECT REGEXP_LIKE('abcde', 'a[bcd]{2}e');              -> 0
mysql> SELECT REGEXP_LIKE('abcde', 'a[bcd]{3}e');              -> 1
mysql> SELECT REGEXP_LIKE('abcde', 'a[bcd]{1,10}e');           -> 1
```
[a-dX], [^a-dX]

~~Matches any character that is (or is not, if ^ is used) either a, b, c, d or X.~~ 匹配a、b、c、d或X中的任一字符（如果用了^则是匹配不属于这些字符的字符）。~~A - character between two other characters forms a range that matches all characters from the first character to the second.~~ 其他两个字符之间的-字符形成一个范围，该范围匹配从第一个字符到第二个字符的所有字符。~~For example, [0-9] matches any decimal digit.~~ 例如，[0-9]匹配任何十进制数字。~~To include a literal ] character, it must immediately follow the opening bracket [.~~ 要包含文字]字符，它必须紧跟在左括号[后面。~~To include a literal - character, it must be written first or last.~~ 要包含文字-字符，必须写在最前面或最后面。~~Any character that does not have a defined special meaning inside a [] pair matches only itself.~~[]对中没有定义特殊含义的任何字符只匹配自身。
```
mysql> SELECT REGEXP_LIKE('aXbc', '[a-dXYZ]');                 -> 1
mysql> SELECT REGEXP_LIKE('aXbc', '^[a-dXYZ]$');               -> 0
mysql> SELECT REGEXP_LIKE('aXbc', '^[a-dXYZ]+$');              -> 1
mysql> SELECT REGEXP_LIKE('aXbc', '^[^a-dXYZ]+$');             -> 0
mysql> SELECT REGEXP_LIKE('gheis', '^[^a-dXYZ]+$');            -> 1
mysql> SELECT REGEXP_LIKE('gheisa', '^[^a-dXYZ]+$');           -> 0
```
[=character_class=]

~~Within a bracket expression (written using [ and ]), [=character_class=] represents an equivalence class.~~ 在括号表达式（使用[和]编写）中，[=character_class=]表示等价类。~~It matches all characters with the same collation value, including itself.~~ 它匹配具有相同排序规则值的所有字符，包括其自身。~~For example, if o and (+) are the members of an equivalence class, [[=o=]], [[=(+)=]], and [o(+)] are all synonymous.~~ 例如，如果o和(+)是等价类的成员，[[=o=]]、[[=(+)=]]和[o(+)]都是同义词。~~An equivalence class may not be used as an endpoint of a range.~~等价类不能用作范围的端点。

[:character_class:]

~~Within a bracket expression (written using [ and ]), [:character_class:] represents a character class that matches all characters belonging to that class.~~ 在方括号表达式（使用[和]编写）中，[:character_class:]表示与属于该类的所有字符匹配的字符类。~~The following table lists the standard class names.~~ 下表列出了标准类名。~~These names stand for the character classes defined in the ctype(3) manual page.~~ 这些名称代表ctype(3)手册页中定义的字符类。~~A particular locale may provide other class names.~~ 特定的区域设置可以提供其他类名。~~A character class may not be used as an endpoint of a range.~~字符类不能用作范围的端点。

~~Character Class Name~~字符类名	~~Meaning~~含义
`alnum`	~~Alphanumeric characters~~字母数字字符
`alpha`	~~Alphabetic characters~~字母字符
`blank`	~~Whitespace characters~~空白字符
`cntrl`	~~Control characters~~控制字符
`digit`	~~Digit characters~~数字字符
`graph`	~~Graphic characters~~图形字符
`lower`	~~Lowercase alphabetic characters~~小写字母字符
`print`	~~Graphic or space characters~~图形或空格字符
`punct`	~~Punctuation characters~~标点符号
`space`	~~Space, tab, newline, and carriage return~~空格、制表符、换行符和回车符
`upper`	~~Uppercase alphabetic characters~~大写字母字符
`xdigit`	~~Hexadecimal digit characters~~十六进制数字字符

mysql> SELECT REGEXP_LIKE('justalnums', '[[:alnum:]]+');       -> 1
mysql> SELECT REGEXP_LIKE('!!', '[[:alnum:]]+');               -> 0

~~To use a literal instance of a special character in a regular expression, precede it by two backslash (\) characters.~~ 要在正则表达式中使用特殊字符的文本实例，请在其前面加两个反斜杠（\）字符。~~The MySQL parser interprets one of the backslashes, and the regular expression library interprets the other.~~ MySQL解析器解释其中一个反斜杠，正则表达式库解释另一个反斜杠。~~For example, to match the string 1+2 that contains the special + character, only the last of the following regular expressions is the correct one:~~例如，要匹配包含特殊+字符的字符串1+2，只有以下正则表达式中的最后一个才是正确的：

mysql> SELECT REGEXP_LIKE('1+2', '1+2');                       -> 0
mysql> SELECT REGEXP_LIKE('1+2', '1\+2');                      -> 0
mysql> SELECT REGEXP_LIKE('1+2', '1\\+2');                     -> 1

Regular Expression Resource Control正则表达式资源控制

~~REGEXP_LIKE() and similar functions use resources that can be controlled by setting system variables:~~REGEXP_LIKE()和类似的函数使用的资源可以通过设置系统变量来控制：

~~The match engine uses memory for its internal stack.~~ 匹配引擎使用内存作为其内部堆栈。~~To control the maximum available memory for the stack in bytes, set the regexp_stack_limit system variable.~~要控制堆栈的最大可用内存（字节），请设置regexp_stack_limit系统变量。
~~The match engine operates in steps.~~ 匹配引擎分步运行。~~To control the maximum number of steps performed by the engine (and thus indirectly the execution time), set the regexp_time_limit system variable.~~ 要控制引擎执行的最大步数（从而间接控制执行时间），请设置regexp_time_limit系统变量。~~Because this limit is expressed as number of steps, it affects execution time only indirectly.~~ 因为这个限制表示为步数，所以它只间接影响执行时间。~~Typically, it is on the order of milliseconds.~~通常，它是以毫秒为单位的。

Regular Expression Compatibility Considerations正则表达式兼容性注意事项

~~Prior to MySQL 8.0.4, MySQL used the Henry Spencer regular expression library to support regular expression operations, rather than International Components for Unicode (ICU).~~ 在MySQL 8.0.4之前，MySQL使用Henry Spencer正则表达式库来支持正则表达式操作，而不是Unicode(ICU)的国际组件。~~The following discussion describes differences between the Spencer and ICU libraries that may affect applications:~~以下讨论描述了Spencer和ICU库之间可能影响应用程序的差异：

~~With the Spencer library, the REGEXP and RLIKE operators work in byte-wise fashion, so they are not multibyte safe and may produce unexpected results with multibyte character sets.~~ 在Spencer库中，REGEXP和RLIKE操作符以字节方式工作，因此它们不是多字节安全的，并且可能对多字节字符集产生意外的结果。~~In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.~~此外，这些运算符按字节值比较字符，即使给定的排序规则将重音字符视为相等，重音字符也可能不相等。

~~ICU has full Unicode support and is multibyte safe.~~ ICU完全支持Unicode，并且是多字节安全的。~~Its regular expression functions treat all strings as UTF-16.~~ 它的正则表达式函数将所有字符串视为UTF-16。~~You should keep in mind that positional indexes are based on 16-bit chunks and not on code points.~~ 您应该记住，位置索引基于16位块，而不是代码点。~~This means that, when passed to such functions, characters using more than one chunk may produce unanticipated results, such as those shown here:~~这意味着，当传递给此类函数时，使用多个块的字符可能会产生意想不到的结果，如以下所示：
```
mysql> SELECT REGEXP_INSTR('🍣🍣b', 'b');
+--------------------------+
| REGEXP_INSTR('??b', 'b') |
+--------------------------+
|                        5 |
+--------------------------+
1 row in set (0.00 sec)

mysql> SELECT REGEXP_INSTR('🍣🍣bxxx', 'b', 4);
+--------------------------------+
| REGEXP_INSTR('??bxxx', 'b', 4) |
+--------------------------------+
|                              5 |
+--------------------------------+
1 row in set (0.00 sec)
```
~~Characters within the Unicode Basic Multilingual Plane, which includes characters used by most modern languages, are safe in this regard:~~Unicode基本多语言平面中的字符（包括大多数现代语言使用的字符）在这方面是安全的：
```
mysql> SELECT REGEXP_INSTR('бжb', 'b');
+----------------------------+
| REGEXP_INSTR('бжb', 'b')   |
+----------------------------+
|                          3 |
+----------------------------+
1 row in set (0.00 sec)

mysql> SELECT REGEXP_INSTR('עבb', 'b');
+----------------------------+
| REGEXP_INSTR('עבb', 'b')   |
+----------------------------+
|                          3 |
+----------------------------+
1 row in set (0.00 sec)

mysql> SELECT REGEXP_INSTR('µå周çб', '周');
+------------------------------------+
| REGEXP_INSTR('µå周çб', '周')       |
+------------------------------------+
|                                  3 |
+------------------------------------+
1 row in set (0.00 sec)
```
~~Emoji, such as the “sushi” character 🍣 (U+1F363) used in the first two examples, are not included in the Basic Multilingual Plane, but rather in Unicode's Supplementary Multilingual Plane.~~ Emoji，例如前两个示例中使用的“sushi”字符🍣（U+1F363)，不包含在基本的多语言平面中，而是包含在Unicode的补充多语言平面中。~~Another issue can arise with emoji and other 4-byte characters when REGEXP_SUBSTR() or a similar function begins searching in the middle of a character.~~ 当REGEXP_SUBSTR()或类似函数开始在字符中间搜索时，emoji和其他4字节字符可能会出现另一个问题。~~Each of the two statements in the following example starts from the second 2-byte position in the first argument.~~ 以下示例中的两个语句都从第一个参数中的第二个2字节位置开始。~~The first statement works on a string consisting solely of 2-byte (BMP) characters.~~ 第一个语句处理仅由2字节（BMP)字符组成的字符串。~~The second statement contains 4-byte characters which are incorrectly interpreted in the result because the first two bytes are stripped off and so the remainder of the character data is misaligned.~~第二条语句包含的4字节字符在结果中被错误地解释，因为前两个字节被剥离，因此字符数据的其余部分没有对齐。
```
mysql> SELECT REGEXP_SUBSTR('周周周周', '.*', 2);
+----------------------------------------+
| REGEXP_SUBSTR('周周周周', '.*', 2)     |
+----------------------------------------+
| 周周周                                 |
+----------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT REGEXP_SUBSTR('🍣🍣🍣🍣', '.*', 2);
+--------------------------------+
| REGEXP_SUBSTR('????', '.*', 2) |
+--------------------------------+
| ?㳟揘㳟揘㳟揘                  |
+--------------------------------+
1 row in set (0.00 sec)
```
~~For the . operator, the Spencer library matches line-terminator characters (carriage return, newline) anywhere in string expressions, including in the middle.~~ 对于.运算符，斯宾塞库匹配字符串终止符中任何地方的行终止符（回车符，换行符），包括中间。~~To match line terminator characters in the middle of strings with ICU, specify the m match-control character.~~要将字符串中间的行结束符字符与ICU匹配，请指定m匹配控制字符。
~~The Spencer library supports word-beginning and word-end boundary markers ([[:<:]] and [[:>:]] notation).~~ Spencer库支持单词开头和单词结尾边界标记（[[:<:]]和[[:>:]]表示法）。~~ICU does not.~~ ICU没有。~~For ICU, you can use \b to match word boundaries; double the backslash because MySQL interprets it as the escape character within strings.~~对于ICU，可以使用\b来匹配单词边界；加倍反斜杠，因为MySQL将其解释为字符串中的转义字符。
~~The Spencer library supports collating element bracket expressions ([.characters.] notation).~~ Spencer库支持排序元素括号表达式（[.characters.]表示法）。~~ICU does not.~~ICU不是这样的。
~~For repetition counts ({n} and {m,n} notation), the Spencer library has a maximum of 255.~~ 对于重复计数（{n}和{m,n}表示法），斯宾塞库的最大值为255。~~ICU has no such limit, although the maximum number of match engine steps can be limited by setting the regexp_time_limit system variable.~~ICU没有这样的限制，尽管可以通过设置regexp_time_limit系统变量来限制匹配引擎的最大步数。

~~ICU interprets parentheses as metacharacters.~~ ICU将圆括号解释为元字符。~~To specify a literal open or close parenthesis ( in a regular expression, it must be escaped:~~若要在正则表达式中指定文字开括号或闭括号(，必须对其进行转义：

mysql> SELECT REGEXP_LIKE('(', '(');
ERROR 3692 (HY000): Mismatched parenthesis in regular expression.
mysql> SELECT REGEXP_LIKE('(', '\\(');
+-------------------------+
| REGEXP_LIKE('(', '\\(') |
+-------------------------+
|                       1 |
+-------------------------+
mysql> SELECT REGEXP_LIKE(')', ')');
ERROR 3692 (HY000): Mismatched parenthesis in regular expression.
mysql> SELECT REGEXP_LIKE(')', '\\)');
+-------------------------+
| REGEXP_LIKE(')', '\\)') |
+-------------------------+
|                       1 |
+-------------------------+

~~ICU also interprets square brackets as metacharacters, but only the opening square bracket need be escaped to be used as a literal character:~~ICU还将方括号解释为元字符，但只有开头的方括号需要转义才能用作文字字符：

mysql> SELECT REGEXP_LIKE('[', '[');
ERROR 3696 (HY000): The regular expression contains an
unclosed bracket expression.
mysql> SELECT REGEXP_LIKE('[', '\\[');
+-------------------------+
| REGEXP_LIKE('[', '\\[') |
+-------------------------+
|                       1 |
+-------------------------+
mysql> SELECT REGEXP_LIKE(']', ']');
+-----------------------+
| REGEXP_LIKE(']', ']') |
+-----------------------+
|                     1 |
+-----------------------+