The Java Tutorials have been written for JDK 8.Java教程是为JDK 8编写的。Examples and practices described in this page don't take advantage of improvements introduced in later releases and might use technology no longer available.本页中描述的示例和实践没有利用后续版本中引入的改进,并且可能使用不再可用的技术。See Java Language Changes for a summary of updated language features in Java SE 9 and subsequent releases.有关Java SE 9及其后续版本中更新的语言特性的摘要,请参阅Java语言更改。
See JDK Release Notes for information about new features, enhancements, and removed or deprecated options for all JDK releases.有关所有JDK版本的新功能、增强功能以及已删除或不推荐的选项的信息,请参阅JDK发行说明。
In the previous section, we saw how quantifiers attach to one character, character class, or capturing group at a time. 在上一节中,我们看到了量词如何一次附加到一个字符、字符类或捕获组。But until now, we have not discussed the notion of capturing groups in any detail.但直到现在,我们还没有详细讨论捕获群体的概念。
Capturing groups are a way to treat multiple characters as a single unit. 捕获组是将多个角色视为单个单元的一种方法。They are created by placing the characters to be grouped inside a set of parentheses. 它们是通过将要分组的字符放在一组括号内创建的。For example, the regular expression 例如,正则表达式(dog)
creates a single group containing the letters "d" "o"
and "g"
. (dog)
创建了单个组,包含了字母"d"
、"o"
和"g"
。The portion of the input string that matches the capturing group will be saved in memory for later recall via backreferences (as discussed below in the section, Backreferences).与捕获组匹配的输入字符串部分将保存在内存中,以便以后通过反向引用调用(如下文反向引用一节所述)。
As described in the 如Pattern
API, capturing groups are numbered by counting their opening parentheses from left to right. Pattern
API中所述,捕获组通过从左到右计算其左括号进行编号。In the expression 例如,在表达式((A)(B(C)))
, for example, there are four such groups:((A)(B(C)))
中,有四个这样的组:
((A)(B(C)))
(A)
(B(C))
(C)
To find out how many groups are present in the expression, call the 要了解表达式中存在多少组,请对groupCount
method on a matcher object. matcher
对象调用groupCount
方法。The groupCount
method returns an int
showing the number of capturing groups present in the matcher's pattern. groupCount
方法返回一个int
,显示匹配器模式中存在的捕获组的数量。In this example, 在本例中,groupCount
would return the number 4
, showing that the pattern contains 4 capturing groups.groupCount
将返回数字4
,表示模式包含4个捕获组。
There is also a special group, group 0, which always represents the entire expression. 还有一个特殊的组,组0,它始终表示整个表达式。This group is not included in the total reported by 此组不包括在groupCount
. groupCount
报告的总数中。Groups beginning with 以(?
are pure, non-capturing groups that do not capture text and do not count towards the group total. (?
开头的组是纯的非捕获组,不捕获文本,不计入组总数。(You'll see examples of non-capturing groups later in the section Methods of the Pattern Class.)(稍后将在模式类的方法一节中看到非捕获组的示例。)
It's important to understand how groups are numbered because some 了解组的编号方式很重要,因为某些Matcher
methods accept an int
specifying a particular group number as a parameter:Matcher
方法接受指定特定组编号的int
作为参数:
public int start(int group)
public int end (int group)
public String group (int group)
The section of the input string matching the capturing group(s) is saved in memory for later recall via backreference. 与捕获组匹配的输入字符串部分保存在内存中,以便以后通过反向引用调用。A backreference is specified in the regular expression as a backslash (反引用在正则表达式中指定为反斜杠(\
) followed by a digit indicating the number of the group to be recalled. \
),后跟一个数字,表示要调用的组的编号。For example, the expression 例如,表达式(\d\d)
defines one capturing group matching two digits in a row, which can be recalled later in the expression via the backreference \1
.(\d\d)
定义了一个与一行中的两个数字匹配的捕获组,稍后可通过反向引用\1
在表达式中调用该组。
To match any 2 digits, followed by the exact same two digits, you would use 要匹配任意两位数字,后跟完全相同的两位数字,请使用(\d\d)\1
as the regular expression:(\d\d)\1
作为正则表达式:
Enter your regex: (\d\d)\1 Enter input string to search: 1212 I found the text "1212" starting at index 0 and ending at index 4.
If you change the last two digits the match will fail:如果更改最后两位数字,匹配将失败:
Enter your regex: (\d\d)\1 Enter input string to search: 1234 No match found.
For nested capturing groups, backreferencing works in exactly the same way: Specify a backslash followed by the number of the group to be recalled.对于嵌套的捕获组,反向引用的工作方式完全相同:指定一个反斜杠,后跟要调用的组的编号。