Documentation

The Java™ Tutorials
Hide TOC
String Literals字符串文字
Trail: Essential Java Classes
Lesson: Regular Expressions

String Literals字符串文字

The most basic form of pattern matching supported by this API is the match of a string literal. 此API支持的模式匹配的最基本形式是字符串文本的匹配。For example, if the regular expression is foo and the input string is foo, the match will succeed because the strings are identical. 例如,如果正则表达式是foo,输入字符串是foo,则匹配将成功,因为字符串是相同的。Try this out with the test harness:使用测试线束尝试此操作:

Enter your regex: foo
Enter input string to search: foo
I found the text foo starting at index 0 and ending at index 3.

This match was a success. Note that while the input string is 3 characters long, the start index is 0 and the end index is 3. 这场比赛很成功。请注意,虽然输入字符串的长度为3个字符,但开始索引为0,结束索引为3。By convention, ranges are inclusive of the beginning index and exclusive of the end index, as shown in the following figure:按照惯例,范围包括开始索引,不包括结束索引,如下图所示:

The string literal foo, with numbered cells and index values.

The string literal foo, with numbered cells and index values.字符串文字foo,带有编号的单元格和索引值。

Each character in the string resides in its own cell, with the index positions pointing between each cell. 字符串中的每个字符都位于自己的单元格中,索引位置指向每个单元格之间。The string "foo" starts at index 0 and ends at index 3, even though the characters themselves only occupy cells 0, 1, and 2.字符串“foo”从索引0开始,到索引3结束,即使字符本身只占用单元格0、1和2。

With subsequent matches, you'll notice some overlap; the start index for the next match is the same as the end index of the previous match:在后续的匹配中,您会注意到一些重叠;下一个匹配的开始索引与上一个匹配的结束索引相同:

Enter your regex: foo
Enter input string to search: foofoofoo
I found the text foo starting at index 0 and ending at index 3.
I found the text foo starting at index 3 and ending at index 6.
I found the text foo starting at index 6 and ending at index 9.

Metacharacters元字符

This API also supports a number of special characters that affect the way a pattern is matched. 此API还支持许多影响模式匹配方式的特殊字符。Change the regular expression to cat. and the input string to cats. 将正则表达式更改为cat.,输入字符串更改为catsThe output will appear as follows:输出将如下所示:

Enter your regex: cat.
Enter input string to search: cats
I found the text cats starting at index 0 and ending at index 4.

The match still succeeds, even though the dot "." is not present in the input string. 即使输入字符串中不存在点“.”,匹配仍然成功。It succeeds because the dot is a metacharacter — a character with special meaning interpreted by the matcher. 它之所以成功,是因为点是元字符—由匹配者解释的具有特殊意义的字符。The metacharacter "." means "any character" which is why the match succeeds in this example.元字符“.”表示“任何字符”,这就是本例中匹配成功的原因。

The metacharacters supported by this API are: 此API支持的元字符包括:<([{\^-=$!|]})?*+.>


Note: In certain situations the special characters listed above will not be treated as metacharacters. 在某些情况下,上面列出的特殊字符不会被视为元字符。You'll encounter this as you learn more about how regular expressions are constructed. 当您进一步了解正则表达式是如何构造的时,您会遇到这种情况。You can, however, use this list to check whether or not a specific character will ever be considered a metacharacter. 但是,您可以使用此列表检查特定字符是否会被视为元字符。For example, the characters @ and # never carry a special meaning. 例如,字符@#从来没有特殊意义。

There are two ways to force a metacharacter to be treated as an ordinary character:有两种方法可以强制将元字符视为普通字符:

When using this technique, the \Q and \E can be placed at any location within the expression, provided that the \Q comes first.使用此技术时,可以将\Q\E放置在表达式中的任何位置,前提是\Q出现在前面。


Previous page: Test Harness
Next page: Character Classes