The Java Tutorials have been written for JDK 8.Java教程是为JDK 8编写的。Examples and practices described in this page don't take advantage of improvements introduced in later releases and might use technology no longer available.本页中描述的示例和实践没有利用后续版本中引入的改进,并且可能使用不再可用的技术。See Java Language Changes for a summary of updated language features in Java SE 9 and subsequent releases.有关Java SE 9及其后续版本中更新的语言特性的摘要,请参阅Java语言更改。
See JDK Release Notes for information about new features, enhancements, and removed or deprecated options for all JDK releases.有关所有JDK版本的新功能、增强功能以及已删除或不推荐的选项的信息,请参阅JDK发行说明。
Until now, we've only used the test harness to create 到目前为止,我们只使用测试工具以最基本的形式创建Pattern
objects in their most basic form. Pattern
对象。This section explores advanced techniques such as creating patterns with flags and using embedded flag expressions. 本节探讨高级技术,例如使用标志创建模式和使用嵌入式标志表达式。It also explores some additional useful methods that we haven't yet discussed.它还探索了一些我们尚未讨论的其他有用方法。
The Pattern
class defines an alternate compile
method that accepts a set of flags affecting the way the pattern is matched. Pattern
类定义了一个替代compile
方法,该方法接受一组影响模式匹配方式的标志。The flags parameter is a bit mask that may include any of the following public static fields:flags参数是一个位掩码,可包括以下任何公共静态字段:
Pattern.CANON_EQ
"a\u030A"
, for example, will match the string "\u00E5"
when this flag is specified."a\u030A"
将与字符串"\u00E5"
匹配。 Pattern.CASE_INSENSITIVE
UNICODE_CASE
标志与此标志一起指定来启用支持Unicode的不区分大小写匹配。(?i)
. (?i)
启用不区分大小写的匹配。Pattern.COMMENTS
#
are ignored until the end of a line. #
开头的嵌入注释被忽略,直到行尾。(?x)
.(?x)
启用注释模式。Pattern.DOTALL
.
matches any character, including a line terminator. .
匹配任何字符,包括行终止符。(?s)
. (?s)
启用Dotall模式。Pattern.LITERAL
CASE_INSENSITIVE
and UNICODE_CASE
retain their impact on matching when used in conjunction with this flag. CASE_INSENSITIVE
和UNICODE_CASE
保留其对匹配的影响。Pattern.MULTILINE
^
and $
match just after or just before, respectively, a line terminator or the end of the input sequence. ^
和$
分别在行终止符或输入序列结尾之后或之前匹配。(?m)
.(?m)
启用。Pattern.UNICODE_CASE
CASE_INSENSITIVE
flag, is done in a manner consistent with the Unicode Standard. CASE_INSENSITIVE
标志启用时,将以与Unicode标准一致的方式进行不区分大小写的匹配。(?u)
. (?u)
启用支持Unicode的大小写折叠。Pattern.UNIX_LINES
'\n'
line terminator is recognized in the behavior of .
, ^
, and $
. .
、^
和$
的行为中仅识别'\n'
行终止符。(?d)
.(?d)
启用。In the following steps we will modify the test harness, 在以下步骤中,我们将修改测试工具RegexTestHarness.java
to create a pattern with case-insensitive matching.RegexTestHarness.java
,以创建具有不区分大小写匹配的模式。
First, modify the code to invoke the alternate version of 首先,修改代码以调用compile
:compile
的备用版本:
Pattern pattern = Pattern.compile(console.readLine("%nEnter your regex: "), Pattern.CASE_INSENSITIVE);
Then compile and run the test harness to get the following results:然后编译并运行测试线束以获得以下结果:
Enter your regex: dog Enter input string to search: DoGDOg I found the text "DoG" starting at index 0 and ending at index 3. I found the text "DOg" starting at index 3 and ending at index 6.
As you can see, the string literal "dog" matches both occurences, regardless of case. 如您所见,字符串文字“dog”匹配这两种情况,不管大小写如何。To compile a pattern with multiple flags, separate the flags to be included using the bitwise OR operator "要使用多行标志编译模式,请分隔标志以包含,方法是使用按位或运算符|
". |
。For clarity, the following code samples hardcode the regular expression instead of reading it from the 为清楚起见,以下代码示例对正则表达式进行硬编码,而不是从控制台读取:Console
:
pattern = Pattern.compile("[az]$", Pattern.MULTILINE | Pattern.UNIX_LINES);
You could also specify an 也可以指定一个int
variable instead:int
变量:
final int flags = Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE; Pattern pattern = Pattern.compile("aa", flags);
It's also possible to enable various flags using embedded flag expressions. 还可以使用嵌入的标志表达式启用各种标志。Embedded flag expressions are an alternative to the two-argument version of 嵌入式标志表达式是compile
, and are specified in the regular expression itself. compile
的两参数版本的替代,并在正则表达式本身中指定。The following example uses the original test harness, 下面的示例使用原始测试工具RegexTestHarness.java
with the embedded flag expression (?i)
to enable case-insensitive matching.RegexTestHarness.java
和嵌入的标志表达式(?i)
来启用不区分大小写的匹配。
Enter your regex: (?i)foo Enter input string to search: FOOfooFoOfoO I found the text "FOO" starting at index 0 and ending at index 3. I found the text "foo" starting at index 3 and ending at index 6. I found the text "FoO" starting at index 6 and ending at index 9. I found the text "foO" starting at index 9 and ending at index 12.
Once again, all matches succeed regardless of case.再一次,无论情况如何,所有匹配都会成功。
The embedded flag expressions that correspond to 下表显示了与Pattern
's publicly accessible fields are presented in the following table:Pattern
的公共可访问字段相对应的嵌入式标志表达式:
Pattern.CANON_EQ |
|
Pattern.CASE_INSENSITIVE |
(?i) |
Pattern.COMMENTS |
(?x) |
Pattern.MULTILINE |
(?m) |
Pattern.DOTALL |
(?s) |
Pattern.LITERAL |
|
Pattern.UNICODE_CASE |
(?u) |
Pattern.UNIX_LINES |
(?d) |
matches(String,CharSequence)
Methodmatches(String,CharSequence)
方法The Pattern
class defines a convenient matches
method that allows you to quickly check if a pattern is present in a given input string. Pattern
类定义了一个方便的matches
方法,允许您快速检查给定输入字符串中是否存在模式。As with all public static methods, you should invoke 与所有公共静态方法一样,您应该通过其类名调用matches
by its class name, such as Pattern.matches("\\d","1");
. matches
,例如Pattern.matches("\\d","1");
。In this example, the method returns 在本例中,该方法返回true
, because the digit "1" matches the regular expression \d
.true
,因为数字“1”与正则表达式\d
匹配。
split(String)
Methodsplit(String)
方法The split
method is a great tool for gathering the text that lies on either side of the pattern that's been matched. split
方法是收集匹配模式两侧的文本的一个很好的工具。As shown below in 如SplitDemo.java
, the split
method could extract the words "one two three four five
" from the string "one:two:three:four:five
":SplitDemo.java
中所示,split
方法可以从字符串"one:two:three:four:five"
中提取单词"one"
、"two"
、"three"
、"four"
、"five"
:
import java.util.regex.Pattern; import java.util.regex.Matcher; public class SplitDemo { private static final String REGEX = ":"; private static final String INPUT = "one:two:three:four:five"; public static void main(String[] args) { Pattern p = Pattern.compile(REGEX); String[] items = p.split(INPUT); for(String s : items) { System.out.println(s); } } }
OUTPUT: one two three four five
For simplicity, we've matched a string literal, the colon (为简单起见,我们匹配了字符串文字,冒号(:
) instead of a complex regular expression. :
)而不是复杂的正则表达式。Since we're still using 因为我们仍然使用Pattern
and Matcher
objects, you can use split to get the text that falls on either side of any regular expression. Pattern
和Matcher
对象,所以可以使用拆分来获取任何正则表达式两侧的文本。Here's the same example, 下面是相同的示例SplitDemo2.java
, modified to split on digits instead:SplitDemo2.java
,改为按数字拆分:
import java.util.regex.Pattern; import java.util.regex.Matcher; public class SplitDemo2 { private static final String REGEX = "\\d"; private static final String INPUT = "one9two4three7four1five"; public static void main(String[] args) { Pattern p = Pattern.compile(REGEX); String[] items = p.split(INPUT); for(String s : items) { System.out.println(s); } } }
OUTPUT: one two three four five
You may find the following methods to be of some use as well:您可能会发现以下方法也有一些用处:
public static String quote(String s)
String
for the specified String
. String
的文本模式String
。String
that can be used to create a Pattern
that would match String s
as if it were a literal pattern. String
,可用于创建与String s
匹配的Pattern
,就像它是一个文本模式一样。public String toString()
String
representation of this pattern. String
表示形式。java.lang.String
java.lang.String
中的模式方法等价物Regular expression support also exists in 通过模拟java.lang.String
through several methods that mimic the behavior of java.util.regex.Pattern
. java.util.regex.Pattern
行为的几种方法,java.lang.String
中也存在正则表达式支持。For convenience, key excerpts from their API are presented below.为方便起见,下面提供了API的关键摘录。
public boolean matches(String regex)
str.matches(regex)
yields exactly the same result as the expression Pattern.matches(regex, str)
.str.matches(regex)
形式的此方法会产生与表达式Pattern.matches(regex, str)
完全相同的结果。public String[] split(String regex, int limit)
str.split(regex, n)
yields the same result as the expression Pattern.compile(regex).split(str, n)
str.split(regex, n)
形式的此方法会产生与表达式Pattern.compile(regex).split(str, n)
相同的结果public String[] split(String regex)
There is also a replace method, that replaces one 还有一种替换方法,可以用一个CharSequence
with another:CharSequence
替换另一个:
public String replace(CharSequence target,CharSequence replacement)