Documentation

The Java™ Tutorials
Hide TOC
Predefined Character Classes预定义字符类
Trail: Essential Java Classes
Lesson: Regular Expressions

Predefined Character Classes预定义字符类

The Pattern API contains a number of useful predefined character classes, which offer convenient shorthands for commonly used regular expressions:PatternAPI包含许多有用的预定义字符类,它们为常用正则表达式提供了方便的简写:

Construct构造 Description描述
. Any character (may or may not match line terminators)任何字符(可能与行终止符匹配,也可能不匹配)
\d A digit: 数字:[0-9]
\D A non-digit: 非数字:[^0-9]
\s A whitespace character: 空白字符:[ \t\n\x0B\f\r]
\S A non-whitespace character: 非空白字符:[^\s]
\w A word character: 单词字符:[a-zA-Z_0-9]
\W A non-word character: 非单词字符:[^\w]

In the table above, each construct in the left-hand column is shorthand for the character class in the right-hand column. 在上表中,左栏中的每个构造都是右栏中字符类的缩写。For example, \d means a range of digits (0-9), and \w means a word character (any lowercase letter, any uppercase letter, the underscore character, or any digit). 例如,\d表示数字范围(0-9),而\w表示单词字符(任何小写字母、任何大写字母、下划线字符或任何数字)。Use the predefined classes whenever possible. 尽可能使用预定义的类。They make your code easier to read and eliminate errors introduced by malformed character classes.它们使代码更易于阅读,并消除了由格式错误的字符类引入的错误。

Constructs beginning with a backslash are called escaped constructs. 以反斜杠开头的构造称为转义构造We previewed escaped constructs in the String Literals section where we mentioned the use of backslash and \Q and \E for quotation. 我们在字符串文字部分预览了转义构造,其中提到使用反斜杠和\Q\E作为引号。If you are using an escaped construct within a string literal, you must precede the backslash with another backslash for the string to compile. 如果在字符串文字中使用转义构造,则必须在反斜杠之前加上另一个反斜杠,才能编译该字符串。For example:例如:

private final String REGEX = "\\d"; // a single digit

In this example \d is the regular expression; the extra backslash is required for the code to compile. 在本例中\d是正则表达式;编译代码需要额外的反斜杠。The test harness reads the expressions directly from the Console, however, so the extra backslash is unnecessary.但是,测试线束直接从Console读取表达式,因此不需要额外的反斜杠。

The following examples demonstrate the use of predefined character classes.以下示例演示预定义字符类的使用。

Enter your regex: .
Enter input string to search: @
I found the text "@" starting at index 0 and ending at index 1.

Enter your regex: . 
Enter input string to search: 1
I found the text "1" starting at index 0 and ending at index 1.

Enter your regex: .
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \d
Enter input string to search: 1
I found the text "1" starting at index 0 and ending at index 1.

Enter your regex: \d
Enter input string to search: a
No match found.

Enter your regex: \D
Enter input string to search: 1
No match found.

Enter your regex: \D
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \s
Enter input string to search:  
I found the text " " starting at index 0 and ending at index 1.

Enter your regex: \s
Enter input string to search: a
No match found.

Enter your regex: \S
Enter input string to search:  
No match found.

Enter your regex: \S
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \w
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \w
Enter input string to search: !
No match found.

Enter your regex: \W
Enter input string to search: a
No match found.

Enter your regex: \W
Enter input string to search: !
I found the text "!" starting at index 0 and ending at index 1.

In the first three examples, the regular expression is simply . (the "dot" metacharacter) that indicates "any character." 在前三个示例中,正则表达式是简单的.(表示“任意字符”的“点”元字符)Therefore, the match is successful in all three cases (a randomly selected @ character, a digit, and a letter). 因此,匹配在所有三种情况下都是成功的(随机选择的@字符、数字和字母)。The remaining examples each use a single regular expression construct from the Predefined Character Classes table. 其余的示例都使用预定义字符类表中的单个正则表达式构造。You can refer to this table to figure out the logic behind each match:您可以参考此表了解每个匹配背后的逻辑:

Alternatively, a capital letter means the opposite:或者,大写字母表示相反的意思:


Previous page: Character Classes
Next page: Quantifiers