Documentation

The Java™ Tutorials
Hide TOC
Scanning扫描
Trail: Essential Java Classes
Lesson: Basic I/O
Section: I/O Streams
Subsection: Scanning and Formatting

Scanning扫描

Objects of type Scanner are useful for breaking down formatted input into tokens and translating individual tokens according to their data type.类型Scanner的对象用于将格式化输入分解为令牌,并根据其数据类型转换单个令牌。

Breaking Input into Tokens将输入分解为令牌

By default, a scanner uses white space to separate tokens. 默认情况下,扫描程序使用空白分隔令牌。(White space characters include blanks, tabs, and line terminators. (空白字符包括空格、制表符和行终止符。For the full list, refer to the documentation for Character.isWhitespace.) 有关完整列表,请参阅Character.isWhitespace的文档。)To see how scanning works, let's look at ScanXan, a program that reads the individual words in xanadu.txt and prints them out, one per line.为了了解扫描是如何工作的,让我们看看ScanXan,它是一个程序,读取xanadu.txt中的单个单词并打印出来,每行一个。

import java.io.*;
import java.util.Scanner;

public class ScanXan {
    public static void main(String[] args) throws IOException {

        Scanner s = null;

        try {
            s = new Scanner(new BufferedReader(new FileReader("xanadu.txt")));

            while (s.hasNext()) {
                System.out.println(s.next());
            }
        } finally {
            if (s != null) {
                s.close();
            }
        }
    }
}

Notice that ScanXan invokes Scanner's close method when it is done with the scanner object. 请注意,ScanXan在处理Scanner对象时调用Scannerclose方法。Even though a scanner is not a stream, you need to close it to indicate that you're done with its underlying stream.即使扫描器不是一个流,您也需要关闭它以表明您已经完成了它的底层流。

The output of ScanXan looks like this:ScanXan的输出如下所示:

In
Xanadu
did
Kubla
Khan
A
stately
pleasure-dome
...

To use a different token separator, invoke useDelimiter(), specifying a regular expression. 要使用不同的标记分隔符,请调用useDelimiter(),指定正则表达式。For example, suppose you wanted the token separator to be a comma, optionally followed by white space. 例如,假设您希望标记分隔符是逗号,可以选择后跟空格。You would invoke,你会调用,

s.useDelimiter(",\\s*");

Translating Individual Tokens翻译单个令牌

The ScanXan example treats all input tokens as simple String values. ScanXan示例将所有输入标记视为简单的String值。Scanner also supports tokens for all of the Java language's primitive types (except for char), as well as BigInteger and BigDecimal. Scanner还支持所有Java语言原语类型(除char外)的标记,以及BigIntegerBigDecimalAlso, numeric values can use thousands separators. 此外,数值可以使用数千个分隔符。Thus, in a US locale, Scanner correctly reads the string "32,767" as representing an integer value.因此,在US(美国)语言环境中,Scanner将字符串“32,767”正确读取为表示整数值。

We have to mention the locale, because thousands separators and decimal symbols are locale specific. 我们必须提到区域设置,因为数千个分隔符和十进制符号是特定于区域设置的。So, the following example would not work correctly in all locales if we didn't specify that the scanner should use the US locale. 因此,如果我们没有指定扫描器应该使用US语言环境,那么下面的示例将无法在所有语言环境中正常工作。That's not something you usually have to worry about, because your input data usually comes from sources that use the same locale as you do. 这不是您通常需要担心的事情,因为您的输入数据通常来自与您使用相同语言环境的源。But this example is part of the Java Tutorial and gets distributed all over the world.但是这个例子是Java教程的一部分,并在全世界范围内发布。

The ScanSum example reads a list of double values and adds them up. ScanSum示例读取一个double值列表并将其相加。Here's the source:以下是消息来源:

import java.io.FileReader;
import java.io.BufferedReader;
import java.io.IOException;
import java.util.Scanner;
import java.util.Locale;

public class ScanSum {
    public static void main(String[] args) throws IOException {

        Scanner s = null;
        double sum = 0;

        try {
            s = new Scanner(new BufferedReader(new FileReader("usnumbers.txt")));
            s.useLocale(Locale.US);

            while (s.hasNext()) {
                if (s.hasNextDouble()) {
                    sum += s.nextDouble();
                } else {
                    s.next();
                }   
            }
        } finally {
            s.close();
        }

        System.out.println(sum);
    }
}

And here's the sample input file, 这是示例输入文件,usnumbers.txt

8.5
32,767
3.14159
1,000,000.1

The output string is "1032778.74159". 输出字符串为“1032778.74159”。The period will be a different character in some locales, because System.out is a PrintStream object, and that class doesn't provide a way to override the default locale. 在某些区域设置中,句点将是不同的字符,因为System.out是一个PrintStream对象,并且该类不提供重写默认区域设置的方法。We could override the locale for the whole program — or we could just use formatting, as described in the next topic, Formatting.我们可以覆盖整个程序的区域设置—或者我们可以使用格式,如下一个主题格式化中所述。


Previous page: Scanning and Formatting
Next page: Formatting