The Java Tutorials have been written for JDK 8.Java教程是为JDK 8编写的。Examples and practices described in this page don't take advantage of improvements introduced in later releases and might use technology no longer available.本页中描述的示例和实践没有利用后续版本中引入的改进,并且可能使用不再可用的技术。See Java Language Changes for a summary of updated language features in Java SE 9 and subsequent releases.有关Java SE 9及其后续版本中更新的语言特性的摘要,请参阅Java语言更改。
See JDK Release Notes for information about new features, enhancements, and removed or deprecated options for all JDK releases.有关所有JDK版本的新功能、增强功能以及已删除或不推荐的选项的信息,请参阅JDK发行说明。
The BreakIterator class is locale-sensitive, because text boundaries vary with language. For example, the syntax rules for line breaks are not the same for all languages. To determine which locales the BreakIterator class supports, invoke the getAvailableLocales method, as follows:
Locale[] locales = BreakIterator.getAvailableLocales();
You can analyze four kinds of boundaries with the BreakIterator class: character, word, sentence, and potential line break. When instantiating a BreakIterator, you invoke the appropriate factory method:
getCharacterInstancegetWordInstancegetSentenceInstancegetLineInstanceEach instance of BreakIterator can detect just one type of boundary. If you want to locate both character and word boundaries, for example, you create two separate instances.
A BreakIterator has an imaginary cursor that points to the current boundary in a string of text. You can move this cursor within the text with the previous and the next methods. For example, if you've created a BreakIterator with getWordInstance, the cursor moves to the next word boundary in the text every time you invoke the next method. The cursor-movement methods return an integer indicating the position of the boundary. This position is the index of the character in the text string that would follow the boundary. Like string indexes, the boundaries are zero-based. The first boundary is at 0, and the last boundary is the length of the string. The following figure shows the word boundaries detected by the next and previous methods in a line of text:

You should use the BreakIterator class only with natural-language text. To tokenize a programming language, use the StreamTokenizer class.
The sections that follow give examples for each type of boundary analysis. The coding examples are from the source code file named BreakIteratorDemo.java.