Chapter 2. Grammars第2章 语法

Table of Contents目录

2.1. Context-Free Grammars上下文无关文法
2.2. The Lexical Grammar词汇语法
2.3. The Syntactic Grammar句法语法
2.4. Grammar Notation语法符号

This chapter describes the context-free grammars used in this specification to define the lexical and syntactic structure of a program.本章介绍本规范中用于定义程序词汇和语法结构的上下文无关语法。

2.1. Context-Free Grammars上下文无关文法

A context-free grammar consists of a number of productions. 上下文无关语法由许多结果组成。Each production has an abstract symbol called a nonterminal as its left-hand side, and a sequence of one or more nonterminal and terminal symbols as its right-hand side. 每个产品都有一个称为非终结符的抽象符号作为其左侧,一个或多个非终结符和终结符的序列作为其右侧For each grammar, the terminal symbols are drawn from a specified alphabet.对于每种语法,终端符号都是从指定的字母表中提取的。

Starting from a sentence consisting of a single distinguished nonterminal, called the goal symbol, a given context-free grammar specifies a language, namely, the set of possible sequences of terminal symbols that can result from repeatedly replacing any nonterminal in the sequence with a right-hand side of a production for which the nonterminal is the left-hand side.给定的上下文无关语法从一个由单个可分辨的非终结符(称为目标符号)组成的句子开始指定一种语言,即,一组可能的终端符号序列,可以通过将序列中的任何非终端反复替换为产品的右侧(非终端为左侧)而得到。

2.2. The Lexical Grammar词汇语法

A lexical grammar for the Java programming language is given in 3 (Lexical Structure). Java编程语言的词汇语法3“词汇结构”中给出。This grammar has as its terminal symbols the characters of the Unicode character set. 此语法将Unicode字符集的字符作为其终端符号。It defines a set of productions, starting from the goal symbol Input (3.5), that describe how sequences of Unicode characters (3.1) are translated into a sequence of input elements (3.5).它定义了一组产品,从目标符号输入3.5)开始,描述如何将Unicode字符序列(3.1)转换为输入元素序列(3.5)。

These input elements, with white space (3.6) and comments (3.7) discarded, form the terminal symbols for the syntactic grammar for the Java programming language and are called tokens (3.5). 这些输入元素,连同空格(3.6)和注释(3.7)一起被丢弃,形成了Java编程语言语法的终端符号,被称为标记3.5)。These tokens are the identifiers (3.8), keywords (3.9), literals (3.10), separators (3.11), and operators (3.12) of the Java programming language.这些标记是Java编程语言的标识符(3.8)、关键字(3.9)、文本(3.10)、分隔符(3.11)和运算符(3.12)。

2.3. The Syntactic Grammar句法语法

The syntactic grammar for the Java programming language is given in Chapters 4, 6-10, 14, and 15. Java编程语言的语法在第4、6-10、14和15章中给出。This grammar has as its terminal symbols the tokens defined by the lexical grammar. 该语法将词汇语法定义的标记作为其终端符号。It defines a set of productions, starting from the goal symbol CompilationUnit (7.3), that describe how sequences of tokens can form syntactically correct programs.它定义了一组产品,从目标符号编译单元3.12)开始,描述标记序列如何形成语法正确的程序。

For convenience, the syntactic grammar is presented all together in Chapter 19.为方便起见,第19章将一并介绍句法语法。

2.4. Grammar Notation语法符号

Terminal symbols are shown in fixed width font in the productions of the lexical and syntactic grammars, and throughout this specification whenever the text is directly referring to such a terminal symbol. 在词汇语法和句法语法的生成中,以及在本规范中,只要文本直接引用这样的终端符号,终端符号都以等宽字体显示。These are to appear in a program exactly as written.它们将完全按照编写的那样出现在程序中。

Nonterminal symbols are shown in italic type. 非终端符号以斜体显示。The definition of a nonterminal is introduced by the name of the nonterminal being defined, followed by a colon. 非终结符的定义由被定义的非终结符的名称引入,后跟冒号。One or more alternative definitions for the nonterminal then follow on succeeding lines.非终结符的一个或多个替代定义随后出现在后续行中。

For example, the syntactic production:例如,语法生成:

IfThenStatement:
if ( Expression ) Statement

states that the nonterminal IfThenStatement represents the token if, followed by a left parenthesis token, followed by an Expression, followed by a right parenthesis token, followed by a Statement.声明非终结符IfThenStatement表示标记if,后跟左括号标记,后跟Expression,后跟右括号标记,后跟Statement

The syntax {x} on the right-hand side of a production denotes zero or more occurrences of x.产品右侧的语法{x}表示零次或多次出现x

For example, the syntactic production:例如,语法生成:

ArgumentList:
Argument {, Argument}

states that an ArgumentList consists of an Argument, followed by zero or more occurrences of a comma and an Argument. 语句ArgumentList由一个Argument,后跟零次或多次出现的逗号和Argument组成。The result is that an ArgumentList may contain any positive number of arguments.结果是ArgumentList可以包含任意数量的正数参数。

The syntax [x] on the right-hand side of a production denotes zero or one occurrences of x. 产品右侧的语法[x]表示零次或一次出现xThat is, x is an optional symbol. 也就是说,x可选符号The alternative which contains the optional symbol actually defines two alternatives: one that omits the optional symbol and one that includes it.包含可选符号的备选方案实际上定义了两个备选方案:一个省略可选符号,另一个包含可选符号。

This means that:这意味着:

BreakStatement:
break [Identifier] ;

is a convenient abbreviation for:是一个以下内容方便的缩写:

BreakStatement:
break ;
break Identifier ;

As another example, it means that:另一个例子是,这意味着:

BasicForStatement:
for ( [ForInit] ; [Expression] ; [ForUpdate] ) Statement

is a convenient abbreviation for:是一个以下内容方便的缩写:

BasicForStatement:
for ( ; [Expression] ; [ForUpdate] ) Statement
for ( ForInit ; [Expression] ; [ForUpdate] ) Statement

which in turn is an abbreviation for:这反过来又是以下内容的缩写:

BasicForStatement:
for ( ; ; [ForUpdate] ) Statement
for ( ; Expression ; [ForUpdate] ) Statement
for ( ForInit ; ; [ForUpdate] ) Statement
for ( ForInit ; Expression ; [ForUpdate] ) Statement

which in turn is an abbreviation for:这反过来又是以下内容的缩写:

BasicForStatement:
for ( ; ; ) Statement
for ( ; ; ForUpdate ) Statement
for ( ; Expression ; ) Statement
for ( ; Expression ; ForUpdate ) Statement
for ( ForInit ; ; ) Statement
for ( ForInit ; ; ForUpdate ) Statement
for ( ForInit ; Expression ; ) Statement
for ( ForInit ; Expression ; ForUpdate ) Statement

so the nonterminal BasicForStatement actually has eight alternative right-hand sides.所以非终结符的BasicForStatement实际上有八个可选的右侧。

A very long right-hand side may be continued on a second line by clearly indenting the second line.通过清楚地缩进第二行,可以在第二行上继续非常长的右侧。

For example, the syntactic grammar contains this production:例如,语法语法包含以下结果:

which defines one right-hand side for the nonterminal NormalClassDeclaration.它定义了非终端NormalClassDeclaration的一个右侧。

The phrase (one of) on the right-hand side of a production signifies that each of the symbols on the following line or lines is an alternative definition.作品右侧的短语(one of)表示以下一行或几行上的每个符号都是一个N选一定义。

For example, the lexical grammar contains the production:例如,词汇语法包含以下结果:

ZeroToThree:
(one of)
0 1 2 3

which is merely a convenient abbreviation for:这只是一个方便的缩写:

ZeroToThree:
0
1
2
3

When an alternative in a production appears to be a token, it represents the sequence of characters that would make up such a token.当产品中的替代项看起来是一个标记时,它表示将构成该标记的字符序列。

Thus, the production:因此,生产:

BooleanLiteral:
(one of)
true false

is shorthand for:是以下的简写:

BooleanLiteral:
t r u e
f a l s e

The right-hand side of a production may specify that certain expansions are not permitted by using the phrase "but not" and then indicating the expansions to be excluded.产品的右侧可以通过使用短语“但不是”并指示要排除的扩展来指定不允许某些扩展。

For example:例如:

Identifier:

Finally, a few nonterminals are defined by a narrative phrase in roman type where it would be impractical to list all the alternatives.最后,一些非终结符是由罗马式的叙述性短语定义的,在这种情况下,列出所有备选项是不切实际的。

For example:例如:

RawInputCharacter:
any Unicode character