The syntax for expressing joins permits nested joins. 表示联接的语法允许嵌套联接。The following discussion refers to the join syntax described in Section 13.2.10.2, “JOIN Clause”.以下讨论涉及第13.2.10.2节,“JOIN子句”中描述的连接语法。
The syntax of 与SQL标准相比,扩展了table_factor
is extended in comparison with the SQL Standard. table_factor
的语法。The latter accepts only 后者只接受table_reference
, not a list of them inside a pair of parentheses. table_reference
,而不接受一对括号内的table_reference
列表。This is a conservative extension if we consider each comma in a list of 这是一个保守的扩展,如果我们把table_reference
items as equivalent to an inner join. table_reference
列表中的每个逗号等同于内部连接。For example:例如:
SELECT * FROM t1 LEFT JOIN (t2, t3, t4) ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)
Is equivalent to:相当于:
SELECT * FROM t1 LEFT JOIN (t2 CROSS JOIN t3 CROSS JOIN t4) ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)
In MySQL, 在MySQL中,CROSS JOIN
is syntactically equivalent to INNER JOIN
; they can replace each other. CROSS JOIN
在语法上等同于INNER JOIN
;它们可以相互替换。In standard SQL, they are not equivalent. 在标准SQL中,它们不是等价的。INNER JOIN
is used with an ON
clause; CROSS JOIN
is used otherwise.INNER JOIN
与ON
子句一起使用;否则将使用CROSS JOIN
。
In general, parentheses can be ignored in join expressions containing only inner join operations. 通常,在只包含内部联接操作的联接表达式中可以忽略括号。Consider this join expression:考虑这个连接表达式:
t1 LEFT JOIN (t2 LEFT JOIN t3 ON t2.b=t3.b OR t2.b IS NULL) ON t1.a=t2.a
After removing parentheses and grouping operations to the left, that join expression transforms into this expression:删除左边的括号和分组操作后,该联接表达式将转换为以下表达式:
(t1 LEFT JOIN t2 ON t1.a=t2.a) LEFT JOIN t3 ON t2.b=t3.b OR t2.b IS NULL
Yet, the two expressions are not equivalent. 然而,这两个表达式并不等价。To see this, suppose that the tables 要了解这一点,假设表t1
, t2
, and t3
have the following state:t1
、t2
和t3
具有以下状态:
Table 表t1
contains rows (1)
, (2)
t1
包含行(1)
和(2)
Table 表t2
contains row (1,101)
t2
包含行(1,101)
Table 表t3
contains row (101)
t3
包含行(101)
In this case, the first expression returns a result set including the rows 在这种情况下,第一个表达式返回一个结果集,其中包括行(1,1,101,101)
, (2,NULL,NULL,NULL)
, whereas the second expression returns the rows (1,1,101,101)
, (2,NULL,NULL,101)
:(1,1,101,101)
,(2,NULL,NULL,NULL)
,而第二个表达式返回行(1,1,101,101)
,(2,NULL,NULL,101)
:
mysql>SELECT *
FROM t1
LEFT JOIN
(t2 LEFT JOIN t3 ON t2.b=t3.b OR t2.b IS NULL)
ON t1.a=t2.a;
+------+------+------+------+ | a | a | b | b | +------+------+------+------+ | 1 | 1 | 101 | 101 | | 2 | NULL | NULL | NULL | +------+------+------+------+ mysql>SELECT *
FROM (t1 LEFT JOIN t2 ON t1.a=t2.a)
LEFT JOIN t3
ON t2.b=t3.b OR t2.b IS NULL;
+------+------+------+------+ | a | a | b | b | +------+------+------+------+ | 1 | 1 | 101 | 101 | | 2 | NULL | NULL | 101 | +------+------+------+------+
In the following example, an outer join operation is used together with an inner join operation:在以下示例中,外部联接操作与内部联接操作一起使用:
t1 LEFT JOIN (t2, t3) ON t1.a=t2.a
That expression cannot be transformed into the following expression:无法将该表达式转换为以下表达式:
t1 LEFT JOIN t2 ON t1.a=t2.a, t3
For the given table states, the two expressions return different sets of rows:对于给定的表状态,这两个表达式返回不同的行集:
mysql>SELECT *
FROM t1 LEFT JOIN (t2, t3) ON t1.a=t2.a;
+------+------+------+------+ | a | a | b | b | +------+------+------+------+ | 1 | 1 | 101 | 101 | | 2 | NULL | NULL | NULL | +------+------+------+------+ mysql>SELECT *
FROM t1 LEFT JOIN t2 ON t1.a=t2.a, t3;
+------+------+------+------+ | a | a | b | b | +------+------+------+------+ | 1 | 1 | 101 | 101 | | 2 | NULL | NULL | 101 | +------+------+------+------+
Therefore, if we omit parentheses in a join expression with outer join operators, we might change the result set for the original expression.因此,如果在具有外部联接运算符的联接表达式中省略括号,则可能会更改原始表达式的结果集。
More exactly, we cannot ignore parentheses in the right operand of the left outer join operation and in the left operand of a right join operation. 更确切地说,我们不能忽略左外部联接操作的右操作数和右联接操作的左操作数中的括号。In other words, we cannot ignore parentheses for the inner table expressions of outer join operations. 换句话说,我们不能忽略外部联接操作的内部表表达式的括号。Parentheses for the other operand (operand for the outer table) can be ignored.可以忽略其他操作数(外部表的操作数)的括号。
The following expression:下面的表达:
(t1,t2) LEFT JOIN t3 ON P(t2.b,t3.b)
Is equivalent to this expression for any tables 对于任何表t1,t2,t3
and any condition P
over attributes t2.b
and t3.b
:t1,t2,t3
和属性t2.b
和t3.b
上的任何条件P
,都等效于此表达式:
t1, t2 LEFT JOIN t3 ON P(t2.b,t3.b)
Whenever the order of execution of join operations in a join expression (只要联接表达式(joined_table
) is not from left to right, we talk about nested joins. joined_table
)中联接操作的执行顺序不是从左到右,我们就讨论嵌套联接。Consider the following queries:考虑下面的查询:
SELECT * FROM t1 LEFT JOIN (t2 LEFT JOIN t3 ON t2.b=t3.b) ON t1.a=t2.a WHERE t1.a > 1 SELECT * FROM t1 LEFT JOIN (t2, t3) ON t1.a=t2.a WHERE (t2.b=t3.b OR t2.b IS NULL) AND t1.a > 1
Those queries are considered to contain these nested joins:这些查询被视为包含以下嵌套联接:
t2 LEFT JOIN t3 ON t2.b=t3.b t2, t3
In the first query, the nested join is formed with a left join operation. 在第一个查询中,使用左连接操作形成嵌套连接。In the second query, it is formed with an inner join operation.在第二个查询中,它由一个内部联接操作组成。
In the first query, the parentheses can be omitted: The grammatical structure of the join expression dictates the same order of execution for join operations. 在第一个查询中,可以省略括号:连接表达式的语法结构规定了连接操作的相同执行顺序。For the second query, the parentheses cannot be omitted, although the join expression here can be interpreted unambiguously without them. 对于第二个查询,不能省略括号,尽管这里的连接表达式可以在没有括号的情况下进行明确的解释。In our extended syntax, the parentheses in 在我们的扩展语法中,第二个查询的(t2, t3)
of the second query are required, although theoretically the query could be parsed without them: We still would have unambiguous syntactical structure for the query because LEFT JOIN
and ON
play the role of the left and right delimiters for the expression (t2,t3)
.(t2, t3)
中的括号是必需的,尽管理论上可以在没有括号的情况下解析查询:我们仍然可以为查询提供明确的语法结构,因为LEFT JOIN
和ON
充当表达式(t2,t3)
的左右分隔符。
The preceding examples demonstrate these points:前面的例子说明了以下几点:
For join expressions involving only inner joins (and not outer joins), parentheses can be removed and joins evaluated left to right. 对于只涉及内部联接(而不涉及外部联接)的联接表达式,可以删除括号并从左到右计算联接。In fact, tables can be evaluated in any order.事实上,可以按任何顺序计算表。
The same is not true, in general, for outer joins or for outer joins mixed with inner joins. 通常,对于外部联接或与内部联接混合的外部联接,情况并非如此。Removal of parentheses may change the result.删除括号可能会改变结果。
Queries with nested outer joins are executed in the same pipeline manner as queries with inner joins. 具有嵌套外部联接的查询以与具有内部联接的查询相同的管道方式执行。More exactly, a variation of the nested-loop join algorithm is exploited. 更准确地说,利用了嵌套循环联接算法的一种变体。Recall the algorithm by which the nested-loop join executes a query (see Section 8.2.1.7, “Nested-Loop Join Algorithms”). 回想一下嵌套循环联接执行查询的算法(请参阅第8.2.1.7节,“嵌套循环联接算法”)。Suppose that a join query over 3 tables 假设3个表T1,T2,T3
has this form:T1,T2,T3
上的联接查询具有以下形式:
SELECT * FROM T1 INNER JOIN T2 ON P1(T1,T2) INNER JOIN T3 ON P2(T2,T3) WHERE P(T1,T2,T3)
Here, 这里,P1(T1,T2)
and P2(T3,T3)
are some join conditions (on expressions), whereas P(T1,T2,T3)
is a condition over columns of tables T1,T2,T3
.P1(T1,T2)
和P2(T3,T3)
是一些连接条件(在表达式上),而P(T1,T2,T3)
是表T1,T2,T3
列上的条件。
The nested-loop join algorithm would execute this query in the following manner:嵌套循环联接算法将按以下方式执行此查询:
FOR each row t1 in T1 { FOR each row t2 in T2 such that P1(t1,t2) { FOR each row t3 in T3 such that P2(t2,t3) { IF P(t1,t2,t3) { t:=t1||t2||t3; OUTPUT t; } } } }
The notation 记号法t1||t2||t3
indicates a row constructed by concatenating the columns of rows t1
, t2
, and t3
. t1||t2||t3
表示通过连接行t1
、t2
和t3
的列而构造的行。In some of the following examples, 在以下一些示例中,表名出现时的NULL
where a table name appears means a row in which NULL
is used for each column of that table. NULL
表示该表的每列都使用NULL
的行。For example, 例如,t1||t2||NULL
indicates a row constructed by concatenating the columns of rows t1
and t2
, and NULL
for each column of t3
. t1||t2||NULL
表示通过连接行t1
和t2
的列而构造的行,而NULL
表示t3
的每列。Such a row is said to be 这样的一行称为NULL
-complemented.NULL
补行。
Now consider a query with nested outer joins:现在考虑嵌套外部连接的查询:
SELECT * FROM T1 LEFT JOIN (T2 LEFT JOIN T3 ON P2(T2,T3)) ON P1(T1,T2) WHERE P(T1,T2,T3)
For this query, modify the nested-loop pattern to obtain:对于此查询,修改嵌套循环模式以获得:
FOR each row t1 in T1 { BOOL f1:=FALSE; FOR each row t2 in T2 such that P1(t1,t2) { BOOL f2:=FALSE; FOR each row t3 in T3 such that P2(t2,t3) { IF P(t1,t2,t3) { t:=t1||t2||t3; OUTPUT t; } f2=TRUE; f1=TRUE; } IF (!f2) { IF P(t1,t2,NULL) { t:=t1||t2||NULL; OUTPUT t; } f1=TRUE; } } IF (!f1) { IF P(t1,NULL,NULL) { t:=t1||NULL||NULL; OUTPUT t; } } }
In general, for any nested loop for the first inner table in an outer join operation, a flag is introduced that is turned off before the loop and is checked after the loop. 通常,对于外部联接操作中第一个内部表的任何嵌套循环,会引入一个标志,该标志在循环之前关闭,并在循环之后检查。The flag is turned on when for the current row from the outer table a match from the table representing the inner operand is found. 当外部表中的当前行与表示内部操作数的表中的匹配时,将启用该标志。If at the end of the loop cycle the flag is still off, no match has been found for the current row of the outer table. 如果在循环周期结束时该标志仍处于关闭状态,则未找到外部表的当前行的匹配项。In this case, the row is complemented by 在这种情况下,行由内部表的列的NULL
values for the columns of the inner tables. NULL
值补充。The result row is passed to the final check for the output or into the next nested loop, but only if the row satisfies the join condition of all embedded outer joins.结果行将传递给输出的最终检查或下一个嵌套循环,但前提是该行满足所有嵌入外部联接的联接条件。
In the example, the outer join table expressed by the following expression is embedded:在该示例中,嵌入了由以下表达式表示的外部联接表:
(T2 LEFT JOIN T3 ON P2(T2,T3))
For the query with inner joins, the optimizer could choose a different order of nested loops, such as this one:对于具有内部联接的查询,优化器可以选择不同顺序的嵌套循环,例如:
FOR each row t3 in T3 { FOR each row t2 in T2 such that P2(t2,t3) { FOR each row t1 in T1 such that P1(t1,t2) { IF P(t1,t2,t3) { t:=t1||t2||t3; OUTPUT t; } } } }
For queries with outer joins, the optimizer can choose only such an order where loops for outer tables precede loops for inner tables. 对于具有外部联接的查询,优化器只能选择外部表的循环优先于内部表的循环的顺序。Thus, for our query with outer joins, only one nesting order is possible. 因此,对于具有外部联接的查询,只有一个嵌套顺序是可能的。For the following query, the optimizer evaluates two different nestings. 对于下面的查询,优化器计算两个不同的嵌套。In both nestings, 在这两种嵌套中,T1
must be processed in the outer loop because it is used in an outer join. T1
必须在外部循环中处理,因为它用于外部联接。T2
and T3
are used in an inner join, so that join must be processed in the inner loop. T2
和T3
用于内部联接,因此联接必须在内部循环中处理。However, because the join is an inner join, 但是,由于联接是内部联接,因此T2
and T3
can be processed in either order.T2
和T3
可以按任意顺序处理。
SELECT * T1 LEFT JOIN (T2,T3) ON P1(T1,T2) AND P2(T1,T3) WHERE P(T1,T2,T3)
One nesting evaluates 一个嵌套计算T2
, then T3
:T2
,然后计算T3
:
FOR each row t1 in T1 { BOOL f1:=FALSE; FOR each row t2 in T2 such that P1(t1,t2) { FOR each row t3 in T3 such that P2(t1,t3) { IF P(t1,t2,t3) { t:=t1||t2||t3; OUTPUT t; } f1:=TRUE } } IF (!f1) { IF P(t1,NULL,NULL) { t:=t1||NULL||NULL; OUTPUT t; } } }
The other nesting evaluates 另一个嵌套计算T3
, then T2
:T3
,然后计算T2
:
FOR each row t1 in T1 { BOOL f1:=FALSE; FOR each row t3 in T3 such that P2(t1,t3) { FOR each row t2 in T2 such that P1(t1,t2) { IF P(t1,t2,t3) { t:=t1||t2||t3; OUTPUT t; } f1:=TRUE } } IF (!f1) { IF P(t1,NULL,NULL) { t:=t1||NULL||NULL; OUTPUT t; } } }
When discussing the nested-loop algorithm for inner joins, we omitted some details whose impact on the performance of query execution may be huge. We did not mention so-called “pushed-down” conditions. 在讨论内部联接的嵌套循环算法时,我们忽略了一些细节,这些细节可能会对查询执行的性能产生巨大影响。我们没有提到所谓的“下推”条件。Suppose that our 假设我们的WHERE
condition P(T1,T2,T3)
can be represented by a conjunctive formula:WHERE
条件P(T1,T2,T3)
可以用以下联接公式表示:
P(T1,T2,T2) = C1(T1) AND C2(T2) AND C3(T3).
In this case, MySQL actually uses the following nested-loop algorithm for the execution of the query with inner joins:在这种情况下,MySQL实际上使用以下嵌套循环算法来执行带有内部联接的查询:
FOR each row t1 in T1 such that C1(t1) { FOR each row t2 in T2 such that P1(t1,t2) AND C2(t2) { FOR each row t3 in T3 such that P2(t2,t3) AND C3(t3) { IF P(t1,t2,t3) { t:=t1||t2||t3; OUTPUT t; } } } }
You see that each of the conjuncts 您可以看到,每个接合点C1(T1)
, C2(T2)
, C3(T3)
are pushed out of the most inner loop to the most outer loop where it can be evaluated. C1(T1)
、C2(T2)
、C3(T3)
都被从最内环推到最外环,在那里可以对其进行评估。If 如果C1(T1)
is a very restrictive condition, this condition pushdown may greatly reduce the number of rows from table T1
passed to the inner loops. C1(T1)
是一个非常严格的条件,那么这种条件下推可能会大大减少从表T1
传递到内部循环的行数。As a result, the execution time for the query may improve immensely.因此,查询的执行时间可能会大大缩短。
For a query with outer joins, the 对于具有外部联接的查询,只有在发现外部表中的当前行与内部表中的行匹配后,才会检查WHERE
condition is to be checked only after it has been found that the current row from the outer table has a match in the inner tables. WHERE
条件。Thus, the optimization of pushing conditions out of the inner nested loops cannot be applied directly to queries with outer joins. Here we must introduce conditional pushed-down predicates guarded by the flags that are turned on when a match has been encountered.因此,将条件推出内部嵌套循环的优化不能直接应用于具有外部联接的查询。这里我们必须引入条件下推谓词,这些谓词由遇到匹配时打开的标志保护。
Recall this example with outer joins:回想一下使用外部联接的示例:
P(T1,T2,T3)=C1(T1) AND C(T2) AND C3(T3)
For that example, the nested-loop algorithm using guarded pushed-down conditions looks like this:对于该示例,使用保护下推条件的嵌套循环算法如下所示:
FOR each row t1 in T1 such that C1(t1) { BOOL f1:=FALSE; FOR each row t2 in T2 such that P1(t1,t2) AND (f1?C2(t2):TRUE) { BOOL f2:=FALSE; FOR each row t3 in T3 such that P2(t2,t3) AND (f1&&f2?C3(t3):TRUE) { IF (f1&&f2?TRUE:(C2(t2) AND C3(t3))) { t:=t1||t2||t3; OUTPUT t; } f2=TRUE; f1=TRUE; } IF (!f2) { IF (f1?TRUE:C2(t2) && P(t1,t2,NULL)) { t:=t1||t2||NULL; OUTPUT t; } f1=TRUE; } } IF (!f1 && P(t1,NULL,NULL)) { t:=t1||NULL||NULL; OUTPUT t; } }
In general, pushed-down predicates can be extracted from join conditions such as 通常,下推谓词可以从连接条件中提取,例如P1(T1,T2)
and P(T2,T3)
. P1(T1,T2)
和P(T2,T3)
。In this case, a pushed-down predicate is guarded also by a flag that prevents checking the predicate for the 在这种情况下,下推谓词还受到一个标志的保护,该标志防止检查谓词中由相应的外部联接操作生成的NULL
-complemented row generated by the corresponding outer join operation.NULL
补码行。
Access by key from one inner table to another in the same nested join is prohibited if it is induced by a predicate from the 如果同一嵌套联接中的一个内部表是由WHERE
condition.WHERE
条件的谓词诱导的,则禁止通过键从一个内部表访问另一个内部表。