8.2.2.4 Optimizing Derived Tables, View References, and Common Table Expressions with Merging or Materialization通过合并或物化优化派生表、视图引用和公共表表达式

The optimizer can handle derived table references using two strategies (which also apply to view references and common table expressions):优化器可以使用两种策略(也适用于视图引用和公共表表达式)处理派生表引用:

Example 1:例1:

SELECT * FROM (SELECT * FROM t1) AS derived_t1;

With merging of the derived table derived_t1, that query is executed similar to:合并派生表derived_t1后,执行的查询类似于:

SELECT * FROM t1;

Example 2:例2:

SELECT *
  FROM t1 JOIN (SELECT t2.f1 FROM t2) AS derived_t2 ON t1.f2=derived_t2.f1
  WHERE t1.f1 > 0;

With merging of the derived table derived_t2, that query is executed similar to:合并派生表derived_t2后,执行的查询类似于:

SELECT t1.*, t2.f1
  FROM t1 JOIN t2 ON t1.f2=t2.f1
  WHERE t1.f1 > 0;

With materialization, derived_t1 and derived_t2 are each treated as a separate table within their respective queries.通过物化,derived_t1derived_t2在各自的查询中都被视为一个单独的表。

The optimizer handles derived tables, view references, and common table expressions the same way: It avoids unnecessary materialization whenever possible, which enables pushing down conditions from the outer query to derived tables and produces more efficient execution plans. 优化器以相同的方式处理派生表、视图引用和公共表表达式:它尽可能避免不必要的物化,从而可以将条件从外部查询下推到派生表,并生成更高效的执行计划。(For an example, see Section 8.2.2.2, “Optimizing Subqueries with Materialization”.)(例如,请参阅第8.2.2.2节,“使用物化优化子查询”。)

If merging would result in an outer query block that references more than 61 base tables, the optimizer chooses materialization instead.如果合并将导致引用超过61个基表的外部查询块,那么优化器将选择物化。

The optimizer propagates an ORDER BY clause in a derived table or view reference to the outer query block if these conditions are all true:如果以下条件均为真,优化器将派生表或视图引用中的ORDER BY子句传播到外部查询块:

Otherwise, the optimizer ignores the ORDER BY clause.否则,优化器将忽略ORDER BY子句。

The following means are available to influence whether the optimizer attempts to merge derived tables, view references, and common table expressions into the outer query block:以下方法可用于影响优化器是否尝试将派生表、视图引用和公共表表达式合并到外部查询块中:

If the optimizer chooses the materialization strategy rather than merging for a derived table, it handles the query as follows:如果优化器选择物化策略而不是合并派生表,它将按如下方式处理查询:

Consider the following EXPLAIN statement, for a SELECT query that contains a derived table:对于包含派生表的SELECT查询,请考虑下面的EXPLAIN语句:

EXPLAIN SELECT * FROM (SELECT * FROM t1) AS derived_t1;

The optimizer avoids materializing the derived table by delaying it until the result is needed during SELECT execution. 优化器通过将派生表延迟到SELECT执行过程中需要结果时,从而避免具体化派生表。In this case, the query is not executed (because it occurs in an EXPLAIN statement), so the result is never needed.在这种情况下,查询不会执行(因为它发生在EXPLAIN语句中),因此永远不需要结果。

Even for queries that are executed, delay of derived table materialization may enable the optimizer to avoid materialization entirely. 即使对于已执行的查询,派生表物化的延迟也可能使优化器完全避免物化。When this happens, query execution is quicker by the time needed to perform materialization. 当这种情况发生时,查询执行会比执行物化所需的时间更快。Consider the following query, which joins the result of a derived table to another table:考虑下面的查询,它将派生表的结果与另一个表连接:

SELECT *
  FROM t1 JOIN (SELECT t2.f1 FROM t2) AS derived_t2
          ON t1.f2=derived_t2.f1
  WHERE t1.f1 > 0;

If the optimization processes t1 first and the WHERE clause produces an empty result, the join must necessarily be empty and the derived table need not be materialized.如果优化首先处理t1,并且WHERE子句生成一个空结果,那么连接必须是空的,并且派生表不需要具体化。

For cases when a derived table requires materialization, the optimizer may add an index to the materialized table to speed up access to it. 对于派生表需要物化的情况,优化器可以向物化表添加索引以加快对它的访问。If such an index enables ref access to the table, it can greatly reduce amount of data read during query execution. 如果这样的索引允许ref访问表,则可以大大减少查询执行期间读取的数据量。Consider the following query:考虑下面的查询:

SELECT *
 FROM t1 JOIN (SELECT DISTINCT f1 FROM t2) AS derived_t2
         ON t1.f1=derived_t2.f1;

The optimizer constructs an index over column f1 from derived_t2 if doing so would enable use of ref access for the lowest cost execution plan. 如果这样做将允许对最低成本的执行计划使用ref访问,那么优化器将在derived_t2f1列上构造一个索引。After adding the index, the optimizer can treat the materialized derived table the same as a regular table with an index, and it benefits similarly from the generated index. 添加索引后,优化器可以将物化的派生表视为具有索引的常规表,并从生成的索引中获得类似的好处。The overhead of index creation is negligible compared to the cost of query execution without the index. 与没有索引的查询执行成本相比,创建索引的开销可以忽略不计。If ref access would result in higher cost than some other access method, the optimizer creates no index and loses nothing.如果ref访问会导致比其他访问方法更高的成本,那么优化器不会创建索引,也不会丢失任何内容。

For optimizer trace output, a merged derived table or view reference is not shown as a node. 对于优化器跟踪输出,合并的派生表或视图引用不显示为节点。Only its underlying tables appear in the top query's plan.顶部查询的计划中仅显示其基础表。

What is true for materialization of derived tables is also true for common table expressions (CTEs). 对于派生表的物化是正确的,对于公共表表达式(CTE)也是正确的。In addition, the following considerations pertain specifically to CTEs.此外,以下注意事项特别适用于CTE。

If a CTE is materialized by a query, it is materialized once for the query, even if the query references it several times.如果一个CTE由一个查询具体化,那么即使查询多次引用它,它也会为该查询具体化一次。

A recursive CTE is always materialized.递归CTE总是具体化的。

If a CTE is materialized, the optimizer automatically adds relevant indexes if it estimates that indexing can speed up access by the top-level statement to the CTE. 如果CTE被物化,那么如果优化器估计索引可以加快顶级语句对CTE的访问,那么它会自动添加相关索引。This is similar to automatic indexing of derived tables, except that if the CTE is referenced multiple times, the optimizer may create multiple indexes, to speed up access by each reference in the most appropriate way.这类似于派生表的自动索引,只是如果CTE被多次引用,优化器可能会创建多个索引,以最适当的方式加速每个引用的访问。

The MERGE and NO_MERGE optimizer hints can be applied to CTEs. MERGE优化器和NO_MERGE优化器提示可应用于CTE。Each CTE reference in the top-level statement can have its own hint, permitting CTE references to be selectively merged or materialized. 顶级语句中的每个CTE引用都可以有自己的提示,允许有选择地合并或具体化CTE引用。The following statement uses hints to indicate that cte1 should be merged and cte2 should be materialized:下面的语句使用提示指示cte1应该合并、cte2应该具体化:

WITH
  cte1 AS (SELECT a, b FROM table1),
  cte2 AS (SELECT c, d FROM table2)
SELECT /*+ MERGE(cte1) NO_MERGE(cte2) */ cte1.b, cte2.d
FROM cte1 JOIN cte2
WHERE cte1.a = cte2.c;

The ALGORITHM clause for CREATE VIEW does not affect materialization for any WITH clause preceding the SELECT statement in the view definition. CREATE VIEWALGORITHM子句不影响视图定义中SELECT语句前面的WITH子句的具体化。Consider this statement:考虑一下这个说法:

CREATE ALGORITHM={TEMPTABLE|MERGE} VIEW v1 AS WITH ... SELECT ...

The ALGORITHM value affects materialization only of the SELECT, not the WITH clause.ALGORITHM值仅影响SELECT的具体化,而不影响WITH子句。

Prior to MySQL 8.0.16, if internal_tmp_disk_storage_engine=MYISAM, an error occurred for any attempt to materialize a CTE using an on-disk temporary table, since for CTEs, the storage engine used for on-disk internal temporary tables could not be MyISAM. 在MySQL 8.0.16之前,如果internal_tmp_disk_storage_engine=MYISAM,则任何使用磁盘上临时表具体化CTE的尝试都会发生错误,因为对于CTE,用于磁盘上内部临时表的存储引擎不能是MYISAMBeginning with MySQL 8.0.16, this is no longer an issue, since TempTable now always uses InnoDB for on-disk internal temporary tables.从MySQL 8.0.16开始,这不再是一个问题,因为TempTable现在总是使用InnoDB作为磁盘上的内部临时表。

As mentioned previously, a CTE, if materialized, is materialized once, even if referenced multiple times. 如前所述,CTE如果被物化,则被物化一次,即使被多次引用。To indicate one-time materialization, optimizer trace output contains an occurrence of creating_tmp_table plus one or more occurrences of reusing_tmp_table.为了指示一次性物化,优化器跟踪输出包含creating_tmp_table的出现和reusing_tmp_table的一次或多次出现。

CTEs are similar to derived tables, for which the materialized_from_subquery node follows the reference. CTE类似于派生表,其materialized_from_subquery节点遵循引用。This is true for a CTE that is referenced multiple times, so there is no duplication of materialized_from_subquery nodes (which would give the impression that the subquery is executed multiple times, and produce unnecessarily verbose output). 这对于多次引用的CTE是正确的,因此不存在materialized_from_subquery节点的复制(这会给人子查询执行多次的印象,并产生不必要的详细输出)。Only one reference to the CTE has a complete materialized_from_subquery node with the description of its subquery plan. 只有一个对CTE的引用具有完整的materialized_from_subquery节点及其子查询计划的描述。Other references have a reduced materialized_from_subquery node. 其他引用具有一个减量的materialized_from_subquery节点。The same idea applies to EXPLAIN output in TRADITIONAL format: Subqueries for other references are not shown.同样的想法也适用于以TRADITIONAL格式EXPLAIN输出:不显示其他引用的子查询。