Chapter 8 Optimization

8.2.1.13 Condition Filtering条件筛选

~~In join processing, prefix rows are those rows passed from one table in a join to the next.~~ 在联接处理中，前缀行是从联接中的一个表传递到下一个表的行。~~In general, the optimizer attempts to put tables with low prefix counts early in the join order to keep the number of row combinations from increasing rapidly.~~ 通常，优化器会尝试在联接的早期放置前缀计数较低的表，以防止行组合的数量快速增加。To the extent that the optimizer can use information about conditions on rows selected from one table and passed to the next, the more accurately it can compute row estimates and choose the best execution plan.优化器可以使用从一个表中选择并传递到下一个表的行上的条件信息，因此它可以更准确地计算行估计并选择最佳执行计划。

~~Without condition filtering, the prefix row count for a table is based on the estimated number of rows selected by the WHERE clause according to whichever access method the optimizer chooses.~~ 在没有条件筛选的情况下，表的前缀行数基于WHERE子句根据优化器选择的访问方法选择的估计行数。~~Condition filtering enables the optimizer to use other relevant conditions in the WHERE clause not taken into account by the access method, and thus improve its prefix row count estimates.~~ 条件筛选使优化器能够在WHERE子句中使用访问方法未考虑的其他相关条件，从而改进其前缀行数估计。For example, even though there might be an index-based access method that can be used to select rows from the current table in a join, there might also be additional conditions for the table in the WHERE clause that can filter (further restrict) the estimate for qualifying rows passed to the next table.例如，即使可能有一种基于索引的访问方法可用于从联接中的当前表中选择行，WHERE子句中的表也可能有其他条件，可以过滤（进一步限制）传递到下一个表的符合条件行的估计。

~~A condition contributes to the filtering estimate only if:~~只有在以下情况下，条件才有助于过滤估计：

~~It refers to the current table.~~它指的是当前表。
~~It depends on a constant value or values from earlier tables in the join sequence.~~它取决于一个常量值或联接序列中早期表中的值。
~~It was not already taken into account by the access method.~~访问方法尚未将其考虑在内。

~~In EXPLAIN output, the rows column indicates the row estimate for the chosen access method, and the filtered column reflects the effect of condition filtering.~~ 在EXPLAIN输出中，rows列表示所选访问方法的行估计值，筛选列反映条件筛选的效果。~~filtered values are expressed as percentages.~~ filtered值以百分比表示。~~The maximum value is 100, which means no filtering of rows occurred.~~ 最大值为100，这意味着没有对行进行筛选。~~Values decreasing from 100 indicate increasing amounts of filtering.~~值从100减少表示过滤量增加。

~~The prefix row count (the number of rows estimated to be passed from the current table in a join to the next) is the product of the rows and filtered values.~~ 前缀行数（在联接中从当前表传递到下一个表的估计行数）是rows和filtered值的乘积。~~That is, the prefix row count is the estimated row count, reduced by the estimated filtering effect.~~ 也就是说，前缀行计数是估计的行计数，由估计的过滤效果减少。~~For example, if rows is 1000 and filtered is 20%, condition filtering reduces the estimated row count of 1000 to a prefix row count of 1000 × 20% = 1000 × .2 = 200.~~例如，如果rows为1000，而filtered为20%，则条件筛选会将估计的行数1000减少为前缀行数1000 × 20%=1000×.2=200。

~~Consider the following query:~~考虑下面的查询：

SELECT *
  FROM employee JOIN department ON employee.dept_no = department.dept_no
  WHERE employee.first_name = 'John'
  AND employee.hire_date BETWEEN '2018-01-01' AND '2018-06-01';

~~Suppose that the data set has these characteristics:~~假设数据集具有以下特征：

~~The employee table has 1024 rows.~~employee表有1024行。
~~The department table has 12 rows.~~department表有12行。
~~Both tables have an index on dept_no.~~两个表都有dept_no索引。
~~The employee table has an index on first_name.~~employee表在first_name上有一个索引。
~~8 rows satisfy this condition on employee.first_name:~~8行在employee.first_name上满足此条件：
```
employee.first_name = 'John'
```
~~150 rows satisfy this condition on employee.hire_date:~~150行在employee.hire_date上满足此条件：
```
employee.hire_date BETWEEN '2018-01-01' AND '2018-06-01'
```

~~1 row satisfies both conditions:~~1行同时满足两个条件：

employee.first_name = 'John'
AND employee.hire_date BETWEEN '2018-01-01' AND '2018-06-01'

~~Without condition filtering, EXPLAIN produces output like this:~~如果没有条件筛选，EXPLAIN将生成如下输出：

+----+------------+--------+------------------+---------+---------+------+----------+
| id | table      | type   | possible_keys    | key     | ref     | rows | filtered |
+----+------------+--------+------------------+---------+---------+------+----------+
| 1  | employee   | ref    | name,h_date,dept | name    | const   | 8    | 100.00   |
| 1  | department | eq_ref | PRIMARY          | PRIMARY | dept_no | 1    | 100.00   |
+----+------------+--------+------------------+---------+---------+------+----------+

~~For employee, the access method on the name index picks up the 8 rows that match a name of 'John'.~~ 对于employee，name索引上的访问方法会选择与名称'John'匹配的8行。~~No filtering is done (filtered is 100%), so all rows are prefix rows for the next table: The prefix row count is rows × filtered = 8 × 100% = 8.~~未进行筛选（filtered为100%），因此所有行都是下一个表的前缀行：前缀行计数为rows × filtered=8 × 100%=8。

~~With condition filtering, the optimizer additionally takes into account conditions from the WHERE clause not taken into account by the access method.~~ 通过条件筛选，优化器还将考虑访问方法未考虑的WHERE子句中的条件。~~In this case, the optimizer uses heuristics to estimate a filtering effect of 16.31% for the BETWEEN condition on employee.hire_date.~~ 在这种情况下，优化器使用启发式方法估计employee.hire_date上的BETWEEN条件的筛选效果为16.31%。~~As a result, EXPLAIN produces output like this:~~因此，EXPLAIN生成如下输出：

+----+------------+--------+------------------+---------+---------+------+----------+
| id | table      | type   | possible_keys    | key     | ref     | rows | filtered |
+----+------------+--------+------------------+---------+---------+------+----------+
| 1  | employee   | ref    | name,h_date,dept | name    | const   | 8    | 16.31    |
| 1  | department | eq_ref | PRIMARY          | PRIMARY | dept_no | 1    | 100.00   |
+----+------------+--------+------------------+---------+---------+------+----------+

~~Now the prefix row count is rows × filtered = 8 × 16.31% = 1.3, which more closely reflects actual data set.~~现在，前缀行计数是rows × filtered=8 × 16.31%=1.3，这更能反映实际数据集。

~~Normally, the optimizer does not calculate the condition filtering effect (prefix row count reduction) for the last joined table because there is no next table to pass rows to.~~ 通常，优化器不会计算上一个联接表的条件筛选效果（前缀行数减少），因为没有下一个要向其传递行的表。~~An exception occurs for EXPLAIN: To provide more information, the filtering effect is calculated for all joined tables, including the last one.~~EXPLAIN出现异常：为了提供更多信息，将对所有联接的表（包括最后一个表）计算过滤效果。

To control whether the optimizer considers additional filtering conditions, use the condition_fanout_filter flag of the optimizer_switch system variable (see Section 8.9.2, “Switchable Optimizations”). 要控制优化器是否考虑其他过滤条件，请使用optimizer_switch变量的condition_fanout_filter标志（请参阅第8.9.2节，“可切换的优化”）。~~This flag is enabled by default but can be disabled to suppress condition filtering (for example, if a particular query is found to yield better performance without it).~~默认情况下启用该标志，但可以禁用该标志以抑制条件筛选（例如，如果发现某个特定查询在没有条件筛选的情况下可以产生更好的性能）。

~~If the optimizer overestimates the effect of condition filtering, performance may be worse than if condition filtering is not used.~~ 如果优化器高估了条件筛选的效果，则性能可能会比不使用条件筛选时更差。~~In such cases, these techniques may help:~~在这种情况下，这些技术可能有助于：

~~If a column is not indexed, index it so that the optimizer has some information about the distribution of column values and can improve its row estimates.~~如果列未编制索引，请对其编制索引，以便优化器获得有关列值分布的一些信息，并可以改进其行估计。
~~Similarly, if no column histogram information is available, generate a histogram (see Section 8.9.6, “Optimizer Statistics”).~~同样，如果没有可用的列直方图信息，则生成直方图（请参阅第8.9.6节，“优化器统计”）。
~~Change the join order.~~ 更改联接顺序。~~Ways to accomplish this include join-order optimizer hints (see Section 8.9.3, “Optimizer Hints”), STRAIGHT_JOIN immediately following the SELECT, and the STRAIGHT_JOIN join operator.~~实现这一点的方法包括连接顺序优化器提示（请参阅第8.9.3节，“优化器提示”）、紧接着SELECT的STRAIGHT_JOIN以及STRAIGHT_JOIN操作符。
~~Disable condition filtering for the session:~~禁用会话的条件筛选：
```
SET optimizer_switch = 'condition_fanout_filter=off';
```
~~Or, for a given query, using an optimizer hint:~~或者，对于给定查询，使用优化器提示：
```
SELECT /*+ SET_VAR(optimizer_switch = 'condition_fanout_filter=off') */ ...
```