In join processing, prefix rows are those rows passed from one table in a join to the next. 在联接处理中,前缀行是从联接中的一个表传递到下一个表的行。In general, the optimizer attempts to put tables with low prefix counts early in the join order to keep the number of row combinations from increasing rapidly. 通常,优化器会尝试在联接的早期放置前缀计数较低的表,以防止行组合的数量快速增加。To the extent that the optimizer can use information about conditions on rows selected from one table and passed to the next, the more accurately it can compute row estimates and choose the best execution plan.优化器可以使用从一个表中选择并传递到下一个表的行上的条件信息,因此它可以更准确地计算行估计并选择最佳执行计划。
Without condition filtering, the prefix row count for a table is based on the estimated number of rows selected by the 在没有条件筛选的情况下,表的前缀行数基于WHERE
clause according to whichever access method the optimizer chooses. WHERE
子句根据优化器选择的访问方法选择的估计行数。Condition filtering enables the optimizer to use other relevant conditions in the 条件筛选使优化器能够在WHERE
clause not taken into account by the access method, and thus improve its prefix row count estimates. WHERE
子句中使用访问方法未考虑的其他相关条件,从而改进其前缀行数估计。For example, even though there might be an index-based access method that can be used to select rows from the current table in a join, there might also be additional conditions for the table in the 例如,即使可能有一种基于索引的访问方法可用于从联接中的当前表中选择行,WHERE
clause that can filter (further restrict) the estimate for qualifying rows passed to the next table.WHERE
子句中的表也可能有其他条件,可以过滤(进一步限制)传递到下一个表的符合条件行的估计。
A condition contributes to the filtering estimate only if:只有在以下情况下,条件才有助于过滤估计:
It refers to the current table.它指的是当前表。
It depends on a constant value or values from earlier tables in the join sequence.它取决于一个常量值或联接序列中早期表中的值。
It was not already taken into account by the access method.访问方法尚未将其考虑在内。
In 在EXPLAIN
output, the rows
column indicates the row estimate for the chosen access method, and the filtered
column reflects the effect of condition filtering. EXPLAIN
输出中,rows
列表示所选访问方法的行估计值,筛选列反映条件筛选的效果。filtered
values are expressed as percentages. filtered
值以百分比表示。The maximum value is 100, which means no filtering of rows occurred. 最大值为100,这意味着没有对行进行筛选。Values decreasing from 100 indicate increasing amounts of filtering.值从100减少表示过滤量增加。
The prefix row count (the number of rows estimated to be passed from the current table in a join to the next) is the product of the 前缀行数(在联接中从当前表传递到下一个表的估计行数)是rows
and filtered
values. rows
和filtered
值的乘积。That is, the prefix row count is the estimated row count, reduced by the estimated filtering effect. 也就是说,前缀行计数是估计的行计数,由估计的过滤效果减少。For example, if 例如,如果rows
is 1000 and filtered
is 20%, condition filtering reduces the estimated row count of 1000 to a prefix row count of 1000 × 20% = 1000 × .2 = 200.rows
为1000,而filtered
为20%,则条件筛选会将估计的行数1000减少为前缀行数1000 × 20%=1000×.2=200。
Consider the following query:考虑下面的查询:
SELECT * FROM employee JOIN department ON employee.dept_no = department.dept_no WHERE employee.first_name = 'John' AND employee.hire_date BETWEEN '2018-01-01' AND '2018-06-01';
Suppose that the data set has these characteristics:假设数据集具有以下特征:
The employee
table has 1024 rows.employee
表有1024行。
The department
table has 12 rows.department
表有12行。
Both tables have an index on 两个表都有dept_no
.dept_no
索引。
The employee
table has an index on first_name
.employee
表在first_name
上有一个索引。
8 rows satisfy this condition on 8行在employee.first_name
:employee.first_name
上满足此条件:
employee.first_name = 'John'
150 rows satisfy this condition on 150行在employee.hire_date
:employee.hire_date
上满足此条件:
employee.hire_date BETWEEN '2018-01-01' AND '2018-06-01'
1 row satisfies both conditions:1行同时满足两个条件:
employee.first_name = 'John' AND employee.hire_date BETWEEN '2018-01-01' AND '2018-06-01'
Without condition filtering, 如果没有条件筛选,EXPLAIN
produces output like this:EXPLAIN
将生成如下输出:
+----+------------+--------+------------------+---------+---------+------+----------+ | id | table | type | possible_keys | key | ref | rows | filtered | +----+------------+--------+------------------+---------+---------+------+----------+ | 1 | employee | ref | name,h_date,dept | name | const | 8 | 100.00 | | 1 | department | eq_ref | PRIMARY | PRIMARY | dept_no | 1 | 100.00 | +----+------------+--------+------------------+---------+---------+------+----------+
For 对于employee
, the access method on the name
index picks up the 8 rows that match a name of 'John'
. employee
,name
索引上的访问方法会选择与名称'John'
匹配的8行。No filtering is done (未进行筛选(filtered
is 100%), so all rows are prefix rows for the next table: The prefix row count is rows
× filtered
= 8 × 100% = 8.filtered
为100%),因此所有行都是下一个表的前缀行:前缀行计数为rows
× filtered
=8 × 100%=8。
With condition filtering, the optimizer additionally takes into account conditions from the 通过条件筛选,优化器还将考虑访问方法未考虑的WHERE
clause not taken into account by the access method. WHERE
子句中的条件。In this case, the optimizer uses heuristics to estimate a filtering effect of 16.31% for the 在这种情况下,优化器使用启发式方法估计BETWEEN
condition on employee.hire_date
. employee.hire_date
上的BETWEEN
条件的筛选效果为16.31%。As a result, 因此,EXPLAIN
produces output like this:EXPLAIN
生成如下输出:
+----+------------+--------+------------------+---------+---------+------+----------+ | id | table | type | possible_keys | key | ref | rows | filtered | +----+------------+--------+------------------+---------+---------+------+----------+ | 1 | employee | ref | name,h_date,dept | name | const | 8 | 16.31 | | 1 | department | eq_ref | PRIMARY | PRIMARY | dept_no | 1 | 100.00 | +----+------------+--------+------------------+---------+---------+------+----------+
Now the prefix row count is 现在,前缀行计数是rows
× filtered
= 8 × 16.31% = 1.3, which more closely reflects actual data set.rows
× filtered
=8 × 16.31%=1.3,这更能反映实际数据集。
Normally, the optimizer does not calculate the condition filtering effect (prefix row count reduction) for the last joined table because there is no next table to pass rows to. 通常,优化器不会计算上一个联接表的条件筛选效果(前缀行数减少),因为没有下一个要向其传递行的表。An exception occurs for EXPLAIN
: To provide more information, the filtering effect is calculated for all joined tables, including the last one.EXPLAIN
出现异常:为了提供更多信息,将对所有联接的表(包括最后一个表)计算过滤效果。
To control whether the optimizer considers additional filtering conditions, use the 要控制优化器是否考虑其他过滤条件,请使用condition_fanout_filter
flag of the optimizer_switch
system variable (see Section 8.9.2, “Switchable Optimizations”). optimizer_switch
变量的condition_fanout_filter
标志(请参阅第8.9.2节,“可切换的优化”)。This flag is enabled by default but can be disabled to suppress condition filtering (for example, if a particular query is found to yield better performance without it).默认情况下启用该标志,但可以禁用该标志以抑制条件筛选(例如,如果发现某个特定查询在没有条件筛选的情况下可以产生更好的性能)。
If the optimizer overestimates the effect of condition filtering, performance may be worse than if condition filtering is not used. 如果优化器高估了条件筛选的效果,则性能可能会比不使用条件筛选时更差。In such cases, these techniques may help:在这种情况下,这些技术可能有助于:
If a column is not indexed, index it so that the optimizer has some information about the distribution of column values and can improve its row estimates.如果列未编制索引,请对其编制索引,以便优化器获得有关列值分布的一些信息,并可以改进其行估计。
Similarly, if no column histogram information is available, generate a histogram (see Section 8.9.6, “Optimizer Statistics”).同样,如果没有可用的列直方图信息,则生成直方图(请参阅第8.9.6节,“优化器统计”)。
Change the join order. 更改联接顺序。Ways to accomplish this include join-order optimizer hints (see Section 8.9.3, “Optimizer Hints”), 实现这一点的方法包括连接顺序优化器提示(请参阅第8.9.3节,“优化器提示”)、紧接着STRAIGHT_JOIN
immediately following the SELECT
, and the STRAIGHT_JOIN
join operator.SELECT
的STRAIGHT_JOIN
以及STRAIGHT_JOIN
操作符。
Disable condition filtering for the session:禁用会话的条件筛选:
SET optimizer_switch = 'condition_fanout_filter=off';
Or, for a given query, using an optimizer hint:或者,对于给定查询,使用优化器提示:
SELECT /*+ SET_VAR(optimizer_switch = 'condition_fanout_filter=off') */ ...