The most general way to satisfy a 满足GROUPBY子句的最常用方法是扫描整个表并创建一个新的临时表,其中每个组中的所有行都是连续的,然后使用此临时表来发现组并应用聚合函数(如果有)。GROUP BY
clause is to scan the whole table and create a new temporary table where all rows from each group are consecutive, and then use this temporary table to discover groups and apply aggregate functions (if any). In some cases, MySQL is able to do much better than that and avoid creation of temporary tables by using index access.在某些情况下,使用MySQL来避免创建表要比使用临时表好得多。
The most important preconditions for using indexes for 为GROUP BY
are that all GROUP BY
columns reference attributes from the same index, and that the index stores its keys in order (as is true, for example, for a BTREE
index, but not for a HASH
index). GROUP BY
使用索引的最重要的先决条件是,所有GROUP BY
列都引用同一索引中的属性,并且索引按顺序存储其键(例如,对于BTREE
索引是如此,但对于哈希索引则不是如此)。Whether use of temporary tables can be replaced by index access also depends on which parts of an index are used in a query, the conditions specified for these parts, and the selected aggregate functions.临时表的使用是否可以替换为索引访问还取决于查询中使用了索引的哪些部分、为这些部分指定的条件以及所选的聚合函数。
There are two ways to execute a 有两种方法可以通过索引访问执行GROUP BY
query through index access, as detailed in the following sections. GROUP BY
查询,详见以下各节。The first method applies the grouping operation together with all range predicates (if any). 第一种方法将分组操作与所有范围谓词(如果有)一起应用。The second method first performs a range scan, and then groups the resulting tuples.第二种方法首先执行范围扫描,然后对生成的元组进行分组。
Loose Index Scan can also be used in the absence of 在某些情况下,如果没有GROUP BY
under some conditions. GROUP BY
,也可以使用松散索引扫描。See Skip Scan Range Access Method.请参阅“跳过扫描范围访问方法”。
The most efficient way to process 处理GROUP BY
is when an index is used to directly retrieve the grouping columns. GROUP BY
的最有效方法是使用索引直接检索分组列。With this access method, MySQL uses the property of some index types that the keys are ordered (for example, 通过这种访问方法,MySQL使用一些索引类型的属性,这些索引类型的键是按顺序排列的(例如,BTREE
). BTREE
)。This property enables use of lookup groups in an index without having to consider all keys in the index that satisfy all 此属性允许在索引中使用查找组,而不必考虑索引中满足所有WHERE
conditions. WHERE
条件的所有键。This access method considers only a fraction of the keys in an index, so it is called a Loose Index Scan. 这种访问方法只考虑索引中的一小部分键,因此称为“松散索引扫描”。When there is no 当没有WHERE
clause, a Loose Index Scan reads as many keys as the number of groups, which may be a much smaller number than that of all keys. WHERE
子句时,松散索引扫描读取的键数与组数相同,这可能比所有键的数目小得多。If the 如果WHERE
clause contains range predicates (see the discussion of the range
join type in Section 8.8.1, “Optimizing Queries with EXPLAIN”), a Loose Index Scan looks up the first key of each group that satisfies the range conditions, and again reads the smallest possible number of keys. WHERE
子句包含范围谓词(请参阅第8.8.1节,“使用EXPLAIN优化查询”中对range
联接类型的讨论),松散索引扫描将查找满足范围条件的每个组的第一个键,并再次读取尽可能少的键。This is possible under the following conditions:这在以下条件下是可能的:
The query is over a single table.查询是在单个表上进行的。
The GROUP BY
names only columns that form a leftmost prefix of the index and no other columns. GROUP BY
只命名构成索引最左边前缀的列,不命名其他列。(If, instead of (如果查询具有GROUP BY
, the query has a DISTINCT
clause, all distinct attributes refer to columns that form a leftmost prefix of the index.) DISTINCT
子句而不是GROUP BY
,则所有DISTINCT
属性都引用构成索引最左侧前缀的列。)For example, if a table 例如,如果表t1
has an index on (c1,c2,c3)
, Loose Index Scan is applicable if the query has GROUP BY c1, c2
. t1
在(c1,c2,c3)
上有索引,则如果查询具有GROUP BY c1、c2
,则适用松散索引扫描。It is not applicable if the query has 如果查询具有GROUP BY c2, c3
(the columns are not a leftmost prefix) or GROUP BY c1, c2, c4
(c4
is not in the index).GROUP BY c2, c3
(列不是最左边的前缀)或GROUP BY c1, c2, c4
(c4
不在索引中),则此选项不适用。
The only aggregate functions used in the select list (if any) are 选择列表中使用的唯一聚合函数(如果有)是MIN()
and MAX()
, and all of them refer to the same column. MIN()
和MAX()
,它们都引用同一列。The column must be in the index and must immediately follow the columns in the 该列必须位于索引中,并且必须紧跟在GROUP BY
.GROUP BY
中的列之后。
Any other parts of the index than those from the 除了查询中引用的GROUP BY
referenced in the query must be constants (that is, they must be referenced in equalities with constants), except for the argument of MIN()
or MAX()
functions.GROUP BY
之外,索引的任何其他部分都必须是常量(也就是说,它们必须与常量相等地引用),但MIN()
或MAX()
函数的参数除外。
For columns in the index, full column values must be indexed, not just a prefix. 对于索引中的列,必须为完整列值编制索引,而不仅仅是前缀。For example, with 例如,对于c1 VARCHAR(20), INDEX (c1(10))
, the index uses only a prefix of c1
values and cannot be used for Loose Index Scan.c1 VARCHAR(20), INDEX (c1(10))
,索引仅使用c1
值的前缀,不能用于松散索引扫描。
If Loose Index Scan is applicable to a query, the 如果松散索引扫描适用于查询,则EXPLAIN
output shows Using index for group-by
in the Extra
column.EXPLAIN
输出会在Extra
列中显示Using index for group-by
。
Assume that there is an index 假设表idx(c1,c2,c3)
on table t1(c1,c2,c3,c4)
. t1(c1,c2,c3,c4)
上有一个索引idx(c1,c2,c3)
。The Loose Index Scan access method can be used for the following queries:松散索引扫描访问方法可用于以下查询:
SELECT c1, c2 FROM t1 GROUP BY c1, c2; SELECT DISTINCT c1, c2 FROM t1; SELECT c1, MIN(c2) FROM t1 GROUP BY c1; SELECT c1, c2 FROM t1 WHERE c1 <const
GROUP BY c1, c2; SELECT MAX(c3), MIN(c3), c1, c2 FROM t1 WHERE c2 >const
GROUP BY c1, c2; SELECT c2 FROM t1 WHERE c1 <const
GROUP BY c1, c2; SELECT c1, c2 FROM t1 WHERE c3 =const
GROUP BY c1, c2;
The following queries cannot be executed with this quick select method, for the reasons given:由于以下原因,无法使用此快速选择方法执行以下查询:
There are aggregate functions other than 存在除MIN()
or MAX()
:MIN()
或MAX()
以外的聚合函数:
SELECT c1, SUM(c2) FROM t1 GROUP BY c1;
The columns in the GROUP BY
clause do not form a leftmost prefix of the index:GROUP BY
子句中的列不构成索引的最左侧前缀:
SELECT c1, c2 FROM t1 GROUP BY c2, c3;
The query refers to a part of a key that comes after the 查询是指在GROUP BY
part, and for which there is no equality with a constant:GROUP BY
部分之后的键的一部分,该部分与常量不相等:
SELECT c1, c3 FROM t1 GROUP BY c1, c2;
Were the query to include 如果查询包含WHERE c3 =
, Loose Index Scan could be used.const
WHERE c3 =
,则可以使用松散索引扫描。const
The Loose Index Scan access method can be applied to other forms of aggregate function references in the select list, in addition to the 除已支持的MIN()
and MAX()
references already supported:MIN()
和MAX()
引用外,松散索引扫描访问方法还可应用于选择列表中其他形式的聚合函数引用:
支持AVG(DISTINCT)
, SUM(DISTINCT)
, and COUNT(DISTINCT)
are supported. AVG(DISTINCT)
、SUM(DISTINCT)
和COUNT(DISTINCT)
。AVG(DISTINCT)
and SUM(DISTINCT)
take a single argument. AVG(DISTINCT)
和SUM(DISTINCT)
采用单个参数。COUNT(DISTINCT)
can have more than one column argument.COUNT(DISTINCT)
可以有多个列参数。
There must be no 查询中不能有GROUP BY
or DISTINCT
clause in the query.GROUP BY
或DISTINCT
子句。
The Loose Index Scan limitations described previously still apply.前面描述的松散索引扫描限制仍然适用。
Assume that there is an index 假设表idx(c1,c2,c3)
on table t1(c1,c2,c3,c4)
. t1(c1,c2,c3,c4)
上有一个索引idx(c1,c2,c3)
。The Loose Index Scan access method can be used for the following queries:松散索引扫描访问方法可用于以下查询:
SELECT COUNT(DISTINCT c1), SUM(DISTINCT c1) FROM t1; SELECT COUNT(DISTINCT c1, c2), COUNT(DISTINCT c2, c1) FROM t1;
A Tight Index Scan may be either a full index scan or a range index scan, depending on the query conditions.紧密索引扫描可以是完全索引扫描,也可以是范围索引扫描,具体取决于查询条件。
When the conditions for a Loose Index Scan are not met, it still may be possible to avoid creation of temporary tables for 当不满足松散索引扫描的条件时,仍然可以避免为GROUP BY
queries. GROUP BY
查询创建临时表。If there are range conditions in the 如果WHERE
clause, this method reads only the keys that satisfy these conditions. WHERE
子句中存在范围条件,则此方法仅读取满足这些条件的键。Otherwise, it performs an index scan. 否则,它将执行索引扫描。Because this method reads all keys in each range defined by the 由于此方法读取WHERE
clause, or scans the whole index if there are no range conditions, it is called a Tight Index Scan. WHERE
子句定义的每个范围内的所有键,或者如果没有范围条件,则扫描整个索引,因此称为“紧密索引扫描”。With a Tight Index Scan, the grouping operation is performed only after all keys that satisfy the range conditions have been found.对于紧密索引扫描,只有在找到满足范围条件的所有键之后,才能执行分组操作。
For this method to work, it is sufficient that there be a constant equality condition for all columns in a query referring to parts of the key coming before or in between parts of the 要使此方法起作用,查询中的所有列都有一个恒定的相等条件就足够了,该查询引用的键部分位于GROUP BY
key. GROUP BY
键部分之前或之间。The constants from the equality conditions fill in any “gaps” in the search keys so that it is possible to form complete prefixes of the index. 相等条件中的常量将填充搜索键中的任何“空白”,以便可以形成索引的完整前缀。These index prefixes then can be used for index lookups. 然后,这些索引前缀可用于索引查找。If the 如果GROUP BY
result requires sorting, and it is possible to form search keys that are prefixes of the index, MySQL also avoids extra sorting operations because searching with prefixes in an ordered index already retrieves all the keys in order.GROUP BY
结果需要排序,并且可以形成作为索引前缀的搜索键,MySQL还可以避免额外的排序操作,因为在有序索引中使用前缀进行搜索已经按顺序检索所有键。
Assume that there is an index 假设表t1(c1,c2,c3,c4)上有一个索引idx(c1,c2,c3)
on table t1(c1,c2,c3,c4)
. idx(c1,c2,c3)
。The following queries do not work with the Loose Index Scan access method described previously, but still work with the Tight Index Scan access method.以下查询不适用于前面描述的松散索引扫描访问方法,但仍适用于紧密索引扫描访问方法。
There is a gap in the GROUP BY
, but it is covered by the condition c2 = 'a'
:GROUP BY
中有一个间隙,但它被条件c2 = 'a'
覆盖:
SELECT c1, c2, c3 FROM t1 WHERE c2 = 'a' GROUP BY c1, c3;
The GROUP BY
does not begin with the first part of the key, but there is a condition that provides a constant for that part:GROUP BY
不以键的第一部分开始,但有一个条件为该部分提供常量:
SELECT c1, c2, c3 FROM t1 WHERE c1 = 'a' GROUP BY c2, c3;