Storage engines collect statistics about tables for use by the optimizer. 存储引擎收集有关表的统计信息以供优化器使用。Table statistics are based on value groups, where a value group is a set of rows with the same key prefix value. 表统计信息基于值组,其中值组是一组具有相同键前缀值的行。For optimizer purposes, an important statistic is the average value group size.对于优化器,一个重要的统计信息是平均值组大小。
MySQL uses the average value group size in the following ways:MySQL通过以下方式使用平均值组大小:
To estimate how many rows must be read for each 估计每个ref
accessref
访问必须读取多少行
To estimate how many rows a partial join produces, that is, the number of rows produced by an operation of the form估计部分联接产生的行数,即表单操作产生的行数
(...) JOINtbl_name
ONtbl_name
.key
=expr
As the average value group size for an index increases, the index is less useful for those two purposes because the average number of rows per lookup increases: For the index to be good for optimization purposes, it is best that each index value target a small number of rows in the table. 随着索引的平均值组大小的增加,索引对于这两个目的的用处就越小,因为每次查找的平均行数会增加:为了使索引更好地用于优化目的,最好每个索引值都以表中的少量行为目标。When a given index value yields a large number of rows, the index is less useful and MySQL is less likely to use it.当一个给定的索引值产生大量的行时,索引就不太有用,MySQL也不太可能使用它。
The average value group size is related to table cardinality, which is the number of value groups. 平均值组大小与表基数有关,表基数是值组的数目。The SHOW INDEX
statement displays a cardinality value based on N/S
, where N
is the number of rows in the table and S
is the average value group size. SHOW INDEX
语句基于N/S
显示基数值,其中N
是表中的行数,S
是平均值组大小。That ratio yields an approximate number of value groups in the table.这个比率在表中产生了近似数量的值组。
For a join based on the 对于基于<=>
comparison operator, NULL
is not treated differently from any other value: NULL <=> NULL
, just as
for any other N
<=> N
N
.<=>
比较运算符,NULL
与任何其他值的处理方式没有区别:NULL <=> NULL
,就像
;N
<=> N
N
代表任何其他的N
。
However, for a join based on the 但是,对于基于=
operator, NULL
is different from non-NULL
values:
is not true when expr1
= expr2
expr1
or expr2
(or both) are NULL
. =
运算符的联接,NULL
与非NULL
值不同:当expr1
或expr2
(或两者)为NULL
时,
不是expr1
= expr2
true
。This affects 这会影响用于比较ref
accesses for comparisons of the form
: MySQL does not access the table if the current value of tbl_name.key
= expr
expr
is NULL
, because the comparison cannot be true.
格式的tbl_name.key
= expr
ref
访问:如果expr
的当前值为NULL
,MySQL不会访问该表,因为比较不能为true
。
For 对于=
comparisons, it does not matter how many NULL
values are in the table. =
比较,表中有多少NULL
值无关紧要。For optimization purposes, the relevant value is the average size of the non-出于优化目的,相关值是非NULL
value groups. NULL
值组的平均大小。However, MySQL does not currently enable that average size to be collected or used.但是,MySQL目前无法收集或使用该平均大小。
For 对于InnoDB
and MyISAM
tables, you have some control over collection of table statistics by means of the innodb_stats_method
and myisam_stats_method
system variables, respectively. InnoDB
和MyISAM
表,您可以分别通过innodb_stats_method
和myisam_stats_method
系统变量来控制表统计信息的收集。These variables have three possible values, which differ as follows:这些变量有三个可能的值,其差异如下:
When the variable is set to 当变量设置为nulls_equal
, all NULL
values are treated as identical (that is, they all form a single value group).nulls_equal
时,所有NULL
值都被视为相同(即,它们都构成一个值组)。
If the 如果NULL
value group size is much higher than the average non-NULL
value group size, this method skews the average value group size upward. NULL
值组大小远高于平均非NULL
值组大小,则此方法会向上倾斜平均值组大小。This makes index appear to the optimizer to be less useful than it really is for joins that look for non-这使得索引在优化器看来不如在查找非NULL
values. NULL
值的联接中实际有用。Consequently, the 因此,nulls_equal
method may cause the optimizer not to use the index for ref
accesses when it should.nulls_equal
方法可能会导致优化器在应该使用索引进行ref
访问时不使用索引。
When the variable is set to 当变量设置为nulls_unequal
, NULL
values are not considered the same. nulls_unequal
时,NULL
值不被视为相同。Instead, each 相反,每个NULL
value forms a separate value group of size 1.NULL
值形成一个大小为1的单独值组。
If you have many 如果有许多NULL
values, this method skews the average value group size downward. NULL
值,此方法会向下倾斜平均值组大小。If the average non-如果平均非NULL
value group size is large, counting NULL
values each as a group of size 1 causes the optimizer to overestimate the value of the index for joins that look for non-NULL
values. NULL
值组大小较大,则将每个NULL
值作为大小为1的组进行计数会导致优化器高估查找非NULL
值的联接的索引值。Consequently, the 因此,当其他方法可能更好时,nulls_unequal
method may cause the optimizer to use this index for ref
lookups when other methods may be better.nulls_unequal
方法可能会导致优化器使用此索引进行ref
查找。
When the variable is set to 当变量设置为nulls_ignored
, NULL
values are ignored.nulls_ignored
时,将忽略NULL
值。
If you tend to use many joins that use 如果您倾向于使用<=>
rather than =
, NULL
values are not special in comparisons and one NULL
is equal to another. <=>
而不是=
,则NULL
值在比较中不再特殊,一个NULL
等于另一个NULL
。In this case, 在这种情况下,nulls_equal
is the appropriate statistics method.nulls_equal
是合适的统计方法。
The innodb_stats_method
system variable has a global value; the myisam_stats_method
system variable has both global and session values. innodb_stats_method
系统变量具有全局值;myisam_stats_method
系统变量同时具有全局值和会话值。Setting the global value affects statistics collection for tables from the corresponding storage engine. 设置全局值会影响从相应存储引擎收集表的统计信息。Setting the session value affects statistics collection only for the current client connection. 设置会话值仅影响当前客户端连接的统计信息收集。This means that you can force a table's statistics to be regenerated with a given method without affecting other clients by setting the session value of 这意味着您可以通过设置myisam_stats_method
.myisam_stats_method
的会话值,强制使用给定的方法重新生成表的统计信息,而不会影响其他客户端。
To regenerate 要重新生成MyISAM
table statistics, you can use any of the following methods:MyISAM
表统计信息,可以使用以下任何方法:
Execute myisamchk --stats_method=执行method_name
--analyzemyisamchk --stats_method=
method_name
--analyze
Change the table to cause its statistics to go out of date (for example, insert a row and then delete it), and then set 更改表以使其统计信息过期(例如,插入一行,然后将其删除),然后设置myisam_stats_method
and issue an ANALYZE TABLE
statementmyisam_stats_method
并发出ANALYZE TABLE
语句
Some caveats regarding the use of 关于使用innodb_stats_method
and myisam_stats_method
:innodb_stats_method
和myisam_stats_method
的一些注意事项:
You can force table statistics to be collected explicitly, as just described. However, MySQL may also collect statistics automatically. 您可以强制显式收集表统计信息,如前所述。然而,MySQL也可以自动收集统计数据。For example, if during the course of executing statements for a table, some of those statements modify the table, MySQL may collect statistics. 例如,如果在为表执行语句的过程中,其中一些语句修改了表,MySQL可能会收集统计信息。(This may occur for bulk inserts or deletes, or some (例如,对于大容量插入或删除,或某些ALTER TABLE
statements, for example.) ALTER TABLE
语句,可能会发生这种情况。)If this happens, the statistics are collected using whatever value 如果发生这种情况,则使用innodb_stats_method
or myisam_stats_method
has at the time. innodb_stats_method
或myisam_stats_method
当时具有的任何值来收集统计信息。Thus, if you collect statistics using one method, but the system variable is set to the other method when a table's statistics are collected automatically later, the other method is used.因此,如果使用一种方法收集统计信息,但当稍后自动收集表的统计信息时,系统变量设置为另一种方法,则使用另一种方法。
There is no way to tell which method was used to generate statistics for a given table.无法判断使用哪种方法生成给定表的统计信息。
These variables apply only to 这些变量仅适用于InnoDB
and MyISAM
tables. InnoDB
和MyISAM
表。Other storage engines have only one method for collecting table statistics. 其他存储引擎只有一种收集表统计信息的方法。Usually it is closer to the 通常它更接近于nulls_equal
method.nulls_equal
方法。