Chapter 8 Optimization

8.3.8 InnoDB and MyISAM Index Statistics CollectionInnoDB和MyISAM索引统计集合

~~Storage engines collect statistics about tables for use by the optimizer.~~ 存储引擎收集有关表的统计信息以供优化器使用。~~Table statistics are based on value groups, where a value group is a set of rows with the same key prefix value.~~ 表统计信息基于值组，其中值组是一组具有相同键前缀值的行。~~For optimizer purposes, an important statistic is the average value group size.~~对于优化器，一个重要的统计信息是平均值组大小。

~~MySQL uses the average value group size in the following ways:~~MySQL通过以下方式使用平均值组大小：

~~To estimate how many rows must be read for each ref access~~估计每个ref访问必须读取多少行
~~To estimate how many rows a partial join produces, that is, the number of rows produced by an operation of the form~~估计部分联接产生的行数，即表单操作产生的行数
```
(...) JOIN tbl_name ON tbl_name.key = expr
```

As the average value group size for an index increases, the index is less useful for those two purposes because the average number of rows per lookup increases: For the index to be good for optimization purposes, it is best that each index value target a small number of rows in the table. 随着索引的平均值组大小的增加，索引对于这两个目的的用处就越小，因为每次查找的平均行数会增加：为了使索引更好地用于优化目的，最好每个索引值都以表中的少量行为目标。~~When a given index value yields a large number of rows, the index is less useful and MySQL is less likely to use it.~~当一个给定的索引值产生大量的行时，索引就不太有用，MySQL也不太可能使用它。

~~The average value group size is related to table cardinality, which is the number of value groups.~~ 平均值组大小与表基数有关，表基数是值组的数目。~~The SHOW INDEX statement displays a cardinality value based on N/S, where N is the number of rows in the table and S is the average value group size.~~ SHOW INDEX语句基于N/S显示基数值，其中N是表中的行数，S是平均值组大小。~~That ratio yields an approximate number of value groups in the table.~~这个比率在表中产生了近似数量的值组。

~~For a join based on the <=> comparison operator, NULL is not treated differently from any other value: NULL <=> NULL, just as N <=> N for any other N.~~对于基于<=>比较运算符，NULL与任何其他值的处理方式没有区别：NULL <=> NULL，就像N <=> N；N代表任何其他的N。

~~However, for a join based on the = operator, NULL is different from non-NULL values: expr1 = expr2 is not true when expr1 or expr2 (or both) are NULL.~~ 但是，对于基于=运算符的联接，NULL与非NULL值不同：当expr1或expr2（或两者）为NULL时，expr1 = expr2不是true。~~This affects ref accesses for comparisons of the form tbl_name.key = expr: MySQL does not access the table if the current value of expr is NULL, because the comparison cannot be true.~~这会影响用于比较tbl_name.key = expr格式的ref访问：如果expr的当前值为NULL，MySQL不会访问该表，因为比较不能为true。

~~For = comparisons, it does not matter how many NULL values are in the table.~~ 对于=比较，表中有多少NULL值无关紧要。~~For optimization purposes, the relevant value is the average size of the non-NULL value groups.~~ 出于优化目的，相关值是非NULL值组的平均大小。~~However, MySQL does not currently enable that average size to be collected or used.~~但是，MySQL目前无法收集或使用该平均大小。

~~For InnoDB and MyISAM tables, you have some control over collection of table statistics by means of the innodb_stats_method and myisam_stats_method system variables, respectively.~~ 对于InnoDB和MyISAM表，您可以分别通过innodb_stats_method和myisam_stats_method系统变量来控制表统计信息的收集。~~These variables have three possible values, which differ as follows:~~这些变量有三个可能的值，其差异如下：

~~When the variable is set to nulls_equal, all NULL values are treated as identical (that is, they all form a single value group).~~当变量设置为nulls_equal时，所有NULL值都被视为相同（即，它们都构成一个值组）。

~~If the NULL value group size is much higher than the average non-NULL value group size, this method skews the average value group size upward.~~ 如果NULL值组大小远高于平均非NULL值组大小，则此方法会向上倾斜平均值组大小。~~This makes index appear to the optimizer to be less useful than it really is for joins that look for non-NULL values.~~ 这使得索引在优化器看来不如在查找非NULL值的联接中实际有用。~~Consequently, the nulls_equal method may cause the optimizer not to use the index for ref accesses when it should.~~因此，nulls_equal方法可能会导致优化器在应该使用索引进行ref访问时不使用索引。
~~When the variable is set to nulls_unequal, NULL values are not considered the same.~~ 当变量设置为nulls_unequal时，NULL值不被视为相同。~~Instead, each NULL value forms a separate value group of size 1.~~相反，每个NULL值形成一个大小为1的单独值组。

~~If you have many NULL values, this method skews the average value group size downward.~~ 如果有许多NULL值，此方法会向下倾斜平均值组大小。If the average non-NULL value group size is large, counting NULL values each as a group of size 1 causes the optimizer to overestimate the value of the index for joins that look for non-NULL values. 如果平均非NULL值组大小较大，则将每个NULL值作为大小为1的组进行计数会导致优化器高估查找非NULL值的联接的索引值。~~Consequently, the nulls_unequal method may cause the optimizer to use this index for ref lookups when other methods may be better.~~因此，当其他方法可能更好时，nulls_unequal方法可能会导致优化器使用此索引进行ref查找。
~~When the variable is set to nulls_ignored, NULL values are ignored.~~当变量设置为nulls_ignored时，将忽略NULL值。

~~If you tend to use many joins that use <=> rather than =, NULL values are not special in comparisons and one NULL is equal to another.~~ 如果您倾向于使用<=>而不是=，则NULL值在比较中不再特殊，一个NULL等于另一个NULL。~~In this case, nulls_equal is the appropriate statistics method.~~在这种情况下，nulls_equal是合适的统计方法。

~~The innodb_stats_method system variable has a global value; the myisam_stats_method system variable has both global and session values.~~ innodb_stats_method系统变量具有全局值；myisam_stats_method系统变量同时具有全局值和会话值。~~Setting the global value affects statistics collection for tables from the corresponding storage engine.~~ 设置全局值会影响从相应存储引擎收集表的统计信息。~~Setting the session value affects statistics collection only for the current client connection.~~ 设置会话值仅影响当前客户端连接的统计信息收集。~~This means that you can force a table's statistics to be regenerated with a given method without affecting other clients by setting the session value of myisam_stats_method.~~这意味着您可以通过设置myisam_stats_method的会话值，强制使用给定的方法重新生成表的统计信息，而不会影响其他客户端。

~~To regenerate MyISAM table statistics, you can use any of the following methods:~~要重新生成MyISAM表统计信息，可以使用以下任何方法：

Execute myisamchk --stats_method=method_name --analyze执行myisamchk --stats_method=method_name --analyze
~~Change the table to cause its statistics to go out of date (for example, insert a row and then delete it), and then set myisam_stats_method and issue an ANALYZE TABLE statement~~更改表以使其统计信息过期（例如，插入一行，然后将其删除），然后设置myisam_stats_method并发出ANALYZE TABLE语句

~~Some caveats regarding the use of innodb_stats_method and myisam_stats_method:~~关于使用innodb_stats_method和myisam_stats_method的一些注意事项：

~~You can force table statistics to be collected explicitly, as just described. However, MySQL may also collect statistics automatically.~~ 您可以强制显式收集表统计信息，如前所述。然而，MySQL也可以自动收集统计数据。~~For example, if during the course of executing statements for a table, some of those statements modify the table, MySQL may collect statistics.~~ 例如，如果在为表执行语句的过程中，其中一些语句修改了表，MySQL可能会收集统计信息。~~(This may occur for bulk inserts or deletes, or some ALTER TABLE statements, for example.)~~ （例如，对于大容量插入或删除，或某些ALTER TABLE语句，可能会发生这种情况。）~~If this happens, the statistics are collected using whatever value innodb_stats_method or myisam_stats_method has at the time.~~ 如果发生这种情况，则使用innodb_stats_method或myisam_stats_method当时具有的任何值来收集统计信息。~~Thus, if you collect statistics using one method, but the system variable is set to the other method when a table's statistics are collected automatically later, the other method is used.~~因此，如果使用一种方法收集统计信息，但当稍后自动收集表的统计信息时，系统变量设置为另一种方法，则使用另一种方法。
~~There is no way to tell which method was used to generate statistics for a given table.~~无法判断使用哪种方法生成给定表的统计信息。
~~These variables apply only to InnoDB and MyISAM tables.~~ 这些变量仅适用于InnoDB和MyISAM表。~~Other storage engines have only one method for collecting table statistics.~~ 其他存储引擎只有一种收集表统计信息的方法。~~Usually it is closer to the nulls_equal method.~~通常它更接近于nulls_equal方法。