The optimizer uses estimated statistics about key distributions to choose the indexes for an execution plan, based on the relative selectivity of the index. Operations such as ANALYZE TABLE
cause InnoDB
to sample random pages from each index on a table to estimate the cardinality of the index. This sampling technique is known as a random dive.
The innodb_stats_persistent_sample_pages
controls the number of sampled pages. You can adjust the setting at runtime to manage the quality of statistics estimates used by the optimizer. The default value is 20. Consider modifying the setting when encountering the following issues:
Statistics are not accurate enough and the optimizer chooses suboptimal plans, as shown in EXPLAIN
output. You can check the accuracy of statistics by comparing the actual cardinality of an index (determined by running SELECT DISTINCT
on the index columns) with the estimates in the mysql.innodb_index_stats
table.
If it is determined that statistics are not accurate enough, the value of innodb_stats_persistent_sample_pages
should be increased until the statistics estimates are sufficiently accurate. Increasing innodb_stats_persistent_sample_pages
too much, however, could cause ANALYZE TABLE
to run slowly.
ANALYZE TABLE
is too slow. In this case innodb_stats_persistent_sample_pages
should be decreased until ANALYZE TABLE
execution time is acceptable. Decreasing the value too much, however, could lead to the first problem of inaccurate statistics and suboptimal query execution plans.
If a balance cannot be achieved between accurate statistics and ANALYZE TABLE
execution time, consider decreasing the number of indexed columns in the table or limiting the number of partitions to reduce ANALYZE TABLE
complexity. The number of columns in the table's primary key is also important to consider, as primary key columns are appended to each nonunique index.
For related information, see Section 15.8.10.3, “Estimating ANALYZE TABLE Complexity for InnoDB Tables”.