Design your tables to minimize their space on the disk.设计表以使其在磁盘上的空间最小化。 This can result in huge improvements by reducing the amount of data written to and read from disk. 通过减少写入磁盘和从磁盘读取的数据量,这可以带来巨大的改进。Smaller tables normally require less main memory while their contents are being actively processed during query execution. 较小的表通常需要较少的主内存,而它们的内容在查询执行期间被积极处理。Any space reduction for table data also results in smaller indexes that can be processed faster.表数据的任何空间缩减也会导致更小的索引,从而可以更快地处理这些索引。
MySQL supports many different storage engines (table types) and row formats. MySQL支持许多不同的存储引擎(表类型)和行格式。For each table, you can decide which storage and indexing method to use. 对于每个表,您可以决定使用哪种存储和索引方法。Choosing the proper table format for your application can give you a big performance gain. 为应用程序选择合适的表格式可以大大提高性能。See Chapter 15, The InnoDB Storage Engine, and Chapter 16, Alternative Storage Engines.请参阅第15章,“InnoDB存储引擎”和第16章,“替代存储引擎”。
You can get better performance for a table and minimize storage space by using the techniques listed here:通过使用下面列出的技术,您可以获得更好的表性能并最小化存储空间:
Use the most efficient (smallest) data types possible. 尽可能使用最有效(最小)的数据类型。MySQL has many specialized types that save disk space and memory. MySQL有许多专门的类型,可以节省磁盘空间和内存。For example, use the smaller integer types if possible to get smaller tables. 例如,如果可能,可以使用较小的整数类型来获得较小的表。MEDIUMINT
is often a better choice than INT
because a MEDIUMINT
column uses 25% less space.MEDIUMINT
通常比INT
更好,因为MEDIUMINT
列使用的空间少25%。
Declare columns to be 如果可能,将列声明为NOT NULL
if possible. NOT NULL
。It makes SQL operations faster, by enabling better use of indexes and eliminating overhead for testing whether each value is 通过更好地使用索引和消除测试每个值是否为NULL
. NULL
的开销,它使SQL操作更快。You also save some storage space, one bit per column. 还可以节省一些存储空间,每列一位。If you really need 如果表中确实需要NULL
values in your tables, use them. NULL
值,请使用它们。Just avoid the default setting that allows 只需避免在每一列中都允许NULL
values in every column.NULL
值的默认设置。
默认情况下,InnoDB
tables are created using the DYNAMIC
row format by default. InnoDB
表是使用动态行格式创建的。To use a row format other than 要使用DYNAMIC
, configure innodb_default_row_format
, or specify the ROW_FORMAT
option explicitly in a CREATE TABLE
or ALTER TABLE
statement.DYNMAIC
(动态)以外的行格式,请配置innodb_default_row_format
,或在CREATE TABLE
或ALTER TABLE
语句中显式指定ROW_FORMAT
选项。
The compact family of row formats, which includes 行格式的紧凑系列(包括COMPACT
, DYNAMIC
, and COMPRESSED
, decreases row storage space at the cost of increasing CPU use for some operations. COMPACT
、DYNAMIC
和COMPRESSED
)以增加某些操作的CPU使用为代价,减少了行存储空间。If your workload is a typical one that is limited by cache hit rates and disk speed it is likely to be faster. 如果您的工作负载是受缓存命中率和磁盘速度限制的典型工作负载,那么它可能会更快。If it is a rare case that is limited by CPU speed, it might be slower.如果这是一种受CPU速度限制的罕见情况,它可能会更慢。
The compact family of row formats also optimizes 紧凑的行格式系列还可以在使用可变长度字符集(如CHAR
column storage when using a variable-length character set such as utf8mb3
or utf8mb4
. utf8mb3
或utf8mb4
)时优化CHAR
列存储。With 当ROW_FORMAT=REDUNDANT
, CHAR(
occupies N
)N
× the maximum byte length of the character set. ROW_FORMAT=REDUNDANT
时,CHAR(N)
占据N
×字符集的最大字节长度。Many languages can be written primarily using single-byte 许多语言可以主要使用单字节utf8
characters, so a fixed storage length often wastes space. utf8
字符编写,因此固定的存储长度通常会浪费空间。With the compact family of rows formats, 对于紧凑的行格式系列,InnoDB
allocates a variable amount of storage in the range of N
to N
× the maximum byte length of the character set for these columns by stripping trailing spaces. InnoDB
通过剥离尾随空格,为这些列分配了N
到N
× 字符集的最大字节长度范围内的可变存储量。The minimum storage length is 最小存储长度为N
bytes to facilitate in-place updates in typical cases. N
字节,以便于在典型情况下进行就地更新。For more information, see Section 15.10, “InnoDB Row Formats”.有关更多信息,请参阅第15.10节,“InnoDB行格式”。
To minimize space even further by storing table data in compressed form, specify 要通过以压缩格式存储表数据来进一步减少空间,请在创建ROW_FORMAT=COMPRESSED
when creating InnoDB
tables, or run the myisampack command on an existing MyISAM
table. InnoDB
表时指定ROW_FORMAT=COMPRESSED
,或在现有MyISAM
表上运行myisampack
命令。((InnoDB
compressed tables are readable and writable, while MyISAM
compressed tables are read-only.)InnoDB
压缩表是可读写的,而MyISAM
压缩表是只读的。)
For 对于MyISAM
tables, if you do not have any variable-length columns (VARCHAR
, TEXT
, or BLOB
columns), a fixed-size row format is used. MyISAM
表,如果没有任何可变长度列(VARCHAR
、TEXT
或BLOB
列),则使用固定大小的行格式。This is faster but may waste some space. 这会更快,但可能会浪费一些空间。See Section 16.2.3, “MyISAM Table Storage Formats”. 请参阅第16.2.3节,“MyISAM表格存储格式”。You can hint that you want to have fixed length rows even if you have 您可以提示您想要固定长度的行,即使您有带有VARCHAR
columns with the CREATE TABLE
option ROW_FORMAT=FIXED
.CREATE TABLE
选项ROW_FORMAT=FIXED
的VARCHAR
列。
The primary index of a table should be as short as possible. 表的主索引应该尽可能短。This makes identification of each row easy and efficient. 这使得识别每一行变得简单高效。For 对于InnoDB
tables, the primary key columns are duplicated in each secondary index entry, so a short primary key saves considerable space if you have many secondary indexes.InnoDB
表,主键列在每个次索引项中都是重复的,因此,如果有许多次索引,则短主键可以节省大量空间。
Create only the indexes that you need to improve query performance. 仅创建提高查询性能所需的索引。Indexes are good for retrieval, but slow down insert and update operations. 索引有利于检索,但会减慢插入和更新操作。If you access a table mostly by searching on a combination of columns, create a single composite index on them rather than a separate index for each column. 如果主要通过搜索列的组合来访问表,请在这些列上创建一个复合索引,而不是为每个列创建单独的索引。The first part of the index should be the column most used. 索引的第一部分应该是使用最多的列。If you always use many columns when selecting from the table, the first column in the index should be the one with the most duplicates, to obtain better compression of the index.如果在从表中选择时始终是使用许多列,则索引中的第一列应该是重复次数最多的列,以便更好地压缩索引。
If it is very likely that a long string column has a unique prefix on the first number of characters, it is better to index only this prefix, using MySQL's support for creating an index on the leftmost part of the column (see Section 13.1.15, “CREATE INDEX Statement”). 如果长字符串列很可能在第一个字符数上有一个唯一的前缀,那么最好只索引这个前缀,使用MySQL支持在列的最左侧创建索引(请参阅第13.1.15节,“CREATE INDEX语句”)。Shorter indexes are faster, not only because they require less disk space, but because they also give you more hits in the index cache, and thus fewer disk seeks. 索引越短,速度越快,这不仅是因为它们需要的磁盘空间越少,还因为它们在索引缓存中提供了更多的命中率,从而减少了磁盘搜索。See Section 5.1.1, “Configuring the Server”.请参阅第5.1.1节,“配置服务器”。
In some circumstances, it can be beneficial to split into two a table that is scanned very often. 在某些情况下,将经常扫描的表分成两部分可能是有益的。This is especially true if it is a dynamic-format table and it is possible to use a smaller static format table that can be used to find the relevant rows when scanning the table.如果它是一个动态格式表,并且可以使用较小的静态格式表,在扫描该表时可以使用该表来查找相关行,则情况尤其如此。
Declare columns with identical information in different tables with identical data types, to speed up joins based on the corresponding columns.在具有相同数据类型的不同表中声明具有相同信息的列,以加速基于相应列的联接。
Keep column names simple, so that you can use the same name across different tables and simplify join queries. 保持列名简单,以便可以在不同的表中使用相同的名称并简化联接查询。For example, in a table named 例如,在名为customer
, use a column name of name
instead of customer_name
. customer
的表中,使用name
的列名而不是customer_name
。To make your names portable to other SQL servers, consider keeping them shorter than 18 characters.为了使您的名称可移植到其他SQL服务器,请考虑保持它们短于18个字符。
Normally, try to keep all data nonredundant (observing what is referred to in database theory as third normal form). 通常,尝试保持所有数据的非冗余性(观察数据库理论中称为第三范式的内容)。Instead of repeating lengthy values such as names and addresses, assign them unique IDs, repeat these IDs as needed across multiple smaller tables, and join the tables in queries by referencing the IDs in the join clause.不要重复冗长的值,例如名称和地址,而是为它们分配唯一的ID,根据需要在多个较小的表中重复这些ID,并通过引用join子句中的ID在查询中联接表。
If speed is more important than disk space and the maintenance costs of keeping multiple copies of data, for example in a business intelligence scenario where you analyze all the data from large tables, you can relax the normalization rules, duplicating information or creating summary tables to gain more speed.如果速度比磁盘空间和保留多个数据副本的维护成本更重要,例如在分析大型表中所有数据的商业智能场景中,则可以放宽规范化规则、复制信息或创建摘要表以获得更高的速度。