MySQL functions are tagged internally as deterministic or nondeterministic. MySQL函数在内部被标记为确定性或非确定性。A function is nondeterministic if, given fixed values for its arguments, it can return different results for different invocations. 函数是不确定的,如果给定其参数的固定值,它可以为不同的调用返回不同的结果。Examples of nondeterministic functions: 非确定性函数的示例:RAND()
, UUID()
.RAND()
,UUID()
。
If a function is tagged nondeterministic, a reference to it in a 如果某个函数被标记为不确定函数,则WHERE
clause is evaluated for every row (when selecting from one table) or combination of rows (when selecting from a multiple-table join).WHERE
子句中对该函数的引用将针对每一行(从一个表中选择时)或行的组合(从多个表联接中选择时)进行求值。
MySQL also determines when to evaluate functions based on types of arguments, whether the arguments are table columns or constant values. MySQL还根据参数类型确定何时对函数求值,参数是表列还是常量值。A deterministic function that takes a table column as argument must be evaluated whenever that column changes value.当表列的值发生变化时,必须对以该列为参数的确定性函数求值。
Nondeterministic functions may affect query performance. For example, some optimizations may not be available, or more locking might be required. 非确定性函数可能会影响查询性能。例如,某些优化可能不可用,或者可能需要更多的锁定。The following discussion uses 下面的讨论使用RAND()
but applies to other nondeterministic functions as well.RAND()
,但也适用于其他非确定性函数。
Suppose that a table 假设表t
has this definition:t
具有以下定义:
CREATE TABLE t (id INT NOT NULL PRIMARY KEY, col_a VARCHAR(100));
Consider these two queries:考虑这两个查询:
SELECT * FROM t WHERE id = POW(1,2); SELECT * FROM t WHERE id = FLOOR(1 + RAND() * 49);
Both queries appear to use a primary key lookup because of the equality comparison against the primary key, but that is true only for the first of them:由于与主键的相等性比较,这两个查询似乎都使用主键查找,但这仅适用于第一个查询:
The first query always produces a maximum of one row because 第一个查询始终最多生成一行,因为带有常量参数的POW()
with constant arguments is a constant value and is used for index lookup.POW()
是一个常量值,用于索引查找。
The second query contains an expression that uses the nondeterministic function 第二个查询包含一个使用不确定函数RAND()
, which is not constant in the query but in fact has a new value for every row of table t
. RAND()
的表达式,该函数在查询中不是常量,但实际上对于表t
的每一行都有一个新值。Consequently, the query reads every row of the table, evaluates the predicate for each row, and outputs all rows for which the primary key matches the random value. 因此,查询读取表的每一行,计算每一行的谓词,并输出主键与随机值匹配的所有行。This might be zero, one, or multiple rows, depending on the 这可能是零行、一行或多行,具体取决于id
column values and the values in the RAND()
sequence.id
列值和RAND()
序列中的值。
The effects of nondeterminism are not limited to 不确定性的影响不限于SELECT
statements. SELECT
语句。This 此UPDATE
statement uses a nondeterministic function to select rows to be modified:UPDATE
语句使用不确定函数选择要修改的行:
UPDATE t SET col_a = some_expr
WHERE id = FLOOR(1 + RAND() * 49);
Presumably the intent is to update at most a single row for which the primary key matches the expression. 大概目的是更新主键与表达式匹配的最多一行。However, it might update zero, one, or multiple rows, depending on the 但是,它可能会更新零行、一行或多行,具体取决于id
column values and the values in the RAND()
sequence.id
列值和RAND()
序列中的值。
The behavior just described has implications for performance and replication:刚才描述的行为对性能和复制有影响:
Because a nondeterministic function does not produce a constant value, the optimizer cannot use strategies that might otherwise be applicable, such as index lookups. 由于非确定性函数不会产生常量值,因此优化器无法使用其他可能适用的策略,例如索引查找。The result may be a table scan.结果可能是表扫描。
InnoDB
might escalate to a range-key lock rather than taking a single row lock for one matching row.InnoDB
可能升级为范围键锁,而不是对一个匹配行使用单行锁。
Updates that do not execute deterministically are unsafe for replication.不确定执行的更新对于复制是不安全的。
The difficulties stem from the fact that the 困难源于这样一个事实:RAND()
function is evaluated once for every row of the table. RAND()
函数对表中的每一行求值一次。To avoid multiple function evaluations, use one of these techniques:要避免多个功能评估,请使用以下技术之一:
Move the expression containing the nondeterministic function to a separate statement, saving the value in a variable. 将包含不确定性函数的表达式移动到单独的语句中,将值保存在变量中。In the original statement, replace the expression with a reference to the variable, which the optimizer can treat as a constant value:在原始语句中,将表达式替换为对变量的引用,优化器可以将该变量视为常量值:
SET @keyval = FLOOR(1 + RAND() * 49);
UPDATE t SET col_a = some_expr
WHERE id = @keyval;
Assign the random value to a variable in a derived table. 将随机值指定给派生表中的变量。This technique causes the variable to be assigned a value, once, prior to its use in the comparison in the 此技术使变量在WHERE
clause:WHERE
子句中的比较中使用之前被赋值一次:
UPDATE /*+ NO_MERGE(dt) */ t, (SELECT FLOOR(1 + RAND() * 49) AS r) AS dt
SET col_a = some_expr
WHERE id = dt.r;
As mentioned previously, a nondeterministic expression in the 如前所述,WHERE
clause might prevent optimizations and result in a table scan. WHERE
子句中的不确定表达式可能会阻止优化并导致表扫描。However, it may be possible to partially optimize the 但是,如果其他表达式是确定性的,则可以部分优化WHERE
clause if other expressions are deterministic. WHERE
子句。For example:例如:
SELECT * FROM t WHERE partial_key=5 AND some_column=RAND();
If the optimizer can use 如果优化器可以使用partial_key
to reduce the set of rows selected, RAND()
is executed fewer times, which diminishes the effect of nondeterminism on optimization.partial_key
来减少所选行的集合,则RAND()
的执行次数会减少,从而减少不确定性对优化的影响。