Table 12.16 XML FunctionsXML函数
ExtractValue() | |
UpdateXML() |
This section discusses XML and related functionality in MySQL.本节讨论MySQL中的XML和相关功能。
It is possible to obtain XML-formatted output from MySQL in the mysql and mysqldump clients by invoking them with the 通过使用--xml
option. --xml
选项调用MySQL和mysqldump客户机,可以从MySQL获得XML格式的输出。See Section 4.5.1, “mysql — The MySQL Command-Line Client”, and Section 4.5.4, “mysqldump — A Database Backup Program”.请参阅第4.5.1节,“mysql命令行客户端”和第4.5.4节,“mysqldump数据库备份程序”。
Two functions providing basic XPath 1.0 (XML Path Language, version 1.0) capabilities are available. 有两个函数提供基本的XPath1.0(XML路径语言1.0版)功能。Some basic information about XPath syntax and usage is provided later in this section; however, an in-depth discussion of these topics is beyond the scope of this manual, and you should refer to the XML Path Language (XPath) 1.0 standard for definitive information. 有关XPath语法和用法的一些基本信息将在本节后面提供;但是,对这些主题的深入讨论超出了本手册的范围,您应该参考XML路径语言(XPath)1.0标准以获取最终信息。A useful resource for those new to XPath or who desire a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.对于那些不熟悉XPath或希望复习基础知识的人来说,Zvon.org XPath教程是一个有用的资源,它有多种语言版本。
These functions remain under development. 这些功能仍在开发中。We continue to improve these and other aspects of XML and XPath functionality in MySQL 8.0 and onwards. 我们将继续改进MySQL8.0及以后版本中XML和XPath功能的这些方面和其他方面。You may discuss these, ask questions about them, and obtain help from other users with them in the MySQL XML User Forum.您可以在MySQL-XML用户论坛上讨论这些问题,询问有关它们的问题,并从其他用户那里获得帮助。
XPath expressions used with these functions support user variables and local stored program variables. 与这些函数一起使用的XPath表达式支持用户变量和本地存储程序变量。User variables are weakly checked; variables local to stored programs are strongly checked (see also Bug #26518):弱检查用户变量;强烈检查存储程序的本地变量(另请参见错误#26518):
User variables (weak checking).用户变量(弱检查)。 Variables using the syntax 使用语法的变量$@
(that is, user variables) are not checked. variable_name
$@
(即,用户变量)未经检查。variable_name
No warnings or errors are issued by the server if a variable has the wrong type or has previously not been assigned a value. 如果变量的类型不正确或以前没有分配值,则服务器不会发出警告或错误。This also means the user is fully responsible for any typographical errors, since no warnings are given if (for example) 这也意味着用户对任何印刷错误负有全部责任,因为如果(例如)$@myvairable
is used where $@myvariable
was intended.$@myvairable
用在$@myvariable
是有意的位置,就不会给任何警告。
Example:示例:
mysql>SET @xml = '<a><b>X</b><b>Y</b></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SET @i =1, @j = 2;
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @i, ExtractValue(@xml, '//b[$@i]');
+------+--------------------------------+ | @i | ExtractValue(@xml, '//b[$@i]') | +------+--------------------------------+ | 1 | X | +------+--------------------------------+ 1 row in set (0.00 sec) mysql>SELECT @j, ExtractValue(@xml, '//b[$@j]');
+------+--------------------------------+ | @j | ExtractValue(@xml, '//b[$@j]') | +------+--------------------------------+ | 2 | Y | +------+--------------------------------+ 1 row in set (0.00 sec) mysql>SELECT @k, ExtractValue(@xml, '//b[$@k]');
+------+--------------------------------+ | @k | ExtractValue(@xml, '//b[$@k]') | +------+--------------------------------+ | NULL | | +------+--------------------------------+ 1 row in set (0.00 sec)
Variables in stored programs (strong checking).存储程序中的变量(强检查)。 Variables using the syntax 使用语法$
can be declared and used with these functions when they are called inside stored programs. variable_name
$
的变量可以在存储程序中调用时声明并与这些函数一起使用。variable_name
Such variables are local to the stored program in which they are defined, and are strongly checked for type and value.这类变量是定义它们的存储程序的本地变量,并且对其类型和值进行严格检查。
Example:例如:
mysql>DELIMITER |
mysql>CREATE PROCEDURE myproc ()
->BEGIN
->DECLARE i INT DEFAULT 1;
->DECLARE xml VARCHAR(25) DEFAULT '<a>X</a><a>Y</a><a>Z</a>';
-> ->WHILE i < 4 DO
->SELECT xml, i, ExtractValue(xml, '//a[$i]');
->SET i = i+1;
->END WHILE;
->END |
Query OK, 0 rows affected (0.01 sec) mysql>DELIMITER ;
mysql>CALL myproc();
+--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 1 | X | +--------------------------+---+------------------------------+ 1 row in set (0.00 sec) +--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 2 | Y | +--------------------------+---+------------------------------+ 1 row in set (0.01 sec) +--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 3 | Z | +--------------------------+---+------------------------------+ 1 row in set (0.01 sec)
Parameters.参数。 Variables used in XPath expressions inside stored routines that are passed in as parameters are also subject to strong checking.作为参数传入的存储例程中XPath表达式中使用的变量也要接受强检查。
Expressions containing user variables or variables local to stored programs must otherwise (except for notation) conform to the rules for XPath expressions containing variables as given in the XPath 1.0 specification.包含用户变量或存储程序本地变量的表达式必须符合XPath 1.0规范中给出的包含变量的XPath表达式的规则。
A user variable used to store an XPath expression is treated as an empty string. 用于存储XPath表达式的用户变量被视为空字符串。Because of this, it is not possible to store an XPath expression as a user variable. 因此,不可能将XPath表达式存储为用户变量。(Bug #32911)
ExtractValue(
xml_frag
, xpath_expr
)
ExtractValue()
takes two string arguments, a fragment of XML markup xml_frag
and an XPath expression xpath_expr
(also known as a locator); it returns the text (CDATA
) of the first text node which is a child of the element or elements matched by the XPath expression.ExtractValue()
接受两个字符串参数,一个是XML标记xml_frag
,另一个是XPath表达式xpath_expr
(也称为定位器);它返回第一个文本节点的文本(CDATA
),该节点是与XPath表达式匹配的元素的子元素。
Using this function is the equivalent of performing a match using the 使用此函数相当于在附加xpath_expr
after appending /text()
. /text()
之后使用xpath_expr
执行匹配。In other words, 换句话说,ExtractValue('<a><b>Sakila</b></a>', '/a/b')
and ExtractValue('<a><b>Sakila</b></a>', '/a/b/text()')
produce the same result.ExtractValue('<a><b>Sakila</b></a>', '/a/b')
和ExtractValue('<a><b>Sakila</b></a>', '/a/b/text()')
产生同样的结果。
If multiple matches are found, the content of the first child text node of each matching element is returned (in the order matched) as a single, space-delimited string.如果找到多个匹配项,则每个匹配元素的第一个子文本节点的内容将作为单个空格分隔的字符串返回(按匹配的顺序)。
If no matching text node is found for the expression (including the implicit 如果由于任何原因找不到表达式(包括隐含的/text()
)—for whatever reason, as long as xpath_expr
is valid, and xml_frag
consists of elements which are properly nested and closed—an empty string is returned. /text()
)的匹配文本节点,只要xpath_expr
有效,并且xml_frag
由正确嵌套和闭合的元素构成,就会返回一个空字符串。No distinction is made between a match on an empty element and no match at all. 空元素上的匹配与完全不匹配之间没有区别。This is by design.这是故意的。
If you need to determine whether no matching element was found in 如果需要确定是否在xml_frag
or such an element was found but contained no child text nodes, you should test the result of an expression that uses the XPath count()
function. xml_frag
中找不到匹配的元素,或者找到了这样的元素但不包含子文本节点,则应该测试使用XPathcount()
函数的表达式的结果。For example, both of these statements return an empty string, as shown here:例如,这两条语句都返回一个空字符串,如下所示:
mysql>SELECT ExtractValue('<a><b/></a>', '/a/b');
+-------------------------------------+ | ExtractValue('<a><b/></a>', '/a/b') | +-------------------------------------+ | | +-------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a><c/></a>', '/a/b');
+-------------------------------------+ | ExtractValue('<a><c/></a>', '/a/b') | +-------------------------------------+ | | +-------------------------------------+ 1 row in set (0.00 sec)
However, you can determine whether there was actually a matching element using the following:但是,您可以使用以下公式确定是否确实存在匹配的元素:
mysql>SELECT ExtractValue('<a><b/></a>', 'count(/a/b)');
+-------------------------------------+ | ExtractValue('<a><b/></a>', 'count(/a/b)') | +-------------------------------------+ | 1 | +-------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a><c/></a>', 'count(/a/b)');
+-------------------------------------+ | ExtractValue('<a><c/></a>', 'count(/a/b)') | +-------------------------------------+ | 0 | +-------------------------------------+ 1 row in set (0.01 sec)
ExtractValue()
returns only CDATA
, and does not return any tags that might be contained within a matching tag, nor any of their content (see the result returned as val1
in the following example).ExtractValue()
只返回CDATA
,不返回匹配标记中可能包含的任何标记,也不返回它们的任何内容(请参见以下示例中作为val1
返回的结果)。
mysql>SELECT
->ExtractValue('<a>ccc<b>ddd</b></a>', '/a') AS val1,
->ExtractValue('<a>ccc<b>ddd</b></a>', '/a/b') AS val2,
->ExtractValue('<a>ccc<b>ddd</b></a>', '//b') AS val3,
->ExtractValue('<a>ccc<b>ddd</b></a>', '/b') AS val4,
->ExtractValue('<a>ccc<b>ddd</b><b>eee</b></a>', '//b') AS val5;
+------+------+------+------+---------+ | val1 | val2 | val3 | val4 | val5 | +------+------+------+------+---------+ | ccc | ddd | ddd | | ddd eee | +------+------+------+------+---------+
This function uses the current SQL collation for making comparisons with 此函数使用当前SQL排序规则与contains()
, performing the same collation aggregation as other string functions (such as CONCAT()
), in taking into account the collation coercibility of their arguments; see Section 10.8.4, “Collation Coercibility in Expressions”, for an explanation of the rules governing this behavior.contains()
进行比较,执行与其他字符串函数(如CONCAT()
)相同的排序规则聚合,同时考虑其参数的排序规则强制性;请参阅第10.8.4节“表达式中的排序规则强制性”,以了解有关此行为规则的解释。
(Previously, binary—that is, case-sensitive—comparison was always used.)(以前,总是使用二进制,即区分大小写的比较。)
如果NULL
is returned if xml_frag
contains elements which are not properly nested or closed, and a warning is generated, as shown in this example:xml_frag
包含未正确嵌套或关闭的元素,则返回NULL
,并生成警告,如本例所示:
mysql>SELECT ExtractValue('<a>c</a><b', '//a');
+-----------------------------------+ | ExtractValue('<a>c</a><b', '//a') | +-----------------------------------+ | NULL | +-----------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS\G
*************************** 1. row *************************** Level: Warning Code: 1525 Message: Incorrect XML value: 'parse error at line 1 pos 11: END-OF-INPUT unexpected ('>' wanted)' 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a>c</a><b/>', '//a');
+-------------------------------------+ | ExtractValue('<a>c</a><b/>', '//a') | +-------------------------------------+ | c | +-------------------------------------+ 1 row in set (0.00 sec)
UpdateXML(
xml_target
, xpath_expr
, new_xml
)
This function replaces a single portion of a given fragment of XML markup 此函数用新的XML片段xml_target
with a new XML fragment new_xml
, and then returns the changed XML. new_xml
替换给定XML标记xml_target
片段的单个部分,然后返回更改后的XML。The portion of 被替换的xml_target
that is replaced matches an XPath expression xpath_expr
supplied by the user.xml_target
部分与用户提供的XPath表达式xpath_expr
匹配。
If no expression matching 如果找不到与xpath_expr
is found, or if multiple matches are found, the function returns the original xml_target
XML fragment. xpath_expr
匹配的表达式,或者找到多个匹配项,则函数返回原先的xml_target
xml片段。All three arguments should be strings.所有三个参数都应该是字符串。
mysql>SELECT
->UpdateXML('<a><b>ccc</b><d></d></a>', '/a', '<e>fff</e>') AS val1,
->UpdateXML('<a><b>ccc</b><d></d></a>', '/b', '<e>fff</e>') AS val2,
->UpdateXML('<a><b>ccc</b><d></d></a>', '//b', '<e>fff</e>') AS val3,
->UpdateXML('<a><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val4,
->UpdateXML('<a><d></d><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val5
->\G
*************************** 1. row *************************** val1: <e>fff</e> val2: <a><b>ccc</b><d></d></a> val3: <a><e>fff</e><d></d></a> val4: <a><b>ccc</b><e>fff</e></a> val5: <a><d></d><b>ccc</b><d></d></a>
A discussion in depth of XPath syntax and usage are beyond the scope of this manual. 对XPath语法和用法的深入讨论超出了本手册的范围。Please see the XML Path Language (XPath) 1.0 specification for definitive information. 请参阅XML路径语言(XPath)1.0规范以获取最终信息。A useful resource for those new to XPath or who are wishing a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.对于那些不熟悉XPath或者希望复习基础知识的人来说,Zvon.org xpath教程是一个有用的资源,它有多种语言版本。
Descriptions and examples of some basic XPath expressions follow:一些基本XPath表达式的描述和示例如下:
/
tag
Matches 当且仅当<
if and only if tag
/><
is the root element.tag
/><
是根元素时,匹配tag
/><
。tag
/>
Example: 示例:在/a
has a match in <a><b/></a>
because it matches the outermost (root) tag. <a><b/></a>
中,/a
有一个匹配,因为它匹配了最外层(根)标签。It does not match the inner 它没有匹配a
element in <b><a/></b>
because in this instance it is the child of another element.<b><a/></b>
里面的内部a
元素,是因为在这个实例中,它是另一个元素的子元素。
/
tag1
/tag2
Matches 当且仅当<
if and only if it is a child of tag2
/><
, and tag1
/><
is the root element.tag1
/><
是tag2
/><
的子元素,并且tag1
/><
是根元素时,匹配该tag1
/><
元素。tag2
/>
Example: 示例:/a/b
matches the b
element in the XML fragment <a><b/></a>
because it is a child of the root element a
. /a/b
匹配XML片段<a><b/></a>
中的b
元素,因为它是根元素a
的子元素。It does not have a match in 它在<b><a/></b>
because in this case, b
is the root element (and hence the child of no other element). <b><a/></b>
中没有匹配,因为在此情形中,b
是根元素(也就是说不是任何元素的子元素)。Nor does the XPath expression have a match in 该表达在<a><c><b/></c></a>
; here, b
is a descendant of a
, but not actually a child of a
.<a><c><b/></c></a>
中也没有匹配,此处,b
是a
的后代,但是并不是a
的真正子元素。
This construct is extendable to three or more elements. 此构造可扩展到三个或更多元素。For example, the XPath expression 例如,XPath表达式/a/b/c
matches the c
element in the fragment <a><b><c/></b></a>
./a/b/c
匹配片段<a><b><c/></b></a>
中的c
元素。
//
tag
Matches any instance of 匹配<
.tag
><
的任一个实例。tag
>
Example: 示例://a
matches the a
element in any of the following: <a><b><c/></b></a>
; <c><a><b/></a></b>
; <c><b><a/></b></c>
.//a
匹配以下任意一项中的a
元素:<a><b><c/></b></a>
;<c><a><b/></a></b>
;<c><b><a/></b></c>
。
//
can be combined with /
. //
能够与/
组合。For example, 例如,//a/b
matches the b
element in either of the fragments <a><b/></a>
or <c><a><b/></a></c>
.//a/b
匹配片段<a><b/></a>
或<c><a><b/></a></c>
中的b
元素。
//
is the equivalent of tag
/descendant-or-self::*/
. tag
//
等同于tag
/descendant-or-self::*/
。tag
A common error is to confuse this with 一个常见的错误是将这与/descendant-or-self::
, although the latter expression can actually lead to very different results, as can be seen here:tag
/descendant-or-self::
搞混,虽然后一种表达式实际上会导致完全不同的结果,如下所示:tag
mysql>SET @xml = '<a><b><c>w</c><b>x</b><d>y</d>z</b></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @xml;
+-----------------------------------------+ | @xml | +-----------------------------------------+ | <a><b><c>w</c><b>x</b><d>y</d>z</b></a> | +-----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '//b[1]');
+------------------------------+ | ExtractValue(@xml, '//b[1]') | +------------------------------+ | x z | +------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '//b[2]');
+------------------------------+ | ExtractValue(@xml, '//b[2]') | +------------------------------+ | | +------------------------------+ 1 row in set (0.01 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::*/b[1]');
+---------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::*/b[1]') | +---------------------------------------------------+ | x z | +---------------------------------------------------+ 1 row in set (0.06 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::*/b[2]');
+---------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::*/b[2]') | +---------------------------------------------------+ | | +---------------------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::b[1]');
+-------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::b[1]') | +-------------------------------------------------+ | z | +-------------------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::b[2]');
+-------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::b[2]') | +-------------------------------------------------+ | x | +-------------------------------------------------+ 1 row in set (0.00 sec)
The *
operator acts as a “wildcard” that matches any element. *
运算符充当匹配任何元素的“通配符”。For example, the expression 例如,/*/b
matches the b
element in either of the XML fragments <a><b/></a>
or <c><b/></c>
. /*/b
匹配XML片段<a><b/></a>
或<c><b/></c>
中的b
元素。However, the expression does not produce a match in the fragment 但是,表达式不会在片段<b><a/></b>
because b
must be a child of some other element. <b><a/></b>
中生成匹配项,因为b
必须是其他元素的子元素。The wildcard may be used in any position: The expression 通配符可以用在任何位置:表达式/*/b/*
matches any child of a b
element that is itself not the root element./*/b/*
匹配本身不是根元素的b
元素的任何子元素。
You can match any of several locators using the 可以使用|
(UNION
) operator. |
(UNION
)操作符匹配多个定位器中的任意一个。For example, the expression 例如,表达式//b|//c
matches all b
and c
elements in the XML target.//b|//c
匹配XML目标中的所有b
和c
元素。
It is also possible to match an element based on the value of one or more of its attributes. 也可以基于元素的一个或多个属性的值来匹配元素。This done using the syntax 这是使用语法
. tag
[@attribute
="value
"]
来完成的。tag
[@attribute
="value
"]For example, the expression 例如,表达式//b[@id="idB"]
matches the second b
element in the fragment <a><b id="idA"/><c/><b id="idB"/></a>
. //b[@id="idB"]
匹配片段<a><b id="idA"/><c/><b id="idB"/></a>
中的第二个b
元素。To match against any element having 要针对任何具有
, use the XPath expression attribute
="value
"//*[
.attribute
="value
"]
的元素作匹配,请使用XPath表达式attribute
="value
"//*[
。attribute
="value
"]
To filter multiple attribute values, simply use multiple attribute-comparison clauses in succession. 要过滤多个属性值,只需连续使用多个属性比较子句。For example, the expression 例如,表达式//b[@c="x"][@d="y"]
matches the element <b c="x" d="y"/>
occurring anywhere in a given XML fragment.//b[@c="x"][@d="y"]
与出现在给定XML片段中的任意位置的元素<b c="x" d="y"/>
匹配。
To find elements for which the same attribute matches any of several values, you can use multiple locators joined by the 要查找同一属性与多个值中的任何值匹配的元素,可以使用多个由|
operator. |
运算符连接的定位器。For example, to match all 例如,若要匹配所有的b
elements whose c
attributes have either of the values 23 or 17, use the expression //b[@c="23"]|//b[@c="17"]
. b
元素,其c
属性具有值23或17,请使用表达式//b[@c="23"]|//b[@c="17"]
。You can also use the logical 你还可以使用逻辑or
operator for this purpose: //b[@c="23" or @c="17"]
.or
运算符以实现这个目的://b[@c="23" or @c="17"]
。
The difference between or
and |
is that or
joins conditions, while |
joins result sets.or
和|
的区别在于or
连接条件,而|
连接结果集。
XPath Limitations.XPath限制。 The XPath syntax supported by these functions is currently subject to the following limitations:这些函数支持的XPath语法目前受到以下限制:
Nodeset-to-nodeset comparison (such as 节点集到节点集的比较(例如'/a/b[@c=@d]'
) is not supported.'/a/b[@c=@d]'
不受支持)。
All of the standard XPath comparison operators are supported. 支持所有标准XPath比较运算符。(Bug #22823)
Relative locator expressions are resolved in the context of the root node. 在根节点的上下文中解析相对定位器表达式。For example, consider the following query and result:例如,考虑以下查询和结果:
mysql>SELECT ExtractValue(
->'<a><b c="1">X</b><b c="2">Y</b></a>',
->'a/b'
->) AS result;
+--------+ | result | +--------+ | X Y | +--------+ 1 row in set (0.03 sec)
In this case, the locator 在这种情况下,定位器a/b
resolves to /a/b
.a/b
解析为/a/b
。
Relative locators are also supported within predicates. 谓词中也支持相对定位器。In the following example, 在下面的示例中,d[../@c="1"]
is resolved as /a/b[@c="1"]/d
:d[../@c="1"]
被解析为/a/b[@c="1"]/d
:
mysql>SELECT ExtractValue(
->'<a>
-><b c="1"><d>X</d></b>
-><b c="2"><d>X</d></b>
-></a>',
->'a/b/d[../@c="1"]')
->AS result;
+--------+ | result | +--------+ | X | +--------+ 1 row in set (0.00 sec)
Locators prefixed with expressions that evaluate as scalar values—including variable references, literals, numbers, and scalar function calls—are not permitted, and their use results in an error.不允许使用带有表达式前缀的定位器,这些表达式的计算结果为标量值,包括变量引用、文字、数字和标量函数调用,使用这些定位器会导致错误。
The ::
operator is not supported in combination with node types such as the following:::
运算符不支持与以下节点类型组合使用:
axis
::comment()
axis
::text()
axis
::processing-instructions()
axis
::node()
However, name tests (such as 然而,名称嵌套(譬如
and axis
::name
) are supported, as shown in these examples:axis
::*
和axis
::name
)是受支持的。axis
::*
mysql>SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b');
+-------------------------------------------------------+ | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b') | +-------------------------------------------------------+ | x | +-------------------------------------------------------+ 1 row in set (0.02 sec) mysql>SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*');
+-------------------------------------------------------+ | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*') | +-------------------------------------------------------+ | x y | +-------------------------------------------------------+ 1 row in set (0.01 sec)
“Up-and-down” navigation is not supported in cases where the path would lead “above” the root element. 如果路径指向根元素的“上方”,则不支持“上下”导航。That is, you cannot use expressions which match on descendants of ancestors of a given element, where one or more of the ancestors of the current element is also an ancestor of the root element (see Bug #16321).也就是说,不能使用与给定元素的祖先后代匹配的表达式,其中当前元素的一个或多个祖先也是根元素的祖先(请参见Bug#16321)。
The following XPath functions are not supported, or have known issues as indicated:以下XPath函数不受支持,或存在已知问题,如图所示:
id()
lang()
local-name()
name()
namespace-uri()
normalize-space()
starts-with()
string()
substring-after()
substring-before()
translate()
The following axes are not supported:不支持以下轴:
following-sibling
following
preceding-sibling
preceding
XPath expressions passed as arguments to 作为参数传递给ExtractValue()
and UpdateXML()
may contain the colon character (:
) in element selectors, which enables their use with markup employing XML namespaces notation. ExtractValue()
和UpdateXML()
的XPath表达式在元素选择器中可能包含冒号字符(:
),这使它们能够与使用XML名称空间表示法的标记一起使用。For example:例如:
mysql>SET @xml = '<a>111<b:c>222<d>333</d><e:f>444</e:f></b:c></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT ExtractValue(@xml, '//e:f');
+-----------------------------+ | ExtractValue(@xml, '//e:f') | +-----------------------------+ | 444 | +-----------------------------+ 1 row in set (0.00 sec) mysql>SELECT UpdateXML(@xml, '//b:c', '<g:h>555</g:h>');
+--------------------------------------------+ | UpdateXML(@xml, '//b:c', '<g:h>555</g:h>') | +--------------------------------------------+ | <a>111<g:h>555</g:h></a> | +--------------------------------------------+ 1 row in set (0.00 sec)
This is similar in some respects to what is permitted by Apache Xalan and some other parsers, and is much simpler than requiring namespace declarations or the use of the 这在某些方面与Apache Xalan和其他一些解析器所允许的类似,并且比要求命名空间声明或使用namespace-uri()
and local-name()
functions.namespace-uri()
和local-name()
函数简单得多。
Error handling.错误处理。 For both 对于EExtractValue()
and UpdateXML()
, the XPath locator used must be valid and the XML to be searched must consist of elements which are properly nested and closed. xtractValue()
和UpdateXML()
,使用的XPath定位器必须有效,并且要搜索的XML必须由正确嵌套和关闭的元素组成。If the locator is invalid, an error is generated:如果定位器无效,则生成错误:
mysql> SELECT ExtractValue('<a>c</a><b/>', '/&a');
ERROR 1105 (HY000): XPATH syntax error: '&a'
If 如果xml_frag
does not consist of elements which are properly nested and closed, NULL
is returned and a warning is generated, as shown in this example:xml_frag
不包含正确嵌套和关闭的元素,则返回NULL
并生成警告,如下例所示:
mysql>SELECT ExtractValue('<a>c</a><b', '//a');
+-----------------------------------+ | ExtractValue('<a>c</a><b', '//a') | +-----------------------------------+ | NULL | +-----------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS\G
*************************** 1. row *************************** Level: Warning Code: 1525 Message: Incorrect XML value: 'parse error at line 1 pos 11: END-OF-INPUT unexpected ('>' wanted)' 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a>c</a><b/>', '//a');
+-------------------------------------+ | ExtractValue('<a>c</a><b/>', '//a') | +-------------------------------------+ | c | +-------------------------------------+ 1 row in set (0.00 sec)
The replacement XML used as the third argument to 不会检查用作UpdateXML()
is not checked to determine whether it consists solely of elements which are properly nested and closed.UpdateXML()
的第三个参数的替换XML,以确定它是否仅由正确嵌套和关闭的元素组成。
XPath Injection.XPath注入。 code injection occurs when malicious code is introduced into the system to gain unauthorized access to privileges and data. 当恶意代码被引入系统以获得对权限和数据的未经授权的访问时,就会发生代码注入。It is based on exploiting assumptions made by developers about the type and content of data input from users. 它基于开发人员对用户输入的数据类型和内容所做的假设。XPath is no exception in this regard.XPath在这方面也不例外。
A common scenario in which this can happen is the case of application which handles authorization by matching the combination of a login name and password with those found in an XML file, using an XPath expression like this one:一种常见的情况是,应用程序通过将登录名和密码的组合与XML文件中的组合相匹配来处理授权,使用如下XPath表达式:
//user[login/text()='neapolitan' and password/text()='1c3cr34m']/attribute::id
This is the XPath equivalent of an SQL statement like this one:这是类似以下SQL语句的XPath等价物:
SELECT id FROM users WHERE login='neapolitan' AND password='1c3cr34m';
A PHP application employing XPath might handle the login process like this:使用XPath的PHP应用程序可能会像这样处理登录过程:
<?php $file = "users.xml"; $login = $POST["login"]; $password = $POST["password"]; $xpath = "//user[login/text()=$login and password/text()=$password]/attribute::id"; if( file_exists($file) ) { $xml = simplexml_load_file($file); if($result = $xml->xpath($xpath)) echo "You are now logged in as user $result[0]."; else echo "Invalid login name or password."; } else exit("Failed to open $file."); ?>
No checks are performed on the input. 不检查输入。This means that a malevolent user can “short-circuit” the test by entering 这意味着恶意用户可以通过为登录名和密码输入' or 1=1
for both the login name and password, resulting in $xpath
being evaluated as shown here:' or 1=1
来“短路”测试,从而导致$xpath
的计算结果如下所示:
//user[login/text()='' or 1=1 and password/text()='' or 1=1]/attribute::id
Since the expression inside the square brackets always evaluates as 由于方括号内的表达式的计算结果始终为true
, it is effectively the same as this one, which matches the id
attribute of every user
element in the XML document:true
,因此它实际上与此表达式相同,后者匹配XML文档中每个用户元素的id属性:
//user/attribute::id
One way in which this particular attack can be circumvented is simply by quoting the variable names to be interpolated in the definition of 规避此特定攻击的一种方法是简单地引用要在$xpath
, forcing the values passed from a Web form to be converted to strings:$xpath
定义中插入的变量名,强制将从Web表单传递的值转换为字符串:
$xpath = "//user[login/text()='$login' and password/text()='$password']/attribute::id";
This is the same strategy that is often recommended for preventing SQL injection attacks. 这与通常建议用于防止SQL注入攻击的策略相同。In general, the practices you should follow for preventing XPath injection attacks are the same as for preventing SQL injection:通常,防止XPath注入攻击应遵循的做法与防止SQL注入的做法相同:
Never accepted untested data from users in your application.从不接受应用程序中用户的未测试数据。
Check all user-submitted data for type; reject or convert data that is of the wrong type检查所有用户提交的数据类型;拒绝或转换错误类型的数据
Test numeric data for out of range values; truncate, round, or reject values that are out of range. 测试数值数据是否超出范围;截断、舍入或拒绝超出范围的值。Test strings for illegal characters and either strip them out or reject input containing them.测试字符串中是否有非法字符,并将其删除或拒绝包含这些字符的输入。
Do not output explicit error messages that might provide an unauthorized user with clues that could be used to compromise the system; log these to a file or database table instead.不要输出明确的错误消息,这些消息可能会向未经授权的用户提供可用于危害系统的线索;将这些记录到文件或数据库表中。
Just as SQL injection attacks can be used to obtain information about database schemas, so can XPath injection be used to traverse XML files to uncover their structure, as discussed in Amit Klein's paper Blind XPath Injection (PDF file, 46KB).正如SQL注入攻击可以用来获取有关数据库模式的信息一样,XPath注入也可以用来遍历XML文件以揭示其结构,正如Amit Klein的论文Blind XPath Injection(PDF文件,46KB)中所讨论的那样。
It is also important to check the output being sent back to the client. 检查发送回客户机的输出也很重要。Consider what can happen when we use the MySQL 考虑一下当我们使用MySQL ExtractValue()
function:ExtractValue()
函数时会发生什么:
mysql>SELECT ExtractValue(
->LOAD_FILE('users.xml'),
->'//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
->) AS id;
+-------------------------------+ | id | +-------------------------------+ | 00327 13579 02403 42354 28570 | +-------------------------------+ 1 row in set (0.01 sec)
Because 由于ExtractValue()
returns multiple matches as a single space-delimited string, this injection attack provides every valid ID contained within users.xml
to the user as a single row of output. ExtractValue()
以单个空格分隔的字符串形式返回多个匹配项,因此此注入攻击将users.xml
中包含的每个有效ID作为一行输出提供给用户。As an extra safeguard, you should also test output before returning it to the user. 作为额外的保护措施,您还应该在将输出返回给用户之前测试输出。Here is a simple example:下面是一个简单的例子:
mysql>SELECT @id = ExtractValue(
->LOAD_FILE('users.xml'),
->'//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
->);
Query OK, 0 rows affected (0.00 sec) mysql>SELECT IF(
->INSTR(@id, ' ') = 0,
->@id,
->'Unable to retrieve user ID')
->AS singleID;
+----------------------------+ | singleID | +----------------------------+ | Unable to retrieve user ID | +----------------------------+ 1 row in set (0.00 sec)
In general, the guidelines for returning data to users securely are the same as for accepting user input. 一般来说,安全地将数据返回给用户的准则与接受用户输入的准则相同。These can be summed up as:这些可以概括为:
Always test outgoing data for type and permissible values.始终测试输出数据的类型和允许值。
Never permit unauthorized users to view error messages that might provide information about the application that could be used to exploit it.决不允许未经授权的用户查看错误消息,这些错误消息可能会提供有关可用于攻击它的应用程序的信息。