12.12 XML FunctionsXML函数

Table 12.16 XML FunctionsXML函数

Name名称Description描述
ExtractValue()Extract a value from an XML string using XPath notation使用XPath表示法从XML字符串中提取值
UpdateXML()Return replaced XML fragment返回替换的XML片段

This section discusses XML and related functionality in MySQL.本节讨论MySQL中的XML和相关功能。

Note注意

It is possible to obtain XML-formatted output from MySQL in the mysql and mysqldump clients by invoking them with the --xml option. 通过使用--xml选项调用MySQLmysqldump客户机,可以从MySQL获得XML格式的输出。See Section 4.5.1, “mysql — The MySQL Command-Line Client”, and Section 4.5.4, “mysqldump — A Database Backup Program”.请参阅第4.5.1节,“mysql命令行客户端”第4.5.4节,“mysqldump数据库备份程序”

Two functions providing basic XPath 1.0 (XML Path Language, version 1.0) capabilities are available. 有两个函数提供基本的XPath1.0(XML路径语言1.0版)功能。Some basic information about XPath syntax and usage is provided later in this section; however, an in-depth discussion of these topics is beyond the scope of this manual, and you should refer to the XML Path Language (XPath) 1.0 standard for definitive information. 有关XPath语法和用法的一些基本信息将在本节后面提供;但是,对这些主题的深入讨论超出了本手册的范围,您应该参考XML路径语言(XPath)1.0标准以获取最终信息。A useful resource for those new to XPath or who desire a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.对于那些不熟悉XPath或希望复习基础知识的人来说,Zvon.org XPath教程是一个有用的资源,它有多种语言版本。

Note注意

These functions remain under development. 这些功能仍在开发中。We continue to improve these and other aspects of XML and XPath functionality in MySQL 8.0 and onwards. 我们将继续改进MySQL8.0及以后版本中XML和XPath功能的这些方面和其他方面。You may discuss these, ask questions about them, and obtain help from other users with them in the MySQL XML User Forum.您可以在MySQL-XML用户论坛上讨论这些问题,询问有关它们的问题,并从其他用户那里获得帮助。

XPath expressions used with these functions support user variables and local stored program variables. 与这些函数一起使用的XPath表达式支持用户变量和本地存储程序变量。User variables are weakly checked; variables local to stored programs are strongly checked (see also Bug #26518):弱检查用户变量;强烈检查存储程序的本地变量(另请参见错误#26518):

Expressions containing user variables or variables local to stored programs must otherwise (except for notation) conform to the rules for XPath expressions containing variables as given in the XPath 1.0 specification.包含用户变量或存储程序本地变量的表达式必须符合XPath 1.0规范中给出的包含变量的XPath表达式的规则。

Note注意

A user variable used to store an XPath expression is treated as an empty string. 用于存储XPath表达式的用户变量被视为空字符串。Because of this, it is not possible to store an XPath expression as a user variable. 因此,不可能将XPath表达式存储为用户变量。(Bug #32911)

Note注意

A discussion in depth of XPath syntax and usage are beyond the scope of this manual. 对XPath语法和用法的深入讨论超出了本手册的范围。Please see the XML Path Language (XPath) 1.0 specification for definitive information. 请参阅XML路径语言(XPath)1.0规范以获取最终信息。A useful resource for those new to XPath or who are wishing a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.对于那些不熟悉XPath或者希望复习基础知识的人来说,Zvon.org xpath教程是一个有用的资源,它有多种语言版本。

Descriptions and examples of some basic XPath expressions follow:一些基本XPath表达式的描述和示例如下:

XPath Limitations.XPath限制。  The XPath syntax supported by these functions is currently subject to the following limitations:这些函数支持的XPath语法目前受到以下限制:

XPath expressions passed as arguments to ExtractValue() and UpdateXML() may contain the colon character (:) in element selectors, which enables their use with markup employing XML namespaces notation. 作为参数传递给ExtractValue()UpdateXML()的XPath表达式在元素选择器中可能包含冒号字符(:),这使它们能够与使用XML名称空间表示法的标记一起使用。For example:例如:

mysql> SET @xml = '<a>111<b:c>222<d>333</d><e:f>444</e:f></b:c></a>';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT ExtractValue(@xml, '//e:f');
+-----------------------------+
| ExtractValue(@xml, '//e:f') |
+-----------------------------+
| 444                         |
+-----------------------------+
1 row in set (0.00 sec)

mysql> SELECT UpdateXML(@xml, '//b:c', '<g:h>555</g:h>');
+--------------------------------------------+
| UpdateXML(@xml, '//b:c', '<g:h>555</g:h>') |
+--------------------------------------------+
| <a>111<g:h>555</g:h></a>                   |
+--------------------------------------------+
1 row in set (0.00 sec)

This is similar in some respects to what is permitted by Apache Xalan and some other parsers, and is much simpler than requiring namespace declarations or the use of the namespace-uri() and local-name() functions.这在某些方面与Apache Xalan和其他一些解析器所允许的类似,并且比要求命名空间声明或使用namespace-uri()local-name()函数简单得多。

Error handling.错误处理。  For both ExtractValue() and UpdateXML(), the XPath locator used must be valid and the XML to be searched must consist of elements which are properly nested and closed. 对于ExtractValue()UpdateXML(),使用的XPath定位器必须有效,并且要搜索的XML必须由正确嵌套和关闭的元素组成。If the locator is invalid, an error is generated:如果定位器无效,则生成错误:

mysql> SELECT ExtractValue('<a>c</a><b/>', '/&a');
ERROR 1105 (HY000): XPATH syntax error: '&a'

If xml_frag does not consist of elements which are properly nested and closed, NULL is returned and a warning is generated, as shown in this example:如果xml_frag不包含正确嵌套和关闭的元素,则返回NULL并生成警告,如下例所示:

mysql> SELECT ExtractValue('<a>c</a><b', '//a');
+-----------------------------------+
| ExtractValue('<a>c</a><b', '//a') |
+-----------------------------------+
| NULL                              |
+-----------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Warning
   Code: 1525
Message: Incorrect XML value: 'parse error at line 1 pos 11:
         END-OF-INPUT unexpected ('>' wanted)'
1 row in set (0.00 sec)

mysql> SELECT ExtractValue('<a>c</a><b/>', '//a');
+-------------------------------------+
| ExtractValue('<a>c</a><b/>', '//a') |
+-------------------------------------+
| c                                   |
+-------------------------------------+
1 row in set (0.00 sec)
Important重要

The replacement XML used as the third argument to UpdateXML() is not checked to determine whether it consists solely of elements which are properly nested and closed.不会检查用作UpdateXML()的第三个参数的替换XML,以确定它是否仅由正确嵌套和关闭的元素组成。

XPath Injection.XPath注入。  code injection occurs when malicious code is introduced into the system to gain unauthorized access to privileges and data. 当恶意代码被引入系统以获得对权限和数据的未经授权的访问时,就会发生代码注入。It is based on exploiting assumptions made by developers about the type and content of data input from users. 它基于开发人员对用户输入的数据类型和内容所做的假设。XPath is no exception in this regard.XPath在这方面也不例外。

A common scenario in which this can happen is the case of application which handles authorization by matching the combination of a login name and password with those found in an XML file, using an XPath expression like this one:一种常见的情况是,应用程序通过将登录名和密码的组合与XML文件中的组合相匹配来处理授权,使用如下XPath表达式:

//user[login/text()='neapolitan' and password/text()='1c3cr34m']/attribute::id

This is the XPath equivalent of an SQL statement like this one:这是类似以下SQL语句的XPath等价物:

SELECT id FROM users WHERE login='neapolitan' AND password='1c3cr34m';

A PHP application employing XPath might handle the login process like this:使用XPath的PHP应用程序可能会像这样处理登录过程:

<?php

  $file     =   "users.xml";

  $login    =   $POST["login"];
  $password =   $POST["password"];

  $xpath = "//user[login/text()=$login and password/text()=$password]/attribute::id";

  if( file_exists($file) )
  {
    $xml = simplexml_load_file($file);

    if($result = $xml->xpath($xpath))
      echo "You are now logged in as user $result[0].";
    else
      echo "Invalid login name or password.";
  }
  else
    exit("Failed to open $file.");

?>

No checks are performed on the input. 不检查输入。This means that a malevolent user can short-circuit the test by entering ' or 1=1 for both the login name and password, resulting in $xpath being evaluated as shown here:这意味着恶意用户可以通过为登录名和密码输入' or 1=1来“短路”测试,从而导致$xpath的计算结果如下所示:

//user[login/text()='' or 1=1 and password/text()='' or 1=1]/attribute::id

Since the expression inside the square brackets always evaluates as true, it is effectively the same as this one, which matches the id attribute of every user element in the XML document:由于方括号内的表达式的计算结果始终为true,因此它实际上与此表达式相同,后者匹配XML文档中每个用户元素的id属性:

//user/attribute::id

One way in which this particular attack can be circumvented is simply by quoting the variable names to be interpolated in the definition of $xpath, forcing the values passed from a Web form to be converted to strings:规避此特定攻击的一种方法是简单地引用要在$xpath定义中插入的变量名,强制将从Web表单传递的值转换为字符串:

$xpath = "//user[login/text()='$login' and password/text()='$password']/attribute::id";

This is the same strategy that is often recommended for preventing SQL injection attacks. 这与通常建议用于防止SQL注入攻击的策略相同。In general, the practices you should follow for preventing XPath injection attacks are the same as for preventing SQL injection:通常,防止XPath注入攻击应遵循的做法与防止SQL注入的做法相同:

Just as SQL injection attacks can be used to obtain information about database schemas, so can XPath injection be used to traverse XML files to uncover their structure, as discussed in Amit Klein's paper Blind XPath Injection (PDF file, 46KB).正如SQL注入攻击可以用来获取有关数据库模式的信息一样,XPath注入也可以用来遍历XML文件以揭示其结构,正如Amit Klein的论文Blind XPath Injection(PDF文件,46KB)中所讨论的那样。

It is also important to check the output being sent back to the client. 检查发送回客户机的输出也很重要。Consider what can happen when we use the MySQL ExtractValue() function:考虑一下当我们使用MySQL ExtractValue()函数时会发生什么:

mysql> SELECT ExtractValue(
    ->     LOAD_FILE('users.xml'),
    ->     '//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
    -> ) AS id;
+-------------------------------+
| id                            |
+-------------------------------+
| 00327 13579 02403 42354 28570 |
+-------------------------------+
1 row in set (0.01 sec)

Because ExtractValue() returns multiple matches as a single space-delimited string, this injection attack provides every valid ID contained within users.xml to the user as a single row of output. 由于ExtractValue()以单个空格分隔的字符串形式返回多个匹配项,因此此注入攻击将users.xml中包含的每个有效ID作为一行输出提供给用户。As an extra safeguard, you should also test output before returning it to the user. 作为额外的保护措施,您还应该在将输出返回给用户之前测试输出。Here is a simple example:下面是一个简单的例子:

mysql> SELECT @id = ExtractValue(
    ->     LOAD_FILE('users.xml'),
    ->     '//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT IF(
    ->     INSTR(@id, ' ') = 0,
    ->     @id,
    ->     'Unable to retrieve user ID')
    -> AS singleID;
+----------------------------+
| singleID                   |
+----------------------------+
| Unable to retrieve user ID |
+----------------------------+
1 row in set (0.00 sec)

In general, the guidelines for returning data to users securely are the same as for accepting user input. 一般来说,安全地将数据返回给用户的准则与接受用户输入的准则相同。These can be summed up as:这些可以概括为: