OfficeOpenXML.com
 
Anatomy of a WordProcessingML FileWordProcessingML文件的剖析

Package Structure包结构

A WordprocessingML or docx file is a zip file (a package) containing a number of "parts"--typically UTF-8 or UTF-16 encoded XML files, though strictly defined, a part is a stream of bytes. WordprocessingML或docx文件是包含许多“部分”的zip文件(一个包)——通常是UTF-8或UTF-16编码的XML文件,尽管严格定义,但部分是字节流。The package may also contain other media files, such as images and video. 该包还可能包含其他媒体文件,例如图像和视频。The structure is organized according to the Open Packaging Conventions.该结构根据开放式包装惯例进行组织。

You can look at the file structure and the files by simply renaming any docx file to a zip file and unzipping the file. 只需将任何docx文件重命名为zip文件并解压缩文件,即可查看文件结构和文件。WordprocessingML file structure

Content Types内容类型

Every package must have a [Content_Types].xml, found at the root of the package. 每个包必须在包的根目录下具有[Content_Types].xml。This file contains a list of all of the content types of the parts in the package. 此文件包含包中零件的所有内容类型的列表。Every part and its type must be listed in [Content_Types].xml. 每个零件及其类型都必须列在[Content_Types].xml中。The following is a content type for the main document part:以下是主文档部分的内容类型:

<Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>

It's important to keep this in mind when adding new parts to the package.在向软件包中添加新部件时,务必记住这一点。

Relationships关系

Every package contains a relationships part that defines the relationships between the other parts and to resources outside of the package. 每个包都包含一个关系部分,该部分定义了其他部分之间以及与包外资源之间的关系。This separates the relationships from content and makes it easy to change relationships without changing the sources that reference targets.这将关系从内容中分离出来,使更改关系变得容易,而无需更改引用目标的源。

package relationships part

For an OOXML package, there is always a relationships part (.rels) within the _rels folder that identifies the starting parts of the package, or the package relationships. 对于OOXML包,在_rels文件夹中始终有一个关系部分(.rels),用于标识包的起始部分或包关系。For example, the following defines the identity of the start part for the content:例如,以下内容定义了内容的开始部分的标识:

<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="word/document.xml"/>.

There are also typically relationships within .rels for app.xml and core.xml.app.xml和core.xml的.rels中通常也存在关系。

In addition to the relationships part for the package, each part that is the source of one or more relationships will have its own relationships part. 除了包的关系部分外,作为一个或多个关系源的每个部分都有自己的关系部分。Each such relationship part is found within a _rels sub-folder of the part and is named by appending '.rels' to the name of the part. 每个这样的关系部件都位于部件的_rels子文件夹中,并通过在部件名称后面附加“.rels”来命名。Typically the main content part (document.xml) has its own relationships part. 通常,主内容部分(document.xml)有自己的关系部分。It will contain relationships to the other parts of the content, such as styles.xml, themes,xml, and footer.xml, as well as the URIs for external links.它将包含与内容的其他部分的关系,例如style.xml、themes、xml和footer.xml,以及外部链接的URI。

document relationships part

A relationship can be either explicit or implicit. 关系可以是显式的,也可以是隐式的。For an explicit relationship, a resource is referenced using the Id attribute of a <Relationship> element. That is, the Id in the source maps directly to an Id of a relationship item, with an explicit reference to the target.

For example, a document might contain a hyperlink such as this:例如,文档可能包含以下超链接:

<w:hyperlink r:id="rId4">

The r:id="rId4" references the following relationship within the relationships part for the document (document.xml.rels).

<Relationship Id="rId4" Type="http://. . ./hyperlink" Target="http://www.google.com/" TargetMode="External"/>

For an implicit relationship, there is no such direct reference to a <Relationship> Id. Instead, the reference is understood. For example, a document might contain a reference to a footnote as shown below.例如,文档可能包含对脚注的引用,如下所示。

<w:footnoteReference r:id="2">

In this case, the reference to the footnote with w:id="2" is understood to be in the Footnotes part that exists when there are footnotes. In the Footnotes part we will see the following.

<w:footnote w:id="2">

Parts Specific to WordprocessingML Documents特定于WordprocessingML文档的部分

Below is a list of the possible parts of a WordprocessingML package that are specific to WordprocessingML documents. 下面列出了WordprocessingML包中特定于WordprocessingML文档的可能部分。Keep in mind that a document may only have a few of these parts. 请记住,文档可能只有其中的几个部分。For example, if a document has no footnotes, then a footnotes part will not be included in the package.例如,如果文档没有脚注,则脚注部分将不包含在文件包中。

Part部分 Description描述
Comments注释 Contains the comments in the document. 包含文档中的注释。There may be a comments part for the main document and one for the glossary, if there is a glossary.如果有术语表,主文档可能有注释部分,术语表可能有注释部分。
Document Settings文档设置 Specifies the settings for the document, including such things as whether to hide spelling and grammatical errors, track revisions, write protection, etc. 指定文档的设置,包括是否隐藏拼写和语法错误、跟踪修订、写保护等。There may be a document settings part for the main document and one for the glossary, if there is a glossary.如果有术语表,主文档和术语表可能分别有一个文档设置部分和一个文档设置部分。
Endnotes尾注 Contains the endnotes for a document. 包含文档的尾注。There may be an endnotes part for the main document and one for the glossary, if there is a glossary.如果有术语表,主文档可能有尾注部分,术语表可能有尾注部分。
Font Table字体表 Specifies information about the fonts used in the document. 指定有关文档中使用的字体的信息。The application will use the information in the part to determine which fonts to use to display the document when the specified fonts are not available on the system. 当指定的字体在系统上不可用时,应用程序将使用部件中的信息来确定用于显示文档的字体。There may be a font table for the main document and one for the glossary, if there is a glossary.如果有术语表,主文档可能有一个字体表,词汇表可能有一个。
Footer页脚 Contains the information for a footer. 包含页脚的信息。Note that each section of a document may contain a footer for the first page, odd pages, and even pages. 请注意,文档的每个部分可能包含第一页、奇数页和偶数页的页脚。So there may be multiple footer parts, depending upon how many sections there are in the documnet and the types of footers for the sections.因此,可能有多个页脚部分,这取决于documnet中有多少节以及节的页脚类型。
Footnotes脚注 Contains the footnotes for the document. 包含文档的脚注。There may be a footnotes part for the main document and one for the glossary, if there is a glossary.如果有术语表,主文档可能有脚注部分,术语表可能有脚注部分。
Glossary术语汇编 This is a supplementary document storage location which may contain content that is carried with the document but is not visible from the main document contents. 这是一个补充文档存储位置,其中可能包含文档附带的内容,但在主文档内容中不可见。It is intended for storage of optional document fragments. 它用于存储可选文档片段。Only one is permitted.只允许一个。
Header页头 Contains the information for a header. 包含页头的信息。Note that each section of a document may contain a header for the first page, odd pages, and even pages. 请注意,文档的每个部分可能包含第一页、奇数页和偶数页的标题。So there may be multiple header parts, depending upon how many sections there are in the documnet and the types of headers for the sections.因此,可能有多个标题部分,这取决于documnet中有多少节以及这些节的标题类型。
Main Document主要文件 Contains the body of the document.包含文档的正文。
Numbering Definitions编号定义 Contains the definition for the structure of each numbering definition in the document. 包含文档中每个编号定义的结构定义。There may be a numbering definitions part for the main document and one for the glossary, if there is a glossary.如果有术语表,主文档可能有一个编号定义部分,术语表可能有一个编号定义部分。
Style Definitions样式定义 Contains the definitions for a set of styles used by the document. 包含文档使用的一组样式的定义。There may be a styles definitions part for the main document and one for the glossary, if there is a glossary.如果有术语表,主文档可能有一个样式定义部分,术语表可能有一个。
Web Settings网络设置 Contains the definitions for web-specific settings used by the document. 包含文档使用的特定于web的设置的定义。These settings specify two categories: settings related to HTML documents (that is, frameset definitions) that can be used in WordprocessingML documents, and settings which affect how the document is handled when saved as HTML. 这些设置指定了两个类别:与可在WordprocessingML文档中使用的HTML文档(即框架集定义)相关的设置,以及在保存为HTML时影响文档处理方式的设置。There may be a web settings part for the main document and one for the glossary, if there is a glossary.如果有术语表,主文档和术语表可能分别有一个web设置部分和一个web设置部分。

Parts Shared by Other OOXML Documents由其他OOXML文档共享的部分

There are a number of part types that may appear in any OOXML package. 任何OOXML包中都可能出现许多部分类型。Below are some of the more relevant parts for WordprocessingML documents.下面是WordprocessingML文档的一些更相关的部分。

Part部分 Description描述
Embedded package嵌入式软件包 Contains a complete package, either internal or external to the referencing package. 包含引用包的内部或外部的完整包。For example, a WordprocessingML document might contain a spreadsheet or presentation document.例如,WordprocessingML文档可能包含电子表格或演示文档。
Extended File Properties (often found at docProps/app.xml)扩展文件属性(通常在docProps/app.xml中找到) Contains properties specific to an OOXML document--properties such as the template used, the number of pages and words, and the application name and version.包含特定于OOXML文档的属性——例如使用的模板、页数和字数以及应用程序名称和版本。
File Properties, Core文件属性,核心 Core file properties enable the user to discover and set common properties within a package--properties such as creator name, creation date, title. 核心文件属性使用户能够发现和设置包中的公共属性,如创建者名称、创建日期、标题等属性。Dublin Core properties (a set of metadate terms used to describe resources) are used whenever possible.尽可能使用都柏林核心属性(一组用于描述资源的元日期术语)。
Font字体 Contains a font embedded directly into the document. 包含直接嵌入到文档中的字体。Fonts can be stored as either bitmapped font in which each glyph is stored as a raster image, or in a format conforming to ISO/IEC 14496-22:2007.字体可以存储为位图字体(其中每个字形存储为光栅图像),也可以存储为符合ISO/IEC 14496-22:2007的格式。
Image图象 Documents often contain images. 文档通常包含图像。An image can be stored in a package as a zip item. 图像可以作为zip项存储在包中。The item must be identified by an image part relationship and the appropriate content type.项目必须通过图像-零件关系和适当的内容类型进行标识。
Theme主题 DrawingML is a shared language across the OOXML document types. DrawingML是跨OOXML文档类型的共享语言。It includes a theme part that is included in WordprocessingML documents when the document uses a theme. 当文档使用主题时,它包括WordprocessingML文档中包含的主题部分。The theme part contains information about a document's theme, that is, such information as the color scheme, font and format schemes.主题部分包含有关文档主题的信息,即颜色方案、字体和格式方案等信息。