WordprocessingML

A WordprocessingML document is composed of a collection of stories (§3:2.1). Each story is one of the following: the main document (§3:2.2), the glossary document (§3:2.13), a subdocument (§3:2.18.2), a header (§3:2.11.1), a footer (§3:2.11.2), a comment (§3:2.14.5), a frame, a text box (§3:2.18.1), a footnote (§3:2.12.1), or an endnote (§3:2.12.2).

The only required story is the main document. It is the target of the package relationship whose type is:

http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument

A typical path from root to leaf in the XML tree would comprise these XML elements (§3:2.2):

  • document – the root element of the main document (§3:2.3).

  • body – body (§3:2.7.1). Can contain multiple paragraphs. Can also contain section properties specified in a sectPr element.

  • p – paragraph (§3:2.4.1). Can contain one or more runs. Can also contain paragraph properties specified in a pPr element, which in turn can contain default run properties (also referred to as character properties) specified in a rPr element (§3:2.4.4).

  • r – run (§3:2.4.2). Can contain multiple types of run content, primarily text ranges. Can also contain run properties (rPr). The run is a fundamental concept within OpenXML. A run is a contiguous piece of text with identical properties; a run contains no additional text markup. For example, if a sentence were to contain the words “this is three runs”, then it would be represented by atleast three runs: “this is ”, “three”, and “runs”. In this respect, OpenXML differs significantly from formats that allow for arbitrary nesting of properties, such as HTML.

  • t – text range (§3:2.4.3.1). Contains an arbitrary amount of text with no formatting, line breaks, tables, graphics, or other non-text material. The formatting for the text is inherited from the run properties and the paragraph properties. This element often uses the xml:space=“preserve” attribute.

In this subsection, we have touched upon direct formatting of text by specifying paragraph and run properties. Direct formatting falls at the end of an order of application that also includes character, paragraph, numbering, and table styles, as well as document defaults (§3:2.8.10). Those styles are themselves organized into inheritance hierarchies (§3:2.8.9).

The subsection “Minimal WordprocesingML Document” below lists a WordprocessingML document in full.