XProc 3.0 Steps: An Introduction

Editor's Draft

This Version:
https://xproc.github.io/3.0-specification/master/head/steps-intro/
Latest Version:
http://spec.xproc.org/master/head/steps-intro/
Editors:
Achim Berndzen
Gerrit Imsieke
Erik Siegel
Norman Walsh
Repository:
This specification on GitHub
Report an issue
Changes:
Commits for this specification

This document is also available in these non-normative formats: XML.


Abstract

This specification describes general features of the step vocabulary of XProc 3.0: An XML Pipeline Language.

Status of this Document

This document is an editor's draft that has no official standing.

This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This document is derived from XProc: An XML Pipeline Language published by the W3C. See also Appendix D, Credits.


1 Introduction

Many atomic steps are available for [XProc 3.0]. They are described in several specifications. A conformant processor must implement all of the steps in [Steps 3.0]. Additional steps may also be implemented.

This specification describes the general background common to all steps.

The types given for options should be understood as follows:

  • Types in the XML Schema namespace, identified as QNames with the xs: prefix, as per the XML Schema specification with one exception. Anywhere an xs:QName is specified, an EQName is allowed.

  • XPathExpression: As a string per [W3C XML Schema: Part 2], including whitespace normalization, and the further requirement to be a conformant Expression per [XPath 3.1].

  • XSLTMatchPattern: An XSLT pattern.

  • XPathSequenceType: An XPath sequence type.

  • ContentType: A media type as defined in [Media Types].

  • ContentTypes: As a whitespace separated list of media types as defined in [Media Types].

Option values are often expressed using the shortcut syntax. In these cases, the option shortcuts are generally treated as value templates. However, for options of type map() or array(), an expression is required (there is no non-expression string which can ever be a legal value for a map or array). Given that every value entered this way will have to be a value template, and consequently every curly brace contained within the expression will have to be escaped, values of type map or array are defined to be expressions directly.

Some aspects of documents are generally unchanged by steps:

  • When a step in this library produces an output document, the base URI of the output is the base URI of the step's primary input document unless the step's process explicitly sets an xml:base attribute or the step's description explicitly states how the base URI is constructed.

  • Each step describes how it modifies the document properties of the documents that flow through it.

    A great many steps indicate that they preserve some or all of the properties of the input document. It should be noted that in some cases the transformation performed by the step will violate the condition associated with some property. In general, the steps cannot know this and the pipeline author is responsible for managing the properties with greater care in this case.

Also, in this specification, several steps use this element for result information:

<c:result>
    string
</c:result>

When a step uses an XPath to compute an option value, the XPath context is as defined in [XProc 3.0].

When a step specifies a particular version of a technology, implementations must implement that version or a subsequent version that is backwards compatible with that version. At user-option, they may implement other non-backwards compatible versions.

1.1 Common options

All XProc steps must accept the option p:timeout. As a debugging tool this option can be used to tell the XProc processor that a step with such an option should be terminated if its execution does take more time than expected. The value of the p:timeout option must be a xs:positiveInteger. It is interpreted as the maximal number of seconds the step is expected to run. The measurement of time starts after all bindings for input ports and all options are evaluated. If the XProc processor realizes a longer runtime, it should terminate the execution of the step as soon as possible, do cleanup to ensure proper execution of other steps in the pipeline and then raise a dynamic error.

It is a dynamic error (err:XD0053) if a step runs longer than its timeout value. It is implementation-defined whether a processor supports timeouts, and if it does, how precisely and precisely how the execution time of a step is measured.

Note

Since the exact time a step takes to perform its task depends on the used computer, the XProc processor's execution strategy, the system load etc., this feature can not be used as an exact timing tool in XProc. Execution times may vary on different executions of the same pipeline on the same computer and the same XProc processor due to execution contingencies. Developers are advised to calculate the value for p:timeout generously, so the dynamic error is raised only in extreme cases.

Note

Note that the name of this option is p:timeout even on steps in the XProc namespace. This limits the extent to which the option restricts the space of available option names for steps. (If the option name wasn’t in a namespace, then no step could have an option named “timeout” which seems unreasonable.)

2 Step Errors

Several of the steps in the standard step library can generate dynamic errors.

A [Definition: A dynamic error is one which occurs while a pipeline is being evaluated.] Examples of dynamic errors include references to URIs that cannot be resolved, steps which fail, and pipelines that exhaust the capacity of an implementation (such as memory or disk space).

If a step fails due to a dynamic error, failure propagates upwards until either a p:try is encountered or the entire pipeline fails. In other words, outside of a p:try, step failure causes the entire pipeline to fail.

The following errors can be raised by steps in this specification:

A Conformance

Conformant processors must implement all of the features described in this specification except those that are explicitly identified as optional.

Some aspects of processor behavior are not completely specified; those features are either implementation-dependent or implementation-defined.

[Definition: An implementation-dependent feature is one where the implementation has discretion in how it is performed. Implementations are not required to document or explain how implementation-dependent features are performed.]

[Definition: An implementation-defined feature is one where the implementation has discretion in how it is performed. Conformant implementations must document how implementation-defined features are performed.]

A.1 Implementation-defined features

The following features are implementation-defined:

  1. It is implementation-defined whether a processor supports timeouts, and if it does, how precisely and precisely how the execution time of a step is measured. See Section 1.1, “Common options”.

A.2 Implementation-dependent features

The following features are implementation-dependent:

    B References

    @@FIXME:UNREFERENCED [RFC 2119] Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. Network Working Group, IETF, Mar 1997.

    [XProc 3.0] XProc 3.0: An XML Pipeline Language. Achim Berndzen, Gerrit Imsieke, Erik Siegel and Norman Walsh, editors.

    [Steps 3.0] XProc 3.0: Standard Step Library. Achim Berndzen, Gerrit Imsieke, Erik Siegel and Norman Walsh, editors.

    [W3C XML Schema: Part 2] XML Schema Part 2: Datatypes Second Edition. Paul V. Biron and Ashok Malhotra, editors. World Wide Web Consortium, 28 October 2004.

    [XPath 3.1] XML Path Language (XPath) 3.1. Jonathan Robie, Michael Dyck, Josh Spiegel, editors. W3C Recommendation. 21 March 2017.

    [Media Types] RFC 2046: Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types. N. Freed, et. al.. Network Working Group, IETF, November, 1996.

    C Glossary

    dynamic error

    A dynamic error is one which occurs while a pipeline is being evaluated.

    implementation-defined

    An implementation-defined feature is one where the implementation has discretion in how it is performed. Conformant implementations must document how implementation-defined features are performed.

    implementation-dependent

    An implementation-dependent feature is one where the implementation has discretion in how it is performed. Implementations are not required to document or explain how implementation-dependent features are performed.

    D Credits

    This document is derived from XProc: An XML Pipeline Language published by the W3C. It was developed by the XML Processing Model Working Group and edited by Norman Walsh, Alex Miłowski, and Henry Thompson.

    The editors of this specification extend their gratitude to everyone who contributed to this document and all of the versions that came before it.