XProc 3.0: Standard Step Library
Editor's Draft
- This Version:
- https://xproc.github.io/3.0-steps/master/head/steps/
- Latest Version:
- http://spec.xproc.org/master/head/steps/
- Editors:
- Norman Walsh
- Achim Berndzen
- Gerrit Imsieke
- Erik Siegel
- Repository:
- This specification on GitHub
- Report an issue
- Changes:
- Diff against current “status quo” draft
- Commits for this specification
This document is also available in these non-normative formats: XML, automatic change markup from the previous draft courtesy of DeltaXML.
Copyright © 2018 @@FIXME:
Abstract
This specification describes the standard step vocabulary of XProc 3.0: An XML Pipeline Language.
Status of this Document
This document is an editor's draft that has no official standing.
This section describes the status of this document at the time of its publication. Other documents may supersede this document.
This document is derived from XProc: An XML Pipeline Language published by the W3C.
Table of Contents
- 1 Introduction
- 2 The required steps
- 2.1 p:add-attribute
- 2.2 p:add-xml-base
- 2.3 p:cast-content-type
- 2.4 p:compare
- 2.5 p:count
- 2.6 p:delete
- 2.7 p:directory-list
- 2.7.1 Directory list details
- 2.8 p:error
- 2.9 p:escape-markup
- 2.10 p:filter
- 2.11 p:hash
- 2.12 p:http-request
- 2.12.1 Specifying a request
- 2.12.2 Request Entity body conversion
- 2.12.3 Managing the response
- 2.12.4 Converting Response Entity Bodies
- 2.12.5 HTTP Request Example
- 2.13 p:identity
- 2.14 p:insert
- 2.15 p:label-elements
- 2.16 p:load
- 2.16.1 Loading XML data
- 2.16.2 Loading text data
- 2.16.3 Loading JSON data
- 2.16.4 Loading HTML data
- 2.16.5 Loading binary data
- 2.17 p:load-directory-list
- 2.18 p:make-absolute-uris
- 2.19 p:namespace-rename
- 2.20 p:pack
- 2.21 p:rename
- 2.22 p:replace
- 2.23 p:set-attributes
- 2.24 p:set-properties
- 2.25 p:sink
- 2.26 p:split-sequence
- 2.27 p:store
- 2.28 p:string-replace
- 2.29 p:tee
- 2.30 p:text-count
- 2.31 p:text-head
- 2.32 p:text-join
- 2.33 p:text-replace
- 2.34 p:text-sort
- 2.35 p:text-tail
- 2.36 p:unescape-markup
- 2.37 p:unwrap
- 2.38 p:uuid
- 2.39 p:wrap-sequence
- 2.40 p:wrap
- 2.41 p:www-form-urldecode
- 2.42 p:www-form-urlencode
- 2.43 p:xinclude
- 2.44 p:xquery
- 2.44.1 Example
- 2.44.2 Document properties
- 2.45 p:xslt
- 3 Step Errors
- A Conformance
- B References
- C Glossary
- D Ancillary files
- E Credits
1 Introduction
This specification describes the standard, required atomic XProc steps. A machine-readable description of these steps may be found in steps.xpl.
Many atomic steps are available for [XProc 3.0]. They are described in several specifications. This specification describes the general background common to all steps. A conformant processor must implement all of the steps in this specification. Additional steps may also be implemented.
The types given for options should be understood as follows:
Types in the XML Schema namespace, identified as QNames with the
xs:
prefix, as per the XML Schema specification with one exception. Anywhere anxs:QName
is specified, an EQName is allowed.XPathExpression
: As a string per [W3C XML Schema: Part 2], including whitespace normalization, and the further requirement to be a conformant Expression per [XPath 3.1].XSLTSelectionPattern
: An XSLT pattern.XPathSequenceType
: An XPath sequence type.ContentType
: A media type as defined in [RFC 2046].ContentTypes
: As a whitespace separated list of media types as defined in [RFC 2046].
Option values are often expressed using the shortcut syntax. In these cases, the option shortcuts are generally treated as value templates. However, for options of type map()
or array()
, an expression is required (there is no non-expression string which can ever be a legal value for a map or array). Given that every value entered this way will have to be a value template, and consequently every curly brace contained within the expression will have to be escaped, values of type map or array are defined to be expressions directly.
Some aspects of documents are generally unchanged by steps:
When a step in this library produces an output document, the base URI of the output is the base URI of the step's primary input document unless the step's process explicitly sets an
xml:base
attribute or the step's description explicitly states how the base URI is constructed.Each step describes how it modifies the document properties of the documents that flow through it.
A great many steps indicate that they preserve some or all of the properties of the input document. It should be noted that in some cases the transformation performed by the step will violate the condition associated with some property. In general, the steps cannot know this and the pipeline author is responsible for managing the properties with greater care in this case.
Also, in this specification, several steps use this element for result information:
<c:result>
string
</c:result>
When a step uses an XPath to compute an option value, the XPath context is as defined in [XProc 3.0].
When a step specifies a particular version of a technology, implementations must implement that version or a subsequent version that is backwards compatible with that version. At user-option, they may implement other non-backwards compatible versions.
In this specification the words must, must not, should, should not, may and recommended are to be interpreted as described in [RFC 2119].
2 The required steps
A conformant processor must support all of these steps.
2.1 p:add-attribute
The p:add-attribute
step adds a single attribute to a set of matching elements. The input document specified on the source
is processed for matches specified by the match pattern in the match
option. For each of these matches, the attribute whose name is specified by the attribute-name
option is set to the attribute value specified by the attribute-value
option.
The resulting document is produced on the result
output port and consists of a exact copy of the input with the exception of the matched elements. Each of the matched elements is copied to the output with the addition of the specified attribute with the specified value.
<p:declare-step
type
="
p:add-attribute
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
match
"
as
="
xs:string
"
select
="
'/*'
"
/>
<!--
XSLTSelectionPattern -->
<p:option
name
="
attribute-name
"
required
="
true
"
as
="
xs:QName
"
/>
<p:option
name
="
attribute-value
"
required
="
true
"
as
="
xs:string
"
/>
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern. It is a dynamic error (err:XC0023
) if the match pattern matches a node which is not an element.
The value of the attribute-value
option must be a legal attribute value according to XML.
If an attribute with the same name as the expanded name from the attribute-name
option exists on the matched element, the value specified in the attribute-value
option is used to set the value of that existing attribute. That is, the value of the existing attribute is changed to the attribute-value
value.
Note
If multiple attributes need to be set on the same element(s), the p:set-attributes
step can be used to set them all at once.
This step cannot be used to add namespace declarations. It is a dynamic error (err:XC0059
) if the QName value in the attribute-name
option uses the prefix “xmlns
” or any other prefix that resolves to the namespace name http://www.w3.org/2000/xmlns/
. Note, however, that while namespace declarations cannot be added explicitly by this step, adding an attribute whose name is in a namespace for which there is no namespace declaration in scope on the matched element may result in a namespace binding being added by namespace fixup.
If an attribute named xml:base
is added or changed, the base URI of the element must also be amended accordingly.
Document properties
All document properties are preserved.
2.2 p:add-xml-base
The p:add-xml-base
step exposes the base URI via explicit xml:base
attributes. The input document from the source
port is replicated to the result
port with xml:base
attributes added to or corrected on each element as specified by the options on this step.
<p:declare-step
type
="
p:add-xml-base
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
all
"
as
="
xs:boolean
"
select
="
false()
"
/>
<p:option
name
="
relative
"
as
="
xs:boolean
"
select
="
true()
"
/>
</p:declare-step>
The value of the all
option must be a boolean.
The value of the relative
option must be a boolean.
It is a dynamic error (err:XC0058
) if the all
and relative
options are bothtrue
.
The p:add-xml-base
step modifies its input as follows:
For the document element: force the element to have an
xml:base
attribute with the document's [base URI] property's value as its value.For other elements:
If the
all
option has the valuetrue
, force the element to have anxml:base
attribute with the element's [base URI] value as its value.If the element's [base URI] is different from the its parent's [base URI], force the element to have an
xml:base
attribute with the following value: if the value of therelative
option istrue
, a string which, when resolved against the parent's [base URI], will give the element's [base URI], otherwise the element's [base URI].Otherwise, if there is an
xml:base
attribute present, remove it.
Document properties
All document properties are preserved.
2.3 p:cast-content-type
The p:cast-content-type
step changes the media type of its input.
<p:declare-step
type
="
p:cast-content-type
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
/>
<p:output
port
="
result
"
content-types
="
*/*
"
/>
<p:option
name
="
content-type
"
required
="
true
"
as
="
xs:string
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item()*)?
"
/>
</p:declare-step>
The input document is transformed from one media type to another. It is a dynamic error (err:XC0070
) if the supplied content-type
is not a valid media type of the form “
”. It is a dynamic error (type
/subtype
+ext
err:XC0071
) if the p:cast-content-type
step cannot perform the requested cast.
The parameters
can be used to supply parameters to control casting. The semantics of the keys and the allowed values for these keys are implementation-defined.It is a dynamic error (err:XC0079
) if the map parameters
contains an entry whose key is defined by the implementation and whose value is not valid for that key.
Casting from one XML media type to another simply changes the “
content-type
” document property.Casting from an HTML media type to an XML media type, if the input document is already an XPath data model document, simply changes the “
content-type
” document property. Casting an HTML document that isn’t already an XPath data model document into XML is implementation-defined.Casting from an XML media type to an HTML media type, simply changes the “
content-type
” document property.Casting from a JSON media type to an XML media type, converts the JSON into XML. The precise nature of the conversion from JSON to XML is implementation-defined. If the input document is a text node or other string representation, implementations should use
fn:json-to-xml
by default.Casting from an XML media type to a JSON media type, converts the XML into JSON. The precise nature of the conversion from XML to JSON is implementation-defined. If the input document is an XML representation of JSON as defined in [XPath and XQuery Functions and Operators 3.1], implementations must use
fn:xml-to-json
by default. If the input document has ac:param-set
document element, a map must be returned that represents the document'sc:param
elements.Casting from a non-XML media type to an XML media type produces an XML document with a
c:data
document element. The original media type will be preserved in thecontent-type
attribute on thec:data
element.<c:data
content-type = ContentType
charset? = string
encoding? = string>
string
</c:data>The content of the
c:data
element is the base64 encoded representation of the non-XML content.Casting from an XML media type to a non-XML media type must support the case where the input document is a
c:data
document. The resulting document will have the specified media type and a representation that is the content of thec:data
element after decoding the base64 encoded content.It is a dynamic error (
err:XC0072
) if thec:data
contains content is not a valid base64 string.It is a dynamic error (
err:XC0073
) if thec:data
element does not have acontent-type
attribute.It is a dynamic error (
err:XC0074
) if thecontent-type
is supplied and is not the same as thecontent-type
specified on thec:data
element.Casting from an XML media type to a non-XML media type when the input document is not a
c:data
document is implementation-defined.Casting from one non-XML media type to another non-XML media type is implementation-defined.
In all cases except when the input document is a c:data
element, it is a dynamic error (err:XC075
) if the content-type
is not supplied.
Document properties
All document properties are preserved except the content-type
property which is updated accordingly.
2.4 p:compare
The p:compare
step compares two documents for equality.
<p:declare-step
type
="
p:compare
"
>
<p:input
port
="
source
"
primary
="
true
"
content-types
="
*/*
"
/>
<p:input
port
="
alternate
"
content-types
="
*/*
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:output
port
="
differences
"
content-types
="
*/*
"
sequence
="
true
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item()*)?
"
/>
<p:option
name
="
method
"
as
="
xs:QName?
"
/>
<p:option
name
="
fail-if-not-equal
"
as
="
xs:boolean
"
select
="
false()
"
/>
</p:declare-step>
This step takes single documents on each of two ports and compares them. If method
is not specified, or if deep-equal
is specified, the comparison uses fn:deep-equal
(as defined in [XPath and XQuery Functions and Operators 3.1]). Implementations of p:compare
must support the deep-equal
method
; other supported methods are implementation-defined.It is a dynamic error (err:XC0076
) if the comparison method
specified in p:compare
is not supported by the implementation. It is a dynamic error (err:XC0077
) if the media types of the documents supplied are incompatible with the comparison method
.
It is a dynamic error (err:XC0019
) if the documents are not equal according to the specified comparison method
, and the value of the fail-if-not-equal
option is true
. If the documents are equal, or if the value of the fail-if-not-equal
option is false
, a c:result
document is produced with contents true
if the documents are equal, otherwise false
.
If fail-if-not-equal
is false
, and the documents differ, an implementation-defined summary of the differences between the two documents may appear on the differences
port.
Document properties
No document properties are preserved.
2.5 p:count
The p:count
step counts the number of documents in the source
input sequence and returns a single document on result
containing that number. The generated document contains a single c:result
element whose contents is the string representation of the number of documents in the sequence.
<p:declare-step
type
="
p:count
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
sequence
="
true
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
limit
"
as
="
xs:integer
"
select
="
0
"
/>
</p:declare-step>
If the limit
option is specified and is greater than zero, the p:count
step will count at most that many documents. This provides a convenient mechanism to discover, for example, if a sequence consists of more than 1 document, without requiring every single document to be buffered before processing can continue.
Document properties
No document properties are preserved.
2.6 p:delete
The p:delete
step deletes items specified by a match pattern from the source
input document and produces the resulting document, with the deleted items removed, on the result
port.
<p:declare-step
type
="
p:delete
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTSelectionPattern -->
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern. A match pattern may match multiple items to be deleted.
If an element is selected by the match
option, the entire subtree rooted at that element is deleted.
This step cannot be used to remove namespaces. It is a dynamic error (err:XC0062
) if the match
option matches a namespace node. Also, note that deleting an attribute named xml:base
does not change the base URI of the element on which it occurred.
Document properties
All document properties are preserved.
2.7 p:directory-list
The p:directory-list
step produces a list of the contents of a specified directory.
<p:declare-step
type
="
p:directory-list
"
>
<p:output
port
="
result
"
content-type
="
application/xml
"
/>
<p:option
name
="
path
"
required
="
true
"
as
="
xs:anyURI
"
/>
<p:option
name
="
detailed
"
as
="
xs:boolean
"
select
="
false()
"
/>
<p:option
name
="
include-filter
"
as
="
xs:string
"
/>
<!--
RegularExpression -->
<p:option
name
="
exclude-filter
"
as
="
xs:string
"
/>
<!--
RegularExpression -->
</p:declare-step>
Conformant processors must support directory paths whose scheme is file
. It is implementation-defined what other schemes are supported by p:directory-list
, and what the interpretation of 'directory', 'file' and 'contents' is for those schemes.
It is a dynamic error (err:XC0017
) if the absolute path does not identify a directory. It is a dynamic error (err:XC0012
) if the contents of the directory path are not available to the step due to access restrictions in the environment in which the pipeline is run.
If the detailed
option is true, the pipeline author is requesting additional information about the matching entries, see Section 2.7.1, “Directory list details”.
If present, the value of the include-filter
or exclude-filter
option must be a whitespace separated list of regular expressions as specified in [XPath and XQuery Functions and Operators 3.1], section 7.61 “Regular Expression Syntax
”.
If any include-filter
pattern matches a directory entry's name, the entry is included in the output. If any exclude-filter
pattern matches a directory entry's name, the entry is excluded in the output. If both options are provided, the include filter is processed first, then the exclude filter. As a result: an item is included if it matches (at least) one of the include-filter
values and none of the exclude-filter
values.
Note
There is no way to specify a list of values using attribute value templates. If the option shortcut syntax is used to provide the include-filter
or exclude-filter
option, it will consist of a single regular expression. To specify a list of regular expressions, you must use the p:with-option
syntax.
The result document produced for the specified directory path has a c:directory
document element whose base URI is the directory path and whose name
attribute is the last segment of the directory path (that is, the directory's (local) name).
<c:directory
name = string>
(c:file |
c:directory |
c:other)*
</c:directory>
Its contents are determined as follows, based on the entries in the directory identified by the directory path. For each entry in the directory, if either no filter
was specified, or the (local) name of the entry matches the filter pattern, a c:file
, a c:directory
, or a c:other
element is produced, as follows:
A
c:directory
is produced for each subdirectory not determined to be special.A
c:file
is produced for each file not determined to be special.<c:file
name = string
content-types? = ContentTypes />Any file or directory determined to be special by the
p:directory-list
step may be output using ac:other
element but the criteria for marking a file as special are implementation-defined.<c:other
name = string />
When a directory entry is a subdirectory, that directory's entries are not output as part of that entry's c:directory
. A user must apply this step again to the subdirectory to list subdirectory contents.
Each of the elements c:file
, c:directory
, and c:other
has a name
attribute when it appears within the top-level c:directory
element, whose value is a relative IRI reference, giving the (local) file or directory name.
2.7.1 Directory list details
If detailed
is false, then only the name
attribute is expected on c:file
, c:directory
, or c:other
elements. Any other attributes on c:file
, c:directory
, or c:other
are implementation-defined.
If detailed
is true, then the pipeline author is expecting additional details about each entry. The following attributes should be provided by the implementation:
readable
“
true
” if the entry is readable.writable
“
true
” if the entry is writable.hidden
“
true
” if the entry is hidden.last-modified
The last modification time of the entry, expressed as a lexical
xs:dateTime
in UTC.size
The size of the entry in bytes.
The precise meaning of these properties are implementation-defined and may vary according to the URI scheme of the path
. If the value of an attribute is “false
” or if it has no meaningful value, the attribute may be omitted.
Any other attributes on c:file
, c:directory
, or c:other
are implementation-defined.
Document properties
No document properties are preserved.
2.8 p:error
The p:error
step generates a dynamic error using the input provided to the step.
<p:declare-step
type
="
p:error
"
>
<p:input
port
="
source
"
sequence
="
true
"
content-types
="
application/xml text/xml */*+xml text/*
"
/>
<p:output
port
="
result
"
sequence
="
true
"
content-types
="
application/xml text/xml */*+xml text/*
"
/>
<p:option
name
="
code
"
required
="
true
"
as
="
xs:QName
"
/>
</p:declare-step>
This step uses the document provided on its input as the content of the error raised. An instance of the c:errors
element will be produced on the error output port, as is always the case for dynamic errors. The error generated can be caught by a p:try
just like any other dynamic error.
For authoring convenience, the p:error
step is declared with a single, primary output port. With respect to connections, this port behaves like any other output port even though nothing can ever appear on it since the step always fails.
For example, given the following invocation:
<p:error xmlns:my="http://www.example.org/error" name="bad-document" code="my:unk12"> <p:input port="source"> <p:inline> <message>The document element is unknown.</message> </p:inline> </p:input> </p:error>
The error vocabulary element (and document) generated on the error output port would be:
<c:errors xmlns:c="http://www.w3.org/ns/xproc-step" xmlns:p="http://www.w3.org/ns/xproc" xmlns:my="http://www.example.org/error"> <c:error name="bad-document" type="p:error" code="my:unk12"><message>The document element is unknown.</message> </c:error> </c:errors>
The href
, line
and column
, or offset
, might also be present on the c:error
to identify the location of the p:error
element in the pipeline.
Document properties
No document properties are preserved.
2.9 p:escape-markup
The p:escape-markup
step applies XML serialization to the children of the document element and replaces those children with their serialization. The outcome is a single element with text content that represents the "escaped" syntax of the children as they were serialized.
<p:declare-step
type
="
p:escape-markup
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
serialization
"
as
="
map(xs:QName,item()*)?
"
/>
</p:declare-step>
This step supports the standard serialization options as specified in [Serialization]. These options control how the output markup is produced before it is escaped.
For example, the input:
<description> <div xmlns="http://www.w3.org/1999/xhtml"> <p>This is a chunk of XHTML.</p> </div> </description>
produces:
<description> <div xmlns="http://www.w3.org/1999/xhtml"> <p>This is a chunk of XHTML.</p> </div> </description>
Note
The result of this step is an XML document that contains the Unicode characters that are the characters that result from escaping the input. It is not encoded characters in a serialized octet stream, therefore, the serialization options related to encoding characters (byte-order-mark
, encoding
, and normalization-form
) do not apply. They are omitted from the standard serialization options on this step.
By default, this step must not generate an XML declaration in the escaped result.
Document properties
No document properties are preserved.
2.10 p:filter
The p:filter
step selects portions of the source document based on a (possibly dynamically constructed) XPath select expression.
<p:declare-step
type
="
p:filter
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
sequence
="
true
"
content-types
="
application/xml
"
/>
<p:option
name
="
select
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XPathExpression -->
</p:declare-step>
This step behaves just like an p:input
with a select
expression except that the select expression is computed dynamically.
Document properties
All document properties are preserved.
2.11 p:hash
The p:hash
step generates a hash, or digital “fingerprint”, for some value and injects it into the source
document.
<p:declare-step
type
="
p:hash
"
>
<p:input
port
="
source
"
primary
="
true
"
content-types
="
*/*
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item()*)?
"
/>
<p:option
name
="
value
"
required
="
true
"
as
="
xs:string
"
/>
<p:option
name
="
algorithm
"
required
="
true
"
as
="
xs:QName
"
/>
<p:option
name
="
match
"
as
="
xs:string
"
select
="
'/*'
"
/>
<!--
XSLTSelectionPattern -->
<p:option
name
="
version
"
as
="
xs:string?
"
/>
</p:declare-step>
The value of the algorithm
option must be a QName. If it does not have a prefix, then it must be one of the following values: “crc”, “md”, or “sha”.
If a version
is not specified, the default version is algorithm-defined. For “crc
” it is 32, for “md
” it is 5, for “sha
” it is 1.
A hash is constructed from the string specified in the value
option using the specified algorithm and version. Implementations must support [CRC32], [MD5], and [SHA1] hashes. It is implementation-defined what other algorithms are supported. The resulting hash should be returned as a string of hexadecimal characters.
The value of the match
option must be an XSLTSelectionPattern.
The hash of the specified value is computed using the algorithm and parameters specified. It is a dynamic error (err:XC0036
) if the requested hash algorithm is not one that the processor understands or if the value or parameters are not appropriate for that algorithm.
The matched nodes are specified with the match pattern in the match
option. For each matching node, the string value of the computed hash is used in the output (if more than one node matches, the same hash value is used in each match). Nodes that do not match are copied without change.
If the expression given in the match
option matches an attribute, the hash is used as the new value of the attribute in the output. If the attribute is named “xml:base
”, the base URI of the element must also be amended accordingly.
If the expression matches any other kind of node, the entire node (and not just its contents) is replaced by the hash.
Document properties
All document properties are preserved.
2.12 p:http-request
The p:http-request
step provides for interaction with resources over HTTP or related protocols. The input document provided on the source
port specifies a request by a single c:request
element. This element specifies the method, resource, and other request properties as well as possibly including an entity body (content) for the request.
<p:declare-step
type
="
p:http-request
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
/>
<p:output
port
="
result
"
sequence
="
true
"
content-types
="
*/*
"
/>
<p:option
name
="
serialization
"
as
="
map(xs:QName,item()*)?
"
/>
</p:declare-step>
The serialization
option is provided to control the serialization of any content which is sent as part of the request. The effect of these options is as specified in [XProc 3.0]. See Section 2.12.2, “Request Entity body conversion” for a discussion of when serialization occurs in constructing a request.
It is a dynamic error (err:XC0040
) if the document element of the document that arrives on the source
port is not c:request
.
Editorial Note
Can the input document be JSON?
2.12.1 Specifying a request
An HTTP request is represented by a c:request
element.
<c:request
method = NCName
href? = anyURI
detailed? = boolean
status-only? = boolean
username? = string
password? = string
auth-method? = string
send-authorization? = boolean
override-content-type? = ContentType
timeout? = positiveInteger
fail-on-timeout? = boolean>
(c:header*,
(c:multipart |
c:body)?)
</c:request>
It is a dynamic error (err:XC0006
) if the method
is not specified on a c:request
. It is a dynamic error (err:XC0005
) if the request contains a c:body
or c:multipart
but the method
does not allow for an entity body being sent with the request.
It is a dynamic error (err:XC0004
) if the status-only
attribute has the value true
and the detailed
attribute does not have the value true
.
The method
attribute specifies the method to be used against the IRI specified by the href
attribute, e.g. GET
or POST
(the value is not case-sensitive). If the href
attribute is not absolute, it will be resolved against the base URI of the element on which it is occurs.
Note
In the case of simple “GET” requests, implementors are encouraged to support as many protocols as practical. In particular, pipeline authors may attempt to use p:http-request
to load documents with computed URIs using the file:
scheme.
If the username
attribute is specified, the username
, password
, auth-method
, and send-authorization
attributes are used to handle authentication according to the selected authentication method.
For the purposes of avoiding an authentication challenge, if the send-authorization
attribute has the value true
and the authentication method specified by the auth-method
supports generation of an Authorization
header without a challenge, then an Authorization
header is generated and sent on the first request. If the send-authorization
attribute is absent or has the value false
, then the first request is sent without an Authorization
header.
If the initial response to the request is an authentication challenge, the auth-method
, username
, password
and any relevant data from the challenge are used to generate an Authorization
header and the request is sent again. If that authorization fails, the request is not retried.
Appropriate values for the auth-method
attribute are “Basic” or “Digest” but other values are allowed. If the authentication method is “Basic” or “Digest”, authentication is handled as per [RFC 2617]. The interpretation of auth-method
values on c:request
other than “Basic” or “Digest” is implementation-defined.
It is a dynamic error (err:XC0003
) if a username
or password
is specified without specifying an auth-method
, if the requested auth-method
isn't supported, or the authentication challenge contains an authentication method that isn't supported. All implementations are required to support "Basic" and "Digest" authentication per [RFC 2617].
The attribute timeout
controls the time the XProc processor is waiting for the request to be answered. If a value is given for timeout
it is taken as the number of seconds to wait for the response to be delivered. If no response is received after that time, the request is terminated.
It is a dynamic error (
err:XC0078
) iffail-on-timeout
is specified astrue
and a value is given fortimeout
and thep:http-request
is not finished in the time specified bytimeout
.If
fail-on-timeout
isfalse
, ac:response
document withstatus=408
is generated.If no value is given for
fail-on-timeout
,false
is assumed. If no value is given fortimeout
,fail-on-timeout
is ignored.
Note
Please note the difference between option p:timeout
on p:http-request
and the attribute timeout
in combination with fail-on-timeout="true"
on c:request
. If the step does not finish in the set time, D0053
is raised. If the request does not finish in time and fail-on-timeout is true, C0078
is raised. The actual times after which a timeout is detected may also differ slightly.
The c:header
element specifies a header name and value, either for inclusion in a request, or as received in a response.
<c:header
name = string
value = string />
The request is formulated from the attribute values on the c:request
element and its c:header
and c:multipart
or c:body
children, if present, and transmitted to the host (and port, if present) specified by the href
attribute. The details of how the request entity body, if any, is constructed are given in Section 2.12.4, “Converting Response Entity Bodies”.
When the request is formulated, the step and/or protocol implementation may add headers as necessary to either complete the request or as appropriate for the content specified (e.g. transfer encodings). A user of this step is guaranteed that their requested headers and content will be sent with the exception of any conflicts with protocol-related headers.
The p:http-request
step allows users to specify independently values that are not always independent. For example, some combinations of c:header
values (e.g., Content-Type
) may be inconsistent with values that the step and/or protocol implementation must set. In a few cases, the step provides more than one mechanism to specify what is actually a single value (e.g., the boundary string in multipart messages). It is a dynamic error (err:XC0020
) if the the user specifies a value or values that are inconsistent with each other or with the requirements of the step or protocol.
2.12.2 Request Entity body conversion
The c:multipart
element specifies a multi-part body, per [RFC 2046], either for inclusion in a request or as received in a response.
<c:multipart
content-type = ContentType
boundary = string>
c:body+
</c:multipart>
In the context of a request, the media type of the c:multipart
must be a multipart media type (i.e. have a main type of 'multipart'). If the content-type
attribute is not specified, a value of “multipart/mixed
” will be assumed. (Whether or not, and to what extent, “multipart/byte-ranges
” responses are supported is implementation-defined.)
The boundary
attribute is required and is used to provide a multipart boundary marker. The implementation must use this boundary marker and must prefix the value with the string “--
” when formulating the multipart message. It is a dynamic error (err:XC0002
) if the value starts with the string “--
”.
If the boundary is also specified as a parameter in the content-type
option, then the parameter value specified and the boundary
value specified must be the same. If the boundary is specified in both the boundary
option and the content-type
option then the two values must be the same.
The c:body
element holds the body or body part of the message. Each of the attributes holds controls some aspect of the encoding the request body or decoding the body element's content when the request is formulated. These are specified as follows:
<c:body
content-type = ContentType
encoding? = string
id? = string
description? = string
disposition? = string>
anyNode*
</c:body>
The content-type
attribute specifies the media type of the body or body part, that is, the value of its Content-Type
header. If the media type is not an XML type or text, the content must already be base64-encoded.
The encoding
attribute controls the decoding of the element content for formulating the body. A value of base64
indicates the element's content is a base64 encoded string whose byte stream should be sent as the message body. An implementation may support encodings other than base64
but these encodings and their names are implementation-defined.It is a dynamic error (err:XC0052
) if the encoding specified is not supported by the implementation.
Note
The p:http-request
step provides only a single set of serialization options for XML media types. There's no direct support for sending a multipart message with two XML parts encoded differently.
For each body or body part, the id
attribute specifies the value of the Content-ID
header; the description
attribute specifies the value of the Content-Description
header; and the disposition
attribute specifies the value of the Content-Disposition
header.
If an entity body is to be sent as part of a request (e.g. a POST
), either a c:body
element, specifying the request entity body, or a c:multipart
element, specifying multiple entity body parts, may be used. When c:multipart
is used it may contain multiple c:body
children. A c:body
specifies the construction of a body or body part as follows:
If the content-type
attribute does not specify an XML media type, or the encoding
attribute is “base64
”, then it is a dynamic error (err:XC0028
) if the content of the c:body
element does not consist entirely of characters, and the entity body or body part will consist of exactly those characters.
Otherwise (the content-type
attribute does specify an XML media type and the encoding
attribute is not 'base64'), it is a dynamic error (err:XC0022
) if the content of the c:body
element does not consist of exactly one element, optionally preceded and/or followed by any number of processing instructions, comments or whitespace characters, and the entity body or body part will consist of the serialization of a document node containing that content. The serialization of that document is controlled by the serialization options on the p:http-request
step itself.
For example, the following input to a p:http-request
step will POST a small XML document:
<c:request method="POST" href="http://example.com/someservice"> <c:body xmlns:c="http://www.w3.org/ns/xproc-step" content-type="application/xml"> <doc> <title>My document</title> </doc> </c:body> </c:request>
The corresponding request should look something like this:
POST http://example.com/someservice HTTP/1.1 Host: example.com Content-Type: application/xml; charset="utf-8" <?xml version='1.0'?> <doc> <title>My document</title> </doc>
2.12.3 Managing the response
Note
Where do we say that for URI schemes (such as file:
and ftp:
) where a content type is not provided by the underlying request, the content type is implementation-dependent?
The handling of the response to the request and the generation of the step's result document is controlled by the status-only
, override-content-type
and detailed
attributes on the c:request
input.
The override-content-type
attribute controls interpretation of the response's Content-Type
header. If this attribute is present, the response will be treated as if it returned the Content-Type
given by its value. This original Content-Type
header will however be reflected unchanged as a c:header
in the result document. It is a dynamic error (err:XC0030
) if the override-content-type
value cannot be used (e.g. text/plain
to override image/png
).
If the override-content-type
includes an encoding parameter, then that encoding must be used to read the document.
If the status-only
attribute has the value true
, the result document will contain only header information. The entity of the response will not be processed to produce a c:body
or c:multipart
element.
The c:response
element represents an HTTP response. The response's status code is encoded in the status
attribute and the headers and entity body are processing into c:header
and c:multipart
or c:body
content.
<c:response
status? = integer>
(c:header*,
(c:multipart |
c:body)?)
</c:response>
The value of the detailed
attribute determines the content of the result document. If it is true
, the response to the request is handled as follows:
A single
c:response
element is produced with thestatus
attribute containing the status of the response received.Each response header is translated into a
c:header
element.Unless the
status-only
attribute has a valuetrue
, the entity body of the response is converted into ac:body
orc:multipart
element via the rules given in Section 2.12.4, “Converting Response Entity Bodies”.
Otherwise (the detailed
attribute is not specified or its value is false
), the response to the request is handled as follows:
If the media type (as determined by the
override-content-type
attribute or theContent-Type
response header) is an XML media type, the entity is decoded if necessary, then parsed as an XML document:The parser which
p:http-request
employs must process the external subset; all general and external parsed entities must be fully expanded.Editorial Note
The requirement to process the external subset comes from p:load, we probably don't want to impose that on all p:http-request calls. Need a way to control it?
It may perform
xml:id
processing, but it must not perform any other processing, such as expanding XIncludes.The parser must be conformant to Namespaces in XML.
Parsing the document must not fail due to validation errors.
The resulting XML document is produced on the
result
output port as the entire output of the step.Otherwise, the entity body of the response is converted into a
c:body
orc:multipart
element via the rules given in Section 2.12.4, “Converting Response Entity Bodies”.
In either case the base URI of the output document is the resolved value of the href
attribute from the input c:request
.
2.12.3.1 Redirects
One possible response from an HTTP request is a redirect, indicated by a status code in the three-hundred range. The precise semantics of the 3xx return codes are laid out by section 10.3 Redirection 3xx in [RFC 2616].
The p:http-request
step should follow redirect requests (in a manner consistent with [RFC 2616]) if they are returned by the server.
2.12.4 Converting Response Entity Bodies
The entity of a response may be multipart per [RFC 2046]. In those situations, the result document will be a c:multipart
element that contains multiple c:body
elements inside.
Note
Although it is technically possible for any of the individual parts of a multipart message to also be multipart, XProc does not provide a standard representation for such messages. The interpretation of a multipart message inside another multipart message is implementation-dependent.
The result of the p:http-request
step is an XML document. For media types (images, binaries, etc.) that can't be represented as a sequence of Unicode characters, the response is encoded as base64
and then returned as text children of the c:body
element. If the content is base64-encoded, the encoding
attribute on c:body
must be set to “base64
”.
Editorial Note
This section hasn't been updated to reflect the fact that non-XML documents are now possible. It should probably say something like:
If the document identified has a non-XML content type, no extra processing is mandated. The number and variety of media types that an implementation can load is implementation-defined.
If the media type of the response is a text type with a charset
parameter that is a Unicode character encoding (per [Unicode TR#17]) or is recognized as a non-XML media type whose contents are encoded as a sequence of Unicode characters (e.g. it has a character parameter or the definition of the media type is such that it requires Unicode), the content of the constructed c:body
element is the translation of the text into a sequence of Unicode characters.
If the response is an XML media type, the content of the constructed c:body
element is the result of decoding the body as necessary, then parsing it with an XML parser.
The parser which
p:http-request
employs must process the external subset; all general and external parsed entities must be fully expanded.Editorial Note
The requirement to process the external subset comes from p:load, we probably don't want to impose that on all p:http-request calls. Need a way to control it?
It may perform
xml:id
processing, but it must not perform any other processing, such as expanding XIncludes.The parser must be conformant to Namespaces in XML.
Parsing the document must not fail due to validation errors.
If the content is not well-formed, the step fails.
Editorial Note
This prose should be consolidated into a single place.
In a c:body
in a response, the content-type
attribute must be an exact copy of the value returned in the Content-Type
header. That is, it must reflect the content type actually returned, not any override value that may have been specified, and it must include any parameters returned by the server.
In the case of a multipart response, the same rules apply when constructing a c:body
element for each body part encountered.
Note
Given the above description, any content identified as text/html
will be encoded as (escaped) text or base64-encoded in the c:body
element, as HTML isn't always well-formed XML. A user can attempt to convert such content into XML using the p:unescape-markup
step.
2.12.5 HTTP Request Example
A simple form might be posted as follows:
<c:request method="POST" href="http://www.example.com/form-action" xmlns:c="http://www.w3.org/ns/xproc-step"> <c:body content-type="application/x-www-form-urlencoded"> name=W3C&spec=XProc </c:body> </c:request>
and if the response was an XHTML document, the result document would be:
<c:response status="200" xmlns:c="http://www.w3.org/ns/xproc-step"> <c:header name="Date" value=" Wed, 09 May 2007 23:12:24 GMT"/> <c:header name="Server" value="Apache/1.3.37 (Unix) PHP/4.4.5"/> <c:header name="Vary" value="negotiate,accept"/> <c:header name="TCN" value="choice"/> <c:header name="P3P" value="policyref='http://www.w3.org/2001/05/P3P/p3p.xml'"/> <c:header name="Cache-Control" value="max-age=600"/> <c:header name="Expires" value="Wed, 09 May 2007 23:22:24 GMT"/> <c:header name="Last-Modified" value="Tue, 08 May 2007 16:10:49 GMT"/> <c:header name="ETag" value="'4640a109;42380ddc'"/> <c:header name="Accept-Ranges" value="bytes"/> <c:header name="Keep-Alive" value="timeout=2, max=100"/> <c:header name="Connection" value="Keep-Alive"/> <c:body content-type="application/xhtml+xml"> <html xmlns="http://www.w3.org/1999/xhtml"> <head><title>OK</title></head> <body><p>OK!</p></body> </html> </c:body> </c:response>
Document properties
No document properties are preserved.
2.13 p:identity
The p:identity
step makes a verbatim copy of its input available on its output.
<p:declare-step
type
="
p:identity
"
>
<p:input
port
="
source
"
sequence
="
true
"
content-types
="
*/*
"
/>
<p:output
port
="
result
"
sequence
="
true
"
content-types
="
*/*
"
/>
</p:declare-step>
If the implementation supports passing PSVI annotations between steps, the p:identity
step must preserve any annotations that appear in the input.
Document properties
All document properties are preserved.
2.14 p:insert
The p:insert
step inserts the insertion
port's document into the source
port's document relative to the matching elements in the source
port's document.
<p:declare-step
type
="
p:insert
"
>
<p:input
port
="
source
"
primary
="
true
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:input
port
="
insertion
"
sequence
="
true
"
content-types
="
application/xml text/* */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
match
"
as
="
xs:string
"
select
="
'/*'
"
/>
<!--
XSLTSelectionPattern -->
<p:option
name
="
position
"
required
="
true
"
as
="
xs:token
"
values
="
('first-child','last-child','before','after')
"
/>
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern. It is a dynamic error (err:XC0023
) if that pattern matches an attribute or a namespace node. Multiple matches are allowed, in which case multiple copies of the insertion
documents will occur. If no elements match, then the document is unchanged.
The value of the position
option must be an NMTOKEN in the following list:
“
first-child
” - the insertion is made as the first child of the match;“
last-child
” - the insertion is made as the last child of the match;“
before
” - the insertion is made as the immediate preceding sibling of the match;“
after
” - the insertion is made as the immediate following sibling of the match.
It is a dynamic error (err:XC0025
) if the match pattern matches anything other than an element or a document node and the value of the position
option is “first-child
” or “last-child
”. It is a dynamic error (err:XC0024
) if the match pattern matches a document node and the value of the position
is “before
” or “after
”.
As the inserted elements are part of the output of the step they are not considered in determining matching elements. If an empty sequence appears on the insertion
port, the result will be the same as the source.
Document properties
All document properties on the source
port are preserved. No document properties on the insertion
port are preserved.
2.15 p:label-elements
The p:label-elements
step generates a label for each matched element and stores that label in the specified attribute.
<p:declare-step
type
="
p:label-elements
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
attribute
"
as
="
xs:QName
"
select
="
'xml:id'
"
/>
<p:option
name
="
label
"
as
="
xs:string
"
select
="
'concat("_",$p:index)'
"
/>
<!--
XPathExpression -->
<p:option
name
="
match
"
as
="
xs:string
"
select
="
'*'
"
/>
<!--
XSLTSelectionPattern -->
<p:option
name
="
replace
"
as
="
xs:boolean
"
select
="
true()
"
/>
</p:declare-step>
The value of the label
option is an XPath expression used to generate the value of the attribute label.
The value of the match
option must be an XSLTSelectionPattern. It is a dynamic error (err:XC0023
) if that expression matches anything other than element nodes.
The value of the replace
must be a boolean value and is used to indicate whether existing attribute values are replaced.
This step operates by generating attribute labels for each element matched. For every matched element, the expression is evaluated with the context node set to the matched element. An attribute is added to the matched element using the attribute name is specified the attribute
option and the string value of result of evaluating the expression. If the attribute already exists on the matched element, the value is replaced with the string value only if the replace
option has the value of true
.
If this step is used to add or change the value of an attribute named “xml:base
”, the base URI of the element must also be amended accordingly.
An implementation must bind the variable “p:index
” in the static context of each evaluation of the XPath expression to the position of the element in the sequence of matched elements. In other words, the first element (in document order) matched gets the value “1
”, the second gets the value “2
”, the third, “3
”, etc.
The result of the p:label-elements step is the input document with the attribute labels associated with matched elements. All other non-matching content remains the same.
Document properties
All document properties are preserved.
2.16 p:load
The p:load
step has no inputs but produces as its result a document (or documents) specified by an IRI.
<p:declare-step
type
="
p:load
"
>
<p:output
port
="
result
"
sequence
="
true
"
content-types
="
*/*
"
/>
<p:option
name
="
href
"
required
="
true
"
as
="
xs:anyURI
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item()*)?
"
/>
<p:option
name
="
content-type
"
as
="
xs:string?
"
/>
<p:option
name
="
document-properties
"
as
="
map(xs:QName, item()*)?
"
/>
</p:declare-step>
It is a dynamic error (err:XD0064
) if the href
URI is not a valid xs:anyURI
. If the href
is relative, it is made absolute against the base URI of the element on which it is specified (p:with-option
or p:load
in the case of a syntactic shortcut value).
The document or documents identified by the href
URI are loaded and returned. If the URI protocol supports redirection, then redirects must be followed.
It is a dynamic error (err:XD0011
) if the resource referenced by a p:load
element does not exist or cannot be accessed.
The behavior of this step depends on the content type of the document or documents loaded. The content type of each document is determined as follows:
If a content-type property is specified in
document-properties
orcontent-type
is present, then each document must be interpreted according to that content type. It is a dynamic error (err:XD0062
) if thecontent-type
is specified and the document-properties has a “content-type
” that is not the same.If the documents are retrieved with a URI protocol that specifies a content type (for example,
http:
), then the document must be interpreted according to that content type.In the absence of an explicit type, the content type is implementation-defined.
The parameters
map contains additional, optional parameters that may influence the way that content is loaded. The interpretation of this map varies according to the content type. Parameter names that are in no namespace are treated as strings using only the local-name where appropriate.
Broadly speaking, there are five categories of data that might be loaded: XML, text, JSON, HTML, and “other” binary data.
2.16.1 Loading XML data
For an XML media type, the content is loaded and parsed as XML.
It is a dynamic error (err:XD0049
) if the loaded content is not a well-formed XML document.
If the dtd-validate
parameter is true
, then DTD validation must be performed when parsing the document. It is a dynamic error (err:XD0023
) if a DTD validation is performed and either the document is not valid or no DTD is found. It is a dynamic error (err:XD0043
) if the dtd-validate
parameter is true
and the processor does not support DTD validation.
Additional XML parameters are implementation-defined.
2.16.2 Loading text data
For a text media type, the content is loaded as a text document. (A text document is an XPath data model document consisting of a single text node.)
It is a dynamic error (err:XD0060
) if the content-type
specifies an encoding, which is not supported by the processor.
Text parameters are implementation-defined.
2.16.3 Loading JSON data
For a JSON media type, the content is loaded and parsed as JSON.
The parameters specified for the fn:parse-json
function in [XPath and XQuery Functions and Operators 3.1] must be supported. Additional JSON parameters are implementation-defined.
It is a dynamic error (err:XD0057
) if the loaded content does not conform to the JSON grammar, unless the parameter liberal
is true
and the processor chooses to accept the deviation.
It is a dynamic error (err:XD0058
) if the parameter duplicates
is reject
and the value of loaded content contains a JSON object with duplicate keys.
It is a dynamic error (err:XD0059
) if the parameter map contains an entry whose key is defined in the specification of fn:parse-json
and whose value is not valid for that key, or if it contains an entry with the key fallback when the parameter escape
with true()
is also present.
2.16.4 Loading HTML data
For an HTML media type, the content is loaded and parsed into an XPath data model document that contains a tree of elements, attributes, and other nodes.
The precise way in which HTML documents are parsed into the XPath data model is implementation-defined.
HTML parameters are implementation-defined.
2.16.5 Loading binary data
An XProc processor may load other, arbitrary data types. How a processor interprets other media types is implementation-defined.
Parameters for other media types are implementation-defined.
Document properties
No document properties are preserved. The properties specified in document-properties
are applied.
2.17 p:load-directory-list
The p:load-directory-list
step has no inputs but produces as its result a sequence of documents loaded from a directory, specified by an IRI. If the directory's IRI is relative, it is made absolute against the base URI of the element on which it is specified (p:with-option
or p:load-directory-list
in the case of a syntactic shortcut value).
<p:declare-step
type
="
p:load-directory-list
"
>
<p:output
port
="
result
"
content-type
="
application/xml
"
/>
<p:option
name
="
path
"
required
="
true
"
as
="
xs:anyURI
"
/>
<p:option
name
="
include-filter
"
as
="
xs:string
"
/>
<!--
RegularExpression -->
<p:option
name
="
exclude-filter
"
as
="
xs:string
"
/>
<!--
RegularExpression -->
</p:declare-step>
Conformant processors must support directory paths whose scheme is file
. It is implementation-defined what other schemes are supported by p:load-directory-list
, and what the interpretation of 'directory', 'file' and 'contents' is for those schemes.
It is a dynamic error (err:XC0017
) if the absolute path does not identify a directory. It is a dynamic error (err:XC0012
) if the contents of the directory path are not available to the step due to access restrictions in the environment in which the pipeline is run.
If present, the value of the include-filter
or exclude-filter
option must be a whitespace separated list of regular expressions as specified in [XPath and XQuery Functions and Operators 3.1],section 7.61 “Regular Expression Syntax
”.
If the include-filter
pattern matches a directory entry's name, the entry is included in the output. If the exclude-filter
pattern matches a directory entry's name, the entry is excluded in the output. If both options are provided, the include filter is processed first, then the exclude filter. As a result: An item is included if it matches (at least) one of the include-filter
values and none of the exclude-filter
values.
Document properties
No document properties are preserved.
2.18 p:make-absolute-uris
The p:make-absolute-uris
step makes an element or attribute's value in the source document an absolute IRI value in the result document.
<p:declare-step
type
="
p:make-absolute-uris
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTSelectionPattern -->
<p:option
name
="
base-uri
"
as
="
xs:anyURI?
"
/>
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern. It is a dynamic error (err:XC0023
) if the pattern matches anything other than element or attribute nodes.
The value of the base-uri
option must be an anyURI
. It is interpreted as an IRI reference. If it is relative, it is made absolute against the base URI of the element on which it is specified (p:with-option
or p:make-absolute-uris
in the case of a syntactic shortcut value).
For every element or attribute in the input document which matches the specified pattern, its XPath string-value is resolved against the specified base URI and the resulting absolute IRI is used as the matched node's entire contents in the output.
The base URI used for resolution defaults to the matched attribute's element or the matched element's base URI unless the base-uri
option is specified. When the base-uri
option is specified, the option value is used as the base URI regardless of any contextual base URI value in the document. This option value is resolved against the base URI of the p:option
element used to set the option.
If the IRI reference specified by the base-uri
option on p:make-absolute-uris
is not valid, or if it is absent and the input document has no base URI, the results are implementation-dependent.
Document properties
All document properties are preserved.
2.19 p:namespace-rename
The p:namespace-rename
step renames any namespace declaration or use of a namespace in a document to a new IRI value.
<p:declare-step
type
="
p:namespace-rename
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
from
"
required
="
true
"
as
="
xs:anyURI
"
/>
<p:option
name
="
to
"
required
="
true
"
as
="
xs:anyURI
"
/>
<p:option
name
="
apply-to
"
as
="
xs:token
"
select
="
'all'
"
values
="
('all','elements','attributes')
"
/>
</p:declare-step>
The value of the from
option must be an anyURI
. It should be either empty or absolute, but will not be resolved in any case.
The value of the to
option must be an anyURI
. It should be empty or absolute, but will not be resolved in any case.
The value of the apply-to
option must be one of “all
”, “elements
”, or “attributes
”. If the value is “elements
”, only elements will be renamed, if the value is “attributes
”, only attributes will be renamed, if the value is “all
”, both elements and attributes will be renamed.
It is a dynamic error (err:XC0014
) if the XML namespace (http://www.w3.org/XML/1998/namespace
) or the XMLNS namespace (http://www.w3.org/2000/xmlns/
) is the value of either the from
option or the to
option.
If the value of the from
option is the same as the value of the to
option, the input is reproduced unchanged on the output. Otherwise, namespace bindings, namespace attributes and element and attribute names are changed as follows:
Namespace bindings: If the
from
option is present and its value is not the empty string, then every binding of a prefix (or the default namespace) in the input document whose value is the same as the value of thefrom
option isreplaced in the output with a binding to the value of the
to
option, provided it is present and not the empty string;otherwise (the
to
option is not specified or has an empty string as its value) absent from the output.
If the
from
option is absent, or its value is the empty string, then no bindings are changed or removed.Elements and attributes: If the
from
option is present and its value is not the empty string, for every element and attribute, as appropriate, in the input whose namespace name is the same as the value of thefrom
option, in the output its namespace name isreplaced with the value of the
to
option, provided it is present and not the empty string;otherwise (the
to
option is not specified or has an empty string as its value) changed to have no value.
If the
from
option is absent, or its value is the empty string, then for every element and attribute, as appropriate, whose namespace name has no value, in the output its namespace name is set to the value of theto
option.Namespace attributes: If the
from
option is present and its value is not the empty string, for every namespace attribute in the input whose value is the same as the value of thefrom
option, in the outputthe namespace attribute's value is replaced with the value of the
to
option, provided it is present and not the empty string;otherwise (the
to
option is not specified or has an empty string as its value) the namespace attribute is absent.
Note
The apply-to
option is primarily intended to make it possible to avoid renaming attributes when the from
option specifies no namespace, since many attributes are in no namespace.
Care should be taken when specifying no namespace with the to
option. Prefixed names in content, for example QNames and XPath expressions, may end up with no appropriate namespace binding.
Document properties
All document properties are preserved.
2.20 p:pack
The p:pack
step merges two document sequences in a pair-wise fashion.
<p:declare-step
type
="
p:pack
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
sequence
="
true
"
primary
="
true
"
/>
<p:input
port
="
alternate
"
sequence
="
true
"
content-types
="
application/xml
"
/>
<p:output
port
="
result
"
sequence
="
true
"
/>
<p:option
name
="
wrapper
"
required
="
true
"
as
="
xs:QName
"
/>
</p:declare-step>
The step takes each pair of documents, in order, one from the source
port and one from the alternate
port, wraps them with a new element node whose QName is the value specified in the wrapper
option, and writes that element to the result
port as a document.
If the step reaches the end of one input sequence before the other, then it simply wraps each of the remaining documents in the longer sequence.
Note
In the common case, where the document element of a document in the result
sequence has two element children, any comments, processing instructions, or white space text nodes that occur between them may have come from either of the input documents; this step does not attempt to distinguish which one.
Document properties
No document properties are preserved.
2.21 p:rename
The p:rename
step renames elements, attributes, or processing-instruction targets in a document.
<p:declare-step
type
="
p:rename
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
match
"
as
="
xs:string
"
select
="
'/*'
"
/>
<!--
XSLTSelectionPattern -->
<p:option
name
="
new-name
"
required
="
true
"
as
="
xs:QName
"
/>
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern. It is a dynamic error (err:XC0023
) if the pattern matches anything other than element, attribute or processing instruction nodes.
Each element, attribute, or processing-instruction in the input matched by the match pattern specified in the match
option is renamed in the output to the name specified by the new-name
option.
If the match
option matches an attribute and if the element on which it occurs already has an attribute whose expanded name is the same as the expanded name of the specified new-name
, then the results is as if the current attribute named “new-name
” was deleted before renaming the matched attribute.
With respect to attributes named “xml:base
”, the following semantics apply: renaming an from “xml:base
” to something else has no effect on the underlying base URI of the element; however, if an attribute is renamed from something else to “xml:base
”, the base URI of the element must also be amended accordingly.
If the pattern matches processing instructions, then it is the processing instruction target that is renamed. It is a dynamic error (err:XC0013
) if the pattern matches a processing instruction and the new name has a non-null namespace.
Document properties
All document properties are preserved.
2.22 p:replace
The p:replace
step replaces matching nodes in its primary input with the document element of the replacement
port's document.
<p:declare-step
type
="
p:replace
"
>
<p:input
port
="
source
"
primary
="
true
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:input
port
="
replacement
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTSelectionPattern -->
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern. It is a dynamic error (err:XC0023
) if that pattern matches anything other than element, text, processing-instruction, or comment nodes. Multiple matches are allowed, in which case multiple copies of the replacement
document will occur.
Every node in the primary input matching the specified pattern is replaced in the output is replaced by the document element of the replacement
document. Only non-nested matches are replaced. That is, once a node is replaced, its descendants cannot be matched.
Document properties
All document properties are preserved.
2.23 p:set-attributes
The p:set-attributes
step sets attributes on matching elements.
<p:declare-step
type
="
p:set-attributes
"
>
<p:input
port
="
source
"
primary
="
true
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:input
port
="
attributes
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
match
"
as
="
xs:string
"
select
="
'/*'
"
/>
<!--
XSLTSelectionPattern -->
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern. It is a dynamic error (err:XC0023
) if that pattern matches anything other than element nodes.
Each attribute on the document element of the document that appears on the attributes
port is copied to each element that matches the match
expression.
If an attribute with the same name as one of the attributes to be copied already exists, the value specified on the attribute
port's document is used. The result port of this step produces a copy of the source
port's document with the matching elements' attributes modified.
The matching elements are specified by the match pattern in the match
option. All matching elements are processed. If no elements match, the step will not change any elements.
This step must not copy namespace declarations. If the attributes copied from the attributes
use namespaces, prefixes, or prefixes bound to different namespaces, the document produced on the result
output port will require namespace fixup.
If an attribute named xml:base
is added or changed, the base URI of the element must also be amended accordingly.
Document properties
All document properties are preserved.
2.24 p:set-properties
The p:set-properties
step sets document properties on the source document.
<p:declare-step
type
="
p:set-properties
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
/>
<p:output
port
="
result
"
content-types
="
*/*
"
/>
<p:option
name
="
properties
"
required
="
true
"
as
="
map(xs:QName,item()*)
"
/>
<p:option
name
="
merge
"
default
="
false()
"
as
="
xs:boolean
"
/>
</p:declare-step>
The document properties of the document on the source
port are augmented with the values specified in the properties
option. The document produced on the result
port has the same representation but the adjusted property values.
If the merge
option is true, then the supplied properties are added to the existing properties. If it is false, the document’s properties are replaced by the new set.
It is a dynamic error (err:XC0069
) if the properties
map contains a key equal to the string “content-type
”.
Document properties
If merge
is true, document properties not overridden by settings in the properties
map are preserved, otherwise the resulting document has the properties specified in the properties
map.
2.25 p:sink
The p:sink
step accepts a sequence of documents and discards them. It has no output.
<p:declare-step
type
="
p:sink
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
sequence
="
true
"
/>
</p:declare-step>
Document properties
Not applicable.
2.26 p:split-sequence
The p:split-sequence
step accepts a sequence of documents and divides it into two sequences.
<p:declare-step
type
="
p:split-sequence
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
sequence
="
true
"
/>
<p:output
port
="
matched
"
sequence
="
true
"
primary
="
true
"
content-types
="
application/xml
"
/>
<p:output
port
="
not-matched
"
sequence
="
true
"
content-types
="
application/xml
"
/>
<p:option
name
="
initial-only
"
as
="
xs:boolean
"
select
="
false()
"
/>
<p:option
name
="
test
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XPathExpression -->
</p:declare-step>
The value of the test
option must be an XPathExpression.
The XPath expression in the test
option is applied to each document in the input sequence. If the effective boolean value of the expression is true, the document is copied to the matched
port; otherwise it is copied to the not-matched
port.
If the initial-only
option is true, then when the first document that does not satisfy the test expression is encountered, it and all the documents that follow it are written to the not-matched
port. In other words, it only writes the initial series of matched documents (which may be empty) to the matched
port. All other documents are written to the not-matched
port, irrespective of whether or not they match.
The XPath context for the test
option changes over time. For each document that appears on the source
port, the expression is evaluated with that document as the context document. The context position (position()
) is the position of that document within the sequence and the context size (last()
) is the total number of documents in the sequence.
Note
In principle, this component cannot stream because it must buffer all of the input sequence in order to find the context size. In practice, if the test expression does not use the last()
function, the implementation can stream and ignore the context size.
If the implementation supports passing PSVI annotations between steps, the p:split-sequence
step must preserve any annotations that appear in the input.
Document properties
All document properties are preserved.
2.27 p:store
The p:store
step stores (a possibly serialized version of) its input to a URI. This step outputs a reference to the location of the stored document.
<p:declare-step
type
="
p:store
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
primary
="
true
"
/>
<p:option
name
="
href
"
required
="
true
"
as
="
xs:anyURI
"
/>
<p:option
name
="
serialization
"
as
="
map(xs:QName,item()*)?
"
/>
</p:declare-step>
The value of the href
option must be an anyURI
. If it is relative, it is made absolute against the base URI of the element on which it is specified (p:with-option
or p:store
in the case of a syntactic shortcut value).
The step attempts to store the XML document to the specified URI. It is a dynamic error (err:XC0050
) if the URI scheme is not supported or the step cannot store to the specified location.
The output of this step is a document containing a single c:result
element whose content is the absolute URI of the document stored by the step.
The serialization
option is provided to control the serialization of content when it is stored. Serialization is described in [XProc 3.0].
Document properties
No document properties are preserved.
2.28 p:string-replace
The p:string-replace
step matches nodes in the document provided on the source
port and replaces them with the string result of evaluating an XPath expression.
<p:declare-step
type
="
p:string-replace
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTSelectionPattern -->
<p:option
name
="
replace
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XPathExpression -->
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern.
The value of the replace
option must be an XPathExpression.
The matched nodes are specified with the match pattern in the match
option. For each matching node, the XPath expression provided by the replace
option is evaluated with the matching node as the XPath context node. The string value of the result is used in the output. Nodes that do not match are copied without change.
If the expression given in the match
option matches an attribute, the string value of the replace
expression is used as the new value of the attribute in the output. If the attribute is named “xml:base
”, the base URI of the element must also be amended accordingly.
If the expression matches any other kind of node, the entire node (and not just its contents) is replaced by the string value of the replace
expression.
Document properties
All document properties are preserved.
2.29 p:tee
The p:tee
step stores (a possibly serialized version of) its input to a URI. The step outputs its input unchanged.
<p:declare-step
type
="
p:tee
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
sequence
="
true
"
/>
<p:output
port
="
result
"
sequence
="
true
"
content-types
="
*/*
"
/>
<p:option
name
="
href
"
required
="
true
"
as
="
xs:anyURI
"
/>
<p:option
name
="
serialization
"
as
="
map(xs:QName,item()*)?
"
/>
<p:option
name
="
enable
"
as
="
xs:boolean
"
select
="
true()
"
/>
</p:declare-step>
The value of the href
option must be an anyURI
. If it is relative, it is made absolute against the base URI of the element on which it is specified (p:with-option
or p:tee
in the case of a syntactic shortcut value).
The step attempts to store the document to the specified URI. It is a dynamic error (err:XC0050
) if the URI scheme is not supported or the step cannot store to the specified location.
The output of this step is what appears on its input port, unchanged.
The serialization
option is provided to control the serialization of content when it is stored. Serialization is described in [XProc 3.0].
If the enabled
option is false, no attempt to store the document will be made. The step will act exactly like a p:identity
step. The values for href
and serialization
will be ignored.
Document properties
All document properties are preserved.
2.30 p:text-count
The p:text-count
step counts the number of lines in a text document and returns a single XML document containing that number.
<p:declare-step
type
="
p:text-count
"
>
<p:input
port
="
source
"
primary
="
true
"
sequence
="
false
"
content-types
="
text/*
"
/>
<p:output
port
="
result
"
primary
="
true
"
sequence
="
false
"
content-types
="
application/xml
"
/>
</p:declare-step>
The p:text-count
step counts the number of lines in the text document appearing on its source
port. It returns on its result
port an XML document containing a single c:result
element whose contents is the string representing this count.
Lines are identified as described in XML, 2.11 End-of-Line Handling.
Document properties
No document properties are preserved.
2.31 p:text-head
The p:text-head
step returns lines from the beginning of a text document.
<p:declare-step
type
="
p:text-head
"
>
<p:input
port
="
source
"
primary
="
true
"
sequence
="
false
"
content-types
="
text/*
"
/>
<p:output
port
="
result
"
primary
="
true
"
sequence
="
false
"
content-types
="
text/*
"
/>
<p:option
name
="
count
"
required
="
true
"
as
="
xs:integer
"
/>
</p:declare-step>
The p:text-head
step returns on its result
port lines from the text document that appears on its source
port:
If the
count
option is positive, thep:text-head
step returns the firstcount
linesIf the
count
option is zero, thep:text-head
step returns all linesIf the
count
option is negative, thep:text-head
step returns all lines except the firstcount
lines
Lines are identified as described in XML, 2.11 End-of-Line Handling.
Document properties
All document properties are preserved.
2.32 p:text-join
The p:text-join
step concatenates text documents.
<p:declare-step
type
="
p:text-join
"
>
<p:output
port
="
source
"
primary
="
true
"
sequence
="
true
"
content-types
="
text/*
"
/>
<p:output
port
="
result
"
primary
="
true
"
sequence
="
false
"
content-types
="
text/*
"
/>
<p:option
name
="
separator
"
required
="
false
"
as
="
xs:string
"
/>
<p:option
name
="
prefix
"
required
="
false
"
as
="
xs:string
"
/>
<p:option
name
="
suffix
"
required
="
false
"
as
="
xs:string
"
/>
</p:declare-step>
The p:text-join
step concatenates the text documents appearing on its source
port into a single document on its result
port. The documents will be concatenated in order of appearance.
When the
separator
option is specified, its value will be inserted in between adjacent documents.When the
prefix
option is specified, the document appearing on theresult
port will always start with its value (also when there are no documents on thesource
port).When the
suffix
option is specified, the document appearing on theresult
port will always end with its value (also when there are no documents on thesource
port).
Document properties
No document properties are preserved.
2.33 p:text-replace
The p:text-replace
step replaces all occurrences of substrings in a text document that match a supplied regular expression with a given replacement string.
<p:declare-step
type
="
p:text-replace
"
>
<p:input
port
="
source
"
primary
="
true
"
sequence
="
false
"
content-types
="
text/*
"
/>
<p:output
port
="
result
"
primary
="
true
"
sequence
="
false
"
content-types
="
text/*
"
/>
<p:option
name
="
pattern
"
required
="
true
"
as
="
xs:string
"
/>
<p:option
name
="
replacement
"
required
="
true
"
as
="
xs:string
"
/>
<p:option
name
="
flags
"
required
="
false
"
as
="
xs:string
"
/>
</p:declare-step>
The p:text-replace
step replaces all occurrences of substrings in the text document appearing on its source
port that match a supplied regular expression with a given replacement string. The result is returned (as another text document) on its result
port.
This step is a convenience wrapper around the XPath fn:replace
function to ease text replacements in the document flow of a pipeline.
The pattern
, replacement
and flags
options are specified the same as the parameters with the same names of the fn:replace
function.
Document properties
All document properties are preserved.
2.34 p:text-sort
The p:text-sort
step sorts lines in a text document.
<p:declare-step
type
="
p:text-sort
"
>
<p:input
port
="
source
"
primary
="
true
"
sequence
="
false
"
content-types
="
text/*
"
/>
<p:output
port
="
result
"
primary
="
true
"
sequence
="
false
"
content-types
="
text/*
"
/>
<p:option
name
="
order
"
required
="
false
"
as
="
xs:string
"
select
="
'ascending'
"
values
="
('ascending', 'descending')
"
/>
<p:option
name
="
case-order
"
required
="
false
"
as
="
xs:string
"
values
="
('upper-first', 'lower-first')
"
/>
<p:option
name
="
lang
"
required
="
false
"
as
="
xs:language
"
/>
<p:option
name
="
data-type
"
required
="
false
"
as
="
xs:string
"
select
="
'text'
"
values
="
('text', 'number')
"
/>
<p:option
name
="
collation
"
required
="
false
"
as
="
xs:string
"
select
="
'https://www.w3.org/2005/xpath-functions/collation/codepoint'
"
/>
<p:option
name
="
stable
"
required
="
false
"
as
="
xs:boolean
"
select
="
true()
"
/>
</p:declare-step>
The p:text-sort
step sorts the lines in the text document appearing on its source
port and returns the result as another text document on its result
port. The full lines are used as the sorting key.
The
order
option defines whether the lines are processed in ascending or descending order. Its value must be one ofascending
ordescending
. The default isascending
.The
case-order
option defines whether upper-case letters are to be collated before or after lower-case letters. Its value must be one ofupper-first
orlower-first
. The default is language-dependent.The
lang
option defines the language whose collating conventions are to be used. The default depends on the processing environment. Its value must be a valid language code (e.g.en-EN
).The
data-type
option defines whether the lines are to be collated alphabetically or numerically. Its value must be one oftext
ornumber
. The default istext
.The
collation
option identifies how strings are to be compared with each other. Its value must be a valid collation URI. The only collation XProc processors must support is the Unicode Codepoint Collationhttps://www.w3.org/2005/xpath-functions/collation/codepoint
. This is also its default. Support for other collations is implementation-defined.If the
stable
option is set tofalse
this indicates that there is no requirement to retain the original order of items that have equal values for all the sort keys.
Lines are identified as described in XML, 2.11 End-of-Line Handling.
Document properties
All document properties are preserved.
2.35 p:text-tail
The p:text-tail
step returns lines from the end of a text document.
<p:declare-step
type
="
p:text-tail
"
>
<p:input
port
="
source
"
primary
="
true
"
sequence
="
false
"
content-types
="
text/*
"
/>
<p:output
port
="
result
"
primary
="
true
"
sequence
="
false
"
content-types
="
text/*
"
/>
<p:option
name
="
count
"
required
="
true
"
as
="
xs:integer
"
/>
</p:declare-step>
The p:text-tail
step returns on its result
port lines from the text document that appears on its source
port:
If the
count
option is positive, thep:text-tail
step returns the lastcount
linesIf the
count
option is zero, thep:text-tail
step returns all linesIf the
count
option is negative, thep:text-tail
step returns all lines except the lastcount
lines
Lines are identified as described in XML, 2.11 End-of-Line Handling.
Document properties
All document properties are preserved.
2.36 p:unescape-markup
The p:unescape-markup
step takes the string value of the document element and parses the content as if it was a Unicode character stream containing serialized XML. The output consists of the same document element with children that result from the parse. This is the reverse of the p:escape-markup
step.
<p:declare-step
type
="
p:unescape-markup
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml text/*
"
/>
<p:output
port
="
result
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:option
name
="
namespace
"
as
="
xs:anyURI?
"
/>
<p:option
name
="
content-type
"
as
="
xs:string
"
select
="
'application/xml'
"
/>
<p:option
name
="
encoding
"
as
="
xs:string?
"
/>
<p:option
name
="
charset
"
as
="
xs:string?
"
/>
</p:declare-step>
The value of the namespace
option must be an anyURI
. It should be absolute, but will not be resolved.
When the string value is parsed, the original document element is preserved so that the result will be well-formed XML even if the content consists of multiple, sibling elements.
The namespace
option specifies a default namespace. Elements that are in no namespace in the unescaped content will be placed into this namespace unless there is an in-scope namespace declaration that specifies a different namespace (or explicitly undeclares the default namespace).
The content-type
option may be used to specify an alternate content type for the string value. An implementation may use a different parser to produce XML content depending on the specified content-type. For example, an implementation might provide an HTML to XHTML parser (e.g. [HTML Tidy] or [TagSoup]) for the content type 'text/html
'.
All implementations must support the content type application/xml
, and must use a standard XML parser for it. It is a dynamic error (err:XC0051
) if the content-type specified is not supported by the implementation. Behavior of p:unescape-markup
for content-type
s other than application/xml
is implementation-defined.
The encoding
option specifies how the data is encoded. All implementations must support the base64
encoding (and the absence of an encoding option, which implies that the content is plain Unicode text). It is a dynamic error (err:XC0052
) if the encoding specified is not supported by the implementation.
If an encoding
is specified, a charset
may also be specified. The character set may be specified as a parameter on the content-type
or via the separate charset
option. If it is specified in both places, the value of the charset
option must be used.
If the specified encoding
is base64
, then the character set must be specified. It is a dynamic error (err:XC0010
) if an encoding of base64
is specified and the character set is not specified or if the specified character set is not supported by the implementation.
The octet-stream that results from decoding the text must be interpreted using the character encoding named by the value of the charset
option to produce a sequence of Unicode characters to parse.
If no encoding
is specified, the character set is ignored, irrespective of where it was specified.
For example, with the 'namespace' option set to the XHTML namespace, the following input:
<description> <p>This is a chunk.</p> <p>This is a another chunk.</p> </description>
would produce:
<description> <p xmlns="http://www.w3.org/1999/xhtml">This is a chunk.</p> <p xmlns="http://www.w3.org/1999/xhtml">This is a another chunk.</p> </description>
Document properties
No document properties are preserved.
2.37 p:unwrap
The p:unwrap
step replaces matched elements with their children.
<p:declare-step
type
="
p:unwrap
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
match
"
as
="
xs:string
"
select
="
'/*'
"
/>
<!--
XSLTSelectionPattern -->
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern. It is a dynamic error (err:XC0023
) if that pattern matches anything other than element nodes.
Every element in the source
document that matches the specified match
pattern is replaced by its children, effectively “unwrapping” the children from their parent. Non-element nodes and unmatched elements are passed through unchanged.
Note
The matching applies to the entire document, not just the “top-most” matches. A pattern of the form h:div
will replace allh:div
elements, not just the top-most ones.
This step produces a single document; if the document element is unwrapped, the result might not be well-formed XML.
Document properties
No document properties are preserved.
2.38 p:uuid
The p:uuid
step generates a [UUID] and injects it into the source
document.
<p:declare-step
type
="
p:uuid
"
>
<p:input
port
="
source
"
primary
="
true
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
match
"
as
="
xs:string
"
select
="
'/*'
"
/>
<!--
XSLTSelectionPattern -->
<p:option
name
="
version
"
as
="
xs:integer?
"
/>
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern. The value of the version
option must be an integer.
If the version
is specified, that version of UUID must be computed. It is a dynamic error (err:XC0060
) if the processor does not support the specified version
of the UUID algorithm. If the version
is not specified, the version of UUID computed is implementation-defined.
Implementations must support version 4 UUIDs. Support for other versions of UUID, and the mechanism by which the necessary inputs are made available for computing other versions, is implementation-defined.
The matched nodes are specified with the match pattern in the match
option. For each matching node, the generated UUID is used in the output (if more than one node matches, the same UUID is used in each match). Nodes that do not match are copied without change.
If the expression given in the match
option matches an attribute, the UUID is used as the new value of the attribute in the output. If the attribute is named “xml:base
”, the base URI of the element must also be amended accordingly.
If the expression matches any other kind of node, the entire node (and not just its contents) is replaced by the UUID.
Document properties
All document properties are preserved.
2.39 p:wrap-sequence
The p:wrap-sequence
step accepts a sequence of documents and produces either a single document or a new sequence of documents.
<p:declare-step
type
="
p:wrap-sequence
"
>
<p:input
port
="
source
"
content-types
="
application/xml */*+xml text/*
"
sequence
="
true
"
/>
<p:output
port
="
result
"
sequence
="
true
"
content-types
="
application/xml
"
/>
<p:option
name
="
wrapper
"
required
="
true
"
as
="
xs:QName
"
/>
<p:option
name
="
group-adjacent
"
as
="
xs:string?
"
/>
<!--
XPathExpression -->
</p:declare-step>
The value of the group-adjacent
option must be an XPathExpression.
In its simplest form, p:wrap-sequence
takes a sequence of documents and produces a single, new document by placing each document in the source
sequence inside a new document element as sequential siblings. The name of the document element is the value specified in the wrapper
option.
The group-adjacent
option can be used to group adjacent documents. The XPath context for the group-adjacent
option changes over time. For each document that appears on the source
port, the expression is evaluated with that document as the context document. The context position (position()
) is the position of that document within the sequence and the context size (last()
) is the total number of documents in the sequence. Whenever two or more sequentially adjacent documents have the same “group adjacent” value, they are wrapped together in a single wrapper element.
Document properties
No document properties are preserved.
2.40 p:wrap
The p:wrap
step wraps matching nodes in the source
document with a new parent element.
<p:declare-step
type
="
p:wrap
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
wrapper
"
required
="
true
"
as
="
xs:QName
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTSelectionPattern -->
<p:option
name
="
group-adjacent
"
as
="
xs:string?
"
/>
<!--
XPathExpression -->
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern. It is a dynamic error (err:XC0023
) if the pattern matches anything other than document, element, text, processing instruction, and comment nodes.
The value of the group-adjacent
option must be an XPathExpression.
If the node matched is the document node (match="/"
), the result is a new document where the document element is a new element node whose QName is the value specified in the wrapper
option. That new element contains copies of all of the children of the original document node.
When the match pattern does not match the document node, every node that matches the specified match
pattern is replaced with a new element node whose QName is the value specified in the wrapper
option. The content of that new element is a copy of the original, matching node. The p:wrap
step performs a "deep" wrapping, the children of the matching node and their descendants are processed and wrappers are added to all matching nodes.
The group-adjacent
option can be used to group adjacent matching nodes in a single wrapper element. The specified XPath expression is evaluated for each matching node with that node as the XPath context node. Whenever two or more adjacent matching nodes have the same “group adjacent” value, they are wrapped together in a single wrapper element.
Two matching nodes are considered adjacent if and only if they are siblings and either there are no nodes between them or all intervening, non-matching nodes are whitespace text, comment, or processing instruction nodes.
Document properties
No document properties are preserved.
2.41 p:www-form-urldecode
The p:www-form-urldecode
step decodes a x-www-form-urlencoded
string into an XML representation.
<p:declare-step
type
="
p:www-form-urldecode
"
>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
value
"
required
="
true
"
as
="
xs:string
"
/>
</p:declare-step>
The value
option is interpreted as a string of parameter values encoded using the x-www-form-urlencoded
algorithm. Each name/value pair is written in a c:param
element. The entire set of parameters is written (as a c:param-set
) on the result
output port.
It is a dynamic error (err:XC0037
) if the value
provided is not a properly x-www-form-urlencoded
value. It is a dynamic error (err:XC0061
) if the name of any encoded parameter name is not a valid xs:NCName
. In other words, this step can only decode simple name/value pairs where the names do not contain colons or any characters that cannot be used in XML names.
The order of the c:param
elements in the result is the same as the order of the encoded parameters, reading from left to right.
If any parameter name occurs more than once in the encoded string, the resulting parameter set will contain a c:param
for each instance.
Document properties
No document properties are preserved.
2.42 p:www-form-urlencode
The p:www-form-urlencode
step encodes a set of parameter values as a x-www-form-urlencoded
string and injects it into the source
document.
<p:declare-step
type
="
p:www-form-urlencode
"
>
<p:input
port
="
source
"
primary
="
true
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item()*)?
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTSelectionPattern -->
</p:declare-step>
The value of the match
option must be an XSLTSelectionPattern.
The set of parameters is encoded as a single x-www-form-urlencoded
string of name/value pairs. When parameters are encoded into name/value pairs, only the local name of each parameter is used. The namespace name is ignored and no prefix or colon appears in the name.
The order of the parameters is implementation-dependent.
The matched nodes are specified with the match pattern in the match
option. For each matching node, the encoded string is used in the output. Nodes that do not match are copied without change.
If the expression given in the match
option matches an attribute, the encoded string is used as the new value of the attribute in the output. If the expression matches any other kind of node, the entire node (and not just its contents) is replaced by the encoded string.
Document properties
No document properties are preserved.
2.43 p:xinclude
The p:xinclude
step applies [XInclude] processing to the source
document.
<p:declare-step
type
="
p:xinclude
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:option
name
="
fixup-xml-base
"
as
="
xs:boolean
"
select
="
false()
"
/>
<p:option
name
="
fixup-xml-lang
"
as
="
xs:boolean
"
select
="
false()
"
/>
</p:declare-step>
The value of the fixup-xml-base
option must be a boolean. If it is true, base URI fixup will be performed as per [XInclude].
The value of the fixup-xml-lang
option must be a boolean. If it is true, language fixup will be performed as per [XInclude].
The included documents are located with the base URI of the input document and are not provided as input to the step.
It is a dynamic error (err:XC0029
) if an XInclude error occurs during processing.
Document properties
All document properties are preserved.
2.44 p:xquery
The p:xquery
step applies an [XQuery 1.0] query to the sequence of documents provided on the source
port.
<p:declare-step
type
="
p:xquery
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
sequence
="
true
"
primary
="
true
"
/>
<p:input
port
="
query
"
content-types
="
application/xml */*+xml text/*
"
/>
<p:output
port
="
result
"
sequence
="
true
"
content-types
="
*/*
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item()*)?
"
/>
<p:option
name
="
version
"
as
="
xs:string?
"
/>
</p:declare-step>
If a sequence of documents is provided on the source
port, the first document is used as the initial context item. The whole sequence is also the default collection. If no documents are provided on the source
port, the initial context item is undefined and the default collection is empty.
The query
port must receive a single document:
If the document root element is
c:query
, the text descendants of this element are considered the query.<c:query>
string
</c:query>If the document root element is in the XQueryX namespace, the document is treated as an XQueryX-encoded query. Support for XQueryX is implementation-defined.
If the
query
document has an XML media type, then the string value of the document must be treated as the query. If the media type has a “text
” type, then it must be interpreted as the query.Otherwise, the interpretation of the query is implementation-defined.
If the step specifies a version
, then that version of XQuery must be used to process the transformation. It is a dynamic error (err:XC0038
) if the specified version is not available. If the step does not specify a version, the implementation may use any version it has available and may use any means to determine what version to use, including, but not limited to, examining the version of the query.
The result of the p:xquery
step must be a sequence of documents. It is a dynamic error (err:XC0057
) if the sequence that results from evaluating the XQuery contains items other than documents and elements. Any elements that appear in the result sequence will be treated as documents with the element as their document element.
For example:
<c:query> declare namespace atom="http://www.w3.org/2005/Atom"; /atom:feed/atom:entry </c:query>
The output of this step may include PSVI annotations.
The static context of the XQuery processor is augmented in the following way:
- Statically known default collection type
document()*
- Statically known namespaces:
Unchanged from the implementation defaults. No namespace declarations in the XProc pipeline are automatically exposed in the static context.
The dynamic context of the XQuery processor is augmented in the following way:
- Context item
The first document that appears on the
source
port.- Context position
1
- Context size
1
- Variable values
Any parameters passed in the
parameters
option augment any implementation-defined variable bindings known to the XQuery processor.- Function implementations
The function implementations provided by the XQuery processor.
- Current dateTime
The point in time returned as the current dateTime is implementation-defined.
- Implicit timezone
The implicit timezone is implementation-defined.
- Available documents
The set of available documents (those that may be retrieved with a URI) is implementation-dependent.
- Available collections
The set of available collections is implementation-dependent.
- Default collection
The sequence of documents provided on the
source
port.
2.44.1 Example
The following pipeline applies XInclude processing and schema validation before using XQuery:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0"> <p:input port="source"/> <p:output port="result"/> <p:xinclude/> <p:validate-with-xml-schema name="validate"> <p:with-input port="schema" href="http://example.com/path/to/schema.xsd"/> </p:validate-with-xml-schema> <p:xquery> <p:with-input port="query" href="countp.xq"/> </p:xquery> </p:declare-step>
Where countp.xq
might contain:
<count>{count(.//p)}</count>
2.44.2 Document properties
No document properties are preserved.
2.45 p:xslt
The p:xslt
step applies an XSLT stylesheet to a document.
<p:declare-step
type
="
p:xslt
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
sequence
="
true
"
primary
="
true
"
/>
<p:input
port
="
stylesheet
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
primary
="
true
"
sequence
="
true
"
content-types
="
*/*
"
/>
<p:output
port
="
secondary
"
sequence
="
true
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item())?
"
/>
<p:option
name
="
initial-mode
"
as
="
xs:QName?
"
/>
<p:option
name
="
template-name
"
as
="
xs:QName?
"
/>
<p:option
name
="
output-base-uri
"
as
="
xs:anyURI?
"
/>
<p:option
name
="
version
"
as
="
xs:string?
"
/>
</p:declare-step>
If present, the value of the initial-mode
option must be a QName
.
If present, the value of the template-name
option must be a QName
.
If present, the value of the output-base-uri
option must be an anyURI
. If it is relative, it is made absolute against the base URI of the element on which it is specified (p:with-option
or p:xslt
in the case of a syntactic shortcut value).
If the step specifies a version
, then that version of XSLT must be used to process the transformation. It is a dynamic error (err:XC0038
) if the specified version is not available. If the step does not specify a version, the implementation may use any version it has available and may use any means to determine what version to use, including, but not limited to, examining the version of the stylesheet.
The XSLT stylesheet provided on the stylesheet
port is applied to the document on the source
port. Any parameters passed in the parameters
option are used to define top-level stylesheet parameters. The primary result document of the transformation, if there is one, appears on the result
port. At most one document can appear on the result
port. All other result documents appear on the secondary
port. The order in which result documents appear on the secondary
port is implementation-dependent. If XSLT 1.0 is used, an empty sequence of documents must appear on the secondary
port.
If a sequence of documents is provided on the source
port, the first document is used as the primary input document. The whole sequence is also the default collection. If no documents are provided on the source
port, the primary input document is undefined and the default collection is empty. It is a dynamic error (err:XC0039
) if a sequence of documents (including an empty sequence) is provided to an XSLT 1.0 step.
A dynamic error occurs if the XSLT processor signals a fatal error. This includes the case where the transformation terminates due to a xsl:message
instruction with a terminate
attribute value of “yes
”. How XSLT message termination errors are reported to the XProc processor is implementation-dependent.
The invocation of the transformation is controlled by the initial-mode
and template-name
options that set the initial mode and/or named template in the XSLT transformation where processing begins. It is a dynamic error (err:XC0056
) if the specified initial mode or named template cannot be applied to the specified stylesheet.
The output-base-uri
option sets the context's output base URI per the XSLT 2.0 specification, otherwise the base URI of the result
document is the base URI of the first document in the source
port's sequence. If the value of the output-base-uri
option is not absolute, it will be resolved using the base URI of its p:option
element. An XSLT 1.0 step should use the value of the output-base-uri
as the base URI of its output, if the option is specified.
If XSLT 2.0 is used, the outputs of this step may include PSVI annotations.
The static and initial dynamic contexts of the XSLT processor are the contexts defined in the step XPath context with the following adjustments.
The dynamic context is augmented as follows:
- Context item
The first document that appears on the
source
port.- Variable values
Any parameters passed in the
parameters
option are available as variable bindings to the XSLT processor.- Function implementations
The function implementations provided by the XSLT processor.
- Default collection
The sequence of documents provided on the
source
port.
Document properties
No document properties are preserved.
3 Step Errors
Several of the steps in the standard step library can generate dynamic errors.
A [Definition: A dynamic error is one which occurs while a pipeline is being evaluated.] Examples of dynamic errors include references to URIs that cannot be resolved, steps which fail, and pipelines that exhaust the capacity of an implementation (such as memory or disk space).
If a step fails due to a dynamic error, failure propagates upwards until either a p:try
is encountered or the entire pipeline fails. In other words, outside of a p:try
, step failure causes the entire pipeline to fail.
The following errors can be raised by steps in this specification:
err:XC0002
It is a dynamic error if the value starts with the string “--”.
err:XC0003
It is a dynamic error if a username or password is specified without specifying an auth-method, if the requested auth-method isn't supported, or the authentication challenge contains an authentication method that isn't supported.
See: Specifying a request
err:XC0004
It is a dynamic error if the status-only attribute has the value true and the detailed attribute does not have the value true.
See: Specifying a request
err:XC0005
It is a dynamic error if the request contains a c:body or c:multipart but the method does not allow for an entity body being sent with the request.
See: Specifying a request
err:XC0006
It is a dynamic error if the method is not specified on a c:request.
See: Specifying a request
err:XC0010
It is a dynamic error if an encoding of base64 is specified and the character set is not specified or if the specified character set is not supported by the implementation.
See: p:unescape-markup
err:XC0012
It is a dynamic error if the contents of the directory path are not available to the step due to access restrictions in the environment in which the pipeline is run.
err:XC0013
It is a dynamic error if the pattern matches a processing instruction and the new name has a non-null namespace.
See: p:rename
err:XC0014
It is a dynamic error if the XML namespace (http://www.w3.org/XML/1998/namespace) or the XMLNS namespace (http://www.w3.org/2000/xmlns/) is the value of either the from option or the to option.
See: p:namespace-rename
err:XC0017
It is a dynamic error if the absolute path does not identify a directory.
err:XC0019
It is a dynamic error if the documents are not equal according to the specified comparison method, and the value of the fail-if-not-equal option is true.
See: p:compare
err:XC0020
It is a dynamic error if the the user specifies a value or values that are inconsistent with each other or with the requirements of the step or protocol.
See: Specifying a request
err:XC0022
it is a dynamic error if the content of the c:body element does not consist of exactly one element, optionally preceded and/or followed by any number of processing instructions, comments or whitespace characters
err:XC0023
It is a dynamic error if the match pattern matches a node which is not an element.
See: p:add-attribute, p:insert, p:label-elements, p:make-absolute-uris, p:rename, p:replace, p:set-attributes, p:unwrap, p:wrap
err:XC0024
It is a dynamic error if the match pattern matches a document node and the value of the position is “before” or “after”.
See: p:insert
err:XC0025
It is a dynamic error if the match pattern matches anything other than an element or a document node and the value of the position option is “first-child” or “last-child”.
See: p:insert
err:XC0028
it is a dynamic error if the content of the c:body element does not consist entirely of characters
err:XC0029
It is a dynamic error if an XInclude error occurs during processing.
See: p:xinclude
err:XC0030
It is a dynamic error if the override-content-type value cannot be used (e.g. text/plain to override image/png).
err:XC0036
It is a dynamic error if the requested hash algorithm is not one that the processor understands or if the value or parameters are not appropriate for that algorithm.
See: p:hash
err:XC0037
It is a dynamic error if the value provided is not a properly x-www-form-urlencoded value.
See: p:www-form-urldecode
err:XC0038
It is a dynamic error if the specified version is not available.
err:XC0039
It is a dynamic error if a sequence of documents (including an empty sequence) is provided to an XSLT 1.0 step.
See: p:xslt
err:XC0040
It is a dynamic error if the document element of the document that arrives on the source port is not c:request.
See: p:http-request
err:XC0050
It is a dynamic error if the URI scheme is not supported or the step cannot store to the specified location.
err:XC0051
It is a dynamic error if the content-type specified is not supported by the implementation.
See: p:unescape-markup
err:XC0052
It is a dynamic error if the encoding specified is not supported by the implementation.
err:XC0056
It is a dynamic error if the specified initial mode or named template cannot be applied to the specified stylesheet.
See: p:xslt
err:XC0057
It is a dynamic error if the sequence that results from evaluating the XQuery contains items other than documents and elements.
See: p:xquery
err:XC0058
It is a dynamic error if the all and relative options are both true.
See: p:add-xml-base
err:XC0059
It is a dynamic error if the QName value in the attribute-name option uses the prefix “xmlns” or any other prefix that resolves to the namespace name http://www.w3.org/2000/xmlns/.
See: p:add-attribute
err:XC0060
It is a dynamic error if the processor does not support the specified version of the UUID algorithm.
See: p:uuid
err:XC0061
It is a dynamic error if the name of any encoded parameter name is not a valid xs:NCName.
See: p:www-form-urldecode
err:XC0062
It is a dynamic error if the match option matches a namespace node.
See: p:delete
err:XC0069
It is a dynamic error if the properties map contains a key equal to the string “content-type”.
See: p:set-properties
err:XC0070
It is a dynamic error if the supplied content-type is not a valid media type of the form “type/subtype+ext”.
See: p:cast-content-type
err:XC0071
It is a dynamic error if the p:cast-content-type step cannot perform the requested cast.
See: p:cast-content-type
err:XC0072
It is a dynamic error if the c:data contains content is not a valid base64 string.
See: p:cast-content-type
err:XC0073
It is a dynamic error if the c:data element does not have a content-type attribute.
See: p:cast-content-type
err:XC0074
It is a dynamic error if the content-type is supplied and is not the same as the content-type specified on the c:data element.
See: p:cast-content-type
err:XC0076
It is a dynamic error if the comparison method specified in p:compare is not supported by the implementation.
See: p:compare
err:XC0077
It is a dynamic error if the media types of the documents supplied are incompatible with the comparison method.
See: p:compare
err:XC0078
It is a dynamic error if fail-on-timeout is specified as true and a value is given for timeout and the p:http-request is not finished in the time specified by timeout.
See: Specifying a request
err:XC0079
It is a dynamic error if the map parameters contains an entry whose key is defined by the implementation and whose value is not valid for that key.
See: p:cast-content-type
err:XC075
In all cases except when the input document is a c:data element, it is a dynamic error if the content-type is not supplied.
See: p:cast-content-type
A Conformance
Conformant processors must implement all of the features described in this specification except those that are explicitly identified as optional.
Some aspects of processor behavior are not completely specified; those features are either implementation-dependent or implementation-defined.
[Definition: An implementation-dependent feature is one where the implementation has discretion in how it is performed. Implementations are not required to document or explain how implementation-dependent features are performed.]
[Definition: An implementation-defined feature is one where the implementation has discretion in how it is performed. Conformant implementations must document how implementation-defined features are performed.]
A.1 Implementation-defined features
The following features are implementation-defined:
- The semantics of the keys and the allowed values for these keys are implementation-defined. See Section 2.3, “p:cast-content-type”.
- Casting an HTML document that isn’t already an XPath data model document into XML is implementation-defined. See Section 2.3, “p:cast-content-type”.
- The precise nature of the conversion from JSON to XML is implementation-defined. See Section 2.3, “p:cast-content-type”.
- The precise nature of the conversion from XML to JSON is implementation-defined. See Section 2.3, “p:cast-content-type”.
- Casting from an XML media type to a non-XML media type when the input document is not a c:data document is implementation-defined. See Section 2.3, “p:cast-content-type”.
- Casting from one non-XML media type to another non-XML media type is implementation-defined. See Section 2.3, “p:cast-content-type”.
- Implementations of p:compare must support the deep-equal method; other supported methods are implementation-defined. See Section 2.4, “p:compare”.
- If fail-if-not-equal is false, and the documents differ, an implementation-defined summary of the differences between the two documents may appear on the differences port. See Section 2.4, “p:compare”.
- Conformant processors must support directory paths whose scheme is file. It is implementation-defined what other schemes are supported by p:directory-list, and what the interpretation of 'directory', 'file' and 'contents' is for those schemes. See Section 2.7, “p:directory-list”.
- Any file or directory determined to be special by the p:directory-list step may be output using a c:other element but the criteria for marking a file as special are implementation-defined. See Section 2.7, “p:directory-list”.
- Any other attributes on c:file, c:directory, or c:other are implementation-defined. See Section 2.7.1, “Directory list details”.
- The precise meaning of these properties are implementation-defined and may vary according to the URI scheme of the path. See Section 2.7.1, “Directory list details”.
- Any other attributes on c:file, c:directory, or c:other are implementation-defined. See Section 2.7.1, “Directory list details”.
- It is implementation-defined what other algorithms are supported. See Section 2.11, “p:hash”.
- The interpretation of auth-method values on c:request other than “Basic” or “Digest” is implementation-defined. See Section 2.12.1, “Specifying a request”.
- Whether or not, and to what extent, “multipart/byte-ranges” responses are supported is implementation-defined.) See Section 2.12.2, “Request Entity body conversion”.
- An implementation may support encodings other than base64 but these encodings and their names are implementation-defined. See Section 2.12.2, “Request Entity body conversion”.
- Pipeline authors that need to preserve cookies across several p:http-request calls in the same pipeline or across multiple invocations of the same or different pipelines will have to rely on implementation-defined mechanisms. See Section 2.12.3.2, “Cookies”.
- In the absence of an explicit type, the content type is implementation-defined See Section 2.16, “p:load”.
- Additional XML parameters are implementation-defined. See Section 2.16.1, “Loading XML data”.
- Text parameters are implementation-defined. See Section 2.16.2, “Loading text data”.
- Additional JSON parameters are implementation-defined. See Section 2.16.3, “Loading JSON data”.
- The precise way in which HTML documents are parsed into the XPath data model is implementation-defined. See Section 2.16.4, “Loading HTML data”.
- HTML parameters are implementation-defined. See Section 2.16.4, “Loading HTML data”.
- How a processor interprets other media types is implementation-defined. See Section 2.16.5, “Loading binary data”.
- Parameters for other media types are implementation-defined. See Section 2.16.5, “Loading binary data”.
- Conformant processors must support directory paths whose scheme is file. It is implementation-defined what other schemes are supported by p:load-directory-list, and what the interpretation of 'directory', 'file' and 'contents' is for those schemes. See Section 2.17, “p:load-directory-list”.
- Support for other collations is implementation-defined. See Section 2.34, “p:text-sort”.
- Behavior of p:unescape-markup for content-types other than application/xml is implementation-defined. See Section 2.36, “p:unescape-markup”.
- If the version is not specified, the version of UUID computed is implementation-defined. See Section 2.38, “p:uuid”.
- Support for other versions of UUID, and the mechanism by which the necessary inputs are made available for computing other versions, is implementation-defined. See Section 2.38, “p:uuid”.
- Support for XQueryX is implementation-defined. See Section 2.44, “p:xquery”.
- Otherwise, the interpretation of the query is implementation-defined. See Section 2.44, “p:xquery”.
- The point in time returned as the current dateTime is implementation-defined. See Section 2.44, “p:xquery”.
- The implicit timezone is implementation-defined. See Section 2.44, “p:xquery”.
A.2 Implementation-dependent features
The following features are implementation-dependent:
- The interpretation of a multipart message inside another multipart message is implementation-dependent. See Section 2.12.4, “Converting Response Entity Bodies”.
- If the IRI reference specified by the base-uri option on p:make-absolute-uris is not valid, or if it is absent and the input document has no base URI, the results are implementation-dependent. See Section 2.18, “p:make-absolute-uris”.
- The order of the parameters is implementation-dependent. See Section 2.42, “p:www-form-urlencode”.
- The set of available documents (those that may be retrieved with a URI) is implementation-dependent. See Section 2.44, “p:xquery”.
- The set of available collections is implementation-dependent. See Section 2.44, “p:xquery”.
- The order in which result documents appear on the secondary port is implementation-dependent. See Section 2.45, “p:xslt”.
- How XSLT message termination errors are reported to the XProc processor is implementation-dependent. See Section 2.45, “p:xslt”.
B References
B.1 Normative References
[XProc 3.0] XProc 3.0: An XML Pipeline Language. Norman Walsh, Achim Berndzen, Gerrit Imsieke and Erik Siegel, editors.
[W3C XML Schema: Part 2] XML Schema Part 2: Datatypes Second Edition. Paul V. Biron and Ashok Malhotra, editors. World Wide Web Consortium, 28 October 2004.
[XPath 3.1] XML Path Language (XPath) 3.1. Jonathan Robie, Michael Dyck, Josh Spiegel, editors. W3C Recommendation. 21 March 2017.
[XPath and XQuery Functions and Operators 3.1] XPath and XQuery Functions and Operators 3.1. Michael Kay, editor. W3C Recommendation. 21 March 2017
[XInclude] XML Inclusions (XInclude) Version 1.0 (Second Edition). Jonathan Marsh, David Orchard, and Daniel Veillard, editors. W3C Recommendation. 15 November 2006.
[Serialization] XSLT and XQuery Serialization 3.1. Andrew Coleman and C. M. Sperberg-McQueen, editors. W3C Recommendation. 21 March 2017.
[XQuery 1.0] XQuery 1.0: An XML Query Language. Scott Boag, Don Chamberlin, Mary Fernández, et. al., editors. W3C Recommendation. 23 January 2007.
[MD5] RFC 1321: The MD5 Message-Digest Algorithm. R. Rivest. Network Working Group, IETF, April 1992.
@@FIXME:UNREFERENCED [RFC 2045] RFC 2045: MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies. N. Freed, N. Borenstein, editors. Internet Engineering Task Force. November, 1996.
[RFC 2046] RFC 2046: Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types. N. Freed, N. Borenstein, editors. Internet Engineering Task Force. November, 1996.
[RFC 2119] Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. Network Working Group, IETF, Mar 1997.
[RFC 2616] RFC 2616: Hypertext Transfer Protocol — HTTP/1.1. R. Fielding, J. Gettys, J. Mogul, et. al., editors. Internet Engineering Task Force. June, 1999.
[RFC 2617] RFC 2617: HTTP Authentication: Basic and Digest Access Authentication. J. Franks, P. Hallam-Baker, J. Hostetler, S. Lawrence, P. Leach, A. Luotonen, L. Stewart. June, 1999 .
[Unicode TR#17] Unicode Technical Report #17: Character Encoding Model. Ken Whistler, Mark Davis, and Asmus Freytag, authors. The Unicode Consortium. 11 November 2008.
[HTML Tidy] HTML Tidy Library Project. SourceForge project.
B.2 Informative References
[CRC32] “32-Bit Cyclic Redundancy Codes for Internet Applications”, The International Conference on Dependable Systems and Networks: 459. 10.1109/DSN.2002.1028931. P. Koopman. June 2002.
C Glossary
- dynamic error
A dynamic error is one which occurs while a pipeline is being evaluated.
- implementation-defined
An implementation-defined feature is one where the implementation has discretion in how it is performed. Conformant implementations must document how implementation-defined features are performed.
- implementation-dependent
An implementation-dependent feature is one where the implementation has discretion in how it is performed. Implementations are not required to document or explain how implementation-dependent features are performed.
D Ancillary files
This specification includes by reference a number of ancillary files.
- steps.xpl
An XProc step library for the declared steps.
E Credits
This document is derived from XProc: An XML Pipeline Language published by the W3C. It was developed by the XML Processing Model Working Group and edited by Norman Walsh, Alex Miłowski, and Henry Thompson.
The editors of this specification extend their gratitude to everyone who contributed to this document and all of the versions that came before it.