The p:ixml
step performs Invisible XML processing per [Invisible XML]. It transforms a non-XML input into XML by applying the specified Invisible XML grammar.
Summary
Input port | Primary | Sequence | Content types |
---|---|---|---|
grammar | ✔ | text xml | |
source | ✔ | any -xml -html |
Output port | Primary | Sequence | Content types |
---|---|---|---|
result | ✔ | ✔ | any |
Option name | Type | Default value |
---|---|---|
fail-on-error | xs:boolean | true() |
parameters | map(xs:QName, item()*)? | () |
Errors
Error code | Description |
---|---|
err:XC0205 | It is a dynamic error if the source document cannot be parsed by the provided grammar. |
err:XC0211 | It is a dynamic error if more than one document appears on the grammar port. |
err:XC0212 | It is a dynamic error if the grammar provided is not a valid Invisible XML grammar. |
Implementation details
Implementation | Description |
---|---|
Defined | It is implementation-defined if other result formats are possible. |
Defined | The parameters are implementation-defined. |
Declaration
<p:declare-step
type
="
p:ixml
"
>
<p:input
port
="
grammar
"
sequence
="
true
"
content-types
="
text xml
"
/>
<p:input
port
="
source
"
primary
="
true
"
content-types
="
any -xml -html
"
/>
<p:output
port
="
result
"
sequence
="
true
"
content-types
="
any
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName, item()*)?
"
/>
<p:option
name
="
fail-on-error
"
as
="
xs:boolean
"
select
="
true()
"
/>
</p:declare-step>
If no grammar is provided on the grammar
port, the grammar for Invisible XML is assumed. If an XML or text grammar is provided it must be an Invisible XML grammar. It is a dynamic error (err:XC0212
) if the grammar provided is not a valid Invisible XML grammar. It is a dynamic error (err:XC0211
) if more than one document appears on the grammar
port.
The source
to be processed is usually text, but there’s nothing in principle that prevents an Invisible XML grammar from applying to an arbitrary sequence of characters.
The result
should be XML. It is implementation-defined if other result formats are possible. (An implementation might, for example, provide a way for the p:ixml
step to compile an Invisible XML grammar into some format that can be processed more efficiently.)
The
parameters
are implementation-defined. An implementation might provide parameters to select among different ambiguous parses or choose alternate representations.If
fail-on-error
istrue
, the step will raise an error if the input cannot be parsed by the grammar. It is a dynamic error (err:XC0205
) if the source document cannot be parsed by the provided grammar. Iffail-on-error
isfalse
, no error will be raised.The Invisible XML specification provides a mechanism to identify failed parses in the output.
Several examples demonstrate features of the step.
In this first example, no grammar is provided, so the pipeline parses the Invisible XML grammar on the source
port and returns its XML representation:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.1"> <p:output port="result"/> <p:ixml> <p:with-input port="grammar"><p:empty /></p:with-input> <p:with-input port="source"> <p:inline content-type="text/plain"> date: s?, day, s, month, (s, year)? . -s: -" "+ . day: digit, digit? . -digit: "0"; "1"; "2"; "3"; "4"; "5"; "6"; "7"; "8"; "9". month: "January"; "February"; "March"; "April"; "May"; "June"; "July"; "August"; "September"; "October"; "November"; "December". year: (digit, digit)?, digit, digit . </p:inline> </p:with-input> </p:ixml> </p:declare-step>
This would produce an XML version of the grammar:
<ixml> <rule name="date"> <alt> <option> <nonterminal name="s"/> </option> <nonterminal name="day"/> <nonterminal name="s"/> <nonterminal name="month"/> <option> <alts> <alt> <nonterminal name="s"/> <nonterminal name="year"/> </alt> </alts> </option> </alt> </rule> <!-- … remaining rules elided for brevity … --> </ixml>
If the grammar is provided on the grammar
port, it can be used to parse input, the string “31 December 2021” in this case:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.1"> <p:output port="result"/> <p:ixml> <p:with-input port="grammar"> <p:inline content-type="text/plain"> date: s?, day, s, month, (s, year)? . -s: -" "+ . day: digit, digit? . -digit: "0"; "1"; "2"; "3"; "4"; "5"; "6"; "7"; "8"; "9". month: "January"; "February"; "March"; "April"; "May"; "June"; "July"; "August"; "September"; "October"; "November"; "December". year: (digit, digit)?, digit, digit . </p:inline> </p:with-input> <p:with-input port="source"> <p:inline content-type="text/plain">31 December 2021</p:inline> </p:with-input> </p:ixml> </p:declare-step>
This would produce an XML version of the date:
<date><day>31</day><month>December</month><year>2021</year></date>
If a parse fails, the implementation must indicate this, but it may also provide information about where the processing failed.
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.1"> <p:output port="result"/> <p:ixml fail-on-error="false"> <p:with-input port="grammar"> <p:inline content-type="text/plain"> date: s?, day, s, month, (s, year)? . -s: -" "+ . day: digit, digit? . -digit: "0"; "1"; "2"; "3"; "4"; "5"; "6"; "7"; "8"; "9". month: "January"; "February"; "March"; "April"; "May"; "June"; "July"; "August"; "September"; "October"; "November"; "December". year: (digit, digit)?, digit, digit . </p:inline> </p:with-input> <p:with-input port="source"> <p:inline content-type="text/plain">31 Mumble 2021</p:inline> </p:with-input> </p:ixml> </p:declare-step>
Here the output might be something like this:
<error xmlns:ixml="http://invisiblexml.org/NS" xmlns:ex="http://example.com/NS" ixml:state="failed" ex:lastChar="4"> <parse> month -> • M a r c h month -> M • a r c h </parse> <parse> month -> • M a y month -> M • a y </parse> </error>
In the case of failure, Invisible XML requires that the ixml:state
attribute appear on the root element containing the token “failed
”. It doesn’t constrain the implementation’s choice of the root element or the content of the document.
An ixml grammar may be ambiguous. In the grammar below, there are three different possible ways to parse the input. By default, one of them is returned.
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.1"> <p:output port="result"/> <p:ixml> <p:with-input port="grammar"> <p:inline content-type="text/plain"> letters: X, (A; B; C) . A: digits . B: digits . C: digits . X: "a" . digits: ["0"-"9"]+ . </p:inline> </p:with-input> <p:with-input port="source"> <p:inline content-type="text/plain">a123</p:inline> </p:with-input> </p:ixml> </p:declare-step>
This might return any one of these parses:
<letters ixml:state="ambiguous" xmlns:ixml="http://invisiblexml.org/NS"><X>a</X><C><digits>123</digits></C></letters>
or
<letters ixml:state="ambiguous" xmlns:ixml="http://invisiblexml.org/NS"><X>a</X><A><digits>123</digits></A></letters>
or
<letters ixml:state="ambiguous" xmlns:ixml="http://invisiblexml.org/NS"><X>a</X><B><digits>123</digits></B></letters>
All are equally correct.
An implementation might provide a parameter to allow the author to select a particular parse:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:ex="http://example.com/" version="3.1"> <p:output port="result"/> <p:ixml parameters="map{'ex:select':2}"> <p:with-input port="grammar"> <p:inline content-type="text/plain"> letters: X, (A; B; C) . A: digits . B: digits . C: digits . X: "a" . digits: ["0"-"9"]+ . </p:inline> </p:with-input> <p:with-input port="source"> <p:inline content-type="text/plain">a123</p:inline> </p:with-input> </p:ixml> </p:declare-step>
This might return:
<letters ixml:state="ambiguous"><X>a</X><A><digits>123</digits></A></letters>
Or a processor might provide a parameter to return all of the parses.
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:ex="http://example.com/" version="3.1"> <p:output port="result"/> <p:ixml parameters="map{'ex:select':'all'}"> <p:with-input port="grammar"> <p:inline content-type="text/plain"> letters: X, (A; B; C) . A: digits . B: digits . C: digits . X: "a" . digits: ["0"-"9"]+ . </p:inline> </p:with-input> <p:with-input port="source"> <p:inline content-type="text/plain">a123</p:inline> </p:with-input> </p:ixml> </p:declare-step>
This might return three documents:
<letters ixml:state="ambiguous"><X>a</X><C><digits>123</digits></C></letters> <letters ixml:state="ambiguous"><X>a</X><B><digits>123</digits></B></letters> <letters ixml:state="ambiguous"><X>a</X><A><digits>123</digits></A></letters>
As before, there is nothing standardized about the results in this case.
Document properties
No document properties are preserved.
This step can raise dynamic errors.
[Definition: A dynamic error is one which occurs while a pipeline is being evaluated.] Examples of dynamic errors include references to URIs that cannot be resolved, steps which fail, and pipelines that exhaust the capacity of an implementation (such as memory or disk space). For a more complete discussion of dynamic errors, see Dynamic Errors in XProc 3.0: An XML Pipeline Language.
If a step fails due to a dynamic error, failure propagates upwards until either a p:try
is encountered or the entire pipeline fails. In other words, outside of a p:try
, step failure causes the entire pipeline to fail.
The following errors can be raised by this step:
err:XC0205
It is a dynamic error if the source document cannot be parsed by the provided grammar.
See: p:ixml
err:XC0211
It is a dynamic error if more than one document appears on the grammar port.
See: p:ixml
err:XC0212
It is a dynamic error if the grammar provided is not a valid Invisible XML grammar.
See: p:ixml
A. Conformance
Conformant processors must implement all of the features described in this specification except those that are explicitly identified as optional.
Some aspects of processor behavior are not completely specified; those features are either implementation-dependent or implementation-defined.
[Definition: An implementation-dependent feature is one where the implementation has discretion in how it is performed. Implementations are not required to document or explain how implementation-dependent features are performed.]
[Definition: An implementation-defined feature is one where the implementation has discretion in how it is performed. Conformant implementations must document how implementation-defined features are performed.]
The following features are implementation-defined:
- It is implementation-defined if other result formats are possible. See Section 2.1, “p:ixml”.
- The parameters are implementation-defined. See Section 2.1, “p:ixml”.
This step has no implementation-dependent features.
B. References
[XProc 3.1] XProc 3.1: An XML Pipeline Language. Norman Walsh, Achim Berndzen, Gerrit Imsieke and Erik Siegel, editors.
[Invisible XML] Invisible XML Specification, version 1.0. Steven Pemberton, editor. Version 2022-06-20.
C. Glossary
- dynamic error
A dynamic error is one which occurs while a pipeline is being evaluated.
- implementation-defined
An implementation-defined feature is one where the implementation has discretion in how it is performed. Conformant implementations must document how implementation-defined features are performed.
- implementation-dependent
An implementation-dependent feature is one where the implementation has discretion in how it is performed. Implementations are not required to document or explain how implementation-dependent features are performed.
D. Ancillary files
This specification includes by reference a number of ancillary files.
- steps.xpl
An XProc step library for the declared steps.