This specification describes the p:ixml
XProc step. A machine-readable description of this step may be found in steps.xpl.
Familarity with the general nature of [XProc 3.0] steps is assumed; for background details, see [XProc 3.0 Steps].
The p:ixml
step performs Invisible XML processing per [Invisible XML]. It transforms a non-XML input into XML by applying the specified Invisible XML grammar.
<p:declare-step
type
="
p:ixml
"
>
<p:input
port
="
grammar
"
sequence
="
true
"
content-types
="
any
"
/>
<p:input
port
="
source
"
primary
="
true
"
content-types
="
any -xml -html
"
/>
<p:output
port
="
result
"
content-types
="
any
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName, item()*)?
"
/>
<p:option
name
="
fail-on-error
"
as
="
xs:boolean
"
select
="
true()
"
/>
</p:declare-step>
If no grammar is provided on the grammar
port, the grammar for Invisible XML is assumed. If an XML or text grammar is provided it should be an Invisible XML grammar. If any other grammar format is provided, its interpretation is implementation-defined.
The source
to be processed is usually text, but there’s nothing in principle that prevents an Invisible XML grammar from applying to an arbitrary sequence of characters.
The result
should be XML. It is implementation-defined if other result formats are possible. (An implementation might, for example, provide a way for the p:ixml
step to compile an Invisible XML grammar into some format that can be processed more efficiently.)
The
parameters
are implementation-defined. An implementation might provide parameters to select among different ambiguous parses or choose alternate representations.If
fail-on-error
istrue
, the step will raise an error if the input cannot be parsed by the grammar. It is a dynamic error (err:XC0205
) if the source document cannot be parsed by the provided grammar. Iffail-on-error
isfalse
, no error will be raised.The Invisible XML specification provides a mechanism to identify failed parses in the output.
The following pipeline parses an Invisible XML grammar and returns its XML representation:
This would produce an XML version of the grammar:
<ixml> <rule name="date"> <alt> <option> <nonterminal name="s"/> </option> <nonterminal name="day"/> <nonterminal name="s"/> <nonterminal name="month"/> <option> <alts> <alt> <nonterminal name="s"/> <nonterminal name="year"/> </alt> </alts> </option> </alt> </rule> <!-- … remaining rules elided for brevity … --> </ixml>
Providing the “date” grammar allows the step to parse dates:
This would produce an XML version of the date:
<date><day>31</day><month>December</month><year>2021</year></date>
If a parse fails, the implementation must indicate this, but it may also provide information about where the processing failed.
Here the output might be something like this:
<ixml xmlns:ixml="http://invisiblexml.org/NS" xmlns:ex="http://example.com/NS" ixml:state="failed" ex:lastChar="4"> <parse> month -> • M a r c h month -> M • a r c h </parse> <parse> month -> • M a y month -> M • a y </parse> </ixml>
There is nothing standard about this markup except the ixml:state
attribute with the value “failed
”.
An ixml grammar may be ambiguous. In the grammar below, there are three different possible ways to parse the input. By default, one of them is returned.
This might return any one of these parses:
<letters ixml:state="ambiguous" xmlns:ixml="http://invisiblexml.org/NS"><X>a</X><C><digits>123</digits></C></letters>
or
<letters ixml:state="ambiguous" xmlns:ixml="http://invisiblexml.org/NS"><X>a</X><A><digits>123</digits></A></letters>
or
<letters ixml:state="ambiguous" xmlns:ixml="http://invisiblexml.org/NS"><X>a</X><B><digits>123</digits></B></letters>
All are equally correct.
An implementation might provide a parameter to allow the author to select a particular parse:
This might return:
<letters><X>a</X><A><digits>123</digits></A></letters>
Or a processor might provide a parameter to return all of the parses.
This might return:
<ixml parseCount='3'> <letters><X>a</X><C><digits>123</digits></C></letters> <letters><X>a</X><B><digits>123</digits></B></letters> <letters><X>a</X><A><digits>123</digits></A></letters> </ixml>
As before, there is nothing standardized about the results in this case.
No document properties are preserved.
This step can raise dynamic errors.
[Definition: A dynamic error is one which occurs while a pipeline is being evaluated.] Examples of dynamic errors include references to URIs that cannot be resolved, steps which fail, and pipelines that exhaust the capacity of an implementation (such as memory or disk space). For a more complete discussion of dynamic errors, see Dynamic Errors in XProc 3.0: An XML Pipeline Language.
If a step fails due to a dynamic error, failure propagates upwards until either a p:try
is encountered or the entire pipeline fails. In other words, outside of a p:try
, step failure causes the entire pipeline to fail.
The following errors can be raised by this step:
err:XC0205
It is a dynamic error if the source document cannot be parsed by the provided grammar.
See: p:ixml
A. Conformance
Conformant processors must implement all of the features described in this specification except those that are explicitly identified as optional.
Some aspects of processor behavior are not completely specified; those features are either implementation-dependent or implementation-defined.
[Definition: An implementation-dependent feature is one where the implementation has discretion in how it is performed. Implementations are not required to document or explain how implementation-dependent features are performed.]
[Definition: An implementation-defined feature is one where the implementation has discretion in how it is performed. Conformant implementations must document how implementation-defined features are performed.]
The following features are implementation-defined:
- If any other grammar format is provided, its interpretation is implementation-defined. See Section 2, “p:ixml”.
- It is implementation-defined if other result formats are possible. See Section 2, “p:ixml”.
- The parameters are implementation-defined. See Section 2, “p:ixml”.
This step has no implementation-dependent features.
B. References
[XProc 3.0] XProc 3.0: An XML Pipeline Language. Norman Walsh, Achim Berndzen, Gerrit Imsieke and Erik Siegel, editors.
[XProc 3.0 Steps] XProc 3.0 Steps: An Introduction. Norman Walsh, Achim Berndzen, Gerrit Imsieke and Erik Siegel, editors.
[Invisible XML] Invisible XML Specification, version 1.0. Steven Pemberton, editor. Version 2022-06-20.
C. Glossary
- dynamic error
A dynamic error is one which occurs while a pipeline is being evaluated.
- implementation-defined
An implementation-defined feature is one where the implementation has discretion in how it is performed. Conformant implementations must document how implementation-defined features are performed.
- implementation-dependent
An implementation-dependent feature is one where the implementation has discretion in how it is performed. Implementations are not required to document or explain how implementation-dependent features are performed.
D. Ancillary files
This specification includes by reference a number of ancillary files.
- steps.xpl
An XProc step library for the declared steps.