<?xml version="1.0" encoding="UTF-8"?><specification xmlns="http://docbook.org/ns/docbook" xmlns:cs="http://www.w3.org/XML/XProc/2006/04/components#" xmlns:e="http://www.w3.org/1999/XSL/Spec/ElementSyntax" xmlns:p="http://www.w3.org/ns/xproc" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="step-ixml" class="ed" role="step" version="5.0-extension w3c-xproc">
<info>
<title>XProc 3.1: Invisible XML</title>
<!-- defaults to date formatted <pubdate>2014-12-18</pubdate> -->
<copyright><year>2021</year>
<year>2022</year><year>2023</year><year>2024</year><year>2025</year>
<holder>the Contributors to the XProc 3.1 Standard Step Library
specifications</holder>
</copyright>

<bibliomisc role="repository">XProc/specification</bibliomisc>
<bibliomisc role="w3c-cg" xlink:href="https://www.w3.org/community/xproc-next/">XProc Next</bibliomisc>

<bibliorelation type="isformatof" xlink:href="specification.xml">XML</bibliorelation>
<authorgroup>
  <author>
    <personname>Norman Walsh</personname>
  </author>
</authorgroup>

<abstract>
<para>This specification describes the <code>p:invisible-xml</code>
step for
<citetitle>XProc 3.1: An XML Pipeline Language</citetitle>.</para>
</abstract>

<legalnotice xml:id="sotd" role="status">
  <para>This specification was published by the
  <link xlink:href="https://www.w3.org/community/xproc-next/">XProc
  Next Community Group</link>. It is not a W3C Standard nor is it on
  the W3C Standards Track. Please note that under the
  <link xlink:href="https://www.w3.org/community/about/agreements/cla/">W3C
  Community Contributor License Agreement (CLA)</link> there is a limited
  opt-out and other conditions apply. Learn more about <link xlink:href="https://www.w3.org/community/">W3C Community and Business
  Groups</link>.
  </para>
  
  <para>If you wish to make comments regarding this document, please
  send them to
  <link xlink:href="mailto:xproc-dev@w3.org">xproc-dev@w3.org</link>.
  (<link xlink:href="mailto:xproc-dev-request@w3.org?subject=subscribe">subscribe</link>,
  <link xlink:href="https://lists.w3.org/Archives/Public/xproc-dev/">archives</link>).
  </para>

<note role="editorial">
<para>This draft is the “editor’s working draft” and may continue to evolve.
</para>
</note>
</legalnotice>
</info>

<section xml:id="introduction">
<title>Introduction</title>

<para>This specification describes the <code>p:invisible-xml</code> XProc step.
A machine-readable description of this step may be found in
<link xlink:href="steps.xpl">steps.xpl</link>.
</para>
  
<para>Familarity with the general nature of <biblioref linkend="xproc31"/>
steps is assumed.</para>
</section>

<section xml:id="library">
<title>Step library</title>

<section xml:id="c.ixml">
<title>p:invisible-xml</title>

<para>The <tag>p:invisible-xml</tag> step performs Invisible XML processing per
<biblioref linkend="invisible-xml"/>. It transforms a non-XML input into XML by applying
the specified Invisible XML grammar.</para>

<p:declare-step type="p:invisible-xml">
  <p:input port="grammar" sequence="true" content-types="text xml"/>
  <p:input port="source" primary="true" content-types="any -xml -html"/>
  <p:output port="result" sequence="true" content-types="any"/>
  <p:option name="parameters" as="map(xs:QName, item()*)?"/>    
  <p:option name="fail-on-error" as="xs:boolean" select="true()"/>
</p:declare-step>

<para>If no grammar is provided on the <port>grammar</port> port, the grammar for
Invisible XML is assumed. If an XML or text grammar is provided it
<rfc2119>must</rfc2119> be an Invisible XML grammar. 
<error code="C0212">It is a <glossterm>dynamic error</glossterm> if the grammar
provided is not a valid Invisible XML grammar.</error>
<error code="C0211">It is a <glossterm>dynamic error</glossterm> if more than one
document appears on the <port>grammar</port> port.</error></para>

<para>The <port>source</port> to be processed is usually text, but
there’s nothing in principle that prevents an Invisible XML grammar
from applying to an arbitrary sequence of characters.</para>

<para>The <port>result</port> <rfc2119>should</rfc2119> be XML. 
<impl>It is <glossterm>implementation-defined</glossterm> if other result
formats are possible.</impl> (An implementation might, for example, provide a
way for the <tag>p:invisible-xml</tag> step to compile an Invisible XML grammar into some
format that can be processed more efficiently.)</para>

<itemizedlist>
<listitem>
<para><impl>The <option>parameters</option> are 
<glossterm>implementation-defined</glossterm>.</impl> An implementation might
provide parameters to select among different ambiguous parses or choose
alternate representations.</para>
</listitem>
<listitem>
<para>If <option>fail-on-error</option> is <code>true</code>, the step will
raise an error if the input cannot be parsed by the grammar.
<error code="C0205">It is a <glossterm>dynamic error</glossterm>
if the source document cannot be parsed by the provided grammar.</error>
If <option>fail-on-error</option> is <code>false</code>, no error will be raised.
</para>
<para>The Invisible XML specification provides a mechanism to identify failed parses
in the output.</para>
</listitem>
</itemizedlist>

<section xml:id="example-ixml">
<title>Examples</title>

<para>Several examples demonstrate features of the step.</para>

<section xml:id="example-parse-ixml">
<title>Parsing an Invisible XML grammar</title>

<para>In this first example, no grammar is provided, so the pipeline parses the
Invisible XML grammar on the <port>source</port> port and returns its XML
representation:</para>

<programlisting language="xml">&lt;p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
                version="3.1"&gt;
&lt;p:output port="result"/&gt;

&lt;p:ixml&gt;
  &lt;p:with-input port="grammar"&gt;&lt;p:empty /&gt;&lt;/p:with-input&gt;
  &lt;p:with-input port="source"&gt;
    &lt;p:inline content-type="text/plain"&gt;
date: s?, day, s, month, (s, year)? .
-s: -" "+ .
day: digit, digit? .
-digit: "0"; "1"; "2"; "3"; "4"; "5"; "6"; "7"; "8"; "9".
month: "January"; "February"; "March"; "April";
       "May"; "June"; "July"; "August";
       "September"; "October"; "November"; "December".
year: (digit, digit)?, digit, digit .
    &lt;/p:inline&gt;
  &lt;/p:with-input&gt;
&lt;/p:ixml&gt;

&lt;/p:declare-step&gt;
</programlisting>

<para>This would produce an XML version of the grammar:</para>

<programlisting language="xml">&lt;ixml&gt;
   &lt;rule name="date"&gt;
      &lt;alt&gt;
         &lt;option&gt;
            &lt;nonterminal name="s"/&gt;
         &lt;/option&gt;
         &lt;nonterminal name="day"/&gt;
         &lt;nonterminal name="s"/&gt;
         &lt;nonterminal name="month"/&gt;
         &lt;option&gt;
            &lt;alts&gt;
               &lt;alt&gt;
                  &lt;nonterminal name="s"/&gt;
                  &lt;nonterminal name="year"/&gt;
               &lt;/alt&gt;
            &lt;/alts&gt;
         &lt;/option&gt;
      &lt;/alt&gt;
   &lt;/rule&gt;
   &lt;!-- … remaining rules elided for brevity … --&gt;
&lt;/ixml&gt;</programlisting>

</section>
<section xml:id="example-parse-date">
<title>Parsing a date</title>

<para>If the grammar is provided on the <port>grammar</port> port, it can be
used to parse input, the string “31 December 2021” in this case:</para>

<programlisting language="xml">&lt;p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
                version="3.1"&gt;
&lt;p:output port="result"/&gt;

&lt;p:ixml&gt;
  &lt;p:with-input port="grammar"&gt;
    &lt;p:inline content-type="text/plain"&gt;
date: s?, day, s, month, (s, year)? .
-s: -" "+ .
day: digit, digit? .
-digit: "0"; "1"; "2"; "3"; "4"; "5"; "6"; "7"; "8"; "9".
month: "January"; "February"; "March"; "April";
       "May"; "June"; "July"; "August";
       "September"; "October"; "November"; "December".
year: (digit, digit)?, digit, digit .
    &lt;/p:inline&gt;
  &lt;/p:with-input&gt;
  &lt;p:with-input port="source"&gt;
    &lt;p:inline content-type="text/plain"&gt;31 December 2021&lt;/p:inline&gt;
  &lt;/p:with-input&gt;
&lt;/p:ixml&gt;

&lt;/p:declare-step&gt;
</programlisting>

<para>This would produce an XML version of the date:</para>

<programlisting language="xml">&lt;date&gt;&lt;day&gt;31&lt;/day&gt;&lt;month&gt;December&lt;/month&gt;&lt;year&gt;2021&lt;/year&gt;&lt;/date&gt;</programlisting>

</section>
<section xml:id="example-parse-fail">
<title>Failed parses</title>

<para>If a parse fails, the implementation <rfc2119>must</rfc2119> indicate
this, but it may also provide information about where the processing failed.</para>

<programlisting language="xml">&lt;p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
                version="3.1"&gt;
&lt;p:output port="result"/&gt;

&lt;p:ixml fail-on-error="false"&gt;
  &lt;p:with-input port="grammar"&gt;
    &lt;p:inline content-type="text/plain"&gt;
date: s?, day, s, month, (s, year)? .
-s: -" "+ .
day: digit, digit? .
-digit: "0"; "1"; "2"; "3"; "4"; "5"; "6"; "7"; "8"; "9".
month: "January"; "February"; "March"; "April";
       "May"; "June"; "July"; "August";
       "September"; "October"; "November"; "December".
year: (digit, digit)?, digit, digit .
    &lt;/p:inline&gt;
  &lt;/p:with-input&gt;
  &lt;p:with-input port="source"&gt;
    &lt;p:inline content-type="text/plain"&gt;31 Mumble 2021&lt;/p:inline&gt;
  &lt;/p:with-input&gt;
&lt;/p:ixml&gt;

&lt;/p:declare-step&gt;
</programlisting>

<para>Here the output might be something like this:</para>

<programlisting language="xml">&lt;error xmlns:ixml="http://invisiblexml.org/NS"
       xmlns:ex="http://example.com/NS"
       ixml:state="failed" ex:lastChar="4"&gt;
&lt;parse&gt;
month -&gt;  •  M  a  r  c  h
month -&gt;  M  •  a  r  c  h
&lt;/parse&gt;
&lt;parse&gt;
month -&gt;  •  M  a  y
month -&gt;  M  •  a  y
&lt;/parse&gt;
&lt;/error&gt;</programlisting>

<para>In the case of failure, Invisible XML requires that the <tag class="attribute">ixml:state</tag> attribute appear on the root element
containing the token “<code>failed</code>”. It doesn’t constrain the implementation’s
choice of the root element or the content of the document.
</para>

</section>
<section xml:id="example-parse-ambiguous-1">
<title>Ambiguous parses</title>

<para>An ixml grammar may be ambiguous. In the grammar below, there are three
different possible ways to parse the input. By default, one of them is returned.
</para>

<programlisting language="xml">&lt;p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
                version="3.1"&gt;
&lt;p:output port="result"/&gt;

&lt;p:ixml&gt;
  &lt;p:with-input port="grammar"&gt;
    &lt;p:inline content-type="text/plain"&gt;
letters: X, (A; B; C) .
A: digits .
B: digits .
C: digits .
X: "a" .
digits: ["0"-"9"]+ .
    &lt;/p:inline&gt;
  &lt;/p:with-input&gt;
  &lt;p:with-input port="source"&gt;
    &lt;p:inline content-type="text/plain"&gt;a123&lt;/p:inline&gt;
  &lt;/p:with-input&gt;
&lt;/p:ixml&gt;

&lt;/p:declare-step&gt;
</programlisting>

<para>This might return any one of these parses:</para>

<programlisting language="xml">&lt;letters ixml:state="ambiguous" xmlns:ixml="http://invisiblexml.org/NS"&gt;&lt;X&gt;a&lt;/X&gt;&lt;C&gt;&lt;digits&gt;123&lt;/digits&gt;&lt;/C&gt;&lt;/letters&gt;</programlisting>

<para>or</para>

<programlisting language="xml">&lt;letters ixml:state="ambiguous" xmlns:ixml="http://invisiblexml.org/NS"&gt;&lt;X&gt;a&lt;/X&gt;&lt;A&gt;&lt;digits&gt;123&lt;/digits&gt;&lt;/A&gt;&lt;/letters&gt;</programlisting>

<para>or</para>

<programlisting language="xml">&lt;letters ixml:state="ambiguous" xmlns:ixml="http://invisiblexml.org/NS"&gt;&lt;X&gt;a&lt;/X&gt;&lt;B&gt;&lt;digits&gt;123&lt;/digits&gt;&lt;/B&gt;&lt;/letters&gt;</programlisting>

<para>All are equally correct.</para>
</section>

<section xml:id="example-parse-ambiguous-2">
<title>Ambiguous parse selection</title>

<para>An implementation might provide a parameter to allow the author to
select a particular parse:
</para>

<programlisting language="xml">&lt;p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
                xmlns:ex="http://example.com/"
                version="3.1"&gt;
&lt;p:output port="result"/&gt;

&lt;p:ixml parameters="map{'ex:select':2}"&gt;
  &lt;p:with-input port="grammar"&gt;
    &lt;p:inline content-type="text/plain"&gt;
letters: X, (A; B; C) .
A: digits .
B: digits .
C: digits .
X: "a" .
digits: ["0"-"9"]+ .
    &lt;/p:inline&gt;
  &lt;/p:with-input&gt;
  &lt;p:with-input port="source"&gt;
    &lt;p:inline content-type="text/plain"&gt;a123&lt;/p:inline&gt;
  &lt;/p:with-input&gt;
&lt;/p:ixml&gt;

&lt;/p:declare-step&gt;
</programlisting>

<para>This might return:</para>

<programlisting language="xml">&lt;letters ixml:state="ambiguous"&gt;&lt;X&gt;a&lt;/X&gt;&lt;A&gt;&lt;digits&gt;123&lt;/digits&gt;&lt;/A&gt;&lt;/letters&gt;</programlisting>
</section>

<section xml:id="example-parse-ambiguous-3">
<title>Multiple ambiguous outputs</title>

<para>Or a processor might provide a parameter to return all of the parses.</para>

<programlisting language="xml">&lt;p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
                xmlns:ex="http://example.com/"
                version="3.1"&gt;
&lt;p:output port="result"/&gt;

&lt;p:ixml parameters="map{'ex:select':'all'}"&gt;
  &lt;p:with-input port="grammar"&gt;
    &lt;p:inline content-type="text/plain"&gt;
letters: X, (A; B; C) .
A: digits .
B: digits .
C: digits .
X: "a" .
digits: ["0"-"9"]+ .
    &lt;/p:inline&gt;
  &lt;/p:with-input&gt;
  &lt;p:with-input port="source"&gt;
    &lt;p:inline content-type="text/plain"&gt;a123&lt;/p:inline&gt;
  &lt;/p:with-input&gt;
&lt;/p:ixml&gt;

&lt;/p:declare-step&gt;
</programlisting>

<para>This might return three documents:</para>

<programlisting language="xml">&lt;letters ixml:state="ambiguous"&gt;&lt;X&gt;a&lt;/X&gt;&lt;C&gt;&lt;digits&gt;123&lt;/digits&gt;&lt;/C&gt;&lt;/letters&gt;
&lt;letters ixml:state="ambiguous"&gt;&lt;X&gt;a&lt;/X&gt;&lt;B&gt;&lt;digits&gt;123&lt;/digits&gt;&lt;/B&gt;&lt;/letters&gt;
&lt;letters ixml:state="ambiguous"&gt;&lt;X&gt;a&lt;/X&gt;&lt;A&gt;&lt;digits&gt;123&lt;/digits&gt;&lt;/A&gt;&lt;/letters&gt;</programlisting>

<para>As before, there is nothing standardized about the results in this case.</para>
</section>
</section>

<section>
<title>Formerly the <code>p:ixml</code> step</title>

<para>In earlier drafts of this specification, the <tag>p:invisible-xml</tag> step
was named <code>p:ixml</code>. Changing the name of the step to <tag>p:invisible-xml</tag>
brings the name into better alignment with the naming conventions used for most other
XProc steps and aligns its name with the <code>fn:invisible-xml()</code> function
in XPath 4.0.</para>

<para>Implementors may wish to support both names for some period of time in order
to avoid breaking changes in existing pipelines. <impl>Support for the alternative name <code>p:ixml</code>
is <glossterm>implementation-defined</glossterm>.</impl></para>
</section>
  
<simplesect>
<title>Document properties</title>
<para feature="exec-preserves-none">No document properties are preserved.</para>
</simplesect>
</section>
</section>

<section xml:id="errors">
<title>Step Errors</title>

<para>This step can raise
<glossterm role="unwrapped" baseform="dynamic error">dynamic errors</glossterm>.
</para>

<para><termdef xml:id="dt-dynamic-error">A <firstterm>dynamic
error</firstterm> is one which occurs while a pipeline is being
evaluated.</termdef> Examples of dynamic errors include references to
URIs that cannot be resolved, steps which fail, and pipelines that
exhaust the capacity of an implementation (such as memory or disk
space). For a more complete discussion of dynamic errors, see
<xspecref spec="xproc" xref="dynamic-errors"/>.
</para>

<para>If a step fails due to a dynamic error, failure propagates
upwards until either a <tag>p:try</tag> is encountered or the entire
pipeline fails. In other words, outside of a <tag>p:try</tag>, step
failure causes the entire pipeline to fail.</para>

<para>The following errors can be raised by this step:</para>

<?step-error-list level="none"?>

</section>

<appendix xml:id="conformance">
<title>Conformance</title>

<para>Conformant processors <rfc2119>must</rfc2119> implement all of the features
described in this specification except those that are explicitly identified
as optional.</para>

<para>Some aspects of processor behavior are not completely specified; those
features are either <glossterm role="unwrapped">implementation-dependent</glossterm> or
<glossterm role="unwrapped">implementation-defined</glossterm>.</para>

<para><termdef xml:id="dt-implementation-dependent">An
<firstterm>implementation-dependent</firstterm> feature is one where the
implementation has discretion in how it is performed.
Implementations are not required to document or explain
how <glossterm role="unwrapped">implementation-dependent</glossterm> features are performed.</termdef>
</para>

<para><termdef xml:id="dt-implementation-defined">An
<firstterm>implementation-defined</firstterm> feature is one where the
implementation has discretion in how it is performed.
Conformant implementations <rfc2119>must</rfc2119> document
how <glossterm role="unwrapped">implementation-defined</glossterm> features are performed.</termdef>
</para>

<section xml:id="implementation-defined">
<title>Implementation-defined features</title>

<para>The following features are implementation-defined:</para>

<?implementation-defined-features?>
</section>

<section xml:id="implementation-dependent">
<title>Implementation-dependent features</title>

<para>This step has no implementation-dependent features.</para>

<?implementation-dependent-features?>
</section>
</appendix>

<appendix xml:id="references">
<title>References</title>
<bibliolist>
<bibliomixed xml:id="xproc31"/>
<bibliomixed xml:id="invisible-xml"/>
</bibliolist>
</appendix>

<!-- This glossary will automatically be elided if there are no
     terms marked up as 'firstterm's in this specification. -->

    <glossary xml:id="glossary">
      <title>Glossary</title>
      <para>Glossary needs to be generated</para>
    </glossary>
  

<appendix version="5.0-extension w3c-xproc" xml:id="ancillary-files">
<title>Ancillary files</title>

<para>This specification includes by reference a number of
ancillary files.</para>

<variablelist>
<varlistentry>
<term><link xlink:href="steps.xpl"/></term>
<listitem>
<para>An XProc step library for the declared steps.
</para>
</listitem>
</varlistentry>
</variablelist>

</appendix>

</specification>