XProc 3.0: Optional Validation Steps

Editor's Draft

This Version:
https://xproc.github.io/3.0-specification/master/head/validation/
Latest Version:
http://spec.xproc.org/master/head/validation/
Editors:
Achim Berndzen
Gerrit Imsieke
Erik Siegel
Norman Walsh
Repository:
This specification on GitHub
Report an issue
Changes:
Commits for this specification

This document is also available in these non-normative formats: XML.


Abstract

This specification describes the p:validate-with-relax-ng, p:validate-with-schematron, and p:validate-with-xml-schema step for XProc 3.0: An XML Pipeline Language.

Status of this Document

This document is an editor's draft that has no official standing.

This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This document is derived from XProc: An XML Pipeline Language published by the W3C.


1 Introduction

This specification describes the p:validate-with-relax-ng, p:validate-with-schematron, and p:validate-with-xml-schema steps. Each is independently optional. A machine-readable description of these steps may be found in steps.xpl.

Familarity with the general nature of [XProc 3.0] steps is assumed; for background details, see [XProc 3.0 Steps].

2 Validate with RELAX NG

The p:validate-with-relax-ng step applies [RELAX NG] validation to the source document.

<p:declare-step type="p:validate-with-relax-ng">
     <p:input port="source" primary="true" content-types="application/xml text/xml */*+xml"/>
     <p:input port="schema" content-types="application/xml */*+xml text/*"/>
     <p:output port="result" content-types="application/xml"/>
     <p:option name="dtd-attribute-values" select="false()" as="xs:boolean"/>
     <p:option name="dtd-id-idref-warnings" select="false()" as="xs:boolean"/>
     <p:option name="assert-valid" select="true()" as="xs:boolean"/>
</p:declare-step>

The values of the dtd-attribute-values and dtd-id-idref-warnings options must be booleans.

If the schema document has an XML media type, then it must be interpreted as a RELAX NG Grammar. If the schema document has the media type “application/relax-ng-compact-syntax” or the media type has a “text” type, then it must be interpreted as a [RELAX NG Compact Syntax] document for validation.

If the dtd-attribute-values option is true, then the attribute value defaulting conventions of [RELAX NG DTD Compatibility] are also applied.

If the dtd-id-idref-warnings option is true, then the validator should treat a schema that is incompatible with the ID/IDREF/IDREFs feature of [RELAX NG DTD Compatibility] as if the document was invalid.

It is a dynamic error (err:XC0053) if the assert-valid option is true and the input document is not valid.

The output from this step is a copy of the input, possibly augmented by application of the [RELAX NG DTD Compatibility]. The output of this step may include PSVI annotations.

Support for [RELAX NG DTD Compatibility] is implementation defined.

2.1 Document properties

All document properties on the source port are preserved on the result port. No document properties on the schemas port are preserved.

3 Validate with Schematron

The p:validate-with-schematron step applies [Schematron] processing to the source document.

<p:declare-step type="p:validate-with-schematron">
     <p:input port="source" primary="true" content-types="application/xml text/xml */*+xml"/>
     <p:input port="schema" content-types="application/xml text/xml */*+xml"/>
     <p:output port="result" primary="true" content-types="application/xml"/>
     <p:output port="report" sequence="true" content-types="application/xml"/>
     <p:option name="parameters" as="map(xs:QName,item())"/>       
     <p:option name="phase" select="'#ALL'" as="xs:string"/>       
     <p:option name="assert-valid" select="true()" as="xs:boolean"/>
</p:declare-step>

It is a dynamic error (err:XC0054) if the assert-valid option is true and any Schematron assertions fail or reports succeed.

The value of the phase option identifies the Schematron validation phase with which validation begins.

The parameters option provides name/value pairs which correspond to Schematron external variables.

The result output from this step is a copy of the input.

Schematron assertions and reports, if any, must appear on the report port. The output should be in Schematron Validation Report Language (SVRL).

The output of this step may include PSVI annotations.

3.1 Document properties

All document properties on the source port are preserved on the result port. No document properties on the schemas port are preserved. No document properties are preserved on the report port.

4 Validate with XML Schema

The p:validate-with-xml-schema step applies [W3C XML Schema: Part 1] validity assessment to the source input.

<p:declare-step type="p:validate-with-xml-schema">
     <p:input port="source" primary="true" content-types="application/xml text/xml */*+xml"/>
     <p:input port="schema" sequence="true" content-types="application/xml text/xml */*+xml"/>
     <p:output port="result" content-types="application/xml"/>
     <p:option name="use-location-hints" select="false()" as="xs:boolean"/>
     <p:option name="try-namespaces" select="false()" as="xs:boolean"/>
     <p:option name="assert-valid" select="true()" as="xs:boolean"/>
     <p:option name="mode" select="'strict'" as="xs:token"/>       <!-- "strict" | "lax" -->
     <p:option name="version" as="xs:string"/>                     
</p:declare-step>

The values of the use-location-hints, try-namespaces, and assert-valid options must be boolean.

The value of the mode option must be an NMTOKEN whose value is either “strict” or “lax”.

Validation is performed against the set of schemas represented by the documents on the schema port. These schemas must be used in preference to any schema locations provided by schema location hints encountered during schema validation, that is, schema locations supplied for xs:import or xsi:schema-location, or determined by schema-processor-defined namespace-based strategies, for the namespaces covered by the documents available on the schemas port.

If xs:include elements occur within the supplied schema documents, they are treated like any other external documents (see [XProc 3.0]). It is implementation-defined if the documents supplied on the schemas port are considered when resolving xs:include elements in the schema documents provided.

The use-location-hints and try-namespaces options allow the pipeline author to control how the schema processor should attempt to locate schema documents necessary but not provided on the schema port. Any schema documents provided on the schema port must be used in preference to schema documents located by other means.

If the use-location-hints option is “true”, the processor should make use of schema location hints to locate schema documents. If the option is “false”, the processor should ignore any such hints.

If the try-namespaces option is “true”, the processor should attempt to dereference the namespace URI to locate schema documents. If the option is “false”, the processor should not dereference namespace URIs.

The mode option allow the pipeline author to control how schema validation begins. The “strict” mode means that the document element must be declared and schema-valid, otherwise it will be treated as invalid. The “lax” mode means that the absence of a declaration for the document element does not itself count as an unsuccessful outcome of validation.

If the step specifies a version, then that version of XML Schema must be used to process the validation. It is a dynamic error (err:XC0038) if the specified version is not available. If the step does not specify a version, the implementation may use any version it has available and may use any means to determine what version to use, including, but not limited to, examining the version of the schema(s).

It is a dynamic error (err:XC0053) if the assert-valid option is true and the input document is not valid. If the assert-valid option is false, it is not an error for the document to be invalid. In this case, if the implementation does not support the PSVI, p:validate-with-xml-schema is essentially just an “identity” step, but if the implementation does support the PSVI, then the resulting document will have additional type information (at least for the subtrees that are valid).

When XML Schema validation assessment is performed, the processor is invoked in the mode specified by the mode option. It is a dynamic error (err:XC0055) if the implementation does not support the specified mode.

The result of the assessment is a document with the Post-Schema-Validation-Infoset (PSVI) ([W3C XML Schema: Part 1]) annotations, if the pipeline implementation supports such annotations. If not, the input document is reproduced with any defaulting of attributes and elements performed as specified by the XML Schema recommendation.

4.1 Document properties

All document properties on the source port are preserved on the result port. No document properties on the schemas port are preserved.

5 Step Errors

This step can raise dynamic errors.

[Definition: A dynamic error is one which occurs while a pipeline is being evaluated.] Examples of dynamic errors include references to URIs that cannot be resolved, steps which fail, and pipelines that exhaust the capacity of an implementation (such as memory or disk space). For a more complete discussion of dynamic errors, see Dynamic Errors in XProc 3.0: An XML Pipeline Language.

If a step fails due to a dynamic error, failure propagates upwards until either a p:try is encountered or the entire pipeline fails. In other words, outside of a p:try, step failure causes the entire pipeline to fail.

The following errors can be raised by this step:

err:XC0038

It is a dynamic error if the specified version is not available.

See: Validate with XML Schema

err:XC0053

It is a dynamic error if the assert-valid option is true and the input document is not valid.

See: Validate with RELAX NG, Validate with XML Schema

err:XC0054

It is a dynamic error if the assert-valid option is true and any Schematron assertions fail or reports succeed.

See: Validate with Schematron

err:XC0055

It is a dynamic error if the implementation does not support the specified mode.

See: Validate with XML Schema

A Conformance

Conformant processors must implement all of the features described in this specification except those that are explicitly identified as optional.

Some aspects of processor behavior are not completely specified; those features are either implementation-dependent or implementation-defined.

[Definition: An implementation-dependent feature is one where the implementation has discretion in how it is performed. Implementations are not required to document or explain how implementation-dependent features are performed.]

[Definition: An implementation-defined feature is one where the implementation has discretion in how it is performed. Conformant implementations must document how implementation-defined features are performed.]

A.1 Implementation-defined features

The following features are implementation-defined:

  1. It is implementation-defined if the documents supplied on the schemas port are considered when resolving xs:include elements in the schema documents provided. See Section 4, “Validate with XML Schema”.

A.2 Implementation-dependent features

The following features are implementation-dependent:

    B References

    [XProc 3.0] XProc 3.0: An XML Pipeline Language. Achim Berndzen, Gerrit Imsieke, Erik Siegel and Norman Walsh, editors.

    [XProc 3.0 Steps] XProc 3.0 Steps: An Introduction. Achim Berndzen, Gerrit Imsieke, Erik Siegel and Norman Walsh, editors.

    [Schematron] ISO/IEC JTC 1/SC 34. ISO/IEC 19757-3:2016(E) Document Schema Definition Languages (DSDL) — Part 3: Rule-based validation — Schematron 2016.

    [RELAX NG Compact Syntax] ISO/IEC JTC 1/SC 34. ISO/IEC 19757-2:2003/Amd 1:2006 Document Schema Definition Languages (DSDL) — Part 2: Grammar-based validation — RELAX NG AMENDMENT 1 Compact Syntax 2006.

    [RELAX NG DTD Compatibility] RELAX NG DTD Compatibility. OASIS Committee Specification. 3 December 2001.

    [W3C XML Schema: Part 1] XML Schema Part 1: Structures Second Edition. Henry S. Thompson, David Beech, Murray Maloney, et. al., editors. World Wide Web Consortium, 28 October 2004.

    C Glossary

    dynamic error

    A dynamic error is one which occurs while a pipeline is being evaluated.

    implementation-defined

    An implementation-defined feature is one where the implementation has discretion in how it is performed. Conformant implementations must document how implementation-defined features are performed.

    implementation-dependent

    An implementation-dependent feature is one where the implementation has discretion in how it is performed. Implementations are not required to document or explain how implementation-dependent features are performed.

    D Ancillary files

    This specification includes by reference a number of ancillary files.

    steps.xpl

    An XProc step library for the declared steps.