For a full glossary of terms, see C Glossary.
[Definition: The software responsible for transforming source trees into result trees using an XSLT stylesheet is referred to as the processor. This is sometimes expanded to XSLT processor to avoid any confusion with other processors, for example an XML processor.]
[Definition: A specific product that performs the functions of an XSLT processor is referred to as an implementation.]
[Definition: The term tree is used (as in [XDM 3.0]) to refer to the aggregate consisting of a parentless node together with all its descendant nodes, plus all their attributes and namespaces.]
Note:
The use of the term tree in this document does not imply the use of a data structure in memory that holds the entire contents of the document at one time. It implies rather a logical view of the XML input and output in which elements have a hierarchic relationship to each other. When a source document is being processed in a streaming manner, access to the nodes in this tree is constrained, but it is still viewed and described as a tree.
The output of a transformation consists of the following:
[Definition: A principal result: this can be any sequence of items (as defined in [XDM 3.0]).] The principal result is the value returned by the function or template in the stylesheet that is nominated as the entry point, as described in 2.3 Initiating a Transformation.
[Definition: Zero or more
secondary results: each secondary result can be any sequence
of items (as defined in [XDM 3.0]).] A
secondary result is the value returned by evaluating the body of an
xsl:result-document
instruction.
Zero or more messages. Messages are generated by the
xsl:message
and xsl:assert
instructions, and are described in 23.1 Messages and 23.2 Assertions.
Static or dynamic errors: see 2.14 Error Handling.
The principal result and the secondary results may be post-processed as described in 2.3.6 Post-processing the Raw Result.
[Definition: The term result tree is used to refer to any tree constructed by instructions in the stylesheet. A result tree is either a final result tree or a temporary tree.]
[Definition: A final result tree is a result tree that forms part of the output of a transformation: specifically, a tree built by post-processing the items in the principal result or in a secondary result. Once created, the contents of a final result tree are not accessible within the stylesheet itself.] Any final result tree may be serialized as described in 26 Serialization.
[Definition: The term source tree
means any tree provided as input to the transformation. This includes the document
containing the global context item if any, documents containing
nodes present in the initial match selection,
documents containing nodes supplied as the values of stylesheet parameters, documents
obtained from the results of functions such as document
,
doc
FO30, and collection
FO30, documents read using the xsl:source-document
instruction, and documents returned by extension functions or
extension instructions. In the context of a particular XSLT instruction, the term
source tree means any tree provided as input to that instruction;
this may be a source tree of the transformation as a whole, or it may be a
temporary tree produced during the
course of the transformation.]
[Definition: The term temporary tree means any tree that is neither a source tree nor a final result tree.] Temporary trees are used to hold intermediate results during the execution of the transformation.
The use of the term “tree” in phrases such as source tree, result tree, and temporary tree is not confined to documents that the processor materializes in memory in their entirety. The processor may, and in some cases must, use streaming techniques to limit the amount of memory used to hold source and result documents. When streaming is used, the nodes of the tree may never all be in memory at the same time, but at an abstract level the information is still modeled as a tree of nodes, and the document is therefore still described as a tree. Unless otherwise stated, the term “tree” refers to a tree rooted at a parentless node: that is, the term does not include subtrees of larger trees. Every node therefore belongs to exactly one tree.
In this specification the phrases must, must not, should, should not, may, required, and recommended, when used in normative text and rendered in capitals, are to be interpreted as described in [RFC2119].
Where the phrase must, must not, or required relates to the behavior of the XSLT processor, then an implementation is not conformant unless it behaves as specified, subject to the more detailed rules in 27 Conformance.
Where the phrase must, must not, or required relates to a stylesheet then the processor must enforce this constraint on stylesheets by reporting an error if the constraint is not satisfied.
Where the phrase should, should not, or recommended relates to a stylesheet then a processor may produce warning messages if the constraint is not satisfied, but must not treat this as an error.
[Definition: In this specification, the term implementation-defined refers to a feature where the implementation is allowed some flexibility, and where the choices made by the implementation must be described in documentation that accompanies any conformance claim.]
[Definition: The term implementation-dependent refers to a feature where the behavior may vary from one implementation to another, and where the vendor is not expected to provide a full specification of the behavior.] (This might apply, for example, to limits on the size of source documents that can be transformed.)
In all cases where this specification leaves the behavior implementation-defined or implementation-dependent, the implementation has the option of providing mechanisms that allow the user to influence the behavior.
A paragraph labeled as a Note or described as an example is non-normative.
Many terms used in this document are defined in the XPath specification [XPath 3.0] or the XDM specification [XDM 3.0]. Particular attention is drawn to the following:
[Definition: The term atomization is defined in Section 2.4.2 Atomization XP30. It is a process that takes as input a sequence of items, and returns a sequence of atomic values, in which the nodes are replaced by their typed values as defined in [XDM 3.0]. If the XPath 3.1 Feature is implemented, then arrays (see 27.7.1 Arrays) are atomized by atomizing their members, recursively.] For some items (for example, elements with element-only content, function items, and maps), atomization generates a dynamic error.
[Definition: The term typed
value is defined in Section
5.15 typed-value Accessor
DM30.
Every node, other than an element whose type
annotation identifies it as having element-only content, has a
typed value. For example, the
typed value of an attribute of
type xs:IDREFS
is a sequence of zero or more
xs:IDREF
values.]
[Definition: The term string value is defined in Section 5.13 string-value Accessor DM30. Every node has a string value. For example, the string value of an element is the concatenation of the string values of all its descendant text nodes.]
[Definition: The term
XPath 1.0 compatibility mode is defined in Section
2.1.1 Static Context
XP30. This is a setting in the static
context of an XPath expression; it has two values, true
and
false
. When the value is set to true, the semantics of
function calls and certain other operations are adjusted to give a greater
degree of backwards compatibility between XPath
3.0 and XPath 1.0.]
[Definition: An XSLT element is an element in the XSLT namespace whose syntax and semantics are defined in this specification.] For a non-normative list of XSLT elements, see D Element Syntax Summary.
In this document the specification of each XSLT element is preceded by a summary of its syntax in the form of a model for elements of that element type. A full list of all these specifications can be found in D Element Syntax Summary. The meaning of the syntax summary notation is as follows:
An attribute that is required is shown with its name in bold. An attribute that may be omitted is shown with a question mark following its name.
An attribute that is deprecated is shown in a grayed font within square brackets.
The string that occurs in the place of an attribute value specifies the allowed
values of the attribute. If this is surrounded by curly brackets
({...}
), then the attribute value is treated as an attribute value template, and
the string occurring within curly brackets specifies the allowed values of the
result of evaluating the attribute value template. Alternative allowed values
are separated by |
. A quoted string indicates a value equal to
that specific string. An unquoted, italicized name specifies a particular type
of value.
The types used, and their meanings, are as follows:
boolean
One of the strings "yes"
,
"true"
, or "1"
to indicate the value
true
, or one of the strings "no"
,
"false"
, or "0"
to indicate the value
false
. Note: the values are synonyms; where this
specification uses a phrase such as “If required='yes'
is
specified ...” this is to be interpreted as meaning “If the attribute
named required
is present, and has the value
yes
, true
, or 1
(after
stripping leading and trailing whitespace) ...”.
string
Any string
expression
An XPath expression
pattern
A pattern as described in 5.5 Patterns.
item-type
An ItemTypeXP30 as defined in the XPath 3.0 specification (or in XPath 3.1 if the processor implements the XPath 3.1 Feature)
sequence-type
A SequenceTypeXP30 as defined in the XPath 3.0 specification (or in XPath 3.1 if the processor implements the XPath 3.1 Feature)
uri; uris
A URI, for example a namespace URI or a collation URI; a whitespace-separated list of URIs
qname
A lexical QName as defined in 5.1.1 Qualified Names
eqname; eqnames
An EQName as defined in 5.1.1 Qualified Names; a whitespace-separated list of EQNames
token; tokens
A string containing no significant whitespace; a whitespace-separated list of such strings
nmtoken; nmtokens
A string conforming to the XML schema rules for the type
xs:NMTOKEN
; a whitespace-separated list of such
strings.
char
A string comprising a single Unicode character
integer
An integer, that is a string in the lexical space of the schema type
xs:integer
decimal
A decimal value, that is a string in the lexical space of the schema
type xs:decimal
ncname
An unprefixed name: a string in the value space of the schema type
xs:NCName
prefix
An xs:NCName
representing a namespace prefix, which must
be in scope for the element on which it appears
id
An xs:NCName
used as a unique identifier for an element
in the containing XML document
Except where the set of allowed values of an attribute is specified using the italicized name string or char, leading and trailing whitespace in the attribute value is ignored. In the case of an attribute value template, this applies to the effective value obtained when the attribute value template is expanded.
XPath comments (delimited by (: ... :)
)
are permitted anywhere that inter-token whitespace is permitted in attributes whose
type is given as expression, pattern, item-type,
or sequence-type, and are not permitted in attributes of other types
(other than within expressions enclosed by curly braces within an attribute value template).
Unless the element is required to be empty, the model element contains a comment specifying the allowed content. The allowed content is specified in a similar way to an element type declaration in XML; sequence constructor means that any mixture of text nodes, literal result elements, extension instructions, and XSLT elements from the instruction category is allowed; other-declarations means that any mixture of XSLT elements from the declaration category is allowed, together with user-defined data elements.
The element is prefaced by comments indicating if it belongs to the
instruction
category or declaration
category or
both. The category of an element only affects whether it is allowed in the
content of elements that allow a sequence constructor or other-declarations.
This example illustrates the notation used to describe XSLT elements.
<!-- Category: instruction -->
<xsl:example-element
select = expression
debug? = boolean
validation? = { "strict" | "lax" } >
<!-- Content: ((xsl:variable | xsl:param)*, xsl:sequence) -->
</xsl:example-element>
This example defines a (non-existent) element xsl:example-element
.
The element is classified as an instruction. It takes the following
attributes:
A mandatory select
attribute, whose value is an XPath expression
An optional debug
attribute, whose
value must be yes
, true
, or
1
to indicate true
, or no
,
false
, or 0
to indicate false
.
An optional validation
attribute, whose value must be
strict
or lax
; the curly brackets indicate that
the value can be defined as an attribute value template, allowing a value such as
validation="{$val}"
, where the variable
val
is evaluated to yield "strict"
or
"lax"
at run-time.
The content of an xsl:example-element
instruction is defined to be a
sequence of zero or more xsl:variable
and
xsl:param
elements, followed by an
xsl:sequence
element.
[ERR XTSE0010] It is a static error if an XSLT-defined element is used in a context where it is not permitted, if a required attribute is omitted, or if the content of the element does not correspond to the content that is allowed for the element.
The rules in the element syntax summary (both for the element structure and for its attributes) apply to the stylesheet content after preprocessing as described in 3.13 Stylesheet Preprocessing.
Attributes are validated as follows. These rules apply to the value of the attribute after removing leading and trailing whitespace.
[ERR XTSE0020] It is a static error if an attribute (other than an attribute written using curly brackets in a position where an attribute value template is permitted) contains a value that is not one of the permitted values for that attribute.
[ERR XTDE0030] It is a dynamic error if the effective value of an attribute written using curly brackets, in a position where an attribute value template is permitted, is a value that is not one of the permitted values for that attribute. If the processor is able to detect the error statically (for example, when any XPath expressions within the curly brackets can be evaluated statically), then the processor may optionally signal this as a static error.
Special rules apply if the construct appears in part of the stylesheet that is processed with forwards compatible behavior: see 3.10 Forwards Compatible Processing.
[Definition: Some constructs defined in this specification are described as being deprecated. The use of this term implies that stylesheet authors should not use the construct, and that the construct may be removed in a later version of this specification.]
Note:
This specification includes a non-normative XML Schema for XSLT stylesheet modules (see H Schemas for XSLT 3.0 Stylesheets). The syntax summaries described in this section are normative.
XSLT defines a set of standard functions which are additional to those defined in [Functions and Operators 3.0]. A list of these functions appears in G.2 List of XSLT-defined functions. The signatures of these functions are described using the same notation as used in [Functions and Operators 3.0]. The names of many of these functions are in the standard function namespace.
This document does not specify any application programming interfaces or other interfaces for initiating a transformation. This section, however, describes the information that is supplied when a transformation is initiated. Except where otherwise indicated, the information is required.
The execution of a stylesheet necessarily involves two activities: static analysis
and dynamic evaluation. Static analysis consists of those tasks that can be performed
by inspection of the stylesheet alone, including the
binding of static variables,
the evaluation of [xsl:]use-when
expressions (see 3.13.1 Conditional Element Inclusion), and shadow attributes
(see 3.13.2 Shadow Attributes) and detection of static errors. Dynamic evaluation consists of
tasks which in general cannot be carried out until a source document is
available.
Dynamic evaluation is further divided into two activities: priming the stylesheet, and invoking a selected component.
Priming the stylesheet provides the dynamic context for evaluation, and supplies all the information needed to establish the values of global variables.
Invoking a component (such as a template or function) causes evaluation of that template or function to produce a result, which is an arbitrary XDM value.
[Definition: The result of invoking the selected component, after any required conversion to the declared result type of the component, is referred to as the raw result.]
The raw result of the invocation
is the immediate result of evaluating the sequence constructor
contained in the target template or function, modified by applying the function conversion rules
to convert the immediate result to the type declared in the as
attribute of the xsl:template
or xsl:function
declaration, if present.
This raw result may optionally be post-processed to construct a result tree, to serialize the result, or both, as described in 2.3.6 Post-processing the Raw Result.
Implementations may allow static analysis and dynamic evaluation to be initiated independently, so that the cost of static analysis can be amortized over multiple transformations using the same stylesheet. Implementations may also allow priming of a stylesheet and invocation of components to be initiated independently, in which case a single act of priming the stylesheet may be followed by a series of independent component invocations. Although this specification does not require such a separation, this section distinguishes information that is needed before static analysis can proceed, information that is needed to prime the stylesheet, and information that is needed when invoking components.
The language is designed to allow the static analysis of each package to be performed independently of other packages, with only basic knowledge of the properties of components made available by used packages. Beyond this, the specification leaves it to implementations to decide how to organize this process. When packages are not used explicitly, the entire stylesheet is treated as a single package.
The following information is needed prior to static analysis of a package:
The location of the package manifest,
or in the absence of a package manifest, the stylesheet module that is to act as
the principal stylesheet
module
of the package. The complete package is
assembled by recursively expanding the xsl:import
and
xsl:include
declarations in the principal stylesheet
module, as described in 3.11.2 Stylesheet Inclusion and 3.11.3 Stylesheet Import.
Information about the packages referenced from this
package using xsl:use-package
declarations. The information
needed will include the names and signatures of public components exported
by the referenced package.
A set (possibly empty) of values for static parameters (see 9.5 Global Variables and Parameters). These values are available for use within
static expressions (notably
in [xsl:]use-when
expressions and shadow attributes) as well as
non-static expressions in the stylesheet. As a minimum, values must be
supplied for any static parameters declared with the attribute
required="yes"
.
Conceptually, the output of the static analysis of a package is an object which might be referred to (without constraining the implementation) as a compiled package. Prior to dynamic evaluation, all the compiled packages needed for execution must be checked for consistency, and component references must be resolved. This process may be referred to, again without constraining the implementation, as linking.
The information needed when priming a stylesheet is as follows:
A set (possibly empty) of values for non-static
stylesheet parameters (see
9.5 Global Variables and Parameters). These values are available for use
within expressions in the stylesheet. As a minimum, values
must be supplied for any parameters declared with the
attribute required="yes"
.
A supplied value is converted if necessary to the declared type of the stylesheet parameter using the function conversion rules.
Note:
Non-static stylesheet parameters are implicitly
public
, which ensures that all the parameters in the
stylesheet for which values can be supplied externally have distinct
names. Static parameters, by contrast,
are local to a package.
[Definition: An item that acts as the global
context item for the transformation. This item acts
as the context item when evaluating
the
select
expression or sequence constructor of a
global variable declaration
within the top-level package, as described in 5.3.3.1 Maintaining Position: the Focus. The global context item may also be available in a named template
when the stylesheet is invoked as described in 2.3.4 Call-Template Invocation].
Note:
In previous releases of this specification, a single node was typically
supplied to represent the source document for the transformation. This
node was used as the target node for the implicit call on
xsl:apply-templates
used to start the transformation
process (now called the initial match selection), and
the root node of the containing tree was used as the context item for
evaluation of global variables (now called the global context item). This relationship between the
initial match selection and the global context item is likely to be found for compatibility
reasons in a transformation API designed to work with earlier versions of
this specification, but it is no longer a necessary relationship; the two
values can in principle be completely independent of each other.
Stylesheet authors wanting to write code that can be invoked using legacy APIs should not rely on the caller being able to supply different values for the initial match selection and the global context item.
The value given to the global context item (and the values given to stylesheet parameters) cannot be nodes in a streamed document. This rule ensures that all global variables can freely navigate within the relevant tree, with no constraints imposed by the streamability rules.
The global context item is potentially
used when initializing global variables and parameters. If the
initialization of any global
variables or parameter depends on the context item, a dynamic error can
occur if the context item is absent. It is implementation-defined whether this error occurs during
priming of the stylesheet or subsequently when the variable is referenced;
and it is implementation-defined whether the error
occurs at all if the variable or parameter is never referenced. The error
can be suppressed by use of xsl:try
and
xsl:catch
within the sequence constructor used to initialize the variable or parameter. It
cannot be suppressed by use of xsl:try
around a
reference to the global variable.
In a library package, the context item, context position, and context size used for evaluation of global variables will be absent, and the evaluation of any expression that references these values will result in a dynamic error. This will also be the case in the top-level package if no global context item is supplied.
Note:
If a context item is available within a global variable declaration, then the context position and context size will always be 1 (one).
Note:
For maximum reusability of code, it is best to avoid use of the context item when initializing global variables and parameters. Instead, all external information should be supplied using named stylesheet parameters. Especially when these use namespaces to avoid conflicts, there is then no risk of confusion between the information supplied externally to different packages.
When a stylesheet parameter is defined in a library package, it is
possible for a using package to supply a value for the parameter by
overriding the parameter declaration within an
xsl:override
element. If the using package is the
top-level package then the overriding declaration
can refer to the global context item.
A mechanism for obtaining a document node and a media type, given an
absolute URI. The total set of available documents (modeled as a mapping
from URIs to document nodes) forms part of the context for evaluating XPath
expressions, specifically the doc
FO30 function. The XSLT
document
function additionally requires the media
type of the resource representation, for use in interpreting any fragment
identifier present within a URI Reference.
Note:
The set of documents that are available to the stylesheet is implementation-dependent, as is the processing that is carried out to construct a tree representing the resource retrieved using a given URI. Some possible ways of constructing a document (specifically, rules for constructing a document from an Infoset or from a PSVI) are described in [XDM 3.0].
Once a stylesheet is primed, the values of global variables
remain stable through all component invocations. In addition, priming a stylesheet
creates an execution scopeFO30
during which the dynamic context and all calls
on deterministicFO30
functions remain stable; for example two calls on the
current-dateTime
FO30 function within an execution scope are
defined to return the same result.
Parameters passed to the transformation by the client application when a stylesheet is primed are matched against stylesheet parameters (see 9.5 Global Variables and Parameters), not against the template parameters of any template executed during the course of the transformation.
[ERR XTDE0050] It is a dynamic error if a stylesheet declares a visible stylesheet parameter that is explicitly or implicitly mandatory, and no value for this parameter is supplied when the stylesheet is primed. A stylesheet parameter is visible if it is not masked by another global variable or parameter with the same name and higher import precedence. If the parameter is a static parameter then the value must be supplied prior to the static analysis phase.
[Definition: A stylesheet may be evaluated by supplying a
value to be processed, together with an initial mode. The
value (which can be any sequence of items) is referred to as the initial
match selection. The processing then corresponds to the effect of the
xsl:apply-templates
instruction.]
The initial match selection will often be a single document node, traditionally called the source document of the transformation; but in general, it can be any sequence. If the initial match selection is an empty sequence, the result of the transformation will be empty, since no template rules are evaluated.
Processing proceeds by finding the template rules that match the items in the initial match selection, and evaluating these template rules with a focus based on the initial match selection. The template rules are evaluated in final output state.
The following information is needed when dynamic evaluation is to start with a template rule:
The initial match selection. An API that chooses to maintain compatibility with previous versions of this specification should allow a method of invocation in which a singleton node is provided, which is then used in two ways: the node itself acts as the initial match selection, and the root node of the containing tree acts as the global context item.
Optionally, an initial mode.
[Definition: The initial mode is the mode used to select template rules for processing items in the initial match selection when apply-templates invocation is used to initiate a transformation.]
In searching for the template rule that best matches the items in the initial match selection, the processor considers only those rules that apply to the initial mode.
If no initial mode is supplied explicitly, then the initial mode is that named in the
default-mode
attribute of the (explicit or implicit) xsl:package
element of the top-level package or in
the absence of such an attribute, the unnamed mode.
[ERR XTDE0044] It is a dynamic error if the invocation of the stylesheet specifies an initial mode when no initial match selection is supplied (either explicitly, or defaulted to the global context item).
A (named or unnamed) mode M is eligible as an initial mode if one of the following conditions applies, where P is the top-level package of the stylesheet:
M is explicitly declared in an xsl:mode
declaration
within P, and has public
or final
visibility (either by virtue
of its visibility
attribute, or by virtue of an xsl:expose
declaration).
M is the unnamed mode.
M is named in the default-mode
attribute of the (explicit or implicit)
xsl:package
element of P.
M is declared in a package used by P, and is given public
or final
visibility in P by means of an xsl:accept
declaration.
The effective value of the declared-modes
attribute of the explicit or implicit
xsl:package
element of P is no
, and M appears as
a mode-name in the mode
attribute of a template rule declared within P.
[ERR XTDE0045] It is a dynamic error if the invocation of the stylesheet specifies an initial mode and the specified mode is not eligible as an initial mode (as defined above).
Parameters, which will be passed to the template rules
used to process items in the input sequence. The parameters consist of two
sets of (QName, value) pairs, one set for tunnel parameters and one for non-tunnel parameters, in which
the QName identifies the name of a parameter and the value provides the
value of the parameter. Either or both sets of parameters may be empty. The
effect is the same as when a template is invoked using
xsl:apply-templates
with an
xsl:with-param
child specifying
tunnel="yes"
or tunnel="no"
as appropriate. If
a parameter is supplied that is not declared or used, the value is simply
ignored. These parameters are not used to set stylesheet parameters.
A supplied value is converted if necessary to the declared type of the template parameter using the function conversion rules.
Details of how the result of the initial template is to be returned. For details, see 2.3.6 Post-processing the Raw Result
The raw result of the invocation is the
result of processing the supplied input sequence as if by a call on
xsl:apply-templates
in the specified mode: specifically, each
item in the input sequence is processed by selecting and evaluating the best
matching template rule, and converting the result (if necessary) to the type
declared in the as
attribute of that template using the function conversion rules; and the results of processing each item
are then concatenated into a single sequence, respecting the order of items in the
input sequence.
Note:
If the initial mode is declared-streamable, then a streaming processor should allow some or all of the items in the initial match selection to be nodes supplied in streamable form, and any nodes that are supplied in this form must then be processed using streaming.
Since the global context item cannot be a streamed node, in cases where the transformation is to proceed by applying streamable templates to a streamed input document, the global context item must either be absent, or must be something that differs from the initial match selection.
Note:
The design of the API for invoking a transformation should provide some means for users to designate the unnamed mode as the initial mode in cases where it is not the default mode.
It is a dynamic error
[see ERR XTDE0700] if the template rule selected for processing any item in the initial match selection defines a template parameter that specifies required="yes"
and no value is supplied for that
parameter.
Note:
A stylesheet can process further source
documents in addition to those supplied when the transformation is invoked.
These additional documents can be loaded using the functions
document
(see 20.1 fn:document) or
doc
FO30 or collection
FO30 (see [Functions and Operators 3.0]), or using the
xsl:source-document
instruction; alternatively, they can
be supplied as stylesheet
parameters (see 9.5 Global Variables and Parameters), or returned as
the result of an extension
function (see 24.1 Extension Functions).
[Definition: A stylesheet may be evaluated by selecting a
named template to be evaluated; this is referred to as the initial named
template.] The effect is analogous to the effect of
executing an xsl:call-template
instruction. The following
information is needed in this case:
Optionally, the name of the initial
named template which is to be executed as the entry point to
the transformation. If no template name is
supplied, the default template name is
xsl:initial-template
. The selected template
must exist within the stylesheet.
Optionally, a context item for evaluation of this named
template, defaulting to the global context item if it
exists. This is constrained by any
xsl:context-item
element appearing within the
selected xsl:template
element. The initial named
template is evaluated with a singleton focus based on
this context item if it exists, or with an absent
focus otherwise.
Parameters, which will be passed to the selected template
rule. The parameters consist of two sets of (QName, value) pairs, one set
for tunnel parameters and one
for non-tunnel parameters, in which the QName identifies the name of a
parameter and the value provides the value of the parameter. Either or both
sets of parameters may be empty. The effect is the same as when a template
is invoked using xsl:call-template
with an
xsl:with-param
child specifying
tunnel="yes"
or tunnel="no"
as appropriate. If
a parameter is supplied that is not declared or used, the value is simply
ignored. These parameters are not used to set stylesheet parameters.
A supplied value is converted if necessary to the declared type of the template parameter using the function conversion rules.
Details of how the result of the initial named template is to be returned. For details, see 2.3.6 Post-processing the Raw Result
The raw result of the invocation is the
result of evaluating the initial named template, after
conversion of the result to the type declared in the as
attribute of
that template using the function conversion rules, if such
conversion is necessary.
The initial named template is evaluated in final output state.
[ERR XTDE0040] It is a dynamic error if the invocation of
the stylesheet specifies a template
name that does not match the expanded
QName of a named template defined in the stylesheet, whose
visibility is public
or final
.
It is a dynamic error
[see ERR XTDE0700] if the initial named template, or any of the template rules invoked to
process items in the initial match selection, defines a
template parameter that
specifies required="yes"
and no value is supplied for that parameter.
[Definition: A stylesheet may be evaluated by calling a named stylesheet function, referred to as the initial function.] The following additional information is needed in this case:
The name and arity of a stylesheet function which is to be executed as the entry point to the transformation.
Note:
In the design of a concrete API, the arity may be inferred from the length of the parameter list.
A list of values to act as parameters to the initial function. The number of values in the list must be the same as the arity of the function.
A supplied value is converted if necessary to the declared type of the function parameter using the function conversion rules.
Details of how the result of the initial function is to be returned. For details, see 2.3.6 Post-processing the Raw Result
The raw result of the invocation is the
result of evaluating the initial function, after conversion of
the result to the type declared in the as
attribute of that function
using the function conversion rules, if such conversion is
necessary.
Note:
The initial function (like all stylesheet functions) is evaluated with an absent focus.
If the initial function is declared-streamable, a streaming processor should allow the value of the first argument to be supplied in streamable form, and if it is supplied in this form, then it must be processed using streaming.
[ERR XTDE0041] It is a dynamic error if the invocation of the stylesheet specifies a function name and
arity that does not match the expanded
QName and arity of a named stylesheet function defined in the stylesheet, whose visibility is
public
or final
.
When a transformation is invoked by calling an initial function, the entire transformation executes in temporary output state, which means that calls on
xsl:result-document
are not permitted.
There are three ways the result of a transformation
may be delivered. (This applies both to the principal result, described here, and
also to secondary results, generated using xsl:result-document
.)
The raw result (a sequence of values) may be returned directly to the calling application.
A result tree may be constructed from the raw result.
By default, a result tree is constructed if the build-tree
attribute of the unnamed output definition
has the effective value yes
. An API for invoking transformations may
allow this setting to be overridden by the calling application. If result tree construction
is requested, it is performed as described in 2.3.6.1 Result Tree Construction.
Alternatively, the raw result may be serialized as described in 2.3.6.2 Serializing the Result. The decision whether or not to serialize the result is determined by the rules of transformation API provided by the processor, and is not influenced by anything in the stylesheet.
Note:
This specification does not constrain the design of application programming interfaces or the choice of defaults. In previous versions of this specification, result tree construction was a mandatory process, while serialization was optional. When invoking stylesheet functions directly, however, result tree construction and serialization may be inappropriate as defaults. These considerations may affect the design of APIs.
In previous versions of XSLT, results were delivered either
in serialized form (as a character or byte stream), or as a tree. In the latter case
processors
typically would use either their own tree representation, or a standardized tree
representation such as the W3C Document Object Model (DOM) (see [DOM Level 2]),
adapted to the data structures offered by the programming language in which the API
is defined.
To deliver a raw result, processors need to define a representation not only of XDM
nodes but
also of sequences, atomic values, maps and even functions. As with the return of a
simple tree,
this may involve a trade-off between strict fidelity to the XDM data model and usability
in the particular
programming language environment. It is not a requirement that an API should return results
in a way that exposes every property of the XDM data model; for example there may
be APIs that do not expose
the precise type annotation of a returned node or atomic value, or that fail to expose
the base URI
or document URI of a node, or that provide no way of determining whether two nodes
in the result
sequence are the same node in the sense of the XPath is
operator.
The way in which maps and functions (and where XPath 3.1 is supported, arrays)
are returned requires careful design choices. It is recommended that an API should be capable
of returning any XDM value without error, and that there should be minimal loss of
information if
the raw results output by one transformation are subsequently used as input to another
transformation.
If a result tree is to be constructed from the raw result, then this is done
by applying the rules for the process of sequence normalizationSER30 as defined in
[XSLT and XQuery Serialization]. This process takes as input the serialization parameters defined in the
unnamed output definition of the top-level package; though the only parameter
that is actually used by this process is item-separator
. In particular, sequence normalization is carried
out regardless of any method
attribute in the unnamed output definition.
The sequence normalization process either returns a document node, or raises a serialization error. The content of the document node is not necessarily well-formed (the document node may have any number of element or text nodes among its children).
Note:
More specifically, the process raises a serialization error if any item in the raw result is an attribute node, a namespace node, or a function (including a map, but not an array: arrays are flattened).
The tree that is constructed is referred to as a final result tree.
If the raw result is an empty sequence, the final result tree will consist of a document node with no children.
The base URI of the document node is set to the base output URI.
Note:
The item-separator
property has no effect if the raw result of the transformation is a sequence
of length zero or one, which in practice will often be the case, especially in a traditional
scenario such as
transformation of an XML document to HTML.
If there is no item-separator
, then a single space is inserted between adjacent atomic values;
for example if the raw result is the sequence 1 to 5
, then sequence normalization produces a tree
comprising a document node with a single child, the child being a text node with the
string value
1 2 3 4 5
.
If there is an item-separator
, then it is used not only between adjacent atomic values,
but between any pair of items in the raw result. For example if the raw result is
a sequence of two
element nodes A
and B
, and the item-separator
is a comma,
then the result of sequence normalization will be a document node with three children:
a copy of A
,
a text node whose string value is a single comma, and a copy of B
.
See 2.7 Parsing and Serialization.
The raw result may optionally be serialized as described in 26 Serialization. The serialization is controlled by the serialization parameters defined in the unnamed output definition of the top-level package.
Note:
The first phase of serialization, called sequence normalizationSER30,
takes place for some output methods but not others. For example, if the json
output method
(defined in [XSLT and XQuery Serialization 3.1]) is selected, then the process of constructing
a tree is bypassed.
The effect of serialization is to generate a sequence of octets, representing the serialized result in some character encoding. The processor’s API may define mechanisms enabling this sequence of octets to be written to persistent storage at some location. The default location is the location identified by the base output URI.
In previous versions of this specification it was stated that
when the raw result of the initial template or function is an empty sequence,
a result tree should be produced if and only if the transformation generates no secondary
results
(that is, if it does not invoke xsl:result-document
). This provision is most likely
to have a noticeable effect if the transformation produces serialized results, and
these results
are written to persistent storage: the effect is then that a transformation producing
an empty
principal result will overwrite any existing content at the base output URI location
if and only
if the transformation produces no other output. Processor APIs offering backwards
compatibility
with earlier versions of XSLT must respect this behavior, but there is no requirement
for new
processor APIs to do so.
[Definition: The base output URI is a URI to be used as the base URI when
resolving a relative URI reference allocated
to a final result tree. If the
transformation generates more than one final result tree, then typically each
one will be allocated a URI relative to this base URI.] The way in
which a base output URI is established is implementation-defined. Each invocation of the stylesheet may supply
a different base output URI. It is acceptable for the base output URI to be
absent, provided no constructs (such as
xsl:result-document
) are evaluated that depend on the value of
the base output URI.
Note:
It will often be convenient for the base output URI to be the same as the location to which the principal result document is serialized, but this relationship is not a necessary one.
The main executable components of a stylesheet are templates and functions. The body of a template or function is a sequence constructor, which is a sequence of elements and text nodes that can be evaluated to produce a result.
A sequence constructor is a sequence of sibling nodes in the stylesheet, each of which is either an XSLT instruction, a literal result element, a text node, or an extension instruction.
[Definition: An instruction is either an XSLT instruction or an extension instruction.]
[Definition: An XSLT
instruction is an XSLT element
whose syntax summary in this specification contains the annotation <!--
category: instruction -->
.]
Extension instructions are described in 24.2 Extension Instructions.
The main categories of XSLT instruction are as follows:
instructions that create new nodes: xsl:document
,
xsl:element
, xsl:attribute
,
xsl:processing-instruction
, xsl:comment
,
xsl:value-of
, xsl:text
,
xsl:namespace
;
instructions that copy nodes: xsl:copy
,
xsl:copy-of
;
an instruction that returns an arbitrary sequence by evaluating an XPath
expression: xsl:sequence
;
instructions that cause conditional or repeated evaluation of nested
instructions: xsl:if
, xsl:choose
, xsl:try
,
xsl:for-each
, xsl:for-each-group
, xsl:fork
, xsl:iterate
and its subordinate instructions xsl:next-iteration
and
xsl:break
;
instructions that generate output conditionally if elements are or are not
empty: xsl:on-empty
, xsl:on-non-empty
,
xsl:where-populated
;
instructions that invoke templates: xsl:apply-templates
,
xsl:apply-imports
, xsl:call-template
,
xsl:next-match
;
Instructions that declare variables: xsl:variable
;
Instructions to assist debugging: xsl:message
,
xsl:assert
;
other specialized instructions: xsl:number
,
xsl:analyze-string
, xsl:fork
,
xsl:result-document
, xsl:source-document
, xsl:perform-sort
,
xsl:merge
.
The classic method of executing an XSLT transformation is to apply template rules to the root node of an input document (see 2.3.3 Apply-Templates Invocation). The operation of applying templates to a node searches the stylesheet for the best matching template rule for that node. This template rule is then evaluated. A common coding pattern, especially when XSLT is used to convert XML documents into display formats such as HTML, is to have one template rule for each kind of element in the source document, and for that template rule to generate some appropriate markup elements, and to apply templates recursively to its own children. The effect is to perform a recursive traversal of the source tree, in which each node is processed using the best-fit template rule for that node. The final result of the transformation is then the tree produced by this recursive process. This result can then be optionally serialized (see 2.3.6 Post-processing the Raw Result).
This example uses rule-based processing to convert a simple XML input document into an HTML output document.
The input document takes the form:
<PERSONAE PLAY="OTHELLO"> <TITLE>Dramatis Personae</TITLE> <PERSONA>DUKE OF VENICE</PERSONA> <PERSONA>BRABANTIO, a senator.</PERSONA> <PERSONA>Other Senators.</PERSONA> <PERSONA>GRATIANO, brother to Brabantio.</PERSONA> <PERSONA>LODOVICO, kinsman to Brabantio.</PERSONA> <PERSONA>OTHELLO, a noble Moor in the service of the Venetian state.</PERSONA> <PERSONA>CASSIO, his lieutenant.</PERSONA> <PERSONA>IAGO, his ancient.</PERSONA> <PERSONA>RODERIGO, a Venetian gentleman.</PERSONA> <PERSONA>MONTANO, Othello's predecessor in the government of Cyprus.</PERSONA> <PERSONA>Clown, servant to Othello. </PERSONA> <PERSONA>DESDEMONA, daughter to Brabantio and wife to Othello.</PERSONA> <PERSONA>EMILIA, wife to Iago.</PERSONA> <PERSONA>BIANCA, mistress to Cassio.</PERSONA> <PERSONA>Sailor, Messenger, Herald, Officers, Gentlemen, Musicians, and Attendants.</PERSONA> </PERSONAE>
The stylesheet to render this as HTML can be written as a set of template rules:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" expand-text="yes"> <xsl:strip-space elements="PERSONAE"/> <xsl:template match="PERSONAE"> <html> <head> <title>The Cast of {@PLAY}</title> </head> <body> <xsl:apply-templates/> </body> </html> </xsl:template> <xsl:template match="TITLE"> <h1>{.}</h1> </xsl:template> <xsl:template match="PERSONA[count(tokenize(., ',') = 2]"> <p><b>{substring-before(., ',')}</b>: {substring-after(., ',')}</p> </xsl:template> <xsl:template match="PERSONA"> <p><b>{.}</b></p> </xsl:template> </xsl:stylesheet>
There are four template rules here:
The first rule matches the outermost element, named PERSONAE
(it could equally
have used match="/"
to match the document node). The effect of this rule is to create
the skeleton of the output HTML page. Technically, the body of the template is a sequence
constructor
comprising a single literal result element (the html
element); this
in turn contains a sequence constructor comprising two literal result elements (the
head
and body
elements). The head
element is populated with a literal title
element whose content is computed as a mixture of fixed and variable text using a
text value template.
The body
element is populated by evaluating an xsl:apply-templates
instruction.
The effect of the xsl:apply-templates
instruction is to process the children of
the PERSONAE
element in the source tree: that is, the TITLE
and
PERSONA
elements. (It would also process any whitespace text node children, but these
have been stripped by virtue of the xsl:strip-space
declaration.) Each of these
child elements is processed by the best matching template rule for that element, which
will be one
of the other three rules in the stylesheet.
The template rule for the TITLE
element outputs an h1
element
to the HTML result document, and populates this with the value of ".", the context
item. That is,
it copies the text content of the TITLE
element to the output h1
element.
The last two rules match PERSONA
element. The first rule matches PERSONA
elements whose text content contains exactly one comma; the second rule matches all
PERSONA
elements,
but it has lower priority than the first rule, so in practice it only applies to PERSONA
elements that contain no comma or multiple commas.
For both rules the body of the rule is a sequence constructor containing a single
literal result element,
the p
element. These literal result elements contain
further sequence constructors comprising literal result elements and text nodes.
In each of these examples the text nodes are in the form of a text value template:
in general this is a combination of fixed text together with XPath expressions enclosed
in curly braces, which
are evaluated to form the content of the containing literal result element.
[Definition: A stylesheet contains a set of template rules (see 6 Template Rules). A template rule has three parts: a pattern that is matched against selected items (often but not necessarily nodes), a (possibly empty) set of template parameters, and a sequence constructor that is evaluated to produce a sequence of items.] In many cases these items are newly constructed nodes, which are then written to a result tree.
The results of some expressions and instructions in a stylesheet may depend on information provided contextually. This context information is divided into two categories: the static context, which is known during static analysis of the stylesheet, and the dynamic context, which is not known until the stylesheet is evaluated. Although information in the static context is known at analysis time, it is sometimes used during stylesheet evaluation.
Some context information can be set by means of declarations within the stylesheet itself. For example, the namespace bindings used for any XPath expression are determined by the namespace declarations present in containing elements in the stylesheet. Other information may be supplied externally or implicitly: an example is the current date and time.
The context information used in processing an XSLT stylesheet includes as a subset
all the context information required when evaluating XPath expressions. The XPath 3.0 specification defines a static and dynamic
context that the host language (in this case, XSLT) may initialize, which affects
the
results of XPath expressions used in that context. XSLT augments the context with
additional information: this additional information is used firstly by XSLT
constructs outside the scope of XPath (for example, the xsl:sort
element), and secondly, by functions that are defined in the XSLT specification (such
as key
and current-group
) that are
available for use in XPath expressions appearing within a stylesheet.
The static context for an expression or other construct in a stylesheet is determined by the place in which it appears lexically. The details vary for different components of the static context, but in general, elements within a stylesheet module affect the static context for their descendant elements within the same stylesheet module.
The dynamic context is maintained as a stack. When an instruction or expression is evaluated, it may add dynamic context information to the stack; when evaluation is complete, the dynamic context reverts to its previous state. An expression that accesses information from the dynamic context always uses the value at the top of the stack.
The most commonly used component of the dynamic context is the context item. This is an implicit variable whose
value is the item currently being processed (it may be a node, an atomic value,
or a function item). The value of the context
item can be referenced within an XPath expression using the expression .
(dot).
Full details of the static and dynamic context are provided in 5.3 The Static and Dynamic Context.
An XSLT stylesheet
describes a process that constructs a set of results from a set of inputs. The inputs
are the data provided at stylesheet invocation, as described in 2.3 Initiating a Transformation. The results include the principal result
(an arbitrary sequence), which is the result of the initial component invocation,
together with any secondary results
produced using xsl:result-document
instructions.
The stylesheet does not describe how a source tree is constructed. Some possible ways of constructing source trees are described in [XDM 3.0]. Frequently an implementation will operate in conjunction with an XML parser (or more strictly, in the terminology of [XML 1.0], an XML processor), to build a source tree from an input XML document. An implementation may also provide an application programming interface allowing the tree to be constructed directly, or allowing it to be supplied in the form of a DOM Document object (see [DOM Level 2]). This is outside the scope of this specification. Users should be aware, however, that since the input to the transformation is a tree conforming to the XDM data model as described in [XDM 3.0], constructs that might exist in the original XML document, or in the DOM, but which are not within the scope of the data model, cannot be processed by the stylesheet and cannot be guaranteed to remain unchanged in the transformation output. Such constructs include CDATA section boundaries, the use of entity references, and the DOCTYPE declaration and internal DTD subset.
[Definition: A frequent requirement is to output a final result tree as an XML document (or in other formats such as HTML). This process is referred to as serialization.]
Like parsing, serialization is not part of the transformation process, and it is not
required that an XSLT processor must be able
to perform serialization. However, for pragmatic reasons, this specification
describes declarations (the xsl:output
element and the
xsl:character-map
declarations, see 26 Serialization), and attributes on the xsl:result-document
instruction, that
allow a stylesheet to specify the desired
properties of a serialized output file. When serialization is not being performed,
either because the implementation does not support the serialization option, or
because the user is executing the transformation in a way that does not invoke
serialization, then the content of the xsl:output
and
xsl:character-map
declarations has no effect. Under these
circumstances the processor may report any errors in an
xsl:output
or xsl:character-map
declaration, or
in the serialization attributes of xsl:result-document
, but is not
required to do so.
In previous versions of the XSLT language, it has been possible to structure a
stylesheet as a collection of modules, using the xsl:include
and
xsl:import
declarations to express the dependency of one module on
others.
In XSLT 3.0 an additional layer of modularization of stylesheet code is enabled through the introduction of packages. A package is a collection of stylesheet modules with a controlled interface to the packages that use it: for example, it defines which functions and templates defined in the package are visible to callers, which are purely internal, and which are not only public but capable of being overridden by other functions and templates supplied by the using package.
Packages are introduced with several motivations, which broadly divide into two categories:
Software engineering benefits: greater re-use of code, greater robustness through ease of testing, controlled evolution of code in response to new requirements, ability to deliver code that users cannot see or modify.
Efficiency benefits: the ability to avoid compiling libraries repeatedly when they are used in multiple stylesheets, and to avoid holding multiple copies of the same library in memory simultaneously.
Packages are designed to allow separate compilation: that is, a package can be compiled independently of the packages that use it. This specification does not define a process model for compilation, or expand on what it means to compile different packages independently. Nor does it mandate that implementations offer any feature along these lines. It merely defines language features that are designed to make separate compilation of packages possible.
To achieve this, packages (unlike modules):
Must not contain unresolved references to functions, templates, or variables declared in other packages;
Have strict rules governing the ability to override declarations in a library package with declarations in a package that uses the library;
Constrain the visibility of component names and of context declarations such as the declarations of keys and decimal formats;
Can declare a mode (a collection of template rules) as final, which disallows the addition of new overriding template rules in a using package;
Require explicit disambiguation where naming conflicts arise, for example when a package uses two other packages that both export like-named components;
Allow multiple specializations of library components to coexist in the same application.
A package is defined in XSLT
by means of an XML document whose
outermost element is an xsl:package
element. This is referred to as
the package manifest. The xsl:package
element
has optional child elements xsl:use-package
and
xsl:expose
describing properties of the package. The package
manifest may refer to an external top-level stylesheet module using an
xsl:include
or xsl:import
declaration, or it may
contain the body of a stylesheet module inline (the two approaches can also be
mixed).
Although this specification defines packages as constructs written using a defined XSLT syntax, implementations may provide mechanisms that allow packages to be written using other languages (for example, XQuery).
When no packages are explicitly defined, the entire
stylesheet is treated as a single package; the effect is as if the
xsl:stylesheet
or xsl:transform
element of the
principal stylesheet
module were replaced by an xsl:package
element with no
other information in the package manifest.
XSLT defines a number of features that allow the language to be extended by implementers, or, if implementers choose to provide the capability, by users. These features have been designed, so far as possible, so that they can be used without sacrificing interoperability. Extensions other than those explicitly defined in this specification are not permitted.
These features are all based on XML namespaces; namespaces are used to ensure that the extensions provided by one implementer do not clash with those of a different implementer.
The most common way of extending the language is by providing additional functions, which can be invoked from XPath expressions. These are known as extension functions, and are described in 24.1 Extension Functions.
It is also permissible to extend the language by providing new instructions. These are referred to as extension instructions, and are described
in 24.2 Extension Instructions. A stylesheet that uses extension
instructions in a particular namespace must declare that it is doing so by using the
[xsl:]extension-element-prefixes
attribute.
Extension instructions and extension functions defined according to these rules may be provided by the implementer of the XSLT processor, and the implementer may also provide facilities to allow users to create further extension instructions and extension functions.
This specification defines how extension instructions and extension functions are invoked, but the facilities for creating new extension instructions and extension functions are implementation-defined. For further details, see 24 Extensibility and Fallback.
The XSLT language can also be extended by the use of extension attributes (see 3.2 Extension Attributes), and by means of user-defined data elements (see 3.7.3 User-defined Data Elements).
An XSLT stylesheet can make use of information from a schema. An XSLT transformation can take place in the absence of a schema (and, indeed, in the absence of a DTD), but where the source document has undergone schema validity assessment, the XSLT processor has access to the type information associated with individual nodes, not merely to the untyped text.
Information from a schema can be used both statically (when the stylesheet is compiled), and dynamically (during evaluation of the stylesheet to transform a source document).
There are places within a stylesheet, and within XPath expressions and patterns in a stylesheet, where it is possible to refer to named type definitions in a schema, or to element and attribute declarations. For example, it is possible to declare the types expected for the parameters of a function. This is done using a SequenceType.
[Definition: A SequenceType constrains the type and number of items in a sequence. The term is used both to denote the concept, and to refer to the syntactic form in which sequence types are expressed in the XPath grammar: specifically SequenceTypeXP30 in [XPath 3.0], or SequenceTypeXP31 in [XPath 3.1], depending on whether or not the XPath 3.1 Feature is implemented.]
[Definition: Type definitions and element and attribute declarations are referred to collectively as schema components.]
[Definition: The schema components that may be referenced by name in a package are referred to as the in-scope schema components.]
The set of in-scope schema components may vary between one package and another, but as explained in 3.15 Importing Schema Components, the schema components used in different packages must be consistent with each other.
The conformance rules for XSLT 3.0, defined in
27 Conformance, distinguish between a basic XSLT processor and a schema-aware XSLT processor. As the
names suggest, a basic XSLT processor does not support the features of XSLT that
require access to schema information, either statically or dynamically. A stylesheet that works with a basic XSLT processor
will produce the same results with a schema-aware XSLT processor provided that the
source documents are untyped (that is, they are not validated against a schema).
However, if source documents are validated against a schema then the results may be
different from the case where they are not validated. Some constructs that work on
untyped data may fail with typed data (for example, an attribute of type
xs:date
cannot be used as an argument of the
substring
FO30 function) and other constructs may produce
different results depending on the datatype (for example, given the element
<product price="10.00" discount="2.00"/>
, the expression
@price gt @discount
will return true if the attributes have type
xs:decimal
, but will return false if they are untyped).
There is a standard set of type definitions that are always available as in-scope schema components in every stylesheet. These are defined in 3.14 Built-in Types.
The remainder of this section describes facilities that are available only with a schema-aware XSLT processor.
Additional schema components (type
definitions, element declarations, and attribute declarations) may be added to the
in-scope schema components
by means of the xsl:import-schema
declaration in a stylesheet.
The xsl:import-schema
declaration may reference an external schema
document by means of a URI, or it may contain an inline xs:schema
element.
It is only necessary to import a schema explicitly if one or more of its schema components are referenced explicitly by name in the stylesheet; it is not necessary to import a schema merely because the stylesheet is used to process a source document that has been assessed against that schema. It is possible to make use of the information resulting from schema assessment (for example, the fact that a particular attribute holds a date) even if no schema has been imported by the stylesheet.
Importing a schema does not of itself say anything about the type of the source document that the stylesheet is expected to process. The imported type definitions can be used for temporary nodes or for nodes on a result tree just as much as for nodes in source documents. It is possible to make assertions about the type of an input document by means of tests within the stylesheet. For example:
<xsl:mode typed="lax"/> <xsl:global-context-item use="required" as="document-node(schema-element(my:invoice))"/>
This example will cause the transformation to fail with an error message, unless
the global context item is valid against the top-level element
declaration my:invoice
, and has been annotated as such.
The setting typed="lax"
further ensures that in any
match pattern for a template rule in this mode, an element name that corresponds
to the name of an element declaration in the schema is taken as referring to
elements validated against that declaration: for example,
match="employee"
will only match a validated employee
element. Selecting this option enables the XSLT processor to do more compile-time
type-checking against the schema, for example it allows the processor to produce
warning or error messages when path expressions contain misspelt element names, or
confuse an element with an attribute.
It is also true that importing a schema does not of itself say
anything about the structure of the result tree. It is possible to request validation
of a result tree against the schema by using the xsl:result-document
instruction, for example:
<xsl:template match="/"> <xsl:result-document validation="strict"> <xhtml:html> <xsl:apply-templates/> </xhtml:html> </xsl:result-document> </xsl:template>
This example will cause the transformation to fail with an error message unless
the document element of the result document is valid against the top-level element
declaration xhtml:html
.
It is possible that a source document may contain nodes whose type annotation is not one of the types
imported by the stylesheet. This creates a potential problem because in the case of
an expression such as data(.) instance of xs:integer
the system needs to
know whether the type named in the type annotation of the context node is derived
by
restriction from the type xs:integer
. This information is not explicitly
available in an XDM tree, as defined in [XDM 3.0]. The
implementation may choose one of several strategies for dealing with this
situation:
The processor may signal a dynamic error if a source document is found to contain a type annotation that is not known to the processor.
The processor may maintain additional metadata, beyond that described in
[XDM 3.0], that allows the source document to be
processed as if all the necessary schema information had been imported using
xsl:import-schema
. Such metadata might be held in the data
structure representing the source document itself, or it might be held in a
system catalog or repository.
The processor may be configured to use a fixed set of schemas, which are automatically used to validate all source documents before they can be supplied as input to a transformation. In this case it is impossible for a source document to have a type annotation that the processor is not aware of.
The processor may be configured to treat the source document as if no schema
processing had been performed, that is, effectively to strip all type
annotations from elements and attributes on input, marking them instead as
having type xs:untyped
and xs:untypedAtomic
respectively.
Where a stylesheet author chooses to make assertions about the types of nodes or of variables and parameters, it is possible for an XSLT processor to perform static analysis of the stylesheet (that is, analysis in the absence of any source document). Such analysis may reveal errors that would otherwise not be discovered until the transformation is actually executed. An XSLT processor is not required to perform such static type-checking. Under some circumstances (see 2.14 Error Handling) type errors that are detected early may be reported as static errors. In addition an implementation may report any condition found during static analysis as a warning, provided that this does not prevent the stylesheet being evaluated as described by this specification.
A stylesheet can also control the type annotations of nodes that it constructs in a result tree. This can be done in a number of ways.
It is possible to request explicit validation of a complete document, that is,
a result
tree rooted at a document node. Validation
is either strict or lax, as described in [XML Schema Part 1]. If
validation of a result tree fails
(strictly speaking, if the outcome of the validity assessment is
invalid
), then the transformation fails, but in all other
cases, the element and attribute nodes of the tree will be annotated with the
names of the types to which these nodes conform. These type annotations will be discarded if the
result tree is serialized as an XML document, but they remain available when
the result tree is passed to an application (perhaps another stylesheet) for further processing.
It is also possible to validate individual element and attribute nodes as they
are constructed. This is done using the type
and
validation
attributes of the xsl:element
,
xsl:attribute
, xsl:copy
, and
xsl:copy-of
instructions, or the xsl:type
and
xsl:validation
attributes of a literal result element.
When elements, attributes, or document nodes are copied, either explicitly
using the xsl:copy
or xsl:copy-of
instructions, or implicitly when nodes in a sequence are attached to a new
parent node, the options validation="strip"
and
validation="preserve"
are available, to control whether
existing type annotations are to be
retained or not.
When nodes in a temporary tree are validated, type information is available for use by operations carried out on the temporary tree, in the same way as for a source document that has undergone schema assessment.
For details of how validation of element and attribute nodes works, see 25.4 Validation.
[Definition: The term streaming refers to a manner of processing in which XML documents (such as source and result documents) are not represented by a complete tree of nodes occupying memory proportional to document size, but instead are processed “on the fly” as a sequence of events, similar in concept to the stream of events notified by an XML parser to represent markup in lexical XML.]
[Definition: A streamed document is a source tree that is processed using streaming, that is, without constructing a complete tree of nodes in memory.]
[Definition: A streamed node is a node in a streamed document.]
Many processors implementing earlier versions of this specification have adopted an architecture that allows streaming of the result tree directly to a serializer, without first materializing the complete result tree in memory. Streaming of the source tree, however, has proved to be more difficult without subsetting the language. This has created a situation where documents exceeding the capacity of virtual memory could not be transformed. XSLT 3.0 therefore introduces facilities allowing stylesheets to be written in a way that makes streaming of source documents possible, without excessive reliance on processor-specific optimization techniques.
Streaming achieves two important objectives: it allows large documents to be transformed without requiring correspondingly large amounts of memory; and it allows the processor to start producing output before it has finished receiving its input, thus reducing latency.
This specification does not attempt to legislate precisely which implementation techniques fall under the definition of streaming, and which do not. A number of techniques are available that reduce memory requirements, while still requiring a degree of buffering, or allocation of memory to partial results. A stylesheet that requests streaming of a source document is indicating that the processor should avoid assuming that the entire source document will fit in memory; in return, the stylesheet must be written in a way that makes streaming possible. This specification does not attempt to describe the algorithms that the processor should actually use, or to impose quantitative constraints on the resources that these algorithms should consume.
Nothing in this specification, nor in its predecessors [XSLT 1.0] and [XSLT 2.0], prevents a processor using streaming whenever it sees an opportunity to do so. However, experience has shown that in order to achieve streaming, it is often necessary to write stylesheet code in such a way as to make this possible. Therefore, XSLT 3.0 provides explicit constructs allowing the stylesheet author to request streaming, and defines explicit static constraints on the structure of the code which are designed to make streaming possible.
A processor that claims conformance with the streaming option offers a guarantee that when streaming is requested for a source document, and when the stylesheet conforms to the rules that make the processing guaranteed-streamable, then an algorithm will be adopted in which memory consumption is either completely independent of document size, or increases only very slowly as document size increases, allowing documents to be processed that are orders-of-magnitude larger than the physical memory available. A processor that does not claim conformance with the streaming option must still process a stylesheet and deliver the correct results, but is not required to use streaming algorithms, and may therefore fail with out-of-memory errors when presented with large source documents.
Apart from the fact that there are constructs to request streaming, and rules that must be followed to guarantee that streaming is possible, the language has been designed so there are as few differences as possible between streaming and non-streaming evaluation. The semantics of the language continue to be expressed in terms of the XDM data model, which is substantively unchanged; but readers must take care to observe that when terms like “node” and “axis” are used, the concepts are completely abstract and may have no direct representation in the run-time execution environment.
Streamed processing of a document can be initiated in one of three ways:
The initial mode can be declared as a
streamable mode. In this case
the initial match selection will generally be a document node (or
sequence of document nodes), supplied by the calling application in
a form that allows streaming (that is, in some form other than a tree in
memory; for example, as a reference to a push or pull XML parser primed to
deliver a stream of events). The type of
these nodes can be constrained by using the attribute
on-no-match="fail"
on the initial mode,
and using this mode only for processing the top-level nodes.
Streamed processing of any document can be initiated using the
xsl:source-document
instruction. This has an attribute
href
whose value is the URI of a document to be processed,
and an attribute streamable
that
indicates whether it is to be processed using
streaming; the actual processing to be applied is defined by the
instructions written as children of the xsl:source-document
instruction.
Streamed merging of a set of input documents can be initiated using the
xsl:merge
instruction.
The rules for streamability, which are defined in detail in 19 Streamability, impose two main constraints:
The only nodes reachable from the node that is currently being processed are its attributes and namespaces, its ancestors and their attributes and namespaces, and its descendants and their attributes and namespaces. The siblings of the node, and the siblings of its ancestors, are not reachable in the tree, and any attempt to use their values is a static error.
When processing a given node in the tree, each descendant node can only be visited once. Essentially this allows two styles of processing: either visit each of the children once, and then process that child with the same restrictions applied; or process all the descendants in a single pass, in which case it is not possible while processing a descendant to make any further downward selection.
The second restriction, that only one visit to the children is
allowed, means that XSLT code that was not designed with streaming in mind will often
need to be rewritten to make it streamable. In many cases it is possible to do this
using a technique sometimes called windowing or burst-mode
streaming (note this is not quite the same meaning as
windowing in XQuery 3.0). Many XML documents consist of a large
number of elements, each of manageable size, representing transactions or business
objects where each such element can be processed independently: in such cases, an
effective design pattern is to write a streaming transformation that takes a snapshot
of each element in turn, processing the snapshot using the full power of the XSLT
language. Each snapshot is a tree built in memory and is therefore fully navigable.
For details see the snapshot
and copy-of
functions.
The new facility of accumulators allows applications complete control over how much information is retained (and by implication, how much memory is required) in the course of a pass over a streamed document. An accumulator computes a value for every node in a streamed document: or more accurately, two values, one for the first visit to a node (before visiting its descendants), and a second value for the second visit to the node (after visiting the descendants). The computation is structured in such a way that the value for a given node can depend only on the value for the previous node in document order together with the data available when positioned at the current node (for example, the attribute values). Based on the well-established fold operation of functional programming languages, accumulators provide the convenience and economy of mutable variables while remaining within the constraints of a purely declarative processing model.
When streaming is initiated, for example using the
xsl:source-document
instruction, it is necessary to declare which
accumulators are applicable to the streamed document.
Streaming applications often fall into one of the following categories:
Aggregation applications, where a single aggregation operation (perhaps
count
FO30, sum
FO30,
exists
FO30, or distinct-values
FO30) is
applied to a set of elements selected from the streamed source document by
means of a path expression.
Record-at-a-time applications, where the source document consists of a long
sequence of elements with similar structure (“records”), and each “record” is
processed using the same logic, independently of any other “records”. This kind
of processing is facilitated using the snapshot
and
copy-of
function mentioned earlier.
Grouping applications, where the output follows the structure of the input, except that an extra layer of hierarchy is added. For example, the input might be a flat series of banking transactions in date/time order, and the output might contain the same transactions grouped by date.
Accumulator applications, which are the same as record-at-a-time applications,
except that the processing of one “record” might depend on data encountered
earlier in the document. A classic example is processing a sequence of banking
transactions in which the input transaction contains a debit or credit amount,
and the output adds a running total (the account balance). The
xsl:iterate
instruction has been introduced to facilitate
this style of processing.
Isomorphic transformations, in which there is an ordered (often largely
one-to-one) relationship between the nodes of the source tree and the nodes of
the result tree: for example, transformations that involve only the renaming or
selective deletion of nodes, or scalar manipulations of the values held in the
leaf nodes. Such transformations are most conveniently expressed using
recursive application of template rules. This is possible with a streamed input
document only if all the template rules adhere to the constraints required for
streamability. To enforce these rules, while still allowing unrestricted
processing of other documents within the same transformation, all streaming
evaluation must be carried out using a specific mode, which is declared to be a streaming mode by means of an
xsl:mode
declaration in the stylesheet.
There are important classes of application in which streaming is possible only if multiple streams can be processed in parallel. This specification therefore provides facilities:
allowing multiple sorted input sequences to be merged into one sorted output
sequence (the xsl:merge
instruction)
allowing multiple output sequences to be generated during a single pass of an
input sequence (the xsl:fork
instruction).
These facilities have been designed in such a way that they can readily be implemented using streaming, that is, without materializing the input or output sequences in memory.
Streaming can be combined with schema-aware processing: that is, the streamed input to a transformation can be subjected to on-the-fly validation, a process which typically accepts an input stream from the XML parser and delivers an output stream (of type-annotated nodes) to the transformation processor. The XSD specification is designed so that validation is, with one or two exceptions, a streamable process. The exceptions include:
There may be a need to allocate memory to hold keys, in order to enforce uniqueness
and
referential integrity constraints (xs:unique
, xs:key
, xs:keyref
).
In XSD 1.1, assertions can be defined by means of XPath expressions. These are not constrained to be streamable; in the general case, any subtree of the document that is validated using an assertion may need to be buffered in memory while the assertion is processed.
Applications that need to run in finite memory may therefore need to avoid these XSD features, or to use them with care.
XSD is designed so that the intended type of an element (the “governing type”) can be determined as soon as the start tag of the element is encountered: the process of validation checks whether the content of the element actually conforms to this type, and by the time the end tag is encountered, the process will have established either that the element is valid against the governing type, or that it is invalid.
By default, dynamic errors occurring during streamed processing are fatal: they typically cause the transformation to fail immediately. XSLT 3.0 introduces the ability to catch dynamic errors and recover from them. Schema invalidity, however, is treated as a dynamic error of the instruction that processes the entire input stream, so after a validation failure, no further processing of that input stream is possible.
In consequence, a streamed validator that is running in tandem with a streamed transformation
can present the transformer with element nodes that carry a provisional type annotation
representing
the type that the element will have if it turns out to be valid. As soon as a node
is encountered that
violates this assumption, the validator should stop the flow of data to the transformer,
so that the
transformer never sees invalid data. This allows the stylesheet code to be compiled
with the assumption
of type-safety: at run-time, all nodes seen by the transformation will conform to
their XSLT-declared types
(for example, a type declared implicitly using match="schema-element(invoice)"
on an
xsl:template
element).
A streamed transformation that only accesses part of the input document (for example, a header at the start of a document) is not required to continue reading once the data it needs has been read. This means that XML well-formedness or validity errors occurring in the unread part of the input stream may go undetected.
The facilities in this specification designed to enable large data sets to be processed in a streaming manner are oriented almost entirely to XML data. This does not mean that there is never a requirement to stream non-XML data, or that the Working Group has ignored this requirement; rather, the Working Group has concluded that for the most part, streaming of non-XML data can be achieved by implementations without the need for specific language features in XSLT.
To make streamed processing of unparsed text files easier, the function unparsed-text-lines
FO30
has been introduced. This is not only more convenient for stylesheet authors than
reading the entire input
using the unparsed-text
FO30 function and then tokenizing the result, it is also easier for implementations
to optimize, allowing each line of text to be discarded from memory after it has been
processed.
For all functions that access external data, including document
, doc
FO30,
collection
FO30, unparsed-text
FO30, unparsed-text-lines
FO30,
and (in XPath 3.1) json-doc
FO31, the requirements on determinism can now
be relaxed using implementation-defined configuration options. This is significant
because it means that when a transformation reads the same external resource more
than once, it becomes
legitimate for the contents of the resource to be different on different invocations,
and this eliminates
the need for the processor to cache the contents of the resource in memory.
In the XDM data model, every value is a sequence, and (as with most functional programming languages), processing of sequences of items is pervasive throughout the XSLT and XPath languages and their function library. Good performance of a functional programming language often depends on sequence-based operations being pipelined, and being evaluated in a lazy fashion (that is, many operations process items in a sequence one at a time, in order; and many operations can deliver a result without processing the entire sequence). The semantics of XSLT and XPath permit pipelined and lazy evaluation (for example, the error handling semantics are carefully written to ensure this), but they do not require it: the details are left to implementations. Pipelined processing of a sequence is not the same thing as streamed processing of a tree, and where the XSLT specification talks of operations being “guaranteed streamable”, this is always referring to processing of trees, not of sequences.
The facilities for streaming of XML trees include operations such as copy-of
and snapshot
which are able to take a sequence of streamed nodes as input,
and produce a sequence of in-memory (unstreamed) nodes as output. It is also possible
to generate
a sequence of strings or other atomic values through the process of atomization.
The actual memory usage of a streamed
XSLT application may depend significantly on whether the processing of the resulting
sequence of in-memory
nodes or atomic values is pipelined or not. The specification, however, has nothing
to say on this matter:
it is considered an area where implementers can exercise their discretion and ingenuity.
Streaming of JSON input receives little attention in this specification. One can envisage
an implementation
of the json-to-xml
function in which the XML delivered by the function consists of
streamed nodes; but the Working Group has not researched the feasibility of such an
implementation in any detail.
[Definition: An error that can be detected by examining a stylesheet before execution starts (that is, before the source document and values of stylesheet parameters are available) is referred to as a static error.]
Generally, errors in the structure of the stylesheet, or in the syntax of XPath expressions contained in the stylesheet, are classified as static errors. Where this specification states that an element in the stylesheet must or must not appear in a certain position, or that it must or must not have a particular attribute, or that an attribute must or must not have a value satisfying specified conditions, then any contravention of this rule is a static error unless otherwise specified.
A processor must provide a mode of operation that takes a (possibly erroneous) stylesheet package as input and enables the user to determine whether or not that package contains any static errors.
Note:
The manner in which static errors are reported, and the behavior when there are multiple static errors, are left as design choices for the implementer. It is recommended that the error codes defined in this specification should be made available in any diagnostics.
A processor may also provide a mode of operation in which static errors in parts of the stylesheet that are not evaluated can go unreported.
Note:
For example, when operating in this mode, a processor might report static errors in a template rule only if the input document contains nodes that match that template rule. Such a mode of operation can provide performance benefits when large and well-tested stylesheets are used to process source documents that might only use a small part of the XML vocabulary that the stylesheet is designed to handle.
[Definition: An error that is not capable of detection until a source document is being transformed is referred to as a dynamic error.]
When a dynamic error occurs, and is not caught
using xsl:catch
, the processor
must signal the error, and the transformation fails.
Because different implementations may optimize execution of the stylesheet in different ways, the detection of dynamic errors is to some degree implementation-dependent. In cases where an implementation is able to produce a principal result or secondary result without evaluating a particular construct, the implementation is never required to evaluate that construct solely in order to determine whether doing so causes a dynamic error. For example, if a variable is declared but never referenced, an implementation may choose whether or not to evaluate the variable declaration, which means that if evaluating the variable declaration causes a dynamic error, some implementations will signal this error and others will not.
There are some cases where this specification requires that a construct must
not be evaluated: for example, the content of an xsl:if
instruction must not be evaluated if the test condition is false.
This means that an implementation must not signal any dynamic
errors that would arise if the construct were evaluated.
An implementation may signal a dynamic error before any source document is available, but only if it can determine that the error would be signaled for every possible source document and every possible set of parameter values. For example, some circularity errors fall into this category: see 9.11 Circular Definitions.
There are also some dynamic
errors where the specification gives a processor license to signal the
error during the analysis phase even if the construct might never be executed; an
example is the use of an invalid QName as a literal argument to a function such as
key
, or the use of an invalid regular expression in the
regex
attribute of the xsl:analyze-string
instruction.
A dynamic error is also signaled during the static analysis phase if the error occurs during evaluation of a static expression.
The XPath specification states (see Section 2.3.1 Kinds of Errors XP30) that if any expression (at any level) can be evaluated during the analysis phase (because all its explicit operands are known and it has no dependencies on the dynamic context), then any error in performing this evaluation may be reported as a static error. For XPath expressions used in an XSLT stylesheet, however, any such errors must not be reported as static errors in the stylesheet unless they would occur in every possible evaluation of that stylesheet; instead, they must be signaled as dynamic errors, and signaled only if the XPath expression is actually evaluated.
An XPath processor may report statically that the expression 1 div 0
fails with a “divide by zero” error. But suppose this XPath expression occurs in
an XSLT construct such as:
<xsl:choose> <xsl:when test="system-property('xsl:version') = '1.0'"> <xsl:value-of select="1 div 0"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="xs:double('INF')"/> </xsl:otherwise> </xsl:choose>
Then the XSLT processor must not report an error, because the relevant XPath construct appears in a context where it will never be executed by an XSLT 2.0 or 3.0 processor. (An XSLT 1.0 processor will execute this code successfully, returning positive infinity, because it uses double arithmetic rather than decimal arithmetic.)
[Definition: Certain errors are classified as type errors. A type error occurs when the value supplied as input to an operation is of the wrong type for that operation, for example when an integer is supplied to an operation that expects a node.] If a type error occurs in an instruction that is actually evaluated, then it must be signaled in the same way as a dynamic error. Alternatively, an implementation may signal a type error during the analysis phase in the same way as a static error, even if it occurs in part of the stylesheet that is never evaluated, provided it can establish that execution of a particular construct would never succeed.
It is implementation-defined whether type errors are signaled statically.
The following construct contains a type error, because
42
is not allowed as the value of the select
expression of the xsl:number
instruction (it must be a node). An
implementation may optionally signal this as a static error,
even though the offending instruction will never be evaluated, and the type error
would therefore never be signaled as a dynamic error.
<xsl:if test="false()"> <xsl:number select="42"/> </xsl:if>
On the other hand, in the following example it is not possible to determine
statically whether the operand of xsl:number
will have a suitable dynamic type. An
implementation may produce a warning in such cases, but it
must not treat it as an error.
<xsl:template match="para"> <xsl:param name="p" as="item()"/> <xsl:number select="$p"/> </xsl:template>
If more than one error arises, an implementation is not required to signal any errors other than the first one that it detects. It is implementation-dependent which of the several errors is signaled. This applies both to static errors and to dynamic errors. An implementation is allowed to signal more than one error, but if any errors have been signaled, it must not finish as if the transformation were successful.
When a transformation signals one or more dynamic errors, the final state of any persistent resources updated by the transformation is implementation-dependent. Implementations are not required to restore such resources to their initial state. In particular, where a transformation produces multiple result documents, it is possible that one or more serialized result documents may be written successfully before the transformation terminates, but the application cannot rely on this behavior.
Everything said above about error handling applies equally to errors in evaluating XSLT instructions, and errors in evaluating XPath expressions. Static errors and dynamic errors may occur in both cases.
[Definition: If a transformation has successfully produced a principal result or secondary result, it is still possible that errors may occur in serializing that result . For example, it may be impossible to serialize the result using the encoding selected by the user. Such an error is referred to as a serialization error.] If the processor performs serialization, then it must do so as specified in 26 Serialization, and in particular it must signal any serialization errors that occur.
Errors are identified by a QName. For errors defined in this specification, the
namespace of the QName is always http://www.w3.org/2005/xqt-errors
(and
is therefore not given explicitly), while the local part is an 8-character code in
the form PPSSNNNN. Here PP is always XT
(meaning
XSLT), and SS is one of SE
(static error), DE
(dynamic error), or TE
(type error). Note that the allocation of an error to one of these categories is
purely for convenience and carries no normative implications about the way the error
is handled. Many errors, for example, can be reported either dynamically or
statically. These error codes are used to label error conditions in this
specification, and are summarized in E Summary of Error Conditions.
Errors defined in related specifications ([XPath 3.0], [Functions and Operators 3.0] [XSLT and XQuery Serialization]) use QNames with a similar structure, in the same namespace. When errors occur in processing XPath expressions, an XSLT processor should use the original error code reported by the XPath processor, unless a more specific XSLT error code is available.
Implementations must use the codes
defined in these specifications when signaling dynamic errors, to ensure that
xsl:catch
behaves in an interoperable way across
implementations. Stylesheet authors should note, however, that there are many
examples of errors where more than one rule in this specification is violated, and
where the processor therefore has discretion in deciding which error code to
associate with the condition: there is therefore no guarantee that different
processors will always use the same error code for the same erroneous
input.
Additional errors defined by an implementation (or by an application) may use QNames in an implementation-defined (or user-defined) namespace without risk of collision.