The facilities described in this section are designed to allow items in a sequence to be grouped based on common values; for example it allows grouping of elements having the same value for a particular attribute, or elements with the same name, or elements with common values for any other expression. Since grouping identifies items with duplicate values, the same facilities also allow selection of the distinct values in a sequence of items, that is, the elimination of duplicates.
Note:
Simple elimination of duplicates can also be achieved using the function
distinct-values
FO30: see [Functions and Operators 3.0].
In addition these facilities allow grouping based on sequential position, for example
selecting groups of adjacent para
elements. The facilities also provide an
easy way to do fixed-size grouping, for example identifying groups of three adjacent
nodes, which is useful when arranging data in multiple columns.
For each group of items identified, it is possible to evaluate a sequence constructor for the group. Grouping is nestable to multiple levels so that groups of distinct items can be identified, then from among the distinct groups selected, further sub-grouping of distinct items in the current group can be done.
It is also possible for one item to participate in more than one group.
xsl:for-each-group
Element<!-- Category: instruction -->
<xsl:for-each-group
select = expression
group-by? = expression
group-adjacent? = expression
group-starting-with? = pattern
group-ending-with? = pattern
composite? = boolean
collation? = { uri } >
<!-- Content: (xsl:sort*, sequence-constructor) -->
</xsl:for-each-group>
This element is an instruction that may be used anywhere within a sequence constructor.
[Definition: The xsl:for-each-group
instruction allocates the items in an input sequence into groups of
items (that is, it establishes a collection of sequences) based either on common
values of a grouping key, or on a pattern that
the initial or final item in a group must
match.] The sequence
constructor that forms the content of the
xsl:for-each-group
instruction is evaluated once for each of
these groups.
[Definition: The sequence of items to be grouped,
which is referred to as the population, is determined by evaluating
the XPath expression contained in the
select
attribute.]
[Definition: The population is treated as a sequence; the order of items in this sequence is referred to as population order ].
A group is never empty. If the population is empty, the number of groups will be zero.
The assignment of items to groups depends on the group-by
,
group-adjacent
, group-starting-with
, and
group-ending-with
attributes.
[ERR XTSE1080] These four attributes are mutually exclusive: it is a static error if none of these four attributes is present or if more than one of them is present.
[ERR XTSE1090] It is a static error to specify the
collation
attribute or the
composite
attribute if neither the
group-by
attribute nor group-adjacent
attribute is
specified.
[Definition: If either of the
group-by
or group-adjacent
attributes is present,
then for each item in the population a set
of grouping keys is calculated, as follows: the expression contained
in the group-by
or group-adjacent
attribute is
evaluated; the result is atomized; and any xs:untypedAtomic
values
are cast to xs:string
. If
composite="yes"
is specified, there is a single grouping key
whose value is the resulting sequence; otherwise, there is a set of grouping
keys, consisting of the distinct atomic values present in the result
sequence.
]
When calculating grouping keys for an item in the population, the expression contained in the group-by
or
group-adjacent
attribute is evaluated with that item as the context item, with its position in population order as the context position, and with the size of the
population as the context size.
If the group-by
attribute is present, and if
the composite
attribute is omitted or takes the value
no
, then an item in the population may
have multiple grouping keys: that is, the group-by
expression evaluates
to a sequence, and each item in the sequence is treated as
a separate grouping key. The item is included in as many groups as there
are distinct grouping keys (which may be zero).
If the group-adjacent
attribute is used, and
if the composite
attribute is omitted or takes the value
no
, then each item in the population
must have exactly one grouping key value.
[ERR XTTE1100] It is a type error if the result of
evaluating the group-adjacent
expression is an empty sequence or a
sequence containing more than one item, unless
composite="yes"
is specified.
Grouping keys are compared using the rules
for the deep-equal
FO30 function. This
means that values of type xs:untypedAtomic
will be cast to
xs:string
before the comparison, and that items that are not
comparable using the eq
operator are considered to be not equal, that
is, they are allocated to different groups. It also means that the value
NaN
is considered equal to itself. If the values are
strings, or untyped atomic values, then if there is a collation
attribute the values are compared using the collation specified as the effective value of the collation
attribute, resolved if relative against the base URI of the
xsl:for-each-group
element. If there is no collation
attribute then the default collation is
used.
[ERR XTDE1110] It is a dynamic error if the collation URI
specified to xsl:for-each-group
(after resolving against the
base URI) is a collation that is not recognized by the implementation. (For
notes, [see ERR XTDE1035].)
For more information on collations, see 13.1.3 Sorting Using Collations.
The way in which an xsl:for-each-group
element is
evaluated depends on which of the four group-defining attributes is present:
If the group-by
attribute is present, the items in the population are examined, in population order.
For each item J, the expression in the group-by
attribute is evaluated to produce a sequence of zero or more grouping key values. If composite="yes"
is specified, there will be a single
grouping key, which will in general be a sequence of zero or more atomic
values; otherwise, there will be zero or more grouping keys, each of which
will be a single atomic value. For each one of these grouping keys, if there is already a group
created to hold items having that grouping key value, J is appended to that group; otherwise a new group is
created for items with that grouping key value, and J becomes its
first member.
An item in the population may thus be appended to zero, one, or many groups. An item will never be appended more than once to the same group; if two or more grouping keys for the same item are equal, then the duplicates are ignored. An item here means the item at a particular position within the population—if the population contains the same node at several different positions in the sequence then a group may indeed contain duplicate nodes.
The number of groups will be the same as the number of distinct grouping key values present in the population.
If the population contains values of different numeric types that differ from
each other by small amounts, then the eq
operator is not
transitive, because of rounding effects occurring during type promotion. The
effect of this is described in 14.5 Non-Transitivity.
If the group-adjacent
attribute is present, the items in the
population are examined, in
population order. If an item has the same value for the grouping key as its preceding item within
the population (in population order), then it is appended to the same group as its preceding item;
otherwise a new group is created and the item becomes its first member.
If the group-starting-with
attribute is present, then its value
must be a pattern.
The items in the population are examined in population order. If an item matches the pattern, or is the first item in the population, then a new group is created and the item becomes its first member. Otherwise, the item is appended to the same group as its preceding item within the population.
If the group-ending-with
attribute is present, then its value
must be a pattern.
The items in the population are examined in population order. If an item is the first item in the population, or if the previous item in the population matches the pattern, then a new group is created and the item becomes its first member. Otherwise, the item is appended to the same group as its preceding item within the population.
In all cases the order of items within each group is predictable, and reflects the original population order, in that the items are processed in population order and each item is appended at the end of zero or more groups.
Note:
As always, a different algorithm may be used if it achieves the same effect.
[Definition: For each group, the item within the group that is first in population order is known as the initial item of the group.]
The sequence constructor contained in the
xsl:for-each-group
element is evaluated once for each of the
groups, in processing order. The sequences that result are concatenated, in
processing order, to form the result
of the xsl:for-each-group
element. Within the sequence constructor, the context item is the initial item of the relevant group, the context position is the position of this
group in the processing order of the groups, and the context size is the number of groups This has the effect
that within the sequence constructor, a call on
position()
takes successive values 1, 2, ... last()
.
Two pieces of information are available during the processing of each group (that
is,
while evaluating the sequence constructor contained in the
xsl:for-each-group
instruction, and also while evaluating the
sort key of a group as expressed by the select
attribute or sequence
constructor of an xsl:sort
child of the
xsl:for-each-group
element):
[Definition: The current group is the group itself, as a sequence of items].
[Definition: The current grouping key is a single atomic value, or in the case of a composite key, a sequence of atomic values, containing the grouping key of the items in the current group.]
Information about the current group
and the current grouping key is held in the dynamic context, and
is available using the current-group
and
current-grouping-key
functions respectively.
In XSLT 2.0, the current group and the
current grouping key were passed unchanged through calls of
xsl:apply-templates
and xsl:call-template
, and
also xsl:apply-imports
and xsl:next-match
. This
behavior is retained in XSLT 3.0 except in the case where streaming is in use:
specifically, if the xsl:apply-templates
,
xsl:call-template
, xsl:apply-imports
, or
xsl:next-match
instruction occurs within a declared-streamable construct (typically, within an
xsl:source-document
instruction, or within a streamable template rule), then the current group and current grouping key are set
to absent in the called template. The reason for this is to allow
the streamability of an xsl:for-each-group
instruction to be
assessed statically, as described in 19.8.4.19 Streamability of xsl:for-each-group.
Returns the group currently being processed by an xsl:for-each-group
instruction.
This function is deterministicFO30, context-dependentFO30, and focus-independentFO30.
The evaluation context for XPath expressions includes a component called the current group, which is a sequence.
The function current-group
returns the sequence
of items making up the current group.
The current group is bound during evaluation of the
xsl:for-each-group
instruction. If no
xsl:for-each-group
instruction is being evaluated, the current group
will be absent: that is, any reference to it will cause a dynamic
error.
The effect of invocation constructs on the current group is as follows:
If the invocation construct is contained within a declared-streamable
construct (for example, if it is within an
xsl:source-document
instruction with the attribute streamable="yes"
,
or within a streamable template), then the
invocation construct sets the current group to absent. In this
situation the scope of the current group is effectively static; it can only be
referenced within the body of the xsl:for-each-group
instruction
to which it applies.
If the invocation construct is a (static or dynamic) function call, then the invocation construct sets the current group to absent.
Otherwise the invocation construct leaves the current group unchanged. In this situation the scope of the current group is effectively dynamic: it can be referenced within called templates and attribute sets.
The current group is initially absent
during the evaluation of global variables and stylesheet parameters, during the
evaluation of the use
attribute or contained sequence constructor of xsl:key
,
and during the evaluation of the initial-value
attribute of xsl:accumulator
and the select
attribute of contained sequence constructor of xsl:accumulator-rule
.
[ERR XTSE1060] It is a static error if the
current-group
function is used within a pattern.
[ERR XTDE1061] It is a dynamic error if the
current-group
function is used when the current group is
absent
, or when it is invoked in the course of evaluating a
pattern. The error may be reported statically if it
can be detected statically.
Like other XSLT extensions to the dynamic evaluation context,
the current group is not retained as part of the closure of a
function value. This means that the expression current-group#0
is valid and
returns a function value, but any invocation of this function will fail with a dynamic
error [see ERR XTDE1061].
Returns the grouping key of the group currently being processed using the
xsl:for-each-group
instruction.
This function is deterministicFO30, context-dependentFO30, and focus-independentFO30.
The evaluation context for XPath expressions includes a component called the current grouping key, which is a sequence of atomic values. The current grouping key is the grouping key shared in common by all the items within the current group.
The function current-grouping-key
returns the current grouping key.
The current grouping key is bound during evaluation of an
xsl:for-each-group
instruction that has a group-by
or
group-adjacent
attribute. If no
xsl:for-each-group
instruction is being evaluated, the current
grouping key will be absent, which means that any reference to it
causes a dynamic error. The current grouping key is also set to absent during the evaluation of an xsl:for-each-group
instruction with a group-starting-with
or group-ending-with
attribute.
The effect of invocation constructs on the current grouping key is as follows:
If the invocation construct is contained within a declared-streamable
construct (for example, if it is within an
xsl:source-document
instruction with the attribute streamable="yes"
,
or within a streamable template), then the
invocation construct sets the current grouping key to absent.
In this situation the scope of the current group is effectively static; it can
only be referenced within the body of the xsl:for-each-group
instruction to which it applies.
If the invocation construct is a (static or dynamic) function call, then the invocation construct sets the current grouping key to absent.
Otherwise the invocation construct leaves the current grouping key unchanged. In this situation the scope of the current group is effectively dynamic: it can be referenced within called templates and attribute sets.
The current grouping key is initially absent
during the evaluation of global variables and stylesheet parameters, during the
evaluation of the use
attribute or contained sequence constructor of xsl:key
,
and during the evaluation of the initial-value
attribute of xsl:accumulator
and the select
attribute of contained sequence constructor of xsl:accumulator-rule
.
While an xsl:for-each-group
instruction with a group-by
or
group-adjacent
attribute is being evaluated, the current grouping key will be a single atomic
value if composite="no"
is specified (explicitly
or implicitly), or a sequence of atomic values if composite="yes"
is
specified.
At other times, the current grouping key will be absent.
The grouping keys of
all items in a group are not necessarily identical. For example, one might be an
xs:float
while another is a numerically equal
xs:decimal
. The current-grouping-key
function returns the grouping key of the initial item in the group, after atomization and
casting of xs:untypedAtomic
values to xs:string
.
The function takes no arguments.
[ERR XTSE1070] It is a static error if the
current-grouping-key
function is used within a pattern.
[ERR XTDE1071] It is a dynamic error if the
current-grouping-key
function is used when the current
grouping key is absent, or when it is invoked in the course of evaluating a pattern.
The error may be reported statically if it can be detected
statically.
Like other XSLT extensions to the dynamic evaluation context,
the current grouping key is not retained as part of the closure of a
function value. This means that the expression current-grouping-key#0
is
valid and returns a function value, but any invocation of this function will fail
with a
dynamic error [see ERR XTDE1071].
[Definition: There is a total ordering among groups referred to as the order of first appearance. A
group G is defined to precede a group H in order of first
appearance if the initial item of
G precedes the initial item of H in population order. If
two groups G and H have the same initial item (because the
item is in both groups) then G precedes H if the grouping key of G precedes the
grouping key of H in the sequence that results from evaluating the
group-by
expression of this initial item.]
[Definition: There is another total ordering among groups referred to as
processing order. If group R precedes group
S in processing order, then in the result sequence returned by the
xsl:for-each-group
instruction the items generated by
processing group R will precede the items generated by processing group
S.]
If there are no xsl:sort
elements immediately within the
xsl:for-each-group
element, the processing order of the groups is the order of
first appearance.
Otherwise, the xsl:sort
elements immediately within the
xsl:for-each-group
element define the processing order of the
groups (see 13 Sorting). They do
not affect the order of items within each group. Multiple sort key components are allowed, and are
evaluated in major-to-minor order. If two groups have the same values for all their
sort key components, they are processed in order of first appearance if
the sort key specification is
stable, otherwise in an implementation-dependent order.
The select
expression of an xsl:sort
element is evaluated once for each group. During
this evaluation, the context item is the
initial item of the group, the context position is the position of this item
within the set of initial items (that is, one item for each group in the population) in population order, the context
size is the number of groups, the current group is
the group whose sort key value is being
determined, and the current grouping key is the grouping key for
that group. If the xsl:for-each-group
instruction uses the
group-starting-with
or group-ending-with
attributes,
then the current grouping key is absent.
For example, this means that if the grouping
key is @category
, you can sort the groups in order of
their grouping key by writing <xsl:sort
select="current-grouping-key()"/>
; or you can sort the groups in
order of size by writing <xsl:sort
select="count(current-group())"/>
The following example groups a list of nodes based on common values. The resulting groups are numbered and sorted, and a total is calculated for each group.
Source XML document:
<cities> <city name="Milano" country="Italia" pop="5"/> <city name="Paris" country="France" pop="7"/> <city name="München" country="Deutschland" pop="4"/> <city name="Lyon" country="France" pop="2"/> <city name="Venezia" country="Italia" pop="1"/> </cities>
More specifically, the aim is to produce a four-column table, containing one row
for each distinct country. The four columns are to contain first, a sequence
number giving the number of the row; second, the name of the country, third, a
comma-separated alphabetical list of the city names within that country, and
fourth, the sum of the pop
attribute for the cities in that
country.
Desired output:
<table> <tr> <th>Position</th> <th>Country</th> <th>List of Cities</th> <th>Population</th> </tr> <tr> <td>1</td> <td>Italia</td> <td>Milano, Venezia</td> <td>6</td> </tr> <tr> <td>2</td> <td>France</td> <td>Lyon, Paris</td> <td>9</td> </tr> <tr> <td>3</td> <td>Deutschland</td> <td>München</td> <td>4</td> </tr> </table>
Solution:
<table xsl:version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <tr> <th>Position</th> <th>Country</th> <th>City List</th> <th>Population</th> </tr> <xsl:for-each-group select="cities/city" group-by="@country"> <tr> <td><xsl:value-of select="position()"/></td> <td><xsl:value-of select="current-grouping-key()"/></td> <td> <xsl:for-each select="current-group()/@name"> <xsl:sort select="."/> <xsl:if test="position() ne 1">, </xsl:if> <xsl:value-of select="."/> </xsl:for-each> </td> <td><xsl:value-of select="sum(current-group()/@pop)"/></td> </tr> </xsl:for-each-group> </table>
Sometimes it is necessary to use a composite grouping key: for example, suppose the source document is similar to the one used in the previous examples, but allows multiple entries for the same country and city, such as:
<cities> <city name="Milano" country="Italia" year="1950" pop="5.23"/> <city name="Milano" country="Italia" year="1960" pop="5.29"/> <city name="Padova" country="Italia" year="1950" pop="0.69"/> <city name="Padova" country="Italia" year="1960" pop="0.93"/> <city name="Paris" country="France" year="1951" pop="7.2"/> <city name="Paris" country="France" year="1961" pop="7.6"/> </cities>
Now suppose we want to list the average value of @pop
for each
(country, name) combination. One way to handle this is to concatenate the parts of
the key, for example <xsl:for-each-group select="concat(@country, '/',
@name)">
. A second solution is to nest one
xsl:for-each-group
element directly inside another. XSLT 3.0
introduces a third option, which is to define the grouping key as composite:
<xsl:for-each-group select="cities/city" group-by="@name, @country" composite="yes"> <p> <xsl:value-of select="current-grouping-key()[1] || ', ' || current-grouping-key()[2] || ': ' || avg(current-group()/@pop)"/> </p> </xsl:for-each-group>
Note:
The string concatenation operator ||
is new in XPath 3.0.
The next example identifies a group not by the presence of a common value, but
rather by adjacency in document order. A group consists of an h2
element, followed by all the p
elements up to the next
h2
element.
Source XML document:
<body> <h2>Introduction</h2> <p>XSLT is used to write stylesheets.</p> <p>XQuery is used to query XML databases.</p> <h2>What is a stylesheet?</h2> <p>A stylesheet is an XML document used to define a transformation.</p> <p>Stylesheets may be written in XSLT.</p> <p>XSLT 2.0 introduces new grouping constructs.</p> </body>
Desired output:
<chapter> <section title="Introduction"> <para>XSLT is used to write stylesheets.</para> <para>XQuery is used to query XML databases.</para> </section> <section title="What is a stylesheet?"> <para>A stylesheet is used to define a transformation.</para> <para>Stylesheets may be written in XSLT.</para> <para>XSLT 2.0 introduces new grouping constructs.</para> </section> </chapter>
Solution:
<xsl:template match="body"> <chapter> <xsl:for-each-group select="*" group-starting-with="h2"> <section title="{self::h2}"> <xsl:for-each select="current-group()[self::p]"> <para><xsl:value-of select="."/></para> </xsl:for-each> </section> </xsl:for-each-group> </chapter> </xsl:template>
The use of title="{self::h2}"
rather than title="{.}"
is
to handle the case where the first element is not an h2
element.
The next example illustrates how a group of related elements can be identified by
the last element in the group, rather than the first. Here the absence of the
attribute continued="yes"
indicates the end of the group.
Source XML document:
<doc> <page continued="yes">Some text</page> <page continued="yes">More text</page> <page>Yet more text</page> <page continued="yes">Some words</page> <page continued="yes">More words</page> <page>Yet more words</page> </doc>
Desired output:
<doc> <pageset> <page>Some text</page> <page>More text</page> <page>Yet more text</page> </pageset> <pageset> <page>Some words</page> <page>More words</page> <page>Yet more words</page> </pageset> </doc>
Solution:
<xsl:template match="doc"> <doc> <xsl:for-each-group select="*" group-ending-with="page[not(@continued='yes')]"> <pageset> <xsl:for-each select="current-group()"> <page><xsl:value-of select="."/></page> </xsl:for-each> </pageset> </xsl:for-each-group> </doc> </xsl:template>
The next example shows how an item can be added to multiple groups. Book titles will be added to one group for each indexing term marked up within the title.
Source XML document:
<titles> <title>A Beginner's Guide to <ix>Java</ix></title> <title>Learning <ix>XML</ix></title> <title>Using <ix>XML</ix> with <ix>Java</ix></title> </titles>
Desired output:
<h2>Java</h2> <p>A Beginner's Guide to Java</p> <p>Using XML with Java</p> <h2>XML</h2> <p>Learning XML</p> <p>Using XML with Java</p>
Solution:
<xsl:template match="titles"> <xsl:for-each-group select="title" group-by="ix"> <h2><xsl:value-of select="current-grouping-key()"/></h2> <xsl:for-each select="current-group()"> <p><xsl:value-of select="."/></p> </xsl:for-each> </xsl:for-each-group> </xsl:template>
In this example, the membership of a node within a group is based both on adjacency of the nodes in document order, and on common values. In this case, the grouping key is a boolean condition, true or false, so the effect is that a grouping establishes a maximal sequence of nodes for which the condition is true, followed by a maximal sequence for which it is false, and so on.
Source XML document:
<p>Do <em>not</em>: <ul> <li>talk,</li> <li>eat, or</li> <li>use your mobile telephone</li> </ul> while you are in the cinema.</p>
Desired output:
<p>Do <em>not</em>:</p> <ul> <li>talk,</li> <li>eat, or</li> <li>use your mobile telephone</li> </ul> <p>while you are in the cinema.</p>
Solution:
This requires creating a p
element around the maximal sequence of
sibling nodes that does not include a ul
or ol
element.
This can be done by using group-adjacent
, with a grouping key that is
true if the element is a ul
or ol
element, and false
otherwise:
<xsl:template match="p"> <xsl:for-each-group select="node()" group-adjacent="self::ul or self::ol"> <xsl:choose> <xsl:when test="current-grouping-key()"> <xsl:copy-of select="current-group()"/> </xsl:when> <xsl:otherwise> <p> <xsl:copy-of select="current-group()"/> </p> </xsl:otherwise> </xsl:choose> </xsl:for-each-group> </xsl:template>
If the population contains values of different numeric types that differ from each
other by small amounts, then the eq
operator is not transitive, because
of rounding effects occurring during type promotion. It is thus possible to have
three values A, B, and C among the grouping keys of
the population such that A eq B
, B eq C
, but A ne
C
.
For example, this arises when computing
<xsl:for-each-group group-by="." select=" xs:float('1.0'), xs:decimal('1.0000000000100000000001'), xs:double('1.00000000001')"/>
because the values of type xs:float
and xs:double
both
compare equal to the value of type xs:decimal
but not equal to each
other.
In this situation the results must be equivalent to the results obtained by the following algorithm:
For each item J in the population in population order, for each of the grouping keys K for that item in sequence, the processor identifies those existing groups G such that the grouping key of the initial item of G is equal to K.
If there is exactly one group G, then J is added to this group, unless J is already a member of this group.
If there is no group G, then a new group is created with J as its first item.
If there is more than one group G (which can only happen in exceptional circumstances involving non-transitivity), then one of these groups is selected in an implementation-dependent way, and J is added to this group, unless J is already a member of this group.
The effect of these rules is that (a) every item in a non-singleton group has a grouping key that is equal to that of at least one other item in that group, (b) for any two distinct groups, there is at least one pair of items (one from each group) whose grouping keys are not equal to each other.