<!-- Category: instruction -->
<xsl:number
value? = expression
select? = expression
level? = "single" | "multiple" | "any"
count? = pattern
from? = pattern
format? = { string }
lang? = { language }
letter-value? = { "alphabetic" | "traditional" }
ordinal? = { string }
start-at? = { string }
grouping-separator? = { char }
grouping-size? = { integer } />
The xsl:number
instruction is used to create a formatted number. The
result of the instruction is a newly constructed text node containing the formatted
number as its string value.
[Definition: The xsl:number
instruction performs two tasks: firstly, determining a place marker
(this is a sequence of integers, to allow for hierarchic numbering schemes such as
1.12.2
or 3(c)ii
), and secondly, formatting the place
marker for output as a text node in the result sequence.] The place marker
to be formatted can either be supplied directly, in the value
attribute, or
it can be computed based on the position of a selected node within the tree that
contains it.
[ERR XTSE0975] It is a static error if the
value
attribute of xsl:number
is present unless
the select
, level
, count
, and
from
attributes are all absent.
Note:
The facilities described in this section are specifically designed to enable the
calculation and formatting of section numbers, paragraph numbers, and the like. For
formatting of other numeric quantities, the format-number
FO30
function may be more suitable.
Furthermore, formatting of integers where there is no requirement to calculate the
position of a node in the document can now be accomplished using the
format-integer
FO30 function, which borrows many concepts from
the xsl:number
specification.
start-at
AttributeThe effective value of the
start-at
attribute must be a string representing a
whitespace-separated sequence of one or more integers, each one optionally preceded
by a minus sign, separated by whitespace. More specifically, the value must be a
string matching the regular expression -?[0-9]+(\s+-?[0-9]+)*
. This
sequence of integers is used to re-base the sequence of integers being
formatted. Specifically if $S is the sequence of integers represented by
the start-at
attribute, and $V is the sequence of integers to
be formatted, then the following transformation is applied to $V:
for $i in 1 to count($V) return if ($i le count($S)) then $V[$i] + $S[$i] - 1 else $V[$i] + $S[last()] - 1
Note:
This means that if there are N integers in the start-at
attribute, then these are used to re-base the first N numbers, while
numbers after the Nth are re-based using the last (Nth)
integer in the start-at
attribute. If the start-at
attribute contains more integers than are required, the surplus is ignored.
For example, if the attribute is given as
start-at="3 0 0"
, and the number sequence to be formatted is
(1, 1, 1, 1)
, then the re-based sequence is 3, 0, 0,
0
.
The place marker to be formatted may be
specified by an expression. The value
attribute contains the expression. The value of this expression is atomized using the procedure defined in [XPath 3.0], and each value $V in the atomized sequence is then
converted to the integer value returned by the XPath expression
xs:integer(round(number($V)))
. If
the start-at
attribute is present, this sequence is then re-based as
described in 12.1 The start-at Attribute. The resulting sequence of
integers is used as the place marker to be formatted.
If the instruction is processed with XSLT 1.0 behavior, then:
All items in the atomized sequence after the first are discarded;
If the atomized sequence is empty, it is replaced by a sequence containing the
xs:double
value NaN
as its only item;
If any value in the sequence cannot be converted to an integer (this includes
the case where the sequence contains a NaN
value) then the string
NaN
is inserted into the formatted result string in its proper
position. The error described in the following paragraph does not apply in this
case.
[ERR XTDE0980] It is a dynamic error if any undiscarded item
in the atomized sequence supplied as the value of the value
attribute of xsl:number
cannot be converted to an integer, or
if the resulting integer is less than 0 (zero).
Note:
The value zero does not arise when numbering nodes in a source document, but it
can arise in other numbering sequences. It is permitted specifically because the
rules of the xsl:number
instruction are also invoked by functions
such as format-time
FO30: the minutes and seconds component of a
time value can legitimately be zero.
The resulting sequence is formatted as a string using the effective values of the attributes specified in
12.4 Number to String Conversion Attributes; each of these attributes is interpreted as an attribute value template. After
conversion, the xsl:number
element constructs a new text node
containing the resulting string, and returns this node.
If no value
attribute is specified, then the xsl:number
instruction returns a new text node containing a formatted place marker that is based on the position of a
selected node within its containing document. If the select
attribute is
present, then the expression contained in the select
attribute is
evaluated to determine the selected node. If the select
attribute is
omitted, then the selected node is the context
node.
[ERR XTTE0990] It is a type error if the
xsl:number
instruction is evaluated, with no
value
or select
attribute, when the context item is not a node.
[ERR XTTE1000] It is a type error if the result of
evaluating the select
attribute of the xsl:number
instruction is anything other than a single node.
The following attributes control how the selected node is to be numbered:
The level
attribute specifies rules for selecting the nodes that
are taken into account in allocating a number; it has the values
single
, multiple
or any
. The default
is single
.
The count
attribute is a pattern that specifies which nodes are to be counted at those
levels. If count
attribute is not specified, then it defaults to
the pattern that matches any node with the same node kind as the selected node
and, if the selected node has an expanded
QName, with the same expanded
QName as the selected node.
The from
attribute is a pattern that specifies where counting starts.
In addition, the attributes specified in 12.4 Number to String Conversion Attributes are used for number
to string conversion, as in the case when the value
attribute is
specified.
The xsl:number
element first constructs a sequence of positive
integers using the level
, count
and from
attributes. Where level
is single
or any
, this
sequence will either be empty or contain a single number; where level
is
multiple
, the sequence may be of any length. The sequence is
constructed as follows:
Let matches-count($node)
be a function that returns true if and only if
the given node $node
matches the pattern given in the count
attribute, or the implied pattern (according to the rules given above) if the
count
attribute is omitted.
Let matches-from($node)
be a function that returns true if and only if
the given node $node
matches the pattern given in the from
attribute, or if $node
is the root node of a tree. If the
from
attribute is omitted, then the function returns true if and only
if $node
is the root node of a tree.
Let $S
be the selected node.
When level="single"
:
Let $A
be the node sequence selected by the following
expression:
$S/ancestor-or-self::node()[matches-count(.)][1]
(this selects the innermost ancestor-or-self node that matches the
count
pattern)
Let $F
be the node sequence selected by the expression:
$S/ancestor-or-self::node()[matches-from(.)][1]
(this selects the innermost ancestor-or-self node that matches the
from
pattern)
Let $AF
be the value of:
$A[ancestor-or-self::node()[. is $F]]
(this selects $A if it is in the subtree rooted at $F, or the empty sequence otherwise)
If $AF
is empty, return the empty sequence, ()
Otherwise return the value of:
1 + count($AF/preceding-sibling::node()[matches-count(.)])
(the number of preceding siblings of the counted node that match the
count
pattern, plus one).
When level="multiple"
:
Let $A
be the node sequence selected by the expression:
$S/ancestor-or-self::node()[matches-count(.)]
(the set of ancestor-or-self nodes that match the count
pattern)
Let $F
be the node sequence selected by the expression:
$S/ancestor-or-self::node()[matches-from(.)][1]
(the innermost ancestor-or-self node that matches the from
pattern)
Let $AF
be the value of:
$A[ancestor-or-self::node()[. is $F]]
(the nodes selected in the first step that are in the subtree rooted at the node selected in the second step)
Return the result of the expression:
for $af in $AF return
1+count($af/preceding-sibling::node()[matches-count(.)])
(a sequence of integers containing, for each of these nodes, one plus the
number of preceding siblings that match the count
pattern)
When level="any"
:
Let $A
be the node sequence selected by the expression:
$S/(preceding::node()|ancestor-or-self::node())[matches-count(.)]
(the set of nodes consisting of the selected node together with all nodes,
other than attributes and namespaces, that precede the selected node in
document order, provided that they match the count
pattern)
Let $F
be the node sequence selected by the expression:
$S/(preceding::node()|ancestor-or-self::node())[matches-from(.)][last()]
(the last node in document order that matches the from
pattern and
that precedes the selected node, using the same definition)
Let $AF
be the node sequence $A[. is $F or . >>
$F]
(the nodes selected in the first step, excluding those that precede the node selected in the second step)
If $AF
is empty, return the empty sequence, ()
Otherwise return the value of the expression count($AF)
The resulting sequence of numbers is referred to as the place marker.
If the start-at
attribute is present, then the
place marker is re-based as described in 12.1 The start-at Attribute.
The sequence of numbers is then converted into a string using the effective values of the attributes specified in
12.4 Number to String Conversion Attributes; each of these attributes is interpreted as an attribute value template. After
conversion, the resulting string is used to create a text node, which forms the
result of the xsl:number
instruction.
The following will number the items in an ordered list:
<xsl:template match="ol/item"> <fo:block> <xsl:number/> <xsl:text>. </xsl:text> <xsl:apply-templates/> </fo:block> </xsl:template>
The following two rules will number title
elements. This is intended
for a document that contains a sequence of chapters followed by a sequence of
appendices, where both chapters and appendices contain sections, which in turn
contain subsections. Chapters are numbered 1, 2, 3; appendices are numbered A, B,
C; sections in chapters are numbered 1.1, 1.2, 1.3; sections in appendices are
numbered A.1, A.2, A.3. Subsections within a chapter are numbered 1.1.1, 1.1.2,
1.1.3; subsections within an appendix are numbered A.1.1, A.1.2, A.1.3.
<xsl:template match="title"> <fo:block> <xsl:number level="multiple" count="chapter|section|subsection" format="1.1 "/> <xsl:apply-templates/> </fo:block> </xsl:template> <xsl:template match="appendix//title" priority="1"> <fo:block> <xsl:number level="multiple" count="appendix|section|subsection" format="A.1 "/> <xsl:apply-templates/> </fo:block> </xsl:template>
Note:
This specification is aligned with that of the
format-integer
FO30 function, but there are differences; for
example grouping separators are part of the primary format token in
format-integer
FO30, but are indicated by separate attributes
in xsl:number
.
The following attributes are used to control conversion of a sequence of numbers into a string. The numbers are integers greater than or equal to 0 (zero). The attributes are all optional.
The main attribute is format
. The default value for the
format
attribute is 1
. The format
attribute
is split into a sequence of tokens where each token is a maximal sequence of
alphanumeric characters or a maximal sequence of non-alphanumeric characters.
Alphanumeric means any character that has a Unicode category of Nd,
Nl, No, Lu, Ll, Lt, Lm or Lo (see [UNICODE]). The alphanumeric tokens (format tokens)
indicate the format to be used for each number in the sequence; in most cases the
format token is the same as the required representation of the number 1 (one).
Each non-alphanumeric token is either a prefix, a separator, or a suffix. If there is a non-alphanumeric token but no format token, then the single non-alphanumeric token is used as both the prefix and the suffix. The prefix, if it exists, is the non-alphanumeric token that precedes the first format token: the prefix always appears exactly once in the constructed string, at the start. The suffix, if it exists, is the non-alphanumeric token that follows the last format token: the suffix always appears exactly once in the constructed string, at the end. All other non-alphanumeric tokens (those that occur between two format tokens) are separator tokens and are used to separate numbers in the sequence.
The nth format token is used to format the nth number in the
sequence. If there are more numbers than format tokens, then the last format token
is
used to format remaining numbers. If there are no format tokens, then a format token
of 1
is used to format all numbers. Each number after the first is
separated from the preceding number by the separator token preceding the format token
used to format that number, or, if that is the first format token, then by
.
(dot).
Given the sequence of numbers 5, 13, 7
and the format token
A-001(i)
, the output will be the string E-013(vii)
Format tokens are interpreted as follows:
Any token where the last character has a decimal digit value of 1 (as specified
in the Unicode character property database, see [UNICODE]), and the Unicode
value of preceding characters is one less than the Unicode value of the last
character generates a decimal representation of the number where each number is
at least as long as the format token. The digits used in the decimal
representation are the set of digits containing the digit character used in the
format token. Thus, a format token 1
generates the sequence
0 1 2 ... 10 11 12 ...
, and a format token 01
generates the sequence 00 01 02 ... 09 10 11 12 ... 99 100 101
. A
format token of ١
(Arabic-Indic digit one) generates the
sequence ١
then ٢
then
٣
...
A format token A
generates the sequence A B C ... Z AA AB
AC...
.
A format token a
generates the sequence a b c ... z aa ab
ac...
.
A format token i
generates the sequence i ii iii iv v vi vii
viii ix x ...
.
A format token I
generates the sequence I II III IV V VI VII
VIII IX X ...
.
A format token w
generates numbers written as lower-case words,
for example in English, one two three four ...
.
A format token W
generates numbers written as upper-case words,
for example in English, ONE TWO THREE FOUR ...
.
A format token Ww
generates numbers written as title-case words,
for example in English, One Two Three Four ...
.
Any other format token indicates a numbering sequence in which that token
represents the number 1 (one) (but see the note below).
It is implementation-defined which numbering sequences,
additional to those listed above, are supported. If an implementation does not
support a numbering sequence represented by the given token, it
must use a format token of 1
.
Note:
In some traditional numbering sequences additional signs are added to denote that the letters should be interpreted as numbers; these are not included in the format token. An example, see also the example below, is classical Greek where a dexia keraia and sometimes an aristeri keraia is added.
For all format tokens other than the first kind above (one that consists of decimal
digits), there may be implementation-defined lower and upper bounds on the range of numbers
that can be formatted using this format token; indeed, for some numbering sequences
there may be intrinsic limits. For example, the format
token ①
(circled digit one, ①) has a range imposed
by the Unicode character repertoire (zero to 20 in Unicode versions prior to 3.2,
or zero to 50 in subsequent versions). For the numbering sequences
described above any upper bound imposed by the implementation must
not be less than 1000 (one thousand) and any lower bound must not be
greater than 1. Numbers that fall outside this range must be
formatted using the format token 1
. The numbering sequence associated
with the format token 1
has a lower bound of 0 (zero).
The above expansions of numbering sequences for format tokens such as a
and i
are indicative but not prescriptive. There are various conventions
in use for how alphabetic sequences continue when the alphabet is exhausted, and
differing conventions for how roman numerals are written (for example,
IV
versus IIII
as the representation of the number 4).
Sometimes alphabetic sequences are used that omit letters such as i
and
o
. This specification does not prescribe the detail of any sequence
other than those sequences consisting entirely of decimal digits.
Many numbering sequences are language-sensitive. This applies especially to the
sequence selected by the tokens w
, W
and Ww
.
It also applies to other sequences, for example different languages using the
Cyrillic alphabet use different sequences of characters, each starting with the
letter #x410 (Cyrillic capital letter A). In such cases, the lang
attribute specifies which language’s conventions are to be used; its effective
value
must either be a string in the value space of
xs:language
, or a zero-length string. If no
lang
value is specified, or if the
value is a zero-length string, the language that is used is implementation-defined. The set of
languages for which numbering is supported is implementation-defined. If a language is
requested that is not supported, the processor may use a fallback language identified by removing
successive hyphen-separated suffixes from the supplied value until a supported
language code is obtained; failing this, the processor uses the language
that it would use if the lang
attribute were omitted.
The optional ordinal
attribute is used to
indicate whether cardinal or ordinal numbers are required, and to select other
options relating to the grammatical context of the number to be formatted. The
allowed set of values is implementation-defined. If the attribute
is absent, or if its value is zero-length, or if its value is no
or
0
or false
, then cardinal numbers appropriate to the
selected language are output. If the value is yes
or 1
or
true
, then ordinal numbers appropriate to the target language are
output. Other values are implementation-defined.
For example, in English, the value ordinal="yes"
when used with the
format token 1
outputs the sequence 1st 2nd 3rd 4th ...
,
and when used with the format token w
outputs the sequence first
second third fourth ...
.
Note:
In some languages, the form of numbers (especially ordinal numbers) varies
depending on the grammatical context: they may have different genders and may
decline with the noun that they qualify. In such cases the value of the
ordinal
attribute may be used to indicate the variation of the
cardinal or ordinal number required, in an implementation-defined way.
The way in which the variation is indicated will depend on the conventions of the language.
For inflected languages that vary the ending of the word, the approach recommended
in the previous version of this specification was to indicate the required ending,
preceded by a hyphen: for example in German, appropriate values might be
ordinal="-e"
, ordinal="-er"
,
ordinal="-es"
, ordinal="-en"
.
Another approach, which might usefully be adopted by an implementation based on
the open-source ICU localization library [ICU], or any other library
making use of the Unicode Common Locale Data Repository [Unicode CLDR], is
to allow the value of the attribute to be the name of a registered numbering rule
set for the language in question, conventionally prefixed with a percent sign: for
example, ordinal="%spellout-ordinal-masculine"
, or
ordinal="%spellout-cardinal-year"
. (The attribute name
ordinal
in this case is a misnomer, but serves the purpose.)
The specification format="1" ordinal="-º" lang="it"
, if supported,
should produce the sequence:
1º 2º 3º 4º ...
The specification format="Ww" ordinal="-o" lang="it"
, if supported,
should produce the sequence:
Primo Secondo Terzo Quarto Quinto ...
The letter-value
attribute disambiguates between numbering sequences
that use letters. In many languages there are two commonly used numbering sequences
that use letters. One numbering sequence assigns numeric values to letters in
alphabetic sequence, and the other assigns numeric values to each letter in some
other manner traditional in that language. In English, these would correspond to the
numbering sequences specified by the format tokens a
and i
.
In some languages, the first member of each sequence is the same, and so the format
token alone would be ambiguous. A value of alphabetic
specifies the
alphabetic sequence; a value of traditional
specifies the other
sequence. If the letter-value
attribute is not specified, then it is
implementation-dependent how
any ambiguity is resolved.
Note:
Implementations may use extension
attributes on xsl:number
to provide additional
control over the way in which numbers are formatted.
The grouping-separator
attribute gives the separator used as a grouping
(for example, thousands) separator in decimal numbering sequences, and the optional
grouping-size
specifies the size (normally 3) of the grouping. For
example, grouping-separator=","
and grouping-size="3"
would
produce numbers of the form 1,000,000
while
grouping-separator="."
and grouping-size="2"
would
produce numbers of the form 1.00.00.00
. If only one of the
grouping-separator
and grouping-size
attributes is
specified, then it is ignored.
The effective
value of the grouping-separator
attribute
may be any string, including a zero-length string.
The effective
value of the grouping-size
attribute
must be a string in the lexical space of
xs:integer
. If the resulting integer is positive then it defines the
number of digits between adjacent grouping separators; it if is zero or negative,
then no grouping separators are inserted.
These examples use non-Latin characters which might not display correctly in all browsers, depending on the system configuration.
Description | Format Token | Sequence |
---|---|---|
French cardinal words |
format="Ww" lang="fr"
|
Un, Deux, Trois, Quatre |
German ordinal words |
format="w" ordinal="-e" lang="de"
|
erste, zweite, dritte, vierte |
Katakana numbering |
format="ア"
|
ア, イ, ウ, エ, オ, カ, キ, ク, ケ, コ, サ, シ, ス, セ, ソ, タ, チ, ツ, テ, ト, ナ, ニ, ヌ, ネ, ノ, ハ, ヒ, フ, ヘ, ホ, マ, ミ, ム, メ, モ, ヤ, ユ, ヨ, ラ, リ, ル, レ, ロ, ワ, ヰ, ヱ, ヲ, ン |
Katakana numbering in iroha order |
format="イ"
|
イ, ロ, ハ, ニ, ホ, ヘ, ト, チ, リ, ヌ, ル, ヲ, ワ, カ, ヨ, タ, レ, ソ, ツ, ネ, ナ, ラ, ム, ウ, ヰ, ノ, オ, ク, ヤ, マ, ケ, フ, コ, エ, テ, ア, サ, キ, ユ, メ, ミ, シ, ヱ, ヒ, モ, セ, ス |
Thai numbering |
format="๑"
|
๑, ๒, ๓, ๔, ๕, ๖, ๗, ๘, ๙, ๑๐, ๑๑, ๑๒, ๑๓, ๑๔, ๑๕, ๑๖, ๑๗, ๑๘, ๑๙, ๒๐ |
Traditional Hebrew numbering |
format="א" letter-value="traditional"
|
א, ב, ג, ד, ה, ו, ז, ח, ט, י, יא, יב, יג, יד, טו, טז, יז, יח, יט, כ |
Traditional Georgian numbering |
format="ა" letter-value="traditional"
|
ა, ბ, გ, დ, ე, ვ, ზ, ჱ, თ, ი, ია, იბ, იგ, იდ, იე, ივ, იზ, იჱ, ით, კ |
Classical Greek numbering (see note) |
format="α" letter-value="traditional"
|
αʹ, βʹ, γʹ, δʹ, εʹ, ϛʹ, ζʹ, ηʹ, θʹ, ιʹ, ιαʹ, ιβʹ, ιγʹ, ιδʹ, ιεʹ, ιϛʹ, ιζʹ, ιηʹ, ιθʹ, κʹ |
Old Slavic numbering |
format="а" letter-value="traditional"
|
А, В, Г, Д, Е, Ѕ, З, И, Ѳ, Ӏ, АӀ, ВӀ, ГӀ, ДӀ, ЕӀ, ЅӀ, ЗӀ, ИӀ, ѲӀ, К |
Note that Classical Greek is an example where the format token is not the same as the representation of the number 1.