12 Numbering

<!-- Category: instruction -->
<xsl:number
  value? = expression
  select? = expression
  level? = "single" | "multiple" | "any"
  count? = pattern
  from? = pattern
  format? = { string }
  lang? = { language }
  letter-value? = { "alphabetic" | "traditional" }
  ordinal? = { string }
  start-at? = { string }
  grouping-separator? = { char }
  grouping-size? = { integer } />

The xsl:number instruction is used to create a formatted number. The result of the instruction is a newly constructed text node containing the formatted number as its string value.

[Definition: The xsl:number instruction performs two tasks: firstly, determining a place marker (this is a sequence of integers, to allow for hierarchic numbering schemes such as 1.12.2 or 3(c)ii), and secondly, formatting the place marker for output as a text node in the result sequence.] The place marker to be formatted can either be supplied directly, in the value attribute, or it can be computed based on the position of a selected node within the tree that contains it.

[ERR XTSE0975] It is a static error if the value attribute of xsl:number is present unless the select, level, count, and from attributes are all absent.

Note:

The facilities described in this section are specifically designed to enable the calculation and formatting of section numbers, paragraph numbers, and the like. For formatting of other numeric quantities, the format-numberFO30 function may be more suitable.

Furthermore, formatting of integers where there is no requirement to calculate the position of a node in the document can now be accomplished using the format-integerFO30 function, which borrows many concepts from the xsl:number specification.

12.1 The start-at Attribute

The effective value of the start-at attribute must be a string representing a whitespace-separated sequence of one or more integers, each one optionally preceded by a minus sign, separated by whitespace. More specifically, the value must be a string matching the regular expression -?[0-9]+(\s+-?[0-9]+)*. This sequence of integers is used to re-base the sequence of integers being formatted. Specifically if $S is the sequence of integers represented by the start-at attribute, and $V is the sequence of integers to be formatted, then the following transformation is applied to $V:

for $i in 1 to count($V) return
  if ($i le count($S))
  then $V[$i] + $S[$i] - 1
  else $V[$i] + $S[last()] - 1

Note:

This means that if there are N integers in the start-at attribute, then these are used to re-base the first N numbers, while numbers after the Nth are re-based using the last (Nth) integer in the start-at attribute. If the start-at attribute contains more integers than are required, the surplus is ignored.

For example, if the attribute is given as start-at="3 0 0", and the number sequence to be formatted is (1, 1, 1, 1), then the re-based sequence is 3, 0, 0, 0.

12.2 Formatting a Supplied Number

The place marker to be formatted may be specified by an expression. The value attribute contains the expression. The value of this expression is atomized using the procedure defined in [XPath 3.0], and each value $V in the atomized sequence is then converted to the integer value returned by the XPath expression xs:integer(round(number($V))). If the start-at attribute is present, this sequence is then re-based as described in 12.1 The start-at Attribute. The resulting sequence of integers is used as the place marker to be formatted.

If the instruction is processed with XSLT 1.0 behavior, then:

[ERR XTDE0980] It is a dynamic error if any undiscarded item in the atomized sequence supplied as the value of the value attribute of xsl:number cannot be converted to an integer, or if the resulting integer is less than 0 (zero).

Note:

The value zero does not arise when numbering nodes in a source document, but it can arise in other numbering sequences. It is permitted specifically because the rules of the xsl:number instruction are also invoked by functions such as format-timeFO30: the minutes and seconds component of a time value can legitimately be zero.

The resulting sequence is formatted as a string using the effective values of the attributes specified in 12.4 Number to String Conversion Attributes; each of these attributes is interpreted as an attribute value template. After conversion, the xsl:number element constructs a new text node containing the resulting string, and returns this node.

Example: Numbering a Sorted List

The following example numbers a sorted list:

<xsl:template match="items">
  <xsl:for-each select="item">
    <xsl:sort select="."/>
    <p>
      <xsl:number value="position()" format="1. "/>
      <xsl:value-of select="."/>
    </p>
  </xsl:for-each>
</xsl:template>

12.3 Numbering based on Position in a Document

If no value attribute is specified, then the xsl:number instruction returns a new text node containing a formatted place marker that is based on the position of a selected node within its containing document. If the select attribute is present, then the expression contained in the select attribute is evaluated to determine the selected node. If the select attribute is omitted, then the selected node is the context node.

[ERR XTTE0990] It is a type error if the xsl:number instruction is evaluated, with no value or select attribute, when the context item is not a node.

[ERR XTTE1000] It is a type error if the result of evaluating the select attribute of the xsl:number instruction is anything other than a single node.

The following attributes control how the selected node is to be numbered:

In addition, the attributes specified in 12.4 Number to String Conversion Attributes are used for number to string conversion, as in the case when the value attribute is specified.

The xsl:number element first constructs a sequence of positive integers using the level, count and from attributes. Where level is single or any, this sequence will either be empty or contain a single number; where level is multiple, the sequence may be of any length. The sequence is constructed as follows:

Let matches-count($node) be a function that returns true if and only if the given node $node matches the pattern given in the count attribute, or the implied pattern (according to the rules given above) if the count attribute is omitted.

Let matches-from($node) be a function that returns true if and only if the given node $node matches the pattern given in the from attribute, or if $node is the root node of a tree. If the from attribute is omitted, then the function returns true if and only if $node is the root node of a tree.

Let $S be the selected node.

When level="single":

When level="multiple":

When level="any":

The resulting sequence of numbers is referred to as the place marker.

If the start-at attribute is present, then the place marker is re-based as described in 12.1 The start-at Attribute.

The sequence of numbers is then converted into a string using the effective values of the attributes specified in 12.4 Number to String Conversion Attributes; each of these attributes is interpreted as an attribute value template. After conversion, the resulting string is used to create a text node, which forms the result of the xsl:number instruction.

Example: Numbering the Items in an Ordered List

The following will number the items in an ordered list:

<xsl:template match="ol/item">
  <fo:block>
    <xsl:number/>
    <xsl:text>. </xsl:text>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

 

Example: Multi-Level Numbering

The following two rules will number title elements. This is intended for a document that contains a sequence of chapters followed by a sequence of appendices, where both chapters and appendices contain sections, which in turn contain subsections. Chapters are numbered 1, 2, 3; appendices are numbered A, B, C; sections in chapters are numbered 1.1, 1.2, 1.3; sections in appendices are numbered A.1, A.2, A.3. Subsections within a chapter are numbered 1.1.1, 1.1.2, 1.1.3; subsections within an appendix are numbered A.1.1, A.1.2, A.1.3.

<xsl:template match="title">
  <fo:block>
     <xsl:number level="multiple"
                 count="chapter|section|subsection"
                 format="1.1 "/>
     <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="appendix//title" priority="1">
  <fo:block>
     <xsl:number level="multiple"
                 count="appendix|section|subsection"
                 format="A.1 "/>
     <xsl:apply-templates/>
  </fo:block>
</xsl:template>

 

Example: Numbering Notes within a Chapter

This example numbers notes sequentially within a chapter, starting from the number 100: :

<xsl:template match="note">
  <fo:block>
     <xsl:number level="any" from="chapter" format="(1) " start-at="100"/>
     <xsl:apply-templates/>
  </fo:block>
</xsl:template>

12.4 Number to String Conversion Attributes

Note:

This specification is aligned with that of the format-integerFO30 function, but there are differences; for example grouping separators are part of the primary format token in format-integerFO30, but are indicated by separate attributes in xsl:number.

The following attributes are used to control conversion of a sequence of numbers into a string. The numbers are integers greater than or equal to 0 (zero). The attributes are all optional.

The main attribute is format. The default value for the format attribute is 1. The format attribute is split into a sequence of tokens where each token is a maximal sequence of alphanumeric characters or a maximal sequence of non-alphanumeric characters. Alphanumeric means any character that has a Unicode category of Nd, Nl, No, Lu, Ll, Lt, Lm or Lo (see [UNICODE]). The alphanumeric tokens (format tokens) indicate the format to be used for each number in the sequence; in most cases the format token is the same as the required representation of the number 1 (one).

Each non-alphanumeric token is either a prefix, a separator, or a suffix. If there is a non-alphanumeric token but no format token, then the single non-alphanumeric token is used as both the prefix and the suffix. The prefix, if it exists, is the non-alphanumeric token that precedes the first format token: the prefix always appears exactly once in the constructed string, at the start. The suffix, if it exists, is the non-alphanumeric token that follows the last format token: the suffix always appears exactly once in the constructed string, at the end. All other non-alphanumeric tokens (those that occur between two format tokens) are separator tokens and are used to separate numbers in the sequence.

The nth format token is used to format the nth number in the sequence. If there are more numbers than format tokens, then the last format token is used to format remaining numbers. If there are no format tokens, then a format token of 1 is used to format all numbers. Each number after the first is separated from the preceding number by the separator token preceding the format token used to format that number, or, if that is the first format token, then by . (dot).

Example: Formatting a List of Numbers

Given the sequence of numbers 5, 13, 7 and the format token A-001(i), the output will be the string E-013(vii)

Format tokens are interpreted as follows:

For all format tokens other than the first kind above (one that consists of decimal digits), there may be implementation-defined lower and upper bounds on the range of numbers that can be formatted using this format token; indeed, for some numbering sequences there may be intrinsic limits. For example, the format token &#x2460; (circled digit one, ①) has a range imposed by the Unicode character repertoire (zero to 20 in Unicode versions prior to 3.2, or zero to 50 in subsequent versions). For the numbering sequences described above any upper bound imposed by the implementation must not be less than 1000 (one thousand) and any lower bound must not be greater than 1. Numbers that fall outside this range must be formatted using the format token 1. The numbering sequence associated with the format token 1 has a lower bound of 0 (zero).

The above expansions of numbering sequences for format tokens such as a and i are indicative but not prescriptive. There are various conventions in use for how alphabetic sequences continue when the alphabet is exhausted, and differing conventions for how roman numerals are written (for example, IV versus IIII as the representation of the number 4). Sometimes alphabetic sequences are used that omit letters such as i and o. This specification does not prescribe the detail of any sequence other than those sequences consisting entirely of decimal digits.

Many numbering sequences are language-sensitive. This applies especially to the sequence selected by the tokens w, W and Ww. It also applies to other sequences, for example different languages using the Cyrillic alphabet use different sequences of characters, each starting with the letter #x410 (Cyrillic capital letter A). In such cases, the lang attribute specifies which language’s conventions are to be used; its effective value must either be a string in the value space of xs:language, or a zero-length string. If no lang value is specified, or if the value is a zero-length string, the language that is used is implementation-defined. The set of languages for which numbering is supported is implementation-defined. If a language is requested that is not supported, the processor may use a fallback language identified by removing successive hyphen-separated suffixes from the supplied value until a supported language code is obtained; failing this, the processor uses the language that it would use if the lang attribute were omitted.

The optional ordinal attribute is used to indicate whether cardinal or ordinal numbers are required, and to select other options relating to the grammatical context of the number to be formatted. The allowed set of values is implementation-defined. If the attribute is absent, or if its value is zero-length, or if its value is no or 0 or false, then cardinal numbers appropriate to the selected language are output. If the value is yes or 1 or true, then ordinal numbers appropriate to the target language are output. Other values are implementation-defined.

For example, in English, the value ordinal="yes" when used with the format token 1 outputs the sequence 1st 2nd 3rd 4th ..., and when used with the format token w outputs the sequence first second third fourth ....

Note:

In some languages, the form of numbers (especially ordinal numbers) varies depending on the grammatical context: they may have different genders and may decline with the noun that they qualify. In such cases the value of the ordinal attribute may be used to indicate the variation of the cardinal or ordinal number required, in an implementation-defined way.

The way in which the variation is indicated will depend on the conventions of the language.

For inflected languages that vary the ending of the word, the approach recommended in the previous version of this specification was to indicate the required ending, preceded by a hyphen: for example in German, appropriate values might be ordinal="-e", ordinal="-er", ordinal="-es", ordinal="-en".

Another approach, which might usefully be adopted by an implementation based on the open-source ICU localization library [ICU], or any other library making use of the Unicode Common Locale Data Repository [Unicode CLDR], is to allow the value of the attribute to be the name of a registered numbering rule set for the language in question, conventionally prefixed with a percent sign: for example, ordinal="%spellout-ordinal-masculine", or ordinal="%spellout-cardinal-year". (The attribute name ordinal in this case is a misnomer, but serves the purpose.)

Example: Ordinal Numbering in Italian

The specification format="1" ordinal="-º" lang="it", if supported, should produce the sequence:

1º 2º 3º 4º ...

The specification format="Ww" ordinal="-o" lang="it", if supported, should produce the sequence:

Primo Secondo Terzo Quarto Quinto ...

The letter-value attribute disambiguates between numbering sequences that use letters. In many languages there are two commonly used numbering sequences that use letters. One numbering sequence assigns numeric values to letters in alphabetic sequence, and the other assigns numeric values to each letter in some other manner traditional in that language. In English, these would correspond to the numbering sequences specified by the format tokens a and i. In some languages, the first member of each sequence is the same, and so the format token alone would be ambiguous. A value of alphabetic specifies the alphabetic sequence; a value of traditional specifies the other sequence. If the letter-value attribute is not specified, then it is implementation-dependent how any ambiguity is resolved.

Note:

Implementations may use extension attributes on xsl:number to provide additional control over the way in which numbers are formatted.

The grouping-separator attribute gives the separator used as a grouping (for example, thousands) separator in decimal numbering sequences, and the optional grouping-size specifies the size (normally 3) of the grouping. For example, grouping-separator="," and grouping-size="3" would produce numbers of the form 1,000,000 while grouping-separator="." and grouping-size="2" would produce numbers of the form 1.00.00.00. If only one of the grouping-separator and grouping-size attributes is specified, then it is ignored.

The effective value of the grouping-separator attribute may be any string, including a zero-length string.

The effective value of the grouping-size attribute must be a string in the lexical space of xs:integer. If the resulting integer is positive then it defines the number of digits between adjacent grouping separators; it if is zero or negative, then no grouping separators are inserted.

Example: Format Tokens and the Resulting Sequences

These examples use non-Latin characters which might not display correctly in all browsers, depending on the system configuration.

Format tokens for use with xsl:number
Description Format Token Sequence
French cardinal words format="Ww" lang="fr" Un, Deux, Trois, Quatre
German ordinal words format="w" ordinal="-e" lang="de" erste, zweite, dritte, vierte
Katakana numbering format="&#x30A2;" ア, イ, ウ, エ, オ, カ, キ, ク, ケ, コ, サ, シ, ス, セ, ソ, タ, チ, ツ, テ, ト, ナ, ニ, ヌ, ネ, ノ, ハ, ヒ, フ, ヘ, ホ, マ, ミ, ム, メ, モ, ヤ, ユ, ヨ, ラ, リ, ル, レ, ロ, ワ, ヰ, ヱ, ヲ, ン
Katakana numbering in iroha order format="&#x30A4;" イ, ロ, ハ, ニ, ホ, ヘ, ト, チ, リ, ヌ, ル, ヲ, ワ, カ, ヨ, タ, レ, ソ, ツ, ネ, ナ, ラ, ム, ウ, ヰ, ノ, オ, ク, ヤ, マ, ケ, フ, コ, エ, テ, ア, サ, キ, ユ, メ, ミ, シ, ヱ, ヒ, モ, セ, ス
Thai numbering format="&#x0E51;" ๑, ๒, ๓, ๔, ๕, ๖, ๗, ๘, ๙, ๑๐, ๑๑, ๑๒, ๑๓, ๑๔, ๑๕, ๑๖, ๑๗, ๑๘, ๑๙, ๒๐
Traditional Hebrew numbering format="&#x05D0;" letter-value="traditional" א, ב, ג, ד, ה, ו, ז, ח, ט, י, יא, יב, יג, יד, טו, טז, יז, יח, יט, כ
Traditional Georgian numbering format="&#x10D0;" letter-value="traditional" ა, ბ, გ, დ, ე, ვ, ზ, ჱ, თ, ი, ია, იბ, იგ, იდ, იე, ივ, იზ, იჱ, ით, კ
Classical Greek numbering (see note) format="&#x03B1;" letter-value="traditional" αʹ, βʹ, γʹ, δʹ, εʹ, ϛʹ, ζʹ, ηʹ, θʹ, ιʹ, ιαʹ, ιβʹ, ιγʹ, ιδʹ, ιεʹ, ιϛʹ, ιζʹ, ιηʹ, ιθʹ, κʹ
Old Slavic numbering format="&#x0430;" letter-value="traditional" А, В, Г, Д, Е, Ѕ, З, И, Ѳ, Ӏ, АӀ, ВӀ, ГӀ, ДӀ, ЕӀ, ЅӀ, ЗӀ, ИӀ, ѲӀ, К

Note that Classical Greek is an example where the format token is not the same as the representation of the number 1.