JSON is a popular format for exchange of structured data on the web: it is specified in [RFC 7159]. This section describes facilities allowing JSON data to be processed using XSLT.
Note:
RFC7159 is taken as the definitive specification of JSON for the purposes of this document. The RFC explains its relationship with other JSON specifications such as [ECMA-404].
Note:
XPath 3.1 incorporates the functions defined in this
section. It also provides additional JSON capability, in the form of functions
parse-json
, json-doc
, and extensions to the
serialize
FO30 function. These facilities are incorporated in XSLT
3.0 only if the XPath 3.1 feature is supported. They depend on support for
arrays.
This specification defines a mapping from JSON data to XML (specifically, to XDM
instances). A function json-to-xml
is provided to take a JSON
string as input and convert it to the XML representation. Two stylesheet modules are
provided to perform the reverse transformation: one produces JSON in compact linear
form, the other in indented form suitable for display, editing, or printing.
The XML representation is designed to be capable of representing any valid JSON text other than one that uses characters which are not valid in XML. The transformation is lossless: that is, distinct JSON texts convert to distinct XML representations. When converting JSON to XML, options are provided to reject unsupported characters, to replace them with a substitute character, or to leave them in backslash-escaped form.
The following example demonstrates the correspondence of a JSON text and the corresponding XML representation.
Consider the following JSON text:
{ "desc" : "Distances between several cities, in kilometers.", "updated" : "2014-02-04T18:50:45", "uptodate": true, "author" : null, "cities" : { "Brussels": [ {"to": "London", "distance": 322}, {"to": "Paris", "distance": 265}, {"to": "Amsterdam", "distance": 173} ], "London": [ {"to": "Brussels", "distance": 322}, {"to": "Paris", "distance": 344}, {"to": "Amsterdam", "distance": 358} ], "Paris": [ {"to": "Brussels", "distance": 265}, {"to": "London", "distance": 344}, {"to": "Amsterdam", "distance": 431} ], "Amsterdam": [ {"to": "Brussels", "distance": 173}, {"to": "London", "distance": 358}, {"to": "Paris", "distance": 431} ] } }
The XML representation of this text is as follows. Whitespace is included in the
XML representation for purposes of illustration, and is ignored by the stylesheets
that convert XML to JSON, but it will not be present in the output of the
json-to-xml
function.
<map xmlns="http://www.w3.org/2005/xpath-functions"> <string key='desc'>Distances between several cities, in kilometers.</string> <string key='updated'>2014-02-04T18:50:45</string> <boolean key="uptodate">true</boolean> <null key="author"/> <map key='cities'> <array key="Brussels"> <map> <string key="to">London</string> <number key="distance">322</number> </map> <map> <string key="to">Paris</string> <number key="distance">265</number> </map> <map> <string key="to">Amsterdam</string> <number key="distance">173</number> </map> </array> <array key="London"> <map> <string key="to">Brussels</string> <number key="distance">322</number> </map> <map> <string key="to">Paris</string> <number key="distance">344</number> </map> <map> <string key="to">Amsterdam</string> <number key="distance">358</number> </map> </array> <array key="Paris"> <map> <string key="to">Brussels</string> <number key="distance">265</number> </map> <map> <string key="to">London</string> <number key="distance">344</number> </map> <map> <string key="to">Amsterdam</string> <number key="distance">431</number> </map> </array> <array key="Amsterdam"> <map> <string key="to">Brussels</string> <number key="distance">173</number> </map> <map> <string key="to">London</string> <number key="distance">358</number> </map> <map> <string key="to">Paris</string> <number key="distance">431</number> </map> </array> </map> </map>
An XSD 1.0 schema for the XML representation is provided in B.1 Schema for the XML Representation of JSON. It is not necessary to import this schema (using
xsl:import-schema
) unless the stylesheet makes explicit reference
to the components defined in the schema. If the stylesheet does import a schema for
the namespace http://www.w3.org/2005/xpath-functions
, then:
The processor (if it is schema-aware) must recognize an
xsl:import-schema
declaration for this namespace, whether
or not the schema-location
is supplied.
If a schema-location
is provided, then the schema document at that
location must be equivalent to the schema document at
B.1 Schema for the XML Representation of JSON; the effect if it is not is implementation-dependent
The rules governing the mapping from JSON to XML are as follows. In these rules, the
phrase “an element named N” is to be interpreted as meaning “an element node whose
local name is N and whose namespace URI is
http://www.w3.org/2005/xpath-functions
”.
The JSON value null
is represented by an element named
null
, with empty content.
The JSON values true
and false
are represented by an
element named boolean
, with content conforming to the type
xs:boolean
.
A JSON number is represented by an element named number
, with
content conforming to the type xs:double
, with the additional
restriction that the value must not be positive or negative infinity, nor
NaN
.
A JSON string is represented by an element named string
, with
content conforming to the type xs:string
.
A JSON array is represented by an element named array
. The content
is a sequence of child elements representing the members of the array in order,
each such element being the representation of the array member obtained by
applying these rules recursively.
A JSON object is represented by an element named map
. The content
is a sequence of child elements each of which represents one of the name/value
pairs in the object. The representation of the name/value pair N:V
is obtained by taking the element that represents the value V (by
applying these rules recursively) and adding an attribute with name
key
(in no namespace), whose value is N as an
instance of xs:string
.
The attribute escaped="true"
may be specified on a string
element to indicate that the string value contains backslash-escaped characters that
are to be interpreted according to the JSON rules. The attribute
escaped-key="true"
may be specified on any element with a
key
attribute to indicate that the key contains backslash-escaped
characters that are to be interpreted according to the JSON rules. Both attributes
have the default value false
.
The JSON grammar for number
is a subset of
the lexical space of the XSD type xs:double
. The mapping from JSON
number
values to xs:double
values is defined by the
XPath rules for casting from xs:string
to xs:double
. Note
that these rules will never generate an error for out-of-range values; instead very
large or very small values will be converted to +INF
or
-INF
. Since JSON does not impose limits on the range or precision of
numbers, the conversion is not guaranteed to be lossless.
Although the order of entries in a JSON object is generally considered to have no
significance, the function json-to-xml
and the stylesheets that perform
the reverse transformation both retain order.
The XDM representation of a JSON value may either be untyped (all elements annotated
as xs:untyped
, attributes as xs:untypedAtomic
), or it may
be typed. If it is typed, then it must have the type annotations
obtained by validating the untyped representation against the schema given in
B.1 Schema for the XML Representation of JSON. If it is untyped, then it
must be an XDM instance such that validation against this
schema would succeed.
This section describes conventions which in principle can be adopted by the specification
of any function. At the time of writing, the function which invoke these conventions
are
xml-to-json
and json-to-xml
.
As a matter of convention, a number of functions defined in this document take a parameter whose value is a map, defining options controlling the detail of how the function is evaluated. Maps are a new data type introduced in XSLT 3.0.
For example, the function fn:xml-to-json
has an options parameter
allowing specification of whether the output is to be indented. A call might be written:
fn:xml-to-json($input, map{'indent':true()})
[Definition: Functions that take an options parameter adopt common conventions on how the options are used. These are referred to as the option parameter conventions. These rules apply only to functions that explicitly refer to them.]
Where a function adopts the option parameter conventions, the following rules apply:
The value of the relevant argument must be a map. The entries in the map are
referred to as options: the key of the entry is called the option name, and the
associated value is the option value. Option names defined in this specification
are always strings (single xs:string
values). Option values may
be of any type.
The type of the options parameter in the function signature is always
given as map(*)
.
Although option names are described above as strings, the actual key may be
any value that compares equal to the required string (using the eq
operator
with Unicode codepoint collation). For example, instances of xs:untypedAtomic
or xs:anyURI
are equally acceptable.
Note:
This means that the implementation of the function can check for the
presence and value of particular options using the functions map:contains
and/or map:get
.
It is not an error if the options map contains options with names other than those
described in this specification. Implementations may attach an
implementation-defined meaning to such entries,
and may define errors that arise if such entries are present with invalid values.
Implementations must ignore such entries unless they have a specific
implementation-defined meaning.
Implementations that define additional options in this way should
use values of type xs:QName
as the option names, using an appropriate namespace.
All entries in the options map are optional, and supplying an empty map has the same effect as omitting the relevant argument in the function call, assuming this is permitted.
For each named option, the function specification defines a required type for the option value. The value that is actually supplied in the map is converted to this required type using the function conversion rulesXP31. A type error [ERR XPTY0004] XP30 occurs if conversion of the supplied value to the required type is not possible, or if this conversion delivers a coerced function whose invocation fails with a type error. A dynamic error occurs if the supplied value after conversion is not one of the permitted values for the option in question: the error codes for this error are defined in the specification of each function.
Note:
It is the responsibility of each function implementation to invoke this conversion; it does not happen automatically as a consequence of the function calling rules.
In cases where an option is list-valued, by convention the value may be supplied
either as a sequence or as an array. Accepting a sequence is convenient if the
value is generated programmatically using an XPath expression; while accepting an
array
allows the options to be held in an an external file in JSON format, to be read using
a call on the fn:json-doc
function.
In cases where the value of an option is itself a map, the specification of the particular function must indicate whether or not these rules apply recursively to the contents of that map.
Parses a string supplied in the form of a JSON text, returning the results in the form of an XML document node.
fn:json-to-xml (
|
$json-text |
as xs:string ,
|
$options |
as map(*) ) as document-node() |
This function is nondeterministicFO30, context-dependentFO30, and focus-independentFO30. It depends on static base URI.
The effect of the one-argument form of this function is the same as calling the
two-argument form with an empty map as the value of the $options
argument.
The first argument is a JSON-text (see below) in the form of a string. The function parses this string to return an XDM node.
The $options
argument can be used to control the way in which the parsing
takes place. The value of the argument is a map. The options defined in this
specification have keys that are strings. The effect of any map entries whose keys
are
not defined in this specification is implementation-defined; implementation-defined
options should use QNames as keys. Implementations
must ignore any entries in the map whose keys are not defined in
this specification, unless the key has a specific implementation-defined meaning.
The entries that may appear in the $options
map are as follows. The keys
are xs:string
values:
Key | Value | Meaning |
---|---|---|
liberal
|
Determines whether deviations from the syntax of RFC7159 are permitted. The value must be a boolean. | |
false
|
The input must consist of an optional byte order mark (which is ignored) followed by a string
that conforms to the grammar of JSON-text in [RFC 7159].
An error must be raised
(see below) if the input does not conform to the grammar.
|
|
true
|
The input may contain deviations from the grammar of [RFC 7159], which are handled in an implementation-defined way. (Note: some popular extensions include allowing quotes on keys to be omitted, allowing a comma to appear after the last item in an array, allowing leading zeroes in numbers, and allowing control characters such as tab and newline to be present in unescaped form.) Since the extensions accepted are implementation-defined, an error may be raised (see below) if the input does not conform to the grammar. | |
validate
|
If the $options map contains an entry with the key
"validate" , then the value must be an
xs:boolean . The default is true for a schema-aware
processor, false for a non-schema-aware processor. If the value
true is supplied and the processor is not schema-aware, a
dynamic error results [see ERR XTDE3245]. It is not
necessary that the containing stylesheet should import the relevant
schema.
|
|
true
|
Indicates that the resulting XDM instance must be typed; that is, the element
and attribute nodes must carry the type annotations that result from validation
against the schema given at B.1 Schema for the XML Representation of JSON, or against an
implementation-defined schema if the liberal
option has the value yes .
|
|
false
|
Indicates that the XDM instance must be untyped. | |
escape
|
Determines whether special characters are represented in the XDM output
in backslash-escaped form.
The required type is xs:boolean .
|
|
false (default)
|
All characters in the input that are valid
in the version of XML supported by the implementation, whether or not they are represented
in the input by means of an escape sequence, are represented as unescaped characters
in the result. Any
characters or codepoints that are not valid XML characters
(for example, unpaired surrogates) are passed to the fallback function
as described below; in the absence of a fallback function, they are replaced by
the Unicode REPLACEMENT CHARACTER (xFFFD ).
The attributes escaped and escaped-key will not be present in the XDM output.
|
|
true
|
JSON escape sequences are used in the result to represent special characters in the
JSON input, as defined below,
whether or not they were represented using JSON escape sequences in the input.
The characters that are considered "special" for this purpose are:
\t ), or a six-character escape sequence otherwise
(for example \uDEAD ). Characters other than these will not be escaped in the result,
even if they were escaped in the input. In the result:
|
|
fallback
|
Provides a function which is called when an invalid character is encountered. | |
Function with signature function(xs:string) as xs:string |
When an invalid character is encountered this function is called supplying the escaped form of the character as the argument. The function returns a string which is inserted into the result in place of the invalid character. The function also has the option of raising a dynamic error. |
The various structures that can occur in JSON are transformed recursively to XDM values according to the rules given in 22.1 XML Representation of JSON.
The function returns a document node, whose only child is the element node representing the outermost construct in the JSON text.
The function is not deterministicFO30: that is, if the function is called twice with the same arguments, it is implementation-dependent whether the same node is returned on both occasions.
The base URI of the returned document node is taken from the static base URI of the function call.
[ERR XTDE3240] It is a dynamic error if the value of
$input
does not conform to the JSON grammar as defined
by [RFC 7159], allowing implementation-defined extensions
if the liberal
option is set to yes
.
[ERR XTDE3245] It is a dynamic error if the value of
the validate
option is true
and the processor is not
schema-aware.
[ERR XTDE3250] It is a dynamic error if the value of
$input
contains an escaped representation of a character (or
codepoint) that is not a valid character in the version of XML supported by the
implementation, unless the unescape
option is set to false.
[ERR XTDE3260] It is a dynamic error if the value of
$options
includes an entry whose key is liberal
,
validate
, unescape
, or fallback
,
and whose value is not a permitted value for that key.
To read a JSON file, this function can be used in conjunction with the
unparsed-text
FO30 function.
ECMA-404 differs from RFC 4627 in two respects: it does not allow the input to depart from the JSON grammar, but it does allow the top-level construct in the input to be a string, boolean, number, or null, rather than requiring an object or array.
Many JSON implementations allow commas to be used after the last item in an object
or
array, although the specification does not permit it. The option
spec="liberal"
is provided to allow such deviations from the
specification to be accepted. Some JSON implementations also allow constructors such
as
new Date("2000-12-13")
to appear as values: specifying
spec="liberal"
allows such extensions to be accepted, but does not
guarantee it. If such extensions are accepted, the resulting value is
implementation-defined, and will not necessarily conform to the schema at B.1 Schema for the XML Representation of JSON.
The expression json-to-xml('{"x": 1, "y": [3,4,5]}')
returns
<map xmlns="http://www.w3.org/2005/xpath-functions">
<number key="x">1</number>
<array key="y">
<number>3</number>
<number>4</number>
<number>5</number>
</array>
</map>
.
The expression json-to-xml('"abcd"', map{'liberal': false()})
returns <string xmlns="http://www.w3.org/2005/xpath-functions">abcd</string>
.
The expression json-to-xml('{"x": "\\", "y": "\u0025"}')
returns
<map xmlns="http://www.w3.org/2005/xpath-functions">
<string key="x">\</string>
<string key="y">%</string>
</map>
.
The expression json-to-xml('{"x": "\\", "y": "\u0025"}', map{'escape': true()})
returns
<map xmlns="http://www.w3.org/2005/xpath-functions">
<string escaped="true" key="x">\\</string>
<string key="y">%</string>
</map>
. (But see the detailed rules for alternative values of the escaped
attribute
on the second string
element.)
The following example illustrates use of the fallback function to handle characters that are invalid in XML.
let $jsonstr := unparsed-text('http://example.com/endpoint'), $options := map { 'liberal': true(), 'fallback': function($char as xs:string) as xs:string { let $c0chars := map { '\u0000':'[NUL]', '\u0001':'[SOH]', '\u0002':'[STX]', ... '\u001E':'[RS]', '\u001F':'[US]' }, $replacement := $c0chars($char) return if (exists($replacement)) then $replacement else error(xs:QName('err:invalid-char'), 'Error: ' || $char || ' is not a C0 control character.') } } return json-to-xml($jsonstr, $options)
Converts an XML tree, whose format corresponds to the XML representation of JSON defined in this specification, into a string conforming to the JSON grammar.
fn:xml-to-json (
|
$input |
as node()? ,
|
$options |
as map(*) ) as xs:string? |
This function is deterministicFO30, context-independentFO30, and focus-independentFO30.
The effect of the one-argument form of this function is the same as calling the
two-argument form with an empty map as the value of the $options
argument.
The first argument $input
is a node; the subtree rooted at this node will typically be
the XML representation of a JSON document as defined in 22.1 XML Representation of JSON.
If $input
is the empty sequence, the function returns the empty sequence.
The $options
argument can be used to control the way in which the conversion
takes place. The option parameter conventions apply.
The entries that may appear in the $options
map are as follows:
Key | Value | Meaning |
---|---|---|
indent
|
Determines whether additional whitespace should be added to the output to improve
readability.
The required type is xs:boolean .
|
|
false
|
The processor must not insert any insignificant whitespace between JSON tokens. | |
true
|
The processor may insert whitespace between JSON tokens in order to improve readability. The specification imposes no constraints on how this is done. |
The node supplied as $input
must be one of the following: [ERR FOJS0006] FO31
An element node whose name matches the name of a global element declaration in the schema given in B.1 Schema for the XML Representation of JSON and whose type annotation matches the type of that element declaration (indicating that the element has been validated against this schema).
An element node whose name matches the name of a global element declaration in the
schema given in
B.1 Schema for the XML Representation of JSON,
and whose content
after stripping all attributes (at any depth) in namespaces other than http://www.w3.org/2005/xpath-functions
is such that validation against the schema given in B.1 Schema for the XML Representation of JSON would succeed.
Note:
The reason attributes in alien namespaces are stripped is to avoid the need for a
non-schema-aware
processor to take into account the effect of attributes such as xsi:type
and xsi:nil
that would affect the outcome of schema validation.
An element node E having a key
attribute and/or an escaped-key
attribute
provided that E would satisfy one of the above
conditions if the key
and/or escaped-key
attributes were removed.
A document node having exactly one element child and no text node children, where the element child satisfies any of the conditions above.
Furthermore, $input
must satisfy the following constraint
(which cannot be conveniently expressed in the schema). Every element M that is a descendant-or-self of
$input
and has local name map
and namespace URI http://www.w3.org/2005/xpath-functions
must satisfy the following rule: there must not be two distinct children of M (say C1 and C2)
such that the normalized key of C1 is equal to the normalized key of C2. The normalized key
of an element C is as follows:
If C has the attribute value escaped-key="true"
, then the value of the
key
attribute of C, with all JSON escape sequences replaced by the corresponding Unicode characters
according to the JSON escaping rules.
Otherwise (the escaped-key
attribute of C is absent or set to false),
the value of the key
attribute of C.
Nodes in the input tree are handled by applying the following rules, recursively.
In these rules the term
"an element named N" means "an element node whose local name is N and whose namespace URI is
http://www.w3.org/2005/xpath-functions
".
A document node having a single element node child is processed by processing that child.
An element named null
results in the output null
.
An element $E
named boolean
results in the output true
or false
depending on the result of xs:boolean(fn:string($E))
.
An element $E
named number
results in the output of the string
result of xs:string(xs:double(fn:string($E)))
An element named string
results in the output of the string value of the element, enclosed in
quotation marks, with any special characters in the string escaped as described below.
An element named array
results in the output of the children of the array
element,
each processed by applying these rules recursively: the items in the resulting list
are enclosed between square brackets,
and separated by commas.
An element named map
results in the output of a sequence of map entries corresponding to
the children of the map
element, enclosed between curly braces and separated by commas.
Each entry comprises the value of the key
attribute of the child element, enclosed in quotation marks
and escaped as described below, followed by a colon, followed by the result of processing
the child element
by applying these rules recursively.
Comments, processing instructions, and whitespace text node children of map
and array
are ignored.
Strings are escaped as follows:
If the attribute escaped="true"
is present for a string value, or escaped-key="true"
for a key value, then:
any valid JSON escape sequence present in the string is copied unchanged to the output;
any invalid JSON escape sequence results in a dynamic error [ERR FOJS0007] FO31;
any unescaped occurrence of quotation mark, backspace, form-feed, newline, carriage
return, tab, or solidus is replaced by
\"
, \b
, \f
, \n
, \r
, \t
or \/
respectively;
any other codepoint in the range 1-31 or 127-159 is replaced by an escape in the form \uHHHH where HHHH is the upper-case hexadecimal representation of the codepoint value.
Otherwise (that is, in the absence of the attribute escaped="true"
for a string value,
or escaped-key="true"
for a key value):
any occurrence of backslash is replaced by \\
any occurrence of quotation mark, backspace, form-feed, newline, carriage return,
or tab is
replaced by \"
, \b
, \f
, \n
, \r
, or \t
respectively;
any other codepoint in the range 1-31 or 127-159 is replaced by an escape in
the form \uHHHH
where HHHH
is the upper-case hexadecimal representation of the codepoint value.
A dynamic error is raised [ERR FOJS0005] FO31 if the value of
$options
includes an entry whose key is defined in this specification,
and whose value is not a permitted value for that key.
A dynamic error is raised [ERR FOJS0006] FO31 if the value of
$input
is not a document or element node or is not valid according to the schema for the
XML representation of
JSON, or if a map
element has two children whose normalized key values are the same.
A dynamic error is raised [ERR FOJS0007] FO31 if the value of
$input
includes a string labeled with escaped="true"
, or
a key labeled with escaped-key="true"
, where the content of the string or key
contains an invalid JSON escape sequence: specifically, where it contains a backslash
(\
) that is not followed by one
of the characters "
, \
, /
, b
, f
, n
,
r
, t
, or u
, or or where it contains the characters \u
not followed by four hexadecimal digits (that is [0-9A-Fa-f]{4}
).
The rule requiring schema validity has a number of consequences, including the following:
The input cannot contain no-namespace attributes, or attributes in the namespace http://www.w3.org/2005/xpath-functions
,
except where explicitly allowed by the schema. Attributes in other namespaces, however,
are ignored.
Nodes that do not affect schema validity, such as comments, processing instructions,
namespace nodes, and whitespace text node
children of map
and array
, are ignored.
Numeric values are restricted to those that are valid in JSON: the schema disallows positive and negative infinity and NaN.
Duplicate keys within a map are disallowed. Most cases of duplicate keys are prevented by the rules in the schema;
additional cases (where the keys are equal only after expanding JSON escape sequences)
are prevented by the prose rules
of this function. For example, the key values \n
and \u000A
are treated as duplicates even though
the rules in the schema do not treat them as such.
The rule allowing the top-level element to have a key
attribute (which is ignored)
allows any element in the output of the fn:json-to-xml
function
to be processed: for example, it is possible to take a JSON document, convert it to
XML, select
a subtree based on the value of a key
attribute, and then convert this subtree
back to JSON, perhaps after a transformation. The rule means that an element with
the appropriate name will be
accepted if it has been validated against one of the
types mapWithinMapType
, arrayWithinMapType
, stringWithinMapType
,
numberWithinMapType
, booleanWithinMapType
, or nullWithinMapType
.
The input <array xmlns="http://www.w3.org/2005/xpath-functions"><number>1</number><string>is</string><boolean>1</boolean></array>
produces the result [1,"is",true]
.
The input <map xmlns="http://www.w3.org/2005/xpath-functions"><number key="Sunday">1</number><number
key="Monday">2</number></map>
produces the result {"Sunday":1,"Monday":2}
.
Given an XML structure that does not use the XML representation of JSON defined in
22.1 XML Representation of JSON, there are two practical ways to convert it
to JSON: either perform a transformation to the XML representation of JSON and then
call the xml-to-json
function; or transform it to JSON directly
by using custom template rules.
To assist with the second approach, a stylesheet is provided in B.2 Stylesheet for converting XML to JSON. This stylesheet includes a function
j:xml-to-json
which, apart from being in a different namespace, is
functionally very similar to the xml-to-json
function described in
the previous section. (It differs in doing less validation
of the input than the function specification requires, and in the details of how
special characters are escaped.)
The implementation of the function is exposed, using template
rules to perform a recursive descent of the supplied input, and the behavior of the
function can therefore be customized (typically by importing the stylesheet and
adding additional template rules) to handle arbitrary XML input.
The stylesheet is provided under the W3C software license for the convenience of
users. There is no requirement for any conformant XSLT processor to make this
stylesheet available. Processors may implement the
xml-to-json
function by invoking this stylesheet (adapted
to achieve full conformance), but there is no requirement to do so.