[Bug 26784] New: [SER 3.1] Comments on JSON Serialization Method

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 26784] New: [SER 3.1] Comments on JSON Serialization Method

Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=26784

            Bug ID: 26784
           Summary: [SER 3.1] Comments on JSON Serialization Method
           Product: XPath / XQuery / XSLT
           Version: Working drafts
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Serialization 3.1
          Assignee: [hidden email]
          Reporter: [hidden email]
        QA Contact: [hidden email]

Apologies that this review could have been done months ago; I only just noticed
the spec.

The comment applies to XSLT and XQuery Serialization 3.1 dated 24 April 2014,
mainly section 9.

1. Serializing maps. After converting keys to strings there may be duplicate
keys, e.g. the string "2014-09-11" and the date 2014-09-11 are not equal but
both convert to the same string. What happens?

2. Serializing maps. Should be explicit that the serialization order of entries
is impl-dep.

3. The proposal is that nodes in the input should be atomized. I think it would
generally be more useful if they are serialized (using the XML output method
with default properties plus omit-xml-declaration="yes"). Also note,
atomization of a node does not produce a single atomic value, it produces a
sequence of atomic values.

4. indent: I think we should specify that if indent="no", the serializer must
output no whitepace between tokens (none is required to satisfy the grammar).
Because this could generate very long lines, while indent="yes" might generate
very verbose output, perhaps there's a need for an intermediate option to
insert a newline occasionally to limit the line length?

5. suppress-indentation: I can't see how this is relevant to JSON
serialization.

6. We should specify which characters should be escaped using JSON escape
sequences. Probably only (a) those where escaping is mandatory, e.g. backslash
and double-quotes, plus \n, \t etc; plus any characters that can't be
represented directly in the chosen encoding. (So encoding="us-ascii" forces
escaping of non-ASCII characters).

7. Section 9 (JSON output method) says that the effect of item-separator is
described in section 2 (Sequence Normalization), but section 2 (on my reading)
says that it does not apply to the JSON output method. (An alternative reading
is that sequence normalization is mandatory for all output methods except JSON,
where it is presumably optional...)

8. There seem to be values for which no JSON serialization is defined. For
example, sequences. What is the JSON serialization of (1 to 10)? Is this an
error? I would be inclined to serialize any sequence of length > 1 as an array.

9 Other values that cannot be serialized into legal JSON include INF, NaN, and
function items. These should probably be serialization errors.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 26784] [SER 3.1] Comments on JSON Serialization Method

Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=26784

C. M. Sperberg-McQueen <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED

--- Comment #1 from C. M. Sperberg-McQueen <[hidden email]> ---
The following remarks reflect the tentative views of the editors of the
serialization spec, after some discussion.

On 1 (duplicate map keys), we note that RFC 7159 does not impose a
uniqueness constraint on member names in an object, although it notes
that some JSON parsers will raise an error on them. The WG seems to
face a choice here:

  (A) Make no substantive change, on the grounds that RFC 7159 does
      not make duplicate object-member names an error.  

      Optionally, add a note to warn users that if they care about
      avoiding duplicate object names, they need to take steps to
      ensure that they are not omitted.

  (B) raise an error, on the grounds that object member names are
      normally supposed to be unique in JSON, RFC 7159 explicitly
      warns of interoperability issues with them, and many users will
      want a serialization error rather than an error in the next JSON
      ingest step.  

      Optionally, add a serialization option to specify that duplicate
      object-member names should not raise an error.  (We'd be happy
      to let try/catch handle this, but try/catch doesn't seem to work
      for implicit serialization.)

On 2 (implementation-dependent order of map keys): agreed.  Good
catch.

On 3 (atomization vs serialization).  The WG will need to decide this.

If we understand correctly, the differences between the current draft
and the proposal to use serialization rather than atomization can be
illustrated with the following sample map:

  map {
    "able" := <e>42</e>,
    "baker" := <f>1 2 3 4 5</f>,
    "charlie" := <f><e>1</e><e>2</e><e>3</e><e>4</e><e>5</e></f>,
    "dog" := <p>The design goals for XML are:
               <list type="ordered">
                 <item>
                   <p>XML shall be straightforwardly usable over
                      the Internet.</p>
                 </item>
                 ...
               </list>
             </p>
  }

Under the current draft (augmented with the rules suggested below to
serialize non-empty sequences as JSON arrays and escape JSON strings
as needed), if all the elements are untyped, this would (I think)
produce

  { "able" : 42,
    "baker" : [1, 2, 3, 4, 5],
    "charlie" : "12345",
    "dog" := "The design goals for XML are:\n\t\n\t\t\n\t\t\t\nXML ..."
  }

If the 'f' element is typed as an element-only element, then the
"charlie" member would produce an error when atomization attempted to
access the typed value of the 'f' element.

If we perform serialization rather than atomization, the results
should be more like this (is this a correct understanding of the
intention?):

  {
    "able" : "<e>42</e>",
    "baker" := "<f>1 2 3 4 5</f>",
    "charlie" := "<f><e>1</e><e>2</e><e>3</e><e>4</e><e>5</e></f>",
    "dog" := "<p>The design goals for XML are:\n\t<list
type=\"ordered\">\n\t\t<item>\n\t\t\t<p>XML ..."
  }

On the one hand:  atomization serializes more elements as data values,
which seems likely to be often what JSON users desire.  And on the
other:  serialization serializes more elements without errors, which
also seems likely to be a desired property.  

On 4 and 5 (indentation): agreed.

On 6 (character escaping):  agreed.

On 7 (item-separator):  we propose to specify that item-separator is
not relevant to JSON output.

On 8 (serialize non-empty non-singleton sequences as arrays):  agreed.

On 9 (INF, NaN, function items, ...):  agreed.

We hope to have draft wording reflecting these changes for the WG to
review tomorrow.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 26784] [SER 3.1] Comments on JSON Serialization Method

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=26784

--- Comment #2 from C. M. Sperberg-McQueen <[hidden email]> ---
A revised version of the 3.1 Serialization spec which addresses most of the
issues raised here (but not item 3) is at [1] (member-only link).

[1]
https://www.w3.org/XML/Group/qtspecs/specifications/xslt-xquery-serialization-31/html/Overview-diff.html

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 26784] [SER 3.1] Comments on JSON Serialization Method

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=26784

Jonathan Robie <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--- Comment #3 from Jonathan Robie <[hidden email]> ---
Is the schema up to date?

https://www.w3.org/XML/Group/qtspecs/specifications/xslt-xquery-serialization-31/html/Overview-diff.html#serparams-schema

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 26784] [SER 3.1] Comments on JSON Serialization Method

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=26784

Andrew Coleman <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--- Comment #4 from Andrew Coleman <[hidden email]> ---
(In reply to Jonathan Robie from comment #3)
> Is the schema up to date?
>
> https://www.w3.org/XML/Group/qtspecs/specifications/xslt-xquery-
> serialization-31/html/Overview-diff.html#serparams-schema

Yes, this is updated with the new serialization parameters

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 26784] [SER 3.1] Comments on JSON Serialization Method

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=26784

Andrew Coleman <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #5 from Andrew Coleman <[hidden email]> ---
Closing as agreed in joint teleconference on 2014-10-14

--
You are receiving this mail because:
You are the QA Contact for the bug.