[Bug 25185] New: Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 25185] New: Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

            Bug ID: 25185
           Summary: Usage absorption can take crawling expressions when
                    TDU derives from xs:anyAtomicType
           Product: XPath / XQuery / XSLT
           Version: Last Call drafts
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: XSLT 3.0
          Assignee: [hidden email]
          Reporter: [hidden email]
        QA Contact: [hidden email]

Functions that take an atomized value that has an occurrence indicator of one
or zero-or-one, can be allowed to take a crawling expression as an argument.

This is true, because it is an error if the expression returns more than one
node, and it is possible in the same way as for fn:count(x) to determine at
runtime whether more than one nodes are returned, resulting in the dynamic
error XPTY0004[1].

This simplifies scenarios where the user is not interested in the depth of a
certain node, but knows beforehand that there will only ever be one matching
node, and if not, accepts it as an error scenario.

Example:

<xsl:value-of select="string(proto//version)" />

Currently, this is not streamable, but if there are more than one matches, this
would result in an error. If there is zero or one match, it is streamable.
Hence, it is streamable in both cases and we can consider this a normal
consuming expression.

This rule can apply to all functions (and therefor can simply be added to the
general streamability rules), even user-defined ones, that have an argument
that derives from xs:anyAtomicType with occurrence indicator zero, or
zero-or-one. For instance, included are fn:ceiling, fn:dateTime, fn:string,
fn:concat, fn:format-date, fn:error, but excluded are fn:data, fn:deep-equal,
fn:min, fn:max.

This bug report was "inspired" by researching backwards compatibility behavior
for bug 24506, comment 5.

[1] http://www.w3.org/TR/xpath-30/#ERRXPTY0004

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #1 from Michael Kay <[hidden email]> ---
I think you are right.

(You say "it is an error if the expression returns more than one node", which
is not strictly correct; it's OK if the expression returns more than one node
provided the atomized value is a singleton. So all selected nodes except one
must have a typed value that is an empty sequence)

Some cases may be tricky to handle, but I think it works in theory.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #2 from Michael Kay <[hidden email]> ---
For info, Saxon handles this case because before it does streamability
analysis, it does type checking, and where an expression such as
string(//title) expects a singleton, it rewrites it as
string(zero-or-one(data(//title))). The zero-or-one() makes it streamable under
the current rules.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #3 from Abel Braaksma <[hidden email]> ---
> and where an expression such as string(//title) expects a
> singleton, it rewrites it as string(zero-or-one(data(//title))).

actually, under current streamability rules, that would not be streamable,
because of the addition of fn:data() and the crawling posture of //title. If
you were to rewrite it as string(zero-or-one(//title)) it would work.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #4 from Michael Kay <[hidden email]> ---
>If you were to rewrite it as string(zero-or-one(//title)) it would work.

Yes but it would produce the wrong answer if there are two title elements and
one of them has an empty sequence as its typed value.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #5 from Michael Kay <[hidden email]> ---
I've convinced myself this is streamable, though it's quite tricky in
pathological cases, for example where the crawling sequence includes both
list-valued elements and their text node children. If the crawling sequence
includes both the element

<list> </list>

and its whitespace text node child, then the atomized value of the element is
an empty sequence and the atomized value of the text node is a whitespace
string, so the atomized value of the node sequence is the single whitespace
xs:string value " "...

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #6 from Michael Kay <[hidden email]> ---
In fact, the logic for atomizing a sequence under these conditions is not
really any easier than atomizing an arbitrary crawling sequence. The only
difference is that for an arbitrary crawling sequence you potentially need
memory proportional to the size of the largest atomic value times the depth of
nesting of nodes in the sequence.

Because expressions like //title when used in an atomizing context almost
invariably do NOT select overlapping nodes, I suggest that rather than handle
the special case where the required value is singleton atomic, we allow all
cases where the required type is an atomic sequence. That is, we deem data(X)
to be streamable if X is crawling.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #7 from Michael Kay <[hidden email]> ---
PROPOSAL
========

The proposal is that we distinguish atomization from other absorption
operations. For atomization, we permit the operand to be crawling.

Specifically, we introduce a fifth kind of operand usage, called atomization,
which differs from absorption in that in the general streamability rules, in
the table in 1.b.iii.B, the entry for "Atomization/Crawling" is "Consuming"
rather than "Free-Ranging".

This operand usage would apply whenever the semantics of the operation invoke
atomization. For example: function calls where the required type is atomic; the
data() function; AVTs; the select attribute of xsl:value-of. It would also
apply to the small number of operations that get the string value of a node,
for example string() and string-length(). In fact, it would apply to most cases
where we currently use usage="absorption", with the exception of constructs
like xsl:for-each and xsl:apply-templates and xsl:iterate where the processing
of descendant elements is defined by user-written code rather than built-in
code.

I'm then inclined to rename the existing usage=absorption as usage=consumption,
to preserve one-letter abbreviations for usages, and because there's a clear
link between usage=consumption and sweep=consuming.

A typical implementation will work as follows: when it encounters the start tag
of a selected node, it opens a buffer for the string value of a node, and adds
this buffer to the end of a queue. When it encounters a text node, it copies
the value to all currently open string-value buffers. When it encounters the
end tag for a selected node, it computes the atomized value of that node and
seals the buffer; it then delivers (and dequeues) the atomized value of all
buffers that are sealed and that are not queued behind one that is still open.
The number of open buffers on the queue is determined by the amount of nesting
of selected nodes in the crawling sequence, which in the vast majority of
practical cases will be one; if there are no nested nodes in the crawling
sequence, then each atomic value will be delivered as soon as the end tag for
the corresponding node is encountered.

We could extend the same mechanism to all absorption operations on crawling
sequences (for example, xsl:apply-templates and xsl:for-each). The reason I
don't propose doing this is that (a) the amount of data in each buffer is
unbounded (as it depends on user code), and (b) with operations like
apply-templates, as distinct from atomization, it is much more likely that the
result of the crawling expression will actually contain nested nodes.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

Michael Kay <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #8 from Michael Kay <[hidden email]> ---
The WG today accepted the technical effect of comment 7, but with advice to the
editor to reconsider the presentation and terminology. The word "comsumption"
was disliked. There was a suggestion that we could make do with the existing 4
operand usages, changing those that don't fit in (like apply-templates,
for-each, and iterate) so they no longer use the GSR or are in some way handled
as exceptions.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

Michael Kay <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |CLOSED

--- Comment #9 from Michael Kay <[hidden email]> ---
I found it was possible to make this change without changing any existing
concepts; apart from changing the relevant entry in the GSR table from
free-ranging to consuming, there is very little impact apart from a few
examples and explanations. Instructions such as for-each, for-each-group, and
iterate did not need to change, they are not affected because they don't rely
on the GSR. The rules for apply-templates need to change to disallow a climbing
or crawling select expression without appeal to the GSRs. The new rules for
calls to streamable user-defined functions already contained this provision.

There is some risk that I didn't find all the incidental places affected by
this change, that is, notes and examples.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #10 from Abel Braaksma <[hidden email]> ---
Great to hear the change is smaller than anticipated. Let us know once the
changes are applies, then I'll spend some time to go over them, including the
existing examples and rules.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #11 from Abel Braaksma <[hidden email]> ---
Correction previous comment: the change was applied to the internal working
draft.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

Michael Kay <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|CLOSED                      |REOPENED
         Resolution|FIXED                       |---

--- Comment #12 from Michael Kay <[hidden email]> ---
Unfortunately this change introduced a bug.

Consider:

<xsl:for-each select="//*">
  <xsl:copy-of select="."/>
</xsl:for-each>

The rules for xsl:for-each say (rule 3)

(a) The posture of the instruction is the posture of the contained sequence
constructor, assessed with the context posture and context item type set to the
posture and type of the select expression.

(b) The sweep of the instruction is the wider of the sweep of the select
expression and the sweep of the contained sequence constructor.

The context posture is crawling, and the posture of xsl:copy-of follows the GSR
with a single operand with posture=crawling, sweep=motionless. As a result of
the change to the table in the GSR, specifically the CRAWLING/ABSORPTION entry,
this is now CONSUMING (previously FREE-RANGINE). So the xsl:for-each as a whole
is grounded/consuming, whereas the intent (in comment 7) was that this would
still be roaming/free-ranging.

As far as xsl:for-each is concerned, I think we need to add a rule that if the
select expression is crawling and the body is consuming then the PS is
roaming/free-ranging. Similar changes may also be needed for apply-templates
and for-each-group.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #13 from Michael Kay <[hidden email]> ---
Abel pointed out that it might be possible to implement the intent by using the
rules for focus-changing expressions: in effect if the controlling part of a
focus-changing expression is crawling then the controlled part must be
motionless.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

--- Comment #14 from Michael Kay <[hidden email]> ---
I looked at the suggestion in comment 13, and I don't think it works in the way
suggested. Generally, focus-changing constructs such as xsl:for-each don't use
the GSR, they have individual rules, and commoning them up would be difficult.
So I think we need to augment the rules for each focus-changing construct.

Specifically:

xsl:for-each (new rule 3): if the select expression is crawling and the
contained sequence constructor is consuming, then roaming and free-ranging

xsl:iterate (new rule 4): if the select expression is crawling and the
contained sequence constructor is consuming, then roaming and free-ranging

xsl:for-each-group (new rule 7): if the select expression is crawling and the
contained sequence constructor is consuming, then roaming and free-ranging

path expressions and simple mapping expressions: no change

xsl:apply-templates: no change.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

Michael Kay <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #15 from Michael Kay <[hidden email]> ---
The WG accepted the proposal in comment #14.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 25185] Usage absorption can take crawling expressions when TDU derives from xs:anyAtomicType

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25185

Michael Kay <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |CLOSED

--- Comment #16 from Michael Kay <[hidden email]> ---
The changes have been applied.

--
You are receiving this mail because:
You are the QA Contact for the bug.