[Bug 27257] New: anyURI_b006 seems to be valid

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 27257] New: anyURI_b006 seems to be valid

Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27257

            Bug ID: 27257
           Summary: anyURI_b006 seems to be valid
           Product: XML Schema Test Suite
           Version: 2006-11-06
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Microsoft tests
          Assignee: [hidden email]
          Reporter: [hidden email]
        QA Contact: [hidden email]

Bug 4048 [1] resulted in marking the expected result for anyURI_b006 test as
"invalid" because "//" (double slash) is considered as invalid URI. However
according to reading of rfc2396 [2] presented below double slash should be
considered as valid URI.

Section "5. Relative URI References" from rfc2396.txt [2] states that:

   A relative reference beginning with two slash characters is termed a
   network-path reference, as defined by <net_path> in Section 3.  

Section "3. URI Syntactic Components" from rfc2396 [2] states:

      net_path      = "//" authority [ abs_path ]

Section "3.2. Authority Component" from rfc2396 [2] states:

      authority     = server | reg_name

So if 'server' component can be empty then '//' should be considered as valid
URI. According to following reasoning 'server' component can be empty.

Section "3.2.2. Server-based Naming Authority" from rfc2396 [2] states:

      server        = [ [ userinfo "@" ] hostport ]

namely according to BNF rules above it is allowed for 'server' component to be
empty, thus '//' can be considered as empty relative network-path reference.

I understand that 3.2.2 from rfc2396 [2] in its beginning states:

   URL schemes that involve the direct use of an IP-based protocol to a
   specified server on the Internet use a common syntax for the server
   component of the URI's scheme-specific data:

      <userinfo>@<host>:<port>

   where <userinfo> may consist of a user name and, optionally, scheme-
   specific information about how to gain authorization to access the
   server. The parts "<userinfo>@" and ":<port>" may be omitted.

thus it looks like that from:
1. definition '<userinfo>@<host>:<port>'
2. and the excerpt from above: 'The parts "<userinfo>@" and ":<port>" may be
omitted'
it follows that '<host>' part is obligatory,
but section "1.6. Syntax Notation and Common Elements" states:

   This document uses two conventions to describe and define the syntax
   for URI.  The first, called the layout form, is a general description
   of the order of components and component separators, as in

      <first>/<second>;<third>?<fourth>

   The component names are enclosed in angle-brackets and any characters
   outside angle-brackets are literal separators.  Whitespace should be
   ignored.  These descriptions are used informally and do not define
   the syntax requirements.

namely it says: "These descriptions are used informally and do not define the
syntax requirements.". Hence I believe no conclusions about syntax should be
made from layout syntax definition '<userinfo>@<host>:<port>' of 'server'
component.

[1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=4048
[2] http://www.ietf.org/rfc/rfc2396.txt

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

Re: [Bug 27257] New: anyURI_b006 seems to be valid

Henry S. Thompson
bugzilla writes:


> Bug 4048 [1] resulted in marking the expected result for anyURI_b006 test as
> "invalid" because "//" (double slash) is considered as invalid URI. However
> according to reading of rfc2396 [2] presented below double slash should be
> considered as valid URI.

2396 was obsoleted by 3986 [3], whose BNF does _not_ allow the
authority to be empty:

  relative-part = "//" authority path-abempty
  authority     = [ userinfo "@" ] host [ ":" port ]

ht

[3] http://tools.ietf.org/html/rfc3986
--
       Henry S. Thompson, School of Informatics, University of Edinburgh
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 650-4587, e-mail: [hidden email]
                       URL: http://www.ltg.ed.ac.uk/~ht/
 [mail from me _always_ has a .sig like this -- mail without it is forged spam]

Reply | Threaded
Open this post in threaded view
|

[Bug 27257] anyURI_b006 seems to be valid

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27257

Henry S. Thompson <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--- Comment #1 from Henry S. Thompson <[hidden email]> ---
2396 was obsoleted by 3986 [3], whose BNF does _not_ allow the
authority to be empty:

  relative-part = "//" authority path-abempty
  authority     = [ userinfo "@" ] host [ ":" port ]

ht

[3] http://tools.ietf.org/html/rfc3986

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 27257] anyURI_b006 seems to be valid

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27257

--- Comment #2 from Georgiy Rakov <[hidden email]> ---
Yes, but XML Schema Part 2: Datatypes Second Edition [4] references rfc2396
rather than rfc3986.

[4] http://www.w3.org/TR/xmlschema-2/

Georgiy.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 27257] anyURI_b006 seems to be valid

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27257

--- Comment #3 from Henry S. Thompson <[hidden email]> ---
Indeed it does.  And 2396 says it has been replaced by 3986.  See recent
discussion about 'tight binding' vs. 'loose binding':


http://lists.w3.org/Archives/Public/www-xml-schema-comments/2014OctDec/0004.html

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 27257] anyURI_b006 seems to be valid

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27257

--- Comment #4 from Michael Kay <[hidden email]> ---
I have some sympathy with Georgiy on this one. XSD 1.0 references RFC 2396. The
problem is that RFC 2396 is a mess.

When I raised this as a bug in bug #4048, I was probably influenced by the fact
that the java.net.URI class rejects "//", with the error:

java.net.URISyntaxException: Expected authority at index 2: //

I suspect that the designers of class java.net.URI noted that very often when
the RFC mentions the term "authority", it means a non-empty authority. Examples
of this usage are: "A base URI without an authority component", "some URI
schemes do not allow an <authority> component", "If the authority component is
defined".

The Javadoc comments for java.net.URI say:

"This constructor parses the given string exactly as specified by the grammar
in RFC 2396, Appendix A, except for the following deviations:

(1) An empty authority component is permitted as long as it is followed by a
non-empty path, a query component, or a fragment component. This allows the
parsing of URIs such as "file:///foo/bar", which seems to be the intent of RFC
2396 although the grammar does not permit it. If the authority component is
empty then the user-information, host, and port components are undefined.

(2) ..."

So I think the justification for rejecting "//" is the belief that RFC 2396
doesn't mean what it says.

--
You are receiving this mail because:
You are the QA Contact for the bug.

Reply | Threaded
Open this post in threaded view
|

[Bug 27257] anyURI_b006 seems to be valid

Bugzilla from bugzilla@jessica.w3.org
In reply to this post by Bugzilla from bugzilla@jessica.w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27257

--- Comment #5 from Georgiy Rakov <[hidden email]> ---
If I understand correctly the intention is to treat referencing rfc2396 within
[4] in 'loose binding' manner (is this correct?). But W3C spec [4] doesn't
state that referencing to rfc2396 is done in 'loose binding' way. BTW: rfc2396
doesn't have any references to rfc3986 but even if such reference existed, I
believe, it wouldn't be obvious that it should take 'superseding' effect when
applying to [4].

So as I see it there is no normative spec stating that rfc2396 should be
superseded by rfc3986 when applying to W3C spec [4]. I believe 'tight binding'
is the 'default understanding' (it's closer to literal interpretation of the
text).

Neither are there any comments that rfc2396 should be understood with some
corrections taken into account (as Michael said rfc2396 is a mess).

--
You are receiving this mail because:
You are the QA Contact for the bug.