path-abempty in URI

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

path-abempty in URI

Tom Petch

I would like clarification about the ABNF in RFC3986.

My understanding from reading the text is that <path-abempty> exists to ensure
that there is always an authority between the // that comes after scheme: and
before the // that may start a path, so that
     scheme://example.com//a?valid=yes
is permissible while
     scheme:////?invalid
is not.  But the ABNF for authority is (selectively)
 authority     = host
   host          =  reg-name
    reg-name      = *( unreserved / pct-encoded / sub-delims )
which allows authority to be zero characters so that
     scheme:////SERVERA/////////////////////////?abnf=ok
where the path is //SERVERA///...
is allowed.

Comments?

Tom Petch



Reply | Threaded
Open this post in threaded view
|

Re: path-abempty in URI

Frank Ellermann

Tom Petch wrote:
 
> My understanding from reading the text is that <path-abempty>
> exists to ensure that there is always an authority between
> the // that comes after scheme: and before the // that may
> start a path

The "//" after the scheme ":" is only used in conjunction with
hier-part = "//" authority path-abempty

You can also have a scheme ":" with path-absolute, path-empty,
and path-rootless, and in these three cases there's no "//"
after the scheme ":".

1 - path-empty is 0<pchar>, no slashes in sight, next stop "?".
2 - path-absolute is "/" segment-nz etc., segment-nz is 1*pchar
    So here you have precisely one slash after the scheme ":".
3 - path-rootless starts with segment-nz, no "/" after the ":"

That leaves "//" authority path-abempty to get an interesting
number of slashes after the scheme ":".

Ignoring optional parts <authority> is at least a <host>, and
<host> is IP-literal / IPv4addrss / reg-name.  I'm positive
that the former are never empty, but <reg-name> can be empty.

path-abempty is zero or more "/" segment, and segment is zero
or more pchar.  So you can have file:///etc (three slashes),
and in theory also more slashes if the segments are "empty".

In practice file: is the only URI scheme I know that allows an
empty reg-name (in that case instead of localhost), are there
any other schemes with a similar "feature" ?

>      scheme:////SERVERA/////////////////////////?abnf=ok
> where the path is //SERVERA///... is allowed.  Comments?

Yes, you're right.  I didn't know that, thanks for info, bye




Reply | Threaded
Open this post in threaded view
|

Re: path-abempty in URI

Tom Petch

----- Original Message -----
From: "Frank Ellermann" <[hidden email]>
To: <[hidden email]>
Sent: Saturday, January 07, 2006 7:42 PM
Subject: Re: path-abempty in URI

>
> Tom Petch wrote:
>
> > My understanding from reading the text is that <path-abempty>
> > exists to ensure that there is always an authority between
> > the // that comes after scheme: and before the // that may
> > start a path
>
> The "//" after the scheme ":" is only used in conjunction with
> hier-part = "//" authority path-abempty
>
> You can also have a scheme ":" with path-absolute, path-empty,
> and path-rootless, and in these three cases there's no "//"
> after the scheme ":".
>
> 1 - path-empty is 0<pchar>, no slashes in sight, next stop "?".
> 2 - path-absolute is "/" segment-nz etc., segment-nz is 1*pchar
>     So here you have precisely one slash after the scheme ":".
> 3 - path-rootless starts with segment-nz, no "/" after the ":"
>
> That leaves "//" authority path-abempty to get an interesting
> number of slashes after the scheme ":".
>
> Ignoring optional parts <authority> is at least a <host>, and
> <host> is IP-literal / IPv4addrss / reg-name.  I'm positive
> that the former are never empty, but <reg-name> can be empty.
>
> path-abempty is zero or more "/" segment, and segment is zero
> or more pchar.  So you can have file:///etc (three slashes),
> and in theory also more slashes if the segments are "empty".
>
> In practice file: is the only URI scheme I know that allows an
> empty reg-name (in that case instead of localhost), are there
> any other schemes with a similar "feature" ?
>
> >      scheme:////SERVERA/////////////////////////?abnf=ok
> > where the path is //SERVERA///... is allowed.  Comments?
>
> Yes, you're right.  I didn't know that, thanks for info, bye
>
It was not so much a question of allowing an empty reg-name, as of requiring an
authority to be present, to be of at least one character.  I wanted to check
that my reading of the ABNF for authority in URI was correct, and you say it is.
In which case, importing the rules from URI is not of itself enough, some
additional textual comment is needed so that
    scheme://///////////
while conforming to the ABNF, is not regarded as valid.

Of course it is possible to write ABNF to achieve (almost) anything, but
sometimes I see it as better to get there by adding a comment:-)

Tom Petch


Reply | Threaded
Open this post in threaded view
|

Re: path-abempty in URI

Roy T. Fielding

On Jan 7, 2006, at 12:20 PM, Tom Petch wrote:

>>> My understanding from reading the text is that <path-abempty>
>>> exists to ensure that there is always an authority between
>>> the // that comes after scheme: and before the // that may
>>> start a path

yes, but the authority may be empty  (i.e., authority == "").

> It was not so much a question of allowing an empty reg-name, as of  
> requiring an
> authority to be present, to be of at least one character.

It does no such thing.  See <file:///etc/hosts>

> I wanted to check
> that my reading of the ABNF for authority in URI was correct

No, your reading is not correct.

    authority     = [ userinfo "@" ] host [ ":" port ]
    host          = IP-literal / IPv4address / reg-name
    reg-name      = *( unreserved / pct-encoded / sub-delims )

which means that a zero-length reg-name produces an empty authority
and thus is valid both in ABNF and in practice.

....Roy


Reply | Threaded
Open this post in threaded view
|

Re: path-abempty in URI

Christopher R. Hertel
In reply to this post by Frank Ellermann

Frank Ellermann wrote:
:
:
> In practice file: is the only URI scheme I know that allows an
> empty reg-name (in that case instead of localhost), are there
> any other schemes with a similar "feature" ?

The SMB URI scheme (still in draft) allows:  smb://

That form is used to identify the top of the local workgroup hierarchy (so
it would return a list of locally visible workgroups).

Chris -)-----

--
"Implementing CIFS - the Common Internet FileSystem" ISBN: 013047116X
Samba Team -- http://www.samba.org/     -)-----   Christopher R. Hertel
jCIFS Team -- http://jcifs.samba.org/   -)-----   ubiqx development, uninq.
ubiqx Team -- http://www.ubiqx.org/     -)-----   [hidden email]
OnLineBook -- http://ubiqx.org/cifs/    -)-----   [hidden email]