URI templates: comma-separated variable lists

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

URI templates: comma-separated variable lists

James Manger
draft-gregorio-uritemplate-06.txt allows a comma-separated list of variables in any expression, regardless of the expression operator.

However, a list should not be used with 5 of the 8 operators (+ # . / and no operator) because an expansion will be ambiguous if any variable in the list is undefined. The server cannot tell which value goes with which variable.

Consider if the template "{alpha,beta,gamma}" is expanded to "23,6". The user could have provided alpha=23 beta=6, or alpha=23 gamma=6, or beta=23 gamma=6 (or beta=[23,6] but that is a separate bug).

A template like "{alpha,beta,gamma}" is almost certainly a mistake by the template author. It only makes any sense if all the variables are mandatory, but in that case it is clearer to use "{alpha},{beta},{gamma}" -- with no chance of ambiguity (and simpler lower-level expressions).

I suggest changing the spec ABNF to only allow variable lists with operators that produce name=value pair (or drop variable lists entirely).

--
James Manger
Reply | Threaded
Open this post in threaded view
|

Re: URI templates: comma-separated variable lists

Joe Gregorio-2
On Wed, Aug 24, 2011 at 9:50 AM, James Manger <[hidden email]> wrote:
> draft-gregorio-uritemplate-06.txt allows a comma-separated list of variables in any expression, regardless of the expression operator.
>
> However, a list should not be used with 5 of the 8 operators (+ # . / and no operator) because an expansion will be ambiguous if any variable in the list is undefined. The server cannot tell which value goes with which variable.

URI Templates are for expansion and not parsing, so the use case of
trying to figure out which value goes with which variable is not a
supported use case.
If it were then there would be more problems than with just lists. For
example, how would the template

  "{a}{b}"

match this incoming string:

  "foobar"

a = "f"
b = "oobar"

or would it be:

  a = "foo"
  b = "bar"

or maybe

  a = "foobar"
  b = ""

  -joe

>
> Consider if the template "{alpha,beta,gamma}" is expanded to "23,6". The user could have provided alpha=23 beta=6, or alpha=23 gamma=6, or beta=23 gamma=6 (or beta=[23,6] but that is a separate bug).
>
> A template like "{alpha,beta,gamma}" is almost certainly a mistake by the template author. It only makes any sense if all the variables are mandatory, but in that case it is clearer to use "{alpha},{beta},{gamma}" -- with no chance of ambiguity (and simpler lower-level expressions).
>
> I suggest changing the spec ABNF to only allow variable lists with operators that produce name=value pair (or drop variable lists entirely).
>
> --
> James Manger
>



--
Joe Gregorio        http://bitworking.org

Reply | Threaded
Open this post in threaded view
|

Re: URI templates: comma-separated variable lists

Kev Burns
See page 12, paragraph 2
http://tools.ietf.org/html/draft-gregorio-uritemplate-06#page-12

Some URI Templates can be used in reverse for the purpose of variable
matching: comparing the template to a fully formed URI in order to
extract the variable parts from that URI and assign them to the named
variables..  Variable matching only works well if the template
expressions are delimited by the beginning or end of the URI or by
characters that cannot be part of the expansion, such as reserved
characters surrounding a simple string expression.  In general,
regular expression languages are better suited for variable matching.

- Kev
c: +001 (650) 521-7791


On Wed, Aug 24, 2011 at 8:20 AM, Joe Gregorio <[hidden email]> wrote:
On Wed, Aug 24, 2011 at 9:50 AM, James Manger <[hidden email]> wrote:
> draft-gregorio-uritemplate-06.txt allows a comma-separated list of variables in any expression, regardless of the expression operator.
>
> However, a list should not be used with 5 of the 8 operators (+ # . / and no operator) because an expansion will be ambiguous if any variable in the list is undefined. The server cannot tell which value goes with which variable.

URI Templates are for expansion and not parsing, so the use case of
trying to figure out which value goes with which variable is not a
supported use case.
If it were then there would be more problems than with just lists. For
example, how would the template

 "{a}{b}"

match this incoming string:

 "foobar"

a = "f"
b = "oobar"

or would it be:

 a = "foo"
 b = "bar"

or maybe

 a = "foobar"
 b = ""

 -joe

>
> Consider if the template "{alpha,beta,gamma}" is expanded to "23,6". The user could have provided alpha=23 beta=6, or alpha=23 gamma=6, or beta=23 gamma=6 (or beta=[23,6] but that is a separate bug).
>
> A template like "{alpha,beta,gamma}" is almost certainly a mistake by the template author. It only makes any sense if all the variables are mandatory, but in that case it is clearer to use "{alpha},{beta},{gamma}" -- with no chance of ambiguity (and simpler lower-level expressions).
>
> I suggest changing the spec ABNF to only allow variable lists with operators that produce name=value pair (or drop variable lists entirely).
>
> --
> James Manger
>



--
Joe Gregorio        http://bitworking.org


Reply | Threaded
Open this post in threaded view
|

RE: URI templates: comma-separated variable lists

Manger, James H
In reply to this post by Joe Gregorio-2
On Wed, Aug 24, 2011 at 9:50 AM, James Manger <[hidden email]> wrote:
>> draft-gregorio-uritemplate-06.txt allows a comma-separated list of variables in any expression, regardless of the expression operator.
>>
>> However, a list should not be used with 5 of the 8 operators (+ # . / and no operator) because an expansion will be ambiguous if any variable in the list is undefined. The server cannot tell which value goes with which variable.

Joe Gregorio responded
> URI Templates are for expansion and not parsing, so the use case of
> trying to figure out which value goes with which variable is not a
> supported use case.

The server will almost always want to parse a URI built from a template.
That doesn't mean the template has to provide *instructions* on how to parse.
The spec correctly notes that regular expressions are generally better
suited as parsing instructions.

The problem with {alpha,beta,gamma} as a template is that NOTHING
(not even a regex) can be used to parse the expansion.
That should be a huge warning sign that this construct is not useful.

The URI templates spec shouldn't go to heroic efforts to try to prevent
template authors from writing templates whose expansions cannot be parsed.
It should not, however, offer constructs that will NEVER be parseable,
despite servers wanting to parse out variable values in 99.9% of cases.


> If it were then there would be more problems than with just lists. For
> example, how would the template
>
>  "{a}{b}"

This is clearly an unusual template because you can immediately see that
parsing is ambiguous. If you see such a template your first thought is that
the template author has made a mistake.
The spec doesn't have to forbid this template as in limited circumstances
it might be reasonable. Perhaps the author knows 'a' will be a 4-digit year
so it can parse the result with the regex "(\d\d\d\d)(.*)".
The parser could be fooled by unexpected 'a' and 'b' values but in some
limited context that might not be an important risk
(eg author, user, processor, and server are mutually trusting).


I don't want "{a,b}" allowed in templates given that it triggers the
same reaction as "{a}{b}": it is probably an mistake by the author.

--
James Manger

Reply | Threaded
Open this post in threaded view
|

Re: URI templates: comma-separated variable lists

Roy T. Fielding
On Aug 24, 2011, at 5:08 PM, Manger, James H wrote:

> On Wed, Aug 24, 2011 at 9:50 AM, James Manger <[hidden email]> wrote:
>>> draft-gregorio-uritemplate-06.txt allows a comma-separated list of variables in any expression, regardless of the expression operator.
>>>
>>> However, a list should not be used with 5 of the 8 operators (+ # . / and no operator) because an expansion will be ambiguous if any variable in the list is undefined. The server cannot tell which value goes with which variable.
>
> Joe Gregorio responded
>> URI Templates are for expansion and not parsing, so the use case of
>> trying to figure out which value goes with which variable is not a
>> supported use case.
>
> The server will almost always want to parse a URI built from a template.

An origin server will always parse a URI, whether it is built from
a template or not.  There are many resources for which {x,y} is the
template.

In fact, the original server-side imagemaps were map?{x,y}.
Likewise, resources in Apache Sling are all name{.selectors*}
where the selectors are only distinguished by name (not by
order of appearance in the URI).

> That doesn't mean the template has to provide *instructions* on how to parse.
> The spec correctly notes that regular expressions are generally better
> suited as parsing instructions.
>
> The problem with {alpha,beta,gamma} as a template is that NOTHING
> (not even a regex) can be used to parse the expansion.

That simply isn't true.

> That should be a huge warning sign that this construct is not useful.

Or that you have created a paper tiger.  If the values are provided
by the server, they can be parsed.  If the values are unique, they
can be parsed.  If the values are never undefined, they can be parsed.
In short, your argument is false.

If a server has no use for a given template construct, that construct
will not be published by that server.  Each expression does not need
to handle all possible cases -- only the ones that matter to that
resource.  If, for any reason, the server needs to distinguish each
variable syntactically, then it is free to use only those URI Template
expressions that do exactly that.

....Roy


Reply | Threaded
Open this post in threaded view
|

RE: URI templates: comma-separated variable lists

Manger, James H
> There are many resources for which {x,y} is the template.

I am sure there are many resources for which {x},{y} is the template,
which isn't ambiguous when one variable is undefined (in contrast to {x,y}).

> In fact, the original server-side imagemaps were map?{x,y}.

I think you are using {x,y} as a shortcut for {x},{y}.
I does save 2 characters, but that is a small a gain for the
ambiguity introduced if not all the variables are defined.


> Likewise, resources in Apache Sling are all name{.selectors*}
> where the selectors are only distinguished by name (not by
> order of appearance in the URI).

That template has 1 variable (selectors) not a comma-separated list so I am not sure how it is relevant to this issue.

If the selectors were explicitly listed the template might be
  "name{.lang}{.fmt}{.ver}"
or
  "name{.lang,fmt,ver}"
Both offer *identical* semantics.
Not allowing the 2nd loses no functionality.
They both only work if all lang values and all fmt values and all ver values are distinct. That is quite a limiting constraint. I don't think we should offer a minor shortcut to template authors that only works under such limitations.
Authors will take the shortcut (eg mimic an example in the spec) and it will bite them when the limitations don't apply to their context.


>> The problem with {alpha,beta,gamma} as a template is that NOTHING
>> (not even a regex) can be used to parse the expansion.

> That simply isn't true.

Ok, it is only true in the general case (when there are no extra constraints on which variables are defined and their values).

Why would an author not use level-1 expressions "{alpha},{beta},{gamma}" instead of a level-3 expression "{alpha,beta,gamma}" when all variables are required? Saving 4 characters is too small a benefit.


Adding a new comma operator {,var} (acting like the . and / operators) would be a more consistent way to cover the corner case where, say, alpha and beta are required but gamma is optional.


>> That should be a huge warning sign that this construct is not useful.

> Or that you have created a paper tiger.  If the values are provided
> by the server, they can be parsed.  If the values are unique, they
> can be parsed.  If the values are never undefined, they can be parsed.
> In short, your argument is false.

Then add a note to the spec saying:
  "The 2-char-per-extra-variable saving that is possible by the allowing a
   comma-separated list of variables MUST only be used when the
   template author is certain that (i) the variable values are provided
   by the server, or (ii) all possible values of all the variables are
   unique, or (iii) the variables are never undefined."

With this text, the shortcut doesn't look so appealing.


> If a server has no use for a given template construct, that construct
will not be published by that server.

True. But it does have to be defined, explained, and illustrated with examples in the spec. It needs to be understood by authors when choosing how to write templates. It needs to be implemented in code. It needs to be handled (and rejected) by tools that automatically convert a template to a regex for parsing where possible.

--
James Manger

Reply | Threaded
Open this post in threaded view
|

Re: URI templates: comma-separated variable lists

Bob Aman-2
In reply to this post by Joe Gregorio-2
The first time you use (.*)(.*) in a regular expression, the solution
wasn't to fix the behavior of the regular expression engine, the
solution was to write a different regular expression. I feel like the
reverse URI template use-case is strongly analogous.