Questions about ALPN

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Questions about ALPN

Mark Nottingham-2
After our recent discussions, I went back and looked at the LC draft of ALPN:
  http://tools.ietf.org/html/draft-ietf-tls-applayerprotoneg-02
and noticed a few things.

For a while now, we've been talking about using the ALPN string for more than just negotiation within TLS; possible uses include in the Upgrade "dance", in the Alt-Svc / Alternate-Protocol header, and possibly within DNS.

However, the range of characters in an ALPN string is broad; in fact, UTF-8 is only a possibility:

"""
   o  Identification Sequence: The precise set of octet values that
      identifies the protocol.  This could be the UTF-8 encoding
      [RFC3629] of the protocol name.
"""

While that's fine for TLS, it becomes problematic in other environments like HTTP headers or DNS, because encoding will be necessary.

Furthermore, there's no ABNF for other specifications to refer to.

I think the range of an ALPN protocol identifier needs to be more constrained, to make it more amenable to reuse. Alternatively, maybe the protocol identifier registry shouldn't be specified by this document; it could be delegated to the URI scheme that is in use, for example. Or, maybe the Port Number registry could be reused (as SVN has done).

In some ways, that makes more sense, since ALPN is so TLS-specific; in some of the discussions I've had, people get confused when I talk about ALPN protocol identifiers being used for non-TLS protocols.

Note that I'm not against having ALPN define this registry -- I just think we need to talk about it and understand what the consequences of doing so will be.

Also, the document pre-registers a number of values:

"""
      Protocol: HTTP/1.1
      Identification Sequence: 0x68 0x74 0x74 0x70 0x2f 0x31 0x2e 0x31 ("http/1.1")
      Specification:  http://tools.ietf.org/html/rfc2616

      Protocol: SPDY/1
      Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x31 ("spdy/1")
      Specification: http://dev.chromium.org/spdy/spdy-protocol/spdy-protocol-draft1

      Protocol: SPDY/2
      Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x32 ("spdy/2")
      Specification: http://dev.chromium.org/spdy/spdy-protocol/spdy-protocol-draft2

      Protocol: SPDY/3
      Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x33 ("spdy/3")
      Specification:  http://dev.chromium.org/spdy/spdy-protocol/spdy-protocol-draft3
"""

The reference for HTTP/1.1 needs to be draft-ietf-httpbis-p1-messaging, and I don't think that SPDY/1 has ever been used (except for early internal testing by Mike and Roberto, perhaps).

I'll eventually send this feedback to TLS, but wanted to bring it up here first; please discuss.

Cheers,

--
Mark Nottingham   http://www.mnot.net/




Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Roberto Peon-2
SPDY/1 was never released externally, and so isn't particularly useful except as an example.

Arguably the string used within ALPN is a subset of the form that might be used in alt-protocol or alt-svc (there is no need to talk about ports, ips, hosts, transports, etc. within the ALPN string-- that choice was already made...), and so should probably be defined in a separate document from both the HTTP or ALPN specs.

-=R


On Mon, Oct 14, 2013 at 2:19 PM, Mark Nottingham <[hidden email]> wrote:
After our recent discussions, I went back and looked at the LC draft of ALPN:
  http://tools.ietf.org/html/draft-ietf-tls-applayerprotoneg-02
and noticed a few things.

For a while now, we've been talking about using the ALPN string for more than just negotiation within TLS; possible uses include in the Upgrade "dance", in the Alt-Svc / Alternate-Protocol header, and possibly within DNS.

However, the range of characters in an ALPN string is broad; in fact, UTF-8 is only a possibility:

"""
   o  Identification Sequence: The precise set of octet values that
      identifies the protocol.  This could be the UTF-8 encoding
      [RFC3629] of the protocol name.
"""

While that's fine for TLS, it becomes problematic in other environments like HTTP headers or DNS, because encoding will be necessary.

Furthermore, there's no ABNF for other specifications to refer to.

I think the range of an ALPN protocol identifier needs to be more constrained, to make it more amenable to reuse. Alternatively, maybe the protocol identifier registry shouldn't be specified by this document; it could be delegated to the URI scheme that is in use, for example. Or, maybe the Port Number registry could be reused (as SVN has done).

In some ways, that makes more sense, since ALPN is so TLS-specific; in some of the discussions I've had, people get confused when I talk about ALPN protocol identifiers being used for non-TLS protocols.

Note that I'm not against having ALPN define this registry -- I just think we need to talk about it and understand what the consequences of doing so will be.

Also, the document pre-registers a number of values:

"""
      Protocol: HTTP/1.1
      Identification Sequence: 0x68 0x74 0x74 0x70 0x2f 0x31 0x2e 0x31 ("http/1.1")
      Specification:  http://tools.ietf.org/html/rfc2616

      Protocol: SPDY/1
      Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x31 ("spdy/1")
      Specification: http://dev.chromium.org/spdy/spdy-protocol/spdy-protocol-draft1

      Protocol: SPDY/2
      Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x32 ("spdy/2")
      Specification: http://dev.chromium.org/spdy/spdy-protocol/spdy-protocol-draft2

      Protocol: SPDY/3
      Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x33 ("spdy/3")
      Specification:  http://dev.chromium.org/spdy/spdy-protocol/spdy-protocol-draft3
"""

The reference for HTTP/1.1 needs to be draft-ietf-httpbis-p1-messaging, and I don't think that SPDY/1 has ever been used (except for early internal testing by Mike and Roberto, perhaps).

I'll eventually send this feedback to TLS, but wanted to bring it up here first; please discuss.

Cheers,

--
Mark Nottingham   http://www.mnot.net/





Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Martin Thomson-3
In reply to this post by Mark Nottingham-2
On 14 October 2013 14:19, Mark Nottingham <[hidden email]> wrote:
> While that's fine for TLS, it becomes problematic in other environments like HTTP headers or DNS, because encoding will be necessary.

The question is whether you consider it a problem if ALPN can
negotiate protocols that others cannot.  Or, if you consider upgrade
to be the only place where a mechanism of this nature exists, that
those "others" might be required to perform some escaping, with all
the bugs that entails.

We're defining our strings in the ASCII range, so I didn't see a big
problem with the extra flexibility.

Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Martin J. Dürst
In reply to this post by Mark Nottingham-2


On 2013/10/15 6:19, Mark Nottingham wrote:

> After our recent discussions, I went back and looked at the LC draft of ALPN:
>    http://tools.ietf.org/html/draft-ietf-tls-applayerprotoneg-02
> and noticed a few things.
>
> For a while now, we've been talking about using the ALPN string for more than just negotiation within TLS; possible uses include in the Upgrade "dance", in the Alt-Svc / Alternate-Protocol header, and possibly within DNS.
>
> However, the range of characters in an ALPN string is broad; in fact, UTF-8 is only a possibility:
>
> """
>     o  Identification Sequence: The precise set of octet values that
>        identifies the protocol.  This could be the UTF-8 encoding
>        [RFC3629] of the protocol name.
> """

I have no idea why arbitrary octet values would be needed here. If
anybody knows, please explain.

I'm generally well known for strongly supporting UTF-8 in many places,
but I have difficulties to see why we'd need anything but ASCII (without
most punctuation,...) for something like protocol identifiers.

Regards,   Martin.

Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Andrei Popov
In reply to this post by Mark Nottingham-2

Hi Mark,

 

SPDY and HTTP/1.1 protocol IDs are defined in the ALPN draft so that ALPN can be immediately used to negotiate the same protocols as NPN. We had originally defined an HTTP/2 protocol ID as well, but removed it from the latest versions of the draft so that the HTTPbis WG can specify the appropriate protocol ID(s).

 

ALPN protocol IDs are defined as opaque non-empty octet strings, which allows for a compact representation of a large number of IDs. This does not prevent the HTTPbis WG from defining US-ASCII or UTF-8 protocol IDs for use on the Web. Do you have concerns about the representation of protocol IDs in ALPN?

 

Cheers,

 

Andrei

Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Mark Nottingham-2
In reply to this post by Mark Nottingham-2

On 16/10/2013, at 9:41 AM, Andrei Popov <[hidden email]> wrote:

> Hi Mark,
>  
> SPDY and HTTP/1.1 protocol IDs are defined in the ALPN draft so that ALPN can be immediately used to negotiate the same protocols as NPN. We had originally defined an HTTP/2 protocol ID as well, but removed it from the latest versions of the draft so that the HTTPbis WG can specify the appropriate protocol ID(s).
>  
> ALPN protocol IDs are defined as opaque non-empty octet strings, which allows for a compact representation of a large number of IDs. This does not prevent the HTTPbis WG from defining US-ASCII or UTF-8 protocol IDs for use on the Web.

Hi Andrei,

I understand; just a bit uncomfortable with using a registry that allows some fields that aren't valid. When we specify the header, we'll either have to specify an encoding (which probably won't get implemented), or just say that values outside a certain range are unusable.

Our intent is to have multiple ways to convey these ids (e.g., in DNS, in headers), which means there needs to be coordination about what syntactic limits they have. The normal place to do that is in the registry requirements.

Of course, we can work around this with some effort; it's just odd. Are there other strong motivating use cases that require this flexibility?

Cheers,


--
Mark Nottingham   http://www.mnot.net/




Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Amos Jeffries-2
In reply to this post by Andrei Popov
On 19/10/2013 7:12 a.m., Andrei Popov wrote:

>
> Hi Mark,
>
> SPDY and HTTP/1.1 protocol IDs are defined in the ALPN draft so that
> ALPN can be immediately used to negotiate the same protocols as NPN.
> We had originally defined an HTTP/2 protocol ID as well, but removed
> it from the latest versions of the draft so that the HTTPbis WG can
> specify the appropriate protocol ID(s).
>
> ALPN protocol IDs are defined as opaque non-empty octet strings, which
> allows for a compact representation of a large number of IDs. This
> does not prevent the HTTPbis WG from defining US-ASCII or UTF-8
> protocol IDs for use on the Web. Do you have concerns about the
> representation of protocol IDs in ALPN?
>

In addition to the worry about ALPN draft defining any there is the
issue of case-sensitivity for the ones it does list.
When I last read the draf it most definitely did NOT specify SPDY and
HTTP/1.1 - it did specify some non-existent spdy/* and http/1.1 tokens.

Has that been corrected?

Amos

Reply | Threaded
Open this post in threaded view
|

RE: Questions about ALPN

Andrei Popov
In reply to this post by Mark Nottingham-2
While ALPN was introduced primarily to enable the negotiation of HTTP and SPDY protocol versions within the TLS handshake, the intent is for other (non-Web) applications to be able to negotiate their protocols using the same TLS extension. It is conceivable that different applications will prefer different representations of their protocol IDs. Even on this thread, I believe both US-ASCII and UTF-8 protocol IDs have been mentioned. From this perspective, having the flexibility at the TLS layer appears beneficial. Treating application protocol IDs as opaque octet strings also allows efficient protocol ID matching at the TLS layer.

Cheers,

Andrei

-----Original Message-----
From: Mark Nottingham [mailto:[hidden email]]
Sent: Friday, October 18, 2013 6:18 PM
To: Andrei Popov
Cc: [hidden email]
Subject: Re: Questions about ALPN


On 16/10/2013, at 9:41 AM, Andrei Popov <[hidden email]> wrote:

> Hi Mark,
>  
> SPDY and HTTP/1.1 protocol IDs are defined in the ALPN draft so that ALPN can be immediately used to negotiate the same protocols as NPN. We had originally defined an HTTP/2 protocol ID as well, but removed it from the latest versions of the draft so that the HTTPbis WG can specify the appropriate protocol ID(s).
>  
> ALPN protocol IDs are defined as opaque non-empty octet strings, which allows for a compact representation of a large number of IDs. This does not prevent the HTTPbis WG from defining US-ASCII or UTF-8 protocol IDs for use on the Web.

Hi Andrei,

I understand; just a bit uncomfortable with using a registry that allows some fields that aren't valid. When we specify the header, we'll either have to specify an encoding (which probably won't get implemented), or just say that values outside a certain range are unusable.

Our intent is to have multiple ways to convey these ids (e.g., in DNS, in headers), which means there needs to be coordination about what syntactic limits they have. The normal place to do that is in the registry requirements.

Of course, we can work around this with some effort; it's just odd. Are there other strong motivating use cases that require this flexibility?

Cheers,


--
Mark Nottingham   http://www.mnot.net/




Reply | Threaded
Open this post in threaded view
|

RE: Questions about ALPN

Andrei Popov
In reply to this post by Amos Jeffries-2
Hi Amos,

The ALPN draft currently specifies the following initial set of registrations:
      Protocol: HTTP/1.1
      Identification Sequence: 0x68 0x74 0x74 0x70 0x2f 0x31 0x2e 0x31 ("http/1.1")

      Protocol: SPDY/1
      Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x31 ("spdy/1")

      Protocol: SPDY/2
      Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x32 ("spdy/2")

      Protocol: SPDY/3
      Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x33 ("spdy/3")

These protocol IDs are currently used in a number of deployed implementations, so they would have to be preserved even if their upper-case versions were also registered. To avoid confusion, I would recommend against registering upper-case versions of these protocol IDs.

Thanks,

Andrei

-----Original Message-----
From: Amos Jeffries [mailto:[hidden email]]
Sent: Friday, October 18, 2013 7:50 PM
To: [hidden email]
Subject: Re: Questions about ALPN

On 19/10/2013 7:12 a.m., Andrei Popov wrote:

>
> Hi Mark,
>
> SPDY and HTTP/1.1 protocol IDs are defined in the ALPN draft so that
> ALPN can be immediately used to negotiate the same protocols as NPN.
> We had originally defined an HTTP/2 protocol ID as well, but removed
> it from the latest versions of the draft so that the HTTPbis WG can
> specify the appropriate protocol ID(s).
>
> ALPN protocol IDs are defined as opaque non-empty octet strings, which
> allows for a compact representation of a large number of IDs. This
> does not prevent the HTTPbis WG from defining US-ASCII or UTF-8
> protocol IDs for use on the Web. Do you have concerns about the
> representation of protocol IDs in ALPN?
>

In addition to the worry about ALPN draft defining any there is the issue of case-sensitivity for the ones it does list.
When I last read the draf it most definitely did NOT specify SPDY and
HTTP/1.1 - it did specify some non-existent spdy/* and http/1.1 tokens.

Has that been corrected?

Amos


Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Martin Thomson-3
On 21 October 2013 16:48, Andrei Popov <[hidden email]> wrote:
>       Protocol: SPDY/1
>       Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x31 ("spdy/1")


Presumably, SPDY/1 doesn't exist in any real-world implementation,
since it was never deployed.

Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Roberto Peon-2
correct.
-=R


On Mon, Oct 21, 2013 at 8:07 PM, Martin Thomson <[hidden email]> wrote:
On 21 October 2013 16:48, Andrei Popov <[hidden email]> wrote:
>       Protocol: SPDY/1
>       Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x31 ("spdy/1")


Presumably, SPDY/1 doesn't exist in any real-world implementation,
since it was never deployed.


Reply | Threaded
Open this post in threaded view
|

RE: Questions about ALPN

Andrei Popov

If folks feel strongly about removing SPDY/1 from the list of initial registrations, I don’t mind making this change.

 

Cheers,

 

Andrei

 

From: Roberto Peon [mailto:[hidden email]]
Sent: Monday, October 21, 2013 8:49 PM
To: Martin Thomson
Cc: Andrei Popov; Amos Jeffries; [hidden email]
Subject: Re: Questions about ALPN

 

correct.

-=R

 

On Mon, Oct 21, 2013 at 8:07 PM, Martin Thomson <[hidden email]> wrote:

On 21 October 2013 16:48, Andrei Popov <[hidden email]> wrote:
>       Protocol: SPDY/1
>       Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x31 ("spdy/1")

Presumably, SPDY/1 doesn't exist in any real-world implementation,
since it was never deployed.

 

Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Amos Jeffries-2
In reply to this post by Andrei Popov
On 22/10/2013 12:48 p.m., Andrei Popov wrote:

> Hi Amos,
>
> The ALPN draft currently specifies the following initial set of registrations:
>        Protocol: HTTP/1.1
>        Identification Sequence: 0x68 0x74 0x74 0x70 0x2f 0x31 0x2e 0x31 ("http/1.1")
>
>        Protocol: SPDY/1
>        Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x31 ("spdy/1")
>
>        Protocol: SPDY/2
>        Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x32 ("spdy/2")
>
>        Protocol: SPDY/3
>        Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x33 ("spdy/3")
>
> These protocol IDs are currently used in a number of deployed implementations, so they would have to be preserved even if their upper-case versions were also registered. To avoid confusion, I would recommend against registering upper-case versions of these protocol IDs.

https://svn.tools.ietf.org/svn/wg/httpbis/draft-ietf-httpbis/latest/p1-messaging.html#http.version 
clearly states the octets for the protocol name (HTTP-name) are
%x48.54.54.50.
RFC 2068 and 2616 are not quite so loud about it, the BNF definition is
written in upper case ("HTTP" "/" ...) and only permit lower-case to
match against the upper-case definition. In practice the implementations
use upper-case.

"http/1.1" is a malformed version for RFC 2608/2616 HTTP-version,
existing only within the fluffy BNF tolerant interpretation. That
tolerance is in the process of being removed from the HTTP/1.1
specifications as the draft linked above shows.

When the registries containing HTTP version name token are considered
ALPN is clearly the odd one out.

I would suggest that it is far easier for the few existing ALPN
implementations to update themselves (already having demonstrated high
speed of rollout) with a new draft or even RFC formal version in a
backward-compatible way than it will be to get uncounted numbers of HTTP
implementations to add duplicate parsing code for accepting the ALPN
lower-case token. Or possibly causing them to violate the updated HTTP
specification in the process if they choose to re-implement existing
parser code case-insensitive.

Amos

Reply | Threaded
Open this post in threaded view
|

RE: Questions about ALPN

Andrei Popov
I totally agree that the name of the HTTP protocol is all uppercase, and this is how this protocol is named in the ALPN protocol IDs registry:
"Protocol: HTTP/1.1".

ALPN protocol IDs, on the other hand, are opaque octet strings, not intended for display to the user, not required to consist of printable characters, not required to match names of protocols. E.g. 0x01 could be a valid ALPN protocol ID.

In the early days of ALPN, a combination of printable characters happened to be chosen as the protocol ID for HTTP/1.1:
"Identification Sequence: 0x68 0x74 0x74 0x70 0x2f 0x31 0x2e 0x31 ("http/1.1")".

The use of printable characters has some advantages (sometimes makes network capture analysis easier) and disadvantages (may result in more bytes on the wire). We may now wish that a different protocol ID had been chosen, we may even register alternative protocol IDs for HTTP/1.1. However, an interoperable implementation will have to keep sending the old protocol ID for years to come. Based on this, I believe that registering alternative protocol IDs for HTTP/1.1 will create more confusion and inefficiency than keeping the current (possibly imperfect) protocol ID.

Cheers,

Andrei

-----Original Message-----
From: Amos Jeffries [mailto:[hidden email]]
Sent: Wednesday, October 23, 2013 6:30 AM
To: [hidden email]
Subject: Re: Questions about ALPN

On 22/10/2013 12:48 p.m., Andrei Popov wrote:

> Hi Amos,
>
> The ALPN draft currently specifies the following initial set of registrations:
>        Protocol: HTTP/1.1
>        Identification Sequence: 0x68 0x74 0x74 0x70 0x2f 0x31 0x2e
> 0x31 ("http/1.1")
>
>        Protocol: SPDY/1
>        Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x31
> ("spdy/1")
>
>        Protocol: SPDY/2
>        Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x32
> ("spdy/2")
>
>        Protocol: SPDY/3
>        Identification Sequence: 0x73 0x70 0x64 0x79 0x2f 0x33
> ("spdy/3")
>
> These protocol IDs are currently used in a number of deployed implementations, so they would have to be preserved even if their upper-case versions were also registered. To avoid confusion, I would recommend against registering upper-case versions of these protocol IDs.

https://svn.tools.ietf.org/svn/wg/httpbis/draft-ietf-httpbis/latest/p1-messaging.html#http.version
clearly states the octets for the protocol name (HTTP-name) are %x48.54.54.50.
RFC 2068 and 2616 are not quite so loud about it, the BNF definition is written in upper case ("HTTP" "/" ...) and only permit lower-case to match against the upper-case definition. In practice the implementations use upper-case.

"http/1.1" is a malformed version for RFC 2608/2616 HTTP-version, existing only within the fluffy BNF tolerant interpretation. That tolerance is in the process of being removed from the HTTP/1.1 specifications as the draft linked above shows.

When the registries containing HTTP version name token are considered ALPN is clearly the odd one out.

I would suggest that it is far easier for the few existing ALPN implementations to update themselves (already having demonstrated high speed of rollout) with a new draft or even RFC formal version in a backward-compatible way than it will be to get uncounted numbers of HTTP implementations to add duplicate parsing code for accepting the ALPN lower-case token. Or possibly causing them to violate the updated HTTP specification in the process if they choose to re-implement existing parser code case-insensitive.

Amos


Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Amos Jeffries-2
On 24/10/2013 11:25 a.m., Andrei Popov wrote:
> I totally agree that the name of the HTTP protocol is all uppercase, and this is how this protocol is named in the ALPN protocol IDs registry:
> "Protocol: HTTP/1.1".
>
> ALPN protocol IDs, on the other hand, are opaque octet strings, not intended for display to the user, not required to consist of printable characters, not required to match names of protocols. E.g. 0x01 could be a valid ALPN protocol ID.

Than please make it opaque and not something that to the human eye reads
like a invalid protocol token from the IANA protocol registry.

> In the early days of ALPN, a combination of printable characters happened to be chosen as the protocol ID for HTTP/1.1:
> "Identification Sequence: 0x68 0x74 0x74 0x70 0x2f 0x31 0x2e 0x31 ("http/1.1")".

"early days" wow. All of 12 months ago.

Which consultation process made that decision BTW?
* it does not appear to have included the official HTTP protocol HTTPbis
working group.
* neither does it seems to have involved anyone from IANA, since they
would likely have pointed you at the official registry of standardized
protocol names http://www.ietf.org/assignments/service-names/ - this is
THE list of protocols.


> The use of printable characters has some advantages (sometimes makes network capture analysis easier) and disadvantages (may result in more bytes on the wire).

Are you trying to say "HTTP" are not equally printable characters to the
ones chosen, without sounding like an idiot?

>   We may now wish that a different protocol ID had been chosen, we may even register alternative protocol IDs for HTTP/1.1. However, an interoperable implementation will have to keep sending the old protocol ID for years to come.

Completely wrong. Handle the old specifiers from experimental ALPN draft
implementers as you would future undefined or unsupported protocol tags,
and upgrade the implementations to conform to whatever ALPN documents
becomes an RFC.

You are going to have to do that anyway. Yet seem to be treating this as
if Draft status means the document is set in stone. If the
implementations can't handle changes they are already broken.

>   Based on this, I believe that registering alternative protocol IDs for HTTP/1.1 will create more confusion and inefficiency than keeping the current (possibly imperfect) protocol ID.

... and that is exacty what is being doing with this new lower-cased
protocol name which differs from how things have been done for 20 years
or more.

Amos

Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Martin J. Dürst
In reply to this post by Andrei Popov


On 2013/10/22 6:03, Andrei Popov wrote:
> While ALPN was introduced primarily to enable the negotiation of HTTP and SPDY protocol versions within the TLS handshake, the intent is for other (non-Web) applications to be able to negotiate their protocols using the same TLS extension. It is conceivable that different applications will prefer different representations of their protocol IDs.

Yes, If some of them, like HTTP, prefer upper-case because it has always
been upper-case, and others prefer lower-case, that's not going to be a
problem.

> Even on this thread, I believe both US-ASCII and UTF-8 protocol IDs have been mentioned.

I saw UTF-8 mentioned as a theoretical idea, but I didn't see any actual
UTF-8 example. Please tell me in case I missed it.

More strongly, as I have said before, I think for *protocol*
identifiers, UTF-8 is entirely and completely unnecessary.

> From this perspective, having the flexibility at the TLS layer appears beneficial.

Flexibility is good, too much flexibility is bad.

> Treating application protocol IDs as opaque octet strings also allows efficient protocol ID matching at the TLS layer.

There's a huge difference between *allowing* arbitrary octet strings
(which is completely unnecessary and actually problematic if you have
octets e.g. in the C0 range show up in displays) and *comparing* them
octet-by-octet (which is good for efficiency).

So please fix them to say that they are limited to printable ASCII and
are compared byte-by-byte. That will be flexible enough without being
too flexible, and efficient on top of it.

Regards,   Martin.

Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Joseph Salowey (jsalowey)
While the primary motivation for ALPN is the HTTP work, it may not be the only consumer and we did not see the need to restrict the possible values.   If the work associated to HTTP wants to restrict the values that they use then that can be done within the context of that work.  Other usages would still be free to define values that are expressed in other ways.   I think this design allows for flexibility so we do not have to define an extension for each usage.  

I still don't see a reason why allowing additional representations beyond what HTTP wants to use as problematic as long as we can represent what HTTP needs.  You will still need to handle unwanted character sets since implementations may choose not to follow the specification for malicious and benign reasons.

Cheers,

Joe



On Oct 26, 2013, at 3:30 PM, "Martin J. Dürst" <[hidden email]>
 wrote:

>
>
> On 2013/10/22 6:03, Andrei Popov wrote:
>> While ALPN was introduced primarily to enable the negotiation of HTTP and SPDY protocol versions within the TLS handshake, the intent is for other (non-Web) applications to be able to negotiate their protocols using the same TLS extension. It is conceivable that different applications will prefer different representations of their protocol IDs.
>
> Yes, If some of them, like HTTP, prefer upper-case because it has always been upper-case, and others prefer lower-case, that's not going to be a problem.
>
>> Even on this thread, I believe both US-ASCII and UTF-8 protocol IDs have been mentioned.
>
> I saw UTF-8 mentioned as a theoretical idea, but I didn't see any actual UTF-8 example. Please tell me in case I missed it.
>
> More strongly, as I have said before, I think for *protocol* identifiers, UTF-8 is entirely and completely unnecessary.
>
>> From this perspective, having the flexibility at the TLS layer appears beneficial.
>
> Flexibility is good, too much flexibility is bad.
>
>> Treating application protocol IDs as opaque octet strings also allows efficient protocol ID matching at the TLS layer.
>
> There's a huge difference between *allowing* arbitrary octet strings (which is completely unnecessary and actually problematic if you have octets e.g. in the C0 range show up in displays) and *comparing* them octet-by-octet (which is good for efficiency).
>
> So please fix them to say that they are limited to printable ASCII and are compared byte-by-byte. That will be flexible enough without being too flexible, and efficient on top of it.
>
> Regards,   Martin.
>


Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Amos Jeffries-2
On 28/10/2013 3:24 p.m., Joseph Salowey (jsalowey) wrote:
> While the primary motivation for ALPN is the HTTP work, it may not be the only consumer and we did not see the need to restrict the possible values.   If the work associated to HTTP wants to restrict the values that they use then that can be done within the context of that work.  Other usages would still be free to define values that are expressed in other ways.   I think this design allows for flexibility so we do not have to define an extension for each usage.

This comparison methodology Martin is proposing has nothing HTTP-specfic
about it. It is simply the most flexible and cross-protocol compatible
definition you can use.

Case-insensitivity and UTF-8 which have both been put forward as
properties of the ALPN token are in fact major *limitations* on what can
be done. UTF-8 implies all the octet mapping between language variants
and alphabets. Case-insensitivity implies mapping between octet case
values even in 7-bit ASCII.

Defining it as opaque 8-bit content with octet-by-octet/byte-by-byte
comparison is *The* most flexible definition to use by far.

> I still don't see a reason why allowing additional representations beyond what HTTP wants to use as problematic as long as we can represent what HTTP needs.  You will still need to handle unwanted character sets since implementations may choose not to follow the specification for malicious and benign reasons.

Representations beyond what HTTP wants is not the issue. Preventing
other non-HTTP definitions from "accidentally" mapping to the HTTP token
*is* a problem. Going past octet-by-octet comparison to anything more
complex introduces a potential for mapping errors and adds needless
complexity.

Amos


  Cheers, Joe On Oct 26, 2013, at 3:30 PM, "Martin J. Dürst"
<[hidden email]> wrote:

>>
>> On 2013/10/22 6:03, Andrei Popov wrote:
>>> While ALPN was introduced primarily to enable the negotiation of HTTP and SPDY protocol versions within the TLS handshake, the intent is for other (non-Web) applications to be able to negotiate their protocols using the same TLS extension. It is conceivable that different applications will prefer different representations of their protocol IDs.
>> Yes, If some of them, like HTTP, prefer upper-case because it has always been upper-case, and others prefer lower-case, that's not going to be a problem.
>>
>>> Even on this thread, I believe both US-ASCII and UTF-8 protocol IDs have been mentioned.
>> I saw UTF-8 mentioned as a theoretical idea, but I didn't see any actual UTF-8 example. Please tell me in case I missed it.
>>
>> More strongly, as I have said before, I think for *protocol* identifiers, UTF-8 is entirely and completely unnecessary.
>>
>>>  From this perspective, having the flexibility at the TLS layer appears beneficial.
>> Flexibility is good, too much flexibility is bad.
>>
>>> Treating application protocol IDs as opaque octet strings also allows efficient protocol ID matching at the TLS layer.
>> There's a huge difference between *allowing* arbitrary octet strings (which is completely unnecessary and actually problematic if you have octets e.g. in the C0 range show up in displays) and *comparing* them octet-by-octet (which is good for efficiency).
>>
>> So please fix them to say that they are limited to printable ASCII and are compared byte-by-byte. That will be flexible enough without being too flexible, and efficient on top of it.
>>
>> Regards,   Martin.
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Mark Nottingham-2
In reply to this post by Joseph Salowey (jsalowey)
Joe,

The crux of the matter for me is that we're talking about defining parallel mechanisms that also use this registry, but have syntactic constraints -- to wit, DNS, and HTTP headers.

That means we'll need to either define an encoding mechanism to allow these theoretical non-ASCII protocol identifiers to be used, or we'll have to say that certain protocol identifiers just won't work. Neither seems like a great solution.

Let's push on it the other way, though: has anyone stated a requirement for non-ASCII protocol identifiers? We've talked about the downside of so much flexibility -- what does it buy us?

Cheers,



On 28/10/2013, at 1:24 PM, "Joseph Salowey (jsalowey)" <[hidden email]> wrote:

> While the primary motivation for ALPN is the HTTP work, it may not be the only consumer and we did not see the need to restrict the possible values.   If the work associated to HTTP wants to restrict the values that they use then that can be done within the context of that work.  Other usages would still be free to define values that are expressed in other ways.   I think this design allows for flexibility so we do not have to define an extension for each usage.  
>
> I still don't see a reason why allowing additional representations beyond what HTTP wants to use as problematic as long as we can represent what HTTP needs.  You will still need to handle unwanted character sets since implementations may choose not to follow the specification for malicious and benign reasons.
>
> Cheers,
>
> Joe
>
>
>
> On Oct 26, 2013, at 3:30 PM, "Martin J. Dürst" <[hidden email]>
> wrote:
>
>>
>>
>> On 2013/10/22 6:03, Andrei Popov wrote:
>>> While ALPN was introduced primarily to enable the negotiation of HTTP and SPDY protocol versions within the TLS handshake, the intent is for other (non-Web) applications to be able to negotiate their protocols using the same TLS extension. It is conceivable that different applications will prefer different representations of their protocol IDs.
>>
>> Yes, If some of them, like HTTP, prefer upper-case because it has always been upper-case, and others prefer lower-case, that's not going to be a problem.
>>
>>> Even on this thread, I believe both US-ASCII and UTF-8 protocol IDs have been mentioned.
>>
>> I saw UTF-8 mentioned as a theoretical idea, but I didn't see any actual UTF-8 example. Please tell me in case I missed it.
>>
>> More strongly, as I have said before, I think for *protocol* identifiers, UTF-8 is entirely and completely unnecessary.
>>
>>> From this perspective, having the flexibility at the TLS layer appears beneficial.
>>
>> Flexibility is good, too much flexibility is bad.
>>
>>> Treating application protocol IDs as opaque octet strings also allows efficient protocol ID matching at the TLS layer.
>>
>> There's a huge difference between *allowing* arbitrary octet strings (which is completely unnecessary and actually problematic if you have octets e.g. in the C0 range show up in displays) and *comparing* them octet-by-octet (which is good for efficiency).
>>
>> So please fix them to say that they are limited to printable ASCII and are compared byte-by-byte. That will be flexible enough without being too flexible, and efficient on top of it.
>>
>> Regards,   Martin.
>>
>

--
Mark Nottingham   http://www.mnot.net/




Reply | Threaded
Open this post in threaded view
|

Re: Questions about ALPN

Alexey Melnikov
On 28/10/2013 05:00, Mark Nottingham wrote:
> Joe,
>
> The crux of the matter for me is that we're talking about defining parallel mechanisms that also use this registry, but have syntactic constraints -- to wit, DNS, and HTTP headers.
>
> That means we'll need to either define an encoding mechanism to allow these theoretical non-ASCII protocol identifiers to be used, or we'll have to say that certain protocol identifiers just won't work. Neither seems like a great solution.
>
> Let's push on it the other way, though: has anyone stated a requirement for non-ASCII protocol identifiers? We've talked about the downside of so much flexibility -- what does it buy us?
I am agreeing with Mark. Identifiers don't need to be anything other
than ASCII. If somebody wants to stuff binary data in ALPN to go with
protocol identifiers, then UTF-8 is not flexible enough.

So I would also like to see use cases for UTF-8 data which is not US-ASCII.

> Cheers,
>
>
>
> On 28/10/2013, at 1:24 PM, "Joseph Salowey (jsalowey)" <[hidden email]> wrote:
>
>> While the primary motivation for ALPN is the HTTP work, it may not be the only consumer and we did not see the need to restrict the possible values.   If the work associated to HTTP wants to restrict the values that they use then that can be done within the context of that work.  Other usages would still be free to define values that are expressed in other ways.   I think this design allows for flexibility so we do not have to define an extension for each usage.
>>
>> I still don't see a reason why allowing additional representations beyond what HTTP wants to use as problematic as long as we can represent what HTTP needs.  You will still need to handle unwanted character sets since implementations may choose not to follow the specification for malicious and benign reasons.
>>
>> Cheers,
>>
>> Joe
>>
>>
>>
>> On Oct 26, 2013, at 3:30 PM, "Martin J. Dürst" <[hidden email]>
>> wrote:
>>
>>>
>>> On 2013/10/22 6:03, Andrei Popov wrote:
>>>> While ALPN was introduced primarily to enable the negotiation of HTTP and SPDY protocol versions within the TLS handshake, the intent is for other (non-Web) applications to be able to negotiate their protocols using the same TLS extension. It is conceivable that different applications will prefer different representations of their protocol IDs.
>>> Yes, If some of them, like HTTP, prefer upper-case because it has always been upper-case, and others prefer lower-case, that's not going to be a problem.
>>>
>>>> Even on this thread, I believe both US-ASCII and UTF-8 protocol IDs have been mentioned.
>>> I saw UTF-8 mentioned as a theoretical idea, but I didn't see any actual UTF-8 example. Please tell me in case I missed it.
>>>
>>> More strongly, as I have said before, I think for *protocol* identifiers, UTF-8 is entirely and completely unnecessary.
>>>
>>>>  From this perspective, having the flexibility at the TLS layer appears beneficial.
>>> Flexibility is good, too much flexibility is bad.
>>>
>>>> Treating application protocol IDs as opaque octet strings also allows efficient protocol ID matching at the TLS layer.
>>> There's a huge difference between *allowing* arbitrary octet strings (which is completely unnecessary and actually problematic if you have octets e.g. in the C0 range show up in displays) and *comparing* them octet-by-octet (which is good for efficiency).
>>>
>>> So please fix them to say that they are limited to printable ASCII and are compared byte-by-byte. That will be flexible enough without being too flexible, and efficient on top of it.
>>>
>>> Regards,   Martin.
>>>


12