Origin vs Authority; use of HTTPS (draft-nottingham-site-meta-01)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Origin vs Authority; use of HTTPS (draft-nottingham-site-meta-01)

Thomas Roessler

Reading draft-nottingham-site-meta-01...

> 4. Discovering host-meta Files

> The metadata for a given authority can be discovered by  
> dereferencing the path /host-meta on the same authority. For  
> example, for an HTTP URI [RFC2616], the following request would  
> obtain metadata for the authority "www.example.com:80";

Editorial nit: That semicolon wants to be a colon.

> GET /host-meta HTTP/1.1
> Host: www.example.com

It is somewhat unclear what the scope of the host-meta file is, or  
more precisely, how the URI for the host-meta file is derived from the  
URI of the resource that the metadata apply to.

Section 4 seems to suggest that the URI is maybe generated by  
dereferencing the relative URI reference /host-meta using the  
resource's URI as the base URI, but it doesn't say that clearly; the  
use of "authority" suggests that the choice of the protocol is  
actually up to the implementation.

 From the previous apps-discuss thread, it seems like the main use  
case for permitting metadata to leak across schemes (and therefore,  
typically ports -- though ports and schemes are strictly speaking  
orthogonal) lies with URI schemes that do not have a resource  
retrieval operation readily available, e.g., mailto.

On the other hand, I'm extremely wary about anything near HTTP that  
might tear down origin boundaries without a great deal of care.  E.g.,  
a purely authority-based approach might permit metadata to leak from  
the HTTP part of a site (where no integrity protection is given) into  
its HTTPS part (where integrity protection and authenticity of data is  
deemed important), possibly permitting attacks against web  
applications that are ostensibly protected -- as is alluded to in the  
security considerations.

The obvious solution to that part of the puzzle is to let the  
mechanism default to the same URI scheme, unless there is a specific  
convention to the contrary.  That should cover any URI schemes for  
which a safe retrieval operation is defined (HTTP, HTTPS, FTP come to  
mind).

For other URI schemes, one could either punt on this issue completely,  
define a default fall-back to HTTP (or HTTPS, depending on which of  
the two better matches the security properties of the protocol in  
question), or actually say explicitly what's the correct scheme.

Thoughts?

--
Thomas Roessler, W3C  <[hidden email]>


Reply | Threaded
Open this post in threaded view
|

Re: Origin vs Authority; use of HTTPS (draft-nottingham-site-meta-01)

Adam Barth-5

Wow, this draft is scary.  I haven't seen the prior discussion of this
draft, but we should learn from the mistakes of Flash's
crossdomain.xml policy file.  In particular, you should require that
the host-meta file should be served with a specific mime type (ignore
the response if the mime type is wrong.  This protects servers that
let users upload content from having attackers upload a bogus
host-meta file.

Also, if you want this feature to be useful for Web browsers, you
should align the scope of the host-meta file with the notion or origin
(not authority).  Section 4 seems to imply that the scope is
"www.example.com:80" but Section 6 implies the scope is
"https://www.example.com".  In fact, computing the origin of a URL
correctly is more complex than this draft assumes.  For details, see
my origin draft.

That said, I think host-meta would be super useful if specified correctly.

Adam


On Tue, Feb 10, 2009 at 6:57 AM, Thomas Roessler <[hidden email]> wrote:

> Reading draft-nottingham-site-meta-01...
>
>> 4. Discovering host-meta Files
>
>> The metadata for a given authority can be discovered by dereferencing the
>> path /host-meta on the same authority. For example, for an HTTP URI
>> [RFC2616], the following request would obtain metadata for the authority
>> "www.example.com:80";
>
> Editorial nit: That semicolon wants to be a colon.
>
>> GET /host-meta HTTP/1.1
>> Host: www.example.com
>
> It is somewhat unclear what the scope of the host-meta file is, or more
> precisely, how the URI for the host-meta file is derived from the URI of the
> resource that the metadata apply to.
>
> Section 4 seems to suggest that the URI is maybe generated by dereferencing
> the relative URI reference /host-meta using the resource's URI as the base
> URI, but it doesn't say that clearly; the use of "authority" suggests that
> the choice of the protocol is actually up to the implementation.
>
> From the previous apps-discuss thread, it seems like the main use case for
> permitting metadata to leak across schemes (and therefore, typically ports
> -- though ports and schemes are strictly speaking orthogonal) lies with URI
> schemes that do not have a resource retrieval operation readily available,
> e.g., mailto.
>
> On the other hand, I'm extremely wary about anything near HTTP that might
> tear down origin boundaries without a great deal of care.  E.g., a purely
> authority-based approach might permit metadata to leak from the HTTP part of
> a site (where no integrity protection is given) into its HTTPS part (where
> integrity protection and authenticity of data is deemed important), possibly
> permitting attacks against web applications that are ostensibly protected --
> as is alluded to in the security considerations.
>
> The obvious solution to that part of the puzzle is to let the mechanism
> default to the same URI scheme, unless there is a specific convention to the
> contrary.  That should cover any URI schemes for which a safe retrieval
> operation is defined (HTTP, HTTPS, FTP come to mind).
>
> For other URI schemes, one could either punt on this issue completely,
> define a default fall-back to HTTP (or HTTPS, depending on which of the two
> better matches the security properties of the protocol in question), or
> actually say explicitly what's the correct scheme.
>
> Thoughts?
>
> --
> Thomas Roessler, W3C  <[hidden email]>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Origin vs Authority; use of HTTPS (draft-nottingham-site-meta-01)

Mark Nottingham-4
In reply to this post by Thomas Roessler


Hi Thomas,

Gentle reminder; the draft asks for discussion on www-talk. Sending  
followups there (I should have mentioned this in the announcement,  
sorry)...

Responses below.


On 11/02/2009, at 1:57 AM, Thomas Roessler wrote:

> Reading draft-nottingham-site-meta-01...
>
>> 4. Discovering host-meta Files
>
>> The metadata for a given authority can be discovered by  
>> dereferencing the path /host-meta on the same authority. For  
>> example, for an HTTP URI [RFC2616], the following request would  
>> obtain metadata for the authority "www.example.com:80";
>
> Editorial nit: That semicolon wants to be a colon.

Ack.


>> GET /host-meta HTTP/1.1
>> Host: www.example.com
>
> It is somewhat unclear what the scope of the host-meta file is, or  
> more precisely, how the URI for the host-meta file is derived from  
> the URI of the resource that the metadata apply to.
>
> Section 4 seems to suggest that the URI is maybe generated by  
> dereferencing the relative URI reference /host-meta using the  
> resource's URI as the base URI, but it doesn't say that clearly; the  
> use of "authority" suggests that the choice of the protocol is  
> actually up to the implementation.

Well, the authority is host + port; common sense tells us that it's  
unlikely that the same (host, port) tuple that we speak HTTP on is  
also going to support SMTP or XMPP. I'm not saying that common sense  
is universal, however.


> From the previous apps-discuss thread, it seems like the main use  
> case for permitting metadata to leak across schemes (and therefore,  
> typically ports -- though ports and schemes are strictly speaking  
> orthogonal) lies with URI schemes that do not have a resource  
> retrieval operation readily available, e.g., mailto.

My understanding of the discussion's resolution was that this is not a  
goal for this spec any more; i.e., if there's any boundary-hopping, it  
will be defined by the protocol or application in use.


> On the other hand, I'm extremely wary about anything near HTTP that  
> might tear down origin boundaries without a great deal of care.

Very much agreed.


>  E.g., a purely authority-based approach might permit metadata to  
> leak from the HTTP part of a site (where no integrity protection is  
> given) into its HTTPS part (where integrity protection and  
> authenticity of data is deemed important), possibly permitting  
> attacks against web applications that are ostensibly protected -- as  
> is alluded to in the security considerations.
>
> The obvious solution to that part of the puzzle is to let the  
> mechanism default to the same URI scheme, unless there is a specific  
> convention to the contrary.  That should cover any URI schemes for  
> which a safe retrieval operation is defined (HTTP, HTTPS, FTP come  
> to mind).

I'm happy to clarify this by either adding scheme/protocol to the  
(host, port) tuple (although we'll probably have to come up with a  
different term than "authority"; PLEASE don't say "endpoint" ;), which  
will affect both the default scoping of application as well as the  
discovery mechanism, or just limiting it to discovery.


> For other URI schemes, one could either punt on this issue  
> completely, define a default fall-back to HTTP (or HTTPS, depending  
> on which of the two better matches the security properties of the  
> protocol in question), or actually say explicitly what's the correct  
> scheme.

I'm inclined to punt on it. Default fall-back to HTTP makes too many  
assumptions.

Thanks,

--
Mark Nottingham

Reply | Threaded
Open this post in threaded view
|

Re: Origin vs Authority; use of HTTPS (draft-nottingham-site-meta-01)

Thomas Roessler

On 11 Feb 2009, at 01:31, Mark Nottingham wrote:

> Gentle reminder; the draft asks for discussion on www-talk. Sending  
> followups there (I should have mentioned this in the announcement,  
> sorry)...

(and I should read instructions.... apologies)

>> The obvious solution to that part of the puzzle is to let the  
>> mechanism default to the same URI scheme, unless there is a  
>> specific convention to the contrary.  That should cover any URI  
>> schemes for which a safe retrieval operation is defined (HTTP,  
>> HTTPS, FTP come to mind).
>
> I'm happy to clarify this by either adding scheme/protocol to the  
> (host, port) tuple (although we'll probably have to come up with a  
> different term than "authority"; PLEASE don't say "endpoint" ;),  
> which will affect both the default scoping of application as well as  
> the discovery mechanism, or just limiting it to discovery.

I'd use the (scheme, host, port) triple to identify the endpoints that  
we're dealing with here, both for scope and discovery. Adam Barth's  
draft-abarth-origin gives a canonicalization procedure for these  
tuples.  That will be useful when the tuples derived from different  
URIs need to be compared, to determine whether one is in the same site  
metadata scope as the other.

Calling that kind of triple "an origin" seems fine, and is consistent  
with the usage of that word in draft-abarth-origin and elsewhere.

The benefit of using the triple for both discovery and scope is that  
you don't acquire yet another possible cross-origin channel in the  
browser.


>> For other URI schemes, one could either punt on this issue  
>> completely, define a default fall-back to HTTP (or HTTPS, depending  
>> on which of the two better matches the security properties of the  
>> protocol in question), or actually say explicitly what's the  
>> correct scheme.
>
> I'm inclined to punt on it. Default fall-back to HTTP makes too many  
> assumptions.

Same inclination here, actually.


Reply | Threaded
Open this post in threaded view
|

Re: Origin vs Authority; use of HTTPS (draft-nottingham-site-meta-01)

Adam Barth-6
In reply to this post by Mark Nottingham-4

On Tue, Feb 10, 2009 at 4:31 PM, Mark Nottingham <[hidden email]> wrote:
> Well, the authority is host + port; common sense tells us that it's unlikely
> that the same (host, port) tuple that we speak HTTP on is also going to
> support SMTP or XMPP. I'm not saying that common sense is universal,
> however.

These assumptions are often violated in attack scenarios, especially
by active network attackers who are very capable of hiding the honest
https://example.com server behind a spoofed http://example.com:443
server.

Adam