spoofing and IRIs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

spoofing and IRIs

Larry Masinter

(resending after fixing access problem)

 

Right now, the “Security Considerations” section of http://tools.ietf.org/html/draft-ietf-iri-3987bis-00#section-10  contains a relatively short discussion of the issues around spoofing.

 

I’d like to replace most of that section with a summary and a pointer to the Unicode Technical Report #36

 

http://unicode.org/reports/tr36/tr36-8.html

 

which expands the discussion quite a bit.  I think a summary might be the form:

 

=============draft============

There are serious difficulties with  relying on a human to verify that a presentation of an IRI to them  (whether visually or read out loud) is the same as another identifier or is the one intended. These problems exist with ASCII-only URIs (bl00mberg.com vs. bloomberg.com) but are enormously exacerbated when using  the larger character repertoire of Unicode; these problems are elaborated in [UTR#36].  There seems to be little hope of relying on either administrative or technical means to reduce the availability of such exploits, to the extent that user agents SHOULD NOT relying on visual or perceptual comparison or verification of IRIs as any means of validating or assuring safety, correctness or appropriateness of an IRI.

 

[UTR#36] also identifies additional security considerations that are applicable to IRIs.

 

 ======draft============

 

 

Basically, I want to push the issue of Spoofing in IRIs to another document.

 

Thoughts?

 

Comments?

 

Larry

--

http://larry.masinter.net

 

 

Reply | Threaded
Open this post in threaded view
|

Re: spoofing and IRIs

sm-7
Hi Larry,
At 21:19 27-02-10, Larry Masinter wrote:
>I'd like to replace most of that section with a summary and a
>pointer to the Unicode Technical Report #36

See Section 4.4 of draft-ietf-idnabis-defs-13.  There is also a
pointer to the Unicode Technical Report #36.

Regards,
-sm



Reply | Threaded
Open this post in threaded view
|

RE: spoofing and IRIs

Larry Masinter
Going through the Security considerations of
of draft-ietf-idnabis-defs-13 vs. the current
"Security Considerations" of the current IRI document

here's looking at
http://tools.ietf.org/html/draft-ietf-idnabis-defs 
section 4:


4.1 general: The mapping difference should be referenced
  in the IRI document security considerations?
  Not recapitulated?

* Do we need to review IDNA2008-Bidi against the
  BIDI advice in the IRI document?
  (I talked with Martin about possibly moving the
   BIDI discussions to a separate document,  mainly
  to facilitate letting other editors work on the
  BIDI sections)?

4.2 U-label lengths
  Are there any additional concerns about URI length
  limits that should be addressed here? Are there
  IRI length limits that are different than the URI
  length limit?

4.3 Local Character Set: I think for IRIs there are
  related issues with the document character set?
  Are there special issues for the query parameters
  being remapped according to the document encoding?

4.4 (this is the 'spoofing' issue) Do you like what
  idnabis-defs says better than what I wrote below?
  I kind of wanted to punt the whole thing to
  UTR36.

4.5 The part of this that's relevant to IRIs is that
the "comparison" function.

4.6-4.8 not sure how these would apply.





Larry

-----Original Message-----
From: SM [mailto:[hidden email]]
Sent: Saturday, February 27, 2010 9:53 PM
To: Larry Masinter
Cc: [hidden email]
Subject: Re: spoofing and IRIs

Hi Larry,
At 21:19 27-02-10, Larry Masinter wrote:
>I'd like to replace most of that section with a summary and a
>pointer to the Unicode Technical Report #36

See Section 4.4 of draft-ietf-idnabis-defs-13.  There is also a
pointer to the Unicode Technical Report #36.

Regards,
-sm




Reply | Threaded
Open this post in threaded view
|

RE: spoofing and IRIs

sm-7
Hi Larry,
At 22:25 27-02-10, Larry Masinter wrote:

>Going through the Security considerations of
>of draft-ietf-idnabis-defs-13 vs. the current
>"Security Considerations" of the current IRI document
>
>here's looking at
>http://tools.ietf.org/html/draft-ietf-idnabis-defs
>section 4:
>
>
>4.1 general: The mapping difference should be referenced
>   in the IRI document security considerations?
>   Not recapitulated?

Yes.

>* Do we need to review IDNA2008-Bidi against the
>   BIDI advice in the IRI document?
>   (I talked with Martin about possibly moving the
>    BIDI discussions to a separate document,  mainly
>   to facilitate letting other editors work on the
>   BIDI sections)?

I suggest expert review by a native speaker in addition to reviewing
draft-ietf-idnabis-bidi-07.

>4.2 U-label lengths
>   Are there any additional concerns about URI length
>   limits that should be addressed here? Are there
>   IRI length limits that are different than the URI
>   length limit?

I haven't looked into this in the context of IRI.

>4.3 Local Character Set: I think for IRIs there are
>   related issues with the document character set?
>   Are there special issues for the query parameters
>   being remapped according to the document encoding?

I'll give the same answer as above.

>4.4 (this is the 'spoofing' issue) Do you like what
>   idnabis-defs says better than what I wrote below?
>   I kind of wanted to punt the whole thing to
>   UTR36.

Section 4.4 refers to visual similar characters (sometimes called
"confusables").  Your text talks about presentation whether visually
or read out loud.  Both texts note that there may not be a technical
solution to the problem.  Your text conveys the idea that this is a
difficult problem to solve.  I have a preference for the text in
Section 4.4 because of its second paragraph.  I would put in
a  pointer to UTR36 as that document is more elaborate.

Regards,
-sm


Reply | Threaded
Open this post in threaded view
|

Re: spoofing and IRIs

Maciej Stachowiak
In reply to this post by Larry Masinter
<base href="x-msg://123/">
On Feb 27, 2010, at 9:19 PM, Larry Masinter wrote:

(resending after fixing access problem)
 
Right now, the “Security Considerations” section of http://tools.ietf.org/html/draft-ietf-iri-3987bis-00#section-10  contains a relatively short discussion of the issues around spoofing.
 
I’d like to replace most of that section with a summary and a pointer to the Unicode Technical Report #36
 
 
which expands the discussion quite a bit.  I think a summary might be the form:
 
=============draft============
There are serious difficulties with  relying on a human to verify that a presentation of an IRI to them  (whether visually or read out loud) is the same as another identifier or is the one intended. These problems exist with ASCII-only URIs (bl00mberg.com vs. bloomberg.com) but are enormously exacerbated when using  the larger character repertoire of Unicode; these problems are elaborated in [UTR#36].  There seems to be little hope of relying on either administrative or technical means to reduce the availability of such exploits, to the extent that user agents SHOULD NOT relying on visual or perceptual comparison or verification of IRIs as any means of validating or assuring safety, correctness or appropriateness of an IRI.
 
[UTR#36] also identifies additional security considerations that are applicable to IRIs.
 
 ======draft============
 
 
Basically, I want to push the issue of Spoofing in IRIs to another document.
 
Thoughts?
 
Comments?

I think there's one piece of your summary that is oddly stated: "... to the extent that user agents SHOULD NOT relying on visual or perceptual comparison or verification of IRIs as any means of validating or assuring safety". User agents don't do any visual comparisons of IRIs directly for their own purposes, they do character-by-character comparisons. The problem is with users themselves, not user agents, doing visual comparisons. Also, while UTR#36 has many specific suggestions for improving IRI security, they are not all for user agents. Some are recommendations for procedures when registering domain names. The UA recommendations do not amount to completely removing the user's reliance on visual comparison, although they may somewhat mitigate the risk of showing the user certain kinds of visually confusable URIs. I'm not sure the recommendations of UTR#36 can be summarized adequately in a short paragraph.

Regards,
Maciej







Reply | Threaded
Open this post in threaded view
|

RE: spoofing and IRIs

Larry Masinter
<base href="x-msg://123/">

I hope amending “user agents SHOULD NOT rely on visual or perceptual”…

to “user agents SHOULD NOT rely on users doing visual or perceptual”…. addresses the first concern you raised.

 

I agree with your point that UTR#36 as currently organized contains much information that is extraneous to the IRI spoofing issue, and it would be helpful to point those out. But it seems to be the most extensive analysis of the issues, and is more likely to be maintained as Unicode evolved, so I think referencing it is useful. (By cc to the authors listed (Mark Davis and Michel Suignard), I’d like to ask if UTR#36 could be reorganized or the sections annotated t to explicitly call out which sections apply  to IRIs, to IDNA, or have other scope.)

 

My intent wasn’t really to summarize TR #36, but really to note that, considering all of the risks noted in UTF#36, that there wasn’t really any way that visual comparison *can* be done safely, and that the “best practice” for mitigating the security risks associated with visual or perceptual comparison or verification is to just not rely on it at all.

 

I’m not sure if you agree or disagree with that as an appropriate “Security Considerations” section, but I’d be happy to hear any counter-proposals of how to deal with this messy issue without having to resolve all of the security mitigations before progressing this document.

 

My hope is that everyone can agree on an overall strategy for getting work here quickly:   focus the main IRI document on the concrete requirements and syntax and processing rules for IRI, and move any of the more difficult, messier, controversial and evolving best practices issues to other documents that could evolve independently, and on their own schedule.  That’s a general approach to modularizing documents that I’ve tried to take elsewhere.

 

Larry

--

http://larry.masinter.net

 

 

 

From: Maciej Stachowiak [mailto:[hidden email]]
Sent: Sunday, February 28, 2010 10:17 AM
To: Larry Masinter
Cc: [hidden email]
Subject: Re: spoofing and IRIs

 

 

On Feb 27, 2010, at 9:19 PM, Larry Masinter wrote:



(resending after fixing access problem)

 

Right now, the “Security Considerations” section of http://tools.ietf.org/html/draft-ietf-iri-3987bis-00#section-10  contains a relatively short discussion of the issues around spoofing.

 

I’d like to replace most of that section with a summary and a pointer to the Unicode Technical Report #36

 

 

which expands the discussion quite a bit.  I think a summary might be the form:

 

=============draft============

There are serious difficulties with  relying on a human to verify that a presentation of an IRI to them  (whether visually or read out loud) is the same as another identifier or is the one intended. These problems exist with ASCII-only URIs (bl00mberg.com vs. bloomberg.com) but are enormously exacerbated when using  the larger character repertoire of Unicode; these problems are elaborated in [UTR#36].  There seems to be little hope of relying on either administrative or technical means to reduce the availability of such exploits, to the extent that user agents SHOULD NOT relying on visual or perceptual comparison or verification of IRIs as any means of validating or assuring safety, correctness or appropriateness of an IRI.

 

[UTR#36] also identifies additional security considerations that are applicable to IRIs.

 

 ======draft============

 

 

Basically, I want to push the issue of Spoofing in IRIs to another document.

 

Thoughts?

 

Comments?

 

I think there's one piece of your summary that is oddly stated: "... to the extent that user agents SHOULD NOT relying on visual or perceptual comparison or verification of IRIs as any means of validating or assuring safety". User agents don't do any visual comparisons of IRIs directly for their own purposes, they do character-by-character comparisons. The problem is with users themselves, not user agents, doing visual comparisons. Also, while UTR#36 has many specific suggestions for improving IRI security, they are not all for user agents. Some are recommendations for procedures when registering domain names. The UA recommendations do not amount to completely removing the user's reliance on visual comparison, although they may somewhat mitigate the risk of showing the user certain kinds of visually confusable URIs. I'm not sure the recommendations of UTR#36 can be summarized adequately in a short paragraph.

 

Regards,

Maciej

 

 

 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

RE: spoofing and IRIs

John C Klensin
In reply to this post by Larry Masinter


--On Saturday, February 27, 2010 22:25 -0800 Larry Masinter
<[hidden email]> wrote:

> Going through the Security considerations of
> of draft-ietf-idnabis-defs-13 vs. the current
> "Security Considerations" of the current IRI document
>
> here's looking at
> http://tools.ietf.org/html/draft-ietf-idnabis-defs 
> section 4:
>...

Larry,

Suggestions, fwiw (mostly drawing comments from other notes
together):

(1) Reference that doc.  As others have pointed out, it
addresses UTR 36, but contains some material that may be more
directly relevant to IRIs generally and their domain name
components in particular.

(2) Point out that neither of those documents (...idnabis-defs
nor UTR36) really addresses "sound alike" (especially to someone
not familiar with the relevant language) issues rather than
"look alike" or "might be expected to be treated alike" ones.
In conjunction with this, note that the problem is not just with
the false positive comparisons that characterize the spoofing
problem but with perceptual false negatives:  people who are
under the delusion the IRIs (or URIs or domain names) are to be
interpreted by humans and who are not computer experts often
expect orthographic variations to compare equal.  Difference in
US and UK spelling, Simplified and Traditional Chinese and maybe
pinyin, conventions about representation of extended Latin
strings in basic Latin characters, and writing of Japanese in
either Kana or Kanji all fall into that category for at least
some populations.

(3) Note that these are problems for _both_ humans and human
perception and user agents that try to guess at strings and
other issues with which humans might have problems so that the
users can be warned.   You've noted the example of trying to
distinguish between familiar and unfamiliar scripts.  Others
have noted that mixed-script situations and the use of some
specific characters can be problematic.  For example, as a
problem very specific to IRIs, there are many characters in
Unicode that could plausibly be confused with forward slashes
and other reserved punctuation.

(4) Of course, we also have the human interface design question
of whether or not one should try to do anything (and possibly
create false expectations and an unreasonable sense of
confidence in being protected) when it is clear that a
comprehensive solution is impossible.  If one inspects browsers
and other IRI-using programs, the consensus seems to be "yes, do
what one can".  That is not the only plausible conclusion and
there is certainly no consensus as to what one should actually
do.   I think it would be wise for the document to say that.

best,
   john