[iri] #121: BIDI: Some users are requiring right-to-left label ordering.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[iri] #121: BIDI: Some users are requiring right-to-left label ordering.

iri issue tracker
#121: BIDI: Some users are requiring right-to-left label ordering.

 BIDI section 2 requires adding embedding marks with force a "western"
 left-to-right ordering of labels.  I have requirements from customers,
 including government customers, that require a right-to-left ordering of
 labels in at least some cases.

 This preferences seems to be a user preference, with, perhaps, a strong
 language bias.

 Specifically, how is a user reading an Arabic domain name from the side of
 the bus over a phone going to read it?  And how will the person on the end
 of the phone type it?  My investigation shows that native speakers will
 prefer reading a domain name from the right in BIDI contexts.

--
----------------------------+----------------------------------------------
 Reporter:  shawnste@…      |      Owner:  draft-ietf-iri-bidi-guidelines@…
     Type:  defect          |     Status:  new
 Priority:  major           |  Milestone:
Component:  bidi-           |    Version:
  guidelines                |   Keywords:
 Severity:  -               |
----------------------------+----------------------------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/121>
iri <http://tools.ietf.org/wg/iri/>


Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

iri issue tracker
#121: BIDI: Some users are requiring right-to-left label ordering.


Comment (by duerst@…):

 Shawn, thanks for creating this issue. Can you give more details about
 your customer's requirements (e.g. is right-to-left ordering meant to work
 per component or per run? At what point should a mixed (including RTL and
 LTR components) IRI be displayed right-to-left (e.g. even if only a single
 component, e.g. a single path component (directory) in a path is RTL)? Are
 there details that vary per "customer", and if yes, what?

--
---------------------------+-----------------------------------------------
 Reporter:  shawnste@…     |       Owner:  draft-ietf-iri-bidi-guidelines@…
     Type:  defect         |      Status:  new
 Priority:  major          |   Milestone:
Component:  bidi-          |     Version:
  guidelines               |  Resolution:
 Severity:  -              |
 Keywords:                 |
---------------------------+-----------------------------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:1>
iri <http://tools.ietf.org/wg/iri/>


Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

iri issue tracker
In reply to this post by iri issue tracker
#121: BIDI: Some users are requiring right-to-left label ordering.


Comment (by shawnste@…):

 The primary concern would be a simple domain name, even without http:// :)
 Of course an IRI needs to be consistent with that.

 The customers have been focused primarily on the domain portion.  By the
 time we look at the query string they've "lost interest".  So RTL in the
 domain should probably force reading order.

 Interestingly, however, the key indicator isn't the domain itself, but
 rather the context/mindset of the user. If they're dealing with Arabic,
 they may expect the URL to render labels from right-to-left, even if it's
 entirely ASCII!  Specifically, if the browser's UI language is Arabic, or
 if the Address Field is in Right To Left Reading order, this expectation
 increases.

 The bias also seems to be cultural &/or experience related.  A software
 engineer that majored in math speaking from one country may feel more
 comfortable with left-to-right behavior than a non-computer/math focused
 person in another country.

 I know it doesn't help this RFC, but keying off the address box
 directionality might be good.  In a document, keying off the primary
 document language might work.  That doesn't provide the consistency
 necessary here.

 I don't think that "any RTL means all-RTL" works very well, because a
 simple Arabic query string to Bing probably doesn't mean that the address
 needs flipped.  Any RTL within the domain portion (or local part of an
 email address) probably does indicate that the labels should be ordered
 from Right to Left.

 I realize that following these rules may end up with behavior that is
 "fuzzier" than some are comfortable with, however the goal here is human
 readable (by the 90%, not engineers).  Machines and Engineers already know
 how to "read" it, we've got byte order if nothing else; our biases should
 not impact the "see a domain name on the side of the bus and type it into
 my phone" case.

 In summary: Follow the order of the address box if the user sets that.  If
 there is no other context, any RTL in the primary portion (eg domain) of
 the IRI should trigger RTL ordering of the labels.  EG: put the whole
 thing in right to left marks instead of left to right marks.

--
---------------------------+-----------------------------------------------
 Reporter:  shawnste@…     |       Owner:  draft-ietf-iri-bidi-guidelines@…
     Type:  defect         |      Status:  new
 Priority:  major          |   Milestone:
Component:  bidi-          |     Version:
  guidelines               |  Resolution:
 Severity:  -              |
 Keywords:                 |
---------------------------+-----------------------------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:2>
iri <http://tools.ietf.org/wg/iri/>


Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

iri issue tracker
In reply to this post by iri issue tracker
#121: BIDI: Some users are requiring right-to-left label ordering.

Changes (by adil@…):

 * keywords:   => bidi
 * status:  new => closed
 * resolution:   => wontfix


Comment:

 Shawn, being one of the people that wants to see Arabic URLs flowing right
 to left I fully understand what you are saying. I have gone around in
 circles a few times with this and I concluded that this version of the
 Bidi-IRI document is not where we should resolve the issue.

 Firstly, internet addresses is a subset of the use of IRIs and I need to
 take into account the general purpose of the IRI. IRIs are rendered by a
 wide variety of devices that have only a few things in common. The primary
 concern is that the IRI is consistent on all these devices when it
 contains bidi characters.

 Secondly, a full solution to getting to URLs to render readably right-to-
 left requires either a modification to the Unicode bidi algorithm (which
 Mark Davis proposed) or a restriction to the characters that can be used
 for registering right-to-left domain names (e.g. only allow Arabic
 alphabetic characters in an Arabic domain name). Both of these cases are
 out of the scope of this document.

 I think what is needed (independently of this document) is a specification
 for URLs that are safe to be drawn right-to-left. Then, if a browser
 recognizes a safe URL it can draw the URL right-to-left without concern.
 This specification can be advertised to domain name registrars and web
 companies. In theory we could then have the Googles, and Facebooks of this
 world using and advertising URLs that are right to left.

 I am setting this issue as won't fix but if you disagree please comment
 here and I will reopen it.

--
---------------------------+-----------------------------------------------
 Reporter:  shawnste@…     |       Owner:  draft-ietf-iri-bidi-guidelines@…
     Type:  defect         |      Status:  closed
 Priority:  major          |   Milestone:
Component:  bidi-          |     Version:
  guidelines               |  Resolution:  wontfix
 Severity:  -              |
 Keywords:  bidi           |
---------------------------+-----------------------------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:3>
iri <http://tools.ietf.org/wg/iri/>


Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

iri issue tracker
In reply to this post by iri issue tracker
#121: BIDI: Some users are requiring right-to-left label ordering.


Comment (by shawnste@…):

 Well, I disagree with pretty much every point :)
 * clearly everything won't be consistent because plain text that doesn't
 know how to detect an IRI isn't going to behave as expected.
 * I think that the importance isn't consistency between devices, but
 rather the ability for users to consistently transcribe the IRI.  That
 includes not only display on devices, but input through whatever keyboards
 from sticky notes that were transcribed by hand from an IRI on the side of
 a bus.
 * Related, I don't think they can be "unnatural".
 * There's a lot of pressure to ensure that RTL domains are "correctly"
 rendered in RTL fashion.  So I think we'd get a better job of consistency
 if the guidelines took that into account instead of having software
 developers trying to do something "better" in an inconsistent fashion.
 * Though fixing the BIDI Algorithm would help, it's not required.  Indeed,
 the proposed behavior uses bidi override marks to get the desired
 behavior.  The same thing can be done for RTL.  Granted a better BIDI
 algorithm for IRIs would make "plain text" better, but it’s not required.
 * As noted, this isn’t necessarily easily gleaned from the script(s) being
 used, as some cultural and user preferences also influence it.

 I disagree that there’s anything particularly interesting about “safe”.  I
 think that as long as the sections are consistently from left to right or
 right to left it doesn’t matter whether its drawn http://www.microsoft.com
 or com.microsoft.www/ /:http.  Indeed if that was the user preference,
 independent of the actual script, then they’d always be consistent for
 that user.  If there does prove to be a spoofing problem with
 http://www.spoof.me.com/com.microsoft.www//:http type things, those are
 fairly easy for malware filters to detect.  Also 90% of users can’t tell
 that http://www.microsoft.safe-secure.com isn’t a great place to enter a
 credit card #.  At the machine level, the rendering is irrelevant since
 it’s always stored the same way.

 I really need an way, even optional if need be, of rendering for RTL
 before I can "sign off" on this draft :)

--
---------------------------+-----------------------------------------------
 Reporter:  shawnste@…     |       Owner:  draft-ietf-iri-bidi-guidelines@…
     Type:  defect         |      Status:  closed
 Priority:  major          |   Milestone:
Component:  bidi-          |     Version:
  guidelines               |  Resolution:  wontfix
 Severity:  -              |
 Keywords:  bidi           |
---------------------------+-----------------------------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:4>
iri <http://tools.ietf.org/wg/iri/>


Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

iri issue tracker
In reply to this post by iri issue tracker
#121: BIDI: Some users are requiring right-to-left label ordering.

Changes (by duerst@…):

 * status:  closed => reopened
 * resolution:  wontfix =>


Comment:

 Reopening it for Shawn. We definitely need wider consensus on how to
 proceed with this.

--
---------------------------+-----------------------------------------------
 Reporter:  shawnste@…     |       Owner:  draft-ietf-iri-bidi-guidelines@…
     Type:  defect         |      Status:  reopened
 Priority:  major          |   Milestone:
Component:  bidi-          |     Version:
  guidelines               |  Resolution:
 Severity:  -              |
 Keywords:  bidi           |
---------------------------+-----------------------------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:5>
iri <http://tools.ietf.org/wg/iri/>


Reply | Threaded
Open this post in threaded view
|

RE: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

masinter
In reply to this post by iri issue tracker
My read on the situation:

It would be helpful if we could get some agreed text describing the nature of the problem --
 it sounds to me that there might be agreement on the problem (more or less) ,
just not on whether there are feasible (partial) solutions.

If we have agreement on the problem statement, then we can:

* document partial solutions (with caveats)
* say we don't believe there are any feasible solutions at this time

It would be useful also to get a survey of of what current implementations actually are doing now, along with some concrete examples of the nature of the problems.

> I really need an way, even optional if need be, of rendering for RT  before I can "sign off" on this draft :)

There's no magic, just "rough consensus and running code":

* if all of the implementations agree, then we can document that.
* If  there are multiple implementations currently, we can try to pick one.
* if we don't like any of the implementations, we can say so.
* If there are no implementations or even demos or samples of implementations, we shouldn't hold our breath hoping one will appear.

Larry
--
http://larry.masinter.net


-----Original Message-----
From: iri issue tracker [mailto:[hidden email]]
Sent: Wednesday, March 28, 2012 7:15 AM
To: [hidden email]; [hidden email]; [hidden email]; [hidden email]
Cc: [hidden email]
Subject: Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

#121: BIDI: Some users are requiring right-to-left label ordering.


Comment (by shawnste@…):

 Well, I disagree with pretty much every point :)
 * clearly everything won't be consistent because plain text that doesn't
 know how to detect an IRI isn't going to behave as expected.
 * I think that the importance isn't consistency between devices, but
 rather the ability for users to consistently transcribe the IRI.  That
 includes not only display on devices, but input through whatever keyboards
 from sticky notes that were transcribed by hand from an IRI on the side of
 a bus.
 * Related, I don't think they can be "unnatural".
 * There's a lot of pressure to ensure that RTL domains are "correctly"
 rendered in RTL fashion.  So I think we'd get a better job of consistency
 if the guidelines took that into account instead of having software
 developers trying to do something "better" in an inconsistent fashion.
 * Though fixing the BIDI Algorithm would help, it's not required.  Indeed,
 the proposed behavior uses bidi override marks to get the desired
 behavior.  The same thing can be done for RTL.  Granted a better BIDI
 algorithm for IRIs would make "plain text" better, but it’s not required.
 * As noted, this isn’t necessarily easily gleaned from the script(s) being
 used, as some cultural and user preferences also influence it.

 I disagree that there’s anything particularly interesting about “safe”.  I
 think that as long as the sections are consistently from left to right or
 right to left it doesn’t matter whether its drawn http://www.microsoft.com
 or com.microsoft.www/ /:http.  Indeed if that was the user preference,
 independent of the actual script, then they’d always be consistent for
 that user.  If there does prove to be a spoofing problem with
 http://www.spoof.me.com/com.microsoft.www//:http type things, those are
 fairly easy for malware filters to detect.  Also 90% of users can’t tell
 that http://www.microsoft.safe-secure.com isn’t a great place to enter a
 credit card #.  At the machine level, the rendering is irrelevant since
 it’s always stored the same way.

 I really need an way, even optional if need be, of rendering for RTL
 before I can "sign off" on this draft :)

--
---------------------------+-----------------------------------------------
 Reporter:  shawnste@…     |       Owner:  draft-ietf-iri-bidi-guidelines@…
     Type:  defect         |      Status:  closed
 Priority:  major          |   Milestone:
Component:  bidi-          |     Version:
  guidelines               |  Resolution:  wontfix
 Severity:  -              |
 Keywords:  bidi           |
---------------------------+-----------------------------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:4>
iri <http://tools.ietf.org/wg/iri/>

Reply | Threaded
Open this post in threaded view
|

RE: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

Shawn Steele
IE is currently not great now, getting into the mixed-up situations we all know is undesirable.

A "concrete example" seems hard, but one that I'm keen on is a partial web name on the side of a bus, in Arabic, eg: CCC.BBB.AAA.  Note that I'm intentionally leaving out the http:// and any default.html or whatever.  I have a difficult time imagining any Arabic speaker copying that onto a notepad other than by writing from right to left.  I also expect that they would then naturally type it the same way they wrote it.  I think we have to build from there, that's how 90% of the people use an IRI.  Nobody's going to type the http://, particularly in Arabic, because it requires a keyboard change, and the browser will add it for them.

In those 90% useful cases there is no mixed Latin/Arabic, it's just a domain name.  It's nice if we present mixed up stuff a little more orderly, but nobody cares about the part after the domain name.

I believe that we need to allow the same thing we have with LTR ordering, except for RTL.   Where it gets confusing to me is when you choose LTR or RTL behavior.  A few options seem possible:

* User Preference
* System/Application Preference (eg: I'm looking at an Arabic web site, so I'll show RTL labels.  I'm looking at an English web site, I'll show LTR labels).
* If there're any RTL characters, do the whole thing as RTL
* Restrict the RTL/LTR test to the primary part of the IRI, eg: domain.

Caveats are that many of those probably allow homographs in some cases (Maybe not User Preference, since they'd know it'd always be one direction or the other.)  I'm not worried about those cases as SmartScreen will easily filter those out if necessary.  It'd be harder if we didn't force RTL/LTR on the whole thing (eg: had current BIDI algorithm behavior).

-Shawn

-----Original Message-----
From: Larry Masinter [mailto:[hidden email]]
Sent: Wednesday, March 28, 2012 3:31 PM
To: [hidden email]; Shawn Steele; [hidden email]
Cc: [hidden email]
Subject: RE: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

My read on the situation:

It would be helpful if we could get some agreed text describing the nature of the problem --  it sounds to me that there might be agreement on the problem (more or less) , just not on whether there are feasible (partial) solutions.

If we have agreement on the problem statement, then we can:

* document partial solutions (with caveats)
* say we don't believe there are any feasible solutions at this time

It would be useful also to get a survey of of what current implementations actually are doing now, along with some concrete examples of the nature of the problems.

> I really need an way, even optional if need be, of rendering for RT  
> before I can "sign off" on this draft :)

There's no magic, just "rough consensus and running code":

* if all of the implementations agree, then we can document that.
* If  there are multiple implementations currently, we can try to pick one.
* if we don't like any of the implementations, we can say so.
* If there are no implementations or even demos or samples of implementations, we shouldn't hold our breath hoping one will appear.

Larry
--
http://larry.masinter.net



Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

iri issue tracker
In reply to this post by iri issue tracker
#121: BIDI: Some users are requiring right-to-left label ordering.


Comment (by duerst@…):

 (from Larry)

 My read on the situation:

 It would be helpful if we could get some agreed text describing the nature
 of the problem --
  it sounds to me that there might be agreement on the problem (more or
 less) ,
 just not on whether there are feasible (partial) solutions.

 If we have agreement on the problem statement, then we can:

 * document partial solutions (with caveats)
 * say we don't believe there are any feasible solutions at this time

 It would be useful also to get a survey of of what current implementations
 actually are doing now, along with some concrete examples of the nature of
 the problems.

 > > I really need an way, even optional if need be, of rendering for RT
 before I can "sign off" on this draft :)
 There's no magic, just "rough consensus and running code":

 * if all of the implementations agree, then we can document that.
 * If  there are multiple implementations currently, we can try to pick
 one.
 * if we don't like any of the implementations, we can say so.
 * If there are no implementations or even demos or samples of
 implementations, we shouldn't hold our breath hoping one will appear.

 Larry

--
---------------------------+-----------------------------------------------
 Reporter:  shawnste@…     |       Owner:  draft-ietf-iri-bidi-guidelines@…
     Type:  defect         |      Status:  reopened
 Priority:  major          |   Milestone:
Component:  bidi-          |     Version:
  guidelines               |  Resolution:
 Severity:  -              |
 Keywords:  bidi           |
---------------------------+-----------------------------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:6>
iri <http://tools.ietf.org/wg/iri/>


Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

iri issue tracker
In reply to this post by iri issue tracker
#121: BIDI: Some users are requiring right-to-left label ordering.


Comment (by duerst@…):

 (from Shawn)

 IE is currently not great now, getting into the mixed-up situations we all
 know is undesirable.

 A "concrete example" seems hard, but one that I'm keen on is a partial web
 name on the side of a bus, in Arabic, eg: CCC.BBB.AAA.  Note that I'm
 intentionally leaving out the http:// and any default.html or whatever.  I
 have a difficult time imagining any Arabic speaker copying that onto a
 notepad other than by writing from right to left.  I also expect that they
 would then naturally type it the same way they wrote it.  I think we have
 to build from there, that's how 90% of the people use an IRI.  Nobody's
 going to type the http://, particularly in Arabic, because it requires a
 keyboard change, and the browser will add it for them.

 In those 90% useful cases there is no mixed Latin/Arabic, it's just a
 domain name.  It's nice if we present mixed up stuff a little more
 orderly, but nobody cares about the part after the domain name.

 I believe that we need to allow the same thing we have with LTR ordering,
 except for RTL.   Where it gets confusing to me is when you choose LTR or
 RTL behavior.  A few options seem possible:

 * User Preference
 * System/Application Preference (eg: I'm looking at an Arabic web site, so
 I'll show RTL labels.  I'm looking at an English web site, I'll show LTR
 labels).
 * If there're any RTL characters, do the whole thing as RTL
 * Restrict the RTL/LTR test to the primary part of the IRI, eg: domain.

 Caveats are that many of those probably allow homographs in some cases
 (Maybe not User Preference, since they'd know it'd always be one direction
 or the other.)  I'm not worried about those cases as SmartScreen will
 easily filter those out if necessary.  It'd be harder if we didn't force
 RTL/LTR on the whole thing (eg: had current BIDI algorithm behavior).

 -Shawn

--
---------------------------+-----------------------------------------------
 Reporter:  shawnste@…     |       Owner:  draft-ietf-iri-bidi-guidelines@…
     Type:  defect         |      Status:  reopened
 Priority:  major          |   Milestone:
Component:  bidi-          |     Version:
  guidelines               |  Resolution:
 Severity:  -              |
 Keywords:  bidi           |
---------------------------+-----------------------------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:7>
iri <http://tools.ietf.org/wg/iri/>


Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

iri issue tracker
In reply to this post by iri issue tracker
#121: BIDI: Some users are requiring right-to-left label ordering.


Comment (by duerst@…):

 Hello Shawn,

 Two points of clarification:

 - At http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:4, you
 write "Indeed, the proposed behavior uses bidi override marks to get the
 desired behavior.", but it's not override marks, it's embedding marks.
 Otherwise, not a single RTL domain label or path component would be
 readable. (maybe that's what you meant, but in that case, please be
 careful with terminology)

 - At http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:7, you
 wrote about partial web names in all-Arabic on the side of a bus, e.g.
 CCC.BBB.AAA. In this specific case, the current spec (RFC 3987 and draft-
 ietf-iri-bidi-guidilines-02.txt) will do the right thing (because the
 Unicode Bidi algorithm reorders by runs, not by components). In that case,
 no embedding may be necessary. This is explicitly mentioned:

 {{{
                                                          Also, a
    bidirectional relative IRI reference that only contains strong right-
    to-left characters and weak characters (such as symbols) and that
    starts and ends with a strong right-to-left character and appears in
    a text with right-to-left base directionality (such as used for
    Arabic or Hebrew) and is preceded and followed by whitespace and
    strong characters does not need an embedding.

 }}}

--
---------------------------+-----------------------------------------------
 Reporter:  shawnste@…     |       Owner:  draft-ietf-iri-bidi-guidelines@…
     Type:  defect         |      Status:  reopened
 Priority:  major          |   Milestone:
Component:  bidi-          |     Version:
  guidelines               |  Resolution:
 Severity:  -              |
 Keywords:  bidi           |
---------------------------+-----------------------------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:8>
iri <http://tools.ietf.org/wg/iri/>


Reply | Threaded
Open this post in threaded view
|

RE: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

Shawn Steele
>> (maybe that's what you meant, but in that case, please be
>> careful with terminology)

Yes, sorry.

>>
>> - At http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:7, you
>> wrote about partial web names in all-Arabic on the side of a bus, e.g.
>> CCC.BBB.AAA. In this specific case, the current spec (RFC 3987 and draft-
>> ietf-iri-bidi-guidilines-02.txt) will do the right thing (because the
>> Unicode Bidi algorithm reorders by runs, not by components). In that case,
>> no embedding may be necessary. This is explicitly mentioned:

I actually got confused a bit and reread the specification.  Now I like the behavior even less :)

Our investigation is that the parts of an IRI are treated like a list.  If I have a list like (Afra, Joe, Mary, Maysun, Mohamed, Phil), I'm not going to change the order of the list because of my language, I expect it to stay (AFRA, joe, mary, MAYSUN, MOHAMED, phil), not (AFRA, joe, mary, MOHAMED, MAYSUN, phil).  (Though I confess to mixing metaphors because I used alphabitization to sort my list and clearly in different scripts that'd be different.  I imagine I'm getting the idea across though, maybe it was an org chart that just so happens to have people arranged alphabetically by transliterated Latin name :)).  

Similarly for http://www.microsoft.com/en-us/default.aspx, it's ordered something like a://b.c.d/e/f.g  -- A list can keep its order rendered as either a://b.c.d/e/f.g or g.f/e/b.c.b//:a   Which is appropriate depends on the situation, but if we start rearranging the order of the labels it gets really confusing.  At that point 99% of the populous would lose all hope of realizing there's an order to an IRI.  (Right now few people could correctly parse one anyway, but it'd get way worse).

IMO, which way the parts are ordered is less important than the fact they're consistently ordered.

-Shawn

Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

Slim Amamou-3
I support this view. I'd add that It's acceptable for me if the LTR
order for the components is enforced on IRIs. The other solution is to
state in the RFC that every IRI spec MUST define an overall ordering
for the components either LTR or RTL.

On Thu, Mar 29, 2012 at 4:10 PM, Shawn Steele
<[hidden email]> wrote:

>(...)
>
> Our investigation is that the parts of an IRI are treated like a list.  If I have a list like (Afra, Joe, Mary, Maysun, Mohamed, Phil), I'm not going to change the order of the list because of my language, I expect it to stay (AFRA, joe, mary, MAYSUN, MOHAMED, phil), not (AFRA, joe, mary, MOHAMED, MAYSUN, phil).  (Though I confess to mixing metaphors because I used alphabitization to sort my list and clearly in different scripts that'd be different.  I imagine I'm getting the idea across though, maybe it was an org chart that just so happens to have people arranged alphabetically by transliterated Latin name :)).
>
> Similarly for http://www.microsoft.com/en-us/default.aspx, it's ordered something like a://b.c.d/e/f.g  -- A list can keep its order rendered as either a://b.c.d/e/f.g or g.f/e/b.c.b//:a   Which is appropriate depends on the situation, but if we start rearranging the order of the labels it gets really confusing.  At that point 99% of the populous would lose all hope of realizing there's an order to an IRI.  (Right now few people could correctly parse one anyway, but it'd get way worse).
>
> IMO, which way the parts are ordered is less important than the fact they're consistently ordered.
>
> -Shawn
>



--
Slim Amamou | سليم عمامو
http://alixsys.com

Reply | Threaded
Open this post in threaded view
|

RE: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

Shawn Steele
I disgree (obviously) about enforcing the LTR ordering though....  I think that's up to the situation/user :)

-Shawn

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Slim Amamou
Sent: ,  29,  2012 8:32
To: Shawn Steele
Cc: iri issue tracker; [hidden email]; [hidden email]; [hidden email]; [hidden email]
Subject: Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

I support this view. I'd add that It's acceptable for me if the LTR order for the components is enforced on IRIs. The other solution is to state in the RFC that every IRI spec MUST define an overall ordering for the components either LTR or RTL.

On Thu, Mar 29, 2012 at 4:10 PM, Shawn Steele <[hidden email]> wrote:

>(...)
>
> Our investigation is that the parts of an IRI are treated like a list.  If I have a list like (Afra, Joe, Mary, Maysun, Mohamed, Phil), I'm not going to change the order of the list because of my language, I expect it to stay (AFRA, joe, mary, MAYSUN, MOHAMED, phil), not (AFRA, joe, mary, MOHAMED, MAYSUN, phil).  (Though I confess to mixing metaphors because I used alphabitization to sort my list and clearly in different scripts that'd be different.  I imagine I'm getting the idea across though, maybe it was an org chart that just so happens to have people arranged alphabetically by transliterated Latin name :)).
>
> Similarly for http://www.microsoft.com/en-us/default.aspx, it's ordered something like a://b.c.d/e/f.g  -- A list can keep its order rendered as either a://b.c.d/e/f.g or g.f/e/b.c.b//:a   Which is appropriate depends on the situation, but if we start rearranging the order of the labels it gets really confusing.  At that point 99% of the populous would lose all hope of realizing there's an order to an IRI.  (Right now few people could correctly parse one anyway, but it'd get way worse).
>
> IMO, which way the parts are ordered is less important than the fact they're consistently ordered.
>
> -Shawn
>



--
Slim Amamou | سليم عمامو
http://alixsys.com

Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

Slim Amamou-3
It can't be users choice. It's either LTR or RTL by the specs. Because
if the user from Bahrain on a trip to UK had to write down a URL
written on a bus in London, he would retranscribe it inverted.

On Thu, Mar 29, 2012 at 5:09 PM, Shawn Steele
<[hidden email]> wrote:
> I disgree (obviously) about enforcing the LTR ordering though....  I think that's up to the situation/user :)
>



--
Slim Amamou | سليم عمامو
http://alixsys.com

Reply | Threaded
Open this post in threaded view
|

RE: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

Shawn Steele
Yes, it gets complicated.  However, while in London, everything else on the bus is in LTR context, while on their computer at home everything is in RTL context.  The thinking is more like:

A) An Arabic user will (eventually) use a lot of Arabic domain names, so they'll be used to the ARABIC.WWW://http form.  
B) So then http://www.english.com will seem funny to them, if their browser's still aligning stuff to the right, etc.

I think that there's enough other contextual differences when switching languages/travelling, that realizing that a URL on a double decker bus in London needs to be handled in the right way isn't that hard.  Certainly there's a bigger immediate danger in driving on the wrong side of the road :)

For "us", we have the order the data is stored in.  For readers I don't think there's a "perfect" solution that is unambiguous in all cases, so I would prefer to err on the side of having things readable in the user's normal way of thinking.

-Shawn

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Slim Amamou
Sent: ,  29,  2012 9:18
To: Shawn Steele
Cc: iri issue tracker; [hidden email]; [hidden email]; [hidden email]; [hidden email]
Subject: Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

It can't be users choice. It's either LTR or RTL by the specs. Because if the user from Bahrain on a trip to UK had to write down a URL written on a bus in London, he would retranscribe it inverted.

On Thu, Mar 29, 2012 at 5:09 PM, Shawn Steele <[hidden email]> wrote:
> I disgree (obviously) about enforcing the LTR ordering though....  I
> think that's up to the situation/user :)
>



--
Slim Amamou | سليم عمامو
http://alixsys.com

Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

iri issue tracker
In reply to this post by iri issue tracker
#121: BIDI: Some users are requiring right-to-left label ordering.


Comment (by adil@…):

 I think I should clarify what I meant:

 Right now if you have a 'simple' URL that is all in Arabic it will be
 rendered right-to-left even given the restrictions of this document. So,
 using the normal bidi notation (capitals for rtl characters):
 Logical order:
   `http://ABC.DEF.GHI/JKL`
 Appears as:
   `http://LKJ/IHG.FED.CBA`
 Or without the http.. :
   `LKJ/IHG.FED.CBA`

 This is why I believe the current situation satisfies the 'side of a bus
 URL' criteria for a small subset of right-to-left URLs.

 The point is to strictly define what that subset is and create tools and
 documents to verify it so that web sites and browsers can display them.
 Also within this subset of URLs it is possible to have browsers draw these
 in the URL bar right-to-left and right aligned. But I do not know if this
 document is the place for such a definition.

--
---------------------------+-----------------------------------------------
 Reporter:  shawnste@…     |       Owner:  draft-ietf-iri-bidi-guidelines@…
     Type:  defect         |      Status:  reopened
 Priority:  major          |   Milestone:
Component:  bidi-          |     Version:
  guidelines               |  Resolution:
 Severity:  -              |
 Keywords:  bidi           |
---------------------------+-----------------------------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/121#comment:9>
iri <http://tools.ietf.org/wg/iri/>


Reply | Threaded
Open this post in threaded view
|

RE: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

Shawn Steele
>Also within this subset of URLs it is possible to have browsers draw these  in the URL bar
>right-to-left and right aligned. But I do not know if this  document is the place for such a definition.

The document explicitly prohibits alternate renderings, like LKJ/IHG.FED.CBA//:http

I find the current behavior very bad since it treats the IRI like unstructured text.  However there is a structure; there's an order to the labels.  If we'd never heard of the BIDI algorithm, our first attempt, from a clean slate, to solve this problem would not allow the ordering of the labels to be exchanged.  The only reason we're considering that is because we've seen what the Bidi Algorithm does to other text in completely different contexts.

My requirements are:
1) The logical order of the parts MUST be preserved.
2) There MUST be a way for mostly Arabic, etc. IRIs to be rendered right to left.
        * So the corollary of 1 & 2 is that the protocol has to go on the right
3) I'd really like a MAY that allows some flexibility for 2; when it's LTR and when it's RTL.  I don't think we're going to get it perfect in our first pass.

At a minimum, I'd suggest that any RTL characters in the domain or email local parts should force 2).  

-Shawn
Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

Martin J. Dürst
Sorry for the delay in writing this answer.

On 2012/03/30 2:09, Shawn Steele wrote:
>> Also within this subset of URLs it is possible to have browsers draw these  in the URL bar
>> right-to-left and right aligned. But I do not know if this  document is the place for such a definition.
>
> The document explicitly prohibits alternate renderings, like LKJ/IHG.FED.CBA//:http

Yes, it currently does. I personally don't necessarily think we need to
keep it that strict. But we need to be very sure of what the trade-offs
are, and there are definitely very strong trade-offs.

One thing that may be possible to remove is the condition that the
embedding be LTR, thus also allowing RTL embedding. But I understand
that wouldn't yet make you happy.


> I find the current behavior very bad since it treats the IRI like unstructured text.

Indeed IRIs are treated like unstructured text, but that may not
necessarily be bad.


> However there is a structure; there's an order to the labels.

Yes. Some people are very aware of that structure, others aren't.


> If we'd never heard of the BIDI algorithm, our first attempt, from a clean slate, to solve this problem would not allow the ordering of the labels to be exchanged.

I think that was indeed the case, until we realized that in order to do
that, one of two things are needed:
1) You have to insert Bidi marks into the IRI, which means it's no
longer the same IRI, or
2) You end up with different displays between places that "know" there's
an IRI (e.g. browser address bar) and places that don't


> The only reason we're considering that is because we've seen what the Bidi Algorithm does to other text in completely different contexts.

Actually, the current solution was proposed by Mati Alluche, and he
argued that it would be possible for people to understand the ordering
because of the heuristics they use when reading mixed text:

Read some text in the main direction, if you meet text in the other
direction, jump to the end of that run of text and read "backwards",
then continue with the text in the main direction. That's a different
heuristic to the one you have used as an equivalent, namely the list
(which the Unicode Bidi Algorithm actually also would "mess up" so that
sequential RTL items would be ordered RTL overall; not sure what people
usually do in these cases, whether they fix it up or not).

Mati said that this would not necessarily help URI/IRI experts, but
might actually be quite easy for non-experts, potentially the easiest
solution (easier than the strict component logical order) for them. I'm
not in a location where I have enough non-IRI-expert average bidi users
around me to test this.


> My requirements are:
> 1) The logical order of the parts MUST be preserved.

That sounds like a very logical requirement :-). As always in the IETF,
any arguments/data to support that would be very much appreciated (your
list equivalent is certainly counting towards that).


> 2) There MUST be a way for mostly Arabic, etc. IRIs to be rendered right to left.
> * So the corollary of 1&  2 is that the protocol has to go on the right

By protocol, do you mean the scheme name (such as ftp:, mailto:, http:,
https:,...)?


> 3) I'd really like a MAY that allows some flexibility for 2; when it's LTR and when it's RTL.

You mean some flexibility depending on context? We could also make that
"MUST respect context". But then there's the problem that the context of
a side of a bus is rather vague :-).


> I don't think we're going to get it perfect in our first pass.

We are already at the second pass. The first pass was RFC 3987.


> At a minimum, I'd suggest that any RTL characters in the domain or email local parts should force 2).

In my personal view, I think that might be overkill. I'm not sure I'd
want everything turned around just because of a few RTL characters. But
if that's what everybody agrees on, I won't stay in the way.


The really tough problem for anything that reorders by component (what
you call 'logical order of parts') is that it may be easy to write a
standard that says so, but it's difficult to implement. Any thoughts
about that?


Regards,    Martin.

Reply | Threaded
Open this post in threaded view
|

Re: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

Martin J. Dürst
In reply to this post by Slim Amamou-3
Hello Slim,

On 2012/03/30 0:31, Slim Amamou wrote:
> I support this view. I'd add that It's acceptable for me if the LTR
> order for the components is enforced on IRIs. The other solution is to
> state in the RFC that every IRI spec MUST define an overall ordering
> for the components either LTR or RTL.

What do you mean by "every IRI spec"? There is only one IRI spec.
Currently, it's RFC 3987, but we are working on an update.

Regards,    Martin.

>
> On Thu, Mar 29, 2012 at 4:10 PM, Shawn Steele
> <[hidden email]>  wrote:
>> (...)
>>
>> Our investigation is that the parts of an IRI are treated like a list.  If I have a list like (Afra, Joe, Mary, Maysun, Mohamed, Phil), I'm not going to change the order of the list because of my language, I expect it to stay (AFRA, joe, mary, MAYSUN, MOHAMED, phil), not (AFRA, joe, mary, MOHAMED, MAYSUN, phil).  (Though I confess to mixing metaphors because I used alphabitization to sort my list and clearly in different scripts that'd be different.  I imagine I'm getting the idea across though, maybe it was an org chart that just so happens to have people arranged alphabetically by transliterated Latin name :)).
>>
>> Similarly for http://www.microsoft.com/en-us/default.aspx, it's ordered something like a://b.c.d/e/f.g  -- A list can keep its order rendered as either a://b.c.d/e/f.g or g.f/e/b.c.b//:a   Which is appropriate depends on the situation, but if we start rearranging the order of the labels it gets really confusing.  At that point 99% of the populous would lose all hope of realizing there's an order to an IRI.  (Right now few people could correctly parse one anyway, but it'd get way worse).
>>
>> IMO, which way the parts are ordered is less important than the fact they're consistently ordered.
>>
>> -Shawn
>>
>
>
>

12