@lang and @xml:lang in XHTML+RDFa 1.1

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

@lang and @xml:lang in XHTML+RDFa 1.1

Shane McCarron
As most of you know, the XHTML 2 working group is in the process of
adding @lang to XHTML 1.1 as part of its efforts to ensure that XHTML
1.1 is maximally useful by assistive technologies.  Since we claim that
XHTML+RDFa is a superset of XHTML 1.1, I believe we need to add @lang
here too.

I am in the process of producing an XHTML+RDFa 1.1 editors draft, and I
am pondering what the inclusion of @lang means for the processing
rules.  To me, it makes perfect sense to say that @lang and @xml:lang
both can define the language of an element, and that @xml:lang takes
precedence.  This is what XHTML 1.1 says. Does anyone see a problem with
this?

-Shane

Reply | Threaded
Open this post in threaded view
|

Re: @lang and @xml:lang in XHTML+RDFa 1.1

Ivan Herman-2


On 01/02/2010 01:58 AM, Shane McCarron wrote:

> As most of you know, the XHTML 2 working group is in the process of
> adding @lang to XHTML 1.1 as part of its efforts to ensure that XHTML
> 1.1 is maximally useful by assistive technologies.  Since we claim that
> XHTML+RDFa is a superset of XHTML 1.1, I believe we need to add @lang
> here too.
>
> I am in the process of producing an XHTML+RDFa 1.1 editors draft, and I
> am pondering what the inclusion of @lang means for the processing
> rules.  To me, it makes perfect sense to say that @lang and @xml:lang
> both can define the language of an element, and that @xml:lang takes
> precedence.  This is what XHTML 1.1 says. Does anyone see a problem with
> this?
This sounds perfectly fine with me. It makes a small but irritating
difference between HTML5 and XHTML versions go away, too.

Thanks!

Ivan

>
> -Shane
>

--

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf


smime.p7s (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: @lang and @xml:lang in XHTML+RDFa 1.1

KANZAKI Masahide-2
In reply to this post by Shane McCarron
Hello,

2010/1/2 Shane McCarron <[hidden email]>:
> I am in the process of producing an XHTML+RDFa 1.1 editors draft, and I am
> pondering what the inclusion of @lang means for the processing rules.  To
> me, it makes perfect sense to say that @lang and @xml:lang both can define
> the language of an element, and that @xml:lang takes precedence.  This is
> what XHTML 1.1 says. Does anyone see a problem with this?


As I posted almost a year ago [1], I've been thinking that the use of
X/HTML lang (xml:lang) attribute to generate RDF lang tag might be one
of the most problematic parts in RDFa.

Generally, HTML authors put lang attribute on the <html> element,
maybe in order to take advantage of assistive technology etc., not
paying attention that the attribute value is inherited through the
elements tree. So we tend to write something:

<html lang="ja" xml:lang="ja">
...
<p xmlns:dc="http://purl.org/dc/terms/">
Updated: <span property="dc:modified">2010-01-13</span>...
</p>

which generates weird triple:

<> dc:modified "2010-01-13"@ja .

I've witnessed bunch of nonsense lang-tagged triples came from RDFa
documents. We can cancel lang attribute with empty lang="" or
xml:lang="", or can add datatype attribute, but very few people do
this in practice. And if there is an English or French name marked
with property="foaf:name", such name will be tagged with @ja.

Another concern is, if a user copy/paste the above paragraph into
his/her HTML, that will be a valid RDFa but will generate different
triple (without lang tag).

If lang attribute is not included in RDFa 1.1 processing model, but
allowed to use for accessibility purpose, then we can write only lang
attribute in <html> element to avoid this confusion. I'm not sure
whether this is a good solution, but RDFa 1.1 should be a chance to
clear this unwelcome situation.

best regards,


[1] http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Jan/0154.html

--
@prefix : <http://www.kanzaki.com/ns/sig#> . <> :from [:name
"KANZAKI Masahide"; :nick "masaka"; :email "[hidden email]"].

Reply | Threaded
Open this post in threaded view
|

Re: @lang and @xml:lang in XHTML+RDFa 1.1

Toby Inkster-4
> <html lang="ja" xml:lang="ja">
> ...
> <p xmlns:dc="http://purl.org/dc/terms/">
> Updated: <span property="dc:modified">2010-01-13</span>...
> </p>
>
> which generates weird triple:
>
> <> dc:modified "2010-01-13"@ja .

It is a weird triple, but the XHTML itself is at fault -- the RDF
generated suffers as a consequence.

Note that the word "Updated", which is an English, but not as far as I
know, a Japanese word, is tagged as being in Japanese too and will be
interpreted as such by all implementations of the xml:lang attribute --
not just RDFa processors.

This is an annoyance of language tagging in XHTML generally, and I don't
think it's RDFa's job to fix it. RDFa should simply use XHTML's built-in
mechanism for declaring languages (mo matter how annoying it may be to use
correctly) rather that trying to invent its own rules.

One minor tweak that *might* aleviate some of the pain in authoring
documents that include multiples languages and/or a mixture of linguistic
and non-linguistic content would be to ask RDFa processors to implement
special handling for a few ISO-639-2 codes. Here's my suggestions:

1. "mul" is the code for "multiple languages". This would generate a
literal tagged with language @mul as you'd expect, however it would be
treated as the same as xml:lang="" in terms of inheroting the language to
descendent elements. Example:

  <div xml:lang="mul" property="ex:test1" content="Foo">
    <span property="ex:test2">Bar</span>
    <span property="ex:test3" xml:lang="en">Baz</span>
  </div>

would generate:

  <> ex:test1 "Foo"@mul .
  <> ex:test2 "Bar" .
  <> ex:test3 "Baz"@en .

This would allow authors to markup the fact that an area of the page
contains multiple languages, and that RDFa processors should not try to
interpret the language of descendent elements without further prompts.

2. "zxx" is the code for non-linguistic content. Processors could
recognise xml:lang="zxx" as being equivalent to xml:lang="".

-Toby

Reply | Threaded
Open this post in threaded view
|

Re: @lang and @xml:lang in XHTML+RDFa 1.1

KANZAKI Masahide-2
Hi Toby, thanks for reply.

2010/1/13 Toby Inkster <[hidden email]>:

>> <html lang="ja" xml:lang="ja">
>> ...
>> <p xmlns:dc="http://purl.org/dc/terms/">
>> Updated: <span property="dc:modified">2010-01-13</span>...
>> </p>
>>
>> which generates weird triple:
>>
>> <> dc:modified "2010-01-13"@ja .
>
> It is a weird triple, but the XHTML itself is at fault -- the RDF
> generated suffers as a consequence.

> Note that the word "Updated", which is an English, but not as far as I
> know, a Japanese word, is tagged as being in Japanese too and will be
> interpreted as such by all implementations of the xml:lang attribute --
> not just RDFa processors.

Sorry, that's the consequence of my changing example to 'Updated' from
Japanese equivalent for non-Japanese readers. Even though I changed
lang/xml:lang to "en", we have the same problem: "2010-01-13"@en is
not an intended result.


> This is an annoyance of language tagging in XHTML generally, and I don't
> think it's RDFa's job to fix it. RDFa should simply use XHTML's built-in
> mechanism for declaring languages (mo matter how annoying it may be to use
> correctly) rather that trying to invent its own rules.


I agree with you in some points, but language tagging in XHTML is
primarily for assistive technologies, not for precise data annotation.
There is no big problem for screen readers when <span> element with
date string has language information, but it matters for RDFa
processors.

(I know simply ignoring lang attribute is not a very good solution, though...)



> 1. "mul" is the code for "multiple languages". This would generate a
> literal tagged with language @mul as you'd expect, however it would be
> treated as the same as xml:lang="" in terms of inheroting the language to
> descendent elements. Example:
>
>  <div xml:lang="mul" property="ex:test1" content="Foo">
>    <span property="ex:test2">Bar</span>
>    <span property="ex:test3" xml:lang="en">Baz</span>
>  </div>
>
> would generate:
>
>  <> ex:test1 "Foo"@mul .
>  <> ex:test2 "Bar" .
>  <> ex:test3 "Baz"@en .
>
> This would allow authors to markup the fact that an area of the page
> contains multiple languages, and that RDFa processors should not try to
> interpret the language of descendent elements without further prompts.
>
> 2. "zxx" is the code for non-linguistic content. Processors could
> recognise xml:lang="zxx" as being equivalent to xml:lang="".

Unfortunately, those do not help assistive techs..

Thanks for discussion.

--
@prefix : <http://www.kanzaki.com/ns/sig#> . <> :from [:name
"KANZAKI Masahide"; :nick "masaka"; :email "[hidden email]"].

Reply | Threaded
Open this post in threaded view
|

Re: @lang and @xml:lang in XHTML+RDFa 1.1

Gregg Kellogg
In reply to this post by KANZAKI Masahide-2
In practice, the way I handle this, is to perform range mapping on untyped literals. By looking up the range of the predicate, I can often infer the appropriate datatype for the literal, turning

        <> dc:modified "2010-01-13"@ja .

into

        <> dc:modified "2010-01-13"^^<http://www.w3.org/2001/XMLSchema#date> .

(having asserted xs:date a range of dc:modified).

In this way, I can map untyped literals (possibly with a language) to appropriately typed literals.

Gregg

On Jan 13, 2010, at 4:55 AM, KANZAKI Masahide wrote:

> Hello,
>
> 2010/1/2 Shane McCarron <[hidden email]>:
>> I am in the process of producing an XHTML+RDFa 1.1 editors draft, and I am
>> pondering what the inclusion of @lang means for the processing rules.  To
>> me, it makes perfect sense to say that @lang and @xml:lang both can define
>> the language of an element, and that @xml:lang takes precedence.  This is
>> what XHTML 1.1 says. Does anyone see a problem with this?
>
>
> As I posted almost a year ago [1], I've been thinking that the use of
> X/HTML lang (xml:lang) attribute to generate RDF lang tag might be one
> of the most problematic parts in RDFa.
>
> Generally, HTML authors put lang attribute on the <html> element,
> maybe in order to take advantage of assistive technology etc., not
> paying attention that the attribute value is inherited through the
> elements tree. So we tend to write something:
>
> <html lang="ja" xml:lang="ja">
> ...
> <p xmlns:dc="http://purl.org/dc/terms/">
> Updated: <span property="dc:modified">2010-01-13</span>...
> </p>
>
> which generates weird triple:
>
> <> dc:modified "2010-01-13"@ja .
>
> I've witnessed bunch of nonsense lang-tagged triples came from RDFa
> documents. We can cancel lang attribute with empty lang="" or
> xml:lang="", or can add datatype attribute, but very few people do
> this in practice. And if there is an English or French name marked
> with property="foaf:name", such name will be tagged with @ja.
>
> Another concern is, if a user copy/paste the above paragraph into
> his/her HTML, that will be a valid RDFa but will generate different
> triple (without lang tag).
>
> If lang attribute is not included in RDFa 1.1 processing model, but
> allowed to use for accessibility purpose, then we can write only lang
> attribute in <html> element to avoid this confusion. I'm not sure
> whether this is a good solution, but RDFa 1.1 should be a chance to
> clear this unwelcome situation.
>
> best regards,
>
>
> [1] http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Jan/0154.html
>
> --
> @prefix : <http://www.kanzaki.com/ns/sig#> . <> :from [:name
> "KANZAKI Masahide"; :nick "masaka"; :email "[hidden email]"].
>


Reply | Threaded
Open this post in threaded view
|

Re: @lang and @xml:lang in XHTML+RDFa 1.1

KANZAKI Masahide-2
Hi Gregg, thanks for reply.

2010/1/14 Gregg Kellogg <[hidden email]>:
> In practice, the way I handle this, is to perform range mapping on untyped literals. By looking up the range of the predicate, I can often infer the appropriate datatype for the literal, turning
>
>        <> dc:modified "2010-01-13"@ja .
>
> into
>
>        <> dc:modified "2010-01-13"^^<http://www.w3.org/2001/XMLSchema#date> .
>
> (having asserted xs:date a range of dc:modified).

Well, that's a good idea for RDFa consumers, though not all terms have
range description in the schema (e.g. properties from simple Dublin
Core). And the basic problem of lang tag handling in RDFa still
remains....

Anyway, thanks for the suggestion.

cheers,

--
@prefix : <http://www.kanzaki.com/ns/sig#> . <> :from [:name
"KANZAKI Masahide"; :nick "masaka"; :email "[hidden email]"].