Various issues with using CURIEs in OWL

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Various issues with using CURIEs in OWL

Bijan Parsia-4
The OWL Working Group had intended to delegate our URI abbreviation  
mechanisms both for in-spec and in-concrete-syntax use. OWL has a  
number of different concrete serializations (including 2 XML based and  
2 non-XML based), all of which use (or I would like to use) CURIEs.

Unfortunately, while trying to use the CURIE spec, I (and others) have  
found that the current CURIE spec does not meet the WG needs even  
putting aside concerns about the ultimate disposition of the document:

1) For non-XML host language: The CURIE spec provides no mechanism  
(although it provides permission) for excluding characters from the  
syntax of the local part of CURIEs. This means that in host languages  
which use symbols like ")" or "[" as part of their syntax, we run into  
parsing ambiguities. Note that safe CURIES do not solve this problem  
as the safe CURIE delimiters are common host language delimiters.

PROPOSED FIX: Ideally, there would be a "mimimalistic" CURIE profile,  
ideally something like SPARQL's abbreviation mechanism. Even QNames  
would be fine (though we'd recommend the spec point out that to cover  
all URIs there should be a non-abbreviated form).

Note that *permission* to make a subset isn't all that helpful. I  
mean, then we're just doing our own thing, yeah?

EDITORIAL NOTE: Many of us found the organization of the spec, and  
especially of the normative parts, very confusing. See:
    <http://www.w3.org/mid/943ED7DE-FBC9-4110-B17B-AF9F8043A0A1@... 
 >

I suggest that "Usage" and "Examples" be consolidated, and that there  
are two normative sections, "Syntax" and "Incorporating CURIEs into  
Host Languages" which contain the respective constraints. The second  
section could usefully be broken down into "XML host languages" and  
"Non-XML host languages".

2) For XML host languages: The requirement to support the XML  
namespace based prefix declaration mechanism, even when an alternative  
mechanism is supplied, is simply a non-starter. Many in the XML world  
are hostile to the namespace based overloaded (even for proper QNames!  
see RELAX NG and Schematron). But being forced to support *two*  
mechanisms, especially when one of them isn't desired, is  
unnecessarily restrictive and leads to the second mechanism not being  
used:
    <http://www.w3.org/mid/29397.1237034265@ubehebe>

3) For XML host languages: There's no reason not to have a standard  
prefix declaration mechanism in the XML namespace. What value is there  
in letting XML host languages coin a bunch of different ones?

For example, <xml:Prefix name="" IRI=""/> is (basically) the syntax  
we're adopting, except with Prefix in the OWL namespace.

4) Processing: In some languages, multiple declarations of a prefix  
have an overriding behavior. In OWL we chose to make that a syntax  
error. The CURIE spec should make clear the processing model.

To sum, I, personally, don't think the CURIE spec helps either with  
implementation interop or with spec factoring, though I think it could  
be made to. Thus, in its current form, there's no point in citing it  
and, thus, no real point in it being a recommendation. The minimal  
necessary changes from my pov are:
        A) A proper XML mechanism with no requirement to suport xmlns
        B) Sensible profiles (I suggest, QName/RDF, SPARQL, and ALL)
        C) A processing model
C could maybe be dropped. A is totally required. I just won't adhere,  
or recommend anyone adhere, to the requirement to use xmlns. It's a  
nonstarter. Thus I won't use or recommend people use the CURIE spec  
(in its current form) for XML based host languages.

I won't use or recommend citing the CURIE spec without B for non-XML  
host languages. If you are happy with this being "using curies" then  
ok :)

Hope this helps.

Cheers,
Bijan.

Reply | Threaded
Open this post in threaded view
|

Re: Various issues with using CURIEs in OWL

Shane McCarron
My (personal) comments inline:

Bijan Parsia wrote:

> The OWL Working Group had intended to delegate our URI abbreviation
> mechanisms both for in-spec and in-concrete-syntax use. OWL has a
> number of different concrete serializations (including 2 XML based and
> 2 non-XML based), all of which use (or I would like to use) CURIEs.
>
> Unfortunately, while trying to use the CURIE spec, I (and others) have
> found that the current CURIE spec does not meet the WG needs even
> putting aside concerns about the ultimate disposition of the document:
>
> 1) For non-XML host language: The CURIE spec provides no mechanism
> (although it provides permission) for excluding characters from the
> syntax of the local part of CURIEs. This means that in host languages
> which use symbols like ")" or "[" as part of their syntax, we run into
> parsing ambiguities. Note that safe CURIES do not solve this problem
> as the safe CURIE delimiters are common host language delimiters.
>
> PROPOSED FIX: Ideally, there would be a "mimimalistic" CURIE profile,
> ideally something like SPARQL's abbreviation mechanism. Even QNames
> would be fine (though we'd recommend the spec point out that to cover
> all URIs there should be a non-abbreviated form).
The lexical form of a CURIE is an optional prefix, separator, and a
reference.  Are you saying that the characters permitted in prefix
(NCName) or reference (irelative-ref as defined in the IRI spec) are too
rich a set of characters?  And that in your use you needed to make this
collection of characters less rich?  If so, I agree that this is
permitted by the specification.
>
> Note that *permission* to make a subset isn't all that helpful. I
> mean, then we're just doing our own thing, yeah?
Not really - it means you are defining a subset or profile of a common
mechanism, and that a CURIE expressed in that subset would be
semantically still a CURIE.  One reason for using a common datatype is
that it helps with comprehension.

>
>
> EDITORIAL NOTE: Many of us found the organization of the spec, and
> especially of the normative parts, very confusing. See:
>    
> <http://www.w3.org/mid/943ED7DE-FBC9-4110-B17B-AF9F8043A0A1@...>
>
> I suggest that "Usage" and "Examples" be consolidated, and that there
> are two normative sections, "Syntax" and "Incorporating CURIEs into
> Host Languages" which contain the respective constraints. The second
> section could usefully be broken down into "XML host languages" and
> "Non-XML host languages".
Thanks for this.  We are already done with CR more or less, but I will
see what I can do.

>
> 2) For XML host languages: The requirement to support the XML
> namespace based prefix declaration mechanism, even when an alternative
> mechanism is supplied, is simply a non-starter. Many in the XML world
> are hostile to the namespace based overloaded (even for proper QNames!
> see RELAX NG and Schematron). But being forced to support *two*
> mechanisms, especially when one of them isn't desired, is
> unnecessarily restrictive and leads to the second mechanism not being
> used:
>    <http://www.w3.org/mid/29397.1237034265@ubehebe>
The XHTML 2 Working Group has already agreed to remove this
restriction.  In fact, what we agreed was that it was the host
language's responsibility to define its prefix mapping mechanim(s).
>
> 3) For XML host languages: There's no reason not to have a standard
> prefix declaration mechanism in the XML namespace. What value is there
> in letting XML host languages coin a bunch of different ones?
>
> For example, <xml:Prefix name="" IRI=""/> is (basically) the syntax
> we're adopting, except with Prefix in the OWL namespace.
Perhaps.  The XHTML 2 Working Group does not have authority to mess in
the xml space.  I am sure the group will discuss your suggestion.
>
> 4) Processing: In some languages, multiple declarations of a prefix
> have an overriding behavior. In OWL we chose to make that a syntax
> error. The CURIE spec should make clear the processing model.
We believe the processing model is completely host-language specific.  
The concept of a CURIE, that is an abbreviation that maps to an IRI, is
general.  The expression of that concept in a host language is
necessarily going to be related to that host language.  For example,
were you to use CURIEs in HTML you would not want to use some "xml"
mechanism to map a prefix.

>
> To sum, I, personally, don't think the CURIE spec helps either with
> implementation interop or with spec factoring, though I think it could
> be made to. Thus, in its current form, there's no point in citing it
> and, thus, no real point in it being a recommendation. The minimal
> necessary changes from my pov are:
>     A) A proper XML mechanism with no requirement to suport xmlns
>     B) Sensible profiles (I suggest, QName/RDF, SPARQL, and ALL)
>     C) A processing model
> C could maybe be dropped. A is totally required. I just won't adhere,
> or recommend anyone adhere, to the requirement to use xmlns. It's a
> nonstarter. Thus I won't use or recommend people use the CURIE spec
> (in its current form) for XML based host languages.
I think we have already addressed this requirement.  Thanks for
reinforcing it though.
>
> I won't use or recommend citing the CURIE spec without B for non-XML
> host languages. If you are happy with this being "using curies" then
> ok :)
>
> Hope this helps.
I think it did!  I really appreciate your taking the time to send this.  
The working group will get you a formal response in due course.
>
> Cheers,
> Bijan.

--
Shane P. McCarron                          Phone: +1 763 786-8160 x120
Managing Director                            Fax: +1 763 786-8180
ApTest Minnesota                            Inet: [hidden email]



Reply | Threaded
Open this post in threaded view
|

Re: Various issues with using CURIEs in OWL

Bijan Parsia-4
(Sean is my AC rep.)
On 10 Apr 2009, at 00:15, Shane McCarron wrote:

> My (personal) comments inline:
>
> Bijan Parsia wrote:
>> The OWL Working Group had intended to delegate our URI abbreviation  
>> mechanisms both for in-spec and in-concrete-syntax use. OWL has a  
>> number of different concrete serializations (including 2 XML based  
>> and 2 non-XML based), all of which use (or I would like to use)  
>> CURIEs.
>>
>> Unfortunately, while trying to use the CURIE spec, I (and others)  
>> have found that the current CURIE spec does not meet the WG needs  
>> even putting aside concerns about the ultimate disposition of the  
>> document:
>>
>> 1) For non-XML host language: The CURIE spec provides no mechanism  
>> (although it provides permission) for excluding characters from the  
>> syntax of the local part of CURIEs. This means that in host  
>> languages which use symbols like ")" or "[" as part of their  
>> syntax, we run into parsing ambiguities. Note that safe CURIES do  
>> not solve this problem as the safe CURIE delimiters are common host  
>> language delimiters.
>>
>> PROPOSED FIX: Ideally, there would be a "mimimalistic" CURIE  
>> profile, ideally something like SPARQL's abbreviation mechanism.  
>> Even QNames would be fine (though we'd recommend the spec point out  
>> that to cover all URIs there should be a non-abbreviated form).
> The lexical form of a CURIE is an optional prefix, separator, and a  
> reference.  Are you saying that the characters permitted in prefix  
> (NCName) or reference (irelative-ref as defined in the IRI spec) are  
> too rich a set of characters?

Reference, yes.

>  And that in your use you needed to make this collection of  
> characters less rich?

Yes.

>  If so, I agree that this is permitted by the specification.

But this gives me no reason to use the spec, esp. with a normative  
reference.

Without a specific subsetting mechanism (e.g., for the datatype, one  
could define by restriction) I think adopting a different set of  
CURIEs just means not adopting the CURIE spec.

Contrast our use of the IRI  and SPARQL spec:
        http://www.w3.org/2007/OWL/wiki/Syntax#IRIs

fullIRI := an IRI as defined in [RFC3987], enclosed in a pair of < (U
+3C) and > (U+3E) characters
prefixName := a finite sequence of characters matching the as PNAME_NS  
production of [SPARQL]

I think there are three reasonable categories of CURIE:

        Exactly QName
        What SPARQL currently does
        Full irelative-ref for reference

There are a couple of others I could imagine (i.e., with %encoding for  
strict acsii). But without at least these I don't think the CURIE spec  
is something SPARQL or OWL should use.
>> Note that *permission* to make a subset isn't all that helpful. I  
>> mean, then we're
>> just doing our own thing, yeah?
> Not really - it means you are defining a subset or profile of a  
> common mechanism,

We disagree strongly. Without a defined subsetting mechanism, it's  
just not helpful. It *might* have been helpful with defined processing  
models...but we don't have that.

Thus, you've not convinced me. At the moment I am better off ignoring  
the CURIE spec.

> and that a CURIE expressed in that subset would be semantically  
> still a CURIE.  One reason for using a common datatype is that it  
> helps with comprehension.

? Comprehension support is not a goal. Specification factoring or  
implementation interop are.

I find it very hard to believe that having to read another spec  
improves comprehension.

>> EDITORIAL NOTE: Many of us found the organization of the spec, and  
>> especially of the normative parts, very confusing. See:
>>   <http://www.w3.org/mid/943ED7DE-FBC9-4110-B17B-AF9F8043A0A1@... 
>> >
>>
>> I suggest that "Usage" and "Examples" be consolidated, and that  
>> there are two normative sections, "Syntax" and "Incorporating  
>> CURIEs into Host Languages" which contain the respective  
>> constraints. The second section could usefully be broken down into  
>> "XML host languages" and "Non-XML host languages".
> Thanks for this.  We are already done with CR more or less, but I  
> will see what I can do.

I don't see how you can get out of CR to PR, looking at your  
implementation report. At this stage, I'm now asking Sean, my AC rep,  
to oppose such a transition.

Speaking as a spec implementor who sincerely tried to use the CURIE  
spec, I think there are problem that merit serious changes to the  
design of the language. This means another LC, if I'm not mistaken.

>> 2) For XML host languages: The requirement to support the XML  
>> namespace based prefix declaration mechanism, even when an  
>> alternative mechanism is supplied, is simply a non-starter. Many in  
>> the XML world are hostile to the namespace based overloaded (even  
>> for proper QNames! see RELAX NG and Schematron). But being forced  
>> to support *two* mechanisms, especially when one of them isn't  
>> desired, is unnecessarily restrictive and leads to the second  
>> mechanism not being used:
>>   <http://www.w3.org/mid/29397.1237034265@ubehebe>
> The XHTML 2 Working Group has already agreed to remove this  
> restriction.

Great. That seems to trigger another LC.

>  In fact, what we agreed was that it was the host language's  
> responsibility to define its prefix mapping mechanim(s).

Well...if that means that we all reinvent ours, then I don't think  
it's a good idea. For me, this means that the CURIE spec is not a rec  
track sensible document, but would be better as a note.

>> 3) For XML host languages: There's no reason not to have a standard  
>> prefix declaration mechanism in the XML namespace. What value is  
>> there in letting XML host languages coin a bunch of different ones?
>>
>> For example, <xml:Prefix name="" IRI=""/> is (basically) the syntax  
>> we're adopting, except with Prefix in the OWL namespace.
> Perhaps.  The XHTML 2 Working Group does not have authority to mess  
> in the xml space.

Ok, use your own, namespace. xml namespace would be better.

>  I am sure the group will discuss your suggestion.

Thanks!

>> 4) Processing: In some languages, multiple declarations of a prefix  
>> have an overriding behavior. In OWL we chose to make that a syntax  
>> error. The CURIE spec should make clear the processing model.
> We believe the processing model is completely host-language specific.

I don't think that's helpful. There are at least 2 sensible, fairly  
common, processing models:
        Error on redefinition
        Lexically nearest wins

Both are in common use. Define them. Provide a way to reference them.

>  The concept of a CURIE, that is an abbreviation that maps to an  
> IRI, is general.  The expression of that concept in a host language  
> is necessarily going to be related to that host language.  For  
> example, were you to use CURIEs in HTML you would not want to use  
> some "xml" mechanism to map a prefix.

Sure, but, uhm, HTML is not an XML host language. And I'm confused as  
to why we're talking syntax rather than processing.

>> To sum, I, personally, don't think the CURIE spec helps either with  
>> implementation interop or with spec factoring, though I think it  
>> could be made to. Thus, in its current form, there's no point in  
>> citing it and, thus, no real point in it being a recommendation.  
>> The minimal necessary changes from my pov are:
>>    A) A proper XML mechanism with no requirement to suport xmlns
>>    B) Sensible profiles (I suggest, QName/RDF, SPARQL, and ALL)
>>    C) A processing model
>> C could maybe be dropped. A is totally required. I just won't  
>> adhere, or recommend anyone adhere, to the requirement to use  
>> xmlns. It's a nonstarter. Thus I won't use or recommend people use  
>> the CURIE spec (in its current form) for XML based host languages.
> I think we have already addressed this requirement.  Thanks for  
> reinforcing it though.

Great! I look forward to the next LC.

>> I won't use or recommend citing the CURIE spec without B for non-
>> XML host languages. If you are happy with this being "using curies"  
>> then ok :)
>>
>> Hope this helps.
> I think it did!  I really appreciate your taking the time to send  
> this.  The working group will get you a formal response in due course.

Great!

Cheers,
Bijan.

Reply | Threaded
Open this post in threaded view
|

Re: Various issues with using CURIEs in OWL

Bijan Parsia-4
In reply to this post by Shane McCarron
Another way of putting the concern:
    If
       * the set of CURIEs is not common (since we can subset in  
different ways), and
       * the syntax (even in XML) is not common, and
       * the processing model(s) is(are) not common
    then
       why is it better to have a recommendation than we all continue  
as we've been doing?

Esp. when consider the cost of standardization.

Cheers,
Bijan.

Reply | Threaded
Open this post in threaded view
|

Re: Various issues with using CURIEs in OWL

Shane McCarron
In reply to this post by Bijan Parsia-4


Bijan Parsia wrote:

>
>
>>> EDITORIAL NOTE: Many of us found the organization of the spec, and
>>> especially of the normative parts, very confusing. See:
>>>  
>>> <http://www.w3.org/mid/943ED7DE-FBC9-4110-B17B-AF9F8043A0A1@...>
>>>
>>>
>>> I suggest that "Usage" and "Examples" be consolidated, and that
>>> there are two normative sections, "Syntax" and "Incorporating CURIEs
>>> into Host Languages" which contain the respective constraints. The
>>> second section could usefully be broken down into "XML host
>>> languages" and "Non-XML host languages".
>> Thanks for this.  We are already done with CR more or less, but I
>> will see what I can do.
>
> I don't see how you can get out of CR to PR, looking at your
> implementation report. At this stage, I'm now asking Sean, my AC rep,
> to oppose such a transition.
Well - the implementation report has not been updated recently.  I
didn't mean that we were transitioning tomorrow.  But there are a number
of implementations and uses for CURIEs now.  I believe we have satisfied
the CR exit criteria, or will have once we are done gathering the
information.
>
> Speaking as a spec implementor who sincerely tried to use the CURIE
> spec, I think there are problem that merit serious changes to the
> design of the language. This means another LC, if I'm not mistaken.
Perhaps - not really up to me.
...
> Well...if that means that we all reinvent ours, then I don't think
> it's a good idea. For me, this means that the CURIE spec is not a rec
> track sensible document, but would be better as a note.

CURIEs are in use in many rec track documents currently, as well as in
the RDFa Recommendation.  The reason CURIEs are separated out is so that
all of these documents can have a consistent definition.  I am
interpreting your core objection here that there are not defined
profiles that would make it easier to have such a consistent definition.
As I said, I am sure the working group will get back to you formally.

>>  The concept of a CURIE, that is an abbreviation that maps to an IRI,
>> is general.  The expression of that concept in a host language is
>> necessarily going to be related to that host language.  For example,
>> were you to use CURIEs in HTML you would not want to use some "xml"
>> mechanism to map a prefix.
>
> Sure, but, uhm, HTML is not an XML host language. And I'm confused as
> to why we're talking syntax rather than processing.
I only mentioned it because you suggested defining a prefixing mechanism
in the XML namespace.  I was pointing out that HTML wouldn't be able to
use such a mechanism.  Sorry if I was unclear.  Moreover, SPARQL would
not be able to use a similar mechanism.  The ability to have the host
language define the syntactical mechanism is essential.  I think I agree
it would be best if the resulting CURIEs were all similar - but as you
point out that is unlikely since there are different constraints that
host languages will want to put on the reference portion.

--
Shane P. McCarron                          Phone: +1 763 786-8160 x120
Managing Director                            Fax: +1 763 786-8180
ApTest Minnesota                            Inet: [hidden email]