RELAX NG and xml:id

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RELAX NG and xml:id

MURATA Makoto (FAMILY Given)

Dear colleagues,

I have a question about interactions of xml:id and
validation.

Consider a RELAX NG schema that defines xml:id as xsd:NCName rather
than xsd:ID and an DTD-free instance valid against this schema.  The
schema does NOT use the RELAX NG DTD compatibility specification.

Since RELAX NG validation does not change the information set,
the [attribute type] property of the attribute xml:id is unknown.

I believe that there is nothing wrong in applying "ID attribute
normalization" and "ID type assignment" to xml:id in this instance
document.  In my understanding, xml:id tries to separate ID processing
from validation as much as possible.

However, Section 4 of the xml:id recommendation says:

        The declared type of the attribute, if it has one, is "ID".
        All declarations for xml:id attributes must specify "ID" as
        the type of the attribute.

Does this sentence prohibit my scenario?  The pattern for xml:id
specifies xsd:NCName rather than xsd:ID.

Furthermore, what will happen if the xml:id attribute is validated
against wildcards?  For example:

  anyAtt = attribute * { xsd:string }.

Such wildcards are useful when we would like to allow foreign elements
to contain any attribute.   Since the RELAX NG DTD compatibility
specification allows the use of xsd:ID only when we precisely know
the element name as well as the attribute name, we cannot
have:
  anyElement = element * {attribute xml:id {xsd:ID}?, anyElement*}

If the "anyAtt" define statement shown above is what you mean
"declaration", we cannot allow xml:id within foreign elements
without giving up RELAX NG validation.

Cheers,
Makoto



Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RELAX NG and xml:id

Daniel Veillard

On Wed, Aug 13, 2008 at 07:12:03PM +0900, Murata Makoto wrote:
>
> Dear colleagues,

  Hello,

> I have a question about interactions of xml:id and
> validation.
>
> Consider a RELAX NG schema that defines xml:id as xsd:NCName rather
> than xsd:ID and an DTD-free instance valid against this schema.  The
> schema does NOT use the RELAX NG DTD compatibility specification.
>
> Since RELAX NG validation does not change the information set,
> the [attribute type] property of the attribute xml:id is unknown.

  Well In my opinion xml:id support is really is a property of the XML
parser. Validation which is not DTD validation but RELAX NG or XSD, etc...
is at least logically coming as a separated step after parsing.
   So if the parser is xml:id aware after parsing and before the information
set is handed to RELAX NG, the ID type assignment (section 4 bullet 2)
has been performed and the attribute is of type ID not unknown.
   But if the parser is not xml:id aware, the attribute would be of type
unknown at that point.

> I believe that there is nothing wrong in applying "ID attribute
> normalization" and "ID type assignment" to xml:id in this instance
> document.

  Agreed this will happen (at least logically) before the RELAX NG processor
receives the informations from the parser.

> In my understanding, xml:id tries to separate ID processing
> from validation as much as possible.

  yes it tries to implement IDness at the parser level, i.e. provide
IDness even if no DTD is available.

> However, Section 4 of the xml:id recommendation says:
>
> The declared type of the attribute, if it has one, is "ID".
> All declarations for xml:id attributes must specify "ID" as
> the type of the attribute.
>
> Does this sentence prohibit my scenario?  The pattern for xml:id
> specifies xsd:NCName rather than xsd:ID.

  In spirit yes, you should not at the validation level conflict with
what a parser supporting xml:id but not validating would provide.
  In practice I would not see that as a hard problem myself since
RELAX NG do not modify the infoset, so basically that rule in your
schemas is in my opinion just verifying that the values passed are
compatible with xsd:NCName, it's a type checking not a type definition.

  Since the infoset is not changed, the only impact of the RNG mismatch
is that you won't be able to catch some problems:
   - conflicting ID but assuming xml:id processing and no DTD one would
     expect IDness to be only xml:id based and conflicts will be reported
     by the parser itself
   - ID/IDREF mismatches

so by miscategorizing the attribute you loose some quality of checking
sounds like a schemas bug but of limited impact.

> Furthermore, what will happen if the xml:id attribute is validated
> against wildcards?  For example:
>
>   anyAtt = attribute * { xsd:string }.

  Sounds similar to me as the previous case, you use a generic rule but
as a result loose some quality in the checking.

> Such wildcards are useful when we would like to allow foreign elements
> to contain any attribute.   Since the RELAX NG DTD compatibility
> specification allows the use of xsd:ID only when we precisely know
> the element name as well as the attribute name, we cannot
> have:
>   anyElement = element * {attribute xml:id {xsd:ID}?, anyElement*}
>
> If the "anyAtt" define statement shown above is what you mean
> "declaration", we cannot allow xml:id within foreign elements
> without giving up RELAX NG validation.

  I think it's an extreme viewpoint. To me it just means that for foreign
elements you will just rely on the parser itself to detect xml:id IDness
and conlict between IDs declared in the full document. But you will loose
ID-IDREF references checking for foreign element, which again sounds rather
limited because you would expect ID-IDREF linking to happen between elements
of a common vocabulary not foreign elements pertaining to a different logic.

  my 2 cents.

Daniel

--
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
[hidden email]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/

Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: RELAX NG and xml:id

Grosso, Paul
In reply to this post by MURATA Makoto (FAMILY Given)

Hello Makoto,

The XML Core WG discussed your email during today's telcon,
and the WG is basically in agreement with Daniel's response at
http://lists.w3.org/Archives/Public/public-xml-id/2008Aug/0001

The crux of our opinion is that xml:id handling is done at parse
time and any RelaxNG validation is done on top of that and doesn't
modify the infoset so there is no risk of clash. We don't see any
incompatibility with RELAX NG and xml:id.

Please feel free to continue this discussion if you still have concerns.

paul

Paul Grosso
for the XML Core WG

> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Murata Makoto
> Sent: Wednesday, 2008 August 13 5:12
> To: [hidden email]
> Cc: [hidden email]; [hidden email]
> Subject: RELAX NG and xml:id
>
>
> Dear colleagues,
>
> I have a question about interactions of xml:id and
> validation.
>
> Consider a RELAX NG schema that defines xml:id as xsd:NCName rather
> than xsd:ID and an DTD-free instance valid against this schema.  The
> schema does NOT use the RELAX NG DTD compatibility specification.
>
> Since RELAX NG validation does not change the information set,
> the [attribute type] property of the attribute xml:id is unknown.
>
> I believe that there is nothing wrong in applying "ID attribute
> normalization" and "ID type assignment" to xml:id in this instance
> document.  In my understanding, xml:id tries to separate ID processing
> from validation as much as possible.
>
> However, Section 4 of the xml:id recommendation says:
>
> The declared type of the attribute, if it has one, is "ID".
> All declarations for xml:id attributes must specify "ID" as
> the type of the attribute.
>
> Does this sentence prohibit my scenario?  The pattern for xml:id
> specifies xsd:NCName rather than xsd:ID.
>
> Furthermore, what will happen if the xml:id attribute is validated
> against wildcards?  For example:
>
>   anyAtt = attribute * { xsd:string }.
>
> Such wildcards are useful when we would like to allow foreign elements
> to contain any attribute.   Since the RELAX NG DTD compatibility
> specification allows the use of xsd:ID only when we precisely know
> the element name as well as the attribute name, we cannot
> have:
>   anyElement = element * {attribute xml:id {xsd:ID}?, anyElement*}
>
> If the "anyAtt" define statement shown above is what you mean
> "declaration", we cannot allow xml:id within foreign elements
> without giving up RELAX NG validation.
>
> Cheers,
> Makoto
>
>
>
>

Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RELAX NG and xml:id

MURATA Makoto (FAMILY Given)

Paul and Daniel,

I am very impressed by your timely and thoughtful reply!  

My question arose from discussions around the ODF schema.  
When ODF 1.2 introduces xml:id, can we create a RNG schema,
which is consistent with the xml:id recommendation and
the RNG DTD compatibility specification?

I am happy with Daniel's reply and believe that the use of
xsd:NCName in RELAX NG schemas for xml:id is fine.  However,
it is probably a good idea to publish errata to xml:id and make
things clearer.  Let me suggest a note in D.3.

  Note: Since RELAX NG validation does not change the infoset,
  the use of other datatypes does not cause any problems to xml:id
  processing.

Cheers,

--
MURATA Makoto (FAMILY Given) <[hidden email]>


Loading...