Request/Question: XML specification - unclear character data definition

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Request/Question: XML specification - unclear character data definition

jan.petvalsky.bugreport
Good afternoon.

 

I can see that there are differences in XML versions specification regarding to character data:

 

http://www.w3.org/TR/xml11/#NT-Char

http://www.w3.org/TR/REC-xml/#NT-Char

 

This unclear definition make that issue that one XML document could be valid for one XML processor, but not for others.

It should be fixed that at least from specification definition that any UNICODE character is valid.

 

If you think that this is followed this is not true especially for control characters “#x0000 – #x001F” that is handled differently.

It is possible to rewrite that by “&#” for some processors, but this not accepted by others. What it is worst I think that this rule is
applied to CDATA sections. So one processor allow “&#”, but not allow that in CDATA section.
if you think that this character is not often used, that may be wrong e.g. vertical tab seems to be in use in Microsoft Office.

 

I hope that you read and not put to bin. I hope that you also mark that XML version that is obsolete as obsolete.

 

See also: http://stackoverflow..com/questions/9526951/xml-and-unicode-specifications-whats-a-legal-character

 

Thank you in advance for correction of specification.   

 

PS: Frankly speaking I would like to have XML 2.0 that it will be called short-xml, so pair tag will be possible to
write in short form (e.g. <tag>…</tag> is same as <tag>…</>). 

 

Pětvalský Jan