Rather, there are two different kinds of XML documents, XML 1.0 and XML
1.1. An XML processor may accept XML 1.0 only, or XML 1.1 only, or both.
(For that matter, it might accept JSON or any other format as well.)
> It should be fixed that at least from specification definition that
> any UNICODE character is valid.
The character U+0000 was intentionally rejected for XML 1.1 character
content. This is unlikely to change in future.
> It is possible to rewrite that by “&#” for some processors, but
> this not accepted by others.
It is acceptable in XML 1.1 documents, but not in XML 1.0 documents.
> I hope that you read and not put to bin. I hope that you also mark
> that XML version that is obsolete as obsolete.
XML 1.0 is not obsolete. XML 1.1 is intended only for specific use
cases that XML 1.0 cannot handle.
There is indeed a problem with Section 2.2, which reads "Legal characters
are tab, carriage return, line feed, and the legal characters of
Unicode and ISO/IEC 10646." Obviously, TAB, CR, and LF are already
legal characters. As I just noted on that page, the First Edition of
XML (1998) read "the legal graphic characters of Unicode." For whatever
reason, the word "graphic" was removed from the Second Edition of 2000,
perhaps because it is inaccurate: XML allows many characters that are
not graphic characters. A correct, if not necessarily clear, revision
would be to add the words "except those in general category Cc".
> PS: Frankly speaking I would like to have XML 2.0 that it will be
> called short-xml, so pair tag will be possible to write in short form
> (e.g. <tag>…</tag> is same as <tag>…</>).
This also is unlikely to happen. The failure of XML 1.1 has made us
very unwilling to work on any successor format that does not have *major*
advantages over XML 1.0.