Amaya XHTML chokes on URL ampersand delimiters

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Amaya XHTML chokes on URL ampersand delimiters

stevan_white (Bugzilla)
In pages to be parsed as XHTML, Amaya fails on URL's containing ampersand
("&")
delimiters.

For example, <a href="http://a.b.org/c?d&e">URL with search string</a>

Amaya complains "not well-formed (invalid token)", after the token following
the
first ampersand.

Without the XHTML DOCTYPE, this does not happen.

For a real-life example, see:
<a
href="http://www.kindernetz.de/oli/tierlexikon/index.php?tid=115&reiter=steckbrief">
Eurasian Coot.</>


amaya-url-parse.html (768 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Amaya XHTML chokes on URL ampersand delimiters

Stanimir Stamenkov

/Steve White/:

> In pages to be parsed as XHTML, Amaya fails on URL's containing
> ampersand ("&")
> delimiters.
>
> For example, <a href="http://a.b.org/c?d&e">URL with search string</a>
>
> Amaya complains "not well-formed (invalid token)", after the token
> following the
> first ampersand.

That's really not well-formed. You might read the XML specification.
As in SGML, & is markup character (used to delimit general entity
references) therefore you need to substitute it with a built-in
general entity reference &amp; when you wish it included as part of
the character data. Just like you use other built-in general entity
references for other markup significant characters &lt; for <, &gt;
for >, &quot; for ", &apos; for '.

> Without the XHTML DOCTYPE, this does not happen.
>
> For a real-life example, see:
> <a
> href="http://www.kindernetz.de/oli/tierlexikon/index.php?tid=115&reiter=steckbrief">
> Eurasian Coot.</>

I'm not really sure about the SGML rules (if malformed or undeclared
entity references are automatically discarded), but in this case of
"compatibility" HTML parsing (given there's no DOCTYPE declaration
included) Amaya just tries to recover from the error you've made.

--
Stanimir

Reply | Threaded
Open this post in threaded view
|

Re: Amaya XHTML chokes on URL ampersand delimiters

Irene Vatton
In reply to this post by stevan_white (Bugzilla)

On Sunday 06 November 2005 14:43, Steve White wrote:

> In pages to be parsed as XHTML, Amaya fails on URL's containing ampersand
> ("&")
> delimiters.
>
> For example, <a href="http://a.b.org/c?d&e">URL with search string</a>
>
> Amaya complains "not well-formed (invalid token)", after the token
> following the
> first ampersand.
>
> Without the XHTML DOCTYPE, this does not happen.
>
> For a real-life example, see:
> <a
> href="http://www.kindernetz.de/oli/tierlexikon/index.php?tid=115&reiter=ste
>ckbrief"> Eurasian Coot.</>

When the document has the XHTML DOCTYPE it's parsed with an XML parser and the
document should be well-formed, so you must replace & by &amp;

--
     Irène.
-----
Irène Vatton                     INRIA Rhône-Alpes
INRIA                               ZIRST
e-mail: [hidden email]       655 avenue de l'Europe
Tel.: +33 4 76 61 53 61             Montbonnot
Fax:  +33 4 76 61 52 07             38334 Saint Ismier Cedex - France