Tidy does fancy entity replacements, but it shouldn't

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Tidy does fancy entity replacements, but it shouldn't

Timo Hummel

Hi everybody,

I'm using tidy to clean up partial HTML documents. However, tidy messes up some entities where it shouldn't. Example:

This is a testcase where I need to ä stay ä and ä should stay ä

I played around with the tidy options, but either ä is converted to ä, ä is converted to ä or both ä and ä are converted to ä.

Is there an option to ignore entities? I just need to clean up broken elements (e.g. incorrect nesting).

With best regards,
Timo A. Hummel

----
four for business AG

Lilistrasse 83/C  |  63067 Offenbach
phone: +49 69 801082-0  |  fax: +49 69 801082-79
mail: [hidden email]  |  web: http://www.4fb.de       


Reply | Threaded
Open this post in threaded view
|

Re: Tidy does fancy entity replacements, but it shouldn't

Bjoern Hoehrmann

* Timo Hummel wrote:
>This is a testcase where I need to ä stay ä and ? should stay ?

This is not possible, information about entities is lost during parsing,
you can only define whether it should use ?, ö, or ö for all
such characters.
--
Bj?rn H?hrmann ? mailto:[hidden email] ? http://bjoern.hoehrmann.de
Weinh. Str. 22 ? Telefon: +49(0)621/4309674 ? http://www.bjoernsworld.de
68309 Mannheim ? PGP Pub. KeyID: 0xA4357E78 ? http://www.websitedev.de/