[XHR] Sending a Document with a mismatched encoding

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[XHR] Sending a Document with a mismatched encoding

Cameron McCormack-4

It seems to be possible to send a Document using XHR where the encoding
specified by the Content-Type charset parameter differs from the actual
encoding used to encode the serialisation.  For example by:

  var r = new XMLHttpRequest();
  r.open("POST", "somewhere");
  r.setRequestHeader("Content-Type", "application/xml;charset=US-ASCII");
  var doc = document.implementation.createDocument(null, "á", null);
  r.send(doc);

Since passing a String to send() will cause the charset to be fixed up
to match the actual encoding used (UTF-8, in that case), shouldn’t
passing a Document to send() do the same?

--
Cameron McCormack ≝ http://mcc.id.au/

Reply | Threaded
Open this post in threaded view
|

Re: [XHR] Sending a Document with a mismatched encoding

Cameron McCormack-4

Cameron McCormack:
> Since passing a String to send() will cause the charset to be fixed up
> to match the actual encoding used (UTF-8, in that case), shouldn’t
> passing a Document to send() do the same?

For that matter, how about defaulting to sending

  Content-Type: text/plain; charset=UTF-8

if a String is passed to send() without a Content-Type having been given
by setRequestHeader()?

--
Cameron McCormack ≝ http://mcc.id.au/

Reply | Threaded
Open this post in threaded view
|

Re: [XHR] Sending a Document with a mismatched encoding

Boris Zbarsky
In reply to this post by Cameron McCormack-4

Cameron McCormack wrote:
> It seems to be possible to send a Document using XHR where the encoding
> specified by the Content-Type charset parameter differs from the actual
> encoding used to encode the serialisation.

For what it's worth, this got changed from Gecko 1.8 to Gecko 1.9.
Firefox 3 will tweak the charset parameter to correspond to the
inputEncoding of the Document being serialized (which is the encoding
used for the serialization).

We discovered that this will in fact cause some issues, since some
servers treat the data as UTF-8 no matter what headers we send, and the
default inputEncoding in Gecko of documents created via createDocument
is ISO-8859-1 at the moment.  We will likely change that to UTF-8...

Oh, and if there is no charset parameter to start with, Gecko will add
one for both Document and String arguments, just like you suggest in
your followup mail.

-Boris