C#: XML UTF8 encoding/streaming problems (trash characters)

I was working with ASP.net working on a website. I needed to cache some XML files and later on stream them at request.
So I worked on a routine that created the XML files, next I made a handler that opened a file stream to the XML file and wrote as response to the request.
Everything seemed to be working like it should. The files were created in the correct folder and when I opened my browser to perform a request I got my XML back as result.

But then something strange happened. The goal was to get the XML to an other application. So when I created the request I got my response but it included some rubbish characters at the beginning of the result. This caused an error when parsing the XML.

I took me a while to figure out what was going wrong. It seems these characters are an indication that the file is encoded in UTF8 format. A text editor will read this and not display it to the user. But it was also streamed as response in my handler.

Luckily there is a fix to get rid of these characters.
Here is my old code when writing the XML files:

XmlTextWriter _xml = new XmlTextWriter(_Stream, Encoding.UTF8);

As you can see I created a XmlTextWriter to write to a stream in UTF8 format.

Now replace this line with:

XmlTextWriter _xml = new XmlTextWriter(_Stream, new UTF8Encoding(false));

This will solve your problem. The extra parameter tells the encoder to exclude the extra characters.

I Hope this will save someone a lot of headache looking where these strange characters are coming from.

2 thoughts on “C#: XML UTF8 encoding/streaming problems (trash characters)

  1. Martin,

    Your random piece of knowledge on XML encoding saved my behind. I figured out it was the encoding that was leaving the trash behind but I had no idea why. Thanks for your help.

Comments are closed.