Character encoding/decoding quirks?

andyc56 · January 20, 2017

I noticed when reading this article on room gain that there appears to be a few character encoding or decoding errors on the page. It' looks like what was intended as an apostrophe or similar character ended up undergoing some sort of illegal conversion somewhere along the line. It is in the text:

"The difference between the two measurements will show the effect the boundaries and acoustics of the space has on the subwoofer�s basic raw response as delivered to the listening position."

and

"Based on these results, we can then gauge what other subwoofers� maximum headroom and basic response shape would be in the same placement, either through looking at outdoor groundplane results such as those presented here or inside of a simulation program."

Looking at the HTML in a hex editor, the black diamond with question mark is the three-byte hex sequence \xef\xbf\xbd, which is the black diamond with white question mark character in a UTF-8 encoding. I noticed the character encoding of the HTML file is listed as iso-8859-1, which seems unusual to me. If the CMS is using UTF-8 in its database, errors can occur with some characters if trying to convert them to ISO-8859-1 for display. UTF-8 can encode all possible Unicode characters, but ISO-8859-1 can only encode less than 256 of them.

Anyway, I just thought I'd mention it in case nobody had seen it.

Kyle · January 28, 2017

Context type should be utf-8, those chars need to be replaced. we'll do so with the new site. Thanks for pointing that out

Sign In

Character encoding/decoding quirks?

Recommended Posts

andyc56

Link to comment

Share on other sites

Kyle

Link to comment

Share on other sites

Archived

Browse

Activity