Character encoding: the light comes on...
Apr 27, 2004Well, after being given a Thai language file for the Forum, a Bulgarian language file for the Ringmaker, and throwing myself at character sets I couldn't read at all... I think I'm finally starting to get the hang of this character encoding business.
I mean, I understood it before, just not how the different sets worked together and how they were displayed when the current document uses the right encoding.
For instance, why a character would display correctly or incorrectly in UTF-8 was voodoo to me. Now I understand!
Anyway, I found a nice script to recode Bulgarian to UTF-8 at PHP.net but I couldn't find the same for windows-874 (Thai).
So after much searching I asked on Usenet and what do you know? I got pointed to a Perl script that recoded Thai and from there it was a simple matter to translate it to PHP. If you'd like the function I came up with, you can download it from my PHP page.
⇐ The rush is finally over | Orca Blog, now with RSS 2.0! ⇒ |