Special Chars in our xAJAX Components
October 5th, 2006
As you may be aware we have been investigating why, under certain environments, non-latin based chars do not display correctly on websites when using our xAJAX based components such as Tags or mosKB.
xAJAX uses UTF-8 so we know it was not an issue with xXAJAX. Joomla 1.0.11 is not too UTF-8 friendly and even the Joomla Developers have blogged about the difficulties in acheiving true UTF-8 on a site.
After many weeks testing under different environments we are pleased to offer some more thoughts so that, if you are experiencing issues with the chars not displaying correctly, you can attempt these changes to your site to ensure the best compatibility. These changes have fixed every site we have personally seen that had problems so we hope that one of them will fix your site if you are having issues.
Credits and Thanks to the some great webpages
Read on tosee ways you can help yourself…
Just so you don’t get the idea that only “serious programmers” can understand the problem, and as a taster for the type of problems you can have, right now (i.e. they may fix it later) on IBM’s new PHP Blog @ developerworks, here’s what you see if you right click > Page Info in Firefox;
Firefox say it regards the character encoding as being ISO-8859-1). That’s actually coming from an HTTP header - if you click on the “Headers” tab you see;
Content-Type: text/html;charset=ISO-8859-1
Meanwhile amongst the HTML meta tags (scroll down past the whitespace) though you find;
Now that’s not a train smash (yet) but it should raise the flag that something isn’t quite right. The meta tag will be ignored by browsers so content will be regarded as being encoded as ISO-8859-1, thanks to the HTTP header.
But what about mbstring, iconv etc.?
Yep there’s PHP extensions to help with character encoding issues but (if you use a shared host, you’ve probably already got that sinking feeling) they’re not enabled by default in PHP4. Two of particular note;
- iconv: The iconv extension became a default part of PHP5 but it doesn’t offer you a magic wand that will make all problems go away. It probably has most value when either migrating old content to UTF-8 or when interfacing with systems can’t deliver you US-ASCII, ISO-8859-1 or UTF-85), such as an RSS feed, your PHP script reads, which is encoded with BIG5.
- mbstring: The mbstring extension is potentially a magic wand, as it provides a mechanism to override a large number of PHP’s string functions. Bad news is it’s not avaible by default in PHP. Third-hand reports say it used to be pretty unstable but in the last year or so has stabilized (more detail appreciated).
It may be you can take advantage of these extensions in your own environment but if you’re writing software for other people to install for themselves, that makes them bad news.
And PHP 6?
Then all our problems magically vanish
Specifically PHP 6 should have native understanding of Unicode and default to UTF-8 for output as well as a bunch of other stuff, building on the International_Components_for_Unicode project.
Practical Issues ::Declaring UTF-8 - Read here to know how to fix the issues
Now we come to the exciting part!
If you are running Joomla 1.0.x you can do these things to force your encoding types to UTF-8
in your templates index.php file add the following line as the FIRST LIINE in the ile:
< ?php header('Content-Type: text/html; charset=utf-8'); ?>
Next, place the following line RIGHT AFTER the tag in your templates index.php file
<meta http-equiv=”Content-Type” content=”text/html; charset=utf-8″/>
If you have/had any other meeeta tags like this one - remove them as this is the important one.
At this point check your site and see if your special non-english chars are displaying right - if so GREAT - if not continue reading.
TAGs Component has a function to aid UTF-8 encoding
However under certain circumstances this function on some servers produces unpredictable results. As such you should make a small change to the /components/com_tag/tag.class.php file.
In the file search for:
function fixChars($str){
directly below this line add the following code:
return $str;
save the file and upload it back to your server and test your site again
The details above have fixed 100% of sites we have had access to - if it doesnt fix your site please submit a request at http://www.phil-taylor.com/send-request AFTER YOU HAVE TRIED THE ABOVE things, and we will take a personal look.


