Localizing PHP web sites using gettext
Developing multi language web sites using PHP is actually very easy. A common approach is having an include file for every supported language which contains an array that maps string ids to localized text (for example “WelcomeText” => “Welcome to our homepage.” would be included using something like <?= $strings[“WelcomeText”] >). However there are several problems with this approach. First of all, when the application is updated and additional strings are added, there is no way to determine which new strings were added and if they are present in every language (unless you write a script for it). What happens if a newly added string is not yet translated into a specific language?
Using gettext with PHP
A widely used framework for internationalization is gettext. It can be used with a variety of programming languages, including PHP. There are basically two ways to use gettext in your PHP applications. You can use the native gettext PHP extension or you can use a library implemented in PHP that does not need any extension, such as php-gettext. I will use the native PHP extension, but once you have read the post you should be able to use the php-gettext-library, too (have a look at the included example).
Using gettext in your application
Using gettext to get translated strings couldn’t be easier. Just call gettext(“Text to be translated”) and you get a localized version of “Text to be translated” if available, or “Text to be translated” otherwise. If you’re lazy, you can use _() instead of gettext().
Let’s try this out. Create a new PHP file (we’ll call it test.php), and insert the following code:
When you open that page in your browser, you will see “Hello World!”.
Localizing your application
Now that you have created a first version of your script, you may want to create localized versions for different languages. In order to do that, you either need the gettext utilities (windows version) or the graphical editor poEdit, which we will be using. So install and launch poEdit. Create a new catalog (File -> new catalog). Choose the language you want to translate the application to. I’m going to use German, GERMANY and iso-8859-1 for both character sets. The last option (plural forms) is an advanced feature of gettext which I’m going to explain later, so for now, just leave it blank. On the paths tab, set the base path to the directory containing your test.php file, and add “.” as a path. On the keywords tab you can add names of additional functions that call gettext. You may want to use this if you’re using php-gettext. Now click OK to open the save dialog. Now create a sub folder called “locale” in your script directory. Create a subdirectory inside locale for every language you support. We’ll create one for “de_DE” (the first part is the language, the second part is the country). Inside that folder create another one called “LC_MESSAGES”. You should now have a directory structure like locale/de_DE/LC_MESSAGES/. Save the file inside that directory as “messages.po”. poEdit will now automatically scan all source files inside the path you specified earlier and extract all strings that are passed to gettext() or _() (or any other methods you may have added in the keywords tab). Click OK. Now that’s cool! poEdit just extracted all strings you want to be localized.
In the upper half of the poEdit window, you have a list of strings (the original string on the left and your translation on the right). Select the first string in the list. In the lower left hand corner, you have 2 text boxes. The first one contains the original string, and the second one is still empty. Type your translation into that box. I’ll enter “Hallo Welt!”. If you had more entries in your file, you could navigate between them by pressing ctrl-up and ctrl-down. Save the file. poEdit automatically created “messages.mo” in the same directory as “messages.po”. This is the compiled version that will later be used by PHP.
Initializing the gettext library
Let’s see if our script can now display the localized string. The first thing you have to do is telling the gettext library which locale you want to use and where the language files are stored. Let’s create a new file called “localization.php” that will handle all the gettext initialization. Copy the following code into localization.php (Note: this code is for the php gettext extension, php libraries such as php-gettext may require different initialization):
$locale = “de_DE”;
if (isSet($_GET[“locale”])) $locale = $_GET[“locale”];
putenv(“LC_ALL=$locale”);
setlocale(LC_ALL, $locale);
bindtextdomain(“messages”, “./locale”);
textdomain(“messages”);
?>
In the first line, we set the default locale to “de_DE”. The second line allows you to override this by appending ?locale=… to the URL. You should replace this by real code to select the locale (perhaps by looking at the Accept-Language header), but it works for our tutorial. The next two lines actually set the current locale to that value and the bindtextdomain call creates a text domain which will use our messages.mo file in the locale directory. The textdomain function selects that domain as the default domain.
Include that file in your test.php by adding a require_once(“localization.php”); line to the top of the file. Now reload test.php in your browser and it should now say “Hallo Welt!”. Now try test.php?locale=en_US and it should say “Hello World!”. You have just created a multi language PHP script!
Updating your application
Let’s add another string to the script. I’ll add echo _("Welcome to my test page");
. When you reload the page with the German locale, you will see that Hello World is localized, while the second string is not. Let’s change that. In poEdit (reopen the .po file if you have closed it), click the update button (that’s the one with the wheel). A new dialog should show one new string. Click OK, translate it (in German it would be “Willkommen auf meiner Testseite”) and save the file. Reload test.php in your browser. Nothing has changed? When using the PHP gettext extension, .mo files are cached by the gettext library. The only way I have found to clear the cache is to restart the web server. Once you have done that, both strings should be translated.
Using plurals
One last thing I want to discuss is gettext’s plural support. This may seem a bit complicated at first, but is actually very powerful. Sometimes you need a string that supports plural forms. A typical example would be “0 comments”, “1 comment”, “2 comments”, “3 comments”… In English you can just add an ‘s’ if the number n != 1. However, in other languages it’s not that easy. gettext supports all these cases. In your scripts you use something like this:
If you have never used (s)printf before, have a look at it’s documentation. The interesting part is of course the ngettext call. ngettext takes three arguments. The first one is the singular string, the second one the plural string and the last one is the number. While that was quite easy, localizing these strings is a bit harder. Add the two lines to your test.php script.
Before you update your language file in poEdit, open the catalog options window (catalog -> options). Now let’s fill in that last field. For our German locale, set the plural forms field to “nplurals=2; plural=(n != 1);”. Essentially this specifies that there are two plural forms and the plural form is determined by the expression (n != 1). This evaluates to 0 if n == 1 and 1 if n != 1. For more information about this field, including samples for various languages, please see the gettext documentation. Now, let’s press the update button again and select the new string. You will notice that the lower left hand corner of the window has changed. It now displays the original singular and plural forms and the translation box has been replaced by two (depends on the number of plural forms the language has). Enter “%d Kommentar” in the first tab and “%d Kommentare” in the second tab. Save the file, reload the web server if needed and your page should show the correct translation (you may want to change the value of $n to see the effect of the ngettext function).
Improving localization in your scripts
This concludes my introduction to localizing PHP web sites using gettext. Here are a few ideas to improve localization in your applications:
- During development, you may want to use a php library instead of the PHP gettext extension because they do not require you to restart the web server every time you modify the language files. You can create a wrapper function (e.g. __()) that calls either the library function or the extension, depending on which is available or selected). You can add the name of the wrapper function as a keyword in poEdit, so that it is recognized.
- You may want to select a default locale according to the browser’s Accept-Language header, so that the user does not have to select his language first.
The performance of gettext
See my follow-up post “Benchmarking PHP Localization – Is gettext fast enough?” for Benchmarks. In general, the gettext Extension is faster than using a String-Array. The pure PHP implementation of gettext is slower and not recommended if you can use the PHP Extension.