PHP Internationalization
This page is on PHP internationalization resources and designed to be helpful for those who want to make their web page available to different countries, languages, and people around the world.
[edit] Terminology
- Internationalization (i18n)
- Localization (l10n)
- Locale
- Unicode
- Character sets
- UTF-8
[edit] Assumptions
This guide assumes availability and usage of PHP 5.3's intl extension or PHP 5.2's PECL module for i18n. If you don't have these available for your site, you should probably consider upgrading. You can still achieve i18n, but it is outside the scope of this article.
Also, this guide assumes use of a MySQL 5.5.x database. If necessary, you may need to change the examples to work with your database.
[edit] Basic resources (starting points)
Articles
- The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
- Internationalization in PHP 5.3 from Zend Developer Zone
Reference manuals
[edit] Identifying the representation
A locale identifies a set of shared data representation for a group.[1] The intl extension includes a locale module. A locale is represented by a string such as en_US or fr_CA.
[edit] Detecting the locale from the browser
You can use the $_SERVER['HTTP_ACCEPT_LANGUAGE'] value to help determine the locale.[2]
[edit] Translating text
[edit] Parameterized strings and grammar
A parameterized string is a phrase or sentence that contains dynamic or replacement values. For example, an error message may be: "First name must be less than %d characters" where %d is dynamically determined by the PHP code. Various languages will require the %d to be in different parts of a grammatically correct sentence. The intl extension includes a message formatter module for parameterized strings.
Example:
$en_fmt = new MessageFormatter("en_US", "{0,number,integer} monkeys on {1,number,integer} trees make {2,number} monkeys per tree\n"); echo $en_fmt->format(array(4560, 123, 4560/123));
Displays:
4,560 monkeys on 123 trees make 37.073 monkeys per tree
- You can also parse strings based on formatting (reversing the effect).
[edit] Breaking text into letters, words, and sentences
A grapheme is a fundamental unit in a written language. The intl extension includes a grapheme module to parse strings into graphemes.
[edit] Language string storage
- Files
- Database
[edit] Sorting text
The intl extension includes a collator module for comparing and sorting strings.
[edit] Binary representation of text
The intl extension includes a normalizer module that normalizes Unicode strings to one standard representation.
[edit] Setting up the database
[edit] Numbers
The intl extension includes a number formatter module.
- Ordinal and cardinal numbers (1st, 2nd, or 1 book, 2 books)
[edit] Date/time
The intl extension includes a date formatter module.
- Displaying time in the proper timezone
[edit] Rendering
- HTTP Content-type header specification
- XHTML encoding Content-type meta specification
- Browsers, quirks, and auto-detection of language encodings
[edit] Input and form posts
[edit] Domains and URLs
The intl extension includes an IDN module that handles internationalized domain names.
[edit] See also
[edit] References
- ↑ http://devzone.zend.com/article/4799#Locale
- ↑ http://stackoverflow.com/questions/297542/simplest-way-to-detect-client-locale-in-php