php结束,PHP

国际化 (i18n) 与本地化 (l10n)

Disclaimer for newcomers: i18n and l10n are numeronyms, a kind of abbreviation where numbers are used to shorten

words - in our case, internationalization becomes i18n and localization, l10n.

首先,我们需要为这两个相似的概念以及相关的概念下定义:

Internationalization is when you organize your code so it can be adapted to different languages or regions

without refactorings. This is usually done once - preferably, in the beginning of the project, or else you’ll probably

need some huge changes in the source!

Localization happens when you adapt the interface (mainly) by translating contents, based on the i18n work done

before. It usually is done every time a new language or region needs support and is updated when new interface pieces

are added, as they need to be available in all supported languages.

Pluralization defines the rules needed between different languages to interoperate strings containing numbers and

counters. For instance, in English when you have only one item, it’s singular, and anything different from that is

called plural; plural in this language is indicated by adding an S after some words, and sometimes changes parts of it.

In other languages, such as Russian or Serbian, there are two plural forms in addition to the singular - you may even

find languages with a total of four, five or six forms, such as Slovenian, Irish or Arabic.

一般的实现方法

The easiest way to internationalize PHP software is by using array files and using those strings in templates, such as

=$TRANS['title_about_page']?>

. This is, however, hardly a recommended way for serious projects, as it poses

some maintenance issues along the road - some might appear in the very beginning, such as pluralization. So, please,

don’t try this if your project will contain more than a couple of pages.

对于 PHP 软件来首,实现国际化的最简单的方式是使用数组键值对应的方式如

=$TRANS['title_about_page']?>

,不过在比较正经的项目中,不建议这么做。因为会随着项目代码慢慢变多,维护的难度将会增加,尤其会阻碍后续本地化实施。

The most classic way and often taken as reference for i18n and l10n is a Unix tool called gettext. It dates

back to 1995 and is still a complete implementation for translating software. It is pretty easy to get running, while

it still sports powerful supporting tools. It’s about Gettext we will be talking here. Also, to help you not get messy

over the command-line, we will be presenting a great GUI application that can be used to easily update your l10n source

files.

其他工具

There are common libraries used that support Gettext and other implementations of i18n. Some of them may seem easier to

install or sport additional features or i18n file formats. In this document, we focus on the tools provided with the

PHP core, but here we list others for completion:

oscarotero/Gettext: Gettext support with an OO interface; includes improved helper functions, powerful

extractors for several file formats (some of them not supported natively by the gettext command), and can also export

to other formats besides .mo/.po files. Can be useful if you need to integrate your translation files into other parts

of the system, like a JavaScript interface.

symfony/translation: supports a lot of different formats, but recommends using verbose XLIFF’s. Doesn’t

include helper functions nor a built-in extractor, but supports placeholders using strtr() internally.

zend/i18n: supports array and INI files, or Gettext formats. Implements a caching layer to save you from

reading the filesystem every time. It also includes view helpers, and locale-aware input filters and validators.

However, it has no message extractor.

Other frameworks also include i18n modules, but those are not available outside of their codebases:

- Laravel supports basic array files, has no automatic extractor but includes a @lang helper for template files.

- Yii supports array, Gettext, and database-based translation, and includes a messages extractor. It is backed by the

Intl extension, available since PHP 5.3, and based on the ICU project; this enables Yii to run powerful

replacements, like spelling out numbers, formatting dates, times, intervals, currency, and ordinals.

If you decide to go for one of the libraries that provide no extractors, you may want to use the gettext formats, so

you can use the original gettext toolchain (including Poedit) as described in the rest of the chapter.

Gettext

安装

首先,你可能需要通过包管理器(例如 apt-get 或 yum)来安装 Gettext 以及相关的 PHP 库。

安装之后,在 php.ini 文件中添加 extension=gettext.so (Linux/Unix) 或者 extension=php_gettext.dll (Windows) 以启用 Gettext。

我们还需要利用 Poedit 来创建翻译文件。一般通过系统自带的包管理器即可安装;针对 Unix、Mac 和 Windows 系统都有对应的版本,也可以在其官方网站下载。

Structure

Types of files

There are three files you usually deal with while working with gettext. The main ones are PO (Portable Object) and

MO (Machine Object) files, the first being a list of readable “translated objects” and the second, the corresponding

binary to be interpreted by gettext when doing localization. There’s also a POT (Template) file, that simply contains

all existing keys from your source files, and can be used as a guide to generate and update all PO files. Those template

files are not mandatory: depending on the tool you’re using to do l10n, you can go just fine with only PO/MO files.

You’ll always have one pair of PO/MO files per language and region, but only one POT per domain.

Domains

There are some cases, in big projects, where you might need to separate translations when the same words convey

different meaning given a context. In those cases, you split them into different domains. They’re basically named

groups of POT/PO/MO files, where the filename is the said translation domain. Small and medium-sized projects usually,

for simplicity, use only one domain; its name is arbitrary, but we will be using “main” for our code samples.

In Symfony projects, for example, domains are used to separate the translation for validation messages.

Locale code

A locale is simply a code that identifies one version of a language. It’s defined following the ISO 639-1 and

ISO 3166-1 alpha-2 specs: two lower-case letters for the language, optionally followed by an underline and two

upper-case letters identifying the country or regional code. For rare languages, three letters are used.

For some speakers, the country part may seem redundant. In fact, some languages have dialects in different

countries, such as Austrian German (de_AT) or Brazilian Portuguese (pt_BR). The second part is used to distinguish

between those dialects - when it’s not present, it’s taken as a “generic” or “hybrid” version of the language.

目录结构

为了使用 Gettext,我们需要创建一个特定的目录结构。首先,你需要在源码仓库中选择任意一个目录作为 l10n 文件的根目录。在这个目录中为每一种本地化语言单独创建一个目录,另外,还要创建一个 LC_MESSAGES 目录存放 PO/MO 文件。例如:

├─ src/

├─ templates/

└─ locales/

├─ forum.pot

├─ site.pot

├─ de/

│ └─ LC_MESSAGES/

│ ├─ forum.mo

│ ├─ forum.po

│ ├─ site.mo

│ └─ site.po

├─ es_ES/

│ └─ LC_MESSAGES/

│ └─ ...

├─ fr/

│ └─ ...

├─ pt_BR/

│ └─ ...

└─ pt_PT/

└─ ...

Plural forms

As we said in the introduction, different languages might sport different plural rules. However, gettext saves us from

this trouble once again. When creating a new .po file, you’ll have to declare the plural rules for that

language, and translated pieces that are plural-sensitive will have a different form for each of those rules. When

calling Gettext in code, you’ll have to specify the number related to the sentence, and it will work out the correct

form to use - even using string substitution if needed.

Plural rules include the number of plurals available and a boolean test with n that would define in which rule the

given number falls (starting the count with 0). For example:

Japanese: nplurals=1; plural=0 - only one rule

English: nplurals=2; plural=(n != 1); - two rules, first if N is one, second rule otherwise

Brazilian Portuguese: nplurals=2; plural=(n > 1); - two rules, second if N is bigger than one, first otherwise

Now that you understood the basis of how plural rules works - and if you didn’t, please look at a deeper explanation

on the LingoHub tutorial -, you might want to copy the ones you need from a list instead

of writing them by hand.

When calling out Gettext to do localization on sentences with counters, you’ll have to give him the

related number as well. Gettext will work out what rule should be in effect and use the correct localized version.

You will need to include in the .po file a different sentence for each plural rule defined.

Sample implementation

After all that theory, let’s get a little practical. Here’s an excerpt of a .po file - don’t mind with its format,

but instead the overall content, you’ll learn how to edit it easily later:

msgid ""

msgstr ""

"Language: pt_BR\n"

"Content-Type: text/plain; charset=UTF-8\n"

"Plural-Forms: nplurals=2; plural=(n > 1);\n"

msgid "We're now translating some strings"

msgstr "Nós estamos traduzindo algumas strings agora"

msgid "Hello %1$s! Your last visit was on %2$s"

msgstr "Olá %1$s! Sua última visita foi em %2$s"

msgid "Only one unread message"

msgid_plural "%d unread messages"

msgstr[0] "Só uma mensagem não lida"

msgstr[1] "%d mensagens não lidas"

The first section works like a header, having the msgid and msgstr especially empty. It describes the file encoding,

plural forms and other things that are less relevant.

The second section translates a simple string from English to

Brazilian Portuguese, and the third does the same, but leveraging string replacement from sprintf so the

translation may contain the user name and visit date.

The last section is a sample of pluralization forms, displaying

the singular and plural version as msgid in English and their corresponding translations as msgstr 0 and 1

(following the number given by the plural rule). There, string replacement is used as well so the number can be seen

directly in the sentence, by using %d. The plural forms always have two msgid (singular and plural), so it’s

advised to not use a complex language as the source of translation.

Discussion on l10n keys

As you might have noticed, we’re using as source ID the actual sentence in English. That msgid is the same used

throughout all your .po files, meaning other languages will have the same format and the same msgid fields but

translated msgstr lines.

Talking about translation keys, there are two main “schools” here:

msgid as a real sentence.

The main advantages are:

if there are pieces of the software untranslated in any given language, the key displayed will still maintain some

meaning. Example: if you happen to translate by heart from English to Spanish but need help to translate to French,

you might publish the new page with missing French sentences, and parts of the website would be displayed in English

instead;

it’s much easier for the translator to understand what’s going on and make a proper translation based on the

msgid;

it gives you “free” l10n for one language - the source one;

The only disadvantage: if you need to change the actual text, you would need to replace the same msgid

across several language files.

msgid as a unique, structured key.

It would describe the sentence role in the application in a structured way, including the template or part where the

string is located instead of its content.

it’s a great way to have the code organized, separating the text content from the template logic.

however, that could bring problems to the translator that would miss the context. A source language file would be

needed as a basis for other translations. Example: the developer would ideally have an en.po file, that

translators would read to understand what to write in fr.po for instance.

missing translations would display meaningless keys on screen (top_menu.welcome instead of Hello there, User!

on the said untranslated French page). That’s good it as would force translation to be complete before publishing -

but bad as translation issues would be really awful in the interface. Some libraries, though, include an option to

specify a given language as “fallback”, having a similar behavior as the other approach.

The Gettext manual favors the first approach as, in general, it’s easier for translators and users in

case of trouble. That’s how we will be working here as well. However, the Symfony documentation favors

keyword-based translation, to allow for independent changes of all translations without affecting templates as well.

Everyday usage

In a common application, you would use some Gettext functions while writing static text in your pages. Those sentences

would then appear in .po files, get translated, compiled into .mo files and then, used by Gettext when rendering

the actual interface. Given that, let’s tie together what we have discussed so far in a step-by-step example:

1. A sample template file, including some different gettext calls

=gettext('Introduction')?>

=gettext('We\'re now translating some strings')?>

gettext() simply translates a msgid into its corresponding msgstr for a given language. There’s also

the shorthand function _() that works the same way;

ngettext() does the same but with plural rules;

there’s also dgettext() and dngettext(), that allows you to override the domain for a single

call. More on domain configuration in the next example.

2. A sample setup file (i18n_setup.php as used above), selecting the correct locale and configuring Gettext

/**

* Verifies if the given $locale is supported in the project

* @param string $locale

* @return bool

*/

function valid($locale) {

return in_array($locale, ['en_US', 'en', 'pt_BR', 'pt', 'es_ES', 'es']);

}

//setting the source/default locale, for informational purposes$lang = 'en_US';

if (isset($_GET['lang']) && valid($_GET['lang'])) {

// the locale can be changed through the query-string $lang = $_GET['lang']; //you should sanitize this! setcookie('lang', $lang); //it's stored in a cookie so it can be reused} elseif (isset($_COOKIE['lang']) && valid($_COOKIE['lang'])) {

// if the cookie is present instead, let's just keep it $lang = $_COOKIE['lang']; //you should sanitize this!} elseif (isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {

// default: look for the languages the browser says the user accepts $langs = explode(',', $_SERVER['HTTP_ACCEPT_LANGUAGE']);

array_walk($langs, function (&$lang) { $lang = strtr(strtok($lang, ';'), ['-' => '_']); });

foreach ($langs as $browser_lang) {

if (valid($browser_lang)) {

$lang = $browser_lang;

break;

}

}

}

// here we define the global system locale given the found languageputenv("LANG=$lang");

// this might be useful for date functions (LC_TIME) or money formatting (LC_MONETARY), for instancesetlocale(LC_ALL, $lang);

// this will make Gettext look for ../locales//LC_MESSAGES/main.mobindtextdomain('main', '../locales');

// indicates in what encoding the file should be readbind_textdomain_codeset('main', 'UTF-8');

// if your application has additional domains, as cited before, you should bind them here as wellbindtextdomain('forum', '../locales');

bind_textdomain_codeset('forum', 'UTF-8');

// here we indicate the default domain the gettext() calls will respond totextdomain('main');

// this would look for the string in forum.mo instead of main.mo

// echo dgettext('forum', 'Welcome back!');?>

3. Preparing translation for the first run

To make matters easier - and one of the powerful advantages Gettext has over custom framework i18n packages - is its

custom file type. “Oh man, that’s quite hard to understand and edit by hand, a simple array would be easier!” Make no

mistake, applications like Poedit are here to help - a lot. You can get the program from

their website, it’s free and available for all platforms. It’s a pretty easy tool to get used to,

and a very powerful one at the same time - using all powerful features Gettext has available.

In the first run, you should select “File > New Catalog” from the menu. There you’ll have a small screen where we will

set the terrain so everything else runs smoothly. You’ll be able to find those settings later through

“Catalog > Properties”:

Project name and version, Translation Team and email address: useful information that goes in the .po file header;

Language: here you should use that format we mentioned before, such as en_US or pt_BR;

Charsets: UTF-8, preferably;

Source charset: set here the charset used by your PHP files - probably UTF-8 as well, right?

plural forms: here go those rules we mentioned before - there’s a link in there with samples as well;

Source paths: here you must include all folders from the project where gettext() (and siblings) will happen - this

is usually your templates folder(s)

Source keywords: this last part is filled by default, but you might need to alter it later - and is one of the

powerful points of Gettext. The underlying software knows how the gettext() calls look like in several programming

languages, but you might as well create your own translation forms. This will be discussed later in the “Tips” section.

After setting those points you’ll be prompted to save the file - using that directory structure we mentioned as well,

and then it will run a scan through your source files to find the localization calls. They’ll be fed empty into the

translation table, and you’ll start typing in the localized versions of those strings. Save it and a .mo file will be

(re)compiled into the same folder and ta-dah: your project is internationalized.

4. Translating strings

As you may have noticed before, there are two main types of localized strings: simple ones and the ones with plural

forms. The first ones have simply two boxes: source and localized string. The source string can’t be modified as

Gettext/Poedit do not include the powers to alter your source files - you should change the source itself and rescan

the files. Tip: you may right-click a translation line and it will hint you with the source files and lines where that

string is being used.

On the other hand, plural form strings include two boxes to show the two source strings, and tabs so you can configure

the different final forms.

Whenever you change your sources and need to update the translations, just hit Refresh and Poedit will rescan the code,

removing non-existent entries, merging the ones that changed and adding new ones. It may also try to guess some

translations, based on other ones you did. Those guesses and the changed entries will receive a “Fuzzy” marker,

indicating it needs review, being highlighted in the list. It’s also useful if you have a translation team and someone

tries to write something they’re not sure about: just mark Fuzzy and someone else will review later.

Finally, it’s advised to leave “View > Untranslated entries first” marked, as it will help you a lot to not forget

any entry. From that menu, you can also open parts of the UI that allow you to leave contextual information for

translators if needed.

Tips & Tricks

Possible caching issues

If you’re running PHP as a module on Apache (mod_php), you might face issues with the .mo file being cached. It

happens the first time it’s read, and then, to update it, you might need to restart the server. On Nginx and PHP5 it

usually takes only a couple of page refreshes to refresh the translation cache, and on PHP7 it is rarely needed.

Additional helper functions

As preferred by many people, it’s easier to use _() instead of gettext(). Many custom i18n libraries from

frameworks use something similar to t() as well, to make translated code shorter. However, that’s the only function

that sports a shortcut. You might want to add in your project some others, such as __() or _n() for ngettext(),

or maybe a fancy _r() that would join gettext() and sprintf() calls. Other libraries, such as

oscarotero’s Gettext also provide helper functions like these.

In those cases, you’ll need to instruct the Gettext utility on how to extract the strings from those new functions.

Don’t be afraid, it’s very easy. It’s just a field in the .po file, or a Settings screen on Poedit. In the editor,

that option is inside “Catalog > Properties > Source keywords”. You need to include there the specifications of those

new functions, following a specific format:

if you create something like t() that simply returns the translation for a string, you can specify it as t.

Gettext will know the only function argument is the string to be translated;

if the function has more than one argument, you can specify in which one the first string is - and if needed, the

plural form as well. For instance, if we call our function like this: __('one user', '%d users', $number), the

specification would be __:1,2, meaning the first form is the first argument, and the second form is the second

argument. If your number comes as the first argument instead, the spec would be __:2,3, indicating the first form is

the second argument, and so on.

After including those new rules in the .po file, a new scan will bring in your new strings just as easy as before.

References

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/465284.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

一个漂亮的电子钟,纪念我们逝去的青春(含软硬件资料)

来源:阿莫论坛,作者:humancn微信公众号:芯片之家(ID:chiphome-dy)公众号不少粉丝,大一大二做的第一个项目,都是电子时钟吧,非常经典的STC89C52DS1302数码管组…

JZOJ 5776. 【NOIP2008模拟】小x游世界树

5776. 【NOIP2008模拟】小x游世界树 (File IO): input:yggdrasil.in output:yggdrasil.out Time Limits: 1500 ms Memory Limits: 262144 KB Detailed Limits Goto ProblemSetDescription 小x得到了一个(不可靠的)小道消息,传说中的神岛阿瓦隆在格陵兰海的某处,据…

SQL Server 2005中的分区表(一):什么是分区表?为什么要用分区表?如何创建分区表?...

如果你的数据库中某一个表中的数据满足以下几个条件,那么你就要考虑创建分区表了。 1、数据库中某个表中的数据很多。很多是什么概念?一万条?两万条?还是十万条、一百万条?这个,我觉得是仁者见仁、智者见智…

java图形界面颜色随机变换,JavaScript实现鼠标移入随机变换颜色

大家好!今天分享一个在 JavaScript中,实现一个鼠标移入可以随机变换颜色。/* 这里定义一下div(块元素)已下span 标签的宽.高.边框线以及边框线的颜色*/span{display: block;width: 80px;height: 80px;border: 1px solid #000000;float: left;}var adocum…

Vscode 用Filter Line看日志,很爽

因为某种原因,我抛弃了Notepad然后一直没有找到一个比较好的日志查看软件,最近发现Vscode里面的这个插件不错,给大家推荐一下。中文详情链接:https://everettjf.github.io/2018/07/03/vscode-extension-filter-line/推荐阅读&…

zblog php 七牛缩略图,zblog中Gravatar头像不显示解决方法

解决zblog博客Gravatar头像不显示方法一第一个,解决zblog博客Gravatar头像不显示解决方法是对其进行修复操作。造成不显示的原因主要是Gravatar头像地址错误。所以,我们需要对头像地址进行更改。1、进入自己的博客后台。2、找到现在使用的主题模板中的&a…

SpringCloud学习--微服务架构

目录 微服务架构快速指南 SOA Dubbo Spring Cloud Dubbo与SpringCloud对比 微服务(Microservice)架构快速指南 什么是软件架构?    软件架构是一个包含各种组织的系统组织,这些组件包括 Web服务器, 应用服务器, 数据库,存储, 通讯层), 它们彼此或和环境存在关系…

工作九年的硬件工程师,想对我们说些什么?

△向上生长, TO BE TO UP. 10万工程师的成长充电站△作者:徐新文,排版:晓宇微信公众号:芯片之家(ID:chiphome-dy)时光荏苒,岁月如梭,转眼就在硬件工程师的岗位上工作了九…

StringBuffer/StringBuilder/String的区别

1、在执行速度上:Stringbuilder->Stringbuffer->String 2、String是字符串常量 Stringbuffer是字符串变量 Stringbuilder是字符串变量 有可能我们会疑惑String怎么是字符串变量。看以下代码: String str adc; str str “ef”&#x…

你知道kernel version的实现原理和细节吗

引言kernel 启动时通常会看到下面第二行信息的内容,它们代表了当前 kernel 的版本、编译工具版本、编译环境等信息。Booting Linux on physical CPU 0x0 Linux version 5.4.124 (funnyfunny) (gcc version 6.5.0 (Linaro GCC 6.5-2018.12)) #30 SMP Sat Sep 11 11:1…

Android 为你的应用程序添加快捷方式【优先级高的快捷方式】

有人会说,快捷方式,不是安装完应用程序后,长按应用程序的ICON然后将它拖到桌面上不就行了吗?没错,这样是一种方法,但这种方法有一个缺点,看图吧: 如上图,如果我们长按桌面…

icinga2 php模块,在Ubuntu 18.04系统上安装Icinga2监视工具的方法

本文介绍在Ubuntu 18.04系统上安装Icinga2监视工具的方法,使用Icinga 2可以监控:服务器资源、网络服务、网络设备。简介Icinga 2是一个开源,可扩展和可扩展的监视工具,可检查网络资源的可用性,通知用户中断&#xff0c…

面试官问:malloc(0)时程序会返回什么?

今天跟大家找了篇文章,主要是一个面试中的有趣问题,其实有些问题在开发中没有遇到过会很难回答出来,如果在面试过程中回答正确,皆大欢喜,拿到offer的概率更大;回答不出来也不要信口开河,面试官主…

考研失败了,怎么办?

有读者提到这个问题,顺带回答下。我没有考研过,但是身边有很多研究生和博士,额,还有很多海外留学的博士。前天我们有外部厂商来公司讨论合作,领导让我跟着一起介绍项目,对方的人问了一句:“你们…

晒一波工程师的工位,你喜欢哪种?

程序员的圈子啊那是十分神秘,又令人着迷的。每天的工作就是对着电脑,那他们的工作是如何的呢?我们来品一品(PS:后面奉上各位大佬的桌面,别走开哦)↓↓↓最最常见的普通版:升级版&…

彻底搞懂系统调用

在应用程序开发过程中经常会进行IO设备的操作,比如磁盘的读写,网卡的读写,键盘,鼠标的读入等,大多数应用开发人员使用高级语言进行开发,例如C,C,java,python等&#xff0…

Kubernetes(k8s)集群部署(k8s企业级Docker容器集群管理)系列目录

0、目录 整体架构目录:ASP.NET Core分布式项目实战-目录 k8s架构目录:Kubernetes(k8s)集群部署(k8s企业级Docker容器集群管理)系列目录 一、感谢 在此感谢.net core社区的帮助。感谢。 二、系列部署目录 0、部署环境规划 1、自签T…

每天都用手机,你对麦克风了解吗?

简 介: 通过对于实际驻极体MIC进行拆解,看到其中的结构,对比起工作原理,实在令人难以想象它的工作机制是可行的,尽管现在它已经广泛应用在周围很多电子设备中。关键词: 驻极体,MIC01 驻极体话筒…

好了,我不想回深圳了~

国庆节算长假,一共七天,高速免费。如果一个人,待在家里睡上七天,可能我在第二天就会特别无聊,想找事情做,因为国庆离开深圳的人很多,我曾经有一次放假去球场打球,结果很失落&#xf…

开源微信管家平台——JeeWx 捷微4.0 微服务版本发布,全新架构,全新UI,提供强大的图文编辑器...

JeeWx捷微4.0 微服务版本发布^_^ 换代产品(全新架构,全新UI,提供强大的图文编辑器) JEEWX 从4.0版本开始,技术架构全新换代,采用微服务架构,插件式开发,每个业务模块都是独立的JAR…