天天看点

鼓捣一下Linux下的locale

关于Linux下的locale,网上讲这个资料不少,本没必要多说什么。无奈手贱,权当做个笔记吧。以下操作环境为CentOS 6.5

[chenhj@node1 ~]$ locale

LANG=en_US.UTF-8

LC_CTYPE="zh_CN.UTF8"

LC_NUMERIC="zh_CN.UTF8"

LC_TIME="zh_CN.UTF8"

LC_COLLATE="zh_CN.UTF8"

LC_MONETARY="zh_CN.UTF8"

LC_MESSAGES="zh_CN.UTF8"

LC_PAPER="zh_CN.UTF8"

LC_NAME="zh_CN.UTF8"

LC_ADDRESS="zh_CN.UTF8"

LC_TELEPHONE="zh_CN.UTF8"

LC_MEASUREMENT="zh_CN.UTF8"

LC_IDENTIFICATION="zh_CN.UTF8"

LC_ALL=zh_CN.UTF8

上面列出了系统支持各种区域相关的属性,比如日期,货币。

通过设置环境变量可以随时改变locale

LC_ALL > LC_* > LANG

[root@node1 ~]# export LC_ALL=zh_CN.utf8

或者修改/etc/sysconfig/i18n

[root@node1 ~]# cat /etc/sysconfig/i18n

LANG="en_US.UTF-8"

SYSFONT="latarcyrheb-sun16"

比如列出所有zh_CN的

[chenhj@node1 ~]$ locale -a|grep zh_CN

zh_CN

zh_CN.gb18030

zh_CN.gb2312

zh_CN.gbk

zh_CN.utf8

在/usr/share/i18n/locales目录比如zh_CN

/usr/share/i18n/locales/zh_CN:

...

LC_CTYPE

% This is a copy of the "i18n" LC_CTYPE with the following modifications:

% - Additional classes: hanzi

copy "i18n"

translit_start

include "translit_combining";""

translit_end

class "hanzi"; /

% U3400>..U4DBF>;/

        U4E00>..U9FA5>;/

        UF92C>;UF979>;UF995>;UF9E7>;UF9F1>;UFA0C>;UFA0D>;UFA0E>;/

        UFA0F>;UFA11>;UFA13>;UFA14>;UFA18>;UFA1F>;UFA20>;UFA21>;/

        UFA23>;UFA24>;UFA27>;UFA28>;UFA29>

END LC_CTYPE

% ISO 14651 collation sequence

LC_COLLATE

copy "iso14651_t1_pinyin"

END LC_COLLATE

上面的LC_CTYPE定义了简体中文的汉字分类"hanzi",LC_COLLATE定义了汉字的拼音排序。

还可以再打开拼音排序的定义文件看看

/usr/share/i18n/locales/iso14651_t1_pinyin

copy "iso14651_t1_common"

script HAN>

order_start HAN>;forward;forward;forward;forward,position

U5416> U5416>;IGNORE;IGNORE;IGNORE #吖104

U814C> U814C>;IGNORE;IGNORE;IGNORE #腌185

U9312> U9312>;IGNORE;IGNORE;IGNORE #錒0

U9515> U9515>;IGNORE;IGNORE;IGNORE #锕7

U963F> U963F>;IGNORE;IGNORE;IGNORE #阿23237

U55C4> U55C4>;IGNORE;IGNORE;IGNORE #嗄60

U554A> U554A>;IGNORE;IGNORE;IGNORE #啊16566

U54C0> U54C0>;IGNORE;IGNORE;IGNORE #哀4070

U54CE> U54CE>;IGNORE;IGNORE;IGNORE #哎2473

一看注释就明白了,确实是按拼音排序的。

字符集都定义在/usr/share/i18n/charmaps目录下,比如GB2312。

/usr/share/i18n/charmaps/GB2312.gz

code_set_name> GB2312

mb_cur_max> 2

mb_cur_min> 1

comment_char> %

escape_char> /

% Chinese charmap for EUC-CN = GB2312 = union of ASCII and GB_2312-80

% version: 1.0

% Contact: ha_shao

% Email: [email protected]

% Distribution and use is free, even for comercial purpose.

%

CHARMAP

U0000> /x00 NULL (NUL)

U0001> /x01 START OF HEADING (SOH)

U0002> /x02 START OF TEXT (STX)

U0003> /x03 END OF TEXT (ETX)

U0004> /x04 END OF TRANSMISSION (EOT)

U0005> /x05 ENQUIRY (ENQ)

U0006> /x06 ACKNOWLEDGE (ACK)

U0007> /x07 BELL (BEL)

U0008> /x08 BACKSPACE (BS)

U0009> /x09 CHARACTER TABULATION (HT)

前面提到的loacle定义和字符集定义相当于源代码,我们真正使用是基于loacle定义+字符集定义的得到的编译好的locale。创建loacle使用localedef

man localedef

The localedef program reads the indicated charmap and input files, compiles them to a form usable by the locale(7) functions inthe C library, and places the six output files in the outputpath directory.

定义一个试试!

[root@node1 ~]# localedef -f UTF-8 -i zh_CN myzh

[root@node1 ~]# locale -a|grep myzh

myzh

myzh.utf8

创建的locale被添加进了/usr/lib/locale/locale-archive

[root@node1 ~]# grep myzh /usr/lib/locale/locale-archive

Binary file /usr/lib/locale/locale-archive matches

[root@node1 ~]# export LC_ALL=myzh.utf8

[root@node1 ~]# ls xx

ls: cannot access xx: No such file or directory

怎么还是英文消息?

看看它在干嘛!

[root@node1 ~]# strace -eopen ls xx

open("/etc/ld.so.cache", O_RDONLY) = 3

open("/lib64/libselinux.so.1", O_RDONLY) = 3

open("/lib64/librt.so.1", O_RDONLY) = 3

open("/lib64/libcap.so.2", O_RDONLY) = 3

open("/lib64/libacl.so.1", O_RDONLY) = 3

open("/lib64/libc.so.6", O_RDONLY) = 3

open("/lib64/libdl.so.2", O_RDONLY) = 3

open("/lib64/libpthread.so.0", O_RDONLY) = 3

open("/lib64/libattr.so.1", O_RDONLY) = 3

open("/proc/filesystems", O_RDONLY) = 3

open("/usr/lib/locale/locale-archive", O_RDONLY) = 3

open("/usr/share/locale/locale.alias", O_RDONLY) = 3

open("/usr/share/locale/myzh.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)

open("/usr/share/locale/myzh/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)

ls: cannot access xxopen("/usr/share/locale/myzh.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)

open("/usr/share/locale/myzh/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)

: No such file or directory

原来找不到libc的本地消息资源文件。localedef只是定义了locale的基本内容,每个应用要使用的本地资源还得另外加。

现在从别的地方先借个过来应急!

[root@node1 ~]# ls /usr/share/locale/myzh

ls: cannot access /usr/share/locale/myzh: No such file or directory

[root@node1 ~]# ln -sf /usr/share/locale/zh_CN /usr/share/locale/myzh

再试一下,OK了。

ls: 无法访问xx: 没有那个文件或目录

最后把这个临时的locale删掉

[root@node1 ~]# localedef --delete-from-archive myzh

[root@node1 ~]# rm -f /usr/share/locale/myzh

http://wiki.ubuntu.org.cn/Locale

http://www.linuxidc.com/Linux/2009-12/23620.htm

http://sysadmin.blog.51cto.com/83876/223870

继续阅读