天天看點

聊聊字元集與字元編碼這些事兒

??绔???褰?

    • ASCII 瀛?绗???
    • ISO 8859-1瀛?绗???
    • GB2312瀛?绗???
    • GBK瀛?绗???
    • Unicode瀛?绗???
    • UTF-32缂????瑰?
    • UTF-16缂????瑰?
    • UTF-8缂????瑰?
    • ???芥????涓??圭????
    • 濡?浣????╃????瑰?锛?
    • MySQL涓???utf8??utf8mb4
    • 浠??ユ?ㄦ??
?㈣??琚?????nicode??UTF-8?稿?沖??瀹癸?褰??哄氨??浜哄?諱???

??????绔?锛?

浣??????? Unicode ?? UTF-8 ??浠?涔??崇郴??锛??ョ????杩?涓?灏卞交搴???浜?锛?_榄??介???????瀹?CSDN??瀹?utf8??unicode???崇郴

(8 灏?绉?淇?/ 80 ?℃???) Unicode ?? UTF-8 ??浠?涔??哄??锛? - ?ヤ? (zhihu.com)

Java ?㈣????澶? 4 UTF-8 ?? Unicode 瑙g????缂???_Free Jim????瀹?CSDN??瀹?

棣???瑕?绛??烘?ョ????锛?

  1. UTF-8??缂???瑙???锛?
  2. Unicode??瀛?绗?????

    浜轟滑?借薄?轟?涓?

    瀛?绗???

    ??姒?蹇墊?ユ??杩版??涓?瀛?绗????寸??缂???瑙?????
??浠??ラ???ㄨ?$???轟腑???藉???ㄤ?杩??舵?版??锛??h?ユ??涔?瀛??ㄥ??绗?覆???褰??舵??寤虹??瀛?绗??浜?杩??舵?版??????灏??崇郴浜?锛?灏?涓?涓?瀛?绗???灏???涓?涓?浜?杩??舵?版????杩?绋?涔?????

缂???

锛?灏?涓?涓?浜?杩??舵?版????灏??頒?涓?瀛?绗???杩?绋?????

瑙g??

??

ASCII 瀛?绗???

ASCII ??锛?American Standard Code for Information Interchange锛?绉頒負缇??芥????淇℃??浜ゆ?㈢????瀹????轟???涓?瀛?姣???涓?濂??佃??缂???绯葷???瀹?瀹?涔?浜?涓?涓??ㄤ?浠h〃甯歌?瀛?绗???瀛??搞???跺????у?ㄤ?锛?ASCII 褰???????涓虹??借?辮????璁捐?$??锛????芥?劇ず 128 涓?缂???锛?瀵瑰?朵???璇?瑷????戒負????瑕??蟲?劇ず?朵?璇?瑷???缂???锛?杩???瑕?浣跨??unicode??

?辨?跺?128涓?瀛?绗??????绌烘?箋?????圭???楓???闆????澶у???瀛?姣???涓?浜?涓???瑙?瀛?绗????變??誨?辨??128涓?瀛?绗????浠ュ??浠ヤ嬌??涓?瀛????ヨ?琛?缂???

ISO 8859-1瀛?绗???

?辨?跺?256涓?瀛?绗??????code>ASCII瀛?绗??????虹?涓????╁??浜?128涓?瑗挎?у父?ㄥ??绗?????寰鋒?涓ゅ?界??瀛?姣?)锛?涔???浠ヤ嬌??涓?瀛????ヨ?琛?缂?????杩?涓?瀛?绗???涔???涓?涓?????

latin1

??

GB2312瀛?绗???

?跺?浜?姹?瀛?浠ュ????涓?瀛?姣???甯???瀛?姣????ユ??骞沖????????????瀛?姣???淇?璇?瑗塊??灏?瀛?姣????朵腑?跺?姹?瀛?6763涓?锛??朵???瀛?绗???82涓??????惰?绉?瀛?绗??????煎??code>ASCII瀛?绗???锛???浠ュ?ㄧ????瑰?涓??懼???浜?濂???锛?

  • 濡???璇ュ??绗???code>ASCII瀛?绗???涓?锛???????瀛???缂?????
  • ????????瀛???缂?????

杩?绉?琛ㄧず涓?涓?瀛?绗???瑕???瀛????闆???戒?????缂????瑰?绉頒負

???跨????瑰?

??姣??矽?村??绗?覆

'??'

锛??朵腑

'??

??瑕???涓?瀛???杩?琛?缂???锛?缂???????????杩??惰〃绀轟負

0xB0AE

锛?

'u'

??瑕???涓?瀛???杩?琛?缂???锛?缂???????????杩??惰〃绀轟負

0x75

锛???浠ユ?煎??璧鋒?ュ氨??

0xB0AE75

??

GBK瀛?绗???

GBK缂?????涓?涓?姹?瀛?缂???????锛??ㄧО??姹?瀛??????╁?瑙???????GBK缂???锛???瀵?B2312缂??????╁?锛???姝ゅ???ㄥ?煎??B2312-80??????GBK缂???渚??堕???ㄥ??瀛???缂????規?锛??剁??????達?8140-FEFE锛?????x7F??浣?锛???3940涓???浣????辨?跺?姹?瀛????懼艦绗???1886涓?锛??朵腑姹?瀛?(?????ㄩ?????浠?21003涓?锛??懼艦绗???83涓???GBK缂????????介??????ISO/IEC10646-1???藉?舵????GB13000-1涓????ㄩ?ㄤ腑?ラ?╂?瀛?锛?骞跺????浜?BIG5缂???涓???????姹?瀛???GBK缂????規?浜?1995骞?2??15?ユ?e???甯?锛?杩?涓?????GBK瑙???涓?.0????

GBK??UTF8??浠?涔??哄??锛?
  • UTF8缂????煎?寰?寮哄ぇ锛??????????藉?剁??璇?瑷?锛?姝f????涓哄????寮哄ぇ锛???浼?瀵艱?村?????ㄧ??绌洪?村ぇ灏?瑕?姣?GBK澶э?瀵逛?缃?绔???寮???搴???瑷?锛?涔?????涓?瀹?褰卞??????
  • GBK缂????煎?锛?瀹??????藉?锛?浠???浜?涓???瀛?绗??褰??跺???????ㄧ??绌洪?村ぇ灏?浼?????瀹??????借????灏?锛???寮?缃?椤電????搴???杈?蹇???

Unicode瀛?绗???

姣?涓??藉?堕?藉??涓??戒??鋒????宸辯??璇?瑷?缂???锛?灏卞?虹?頒???绉????風??缂???锛?濡???涓?瀹?瑁??稿???缂???锛?灏辨??娉?瑙i???稿?缂????寵〃杈劇????瀹廣??缁?浜?锛???涓??? ISO(?介????????缁?缁?) ??缁?缁???涓?涓??諱???浠?浠?涓?璧峰????浜?涓?绉?缂??? UNICODE(Universal Multiple-Octet Coded Character Set)锛?杩?绉?缂?????甯稿ぇ锛?澶у?闆??浠ュ?圭撼涓???涓?浠諱?涓?涓???瀛?????蹇?????浠ュ??瑕??佃??涓??? UNICODE 杩?绉?缂???绯葷?锛???璁烘???ㄧ????绉???瀛?锛?????瑕?淇?瀛???浠剁???跺??锛?淇?瀛??? UNICODE 缂???灏卞??浠ヨ??朵??佃??姝e父瑙i?????諱?灏辨????绉?涓?瑗塊?芥???э?濂??╃????Unicode???充???瀛?绗??ヨ〃绀烘??缁?涓??????????楓??濂藉???璇???浠ョ??涓?涓?锛?

Unicode/UTF-8-character table - starting from code position 1F80 (utf8-chartable.de)

绉?瀛??跺???幫???瀛?????Unicode 灏辮凍澶??ㄧ?????ㄦ?鋒?ュ父浣跨?ㄤ?锛?2瀛???锛?16浣?锛?2^16=65536

?ュ父浣跨?????浠?涓??芥?ュ父浣跨?ㄧ??姹?瀛??藉ぇ姒?????3500涓?宸??沖?姝や???60000????澶?浜???

Java ????char绫誨??灏辨??2瀛???锛?????灏辨?????ㄤ?Unicode??瀛?绗???

C璇?瑷?涓?char ??1瀛???锛????ㄧ????ASCII??瀛?绗???锛???0~127 ??28涓?瀛?绗?

Unicode涓轟???涓?????瀛?绗??藉????浜?涓?涓???涓????闆??缂??鳳?杩?涓?缂??瘋???翠?0x000000??x10FFFF锛???涓??句???澶?涓?锛?姣?涓?瀛?绗??芥??涓?涓???涓??? Unicode 缂??鳳?杩?涓?缂??蜂??????? 16 杩??訛??ㄥ???㈠??涓? U+??渚?濡?锛???椹????? Unicode ??U+9A6C??Unicode 灏辯?稿?浜?涓?寮?琛??寤虹??浜?瀛?绗??缂??蜂??寸????绯彙??瀹???涓?绉?瑙?瀹?锛?Unicode ??韬???瑙?瀹?浜?姣?涓?瀛?绗????闆??缂??鋒??澶?灏?锛?骞舵病??瑙?瀹?杩?涓?缂??峰?浣?瀛??ㄣ??

UTF锛?UCS Trandfer Format锛???涓轟?瑙e??nicode?ㄧ?缁?涓?浼?杈?????棰????虹?扮????绉??㈠??浼?杈???????锛?椤懼????涔?锛?UTF-8 灏辨??姣?娆?8 涓?浣?浼?杈??版??锛??? UTF-16 灏辨??姣?娆?16 涓?浣???

Unicode瀛?绗???涓烘??涓?涓?瀛?绗?????涓?涓???浣?锛?渚?濡? ?? ???? ????浣???30693锛?璁頒?U+ 77E5锛?30693??16杩??朵負0x77E5)??

UTF-32缂????瑰?

??瀛?绗??瑰???缂??風?存?ヨ漿??負浜?杩??跺艦寮?锛???涓?瀛???锛???

姣?濡?椹??? Unicode 涓猴?U+9A6C锛??d??存?ヨ漿??涓轟?杩??訛?瀹???琛ㄧず灏變負锛?1001 1010 0110 1100??

??棰?锛?澶х??涓?灏?绔?

璁$???哄?ㄥ???ㄥ?ㄤ腑????瀛?????涓ょ??瑰?锛?澶х??娉???灏?绔?娉?锛?澶х??娉?灏辨??灏?楂?浣?瀛????懼?頒??闆??澶?锛?灏?绔??稿????UTF-32 ?ㄥ??涓?瀛???琛ㄧず锛?澶???????涓哄??涓?瀛???锛?涓?娆℃?垮?闆??涓?瀛???杩?琛?澶???锛?锛?濡???涓???澶у?绔???璇?锛??d?灏變??虹?拌В璇婚??璇?锛?姣?濡???浠?涓?娆¤?澶?????涓?瀛??? 12 34 56 78锛?杩???涓?瀛?????琛ㄧず 0x12 34 56 78 杩???琛ㄧず 0x78 56 34 12锛?涓?????瑙i????缁?琛ㄧず???間?涓??楓????浠???浠ユ?規??浠?浠?楂?浣?瀛?????瀛??ㄤ?缃??ュ?ゆ??浠?浠???浠h〃????涔?锛???浠ュ?ㄧ????瑰?涓??? UTF-32BE ?? UTF-32LE锛?????瀵瑰?澶х????灏?绔?锛??ユ?g‘?拌В??澶?涓?瀛???锛?杩???????涓?瀛???锛?????涔???

UTF-16缂????瑰?

???垮????琛ㄧず??

瀵逛?缂??峰??U+0000 ??U+FFFF ??瀛?绗??甯哥?ㄥ??绗???锛?锛?65536涔?????锛?锛??存?ョ?ㄤ袱涓?瀛???琛ㄧず??缂??峰??U+10000 ??U+10FFFF 涔??寸??锛?65536??浠ュ????锛?瀛?绗????瑕??ㄥ??涓?瀛???琛ㄧず??

???鳳?UTF-16 涔???瀛?????椤哄???棰?锛?澶у?绔?锛?锛???浠ュ氨?? UTF-16BE 琛ㄧず澶х??锛?UTF-16LE 琛ㄧず灏?绔???

UTF-8缂????瑰?

UTF-8灏辨???ㄤ???缃?涓?浣跨?ㄦ??骞跨??涓?绉?Unicode??瀹??版?瑰?锛?杩???涓轟?杈???璁捐?$??缂???锛?骞朵嬌缂??????界??锛?杩??峰氨??浠ユ?劇ず?ㄤ???涓???????????瀛?绗????UTF-8??澶х??涓?涓??圭?癸?灏辨??瀹???涓?绉????跨??缂????瑰???瀹???浠ヤ嬌??~4涓?瀛???琛ㄧず涓?涓?绗??鳳??規??涓?????绗??瘋??????瀛????垮害锛?褰?瀛?绗???SCII???????存?訛?灏辯?ㄤ?涓?瀛???琛ㄧず锛?淇???浜?ASCII瀛?绗??涓?瀛?????缂?????涓哄????涓??ㄥ??(瀹??藉?瀹??ㄥ?煎??ASCII ??锛???浠??ラ?? ASCII ?? ????128 涓?瀛?绗??????锛??? Unicode 涓????? 128 涓?瀛?绗??? ASCII ???芥??涓?涓?瀵瑰???)??娉ㄦ??????Unicode涓?涓?涓???瀛?绗???2涓?瀛???锛???UTF-8涓?涓?涓???瀛?绗???3涓?瀛?????浠?Unicode??TF-8骞朵????存?ョ??瀵瑰?锛?????瑕?杩?涓?浜?绠?娉???瑙????ヨ漿????

UTF-8 ?㈢?惰?戒?瀛??d?澶???瀛???绗??鳳?涓轟?涔??藉??杩???杩?涔?澶?浣跨??GBK 绛?缂?????浜?

??涓?UTF-8 绛?缂???浣?绉?姣?杈?澶э????佃??绌洪?存??杈?澶?锛?濡????㈠????浣跨?ㄤ漢缇ょ?澶ч?ㄥ???芥??涓??戒漢锛???GBK 绛?缂???涔???浠ャ??

UTF-8濡?浣?瀹??闆????瀛???缂??????杩?灏遍??瑕?浜?瑙e?剁???瑙???浜???

UTF-8 ??缂???瑙?????锛?

  • 瀵逛???瀛?????绗??鳳?瀛?????绗?涓?浣?璁句負 0锛????㈢??7浣?涓鴻?涓?绗??風?? Unicode ??锛???姝ゅ?逛??辨??瀛?姣?锛?UTF-8 缂????? ASCII ?????稿??????
  • 瀵逛?n瀛?????绗??鳳?n>1锛?,绗?涓?涓?瀛??????? n 浣??借?句負 1锛?绗? n+1 浣?璁句負 0锛????㈠????????涓や?涓?寰?璁句負 10锛??╀???娌℃????????浜?杩??朵?锛??ㄩ?ㄤ負杩?涓?绗??風?? Unicode ?? ??

Unicode缂??瘋???翠?瀵瑰?UTF-8缂?????浜?杩??舵?煎?濡?涓?锛?

聊聊字元集與字元編碼這些事兒
0x00 ~ 0x7F: 0 ~ 127
0x80 ~ 0x7FF: 128 ~ 2047
0x800 ~ 0xFFFF: 2048 ~ 65535
0x10000 ~ 0x10FFFF: 65536浠ヤ?锛?65536??2??16娆℃?癸?
           

?d?瀵逛?涓?涓??蜂??? Unicode 缂??鳳??蜂???涔?杩?琛? UTF-8 ??缂??????

  • 棣????懼?拌??Unicode 缂??鋒???ㄧ??缂??瘋???達?杩?????浠ユ?懼?頒?涔?瀵瑰???浜?杩??舵?煎?锛??跺??灏?璇?Unicode 缂??瘋漿??涓轟?杩??舵?幫??繪??楂?浣??? 0锛?锛?????灏?璇ヤ?杩??舵?頒??沖??宸??娆″~?ヤ?杩??舵?煎??? X 涓?锛?濡???杩??? X ??濉?锛???璁句負 0
姣?濡?锛???椹????? Unicode 缂??鋒??锛?0x9A6C锛??存?扮??鋒?? 39532锛?瀵瑰?绗?涓?涓????達?2048 - 65535锛?锛??舵?煎?涓猴?

1110XXXX 10XXXXXX 10XXXXXX

锛?39532 瀵瑰???浜?杩??舵?? 1001 1010 0110 1100锛?灏?浜?杩??跺~?ヨ??ュ氨涓猴?

1110

1001

10

101001

10

101100

???芥????涓??圭????

浠???瀵規??涓?浜哄???杩?绋?涓?????????涓?涓?瑙i??锛?濡???璇昏??娌℃????浠ュ拷?ユ?ら?ㄥ????

??浠?涓?寮?濮?璇翠?锛?UTF-8??缂???瑙???锛?Unicode??瀛?绗?????浣???ASCII????GB2312?芥??缂???瀛?绗???锛??f??浠?涓轟?涔?缁?甯鎬???浠?浠?璁や負??涓?绉?缂????瑰????

?跺??锛?ASCII????GB2312杩?浜???缂???瀛?绗???娌℃????锛?浣???瀵逛?浠?浠???瑷??藉??????涓?涓?绉?缂???锛??d???浠?绉闆?間?浠?ASCII缂???????GB2312缂???涔?娌℃????锛???????涔?涓?????宸層??姝f??杩?绉?????锛????ㄥ?澶??跺??锛?涓?绠℃??瀛?绗???璧??艱???缂????規?璧??奸?藉??浠ョ?存?ョ??b2312????ascii锛?姣?濡?锛?

Encoding gb2312 = Encoding.GetEncoding("gb2312");
Response.ContentEncoding = gb2312;//缂???
Response.Charset="gb2312";//瀛?绗???
           

????浠???Unicode??韬?浣?涓虹???瀛?绗???娌℃??浠諱?瀛??ㄥ艦寮?锛?????涓?涓?缂??峰??瀛?绗??瑰???琛ㄨ??宸詫?濡?浣??ㄨ?$???哄?????浣????芥?沖?頒?骞茶???存?ユ??缂??峰?浣?浜?杩??舵?闆?兼?ョ?存?ュ??????d?涓轟?涔?涓?杩?涔??????杩?涔?绠???涓?绉?瀛?绗???缂????規?锛?灏辨???轟?unicode缂???瀛?绗?????UTF-32缂????規?锛??d???娌℃???村???鴻?戒??圭??缂????規????涓轟?涔?浼?娌℃??????e氨??UTF-8??UTF-16绛?绛?锛?unicode缂???瀛?绗?????濡?姝ゅ???缂????規?锛?

UTF-8锛?瀛?姣??闆??绗??風????1瀛???锛?姹?瀛???涓?瀛???

UTF-16锛?瀵?nicode缂???瀛?绗???涓?????65536涓?瀛?绗??藉??涓や釜瀛???锛?涔?????????涓?瀛?????

UTF-32锛??ㄩ?ㄥ????瀛???

UTF-8涓?杩????朵腑涓?绉???宸詫?

?蟲?ゆ??瑙?寰?浠?浠????崇郴??浠ヨ?存??褰誨?????浜???

聊聊字元集與字元編碼這些事兒

濡?浣????╃????瑰?锛?

UTF-8 ??浼??癸?

  1. 瀛?绗?┖?磋凍澶?澶э?????Unicode ?版?????跺??村?瀛?绗??UTF-8 涔??藉Ε濡ョ???煎?癸???姝や?浼????虹??UTF-16 ?f?風??灏村艾
  2. 涓?瀛??ㄥぇ灏?绔?瀛???搴???棰?锛?淇℃??浜ゆ?㈡?堕??甯鎬究??/li>
  3. 瀹歸???ч??锛?灞??ㄧ??瀛?????璇?锛?涓㈠け??澧??????瑰??锛?涓?浼?瀵艱?磋????х????璇?锛???涓?UTF-8 ??瀛?绗?竟??寰?瀹規??妫?娴??烘?ワ?杩???涓?涓?宸ㄥぇ??浼??癸?姝f??涓轟?瀹??拌?涓??癸??變滑涓??ラ?╀漢姘?涓?寰?涓?蹇??? 3 瀛??? 1 涓?瀛?绗??????ュ??锛?

UTF-8 涔?涓?瀹?缇?

  1. ????涓???涓?骞寵 ????瀵逛?娆х??闆?轟?浜?浠ヨ?辮??涓烘??璇????藉??UTF-8 绠??存??澶?妫?浜?锛???涓哄???? ASCII 涓??鳳?涓?涓?瀛?绗?????涓?涓?瀛???锛?娌℃??浠諱?棰?澶???瀛??ㄨ???锛?浣???瀵逛?涓??ラ?╃???藉?舵?ヨ?達?UTF-8 瀹??ㄦ??澶???浣?锛?涓?涓?瀛?绗????惰?????3 涓?瀛???锛?瀛??ㄥ??浼?杈???????涓?浣?娌℃??????锛?????涓???浜?????浠ユ?х?浜烘?甯稿父姣?涓??矽鮑??????UTF-8锛?????浠??磋????瑕??矽鮑涓?浼???/li>
  2. ???垮????琛ㄧず甯??ョ????????棰?????澶у?跺??UTF-8 ??????????涓?涓???棰?灏辨???ㄤ??跺??涓烘?????垮????琛ㄧず锛???姝ゆ??璁烘??璁$??瀛?绗??幫?杩????ц?绱㈠???浣??????戒?楂???涓轟?瑙e?寵?涓???棰?锛?甯稿父浼??????? UTF-8 ??杞???負 UTF-16 ???? UTF-32 ??????浣?锛???浣?瀹?姣?????杞??㈠???彙????杩??劇?舵??涓?绉??ц?借?????

??璁烘?? UTF-8 ?? UTF-16/32 ?藉????浼?缂虹?癸???姝ら???╃???跺??搴?褰?绔?瓒充?瀹?????搴??ㄥ?烘??????甯歌??瑷?锛?瀛??ㄥ?ㄧ???涓???杩?琛?缃?缁?浜ゆ?㈡?堕?戒?????UTF-8锛????ㄧ?搴????ㄨ?琛?澶????跺??杞???負 UTF-16/32??瀵逛?澶у??扮??????绋?搴??ヨ?達?杩??峰???㈠??浠ヤ?璇?淇℃??浜ゆ?㈡?跺?規??瀹??扮?鎬??煎?癸????跺?ㄥ???ㄥ????朵?姣?杈?绠???锛??ц?戒?杩?绠?涓?????

娴?璇?涓?涓?锛?

String param = "abc涓?浜?涓???";
        int length1 = param.length();
        int length2 = param.getBytes(StandardCharsets.UTF_8).length;
        int length4 = param.getBytes(StandardCharsets.ISO_8859_1).length;
        int length5 = param.getBytes(StandardCharsets.US_ASCII).length;
        int length6 = param.getBytes(StandardCharsets.UTF_16BE).length;
        int length7 = param.getBytes(StandardCharsets.UTF_16LE).length;
        System.out.println(length1);
        System.out.println("============");
        System.out.println(length2);
        System.out.println(length4);
        System.out.println(length5);
        System.out.println(length6);
        System.out.println(length7);
           

缁???锛?

7
============
15
7
7
14
14
           

MySQL涓???utf8??utf8mb4

??浠?涓?杈矽??code>utf8瀛?绗???琛ㄧず涓?涓?瀛?绗???瑕?浣跨??锝?4涓?瀛???锛?浣?????浠?甯哥?ㄧ??涓?浜?瀛?绗?嬌??锝?3涓?瀛???灏卞??浠ヨ〃绀轟???????code>MySQL涓?瀛?绗???琛ㄧず涓?涓?瀛?绗????ㄦ??澶у?????垮害?ㄦ??浜??歸???褰卞??绯葷???瀛??ㄥ???ц?斤???浠ヨ?捐??code>MySQL??澶у???峰?風??瀹?涔?浜?涓や釜姒?蹇碉?

  • utf8mb3

    锛????茶???

    utf8

    瀛?绗???锛???浣跨??锝?3涓?瀛???琛ㄧず瀛?绗???
  • utf8mb4

    锛?姝e????

    utf8

    瀛?绗???锛?浣跨??锝?4涓?瀛???琛ㄧず瀛?绗???

??涓??歸??瑕?澶у?跺??????娉ㄦ??锛???code>MySQL涓?**

utf8

??

utf8mb3

??????**锛???浠ヤ?????code>MySQL涓?????code>utf8灏辨???崇??浣跨??~3涓?瀛????ヨ〃绀轟?涓?瀛?绗??濡???澶у?舵??浣跨??瀛???缂???涓?涓?瀛?绗??????碉?姣?濡?瀛??ㄤ?浜?emoji琛ㄦ???ョ??锛??h?蜂嬌??code>utf8mb4??

MySQL

????濂藉?濂藉?绉?瀛?绗???锛??ョ??褰???

MySQL

涓???????瀛?绗?????浠ョ?ㄤ?杈矽?涓?璇??ワ?

mysql> SHOW CHARSET;
+----------+---------------------------------+---------------------+--------+
| Charset  | Description                     | Default collation   | Maxlen |
+----------+---------------------------------+---------------------+--------+
| big5     | Big5 Traditional Chinese        | big5_chinese_ci     |      2 |
...
| latin1   | cp1252 West European            | latin1_swedish_ci   |      1 |
| latin2   | ISO 8859-2 Central European     | latin2_general_ci   |      1 |
...
| ascii    | US ASCII                        | ascii_general_ci    |      1 |
...
| gb2312   | GB2312 Simplified Chinese       | gb2312_chinese_ci   |      2 |
...
| gbk      | GBK Simplified Chinese          | gbk_chinese_ci      |      2 |
| latin5   | ISO 8859-9 Turkish              | latin5_turkish_ci   |      1 |
...
| utf8     | UTF-8 Unicode                   | utf8_general_ci     |      3 |
| ucs2     | UCS-2 Unicode                   | ucs2_general_ci     |      2 |
...
| latin7   | ISO 8859-13 Baltic              | latin7_general_ci   |      1 |
| utf8mb4  | UTF-8 Unicode                   | utf8mb4_general_ci  |      4 |
| utf16    | UTF-16 Unicode                  | utf16_general_ci    |      4 |
| utf16le  | UTF-16LE Unicode                | utf16le_general_ci  |      4 |
...
| utf32    | UTF-32 Unicode                  | utf32_general_ci    |      4 |
| binary   | Binary pseudo charset           | binary              |      1 |
...
| gb18030  | China National Standard GB18030 | gb18030_chinese_ci  |      4 |
+----------+---------------------------------+---------------------+--------+
41 rows in set (0.01 sec)
           

??浠???浠ョ??涓?涓?????涓???maxlen锛?

utf8

瀵瑰?????3锛?

utf8mb4

瀵瑰?????4??

浠??ユ?ㄦ??

-----??娴烽?涓??????? 寮?绱?瀹?

绛?涓???榛???????娼?姘存定浜?

??浣?棰???娴╃??椋???

娑?婕?浜ら?? 娓╂??????瀹?瀹???缂╁獎

浠?澶???娴烽??逛寒杩?????

???????沖?諱? 璁╂?跺??????

??缇???澶??插甫浣???璧拌?

姊?看??????瀹垮????绛?浣???搴?

濡?????娴烽?璁╂??浠???杩?

????涓?娆″?諱? ??缁?浜?绌烘?

??灞?浜?浣???绗?涓?涓?榛???

璁╀????界‘瀹?

繼續閱讀