è¿ç¯æç« è¯¦ç»ä»ç»ä¸ä¸MySQLä¸çå符éåå符åºç¸å ³çé®é¢ï¼ééå¤å¤å°äºè§£ä¸ä¸å符éåå符åºçæ¹æ¹é¢é¢ï¼åæ¶éç¹è¯´æä¸ä¸å¼åä¸éè¦æ³¨æçé®é¢ã
æç« åºäºMySQL 8.0ï¼ä¹ä¼æ¶åå°5.7çæ¬ã主è¦åèMySQLæåï¼https://dev.mysql.com/doc/refman/8.0/en/
1. 太é¿ä¸ç±çç
- å符é(Character Set)æ¯å符çç¼ç è§åï¼å符åº(Collation)æ¯å符çæåºè§åï¼
- æ¯ä¸ä¸ªå符éé½å å«ä¸å®èå´çå符ï¼
- æ¯ä¸ä¸ªå符éé½æä¸ä¸ªæå¤ä¸ªå符åºï¼å ¶ä¸ä¸ä¸ªå符åºä¸ºé»è®¤å符åºï¼
- æ¯ä¸ä¸ªå符åºé½æä¸ä¸ªç¸å ³èçå符éï¼
- 两个ä¸åçå符é没æç¸åçå符åºï¼
- ç¸åæ°æ®ä½¿ç¨ä¸åçå符åºæåºç»æå¯è½ä¼ä¸åï¼
- MySQLæ¯ææå¡(server)ãæ°æ®åº(database)ã表(table)ãå段(column)以åå符串åé¢é(string literal)çå¤ä¸ªçº§å«çå符éä¸å符åºç设置ï¼
- åºè¡¨å建以åç¨åºä¸å°½å¯è½ä½¿ç¨
å符éï¼å¯æ¯æemojiï¼ï¼utf8mb4
- MySQLä¸ç
å符éæ¯utf8
å符éçå«åï¼é¿å 使ç¨ï¼utf8mb3
- MySQL 5.7çæå¡é»è®¤å符éæ¯
ï¼è8.0ä¸æ¯latin1
ï¼utf8mb4
- éè¿
æ¥çå½åæå¡å¨ææ¯æçå符éï¼SHOW CHARACTER SET
- éè¿
æ¥çææçå符åºï¼SHOW COLLATION
- MySQLä¸æå¤ä¸ªå ³äºå符éåå符åºçç³»ç»åéï¼å¯ä»¥éå¯¹å ¨å±ä»¥åå½åsessionè¿è¡è®¾ç½®ï¼
- éè¿
æ¥çå½åæå¡ä»¥åæ°æ®åºçå符é设置ï¼SHOW VARIABLES LIKE "character_set%"
- 使ç¨client客æ·ç«¯è¿æ¥æ°æ®åºæ¶ï¼å¦æç¼ç æé®é¢ï¼å¯ä»¥éè¿
æ¥è®¾ç½®åæ°æ®åºä¸è´çå符éï¼SET NAMES charset_name
- åºè¡¨å 容å¨ä¸åå符éä¹å转æ¢å¯è½ä¼åçæ°æ®ä¸¢å¤±é®é¢ã
2. ä»ä¹æ¯å符éä¸å符åº
å¨å¤§å¤æ°æ åµä¸ï¼æ们并ä¸éè¦äºè§£å符éä¸å符åºï¼ä½æ¯å¨æ¶åå°ä¸åå符éç转æ¢æ¶å¯è½ä¼åºç°é®é¢ï¼è¿æ¶äºè§£ä¸ä¸ç¸å ³çç¥è¯è¿æ¯æ帮å©çã
æ们é¦å äºè§£ä¸ä¸å符éçåçã
2.1 å符ä¸å符é
å符æ¯æ们ç»å¸¸æ¥è§¦å°çä¸è¥¿ï¼æ¯å¦aãä¸ãÃï¼ä»¥åðçé½æ¯å符ã
æ们ç¥é计ç®æºæ¯éè¿bitæ¥åå¨æ°æ®çï¼å°äººç±»å¯è¯å«çå符转æ¢æ计ç®æºå¯åå¨çå½¢å¼ï¼è¿ä¸ªè¿ç¨å°±æ¯ç¼ç ï¼å符ç¼ç çç»æï¼å°±æ¯å åç¼ç ã
ä¸ä¸ªå符éè¦ç¨å¤å°ä¸ªbitæ¥åå¨ï¼é£å°±éè¦ç¥éä¸å ±æå¤å°ä¸ªå符ã
ææçå符æ¾å¨ä¸èµ·å°±æ¯å符éã
æ¾ç¶ï¼ç±äºä½¿ç¨èå´ä¸åï¼å°±åºç°äºä¸åçå符éãæ¯å¦ï¼
- æ±è¯ä¸çææå符ææä¸ä¸ªå符éï¼ä¹å æ¬ä¸æ¯æ±åçå符ï¼æ¯å¦æ ç¹ç¬¦å·çï¼ï¼
- è±è¯ä¸çææå符ææä¸ä¸ªå符éï¼
- çç
对äºå符éä¸çæ¯ä¸ªå符æ¥è¯´ï¼é½æ两个å±æ§ï¼
- ä¸ä¸ªæ¯è¿ä¸ªå符å¨æå±å符éä¸çä½ç½®ï¼å¯ä»¥å«åå符åºå·ï¼æç ç¹ï¼code pointï¼ï¼
- 第äºä¸ªå°±æ¯è¿ä¸ªå符å¨è®¡ç®æºä¸çæ°æ®è¡¨ç¤ºï¼å³å åç¼ç ã
æ¯å¦ASCIIç ï¼åªéè¦8个bitå°±å¯ä»¥åå¨ææéè¦ä½¿ç¨çå符äºã
ä½å¯¹äºæ±è¯æ¥è¯´æ¾ç¶æ¯ä¸å¤çï¼å æ¤æ±è¯å符éè¦æ´å¤çbitï¼æ¥å°æ¯ä¸ªå符è¿è¡ç¼ç ã
è¿æ ·å¯¹äºæ¯ä¸ªå½å®¶æ¥è¯´ï¼é½å¯è½éè¦ä¸ä¸ªå°èªå·±ä½¿ç¨çå符éç¼ç æ计ç®æºå åç¼ç çè§åã
é£ä¹åä¸ä¸ªå åç¼ç ï¼å¯¹äºä¸åçå符éæ¥è¯´å°±å¯è½ä»£è¡¨ä¸åçå符ï¼
![](https://img.laitimes.com/img/_0nNw4CM6IyYiwiM6ICdiwiIwIjNx8CX39CXy8CXycXZpZVZnFWbp9zZuBnLhRzMoNGZoFWZ49CX4YTN2YzMz8CXlx2YpRnch9CXzV2Zh1WatIXZw9GblZXZk9CXt92YucWbpRWdvx2Yx5yazF2Lc9CX6MHc0RHaiojIsJye.png)
ç¸åçç¼ç 对åºä¸åçå符éå¯è½æ¯ä¸åçå符
æ¯å¦GB18030å符éä¸çâå°çâ两个å符çå åç¼ç åå«æ¯0xB5D8å0xC7F2ï¼ä½è¿ä¸¤ä¸ªå åç¼ç å¨å符éBIG5ä¸ä»£è¡¨çå符å´æ¯âè¯â©âã
è¿å°å¯¹æ们çç¨åºæå¾å¤§çå½±åã
2.2 Unicodeå符éä¸UTF-8
为äºè§£å³ä¸åè¯è¨ç¼ç ä¹é´ä¸å ¼å®¹çé®é¢ï¼Unicodeåºç°äºã
Unicodeå符éè´åäºä¸ºå ¨ä¸çæ¯ä¸ä¸ªè¯è¨çæ¯ä¸ä¸ªå符é½æç»ä¸ä¸å¯ä¸çç¼ç ï¼
All languages, all characters in Unicode
Unicodeå符åºå·çèå´æ¯0x000000å°0x10FFFFï¼å¯ä»¥å®¹çº³110å¤ä¸ä¸ªå符ã
é£ä¹å¦ä½å°Unicodeä¸çå符æ å°å°å åç¼ç å¢ï¼ä¸»è¦æUTF-8ãUTF-16åUTF-32çï¼å ¶ä¸æ常ç¨çå°±æ¯UTF-8ã
UTF-8使ç¨1å°4个ä¸ççåèæ¥è¡¨ç¤ºææçå符ï¼å ¶ä¸å128个å符ä¸ASCIIä¸è´ã
å ³äºUnicdoeç详ç»ä¿¡æ¯ï¼å¯ä»¥åèhttps://home.unicode.org/
2.3 å符åº
ä¸ä¸ªå符éä¸æå¤ä¸ªå符ï¼é£ä¹å¦ä½å¯¹å ¶ä¸çå符è¿è¡æåºå¢ï¼è¿å°±æ¯å符åºã
ç®åæ¥è¯´ï¼å符åºå°±æ¯å符æåºçè§åéåãæ¯å¦ä¸ä¸ªå符éæä¸é¢å 个å符ï¼ä»¥åå åç¼ç ï¼ï¼
å符 | å åç¼ç |
---|---|
A | 00 |
B | 01 |
a | 10 |
b | 11 |
å½ç¶æ们å¯ä»¥ç´æ¥æç §A>B>a>bçè§åæ¥è¿è¡æåºï¼è¿å°±æ¯è¿ä¸ªç®åå符éçä¸ä¸ªå符åºã
å¦ææ³è®©å°ååæ¯æ¾å¨åé¢ï¼æ¯å¦a>b>A>Bï¼è¿åæ¯ä¸ç§å符åºã
å¦æè¿æ³å ä¸å¤§å°åæ å ³æ大å°åç¸å ³ï¼é£ä¹æåºçè§åéå°±ä¼æç¸åºçç¼ç ï¼è¿å°±äº§çäºä¸åçå符åºã
å符åºä¸»è¦å¯¹å符çæåºæå½±åã
3. MySQLä¸çå符éä¸å符åº
äºè§£äºå符éåå符åºä¹åï¼æ¥ççMySQLä¸çå符éä¸å符åºã
3.1 MySQLä¸çå符é
éè¿ä¸é¢çè¯å¥æ¥æ¥çMySQLä¸æ¯æçå符éï¼
SHOW CHARACTER SET;
å¤å¶
ç»æï¼
+----------+---------------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+---------------------------------+---------------------+--------+
| armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |
| ascii | US ASCII | ascii_general_ci | 1 |
| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |
| binary | Binary pseudo charset | binary | 1 |
| cp1250 | Windows Central European | cp1250_general_ci | 1 |
| cp1251 | Windows Cyrillic | cp1251_general_ci | 1 |
| cp1256 | Windows Arabic | cp1256_general_ci | 1 |
| cp1257 | Windows Baltic | cp1257_general_ci | 1 |
| cp850 | DOS West European | cp850_general_ci | 1 |
| cp852 | DOS Central European | cp852_general_ci | 1 |
| cp866 | DOS Russian | cp866_general_ci | 1 |
| cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 |
| dec8 | DEC West European | dec8_swedish_ci | 1 |
| eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 |
| euckr | EUC-KR Korean | euckr_korean_ci | 2 |
| gb18030 | China National Standard GB18030 | gb18030_chinese_ci | 4 |
| gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 |
| gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 |
| geostd8 | GEOSTD8 Georgian | geostd8_general_ci | 1 |
| greek | ISO 8859-7 Greek | greek_general_ci | 1 |
| hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |
| hp8 | HP West European | hp8_english_ci | 1 |
| keybcs2 | DOS Kamenicky Czech-Slovak | keybcs2_general_ci | 1 |
| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |
| koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |
| latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 |
| macce | Mac Central European | macce_general_ci | 1 |
| macroman | Mac West European | macroman_general_ci | 1 |
| sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 |
| swe7 | 7bit Swedish | swe7_swedish_ci | 1 |
| tis620 | TIS620 Thai | tis620_thai_ci | 1 |
| ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 |
| ujis | EUC-JP Japanese | ujis_japanese_ci | 3 |
| utf16 | UTF-16 Unicode | utf16_general_ci | 4 |
| utf16le | UTF-16LE Unicode | utf16le_general_ci | 4 |
| utf32 | UTF-32 Unicode | utf32_general_ci | 4 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| utf8mb4 | UTF-8 Unicode | utf8mb4_0900_ai_ci | 4 |
+----------+---------------------------------+---------------------+--------+
å¤å¶
å段å«ä¹ï¼
- Charset: å符éçå称ï¼
- Descriptionï¼å符éçç®åæè¿°ï¼
- Default collationï¼è¯¥å符éçé»è®¤å符åºï¼
- Maxlenï¼è¯¥å符éä¸å符æ大åå¨é¿åº¦ã
éè¿å å ¥æ¡ä»¶æ¥è¯¢ä¸é¨åå符éï¼
SHOW CHARATER SET LIKE 'utf%';
å¤å¶
ç»æå±ç¤ºææUnicodeå符éï¼
+---------+------------------+--------------------+--------+
| Charset | Description | Default collation | Maxlen |
+---------+------------------+--------------------+--------+
| utf16 | UTF-16 Unicode | utf16_general_ci | 4 |
| utf16le | UTF-16LE Unicode | utf16le_general_ci | 4 |
| utf32 | UTF-32 Unicode | utf32_general_ci | 4 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| utf8mb4 | UTF-8 Unicode | utf8mb4_0900_ai_ci | 4 |
+---------+------------------+--------------------+--------+
å¤å¶
åé¢ä¼æå ³äºUnicodeçæ¯ç§å符éç详ç»ä¿¡æ¯ã
3.2 MySQLä¸çå符åº
æ¯ä¸ªå符éé½æä¸ä¸ªæå¤ä¸ªå符åºï¼å¯ä»¥éè¿ä¸é¢çè¯å¥æ¥çææçå符åºï¼
SHOW COLLATION;
å¤å¶
ç»æï¼åªå±ç¤ºä¸é¨åï¼ï¼
+---------------------+----------+-----+---------+----------+---------+---------------+
| Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
+---------------------+----------+-----+---------+----------+---------+---------------+
| armscii8_bin | armscii8 | 64 | | Yes | 1 | PAD SPACE |
| armscii8_general_ci | armscii8 | 32 | Yes | Yes | 1 | PAD SPACE |
| ascii_bin | ascii | 65 | | Yes | 1 | PAD SPACE |
| ascii_general_ci | ascii | 11 | Yes | Yes | 1 | PAD SPACE |
| big5_bin | big5 | 84 | | Yes | 1 | PAD SPACE |
| big5_chinese_ci | big5 | 1 | Yes | Yes | 1 | PAD SPACE |
| binary | binary | 63 | Yes | Yes | 1 | NO PAD |
| cp1250_bin | cp1250 | 66 | | Yes | 1 | PAD SPACE |
| cp1250_croatian_ci | cp1250 | 44 | | Yes | 1 | PAD SPACE |
| cp1250_czech_cs | cp1250 | 34 | | Yes | 2 | PAD SPACE |
| cp1250_general_ci | cp1250 | 26 | Yes | Yes | 1 | PAD SPACE |
| cp1250_polish_ci | cp1250 | 99 | | Yes | 1 | PAD SPACE |
| ...... | ...... | ...| ... | ... | ... | ...... |
+---------------------+----------+-----+---------+----------+---------+---------------+
å¤å¶
å段å«ä¹å¦ä¸ï¼
- Collationï¼å符åºå称ï¼
- Charsetï¼è¯¥å符åºå ³èçå符éï¼
- Idï¼å符åºIDï¼
- Defaultï¼è¯¥å符åºæ¯å¦æ¯æå
³èçå符éçé»è®¤å符åºãæ¯
å°±æ¯armscii8_general_ci
çé»è®¤å符åºï¼èarmscii8
å°±ä¸æ¯ï¼armscii8_bin
- Compiledï¼å符éæ¯å¦å·²ç¼è¯å°æå¡å¨ä¸ï¼
- Sortlenï¼è¿ä¸å¯¹ä»¥å符é表示çå符串è¿è¡æåºæéçå åéæå ³ï¼
- Pad_attributeï¼è¿è¡¨æäºå符åºå¨æ¯è¾å符串æ¶å¯¹æ«å°¾paddingçå¤çã
表æå¨æ¯è¾å符串æ¶ï¼æ«å°¾çpaddingä¹ä¼èèè¿å»ï¼å¦åä¸èèãNO PAD
ä¹å¯ä»¥æå®æ¡ä»¶æ¥è¯¢ï¼
SHOW COLLATION WHERE Charset = 'utf8mb4';
å¤å¶
è¿éæ¥è¯¢çå°±æ¯
utf8mb4
å符éçææå符åºã
æ¯ä¸ªå符åºé½æ¯ä»¥è¯¥å符åºæå ³èçå符é为åç¼çï¼åæ¶è¿æä¸äºæè§å¾çåç¼ã
è¿äºåç¼æï¼
- binï¼äºè¿å¶ï¼
- ciï¼å¤§å°åä¸ææï¼
- csï¼å¤§å°åææï¼
- aiï¼å£é³ï¼Accentï¼ä¸ææï¼
- asï¼å£é³ææï¼
- ksï¼ååï¼Kanatypeï¼ææã
åæ¶æçå符åºæ¯é¢åæç§è¯è¨çï¼ä¹ä¼å¨å符åºååä¸ææä½ç°ï¼æ¯å¦
big5_chinese_ci
ã
3.3 å符éä¸å符åºçå ³ç³»
å符éä¸å符åºçå ³ç³»å¯ä»¥ç¨ä¸é¢çå¾æ¥è¡¨ç¤ºï¼
æ¯ä¸ªå符éé½æè³å°ä¸ä¸ªå符åº
å³ï¼
- æ¯ä¸ªå符éé½æä¸ä¸ªæå¤ä¸ªå符åºï¼
- æ¯ä¸ªå符éé½æä¸ä¸ªé»è®¤çå符åºï¼
- æ¯ä¸ªå符åºé½å ³èä¸ä¸ªä¸åªæä¸ä¸ªå符éï¼
- 两个ä¸åçå符é没æç¸åçå符åºã
3.4 MySQLä¸çç¸å ³åé
MySQLä¸æä¸äºåéç¨äºå符éä¸å符åºç设置ã
3.4.1 å符éåé
éè¿ä¸é¢çè¯å¥æ¥æ¥çä¸å符éç¸å ³çåéï¼
SHOW VARIABLES LIKE 'character_set\_%'; -- å½åä¼è¯
SHOW GLOBAL VARIABLES LIKE 'character_set\_%'; -- å
¨å±
å¤å¶
ç»æï¼
+--------------------------------+---------+
| Variable_name | Value |
+--------------------------------+---------+
| character_set_client | utf8mb4 |
| character_set_client_handshake | ON |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
+--------------------------------+---------+
å¤å¶
åéå«ä¹ï¼
åé | å«ä¹ | ä½ç¨å | é»è®¤å¼ |
---|---|---|---|
| 客æ·ç«¯ååºSQLè¯å¥çå符é | å ¨å±ï¼ä¼è¯ | |
| ä¸å¿½ç¥å®¢æ·ç«¯ååºçå符éä¿¡æ¯ | å ¨å±ï¼ä¼è¯ | ONï¼ä¸å¿½ç¥ï¼ |
| 没ææå®å符éçå符串åé¢å¼æ使ç¨çå符é | å ¨å±ï¼ä¼è¯ | |
| æ°æ®åºé»è®¤ä½¿ç¨çå符é | å ¨å±ï¼ä¼è¯ | |
| æ件系ç»é»è®¤çå符é | å ¨å±ï¼ä¼è¯ | |
| æå¡å¨è¿åç»å®¢æ·ç«¯çç»æ使ç¨çå符é | å ¨å±ï¼ä¼è¯ | |
| æå¡å¨é»è®¤çå符é | å ¨å±ï¼ä¼è¯ | |
| æå¡å¨åå¨å æ°æ®ä½¿ç¨çå符é | å ¨å± | |
3.4.2 character_set_system
è¿ä¸ªåéæ¯MySQLæ°æ®åºå æ°æ®ä½¿ç¨çå符éã
æææè¿°æ°æ®åºçæ°æ®é½æ¯å æ°æ®ï¼æ¯å¦è¡¨åãååççã
对å æ°æ®çåå¨æå¦ä¸å 个è¦æ±ï¼
ææçå æ°æ®å¿ 须使ç¨ç¸åçå符éï¼
è¿ä¸ªå符éå¿ é¡»å å«ææè¯è¨çå符ã
MySQL使ç¨UTF-8ï¼å ·ä½çå°±æ¯
utf8mb3
å符éï¼å¨MySQLä¸
utf8
å°±æ¯
utf8mb3
ï¼ä¸è¿åç»ççæ¬å¯è½ä¼æææ¹åã
3.4.3 å符åºåé
éè¿ä¸é¢çè¯å¥æ¥æ¥çä¸å符åºç¸å ³çåéï¼
SHOW VARIABLES LIKE '%collation%'; -- å½åä¼è¯
SHOW GLOBAL VARIABLES LIKE '%collation%'; -- å
¨å±
å¤å¶
ç»æï¼
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8mb4_general_ci |
| collation_server | utf8mb4_general_ci |
+----------------------+--------------------+
å¤å¶
åéå«ä¹ï¼
åé | å«ä¹ | ä½ç¨å |
---|---|---|
| 没ææå®å符éçå符串åé¢å¼æ使ç¨çåç¬¦åº | å ¨å±ï¼ä¼è¯ |
| æ°æ®åºé»è®¤ä½¿ç¨çåç¬¦åº | å ¨å±ï¼ä¼è¯ |
| æå¡å¨é»è®¤ä½¿ç¨çåç¬¦åº | å ¨å±ï¼ä¼è¯ |
3.4.3 éè¦åé
æ¬æ主è¦å ³æ³¨äºä¸é¢å 个åéï¼
-
,character_set_server
collation_server
-
,character_set_database
collation_database
-
character_set_client
-
,character_set_connection
collation_connection
-
character_set_results
å ¶ä¸åå两ç»æ¶ååºè¡¨è®¾è®¡æ¶å符éä¸å符åºçé ç½®ï¼åä¸ç»æ¶åå°å®¢æ·ç«¯ä¸æå¡å¨è¿æ¥æ¶çå符éä¸å符åºçé ç½®ã
4. 设置å符éä¸å符åº
MySQLä¸æ¯æå¤ç§å符éä¸å符åºï¼å¯¹æ¤ï¼MySQLè½å¤ä¸ºæ们åå°ï¼
- 使ç¨ä¸åå符éåå¨å符串ï¼
- 使ç¨ä¸åçå符åºå¯¹å符串è¿è¡æåºï¼
- å¨åä¸ä¸ªæå¡å¨ä¸ï¼æåä¸ä¸ªæ°æ®åºä¸ï¼çè³åä¸å¼ 表ä¸ä½¿ç¨ä¸åçå符éæå符åºï¼
- 对äºå¤ä¸ªçº§å«çå符éä¸å符åºè¿è¡è®¾ç½®ã
4.1 æå¡å¨ç设置
MySQLä¸ï¼æå¡å¨æä¸ä¸ªé»è®¤çå符éä¸å符åºï¼å ¶ä¸ï¼
- 5.7以åçæ¬é»è®¤å符éæ¯
ï¼utf8mb4
- 5.7çé»è®¤å符åºæ¯
ï¼utf8mb4_general_ci
- 8.0çé»è®¤å符åºæ¯
ãutf8mb4_0900_ai_ci
æå¡å¨çå符éä¸å符åºå¯ä»¥éè¿å¤ç§æ¹å¼è®¾ç½®ï¼
- æå¡å¨å¯å¨æ¶ï¼
mysqld
mysqld --character-set-server=utf8mb4
mysqld --character-set-server=utf8mb4 --collation-server=utf8mb4_0900_ai_ci
å¤å¶
è¿ä¸ç§æ¹å¼ææä¸æ ·ã
- æå¡è¿è¡æ¶ï¼è®¾ç½®
åcharacter_set_server
åéï¼collation_server
- è¿å¯ä»¥ç¼è¯MySQLæå¡æ¶è¿è¡è®¾ç½®ã
æå¡å¨å符éä¸å符åºçå½±åï¼å½å建æ°æ®åºæ¶æ²¡ææå®å符éä¸å符åºï¼å°±æ¯ç¨æå¡å¨çå符éä¸å符åºã
é¤æ¤ä¹å¤æ²¡æå«çå½±åã
4.2 åºè¡¨è®¾è®¡ä¸ç设置
å¨å建åºè¡¨æ¶ï¼éè¦æå®æ°æ®åºã表以åå段æ使ç¨çå符éä¸å符åºã
å¦æ没ææå®ï¼MySQLæä¸ç³»åè§åæ¥ä½¿ç¨å符éä¸å符åºçé»è®¤å¼ã
4.2.1 æ°æ®åºç设置
å¨å建æ°æ®åºçæ¶åå¯ä»¥æå®è¯¥æ°æ®åºæ使ç¨çå符éä¸å符åºï¼
CREATE DATABASE db_name CHARACTER SET latin1 COLLATE latin1_swedish_ci;
å¤å¶
MySQL使ç¨ä¸é¢çè§åæ¥è®¾ç½®æ°æ®åºçå符éä¸å符åºï¼
- å¦æå建æ°æ®åºçæ¶åæå®äºå符éä¸å符åºï¼å°±æ¯ç¨è¿ä¸ªå符éä¸å符åºï¼CREATE DATABASE db_name CHARACTER SET latin1 COLLATE latin1_swedish_ci;è¿éæ°æ®åºçå符éå°±æ¯
ï¼å符åºå°±æ¯latin1
ãlatin1_swedish_ci
- å¦æå建æ°æ®åºæ¶æå®äºå符éè没ææå®å符åºï¼é£å符éå°±æ¯è¿ä¸ªå¼ï¼èå符åºå°±æ¯å符éçé»è®¤å符åºï¼
CREATE DATABASE db_name CHARACTER SET latin1;
å¤å¶
å符éå°±æ¯
latin1
ï¼å符åºå°±æ¯
latin1
çé»è®¤å符åº
latin1_swedish_ci
ã
- å¦æå建æ°æ®åºæ¶æå®äºå符åºè没ææå®å符éï¼é£ä¹å符éå°±æ¯è¯¥å符åºæ对åºçå符éï¼
CREATE DATABASE db_name CHARACTER COLLATE latin1_swedish_ci;
å¤å¶
å符åºå°±æ¯
latin1_swedish_ci
ï¼å符éå°±æ¯è¿ä¸ªå符åºå ³èçå符é
latin1
ã
- å¦æå符éä¸å符åºé½æ²¡ææå®ï¼é£ä¹å°±æ¯ç¨æå¡å¨é»è®¤çå符é(
)ä¸å符åº(character_set_server
)ãcollation_server
对äºå½åæ°æ®åºæ使ç¨çå符éä¸å符åºï¼å¯ä»¥éè¿æ¥çä¸é¢ä¸¤ä¸ªåéçå¼ï¼
USE db_name;
SELECT @@character_set_database, @@collation_database;
å¤å¶
è¿ä¸¤ä¸ªåéçå¼æå¦ä¸çå½±åï¼
- å¦æå建表æ¶æ²¡ææå®è¯¥è¡¨ä½¿ç¨çå符éä¸å符åºï¼å°±æ¯ç¨è¿ä¸¤ä¸ªåéæ对åºçå符éä¸å符åºï¼
-
è¯å¥æ²¡ææå®å符éæ¶ï¼æå¡å¨ä½¿ç¨LOAT DATA
æ¥è§£ææ件ä¸çä¿¡æ¯ãcharacter_set_database
4.2.2 表ç设置
å建表æ¶ä¹å¯ä»¥å¶å®è¯¥è¡¨æ使ç¨çå符éä¸å符åºï¼
CREATE TABLE t1 ( ... )
CHARACTER SET latin1 COLLATE latin1_danish_ci;
å¤å¶
MySQLéæ©è¡¨çå符éä¸å符åºçè§ååæ°æ®åºç±»ä¼¼ï¼ä¸åå¨äºå¦æå建表æ¶æ²¡ææå®å符éä¸å符åºï¼å°±ä¼ä½¿ç¨åé
character_set_database
å
collation_database
ææå®çå符éä¸å符åºã
4.2.3 å段ç设置
表éçæ¯ä¸ªå段ä¹å¯ä»¥æ¥æèªå·±çå符éä¸å符åºï¼
CREATE TABLE t1
(
col1 VARCHAR(5)
CHARACTER SET latin1
COLLATE latin1_german1_ci
);
ALTER TABLE t1 MODIFY
col1 VARCHAR(5)
CHARACTER SET latin1
COLLATE latin1_swedish_ci;
å¤å¶
ä¸æ°æ®åºå表类似ï¼MySQLä¹æ¯æç §å±çº§æ¥å¶å®å符éä¸å符åºçã
å¦æå段没ææå®ï¼é£ä¹å°±æ¯ç¨è¡¨æ使ç¨çå符éä¸å符åºã
4.2.4 å°ç»
ä¸é¢çå 个å°èä¸å ³äºåºè¡¨è®¾è®¡çå符éä¸å符åºè®¾ç½®ï¼å¯ä»¥ç¨ä¸å¾æ¥è¡¨ç¤ºï¼
MySQLæ¯æå¤ä¸ªå±çº§çå符éä¸å符åºè®¾ç½®
ä¸ä¸å±çº§å¦æ没ææå®å符éä¸å符åºï¼å°±æ¯ç¨ä¸ä¸å±çº§çå符éä¸å符åºã
4.3 客æ·ç«¯è¿æ¥ä¸ç设置
å½æ们使ç¨
mysql
è¿ä¸ªå®¢æ·ç«¯ä¸MySQLæå¡å¨è¿æ¥çæ¶åï¼ä¹ä¼æ¶åå°å符éä¸å符åºç设置ã
4.3.1 ä¸è¿æ¥æå ³çåé
æ¯ä¸ä¸ªè¿æ¥å°æå¡å¨ç客æ·ç«¯é½æä¸ä¸ªå¯¹åºçå符éä¸å符åºã
æ¶åå°çåéæï¼
-
character_set_client
-
,character_set_connection
collation_connection
-
character_set_results
è¿ä¸ä¸ªå ³äºå符éçåéæ¯è¿æ ·ä½¿ç¨çï¼
å符转æ¢ä¹æ
å³ï¼
- 客æ·ç«¯çè¯å¥ä»å®¢æ·ç«¯åºåæ¶ï¼ä½¿ç¨çå符éæ¯
ï¼character_set_client
- è¯å¥å°è¾¾æå¡å¨æ¶ï¼æå¡å¨å°è¯å¥è½¬æ¢æ
å符éï¼character_set_connection
- æå¡å¨æ§è¡å®ï¼å°ç»æè¿åç»å®¢æ·ç«¯æ¶ï¼ä½¿ç¨çæ¯
å符éãcharacter_set_results
4.3.2 客æ·ç«¯è¿æ¥å符é设置
å½æå¡å¨ä½¿ç¨çå符éä¸å®¢æ·ç«¯è¿æ¥ä½¿ç¨çå符éä¸åæ¶ï¼å¯è½ä¼æé®é¢ï¼
MySQL [test]> select title from article where id = 2;
+-------+
| title |
+-------+
| ????? |
+-------+
å¤å¶
è¿æ¶éè¦å°å®¢æ·ç«¯è¿æ¥çå符é设置æä¸æå¡å¨ä¿æä¸è´ã
SET NAMES utf8mb4;
å¤å¶
è¿æ ·å°±å¯ä»¥äºï¼
MySQL [test]> select title from article where id = 2;
+-----------------+
| title |
+-----------------+
| æªå½åé¡¹ç® |
+-----------------+
å¤å¶
ä¸é¢çè¯å¥çåäºï¼
SET character_set_client = utf8mb4;
SET character_set_results = utf8mb4;
SET character_set_connection = utf8mb4;
å¤å¶
4.3.3 ä¸äºéå¶
å¨è®¾ç½®
character_set_client
åéæ¶ï¼ä¸é¢çå符éä¸å¯ç¨ï¼
-
ucs2
-
utf16
-
utf16le
-
utf32
å¦åå°±ä¼æ¥éï¼
MySQL [test]> set names utf16;
ERROR 1231 (42000): Variable 'character_set_client' can't be set to the value of 'utf16'
å¤å¶
4.3.4 ç¨åºä¸è¿æ¥ç设置
使ç¨æ°æ®åºçç¨åºä¹æ¯ä¸ä¸ªå®¢æ·ç«¯ï¼å¨è¿æ¥æ°æ®åºçæ¶åä¹éè¦æå®å符éã
Pythonï¼
conn = mysql.connect(host='127.0.0.1',
user='user',
passwd='passwd',
db='db',
charset='utf8')
å¤å¶
Golang:
dsn := `root:root@tcp(127.0.0.1 :3306)/DB_NAME?charset=utf8mb4`
dbConn, _:= sql.Open(`mysql`, dsn)
å¤å¶
5. MySQL对Unicodeçæ¯æ
åé¢æå°è¿ä¸ç§å å«ææè¯è¨ææå符çå符éUnicodeï¼å®çç ç¹å为两个é¨åï¼
BMP(Basic Multilingual Plane): åºæ¬å¤æç§å¹³é¢ï¼èå´æ¯0x0000å°0xFFFFï¼
è¡¥å å¹³é¢ï¼Supplementaryï¼: è¡¥å å¹³é¢ï¼èå´æ¯0x10000å°0x10FFFFã
5.1 BMP
BMPæå¦ä¸çç¹ç¹ï¼
èå´æ¯0x0000å°0xFFFFï¼ä¸å ±65535个å符ï¼
å¯ä»¥ç¼ç æåé¿ç¼ç ï¼ä½¿ç¨1å°3个åèï¼
ä¹å¯ä»¥ç¼ç æå®é¿ç¼ç ï¼ä½¿ç¨2个åèï¼
对äºä¸»è¦è¯è¨ç大å¤æ°å符æ¥è¯´æ¯è¶³å¤äºã
5.2 Supplementary
è¡¥å å¹³é¢æå¦ä¸çç¹ç¹ï¼
èå´æ¯0x10000å°0x10FFFFï¼
ç¼ç æéçåè大äºBMPï¼éè¦4个åèã
5.3 MySQLçæ¯æ
MySQLä¸æå 个å符éæ¯æäºUnicodeï¼
å符é | æ¯æçå符 | æ¯ä¸ªå符æéçåå¨å¤§å° | å¤æ³¨ |
---|---|---|---|
, | BMP | 1ã2æ3个åè | MySQL 8.0ä¸å·²å¼ç¨ |
| BMP | 2个åè | MySQL 8.0ä¸å·²å¼ç¨ |
| BMPåSupplementary | 1ã2ã3æ3个åè | æ¨èä½¿ç¨ |
| BMPåSupplementary | 2æ4个åè | |
| BMPåSupplementary | 2æ4个åè | little-endianï¼å ä¸æ · |
| BMPåSupplementary | 4个åè |