天天看點

034_Unicode标準

1. Unicode标準

1.1. 由于ASCII字元集、ISO字元集、GBK字元集列出的字元集都有容量限制, 而且不相容多語言環境, Unicode聯盟開發了Unicode标準。

1.2. Unicode标準涵蓋了世界上的所有字元、标點和符号。

1.3. 不論是何種平台、程式或語言, Unicode都能夠進行文本資料的處理、存儲和交換。

1.4.  Unicode标準已經獲得了成功, 在XML、Java、ECMAScript(JavaScript)、LDAP、CORBA 3.0、WML中, Unicode已經得到了實作。在許多作業系統以及所有的現代浏覽器中, Unicode同樣得到了支援。

2. Unicode字元

2.1. 網址: https://home.unicode.org/

034_Unicode标準

2.2. Unicode标準版本13歸檔字元

編号 字元名稱 範圍
1 C0 Controls and Basic Latin(控制符及基本拉丁文) 0000–007F
2 C1 Controls and Latin-1 Supplement(控制符及拉丁文-1補充) 0080–00FF
3 Latin Extended-A(拉丁文擴充-A) 0100–017F
4 Latin Extended-B(拉丁文擴充-B) 0180–024F
5 IPA Extensions(國際音标擴充) 0250–02AF
6 Spacing Modifier Letters(空白修飾字母) 02B0–02FF
7 Combining Diacritical Marks(組合用附加符号) 0300–036F
8 Greek and Coptic(希臘文及科普特文) 0370–03FF
9 Cyrillic(西裡爾語) 0400–04FF
10 Cyrillic Supplement(西裡爾語補充) 0500–052F
11 Armenian(亞美尼亞語) 0530–058F
12 Hebrew(希伯來語) 0590–05FF
13 Arabic(阿拉伯語) 0600–06FF
14 Syriac(叙利亞語) 0700–074F
15 Arabic Supplement(阿拉伯語補充) 0750–077F
16 Thaana(塔納文) 0780–07BF
17 N'Ko(西非書面語言) 07C0–07FF
18 Samaritan(撒瑪利亞語) 0800–083F
19 Mandaic(曼代克語) 0840–085F
20 Syriac Supplement(叙利亞語補充) 0860–086F
21 Arabic Extended-A(阿拉伯語擴充-A) 08A0–08FF
22 Devanagari(天城體文字) 0900–097F
23 Bengali(孟加拉語) 0980–09FF
24 Gurmukhi(錫克教語) 0A00–0A7F
25 Gujarati(古吉拉特語) 0A80–0AFF
26 Oriya(奧裡雅語) 0B00–0B7F
27 Tamil(泰米爾語) 0B80–0BFF
28 Telugu(泰盧固語) 0C00–0C7F
29 Kannada(卡納拉語) 0C80–0CFF
30 Malayalam(馬拉亞蘭語) 0D00–0D7F
31 Sinhala(僧伽羅語) 0D80–0DFF
32 Thai(泰文) 0E00–0E7F
33 Lao(寮國文) 0E80–0EFF
34 Tibetan(藏文) 0F00–0FFF
35 Myanmar(緬甸語) 1000–109F
36 Georgian(格魯吉亞語) 10A0–10FF
37 Hangul Jamo(北韓文) 1100–11FF
38 Ethiopic(埃塞俄比亞語) 1200–137F
39 Ethiopic Supplement(埃塞俄比亞語補充) 1380–139F
40 Cherokee(切羅基語) 13A0–13FF
41 Unified Canadian Aboriginal Syllabics(統一加拿大土著語音節) 1400–167F
42 Ogham(歐甘字母) 1680–169F
43 Runic(如尼文) 16A0–16FF
44 Tagalog(菲律賓語) 1700–171F
45 Hanunoo(塔加路文) 1720–173F
46 Buhid(布希德文) 1740–175F
47 Tagbanwa(塔格巴努亞文) 1760–177F
48 Khmer(高棉語) 1780–17FF
49 Mongolian(蒙古文) 1800–18AF
50 Unified Canadian Aboriginal Syllabics Extended(統一加拿大土著語音節擴充) 18B0–18FF
51 Limbu(林布文) 1900–194F
52 Tai Le(德宏傣文) 1950–197F
53 New Tai Lue(新傣文) 1980–19DF
54 Khmer Symbols(高棉語符号) 19E0–19FF
55 Buginese(布吉文) 1A00–1A1F
56 Tai Tham(老傣文) 1A20–1AAF
57 Combining Diacritical Marks Extended(組合用附加符号擴充) 1AB0–1AFF
58 Balinese(巴厘語) 1B00–1B7F
59 Sundanese(巽他語) 1B80–1BBF
60 Batak(巴塔克文) 1BC0–1BFF
61 Lepcha(雷布查語) 1C00–1C4F
62 Ol Chiki(歐甘語) 1C50–1C7F
63 Cyrillic Extended-C(西裡爾語擴充-C) 1C80–1C8F
64 Georgian Extended(格魯吉亞語擴充) 1C90–1CBF
65 Sundanese Supplement(巽他語補充) 1CC0–1CCF
66 Vedic Extensions(梵語擴充) 1CD0–1CFF
67 Phonetic Extensions(語音學擴充) 1D00–1D7F
68 Phonetic Extensions Supplement(語音學擴充補充) 1D80–1DBF
69 Combining Diacritical Marks Supplement(組合用附加符号補充) 1DC0–1DFF
70 Latin Extended Additional(拉丁文擴充附加) 1E00–1EFF
71 Greek Extended(希臘語擴充) 1F00–1FFF
72 General Punctuation(常用标點) 2000–206F
73 Superscripts and Subscripts(上标及下标) 2070–209F
74 Currency Symbols(貨币符号) 20A0–20CF
75 Combining Diacritical Marks for Symbols(組合用記号) 20D0–20FF
76 Letterlike Symbols(字母式符号) 2100–214F
77 Number Forms(數字形式) 2150–218F
78 Arrows(箭頭) 2190–21FF
79 Mathematical Operators(數學運算符) 2200–22FF
80 Miscellaneous Technical(雜項工業符号) 2300–23FF
81 Control Pictures(控制圖檔) 2400–243F
82 Optical Character Recognition(光學識别符) 2440–245F
83 Enclosed Alphanumerics(封閉式字母數字) 2460–24FF
84 Box Drawing(制表符) 2500–257F
85 Block Elements(方塊元素) 2580–259F
86 Geometric Shapes(幾何圖形) 25A0–25FF
87 Miscellaneous Symbols(雜項符号) 2600–26FF
88 Dingbats(印刷符号) 2700–27BF
89 Miscellaneous Mathematical Symbols-A(雜項數學符号-A) 27C0–27EF
90 Supplemental Arrows-A(追加箭頭-A) 27F0–27FF
91 Braille Patterns(盲文點字模型) 2800–28FF
92 Supplemental Arrows-B(追加箭頭-B) 2900–297F
93 Miscellaneous Mathematical Symbols-B(雜項數學符号-B) 2980–29FF
94 Supplemental Mathematical Operators(追加數學運算符) 2A00–2AFF
95 Miscellaneous Symbols and Arrows(雜項符号和箭頭) 2B00–2BFF
96 Glagolitic(格拉哥裡字母) 2C00–2C5F
97 Latin Extended-C(拉丁文擴充-C) 2C60–2C7F
98 Coptic(古埃及語) 2C80–2CFF
99 Georgian Supplement(格魯吉亞語補充) 2D00–2D2F
100 Tifinagh(提非納文) 2D30–2D7F
101 Ethiopic Extended(埃塞俄比亞語擴充) 2D80–2DDF
102 Cyrillic Extended-A(西裡爾語擴充-A) 2DE0–2DFF
103 Supplemental Punctuation(追加标點) 2E00–2E7F
104 CJK Radicals Supplement(CJK部首補充) 2E80–2EFF
105 Kangxi Radicals(康熙字典部首) 2F00–2FDF
106 Ideographic Description Characters(表意文字描述符) 2FF0–2FFF
107 CJK Symbols and Punctuation(CJK符号和标點) 3000–303F
108 Hiragana(日文平假名) 3040–309F
109 Katakana(日文片假名) 30A0–30FF
110 Bopomofo(注音字母) 3100–312F
111 Hangul Compatibility Jamo(北韓文相容字母) 3130–318F
112 Kanbun(象形字注釋标志) 3190–319F
113 Bopomofo Extended(注音字母擴充) 31A0–31BF
114 CJK Strokes(CJK筆畫) 31C0–31EF
115 Katakana Phonetic Extensions(日文片假名語音擴充) 31F0–31FF
116 Enclosed CJK Letters and Months(封閉式CJK文字和月份) 3200–32FF
117 CJK Compatibility(CJK相容) 3300–33FF
118 CJK Unified Ideographs Extension A(CJK統一表意文字擴充A) 3400–4DBF
119 Yijing Hexagram Symbols(易經六十四卦符号) 4DC0–4DFF
120 CJK Unified Ideographs(CJK統一表意文字(基本漢字)) 4E00–9FFC
121 Yi Syllables(彜文音節) A000–A48F
122 Yi Radicals(彜文字根) A490–A4CF
123 Lisu(傈僳語) A4D0–A4FF
124 Vai(瓦伊語) A500–A63F
125 Cyrillic Extended-B(西裡爾字母擴充-B) A640–A69F
126 Bamum(巴姆穆語) A6A0–A6FF
127 Modifier Tone Letters(聲調修飾字母) A700–A71F
128 Latin Extended-D(拉丁文擴充-D) A720–A7FF
129 Syloti Nagri(錫爾赫特文) A800–A82F
130 Common Indic Number Forms(普通印度數字表) A830–A83F
131 Phags-pa(八思巴字) A840–A87F
132 Saurashtra(索拉什特拉) A880–A8DF
133 Devanagari Extended(天城體文字擴充) A8E0–A8FF
134 Kayah Li(克耶字母) A900–A92F
135 Rejang(勒姜語) A930–A95F
136 Hangul Jamo Extended-A(北韓文擴充-A) A960–A97F
137 Javanese(爪哇語) A980–A9DF
138 Myanmar Extended-B(緬甸語擴充-B) A9E0–A9FF
139 Cham(鞑靼文) AA00–AA5F
140 Myanmar Extended-A(緬甸語擴充-A) AA60–AA7F
141 Tai Viet(越南傣文) AA80–AADF
142 Meetei Mayek Extensions(曼尼普爾文擴充) AAE0–AAFF
143 Ethiopic Extended-A(埃塞俄比亞文擴充-A) AB00–AB2F
144 Latin Extended-E(拉丁文擴充-E) AB30–AB6F
145 Cherokee Supplement(徹羅基語補充) AB70–ABBF
146 Meetei Mayek(曼尼普爾文) ABC0–ABFF
147 Hangul Syllables(北韓文音節) AC00–D7AF
148 Hangul Jamo Extended-B(北韓文擴充-B) D7B0–D7FF
149 High Surrogate Area(UTF-16高位元組占用區域) D800-DBFF
150 Low Surrogate Area(UTF-16低位元組占用區域) DC00-DFFF
151 Private Use Area(自行使用區域) E000-F8FF
152 CJK Compatibility Ideographs(CJK相容表意文字) F900–FAD9
153 Alphabetic Presentation Forms(字母表達形式) FB00–FB4F
154 Arabic Presentation Forms-A(阿拉伯文表達形式-A) FB50–FDFF
155 Variation Selectors(變量選擇符) FE00–FE0F
156 Vertical Forms(豎排形式) FE10–FE1F
157 Combining Half Marks(組合用半符号) FE20–FE2F
158 CJK Compatibility Forms(CJK相容形式) FE30–FE4F
159 Small Form Variants(小型變體形式) FE50–FE6F
160 Arabic Presentation Forms-B(阿拉伯文表達形式-B) FE70–FEFF
161 Halfwidth and Fullwidth Forms(半型及全型形式) FF00–FFEF
162 Specials(特殊) FFF0–FFFF
163 Linear B Syllabary 10000–1007F
164 Linear B Ideograms 10080–100FF
165 Aegean Numbers 10100–1013F
166 Ancient Greek Numbers 10140–1018F
167 Ancient Symbols 10190–101CF
168 Phaistos Disc 101D0–101FF
169 Lycian 10280–1029F
170 Carian 102A0–102DF
171 Coptic Epact Numbers 102E0–102FF
172 Old Italic 10300–1032F
173 Gothic 10330–1034F
174 Old Permic 10350–1037F
175 Ugaritic 10380–1039F
176 Old Persian 103A0–103DF
177 Deseret 10400–1044F
178 Shavian 10450–1047F
179 Osmanya 10480–104AF
180 Osage 104B0–104FF
181 Elbasan 10500–1052F
182 Caucasian Albanian 10530–1056F
183 Linear A 10600–1077F
184 Cypriot Syllabary 10800–1083F
185 Imperial Aramaic 10840–1085F
186 Palmyrene 10860–1087F
187 Nabataean 10880–108AF
188 Hatran 108E0–108FF
189 Phoenician 10900–1091F
190 Lydian 10920–1093F
191 Meroitic Hieroglyphs 10980–1099F
192 Meroitic Cursive 109A0–109FF
193 Kharoshthi 10A00–10A5F
194 Old South Arabian 10A60–10A7F
195 Old North Arabian 10A80–10A9F
196 Manichaean 10AC0–10AFF
197 Avestan 10B00–10B3F
198 Inscriptional Parthian 10B40–10B5F
199 Inscriptional Pahlavi 10B60–10B7F
200 Psalter Pahlavi 10B80–10BAF
201 Old Turkic 10C00–10C4F
202 Old Hungarian 10C80–10CFF
203 Hanifi Rohingya 10D00–10D3F
204 Rumi Numeral Symbols 10E60–10E7F
205 Yezidi 10E80–10EBF
206 Old Sogdian 10F00–10F2F
207 Sogdian 10F30–10F6F
208 Chorasmian 10FB0–10FDF
209 Elymaic 10FE0–10FFF
210 Brahmi 11000–1107F
211 Kaithi 11080–110CF
212 Sora Sompeng 110D0–110FF
213 Chakma 11100–1114F
214 Mahajani 11150–1117F
215 Sharada 11180–111DF
216 Sinhala Archaic Numbers 111E0–111FF
217 Khojki 11200–1124F
218 Multani 11280–112AF
219 Khudawadi 112B0–112FF
220 Grantha 11300–1137F
221 Newa 11400–1147F
222 Tirhuta 11480–114DF
223 Siddham 11580–115FF
224 Modi 11600–1165F
225 Mongolian Supplement 11660–1167F
226 Takri 11680–116CF
227 Ahom 11700–1173F
228 Dogra 11800–1184F
229 Warang Citi 118A0–118FF
230 Dives Akuru 11900–1195F
231 Nandinagari 119A0–119FF
232 Zanabazar Square 11A00–11A4F
233 Soyombo 11A50–11AAF
234 Pau Cin Hau 11AC0–11AFF
235 Bhaiksuki 11C00–11C6F
236 Marchen 11C70–11CBF
237 Masaram Gondi 11D00–11D5F
238 Gunjala Gondi 11D60–11DAF
239 Makasar 11EE0–11EFF
240 Lisu Supplement 11FB0–11FBF
241 Tamil Supplement 11FC0–11FFF
242 Cuneiform 12000–123FF
243 Cuneiform Numbers and Punctuation 12400–1247F
244 Early Dynastic Cuneiform 12480–1254F
245 Egyptian Hieroglyphs 13000–1342F
246 Egyptian Hieroglyph Format Controls 13430–1343F
247 Anatolian Hieroglyphs 14400–1467F
248 Bamum Supplement 16800–16A3F
249 Mro 16A40–16A6F
250 Bassa Vah 16AD0–16AFF
251 Pahawh Hmong 16B00–16B8F
252 Medefaidrin 16E40–16E9F
253 Miao 16F00–16F9F
254 Ideographic Symbols and Punctuation 16FE0–16FFF
255 Tangut 17000–187F7
256 Tangut Components 18800–18AFF
257 Khitan Small Script 18B00–18CFF
258 Tangut Supplement 18D00–18D08
259 Kana Supplement 1B000–1B0FF
260 Kana Extended-A 1B100–1B12F
261 Small Kana Extension 1B130–1B16F
262 Nushu 1B170–1B2FF
263 Duployan 1BC00–1BC9F
264 Shorthand Format Controls 1BCA0–1BCAF
265 Byzantine Musical Symbols 1D000–1D0FF
266 Musical Symbols 1D100–1D1FF
267 Ancient Greek Musical Notation 1D200–1D24F
268 Mayan Numerals 1D2E0–1D2FF
269 Tai Xuan Jing Symbols 1D300–1D35F
270 Counting Rod Numerals 1D360–1D37F
271 Mathematical Alphanumeric Symbols 1D400–1D7FF
272 Sutton SignWriting 1D800–1DAAF
273 Glagolitic Supplement 1E000–1E02F
274 Nyiakeng Puachue Hmong 1E100–1E14F
275 Wancho 1E2C0–1E2FF
276 Mende Kikakui 1E800–1E8DF
277 Adlam 1E900–1E95F
278 Indic Siyaq Numbers 1EC70–1ECBF
279 Ottoman Siyaq Numbers 1ED00–1ED4F
280 Arabic Mathematical Alphabetic Symbols 1EE00–1EEFF
281 Mahjong Tiles 1F000–1F02F
282 Domino Tiles 1F030–1F09F
283 Playing Cards 1F0A0–1F0FF
284 Enclosed Alphanumeric Supplement 1F100–1F1FF
285 Enclosed Ideographic Supplement 1F200–1F2FF
286 Miscellaneous Symbols and Pictographs 1F300–1F5FF
287 Emoticons 1F600–1F64F
288 Ornamental Dingbats 1F650–1F67F
289 Transport and Map Symbols 1F680–1F6FF
290 Alchemical Symbols 1F700–1F77F
291 Geometric Shapes Extended 1F780–1F7FF
292 Supplemental Arrows-C 1F800–1F8FF
293 Supplemental Symbols and Pictographs 1F900–1F9FF
294 Chess Symbols 1FA00–1FA6F
295 Symbols and Pictographs Extended-A 1FA70–1FAFF
296 Symbols for Legacy Computing 1FB00–1FBFF
297 Unassigned 1FF80–1FFFF
298 CJK Unified Ideographs Extension B 20000–2A6DD
299 CJK Unified Ideographs Extension C 2A700–2B734
300 CJK Unified Ideographs Extension D 2B740–2B81D
301 CJK Unified Ideographs Extension E 2B820–2CEA1
302 CJK Unified Ideographs Extension F 2CEB0–2EBE0
303 CJK Compatibility Ideographs Supplement 2F800–2FA1D
304 Unassigned 2FF80–2FFFF
305 CJK Unified Ideographs Extension G 30000–3134A
306 Unassigned 3FF80–3FFFF
307 Unassigned 4FF80–4FFFF
308 Unassigned 5FF80–5FFFF
309 Unassigned 6FF80–6FFFF
310 Unassigned 7FF80–7FFFF
311 Unassigned 8FF80–8FFFF
312 Unassigned 9FF80–9FFFF
313 Unassigned AFF80–AFFFF
314 Unassigned BFF80–BFFFF
315 Unassigned CFF80–CFFFF
316 Unassigned DFF80–DFFFF
317 Tags E0000–E007F
318 Variation Selectors Supplement E0100–E01EF
319 Unassigned EFF80–EFFFF
320 Supplementary Private Use Area-A FFF80–FFFFF
321 Supplementary Private Use Area-B 10FF80–10FFFF