1. Unicode标準
1.1. 由于ASCII字元集、ISO字元集、GBK字元集列出的字元集都有容量限制, 而且不相容多語言環境, Unicode聯盟開發了Unicode标準。
1.2. Unicode标準涵蓋了世界上的所有字元、标點和符号。
1.3. 不論是何種平台、程式或語言, Unicode都能夠進行文本資料的處理、存儲和交換。
1.4. Unicode标準已經獲得了成功, 在XML、Java、ECMAScript(JavaScript)、LDAP、CORBA 3.0、WML中, Unicode已經得到了實作。在許多作業系統以及所有的現代浏覽器中, Unicode同樣得到了支援。
2. Unicode字元
2.1. 網址: https://home.unicode.org/
2.2. Unicode标準版本13歸檔字元
編号 | 字元名稱 | 範圍 |
1 | C0 Controls and Basic Latin(控制符及基本拉丁文) | 0000–007F |
2 | C1 Controls and Latin-1 Supplement(控制符及拉丁文-1補充) | 0080–00FF |
3 | Latin Extended-A(拉丁文擴充-A) | 0100–017F |
4 | Latin Extended-B(拉丁文擴充-B) | 0180–024F |
5 | IPA Extensions(國際音标擴充) | 0250–02AF |
6 | Spacing Modifier Letters(空白修飾字母) | 02B0–02FF |
7 | Combining Diacritical Marks(組合用附加符号) | 0300–036F |
8 | Greek and Coptic(希臘文及科普特文) | 0370–03FF |
9 | Cyrillic(西裡爾語) | 0400–04FF |
10 | Cyrillic Supplement(西裡爾語補充) | 0500–052F |
11 | Armenian(亞美尼亞語) | 0530–058F |
12 | Hebrew(希伯來語) | 0590–05FF |
13 | Arabic(阿拉伯語) | 0600–06FF |
14 | Syriac(叙利亞語) | 0700–074F |
15 | Arabic Supplement(阿拉伯語補充) | 0750–077F |
16 | Thaana(塔納文) | 0780–07BF |
17 | N'Ko(西非書面語言) | 07C0–07FF |
18 | Samaritan(撒瑪利亞語) | 0800–083F |
19 | Mandaic(曼代克語) | 0840–085F |
20 | Syriac Supplement(叙利亞語補充) | 0860–086F |
21 | Arabic Extended-A(阿拉伯語擴充-A) | 08A0–08FF |
22 | Devanagari(天城體文字) | 0900–097F |
23 | Bengali(孟加拉語) | 0980–09FF |
24 | Gurmukhi(錫克教語) | 0A00–0A7F |
25 | Gujarati(古吉拉特語) | 0A80–0AFF |
26 | Oriya(奧裡雅語) | 0B00–0B7F |
27 | Tamil(泰米爾語) | 0B80–0BFF |
28 | Telugu(泰盧固語) | 0C00–0C7F |
29 | Kannada(卡納拉語) | 0C80–0CFF |
30 | Malayalam(馬拉亞蘭語) | 0D00–0D7F |
31 | Sinhala(僧伽羅語) | 0D80–0DFF |
32 | Thai(泰文) | 0E00–0E7F |
33 | Lao(寮國文) | 0E80–0EFF |
34 | Tibetan(藏文) | 0F00–0FFF |
35 | Myanmar(緬甸語) | 1000–109F |
36 | Georgian(格魯吉亞語) | 10A0–10FF |
37 | Hangul Jamo(北韓文) | 1100–11FF |
38 | Ethiopic(埃塞俄比亞語) | 1200–137F |
39 | Ethiopic Supplement(埃塞俄比亞語補充) | 1380–139F |
40 | Cherokee(切羅基語) | 13A0–13FF |
41 | Unified Canadian Aboriginal Syllabics(統一加拿大土著語音節) | 1400–167F |
42 | Ogham(歐甘字母) | 1680–169F |
43 | Runic(如尼文) | 16A0–16FF |
44 | Tagalog(菲律賓語) | 1700–171F |
45 | Hanunoo(塔加路文) | 1720–173F |
46 | Buhid(布希德文) | 1740–175F |
47 | Tagbanwa(塔格巴努亞文) | 1760–177F |
48 | Khmer(高棉語) | 1780–17FF |
49 | Mongolian(蒙古文) | 1800–18AF |
50 | Unified Canadian Aboriginal Syllabics Extended(統一加拿大土著語音節擴充) | 18B0–18FF |
51 | Limbu(林布文) | 1900–194F |
52 | Tai Le(德宏傣文) | 1950–197F |
53 | New Tai Lue(新傣文) | 1980–19DF |
54 | Khmer Symbols(高棉語符号) | 19E0–19FF |
55 | Buginese(布吉文) | 1A00–1A1F |
56 | Tai Tham(老傣文) | 1A20–1AAF |
57 | Combining Diacritical Marks Extended(組合用附加符号擴充) | 1AB0–1AFF |
58 | Balinese(巴厘語) | 1B00–1B7F |
59 | Sundanese(巽他語) | 1B80–1BBF |
60 | Batak(巴塔克文) | 1BC0–1BFF |
61 | Lepcha(雷布查語) | 1C00–1C4F |
62 | Ol Chiki(歐甘語) | 1C50–1C7F |
63 | Cyrillic Extended-C(西裡爾語擴充-C) | 1C80–1C8F |
64 | Georgian Extended(格魯吉亞語擴充) | 1C90–1CBF |
65 | Sundanese Supplement(巽他語補充) | 1CC0–1CCF |
66 | Vedic Extensions(梵語擴充) | 1CD0–1CFF |
67 | Phonetic Extensions(語音學擴充) | 1D00–1D7F |
68 | Phonetic Extensions Supplement(語音學擴充補充) | 1D80–1DBF |
69 | Combining Diacritical Marks Supplement(組合用附加符号補充) | 1DC0–1DFF |
70 | Latin Extended Additional(拉丁文擴充附加) | 1E00–1EFF |
71 | Greek Extended(希臘語擴充) | 1F00–1FFF |
72 | General Punctuation(常用标點) | 2000–206F |
73 | Superscripts and Subscripts(上标及下标) | 2070–209F |
74 | Currency Symbols(貨币符号) | 20A0–20CF |
75 | Combining Diacritical Marks for Symbols(組合用記号) | 20D0–20FF |
76 | Letterlike Symbols(字母式符号) | 2100–214F |
77 | Number Forms(數字形式) | 2150–218F |
78 | Arrows(箭頭) | 2190–21FF |
79 | Mathematical Operators(數學運算符) | 2200–22FF |
80 | Miscellaneous Technical(雜項工業符号) | 2300–23FF |
81 | Control Pictures(控制圖檔) | 2400–243F |
82 | Optical Character Recognition(光學識别符) | 2440–245F |
83 | Enclosed Alphanumerics(封閉式字母數字) | 2460–24FF |
84 | Box Drawing(制表符) | 2500–257F |
85 | Block Elements(方塊元素) | 2580–259F |
86 | Geometric Shapes(幾何圖形) | 25A0–25FF |
87 | Miscellaneous Symbols(雜項符号) | 2600–26FF |
88 | Dingbats(印刷符号) | 2700–27BF |
89 | Miscellaneous Mathematical Symbols-A(雜項數學符号-A) | 27C0–27EF |
90 | Supplemental Arrows-A(追加箭頭-A) | 27F0–27FF |
91 | Braille Patterns(盲文點字模型) | 2800–28FF |
92 | Supplemental Arrows-B(追加箭頭-B) | 2900–297F |
93 | Miscellaneous Mathematical Symbols-B(雜項數學符号-B) | 2980–29FF |
94 | Supplemental Mathematical Operators(追加數學運算符) | 2A00–2AFF |
95 | Miscellaneous Symbols and Arrows(雜項符号和箭頭) | 2B00–2BFF |
96 | Glagolitic(格拉哥裡字母) | 2C00–2C5F |
97 | Latin Extended-C(拉丁文擴充-C) | 2C60–2C7F |
98 | Coptic(古埃及語) | 2C80–2CFF |
99 | Georgian Supplement(格魯吉亞語補充) | 2D00–2D2F |
100 | Tifinagh(提非納文) | 2D30–2D7F |
101 | Ethiopic Extended(埃塞俄比亞語擴充) | 2D80–2DDF |
102 | Cyrillic Extended-A(西裡爾語擴充-A) | 2DE0–2DFF |
103 | Supplemental Punctuation(追加标點) | 2E00–2E7F |
104 | CJK Radicals Supplement(CJK部首補充) | 2E80–2EFF |
105 | Kangxi Radicals(康熙字典部首) | 2F00–2FDF |
106 | Ideographic Description Characters(表意文字描述符) | 2FF0–2FFF |
107 | CJK Symbols and Punctuation(CJK符号和标點) | 3000–303F |
108 | Hiragana(日文平假名) | 3040–309F |
109 | Katakana(日文片假名) | 30A0–30FF |
110 | Bopomofo(注音字母) | 3100–312F |
111 | Hangul Compatibility Jamo(北韓文相容字母) | 3130–318F |
112 | Kanbun(象形字注釋标志) | 3190–319F |
113 | Bopomofo Extended(注音字母擴充) | 31A0–31BF |
114 | CJK Strokes(CJK筆畫) | 31C0–31EF |
115 | Katakana Phonetic Extensions(日文片假名語音擴充) | 31F0–31FF |
116 | Enclosed CJK Letters and Months(封閉式CJK文字和月份) | 3200–32FF |
117 | CJK Compatibility(CJK相容) | 3300–33FF |
118 | CJK Unified Ideographs Extension A(CJK統一表意文字擴充A) | 3400–4DBF |
119 | Yijing Hexagram Symbols(易經六十四卦符号) | 4DC0–4DFF |
120 | CJK Unified Ideographs(CJK統一表意文字(基本漢字)) | 4E00–9FFC |
121 | Yi Syllables(彜文音節) | A000–A48F |
122 | Yi Radicals(彜文字根) | A490–A4CF |
123 | Lisu(傈僳語) | A4D0–A4FF |
124 | Vai(瓦伊語) | A500–A63F |
125 | Cyrillic Extended-B(西裡爾字母擴充-B) | A640–A69F |
126 | Bamum(巴姆穆語) | A6A0–A6FF |
127 | Modifier Tone Letters(聲調修飾字母) | A700–A71F |
128 | Latin Extended-D(拉丁文擴充-D) | A720–A7FF |
129 | Syloti Nagri(錫爾赫特文) | A800–A82F |
130 | Common Indic Number Forms(普通印度數字表) | A830–A83F |
131 | Phags-pa(八思巴字) | A840–A87F |
132 | Saurashtra(索拉什特拉) | A880–A8DF |
133 | Devanagari Extended(天城體文字擴充) | A8E0–A8FF |
134 | Kayah Li(克耶字母) | A900–A92F |
135 | Rejang(勒姜語) | A930–A95F |
136 | Hangul Jamo Extended-A(北韓文擴充-A) | A960–A97F |
137 | Javanese(爪哇語) | A980–A9DF |
138 | Myanmar Extended-B(緬甸語擴充-B) | A9E0–A9FF |
139 | Cham(鞑靼文) | AA00–AA5F |
140 | Myanmar Extended-A(緬甸語擴充-A) | AA60–AA7F |
141 | Tai Viet(越南傣文) | AA80–AADF |
142 | Meetei Mayek Extensions(曼尼普爾文擴充) | AAE0–AAFF |
143 | Ethiopic Extended-A(埃塞俄比亞文擴充-A) | AB00–AB2F |
144 | Latin Extended-E(拉丁文擴充-E) | AB30–AB6F |
145 | Cherokee Supplement(徹羅基語補充) | AB70–ABBF |
146 | Meetei Mayek(曼尼普爾文) | ABC0–ABFF |
147 | Hangul Syllables(北韓文音節) | AC00–D7AF |
148 | Hangul Jamo Extended-B(北韓文擴充-B) | D7B0–D7FF |
149 | High Surrogate Area(UTF-16高位元組占用區域) | D800-DBFF |
150 | Low Surrogate Area(UTF-16低位元組占用區域) | DC00-DFFF |
151 | Private Use Area(自行使用區域) | E000-F8FF |
152 | CJK Compatibility Ideographs(CJK相容表意文字) | F900–FAD9 |
153 | Alphabetic Presentation Forms(字母表達形式) | FB00–FB4F |
154 | Arabic Presentation Forms-A(阿拉伯文表達形式-A) | FB50–FDFF |
155 | Variation Selectors(變量選擇符) | FE00–FE0F |
156 | Vertical Forms(豎排形式) | FE10–FE1F |
157 | Combining Half Marks(組合用半符号) | FE20–FE2F |
158 | CJK Compatibility Forms(CJK相容形式) | FE30–FE4F |
159 | Small Form Variants(小型變體形式) | FE50–FE6F |
160 | Arabic Presentation Forms-B(阿拉伯文表達形式-B) | FE70–FEFF |
161 | Halfwidth and Fullwidth Forms(半型及全型形式) | FF00–FFEF |
162 | Specials(特殊) | FFF0–FFFF |
163 | Linear B Syllabary | 10000–1007F |
164 | Linear B Ideograms | 10080–100FF |
165 | Aegean Numbers | 10100–1013F |
166 | Ancient Greek Numbers | 10140–1018F |
167 | Ancient Symbols | 10190–101CF |
168 | Phaistos Disc | 101D0–101FF |
169 | Lycian | 10280–1029F |
170 | Carian | 102A0–102DF |
171 | Coptic Epact Numbers | 102E0–102FF |
172 | Old Italic | 10300–1032F |
173 | Gothic | 10330–1034F |
174 | Old Permic | 10350–1037F |
175 | Ugaritic | 10380–1039F |
176 | Old Persian | 103A0–103DF |
177 | Deseret | 10400–1044F |
178 | Shavian | 10450–1047F |
179 | Osmanya | 10480–104AF |
180 | Osage | 104B0–104FF |
181 | Elbasan | 10500–1052F |
182 | Caucasian Albanian | 10530–1056F |
183 | Linear A | 10600–1077F |
184 | Cypriot Syllabary | 10800–1083F |
185 | Imperial Aramaic | 10840–1085F |
186 | Palmyrene | 10860–1087F |
187 | Nabataean | 10880–108AF |
188 | Hatran | 108E0–108FF |
189 | Phoenician | 10900–1091F |
190 | Lydian | 10920–1093F |
191 | Meroitic Hieroglyphs | 10980–1099F |
192 | Meroitic Cursive | 109A0–109FF |
193 | Kharoshthi | 10A00–10A5F |
194 | Old South Arabian | 10A60–10A7F |
195 | Old North Arabian | 10A80–10A9F |
196 | Manichaean | 10AC0–10AFF |
197 | Avestan | 10B00–10B3F |
198 | Inscriptional Parthian | 10B40–10B5F |
199 | Inscriptional Pahlavi | 10B60–10B7F |
200 | Psalter Pahlavi | 10B80–10BAF |
201 | Old Turkic | 10C00–10C4F |
202 | Old Hungarian | 10C80–10CFF |
203 | Hanifi Rohingya | 10D00–10D3F |
204 | Rumi Numeral Symbols | 10E60–10E7F |
205 | Yezidi | 10E80–10EBF |
206 | Old Sogdian | 10F00–10F2F |
207 | Sogdian | 10F30–10F6F |
208 | Chorasmian | 10FB0–10FDF |
209 | Elymaic | 10FE0–10FFF |
210 | Brahmi | 11000–1107F |
211 | Kaithi | 11080–110CF |
212 | Sora Sompeng | 110D0–110FF |
213 | Chakma | 11100–1114F |
214 | Mahajani | 11150–1117F |
215 | Sharada | 11180–111DF |
216 | Sinhala Archaic Numbers | 111E0–111FF |
217 | Khojki | 11200–1124F |
218 | Multani | 11280–112AF |
219 | Khudawadi | 112B0–112FF |
220 | Grantha | 11300–1137F |
221 | Newa | 11400–1147F |
222 | Tirhuta | 11480–114DF |
223 | Siddham | 11580–115FF |
224 | Modi | 11600–1165F |
225 | Mongolian Supplement | 11660–1167F |
226 | Takri | 11680–116CF |
227 | Ahom | 11700–1173F |
228 | Dogra | 11800–1184F |
229 | Warang Citi | 118A0–118FF |
230 | Dives Akuru | 11900–1195F |
231 | Nandinagari | 119A0–119FF |
232 | Zanabazar Square | 11A00–11A4F |
233 | Soyombo | 11A50–11AAF |
234 | Pau Cin Hau | 11AC0–11AFF |
235 | Bhaiksuki | 11C00–11C6F |
236 | Marchen | 11C70–11CBF |
237 | Masaram Gondi | 11D00–11D5F |
238 | Gunjala Gondi | 11D60–11DAF |
239 | Makasar | 11EE0–11EFF |
240 | Lisu Supplement | 11FB0–11FBF |
241 | Tamil Supplement | 11FC0–11FFF |
242 | Cuneiform | 12000–123FF |
243 | Cuneiform Numbers and Punctuation | 12400–1247F |
244 | Early Dynastic Cuneiform | 12480–1254F |
245 | Egyptian Hieroglyphs | 13000–1342F |
246 | Egyptian Hieroglyph Format Controls | 13430–1343F |
247 | Anatolian Hieroglyphs | 14400–1467F |
248 | Bamum Supplement | 16800–16A3F |
249 | Mro | 16A40–16A6F |
250 | Bassa Vah | 16AD0–16AFF |
251 | Pahawh Hmong | 16B00–16B8F |
252 | Medefaidrin | 16E40–16E9F |
253 | Miao | 16F00–16F9F |
254 | Ideographic Symbols and Punctuation | 16FE0–16FFF |
255 | Tangut | 17000–187F7 |
256 | Tangut Components | 18800–18AFF |
257 | Khitan Small Script | 18B00–18CFF |
258 | Tangut Supplement | 18D00–18D08 |
259 | Kana Supplement | 1B000–1B0FF |
260 | Kana Extended-A | 1B100–1B12F |
261 | Small Kana Extension | 1B130–1B16F |
262 | Nushu | 1B170–1B2FF |
263 | Duployan | 1BC00–1BC9F |
264 | Shorthand Format Controls | 1BCA0–1BCAF |
265 | Byzantine Musical Symbols | 1D000–1D0FF |
266 | Musical Symbols | 1D100–1D1FF |
267 | Ancient Greek Musical Notation | 1D200–1D24F |
268 | Mayan Numerals | 1D2E0–1D2FF |
269 | Tai Xuan Jing Symbols | 1D300–1D35F |
270 | Counting Rod Numerals | 1D360–1D37F |
271 | Mathematical Alphanumeric Symbols | 1D400–1D7FF |
272 | Sutton SignWriting | 1D800–1DAAF |
273 | Glagolitic Supplement | 1E000–1E02F |
274 | Nyiakeng Puachue Hmong | 1E100–1E14F |
275 | Wancho | 1E2C0–1E2FF |
276 | Mende Kikakui | 1E800–1E8DF |
277 | Adlam | 1E900–1E95F |
278 | Indic Siyaq Numbers | 1EC70–1ECBF |
279 | Ottoman Siyaq Numbers | 1ED00–1ED4F |
280 | Arabic Mathematical Alphabetic Symbols | 1EE00–1EEFF |
281 | Mahjong Tiles | 1F000–1F02F |
282 | Domino Tiles | 1F030–1F09F |
283 | Playing Cards | 1F0A0–1F0FF |
284 | Enclosed Alphanumeric Supplement | 1F100–1F1FF |
285 | Enclosed Ideographic Supplement | 1F200–1F2FF |
286 | Miscellaneous Symbols and Pictographs | 1F300–1F5FF |
287 | Emoticons | 1F600–1F64F |
288 | Ornamental Dingbats | 1F650–1F67F |
289 | Transport and Map Symbols | 1F680–1F6FF |
290 | Alchemical Symbols | 1F700–1F77F |
291 | Geometric Shapes Extended | 1F780–1F7FF |
292 | Supplemental Arrows-C | 1F800–1F8FF |
293 | Supplemental Symbols and Pictographs | 1F900–1F9FF |
294 | Chess Symbols | 1FA00–1FA6F |
295 | Symbols and Pictographs Extended-A | 1FA70–1FAFF |
296 | Symbols for Legacy Computing | 1FB00–1FBFF |
297 | Unassigned | 1FF80–1FFFF |
298 | CJK Unified Ideographs Extension B | 20000–2A6DD |
299 | CJK Unified Ideographs Extension C | 2A700–2B734 |
300 | CJK Unified Ideographs Extension D | 2B740–2B81D |
301 | CJK Unified Ideographs Extension E | 2B820–2CEA1 |
302 | CJK Unified Ideographs Extension F | 2CEB0–2EBE0 |
303 | CJK Compatibility Ideographs Supplement | 2F800–2FA1D |
304 | Unassigned | 2FF80–2FFFF |
305 | CJK Unified Ideographs Extension G | 30000–3134A |
306 | Unassigned | 3FF80–3FFFF |
307 | Unassigned | 4FF80–4FFFF |
308 | Unassigned | 5FF80–5FFFF |
309 | Unassigned | 6FF80–6FFFF |
310 | Unassigned | 7FF80–7FFFF |
311 | Unassigned | 8FF80–8FFFF |
312 | Unassigned | 9FF80–9FFFF |
313 | Unassigned | AFF80–AFFFF |
314 | Unassigned | BFF80–BFFFF |
315 | Unassigned | CFF80–CFFFF |
316 | Unassigned | DFF80–DFFFF |
317 | Tags | E0000–E007F |
318 | Variation Selectors Supplement | E0100–E01EF |
319 | Unassigned | EFF80–EFFFF |
320 | Supplementary Private Use Area-A | FFF80–FFFFF |
321 | Supplementary Private Use Area-B | 10FF80–10FFFF |