天天看點

java 中文文本分析_java中文編碼(字元集)分析-中文亂碼分析及解決方案

锘匡豢

娉細鏈枃閮ㄥ垎鍐呭鎽樿嚜缃戠粶锛屾憳鎶勫唴瀹圭増鏉冨綊鍘熶綔鑰呮墍鏈夈€?

1.聽 聽 聽鑳屾櫙鐭ヨ瘑

1.1.聽 聽 Http鍗忚

1.1.1.聽 URL鍜孶RI

1.1.2.聽 濯掍綋绫誨瀷瀹氫箟

HTTP鍦?Content-Type(14.17鑺?鍜?Accept(14.1鑺?澶撮儴鍩熶腑浣跨敤鍥犵壒缃戝獟浣撶被鍨?[17]锛屼負浜嗘彁渚涙墦寮€鍜屽彲鎵╁睍鐨勬暟鎹被鍨嬪拰绫誨瀷鍗忚銆?

media-type = type "/" subtype *( ";" parameter )

type = token

subtype = token parameter銆愬彲浠ャ€戞帴鍦?type/subtype鍚庨潰锛屾寜 attribute/value瀵圭殑褰㈠紡(濡?3.6鑺備腑鎵€瀹氫箟)銆?type銆乻ubtype鍜?parameter灞炴€у悕鏄ぇ灏忓啓闈炴晱鎰熺殑銆?parameter鍊煎彲浠ユ槸鎴栦笉鏄ぇ灏忓啓鏁忔劅鐨勶紝鍙栧喅浜?parameter鍚嶇О鐨勮涔夈€傘€愮姝€戝湪 type鍜?subtype闂翠嬌鐢ㄧ嚎鎬х┖鐧界(LWS)锛屽睘鎬т笌鍏跺€奸棿涔熺姝€傚瓨鍦ㄦ垨缂哄皯 parameter鍙互瀵?media-type澶勭悊鏈夋剰涔夛紝鍙栧喅浜庡獟浣撶被鍨嬫敞鍐岃〃涓殑瀹氫箟銆?

瑕佹敞鎰忥紝涓€浜涜€佺殑 HTTP搴旂敤绋嬪簭涓嶈璇嗗獟浣撶被鍨嬪弬鏁般€傚綋鍙戦€佹暟鎹粰鑰?HTTP搴旂敤绋嬪簭鏃訛紝瀹炵幇銆愬簲璇ャ€戝彧鍦ㄨ type/subtype瀹氫箟闇€瑕佹椂浣跨敤濯掍綋绫誨瀷鐨勫弬鏁般€?

media-type鐨勫€肩敱鍥犵壒缃戝垎閰嶆暟瀛楁潈濞?IANA [19])娉ㄥ唽銆傚獟浣撶被鍨嬫敞鍐岃繃绋嬪湪 RFC

1.1.3.聽 鍜屽瓧绗﹂泦鐩稿叧鐨勬寚浠?

鍦╦sp銆乻ervlet鍜宧tml锛屾寚瀹欼E鎸夌収閭d竴绉嶅瓧绗﹂泦锛岃В鏋愬瓧鑺傛祦銆傚嵆锛欼E涓環tml椤甸潰鐨勯粯璁ら〉闈㈢紪鐮併€?

JSP鏂囦歡涓瑿ontentType锛屾寚瀹氫紶杈撳唴瀹圭殑缂栫爜鏍煎紡銆傚涓嬶細聽 聽 聽 聽 聽銆偮?聽Java Servlet涓紝鍦╮esponse瀵矽薄涓缃€傚涓嬶細聽 聽 聽 聽response.setContentType("text/html;charset=GBK");聽 聽HTML涓紝閫氳繃鍏冪礌璁劇疆銆傚涓嬶細聽

2.聽 聽 聽鈥滅伀鈥濈殑缂栫爜

鈥滅伀鈥濆瓧锛?

UTF-8缂栫爜锛?xE781AB

GBK缂栫爜锛?xBBF0

GB2312缂栫爜锛?xBBF0

3.聽 聽 聽鍦ㄥ湴鍧€鏍忎腑杈撳叆鍦闆潃锛屾彁浜?

3.1.聽 聽 娑夊強鍙傛暟

IE閫夐」脿楂樼駭锛?

3.2.聽 聽 鈥滀互UTF-8鍙戦€佲€濋€変腑锛屼笉甯︿腑鏂囧弬鏁?

IE閫夐」脿楂樼駭锛?

鍦闆潃涓嚭鐜頒腑鏂囷紝渚嬪锛?

娴忚鍣ㄦ妸涓枃鈥滅伀鈥濆瓧锛屽厛鎸夌収UTF-8缂栫爜杞瘧锛屽涓嬶細

鎶婅繖涓湴鍧€杞琁SO-8859-1缂栫爜锛涗笅鍥撅紝娴忚鍣ㄦ彁浜ゅ瓧鑺傛祦锛屼簩杩涘埗缂栫爜锛涘涓嬶細

鏈嶅姟鍣ㄧ鍝嶅簲鐨勪簩杩涘埗娴佺紪鐮侊細

3.3.聽 聽 鈥滀互UTF-8鍙戦€佲€濋€変腑锛屽甫涓枃鍙傛暟

IE閫夐」脿楂樼駭锛?

鍦闆潃锛屾柟寮忥細js璋冪敤銆佸湴鍧€鏍忚緭鍏ャ€?

娴忚鍣ㄦ妸url鍦闆潃涓€滅伀鈥濆瓧锛屽厛鎸夌収UTF-8缂栫爜杞瘧锛岋紵鍚庡弬鏁扮殑鐏瓧锛屾寜鐓ч〉闈㈢紪鐮佽漿鐮?椤甸潰缂栫爜GBK)锛岋細

杩囩▼涓猴細url鍦闆潃澶勭悊锛屽皢涓枃瀛楃杞負%鐨勫艦寮忥紝灏嗚漿璇戝悗鐨勫瓧绗︿覆浣滀負鏂扮殑鍦闆潃瀛楃涓詫紱灏嗗湴鍧€鍜屽弬鏁扮紪璇戜負瀛楄妭娴侊紝鍦闆潃鎸夌収ISO-8859-1缂栫爜锛屽弬鏁版寜鐓bk缂栫爜(椤甸潰缂栫爜)銆?

褰撻〉闈㈠瓧绗﹂泦鏃禪TF-8鏃訛紝鎻愪氦瀛楄妭娴侊紝濡備笅锛?

鍙傛暟缂栫爜涓篣TF-8缂栫爜銆?

3.4.聽 聽 鈥滀互UTF-8鍙戦€佲€濅笉閫変腑

IE閫夐」脿楂樼駭锛?

鍦闆潃涓嚭鐜頒腑鏂囷紝渚嬪锛?

鎻愪氦瀛楃涓詫紝鐩存帴鍖呮嫭鈥滅伀鈥濆瓧銆?

娴忚鍣ㄦ彁浜ゅ瓧鑺傛祦锛屼簩杩涘埗缂栫爜锛?

鈥滅伀鈥濆瓧鐨勭紪鐮乥bf0鏄疓BK瀛楃闆嗘牸寮忕殑缂栫爜銆?

鏈嶅姟鍣ㄧ鍝嶅簲鐨勪簩杩涘埗娴佺紪鐮侊細

姝ゆ椂鐨勮繃绋嬩負锛?

娴忚鍣ㄥ皢鈥滅伀鈥濆瓧鎸夌収GBK缂栫爜bbf0鐨勫瓧鑺傛祦鍙戦€佸埌鏈嶅姟鍣ㄧ锛屾湇鍔″櫒绔粯璁ゆ寜鐓SO-8859-1瑙f瀽璇ュ瓧鑺傛祦锛屾湇鍔″櫒绔繘琛屽鐞?瑙f瀽鐨勭洰鐨勬槸杞崲涓篣TF-16鐨勫唴閮ㄧ紪鐮侊紝娉ㄦ剰锛氳繖閲屾槸浠ュ瓧绗︿負鍩哄噯鐨勮漿鎹紝鍥犱負浜岃繘鍒跺瓧鑺傛祦鏄病鏈夋剰涔夌殑锛屽彧鏈夎漿鎹負瀛楃鍚庯紝鎵嶈兘鎵懼埌utf-16涓搴旂殑缂栫爜锛屾垨鑰呰锛屾槸鎶婂瓧绗so-8859-1瀵瑰簲鐨勭爜鍊艱漿鎹負utf-16涓搴旂殑鐮佸€?锛屽鐞嗗畬姣曞悗鎸夌収UTF-8瀛楃闆嗚漿鎹㈠瓧鑺傛祦缂栫爜(c2bbc3b0)锛岃緭鍑哄埌瀹㈡埛绔€傛敞鎰弖tf-16杞崲鏄棤鎹熺殑銆傚綋鐒跺鏋滃嚭鐜幫紝gbk杞崲涓篿so8859-1鍑虹幇淇℃伅鎹熶激鏄笉鍙仮澶嶇殑銆?

缁撹锛?

涓嶉€変腑鏃訛紝URL鍦闆潃涓殑姹夊瓧锛屾寜鐓ч〉闈㈠瓧绗﹂泦杩涜缂栫爜銆?

3.5.聽 聽 缁撹

鍦闆潃鏍忎腑杈撳叆鐨刄RL鍦闆潃瀛楃涓詫紝鍦ㄦ彁浜ょ粰鏈嶅姟鍣ㄧ鏃訛紝瀵筓RL鍒嗘垚涓ゅ潡鍗曠嫭缂栫爜锛涒€滐紵鈥濆墠鐨勫湴鍧€涓轟竴閮ㄥ垎锛涒€滐紵鈥濆悗鐨勫弬鏁頒負涓€閮ㄥ垎銆?

鍦闆潃閮ㄥ垎锛屽湪鈥滀互UTF-8鈥濋€変腑鏃訛紝鍦闆潃閮ㄥ垎鍑虹幇鐨勫瀛楄妭瀛楃锛屾寜鐓TF-8杞爜锛涘鏋滄湭閫変腑锛屾寜鐓ф搷浣滅郴缁熼粯璁ゅ瓧绗﹂泦杩涜缂栫爜銆?

鍙傛暟閮ㄥ垎锛屽拰鈥滀互UTF-8鈥濆弬鏁版棤鍏籌紝鎸夌収鎿嶄綔绯葷粺榛樿瀛楃闆嗚繘琛岀紪鐮併€?

4.聽 聽 聽閫氳繃js鏂瑰紡鎻愪氦鍦闆潃

4.1.聽 聽 瀵筓RL鍦闆潃涓嶈漿鐮佹彁浜?

涓嶈漿鐮佸拰get鏂瑰紡鐩稿悓銆備緥濡傦細聽 聽 聽鎻愪氦URL聽 鍘熷URL锛毬?聽 椤甸潰缂栫爜涓篣TF-8聽 聽鎻愪氦鍒癢eb鏈嶅姟鍣ㄧ锛屄?URL缂栫爜聽 聽 鏌ョ湅userid鍚庣殑缂栫爜聽 聽 0xe781锛岀紪鐮佹牸寮忔槸UTF-8锛屽拰椤甸潰瀛楃闆嗕竴鑷淬€偮?聽缁撹聽 鍜屽湪鍦闆潃鏍忕洿鎺ヨ緭鍏RL涓€鑷淬€?

4.2.聽 聽 瀵筓RL鍦闆潃杞爜鎻愪氦

瀵筓RL鍦闆潃杞爜锛岄€氳繃璋冪敤encodeURI鎴杄ncodeURIComponent鍑芥暟锛屽鏁翠綋鐨刄RL鍦闆潃(鍖呮嫭鍙傛暟)涓潪ANSI瀛楃(瑙乪ncodeURI鏂規硶鐨勬弿杩?杩涜缂栫爜锛岀紪鐮佸瓧绗﹂泦涓篣TF-8(鎸囧畾鍏朵粬瀛楃闆嗕篃鏃犳晥)锛屽湪URL涓樉绀虹殑鏍煎紡涓猴紝%+鍗佸叚杩涘埗鐨勫瓧绗﹀簭鍒椼€?

鍘熷URL锛?

杞爜鍚庣殑URL锛?

鎻愪氦鍒癢eb鏈嶅姟鍣ㄤ腑鐨刄RL(鎷︽埅寰楀埌)锛?

4.3.聽 聽 杞爜鎻愪氦鍚庤В鏋愯繃绋?

鍦ㄤ笂杩頒袱绉嶆儏鍐典腑锛屽簲鐢ㄦ湇鍔″櫒榛樿缂栫爜涓篒SO-8859-1锛屾湇鍔″櫒涓嶈兘姝g‘瑙f瀽璇RI銆傜粨鏋滃涓嬶細

鏈漿鐮侊細

杞爜锛?

瑙f瀽杩囩▼

搴旂敤鏈嶅姟鍣?Tomcat)瀵瑰湴鍧€涓殑%E7%81%AB杩涜瑙f瀽杩囩▼涓猴紝鍘繪帀%鍙鳳紝灏?鍚庣殑瀛楃浣滀負HEX鏁幫紝鏍規嵁Tomcat璁劇疆鐨勫瓧绗﹂泦缈昏瘧涓哄瓧绗︺€俆omcat榛樿瀛楃闆嗕負ISO-8859-1锛屼嬌鐢↖SO-8859-1瀵笶7锛?1缈昏瘧涓哄瓧绗︼紝鍙戠幇E7锛?1瓒呭嚭ISO-8859-1鐨勪唬鐮佺┖闂達紝涓嶈瘑鍒瀛楃锛岀炕璇戜負鈥滐紵鈥?鍗?x3f)銆?

瑙e喅鏂規硶锛?涓€)灏嗗簲鐢ㄦ湇鍔″櫒瀛楃闆嗚缃甎TF-8锛?浜?灏嗗瀛楄妭搴忓垪鐨勮漿鐮佹斁鍒扮▼搴忎腑瀹炵幇銆?

鏂規硶(涓€)鍦╯erver.xml涓殑鍏冪礌涓紝娣誨姞URIEncoding=鈥漊TF-8鈥濆睘鎬с€?

鏂規硶(浜?鍦ㄥ鎴風浜屾杞爜encodeURI(encodeURI(url))锛屾鏃秛rl瀛楃涓蹭負锛?

璇﹁涓嬮潰鏍蜂緥銆傛敞鎰忥紝姝ゆ柟娉曪紝鍙兘瀵筓RL瀛楃涓蹭腑鍙傛暟閮ㄥ垎瑙f瀽锛屽URI涓湴鍧€閮ㄥ垎涓嶈兘瑙f瀽銆?

4.4.聽 聽 杞爜鎻愪氦瑙f瀽鏍蜂緥

4.4.1.聽 鐜璁劇疆

鐜璁劇疆

鐩爣鏂囦歡鎵€鍦ㄧ洰褰曪細urlencode\鐏玕urlencode-2.jsp

IE椤甸潰瀛楃闆嗭細UTF-8銆?

IE閫夐」锛氶€変腑鈥滀互UTF-8鍙戦€佲€濄€?

request缂栫爜璁劇疆锛?

4.4.2.聽 搴旂敤鏈嶅姟鍣ㄨ缃甎TF-8瀛楃闆?

璁劇疆URIEncoding涓篣TF-8聽 Tomcat閰嶇疆鏂囦歡璁劇疆URI瑙f瀽瀛楃闆喡?聽 聽杩囩▼聽 閫氳繃js璇鋒眰鐨刄RL锛毬?聽 鎻愪氦鍒版湇鍔″櫒绔殑鍦闆潃涓?鎴幏)锛毬?聽 缁撴灉锛氭垚鍔熴€偮?鎴愬姛椤甸潰鏄劇ず鐨勫湴鍧€锛毬?聽 聽 聽 缁撹锛毬?Tomcat搴旂敤鏈嶅姟鍣ㄨ缃殑瀛楃闆嗗URI杩涜瑙g爜銆?

4.4.3.聽 浜屾杞爜鍜屼簩娆¤В鐮?

URL鍦闆潃涓笉鍖呭惈涓枃

璇鋒眰鐨刄RL聽 聽 聽IE缂栫爜鍚庢彁浜ゆ湇鍔″櫒绔殑瀛楃涓猜?聽 聽鐩存帴浠巖equest涓幏鍙栫殑鍊悸?聽 聽閫氳繃URLDecode绫昏В鐮佸悗鐨勫€?

URL鍦闆潃涓寘鍚腑鏂?

璇鋒眰鐨刄RL聽 聽 聽IE缂栫爜鍚庢彁浜ゆ湇鍔″櫒绔殑瀛楃涓猜?聽 聽澶辮觸椤甸潰

4.5.聽 聽 缁撹

js鎻愪氦鐨刄RL鍦闆潃瀛楃涓詫紝鍦ㄦ彁浜ょ粰鏈嶅姟鍣ㄧ鏃訛紝瀵筓RL鍒嗘垚涓ゅ潡鍗曠嫭缂栫爜锛涒€滐紵鈥濆墠鐨勫湴鍧€涓轟竴閮ㄥ垎锛涒€滐紵鈥濆悗鐨勫弬鏁頒負涓€閮ㄥ垎銆?

鍦闆潃閮ㄥ垎锛屽湪鈥滀互UTF-8鈥濋€変腑鏃訛紝鍦闆潃閮ㄥ垎鍑虹幇鐨勫瀛楄妭瀛楃锛屾寜鐓TF-8杞爜锛涘鏋滄湭閫変腑锛屾寜鐓ч〉闈㈤粯璁ゅ瓧绗﹂泦杩涜缂栫爜銆?

鍙傛暟閮ㄥ垎锛屽拰鈥滀互UTF-8鈥濆弬鏁版棤鍏籌紝鎸夌収椤甸潰榛樿瀛楃闆嗚繘琛岀紪鐮併€?

5.聽 聽 聽鏈嶅姟绔В鏋?

瀵筓RI鐨勮В鏋惵?聽 聽 聽 聽 瀵筓RI鐨勮В鏋愬寘鎷袱閮ㄥ垎锛屽湴鍧€閮ㄥ垎鍜屽弬鏁伴儴鍒嗐€傚簲鐢ㄦ湇鍔″櫒鏍規嵁鎸囧畾鐨勫瓧绗﹂泦瑙f瀽URI涓殑鍦闆潃閮ㄥ垎鍜屽弬鏁伴儴鍒嗭紝鍗充互GET鏂瑰紡鎻愪氦鐨勬暟鎹€偮?聽 聽 聽 聽 娉ㄦ剰浜屾缂栫爜鍜屼簩娆¤В鏋愶紝鍙兘瑙f瀽鍙傛暟锛屼笉鑳借В鏋怳RI銆偮?聽瀵規彁浜ゆ暟鎹唴瀹圭殑瑙f瀽聽 聽 聽 聽 聽 浠ET鏂瑰紡鎻愪氦鐨勬暟鎹紝鏍規嵁搴旂敤鏈嶅姟鍣ㄧ殑瀛楃闆嗚В鏋愩€傚湪搴旂敤涓缃畆equest.setCharacterEncoding()涓嶈搗浣滅敤銆偮?聽 聽 聽 聽 浠ost鏂瑰紡鎻愪氦鐨勬暟鎹紝鏍規嵁request.setCharacterEncoding()纭畾鐨勫瓧绗﹂泦瑙f瀽銆傞粯璁ゅ瓧绗﹂泦涓篔ava榛樿鐨勫瓧绗﹂泦锛宩ava榛樿鐨勫瓧绗﹂泦涓烘搷鐫€绯葷粺榛樿鐨勫瓧绗﹂泦銆?

6.聽 聽 聽URI鍏佽鐨勫瓧绗?

ASCII瀛楁瘝銆佹暟瀛椼€佷繚鐣欏瓧绗︺€佹爣璁闆瓧绗?

7.聽 聽 聽缁撹

7.1.聽 聽 鍦ㄥ湴鍧€鏍忚緭鍏RL锛岀紪鐮佸拰瑙g爜

URL鍦闆潃閮ㄥ垎(鈥滐紵涔嬪墠鈥?锛?

缂栫爜聽 聽 聽 聽 聽 閫変腑鈥滀互UTF-8鍙戦€佲€濓紝闈炲瀛楄妭搴忓垪鐨勫瓧绗︼紝琚漿璇戜負UTF-8瀛楄妭搴忓垪銆備緥濡傦紝姹夊瓧琚漿璇戜負%A1%B0鏍煎紡锛屽叾涓瑼1鏄眽瀛楀搴旂殑UTF-8缂栫爜锛岃鍚庢暣涓猆RL鎸夌収ISO-8859-1缂栫爜鍙戦€佸埌鏈嶅姟鍣ㄧ銆傛湇鍔″櫒绔互ISO-8859-1瑙g爜銆偮?聽 聽 聽 聽 鏈€変腑鈥滀互UTF-8鍙戦€佲€濓紝URL鍦闆潃鐨勭紪鐮侊紝鎸夌収鎿嶄綔绯葷粺榛樿瀛楃闆嗚繘琛岀紪鐮併€偮?聽瑙g爜聽 聽 聽 聽 聽 鏈嶅姟鍣ㄧ榛樿鎸夌収iso-8859-1瑙g爜锛屽彲浠ユ墜宸ヤ慨鏀瑰簲鐢ㄦ湇鍔″櫒鐨刄RI瑙f瀽瀛楃闆?

URL鍦闆潃鍚庣殑鍙傛暟(?鍚庣殑閮ㄥ垎)

缂栫爜锛毬?聽 聽 聽 聽 鎸夌収鎿嶄綔绯葷粺榛樿瀛楃闆嗚繘琛岀紪鐮併€偮?聽瑙g爜锛毬?聽 聽 聽 聽 鍦═omcat涓紝鏈嶅姟鍣ㄧ榛樿鎸夌収iso-8859-1瑙g爜锛屼笉浼氫嬌鐢╮equest.setCharacterEncode()鏂規硶璁劇疆鐨勫瓧绗﹂泦瑙g爜銆?

7.2.聽 聽 js鏂瑰紡鎻愪氦URL锛岀紪鐮佸拰瑙g爜

URL鍦闆潃閮ㄥ垎(鈥滐紵涔嬪墠鈥?聽 缂栫爜聽 聽 聽 聽 聽 閫変腑鈥滀互UTF-8鍙戦€佲€濓紝鍜屽湪鍦闆潃鏍忚緭鍏RL鐩稿悓銆傚鏋渏s鍋氫簡杞爜锛孶RL瀛楃涓插垯鏄鍚圲RL瑙勮寖鐨勫瓧绗︿覆銆偮?聽 聽 聽 聽 鏈€変腑鈥滀互UTF-8鍙戦€佲€濓紝URL鍦闆潃鐨勭紪鐮侊紝鎸夌収IE椤甸潰榛樿瀛楃闆嗚繘琛岀紪鐮併€偮?瑙g爜聽 聽 聽 聽 聽 鏈嶅姟鍣ㄧ榛樿鎸夌収iso-8859-1瑙g爜锛屽彲浠ユ墜宸ヤ慨鏀瑰簲鐢ㄦ湇鍔″櫒鐨刄RI瑙f瀽瀛楃闆嗐€傚皢%鐨勫瓧鑺傚簭鍒楋紝杞崲涓哄瓧绗︺€偮?聽URL鍦闆潃鍚庣殑鍙傛暟(?鍚庣殑閮ㄥ垎)聽 缂栫爜锛毬?聽 聽 聽 聽 鎸夌収IE椤甸潰榛樿瀛楃闆嗚繘琛岀紪鐮併€傚鏋渏s鍋氫簡杞爜锛孶RL瀛楃涓插垯鏄鍚圲RL瑙勮寖鐨勫瓧绗︿覆銆偮?瑙g爜锛毬?聽 聽 聽 聽 鍦═omcat涓紝鏈嶅姟鍣ㄧ榛樿鎸夌収iso-8859-1瑙g爜锛屼笉浼氫嬌鐢╮equest.setCharacterEncode()鏂規硶璁劇疆鐨勫瓧绗﹂泦瑙g爜銆傚皢%鐨勫瓧鑺傚簭鍒楋紝杞崲涓哄瓧绗︺€?

7.3.聽 聽 GET鍐呭缂栫爜鍜屽拰瑙g爜

鎸囷細URL鍦闆潃涓€滐紵鈥濅箣鍚庣殑閮ㄥ垎銆?

浠ラ〉闈㈢紪鐮佸瓧绗﹂泦锛屽鎻愪氦鍐呭杩涜缂栫爜锛涜鈥滀互鈥橴TF-8鍙戦€佲€欓€変腑锛屽甫鍙傛暟鈥濋儴鍒嗐€?

杞爜锛氬鏋滆皟鐢╡ncodeURI杩涜杞瘧锛屽湪鏈嶅姟鍣ㄧ鎺ユ敹鏃訛紝闇€瑕佸仛鐩稿簲鐨勮В鐮?璇﹁鈥滃URL鍦闆潃杞爜鈥濆皬鑺?锛屾湁涓ょ鏂瑰紡锛屼竴鍦╰omcat绔缃甎RI鐨勫瓧绗﹂泦涓篣TF-8瀛楃闆嗭紱浜屻€乼omcat榛樿瀛楃闆?ISO-8859-1)锛涘鎴風杩涜浜屾杞爜锛泃omcat搴旂敤鏈嶅姟鍣ㄦ帴鏀舵椂涓€娆¤В鐮侊紝鍦ㄤ唬鐮佷腑浣跨敤URLDecode绫昏繘琛屼簩娆¤В鐮併€?

缂栫爜锛氬鏋滀笉瀵瑰湴鍧€杞爜锛岀洿鎺ユ彁浜わ紝鍙傛暟涓€椤甸潰瀛楃闆嗚繘琛岀紪鐮併€?

瑙g爜锛氬垯浠ュ簲鐢ㄦ湇鍔″櫒(Tomcat)璁劇疆鐨勯粯璁ゅ瓧绗﹂泦杩涜瑙f瀽銆俥quest.setCharacterEncoding()鍛戒護鏃犳晥銆?

7.4.聽 聽 Post鍐呭缂栫爜鍜岃В鐮?

缂栫爜锛氫互椤甸潰缂栫爜瀛楃闆嗭紝瀵規彁浜ゅ唴瀹矽繘琛岀紪鐮併€?

瑙g爜锛氭湇鍔″櫒绔牴鎹?

request.setCharacterEncoding("GBK");

鍛戒護杩涜瑙g爜銆?

7.5.聽 聽 IE鈥滀互UTF-8鍙戦€佲€濆弬鏁?

璇ラ€夐」鐨勮漿鎹㈢殑鑼冨洿鍖呮嫭锛?)鍦ㄥ湴鍧€鏍忕洿鎺ヨ緭鍏ョ殑URL鍦闆潃瀛楃涓詫紱2)js涓彁浜ょ殑URL鍦闆潃瀛楃涓層€備笉鍖呮嫭鈥滐紵鈥濅箣鍚庣殑閮ㄥ垎銆?

灏哢RL鍦闆潃閮ㄥ垎(鈥滐紵鈥濅箣鍓嶇殑閮ㄥ垎)鎸夌収UTF-8缂栫爜锛屾晥鏋滃拰encodeURI涓€鑷淬€傗€滐紵鈥濆悗闈㈢殑鍙傛暟閮ㄥ垎涓嶅仛杞爜銆?

7.6.聽 聽 璁劇疆Tomcat URIEncoding鍙傛暟

Tomcat閰嶇疆鏂囦歡璁劇疆URI瑙f瀽瀛楃闆?

瀵規彁浜ょ殑鏁翠釜URI杩涜瑙f瀽(鍖呮嫭鍙傛暟閮ㄥ垎锛屼笉鍖呮嫭IP鍦闆潃鍜岀鍙?銆?

8.聽 聽 聽URI杞爜鍑芥暟

8.1.聽 聽 encodeURI / decodeURI鍑芥暟

鈥滆漿涔夆€濆畾涔壜?鎵€璋撹漿涔夛紝绋嬪簭鐢ㄦ寚瀹氱殑瀛楃鏋勬垚锛屽綋绋嬪簭澶勭悊缁勬垚鏈▼搴忕殑瀛楃鏃訛紝鎴栬€呰瀛楃涓嶅彲璇嗗埆鏃訛紝姝ゆ椂绋嬪簭浼氶粯璁ゅ皢璇ュ瓧绗︿綔涓虹▼搴忔湰韬殑瀛楃杩涜澶勭悊锛岃€屼笉鏄綔涓鴻澶勭悊瀵矽薄鐨勫唴瀹矽繘琛屽鐞嗐€傚浜庢绉嶆儏鍐碉紝鍒欓渶瑕佽漿涔夛紝鍗寵漿涔夊瓧绗﹀悗鐨勫瓧绗︼紝琛ㄧず鐨勪笉鏄湰鎰忥紝渚嬪鎴戜滑甯歌鐨勬崲琛屸€?n鈥濓紝鈥?鈥濊〃绀鴻漿涔夊瓧绗︼紝鍗沖悗闈㈢殑瀛楃鈥渘鈥濊〃绀虹殑鍚箟宸茬粡涓嶆槸鍘熸潵26涓瓧姣嶄腑n鐨勮涔夛紝鑰屾槸琛ㄧず鎹㈣銆偮?鍚岀悊鍦║RL涓?琛ㄧず鍚庨潰鐨勪負瀛楃缂栫爜銆偮?聽encodeURI()聽 杞瘧URI涓殑瀛楃銆偮?鎽樿锛歟ncodeURI(uri)聽 鍙傛暟锛毬?聽 聽 聽 聽 聽uri 涓€涓瓧绗︿覆锛屽惈鏈塽ri鎴栬€呴渶瑕佺紪鐮佺殑鏂囨湰銆偮?杩斿洖鍊鹼細聽 聽 聽 聽 聽 uri鐨勫壇鏈紝鍏朵腑鏌愪簺瀛楃琚崄鍏繘鍒剁殑杞瘧搴忓垪鏇誇唬浜嗐€偮?鎶涘嚭锛毬?聽 聽 聽 聽 璇存槑uri涓惈鏈夋牸寮忓寲閿欒鐨刄nicode鏇誇唬瀵癸紝涓嶈兘缂栫爜銆偮?鎻忚堪锛毬?聽 聽 聽 聽 encodeURI()鏄叏灞€鍑芥暟锛岃繑鍥瀠ri鐨勭紪鐮佸壇鏈€侫SCII鐨勫瓧姣嶅拰鏁闆瓧涓嶇紪鐮侊紝姝ゅ浠ヤ笅鐨凙SCII鏍囩偣绗﹀彿(ASCII Mark)涔熶笉缂栫爜銆偮?聽 聽 聽 聽 聽 鍥犱負endcodeURI()鐩殑鏄粰URI杩涜瀹屾暣鐨勭紪鐮侊紝鎵€浠RI涓湁鐗規畩鍚箟鐨勪繚鐣欏瓧绗︼紝涔熶笉杞箟銆偮?聽 聽 聽 聽 聽 闄や簡涓婅堪鍥涚瀛楃(ASCII瀛楁瘝銆佹暟瀛椼€佷繚鐣欏瓧绗︺€佹爣璁闆瓧绗?锛寀ri涓殑鍏朵粬瀛楃閮藉皢杞崲鎴愬畠鐨刄TF-8鐨勭紪鐮佸瓧绗︼紝聽 聽 聽 聽 聽 decodeURI()

8.2.聽 聽 encodeURIComponent / decodeURIComponent

8.3.聽 聽 涓よ€呭尯鍒?

銆愬ぇ鎰忥細銆戝浜庝繚鐣欏瓧绗︹€滐細鈥濓紝鈥?鈥濓紝鈥滐紱鈥濆拰鈥滐紵鈥漞ncodeURI鏄笉缂栫爜鐨勭殑锛宔ncodeURICompoent鏂規硶鏄紪鐮佺殑銆?

9.聽 聽 聽鏍蜂緥

鐜聽 聽 js灏唘rl杩涜缂栫爜(璋冪敤encodeURI鏂規硶)鎻愪氦銆偮?IE椤甸潰璁劇疆鐨勫瓧绗﹂泦涓猴細GBK銆偮?url鍦闆潃鎻愪氦鐨勬暟鎹紝key锛歶serid锛屸€滅伀鈥濄€偮?form琛ㄥ崟鎻愪氦鐨勬暟鎹細key锛歶serid1,鈥滅伀鈥濄€偮?鍗籌細聽 url鍦闆潃鎸夌収UTF-8缂栫爜锛屾彁浜ゃ€偮?get鍙傛暟鎸夌収UTF-8缂栫爜锛屾彁浜ゃ€偮?post鍙傛暟鎸夌収GBK缂栫爜锛屾彁浜ゃ€偮?聽 聽1)GET鎻愪氦鏁版嵁锛毬?鎴幏鐨勬彁浜ゅ瓧绗︿覆銆偮?聽 寰楀嚭锛実et鎻愪氦鐨勬暟鎹紝琚漿涔変負UTF-8鐨勫崄鍏繘鍒跺簭鍒椼€偮?聽 聽2)鎻愪氦post鏁版嵁锛毬?聽 缂栫爜聽 聽 鏍規嵁缂栫爜鍙互寰楀嚭锛孭OST鎻愪氦鏁版嵁鏄疓BK缂栫爜銆偮?聽 聽 Tomcat 鎸夌収榛樿缂栫爜(ISO-8859-1)聽 鎵句笉鍒般€偮?缁撴灉濡備笅锛毬?聽 聽Tomcat 鎸夌収UTF-8缂栫爜

Tomcat鎸夌収UTF-8瑙f瀽URI瀛楄妭娴併€?

褰搑equest璁劇疆GBK瀛楃闆嗭紝post鏁版嵁鍙瘑鍒紝璁劇疆浠g爜锛屽涓嬶細

缁撴灉濡備笅锛?

濡傛灉涓嶆樉绀虹殑璁劇疆request缂栫爜锛屽嚭鐜頒貢鐮侊紝濡備笅锛?

userid鐨勮В鐮佸瓧绗﹂泦锛屼負UTF-8锛屽嵆tomcat涓璘RIEncoding涓缃殑瀛楃闆嗐€?

10.聽 澶勭悊娴佺▼

鍑嗗锛氶渶瑕佹彁浜ょ殑URL锛歨ttp://localhost/urlencode/鐏?urlencode-2.jsp?userid='鐏?

绗竴姝ワ細杈撳叆URL鍦闆潃锛岃幏寰桿RL瀛楃涓?

鍖呮嫭涓ょ锛?

鍦ㄥ湴鍧€鏍忕洿鎺ヨ緭鍏RL瀛楃涓層€偮?聽璋冪敤js鏂規硶杈撳叆URL瀛楃涓層€傚湪姝ゆ柟寮忎腑锛屽彲浠ュURL瀛楃涓茶繘琛岃漿鐮佹搷浣?璋冪敤encodeURI /encodeURIComponent 鏂規硶锛屽皢URL瑙勫垯浠ュ鐨勫瓧绗︼紝杞崲涓哄崄鍏繘鍒剁殑瀛楃搴忓垪)寰楀埌杞崲鍚庣殑瀛楃涓層€傛渶涓烘渶缁堢殑URL瀛楃涓層€傚涓嬶細聽 聽 绗簩姝ワ細IE瀵規彁浜RL瀛楃涓茶繘琛岀紪鐮伮?聽 聽 聽 聽棣栧厛鏌ョ湅鈥滀互UTF-8鍙戦€佲€濇槸鍚﹂€変腑锛屽鏋滈€変腑锛屽URL瀛楃涓蹭腑鐨勫湴鍧€閮ㄥ垎(涓嶅寘鎷€滐紵鈥濆悗鐨勫弬鏁?锛岃繘琛岃漿鐮併€傛湭閫変腑锛屼繚鐣欏瓧绗︿覆涓嶅彉銆偮?聽 聽 聽 聽鎸夌収椤甸潰缂栫爜瀛楃闆嗗URL瀛楃涓?鏁翠綋)杩涜缂栫爜銆偮?绗笁姝ワ細閫氳繃HTT鍗忚鎻愪氦鍒版湇鍔″櫒绔?绗洓姝ワ細鏈嶅姟鍣ㄥ瑙g爜銆偮?聽 聽 聽 聽鏈嶅姟鍣ㄧ瑙g爜锛屽寘鎷袱閮ㄥ垎锛屽涓嬶細聽 涓€锛氬簲鐢ㄦ湇鍔″櫒瑙f瀽URL銆偮?聽 聽 聽 聽搴旂敤鏈嶅姟鍣ㄦ帴鍙楀埌璇鋒眰鐨勫瓧鑺傚簭鍒楋紝鎸夌収鎸囧畾鐨勫瓧绗﹂泦(Tomcat榛樿鏄疘SO-8859-1)瀵瑰瓧鑺傛祦瑙g爜銆傛敞鎰忥細閬囧埌鈥?鈥濓紝灏嗏€?鈥濆悗鐨勫瓧绗︽寜鐓у崄鍏繘鍒跺瓧鑺傚簭鍒楀鐞嗭紝鎸夌収鎸囧畾鐨勫瓧绗﹂泦杞崲涓虹浉搴旂殑瀛楃涓層€偮?聽 聽 聽 聽娉ㄦ剰锛氬鏋滃鎴風璋冪敤encodeURI鏂規硶锛屽URL瀛楃涓茶繘琛岃漿鎹紝搴旂敤鏈嶅姟鍣ㄧ瀛楃闆嗗簲璁劇疆涓篣TF-8锛屽拰瀹㈡埛绔繚鎸佷竴鑷淬€傚鏋滀緷鏃т負榛樿鐨処SO-8859-1锛屽垯闇€瑕佸簲鐢ㄧ▼搴忓鎻愪氦UTF-8瀛楃杩涜瑙g爜锛屾鏃堕渶瑕両E瀹㈡埛绔URL杩涜浜屾閲嶅杞崲( encodeURI(encodeURI(url))聽 )銆偮?浜岋細搴旂敤鏈嶅姟鍣ㄨВ鏋愭彁浜ょ殑鍐呭銆偮?聽 聽 聽 聽搴旂敤鏈嶅姟鍣ㄨ鍙杛equest.setCharacterEncoding()鏂規硶璁劇疆鐨勫瓧绗﹂泦锛屽鎻愪氦鐨勫唴瀹矽繘琛岃В鐮侊紝鎻愪氦鐨勫唴瀹瑰寘鎷細post鍐呭锛屽拰get鏂瑰紡鐨勫唴瀹廣€?

11.聽 HTML

鎸囧畾浜嗛〉闈㈠唴瀹圭殑瀛楃闆嗭紝琚粦瀹氬埌http鍝嶅簲澶翠俊鎭腑锛屽鏋渏sp涓篃璁劇疆浜嗭紝jsp浼樺厛銆?

12.聽 jsp瀛楃闆?

鎸囦護锛岃〃绀簀sp鏂囦歡浠bk鏍煎紡鐨勫瓧鑺傛祦锛屽簲绛旂粰瀹㈡埛绔€?

褰搄sp婧愭枃浠朵互gbk淇濆瓨锛岃€屽湪page涓紝鏍囪涓簎tf-8銆?

渚嬪瓙锛?

姹夊瓧锛氱伀銆?

12.1.瀹屾暣鐨凧SP鏂囦歡

New Document

12.2.GBK淇濆瓨锛宑ontentType锛欸BK锛宲ageCoding锛氭棤

鏂囦歡淇濆瓨ansi/GBK鏍煎紡銆?

IE璁塊棶姝e父銆?

鎵懼埌璇sp瀵瑰簲鐨刯ava鏂囦歡锛屸€滅伀鈥濆瓧鏄劇ず姝e父锛屽涓嬶細

浠ヤ簩杩涘埗鎵撳紑锛岃瀛楃浜岃繘鍒舵樉绀猴紝濡備笅锛?

(鈥滅伀鈥濆瓧鐨刄TF-8缂栫爜鍊?

鏍規嵁瀛楃鏄犲皠锛宩ava鏂囦歡鐨勪繚瀛樼紪鐮佷負UTF-8鏍煎紡銆?

12.3.GBK淇濆瓨锛宑ontentType锛歎TF-8锛宲ageCoding锛氭棤

鍦╦sp鏂囦歡涓紝濡備笅锛?

灏唈sp鏂囦歡淇濆瓨GBK鏍煎紡銆傞儴缃插埌Tomcat锛岄€氳繃IE璁塊棶銆傞〉闈㈠涓嬶細

鏄劇ず涔辯爜銆?

鎵懼埌璇sp瀵瑰簲鐨刯ava鏂囦歡锛屸€滅伀鈥濆瓧鏄劇ず涔辯爜锛屼互浜岃繘鍒舵墦寮€锛岃瀛楃浜岃繘鍒舵樉绀猴紝濡備笅锛?

璇ョ粨鏋滐紝浠TF-8璇誨彇GBK瀛楄妭娴佺粨鏋滀竴鑷淬€傚嵆浠BK鏍煎紡杞崲鈥滅伀鈥濈殑瀛楄妭娴佷繚瀛樺湪纾佺洏涓婏紝搴旂敤鏈嶅姟鍣ㄤ互utf-8鏍煎紡瑙f瀽璇ュ瓧鑺傛祦銆?

12.4.GBK淇濆瓨锛宑ontentType锛歎TF-8锛宲ageCoding GBK

娣誨姞澶?

灏唈sp鏂囦歡淇濆瓨GBK鏍煎紡銆傞儴缃插埌Tomcat锛岄€氳繃IE璁塊棶锛屾樉绀烘甯革紝IE鏄劇ず鐨勭紪鐮佹牸寮忎負UTF-8锛屻€?

http鍝嶅簲澶翠俊鎭細

鎵懼埌璇sp瀵瑰簲鐨刯ava鏂囦歡锛屸€滅伀鈥濆瓧鏄劇ず姝e父锛屽涓嬶細

浠ヤ簩杩涘埗鎵撳紑锛岃瀛楃浜岃繘鍒舵樉绀猴紝濡備笅锛?

鏍規嵁瀛楃鏄犲皠锛宩ava鏂囦歡鐨勪繚瀛樼紪鐮佷負UTF-8鏍煎紡銆?

鍙︼細灏嗗ご鏀逛負

鍚庯紝鏄劇ず姝e父銆?

HTTP澶存樉绀轟負锛?

鎵懼埌璇sp瀵瑰簲鐨刯ava鏂囦歡锛屸€滅伀鈥濆瓧鏄劇ず姝e父锛屽涓嬶細

12.5.姝ラ

JSP鏂囦歡杞藉叆

浠庣‖鐩樹腑璇誨彇JSP鏂囦歡锛屽緱鍒拌鏂囦歡鐨勪簩杩涘埗娴併€偮?聽搴旂敤鏈嶅姟鍣ㄦ牴鎹寚瀹氱殑瀛楃闆嗚В鐮丣SP鏂囦歡銆偮?聽 鐢熸垚Servlet

璋冪敤JSP寮曟搸锛岀敓鎴怱ervlet鏂囦歡锛屾牴鎹寚瀹氬瓧绗﹂泦缂栫爜锛屼繚瀛樺埌纭洏涓娿€偮?聽缂栬瘧Servlet锛岀敓鎴恈lass鏂囦歡銆偮?聽 鐢熸垚搴旂瓟鐨勮緭鍑烘祦

鏍規嵁鎸囧畾鐨勫瓧绗﹂泦锛屽姞杞借Servlet鐨刢lass鏂囦歡銆偮?聽鏍規嵁Servlet鎸囧畾鐨勫瓧绗﹂泦锛岀敓鎴愯緭鍑烘祦銆偮?聽 IE鎺ユ敹骞跺埌杈撳嚭娴?

IE瀹㈡埛绔帴鏀跺瓧鑺傛祦聽 聽 鏍規嵁鎸囧畾瀛楃闆嗙敓鎴怘TML椤甸潰

12.6.缁撹

jsp鏂囦歡琚簲鐢ㄦ湇鍔″櫒缂栬瘧涓簀ava鏂囦歡鍚庯紝java鏂囦歡浠tf-8鏍煎紡淇濆瓨銆偮?聽搴旂敤鏈嶅姟鍣ㄨ鍙杍sp鏂囦歡锛岃鍙栫殑瀛楃闆嗭紝鎸夌収jsp鐨刾ageEncoding鎸囦護鍐沖畾锛屽涓嬶細聽 聽 濡傛灉娌℃湁璁劇疆璇ユ寚浠わ紝浠ラ粯璁ゅ彇contentType灞炴€у€箋€偮?聽鍦╟ontentType灞炴€т腑锛宮imeType鎸囩ず娴忚鍣ㄦ樉绀哄唴瀹圭殑鏍煎紡锛屽嵆鐢ㄤ粈涔堝簲鐢ㄧ▼搴忔垨鑰呭瓧绗﹂泦鏄劇ず鍐呭锛泃ext/html锛岃缃畆esponse鐨勮緭鍑哄瓧鑺傛祦锛岃繑鍥炲鎴風鐨勫瓧鑺傛祦缂栫爜鏍煎紡銆偮?聽 搴旂敤鏈嶅姟鍣ㄥ皢璇ユ寚浠ょ殑鍐呭锛岀粦瀹氬埌HTTP鍝嶅簲澶翠腑锛屾祻瑙堝櫒鏍規嵁鍝嶅簲澶翠俊鎭В鐮併€?

13.聽 java骞沖彴瀛楃闆?

char绫誨瀷鐨勭紪鐮佷負utf-16鐨勭紪鐮佹牸寮忋€傜敱char缁勬垚String锛孲tring涓哄鈎鍙版彁渚涙搷浣滃瓧绗︾殑宸ュ叿銆係tring鏄熀纭€锛宑har鍒欐槸鏋勫緩鍩虹鐨勫厓绱犮€?

java婧愭枃浠訛紝鐢辨湰鍦版搷浣滅郴缁熷喅瀹氥€俲avac鍦ㄨ鍙栨簮鏂囦歡鏃訛紝榛樿璇誨彇鏈湴鎿嶄綔绯葷粺鐨勫瓧绗﹂泦锛岃鍙杍ava婧愭枃浠躲€?

javac缂栬瘧class鏂囦歡涓簎nicode瀛楃闆嗭紝缂栫爜涓簎tf-8鏍煎紡銆?

jvm杩愯锛屽瓧绗﹂泦涓簎nicode瀛楃闆嗭紝缂栫爜鏍煎紡utf-16鏍煎紡銆?

14.聽 Tomcat澶勭悊鍐呭鐨勯粯璁ょ紪鐮?

銆乀omcat 6.0璁劇疆瀛楃闆嗭紝聽 聽 聽URIEncoding="UTF-8"

15.聽 瀛楃闆嗙紪鐮?闂鐮旂┒(鎽樿嚜缃戠粶)

15.1.姒傝堪

鏈枃涓昏鍖呮嫭浠ヤ笅鍑犱釜鏂歸潰锛氱紪鐮佸熀鏈煡璇嗭紝java锛岀郴缁熻蔣浠訛紝url锛屽伐鍏瘋蔣浠剁瓑銆?

鍦ㄤ笅闈㈢殑鎻忚堪涓紝灏嗕互"涓枃"涓や釜瀛椾負渚嬶紝缁忔煡琛ㄥ彲浠ョ煡閬撳叾GB2312缂栫爜鏄?d6d0 cec4"锛孶nicode缂栫爜涓?4e2d 6587"锛孶TF缂栫爜灏辨槸"e4b8ad e69687"銆傛敞鎰忥紝杩欎袱涓瓧娌℃湁iso8859-1缂栫爜锛屼絾鍙互鐢╥so8859-1缂栫爜鏉?琛ㄧず"銆?

15.2.缂栫爜鍩烘湰鐭ヨ瘑

鏈€鏃╃殑缂栫爜鏄痠so8859-1锛屽拰ascii缂栫爜鐩鎬技銆備絾涓轟簡鏂逛究琛ㄧず鍚勭鍚勬牱鐨勮瑷€锛岄€愭笎鍑虹幇浜嗗緢澶氭爣鍑嗙紪鐮侊紝閲嶈鐨勬湁濡備笅鍑犱釜銆?

15.2.1.聽 iso8859-1

灞炰簬鍗曞瓧鑺傜紪鐮侊紝鏈€澶氳兘琛ㄧず鐨勫瓧绗﹁寖鍥存槸0-255锛屽簲鐢ㄤ簬鑻辨枃绯誨垪銆傛瘮濡傦紝瀛楁瘝'a'鐨勭紪鐮佷負0x61=97銆?

寰堟槑鏄撅紝iso8859-1缂栫爜琛ㄧず鐨勫瓧绗﹁寖鍥村緢绐勶紝鏃犳硶琛ㄧず涓枃瀛楃銆備絾鏄紝鐢變簬鏄崟瀛楄妭缂栫爜锛屽拰璁$畻鏈烘渶鍩虹鐨勮〃绀哄崟浣嶄竴鑷達紝鎵€浠ュ緢澶氭椂鍊欙紝浠嶆棫浣跨敤iso8859-1缂栫爜鏉ヨ〃绀恒€傝€屼笖鍦ㄥ緢澶氬崗璁笂锛岄粯璁や嬌鐢ㄨ缂栫爜銆傛瘮濡傦紝铏界劧"涓枃"涓や釜瀛椾笉瀛樺湪iso8859-1缂栫爜锛屼互gb2312缂栫爜涓轟緥锛屽簲璇ユ槸"d6d0 cec4"涓や釜瀛楃锛屼嬌鐢╥so8859-1缂栫爜鐨勬椂鍊欏垯灏嗗畠鎷嗗紑涓?涓瓧鑺傛潵琛ㄧず锛?d6 d0 ce c4"(浜嬪疄涓婏紝鍦ㄨ繘琛屽瓨鍌ㄧ殑鏃跺€欙紝涔熸槸浠ュ瓧鑺備負鍗曚綅澶勭悊鐨?銆傝€屽鏋滄槸UTF缂栫爜锛屽垯鏄?涓瓧鑺?e4 b8 ad e6 96 87"銆傚緢鏄庢樉锛岃繖绉嶈〃绀烘柟娉曡繕闇€瑕佷互鍙︿竴绉嶇紪鐮佷負鍩虹銆?

15.2.2.聽 聽GB2312/GBK

杩欏氨鏄眽瀛愮殑鍥芥爣鐮侊紝涓撻棬鐢ㄦ潵琛ㄧず姹夊瓧锛屾槸鍙屽瓧鑺傜紪鐮侊紝鑰岃嫳鏂囧瓧姣嶅拰iso8859-1涓€鑷?鍏煎iso8859-1缂栫爜)銆傚叾涓璯bk缂栫爜鑳藉鐢ㄦ潵鍚屾椂琛ㄧず绻佷綋瀛楀拰绠€浣撳瓧锛岃€実b2312鍙兘琛ㄧず绠€浣撳瓧锛実bk鏄吋瀹筭b2312缂栫爜鐨勩€?

15.2.3.聽 聽unicode

杩欐槸鏈€缁熶竴鐨勭紪鐮侊紝鍙互鐢ㄦ潵琛ㄧず鎵€鏈夎瑷€鐨勫瓧绗︼紝鑰屼笖鏄畾闀垮弻瀛楄妭(涔熸湁鍥涘瓧鑺傜殑)缂栫爜锛屽寘鎷嫳鏂囧瓧姣嶅湪鍐呫€傛墍浠ュ彲浠ヨ瀹冩槸涓嶅吋瀹筰so8859-1缂栫爜鐨勶紝涔熶笉鍏煎浠諱綍缂栫爜銆備笉杩囷紝鐩稿浜巌so8859-1缂栫爜鏉ヨ锛寀niocode缂栫爜鍙槸鍦ㄥ墠闈㈠鍔犱簡涓€涓?瀛楄妭锛屾瘮濡傚瓧姣?a'涓?00 61"銆?

闇€瑕佽鏄庣殑鏄紝瀹氶暱缂栫爜渚誇簬璁$畻鏈哄鐞?娉ㄦ剰GB2312/GBK涓嶆槸瀹氶暱缂栫爜)锛岃€寀nicode鍙堝彲浠ョ敤鏉ヨ〃绀烘墍鏈夊瓧绗︼紝鎵€浠ュ湪寰堝杞歡鍐呴儴鏄嬌鐢╱nicode缂栫爜鏉ュ鐞嗙殑锛屾瘮濡俲ava銆?

15.2.4.聽 聽UTF

鑰冭檻鍒皍nicode缂栫爜涓嶅吋瀹筰so8859-1缂栫爜锛岃€屼笖瀹規槗鍗犵敤鏇村鐨勭┖闂達細鍥犱負瀵逛簬鑻辨枃瀛楁瘝锛寀nicode涔熼渶瑕佷袱涓瓧鑺傛潵琛ㄧず銆傛墍浠nicode涓嶄究浜庝紶杈撳拰瀛樺偍銆傚洜姝よ€屼駭鐢熶簡utf缂栫爜锛寀tf缂栫爜鍏煎iso8859-1缂栫爜锛屽悓鏃朵篃鍙互鐢ㄦ潵琛ㄧず鎵€鏈夎瑷€鐨勫瓧绗︼紝涓嶈繃锛寀tf缂栫爜鏄笉瀹氶暱缂栫爜锛屾瘡涓€涓瓧绗︾殑闀垮害浠?-6涓瓧鑺備笉绛夈€傚彟澶栵紝utf缂栫爜鑷甫绠€鍗曠殑鏍¢獙鍔熻兘銆備竴鑸潵璁詫紝鑻辨枃瀛楁瘝閮芥槸鐢ㄤ竴涓瓧鑺傝〃绀猴紝鑰屾眽瀛椾嬌鐢ㄤ笁涓瓧鑺傘€?

娉ㄦ剰锛岃櫧鐒惰utf鏄負浜嗕嬌鐢ㄦ洿灏戠殑绌洪棿鑰屼嬌鐢ㄧ殑锛屼絾閭e彧鏄浉瀵逛簬unicode缂栫爜鏉ヨ锛屽鏋滃凡缁忕煡閬撴槸姹夊瓧锛屽垯浣跨敤GB2312/GBK鏃犵枒鏄渶鑺傜渷鐨勩€備笉杩囧彟涓€鏂歸潰锛屽€煎緱璇存槑鐨勬槸锛岃櫧鐒秛tf缂栫爜瀵規眽瀛椾嬌鐢?涓瓧鑺傦紝浣嗗嵆浣垮浜庢眽瀛楃綉椤碉紝utf缂栫爜涔熶細姣攗nicode缂栫爜鑺傜渷锛屽洜涓虹綉椤典腑鍖呭惈浜嗗緢澶氱殑鑻辨枃瀛楃銆?

15.3.java瀵瑰瓧绗︾殑澶勭悊

鍦╦ava搴旂敤杞歡涓紝浼氭湁澶氬娑夊強鍒闆瓧绗﹂泦缂栫爜锛屾湁浜涘湴鏂歸渶瑕佽繘琛屾纭殑璁劇疆锛屾湁浜涘湴鏂歸渶瑕佽繘琛屼竴瀹氱▼搴︾殑澶勭悊銆?

15.3.1.聽 聽getBytes(charset)

杩欐槸java瀛楃涓插鐞嗙殑涓€涓爣鍑嗗嚱鏁幫紝鍏朵綔鐢ㄦ槸灏嗗瓧绗︿覆鎵€琛ㄧず鐨勫瓧绗︽寜鐓harset缂栫爜锛屽苟浠ュ瓧鑺傛柟寮忚〃绀恒€傛敞鎰忓瓧绗︿覆鍦╦ava鍐呭瓨涓€繪槸鎸塽nicode缂栫爜瀛樺偍鐨勩€傛瘮濡?涓枃"锛屾甯告儏鍐典笅(鍗蟲病鏈夐敊璇殑鏃跺€?瀛樺偍涓?4e2d6587"锛屽鏋渃harset涓?gbk"锛屽垯琚紪鐮佷負"d6d0 cec4"锛岀劧鍚庤繑鍥炲瓧鑺?d6 d0 ce c4"銆傚鏋渃harset涓?utf8"鍒欐渶鍚庢槸"e4 b8 ad e6 96 87"銆傚鏋滄槸"iso8859-1"锛屽垯鐢變簬鏃犳硶缂栫爜锛屾渶鍚庤繑鍥?"3f 3f"(涓や釜闂彿)銆?

15.3.2.聽 聽newString(charset)

杩欐槸java瀛楃涓插鐞嗙殑鍙︿竴涓爣鍑嗗嚱鏁幫紝鍜屼笂涓€涓嚱鏁扮殑浣滅敤鐩稿弽锛屽皢瀛楄妭鏁扮粍鎸夌収charset缂栫爜杩涜缁勫悎璇嗗埆锛屾渶鍚庤漿鎹負unicode瀛樺偍銆傚弬鑰冧笂杩癵etBytes鐨勪緥瀛愶紝"gbk"鍜?utf8"閮藉彲浠ュ緱鍑烘纭殑缁撴灉"4e2d 6587"锛屼絾iso8859-1鏈€鍚庡彉鎴愪簡"003f 003f"(涓や釜闂彿)銆?

鍥犱負utf8鍙互鐢ㄦ潵琛ㄧず/缂栫爜鎵€鏈夊瓧绗︼紝鎵€浠ew String( str.getBytes("utf8" ), "utf8" ) === str锛屽嵆瀹屽叏鍙€嗐€?

15.3.3.聽 setCharacterEncoding()

璇ュ嚱鏁扮敤鏉ヨ缃甴ttp璇鋒眰鎴栬€呯浉搴旂殑缂栫爜銆?

瀵逛簬request锛屾槸鎸囨彁浜ゅ唴瀹圭殑缂栫爜锛屾寚瀹氬悗鍙互閫氳繃getParameter()鍒欑洿鎺ヨ幏寰楁纭殑瀛楃涓詫紝濡傛灉涓嶆寚瀹氾紝鍒欓粯璁や嬌鐢╥so8859-1缂栫爜锛岄渶瑕佽繘涓€姝ュ鐞嗐€傚弬瑙佷笅杩?琛ㄥ崟杈撳叆"銆傚€煎緱娉ㄦ剰鐨勬槸鍦ㄦ墽琛宻etCharacterEncoding()涔嬪墠锛屼笉鑳芥墽琛屼換浣昰etParameter()銆俲ava doc涓婅鏄庯細This method must be called priorto reading request parameters or reading input using getReader()銆傝€屼笖锛岃鎸囧畾鍙POST鏂規硶鏈夋晥锛屽GET鏂規硶鏃犳晥銆傚垎鏋愬師鍥狅紝搴旇鏄湪鎵ц绗竴涓猤etParameter()鐨勬椂鍊欙紝java灏嗕細鎸夌収缂栫爜鍒嗘瀽鎵€鏈夌殑鎻愪氦鍐呭锛岃€屽悗缁殑getParameter()涓嶅啀杩涜鍒嗘瀽锛屾墍浠etCharacterEncoding()鏃犳晥銆傝€屽浜嶨ET鏂規硶鎻愪氦琛ㄥ崟鏄紝鎻愪氦鐨勫唴瀹瑰湪URL涓紝涓€寮€濮嬪氨宸茬粡鎸夌収缂栫爜鍒嗘瀽鎵€鏈夌殑鎻愪氦鍐呭锛宻etCharacterEncoding()鑷劧灏辨棤鏁堛€?

瀵逛簬response锛屽垯鏄寚瀹氳緭鍑哄唴瀹圭殑缂栫爜锛屽悓鏃訛紝璇ヨ缃細浼犻€掔粰娴忚鍣紝鍛婅瘔娴忚鍣ㄨ緭鍑哄唴瀹規墍閲囩敤鐨勭紪鐮併€?

15.3.4.聽 澶勭悊杩囩▼

涓嬮潰鍒嗘瀽涓や釜鏈変唬琛ㄦ€х殑渚嬪瓙锛岃鏄巎ava瀵圭紪鐮佹湁鍏抽棶棰樼殑澶勭悊鏂規硶銆?

15.3.4.1.聽 聽 聽 聽 琛ㄥ崟杈撳叆

Userinput聽 *(gbk:d6d0 cec4) browser聽 *(gbk:d6d0cec4) web server聽 iso8859-1(00d6 00d 000ce 00c4) class锛岄渶瑕佸湪class涓繘琛屽鐞嗭細getbytes("iso8859-1")涓篸6 d0 ce c4锛宯ew String("gbk")涓篸6d0 cec4锛屽唴瀛樹腑浠nicode缂栫爜鍒欎負4e2d 6587銆?

l鐢ㄦ埛杈撳叆鐨勭紪鐮佹柟寮忓拰椤甸潰鎸囧畾鐨勭紪鐮佹湁鍏籌紝涔熷拰鐢ㄦ埛鐨勬搷浣滅郴缁熸湁鍏籌紝鎵€浠ユ槸涓嶇‘瀹氱殑锛屼笂渚嬩互gbk涓轟緥銆?

l浠巄rowser鍒皐eb server锛屽彲浠ュ湪琛ㄥ崟涓寚瀹氭彁浜ゅ唴瀹規椂浣跨敤鐨勫瓧绗﹂泦锛屽惁鍒欎細浣跨敤椤甸潰鎸囧畾鐨勭紪鐮併€傝€屽鏋滃湪url涓洿鎺ョ敤?鐨勬柟寮忚緭鍏ュ弬鏁幫紝鍒欏叾缂栫爜寰€寰€鏄搷浣滅郴缁熸湰韬殑缂栫爜锛屽洜涓鴻繖鏃跺拰椤甸潰鏃犲叧銆備笂杩頒粛鏃т互gbk缂栫爜涓轟緥銆?

l Web server鎺ユ敹鍒扮殑鏄瓧鑺傛祦锛岄粯璁ゆ椂(getParameter)浼氫互iso8859-1缂栫爜澶勭悊涔嬶紝缁撴灉鏄笉姝g‘鐨勶紝鎵€浠ラ渶瑕佽繘琛屽鐞嗐€備絾濡傛灉棰勫厛璁劇疆浜嗙紪鐮?閫氳繃request. setCharacterEncoding ())锛屽垯鑳藉鐩存帴鑾峰彇鍒版纭殑缁撴灉銆?

l鍦ㄩ〉闈腑鎸囧畾缂栫爜鏄釜濂戒範鎯紝鍚﹀垯鍙兘澶卞幓鎺у埗锛屾棤娉曟寚瀹氭纭殑缂栫爜銆?

15.3.4.2.聽 聽 聽 聽 鏂囦歡缂栬瘧

鍋囪鏂囦歡鏄痝bk缂栫爜淇濆瓨鐨勶紝鑰岀紪璇戞湁涓ょ缂栫爜閫夋嫨锛歡bk鎴栬€卛so8859-1锛屽墠鑰呮槸涓枃windows鐨勯粯璁ょ紪鐮侊紝鍚庤€呮槸linux鐨勯粯璁ょ紪鐮侊紝褰撶劧涔熷彲浠ュ湪缂栬瘧鏃舵寚瀹氱紪鐮併€?

Jsp *(gbk:d6d0 cec4) java file聽 *(gbk:d6d0 cec4) compilerread聽 uincode(gbk: 4e2d 6587; iso8859-1: 00d6 00d 000ce 00c4) compilerwrite聽 utf(gbk: e4b8ad e69687; iso8859-1: *) compiled file unicode(gbk: 4e2d 6587; iso8859-1: 00d6 00d 000ce 00c4) class銆傛墍浠ョ敤gbk缂栫爜淇濆瓨锛岃€岀敤iso8859-1缂栬瘧鐨勭粨鏋滄槸涓嶆纭殑銆?

class unicode(4e2d 6587) system.out / jsp.out聽 gbk(d6d0 cec4) os console / browser銆?

l鏂囦歡鍙互浠ュ绉嶇紪鐮佹柟寮忎繚瀛橈紝涓枃windows涓嬶紝榛樿涓篴nsi/gbk銆?

l缂栬瘧鍣ㄨ鍙栨枃浠舵椂锛岄渶瑕佸緱鍒版枃浠剁殑缂栫爜锛屽鏋滄湭鎸囧畾锛屽垯浣跨敤绯葷粺榛樿缂栫爜銆備竴鑸琧lass鏂囦歡锛屾槸浠ョ郴缁熼粯璁ょ紪鐮佷繚瀛樼殑锛屾墍浠ョ紪璇戜笉浼氬嚭闂锛屼絾瀵逛簬jsp鏂囦歡锛屽鏋滃湪涓枃windows涓嬬紪杈戜繚瀛橈紝鑰岄儴缃插湪鑻辨枃linux涓嬭繍琛?缂栬瘧锛屽垯浼氬嚭鐜伴棶棰樸€傛墍浠ラ渶瑕佸湪jsp鏂囦歡涓敤pageEncoding鎸囧畾缂栫爜銆?

l Java缂栬瘧鐨勬椂鍊欎細杞崲鎴愮粺涓€鐨剈nicode缂栫爜澶勭悊锛屾渶鍚庝繚瀛樼殑鏃跺€欏啀杞崲涓簎tf缂栫爜銆?

l褰撶郴缁熻緭鍑哄瓧绗︾殑鏃跺€欙紝浼氭寜鎸囧畾缂栫爜杈撳嚭锛屽浜庝腑鏂噖indows涓嬶紝System.out灏嗕嬌鐢╣bk缂栫爜锛岃€屽浜巖esponse(娴忚鍣?锛屽垯浣跨敤jsp鏂囦歡澶存寚瀹氱殑contentType锛屾垨鑰呭彲浠ョ洿鎺ヤ負response鎸囧畾缂栫爜銆傚悓鏃訛紝浼氬憡璇塨rowser缃戦〉鐨勭紪鐮併€傚鏋滄湭鎸囧畾锛屽垯浼氫嬌鐢╥so8859-1缂栫爜銆傚浜庝腑鏂囷紝搴旇涓篵rowser鎸囧畾杈撳嚭瀛楃涓茬殑缂栫爜銆?

l browser鏄劇ず缃戦〉鐨勬椂鍊欙紝棣栧厛浣跨敤response涓寚瀹氱殑缂栫爜(jsp鏂囦歡澶存寚瀹氱殑contentType鏈€缁堜篃鍙嶆槧鍦╮esponse涓?锛屽鏋滄湭鎸囧畾锛屽垯浼氫嬌鐢ㄧ綉椤典腑meta椤規寚瀹氫腑鐨刢ontentType銆?

15.3.5.聽 鍑犲璁劇疆

瀵逛簬web搴旂敤绋嬪簭锛屽拰缂栫爜鏈夊叧鐨勮缃垨鑰呭嚱鏁闆涓嬨€?

15.3.5.1.聽 聽 聽 聽 jsp缂栬瘧

鎸囧畾鏂囦歡鐨勫瓨鍌ㄧ紪鐮侊紝寰堟槑鏄撅紝璇ヨ缃簲璇ョ疆浜庢枃浠剁殑寮€澶淬€備緥濡傦細銆傚彟澶栵紝瀵逛簬涓€鑸琧lass鏂囦歡锛屽彲浠ュ湪缂栬瘧鐨勬椂鍊欐寚瀹氱紪鐮併€?

15.3.5.2.聽 聽 聽 聽 jsp杈撳嚭

鎸囧畾鏂囦歡杈撳嚭鍒癰rowser鏄嬌鐢ㄧ殑缂栫爜锛岃璁劇疆涔熷簲璇ョ疆浜庢枃浠剁殑寮€澶淬€備緥濡傦細銆傝璁劇疆鍜宺esponse.setCharacterEncoding("GBK")绛夋晥銆?

15.3.5.3.聽 聽 聽 聽 聽meta璁劇疆

鎸囧畾缃戦〉浣跨敤鐨勭紪鐮侊紝璇ヨ缃闈欐€佺綉椤靛挨鍏舵湁浣滅敤銆傚洜涓洪潤鎬佺綉椤墊棤娉曢噰鐢╦sp鐨勮缃紝鑰屼笖涔熸棤娉曟墽琛宺esponse.setCharacterEncoding()銆備緥濡傦細

濡傛灉鍚屾椂閲囩敤浜唈sp杈撳嚭鍜宮eta璁劇疆涓ょ缂栫爜鎸囧畾鏂瑰紡锛屽垯jsp鎸囧畾鐨勪紭鍏堛€傚洜涓簀sp鎸囧畾鐨勭洿鎺ヤ綋鐜闆湪response涓€?

闇€瑕佹敞鎰忕殑鏄紝apache鏈変竴涓缃彲浠ョ粰鏃犵紪鐮佹寚瀹氱殑缃戦〉鎸囧畾缂栫爜锛岃鎸囧畾绛夊悓浜巎sp鐨勭紪鐮佹寚瀹氭柟寮忥紝鎵€浠ヤ細瑕嗙洊闈欐€佺綉椤典腑鐨刴eta鎸囧畾銆傛墍浠ユ湁浜哄緩璁叧闂璁劇疆銆?

15.3.5.4.聽 聽 聽 聽 聽form璁劇疆

褰撴祻瑙堝櫒鎻愪氦琛ㄥ崟鐨勬椂鍊欙紝鍙互鎸囧畾鐩稿簲鐨勭紪鐮併€備緥濡傦細

銆備竴鑸笉蹇呬笉浣跨敤璇ヨ缃紝娴忚鍣ㄤ細鐩存帴浣跨敤缃戦〉鐨勭紪鐮併€?

15.4.绯葷粺杞歡

涓嬮潰璁ㄨ鍑犱釜鐩稿叧鐨勭郴缁熻蔣浠躲€?

15.4.1.聽 聽mysql鏁版嵁搴?

寰堟槑鏄撅紝瑕佹敮鎸佸璇█锛屽簲璇ュ皢鏁版嵁搴撶殑缂栫爜璁劇疆鎴恥tf鎴栬€卽nicode锛岃€寀tf鏇撮€傚悎涓庡瓨鍌ㄣ€備絾鏄紝濡傛灉涓枃鏁版嵁涓寘鍚殑鑻辨枃瀛楁瘝寰堝皯锛屽叾瀹瀠nicode鏇翠負閫傚悎銆?

鏁版嵁搴撶殑缂栫爜鍙互閫氳繃mysql鐨勯厤缃枃浠惰缃紝渚嬪default-character-set=utf8銆傝繕鍙互鍦ㄦ暟鎹簱閾炬帴URL涓缃紝渚嬪锛?useUnicode=true&characterEncoding=UTF-8銆傛敞鎰忚繖涓よ€呭簲璇ヤ繚鎸佷竴鑷達紝鍦ㄦ柊鐨剆ql鐗堟湰閲岋紝鍦ㄦ暟鎹簱閾炬帴URL閲屽彲浠ヤ笉杩涜璁劇疆锛屼絾涔熶笉鑳芥槸閿欒鐨勮缃€?

15.4.2.聽 聽apache

appache鍜岀紪鐮佹湁鍏崇殑閰嶇疆鍦╤ttpd.conf涓紝渚嬪AddDefaultCharset UTF-8銆傚鍓嶆墍杩幫紝璇ュ姛鑳戒細灏嗘墍鏈夐潤鎬侀〉闈㈢殑缂栫爜璁劇疆涓篣TF-8锛屾渶濂藉叧闂鍔熻兘銆?

鍙﹀锛宎pache杩樻湁鍗曠嫭鐨勬ā鍧楁潵澶勭悊缃戦〉鍝嶅簲澶達紝鍏朵腑涔熷彲鑳藉缂栫爜杩涜璁劇疆銆?

15.4.3.聽 聽linux榛樿缂栫爜

杩欓噷鎵€璇寸殑linux榛樿缂栫爜锛屾槸鎸囪繍琛屾椂鐨勭幆澧冨彉閲忋€備袱涓噸瑕佺殑鐜鍙橀噺鏄疞C_ALL鍜孡ANG锛岄粯璁ょ紪鐮佷細褰卞搷鍒癹ava URLEncode鐨勮涓猴紝涓嬮潰鏈夋弿杩般€?

寤鴻閮借缃負"zh_CN.UTF-8"銆?

15.4.4.聽 鍏跺畠

涓轟簡鏀寔涓枃鏂囦歡鍚嶏紝linux鍦ㄥ姞杞界鐩樻椂搴旇鎸囧畾瀛楃闆嗭紝渚嬪锛歮ount /dev/hda5 /mnt/hda5/ -t ntfs -o iocharset=gb2312銆?

鍙﹀锛屽鍓嶆墍杩幫紝浣跨敤GET鏂規硶鎻愪氦鐨勪俊鎭笉鏀寔request.setCharacterEncoding()锛屼絾鍙互閫氳繃tomcat鐨勯厤缃枃浠舵寚瀹氬瓧绗﹂泦锛屽湪tomcat鐨剆erver.xml鏂囦歡涓紝褰㈠锛氥€傝繖绉嶆柟娉曞皢缁熶竴璁劇疆鎵€鏈夎姹傦紝鑰屼笉鑳介拡瀵瑰叿浣撻〉闈㈣繘琛岃缃紝涔熶笉涓€瀹氬拰browser浣跨敤鐨勭紪鐮佺浉鍚岋紝鎵€浠ユ湁鏃跺€欏苟涓嶆槸鎵€鏈熸湜鐨勩€?

15.5.URL鍦闆潃

URL鍦闆潃涓惈鏈変腑鏂囧瓧绗︽槸寰堥夯鐑︾殑锛屽墠闈㈡弿杩拌繃浣跨敤GET鏂規硶鎻愪氦琛ㄥ崟鐨勬儏鍐碉紝浣跨敤GET鏂規硶鏃訛紝鍙傛暟灏辨槸鍖呭惈鍦║RL涓€?

15.5.1.聽 URL缂栫爜

瀵逛簬URL涓殑涓€浜涚壒娈婂瓧绗︼紝娴忚鍣ㄤ細鑷姩杩涜缂栫爜銆傝繖浜涘瓧绗﹂櫎浜?/?&"绛夊锛岃繕鍖呮嫭unicode瀛楃锛屾瘮濡傛眽瀛愩€傝繖鏃剁殑缂栫爜姣旇緝鐗規畩銆?

IE鏈変竴涓€夐」"鎬繪槸浣跨敤UTF-8鍙戦€乁RL"锛屽綋璇ラ€夐」鏈夋晥鏃訛紝IE灏嗕細瀵圭壒娈婂瓧绗﹁繘琛孶TF-8缂栫爜锛屽悓鏃惰繘琛孶RL缂栫爜銆傚鏋滄敼閫夐」鏃犳晥锛屽垯浣跨敤榛樿缂栫爜"GBK"锛屽苟涓斾笉杩涜URL缂栫爜銆備絾鏄紝瀵逛簬URL鍚庨潰鐨勫弬鏁幫紝鍒欐€繪槸涓嶈繘琛岀紪鐮侊紝鐩稿綋浜嶶TF-8閫夐」鏃犳晥銆傛瘮濡?涓枃.html?a=涓枃"锛屽綋UTF-8閫夐」鏈夋晥鏃訛紝灏嗗彂閫侀摼鎺?%e4%b8%ad%e6%96%87.html?a=\x4e\x2d\x65\x87"锛涜€孶TF-8閫夐」鏃犳晥鏃訛紝灏嗗彂閫侀摼鎺?\x4e\x2d\x65\x87.html?a=\x4e\x2d\x65\x87"銆傛敞鎰忓悗鑰呭墠闈㈢殑"涓枃"涓や釜瀛楀彧鏈?涓瓧鑺傦紝鑰屽墠鑰呭嵈鏈?8涓瓧鑺傦紝杩欎富瑕佹椂URL缂栫爜鐨勫師鍥犮€?

褰搘eb server(tomcat)鎺ユ敹鍒拌閾炬帴鏃訛紝灏嗕細杩涜URL瑙g爜锛屽嵆鍘繪帀"%"锛屽悓鏃舵寜鐓SO8859-1缂栫爜(涓婇潰宸茬粡鎻忚堪锛屽彲浠ヤ嬌鐢║RLEncoding鏉ヨ缃垚鍏跺畠缂栫爜)璇嗗埆銆備笂杩頒緥瀛愮殑缁撴灉鍒嗗埆鏄?\ue4\ub8\uad\ue6\u96\u87.html?a=\u4e\u2d\u65\u87"鍜?\u4e\u2d\u65\u87.html?a=\u4e\u2d\u65\u87"锛屾敞鎰忓墠鑰呭墠闈㈢殑"涓枃"涓や釜瀛楁仮澶嶆垚浜?涓瓧绗︺€傝繖閲岀敤"\u"锛岃〃绀烘槸unicode銆?

鎵€浠ワ紝鐢變簬瀹㈡埛绔缃殑涓嶅悓锛岀浉鍚岀殑閾炬帴锛屽湪鏈嶅姟鍣ㄤ笂寰楀埌浜嗕笉鍚岀粨鏋溿€傝繖涓棶棰樹笉灏戜漢閮介亣鍒幫紝鍗存病鏈夊緢濂界殑瑙e喅鍔炴硶銆傛墍浠ユ湁鐨勭綉绔欎細寤鴻鐢ㄦ埛灏濊瘯鍏抽棴UTF-8閫夐」銆備笉杩囷紝涓嬮潰浼氭弿杩頒竴涓洿濂界殑澶勭悊鍔炴硶銆?

15.5.2.聽 聽rewrite

鐔熸倝鐨勪漢閮界煡閬擄紝apache鏈変竴涓姛鑳藉己澶х殑rewrite妯″潡锛岃繖閲屼笉鎻忚堪鍏跺姛鑳姐€傞渶瑕佽鏄庣殑鏄妯″潡浼氳嚜鍔ㄥ皢URL瑙g爜(鍘婚櫎%)锛屽嵆瀹屾垚涓婅堪web server(tomcat)鐨勯儴鍒嗗姛鑳姐€傛湁鐩稿叧鏂囨。浠嬬粛璇村彲浠ヤ嬌鐢╗NE]鍙傛暟鏉ュ叧闂鍔熻兘锛屼絾鎴戣瘯楠屽苟鏈垚鍔燂紝鍙兘鏄洜涓虹増鏈?鎴戜嬌鐢ㄧ殑鏄痑pache 2.0.54)闂銆傚彟澶栵紝褰撳弬鏁頒腑鍚湁"?& "绛夌鍙風殑鏃跺€欙紝璇ュ姛鑳藉皢瀵艱嚧绯葷粺寰椾笉鍒版甯哥粨鏋溿€?

rewrite鏈韓浼間箮瀹屽叏鏄噰鐢ㄥ瓧鑺傚鐞嗙殑鏂瑰紡锛岃€屼笉鑰冭檻瀛楃涓茬殑缂栫爜锛屾墍浠ヤ笉浼氬甫鏉ョ紪鐮侀棶棰樸€?

15.5.3.聽 聽URLEncode.encode()

杩欐槸Java鏈韓鎻愪緵瀵圭殑URL缂栫爜鍑芥暟锛屽畬鎴愮殑宸ヤ綔鍜屼笂杩癠TF-8閫夐」鏈夋晥鏃舵祻瑙堝櫒鎵€鍋氱殑宸ヤ綔鐩鎬技銆傚€煎緱璇存槑鐨勬槸锛宩ava宸茬粡涓嶈禐鎴愪笉鎸囧畾缂栫爜鏉ヤ嬌鐢ㄨ鏂規硶(deprecated)銆傚簲璇ュ湪浣跨敤鐨勬椂鍊欏鍔犵紪鐮佹寚瀹氥€?

褰撲笉鎸囧畾缂栫爜鐨勬椂鍊欙紝璇ユ柟娉曚嬌鐢ㄧ郴缁熼粯璁ょ紪鐮侊紝杩欎細瀵艱嚧杞歡杩愯缁撴灉寰椾笉纭畾銆傛瘮濡傚浜?涓枃"锛屽綋绯葷粺榛樿缂栫爜涓?gb2312"鏃訛紝缁撴灉鏄?%4e%2d%65%87"锛岃€岄粯璁ょ紪鐮佷負"UTF-8"锛岀粨鏋滃嵈鏄?%e4%b8%ad%e6%96%87"锛屽悗缁▼搴忓皢闅句互澶勭悊銆傚彟澶栵紝杩欏効璇寸殑绯葷粺榛樿缂栫爜鏄敱杩愯tomcat鏃剁殑鐜鍙橀噺LC_ALL鍜孡ANG绛夊喅瀹氱殑锛屾浘缁忓嚭鐜拌繃tomcat閲嶅惎鍚庡氨鍑虹幇涔辯爜鐨勯棶棰橈紝鏈€鍚庢墠閮侀椃鐨勫彂鐜版槸鍥犱負淇敼淇敼浜嗚繖涓や釜鐜鍙橀噺銆?

寤鴻缁熶竴鎸囧畾涓?UTF-8"缂栫爜锛屽彲鑳介渶瑕佷慨鏀圭浉搴旂殑绋嬪簭銆?

15.5.4.聽 涓€涓В鍐蟲柟妗?

涓婇潰璇磋搗杩囷紝鍥犱負娴忚鍣ㄨ缃殑涓嶅悓锛屽浜庡悓涓€涓摼鎺ワ紝web server鏀跺埌鐨勬槸涓嶅悓鍐呭锛岃€岃蔣浠剁郴缁熸湁鏃犳硶鐭ラ亾杩欎腑闂寸殑鍖哄埆锛屾墍浠ヨ繖涓€鍗忚鐩墠杩樺瓨鍦ㄧ己闄楓€?

閽堝鍏蜂綋闂锛屼笉搴旇渚ュ垢璁や負鎵€鏈夊鎴風殑IE璁劇疆閮芥槸UTF-8鏈夋晥鐨勶紝涔熶笉搴旇绮楁毚鐨勫緩璁敤鎴蜂慨鏀笽E璁劇疆锛岃鐭ラ亾锛岀敤鎴蜂笉鍙兘鍘昏浣忔瘡涓€涓獁eb server鐨勮缃€傛墍浠ワ紝鎺ヤ笅鏉ョ殑瑙e喅鍔炴硶灏卞彧鑳芥槸璁╄嚜宸辯殑绋嬪簭澶氫竴鐐規櫤鑳斤細鏍規嵁鍐呭鏉ュ垎鏋愮紪鐮佹槸鍚TF-8銆?

姣旇緝骞歌繍鐨勬槸UTF-8缂栫爜鐩稿綋鏈夎寰嬶紝鎵€浠ュ彲浠ラ€氳繃鍒嗘瀽浼犺緭杩囨潵鐨勯摼鎺ュ唴瀹癸紝鏉ュ垽鏂槸鍚︽槸姝g‘鐨刄TF-8瀛楃锛屽鏋滄槸锛屽垯浠TF-8澶勭悊涔嬶紝濡傛灉涓嶆槸锛屽垯浣跨敤瀹㈡埛榛樿缂栫爜(姣斿"GBK")锛屼笅闈㈡槸涓€涓垽鏂槸鍚TF-8鐨勪緥瀛愶紝濡傛灉浣犱簡瑙g浉搴旇寰嬶紝灏卞鏄撶悊瑙c€?

publicstatic boolean isValidUtf8(byte[] b,int aMaxCount){

int lLen=b.length,lCharCount=0;

for(int i=0;i

byte lByte=b[i++];//to fast operation, ++ now, ready for the following for(;;)

if(lByte>=0) continue;//>=0 is normal ascii

if(lByte(byte)0xfd) return false;

int lCount=lByte>(byte)0xfc?5:lByte>(byte)0xf8?4

:lByte>(byte)0xf0?3:lByte>(byte)0xe0?2:1;

if(i+lCount>lLen) return false;

for(int j=0;j=(byte)0xc0) return false;

}

return true;

}

鐩稿簲鍦幫紝涓€涓嬌鐢ㄤ笂杩版柟娉曠殑渚嬪瓙濡備笅锛?

publicstatic String getUrlParam(String aStr,String aDefaultCharset)

throwsUnsupportedEncodingException{

if(aStr==null) return null;

byte[] lBytes=aStr.getBytes("ISO-8859-1");

return new String(lBytes,StringUtil.isValidUtf8(lBytes)?"utf8":aDefaultCharset);

}

涓嶈繃锛岃鏂規硶涔熷瓨鍦ㄧ己闄鳳紝濡備笅涓ゆ柟闈細

l娌℃湁鍖呮嫭瀵圭敤鎴烽粯璁ょ紪鐮佺殑璇嗗埆锛岃繖鍙互鏍規嵁璇鋒眰淇℃伅鐨勮瑷€鏉ュ垽鏂紝浣嗕笉涓€瀹氭纭紝鍥犱負鎴戜滑鏈夋椂鍊欎篃浼氳緭鍏ヤ竴浜涢煩鏂囷紝鎴栬€呭叾浠栨枃瀛椼€?

l鍙兘浼氶敊璇垽鏂璘TF-8瀛楃锛屼竴涓緥瀛愭槸"瀛︿範"涓や釜瀛楋紝鍏禛BK缂栫爜鏄?\xd1\xa7\xcf\xb0"锛屽鏋滀嬌鐢ㄤ笂杩癷sValidUtf8鏂規硶鍒ゆ柇锛屽皢杩斿洖true銆傚彲浠ヨ€冭檻浣跨敤鏇翠弗鏍肩殑鍒ゆ柇鏂規硶锛屼笉杩囦及璁℃晥鏋滀笉澶с€?

鏈変竴涓緥瀛愬彲浠ヨ瘉鏄巊oogle涔熼亣鍒頒簡涓婅堪闂锛岃€屼笖涔熼噰鐢ㄤ簡鍜屼笂杩扮浉浼肩殑澶勭悊鏂規硶锛屾瘮濡傦紝濡傛灉鍦ㄥ湴鍧€鏍忎腑杈撳叆"http://www.google.com/search?hl=zh-CN&newwindow=1&q=瀛︿範"锛実oogle灏嗘棤娉曟纭瘑鍒紝鑰屽叾浠栨眽瀛椾竴鑸兘澶熸甯歌瘑鍒€?

鏈€鍚庯紝搴旇琛ュ厖璇存槑涓€涓嬶紝濡傛灉涓嶄嬌鐢╮ewrite瑙勫垯锛屾垨鑰呴€氳繃琛ㄥ崟鎻愪氦鏁版嵁锛屽叾瀹炲苟涓嶄竴瀹氫細閬囧埌涓婅堪闂锛屽洜涓鴻繖鏃跺彲浠ュ湪鎻愪氦鏁版嵁鏃舵寚瀹氬笇鏈涚殑缂栫爜銆傚彟澶栵紝涓枃鏂囦歡鍚嶇‘瀹炰細甯︽潵闂锛屽簲璇ヨ皚鎱庝嬌鐢ㄣ€?

15.6.鍏跺畠

涓嬮潰鎻忚堪涓€浜涘拰缂栫爜鏈夊叧鐨勫叾浠栭棶棰樸€?

15.6.1.聽 聽SecureCRT

闄や簡娴忚鍣ㄥ拰鎺у埗鍙頒笌缂栫爜鏈夊叧澶栵紝涓€浜涘鎴風涔熷緢鏈夊叧绯彙€傛瘮濡傚湪浣跨敤SecureCRT杩炴帴linux鏃訛紝搴旇璁㏒ecureCRT鐨勬樉绀虹紪鐮?涓嶅悓鐨剆ession锛屽彲浠ユ湁涓嶅悓鐨勭紪鐮佽缃?鍜宭inux鐨勭紪鐮佺幆澧冨彉閲忎繚鎸佷竴鑷淬€傚惁鍒欑湅鍒扮殑涓€浜涘府鍔╀俊鎭紝灏卞彲鑳芥槸涔辯爜銆?

鍙﹀锛宮ysql鏈夎嚜宸辯殑缂栫爜璁劇疆锛屼篃搴旇淇濇寔鍜孲ecureCRT鐨勬樉绀虹紪鐮佷竴鑷淬€傚惁鍒欓€氳繃SecureCRT鎵цsql璇彞鐨勬椂鍊欙紝鍙兘鏃犳硶澶勭悊涓枃瀛楃锛屾煡璇㈢粨鏋滀篃浼氬嚭鐜頒貢鐮併€?

瀵逛簬Utf-8鏂囦歡锛屽緢澶氱紪杈戝櫒(姣斿璁頒簨鏈?浼氬湪鏂囦歡寮€澶村鍔犱笁涓笉鍙鐨勬爣蹇楀瓧鑺傦紝濡傛灉浣滀負mysql鐨勮緭鍏ユ枃浠訛紝鍒欏繀椤昏鍘繪帀杩欎笁涓瓧绗︺€?鐢╨inux鐨剉i淇濆瓨鍙互鍘繪帀杩欎笁涓瓧绗?銆備竴涓湁瓒g殑鐜拌薄鏄紝鍦ㄤ腑鏂噖indows涓嬶紝鍒涘緩涓€涓柊txt鏂囦歡锛岀敤璁頒簨鏈墦寮€锛岃緭鍏?杩為€?涓や釜瀛楋紝淇濆瓨锛屽啀鎵撳紑锛屼綘浼氬彂鐜頒袱涓瓧娌′簡锛屽彧鐣欎笅涓€涓皬榛戠偣銆?

15.6.2.聽 杩囨護鍣?

濡傛灉闇€瑕佺粺涓€璁劇疆缂栫爜锛屽垯閫氳繃filter杩涜璁劇疆鏄釜涓嶉敊鐨勯€夋嫨銆傚湪filterclass涓紝鍙互缁熶竴涓洪渶瑕佺殑璇鋒眰鎴栬€呭洖搴旇缃紪鐮併€傚弬鍔犱笂杩皊etCharacterEncoding()銆傝繖涓被apache宸茬粡缁欏嚭浜嗗彲浠ョ洿鎺ヤ嬌鐢ㄧ殑渚嬪瓙SetCharacterEncodingFilter銆?

15.6.3.聽 聽POST鍜孏ET

寰堟槑鏄撅紝浠OST鎻愪氦淇℃伅鏃訛紝URL鏈夋洿濂界殑鍙鎬э紝鑰屼笖鍙互鏂逛究鐨勪嬌鐢╯etCharacterEncoding()鏉ュ鐞嗗瓧绗﹂泦闂銆備絾GET鏂規硶褰㈡垚鐨刄RL鑳藉鏇村鏄撹〃杈劇綉椤電殑瀹為檯鍐呭锛屼篃鑳藉鐢ㄤ簬鏀惰棌銆?

浠庣粺涓€鐨勮搴﹁€冭檻闂锛屽緩璁噰鐢℅ET鏂規硶锛岃繖瑕佹眰鍦ㄧ▼搴忎腑鑾峰緱鍙傛暟鏄繘琛岀壒娈婂鐞嗭紝鑰屾棤娉曚嬌鐢╯etCharacterEncoding()鐨勪究鍒╋紝濡傛灉涓嶈€冭檻rewrite锛屽氨涓嶅瓨鍦↖E鐨刄TF-8闂锛屽彲浠ヨ€冭檻閫氳繃璁劇疆URIEncoding鏉ユ柟渚胯幏鍙朥RL涓殑鍙傛暟銆?

15.6.4.聽 绠€绻佷綋缂栫爜杞崲

GBK鍚屾椂鍖呭惈绠€浣撳拰绻佷綋缂栫爜锛屼篃灏辨槸璇村悓涓€涓瓧锛岀敱浜庣紪鐮佷笉鍚岋紝鍦℅BK缂栫爜涓嬪睘浜庝袱涓瓧銆傛湁鏃跺€欙紝涓轟簡姝g‘鍙栧緱瀹屾暣鐨勭粨鏋滐紝搴旇灏嗙箒浣撳拰绠€浣撹繘琛岀粺涓€銆傚彲浠ヨ€冭檻灏哢TF銆丟BK涓殑鎵€鏈夌箒浣撳瓧锛岃漿鎹負鐩稿簲鐨勭畝浣撳瓧锛孊IG5缂栫爜鐨勬暟鎹紝涔熷簲璇ヨ漿鍖栨垚鐩稿簲鐨勭畝浣撳瓧銆傚綋鐒訛紝浠嶆棫浠TF缂栫爜瀛樺偍銆?

渚嬪锛屽浜?璇█瑾炶█"锛岀敤UTF琛ㄧず涓?\xE8\xAF\xAD\xE8\xA8\x80\xE8\xAA\x9E\xE8\xA8\x80"锛岃繘琛岀畝绻佷綋缂栫爜杞崲鍚庡簲璇ユ槸涓や釜鐩稿悓鐨?"\xE8\xAF\xAD\xE8\xA8\x80>"銆?

Manufacturer.com鍒樼鍨?

2006-3-8

锘匡豢