http://blog.csdn.net/eclipseek/article/details/7478208
WebClient在多線程、使用代理情況下 socket closed 問題的一個解決辦法[htmlunit]
分類: 技術專題 2012-04-19 16:53 3364人閱讀 評論(0) 收藏 舉報 socket 多線程 null 浏覽器 伺服器 工作 通過 WebClient 的内置浏覽器,可以執行頁面抓取工作,有時可能需要設定代理,
WebClient webClient = new WebClient(BrowserVersion.x);
webClient.setProxyConfig(ProxyConfig pc);
在單線程情況下,使用這樣建立的webClient不會有問題:用戶端到代理伺服器的連接配接能夠很有次序的建立、關閉。
考慮這樣的情況:多個線程并發地通路 WebClient,可能就會報下面的異常:
[Thread-7] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute - Closing connection [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]][null]
........
2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute - Total connections kept alive: 0
2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute - Total issued connections: 0
2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute - Total allocated connection: 0 out of 20
2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute - No free connections [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]][null]
2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute - Available capacity: 2 out of 2 [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]][null]
2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute - Creating new connection [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]]
2012-04-19 10:31:32,926 [Thread-6] DEBUG org.apache.http.impl.client.DefaultHttpClient - socket closed
java.net.SocketException: socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:130)
at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:127)
at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:233)
at org.apache.http.impl.conn.LoggingSessionInputBuffer.readLine(LoggingSessionInputBuffer.java:100)
at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:210)
at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:271)
at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:227)
at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:209)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:292)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:126)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:483)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:641)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:597)
at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:134)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1406)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1460)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1325)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:304)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:370)
異常資訊顯示:thread-6使用webclient時,檢測到 socket closed異常,檢視上面的異常,存在 socket (http://192.168.5.29:3128->http://58.223.139.151:8080)被 thread-7 關閉的情況,thread-8 建立了新 socket,可能之後某個時間點,socket又被關閉,導緻 thread-6 報socket closed異常。
通過使用ThreadLocal為不同的線程建立各自獨立的 WebClient 對象,就能避免上述問題:
[java] view plain copy
- // 每個線程保持一個獨立的 WebClient 對象,防止線程共用一個浏覽器互相幹擾
- private ThreadLocal<WebClient> client = new ThreadLocal<WebClient>() {
- protected synchronized WebClient initialValue(){
- WebClient webClient = new WebClient(version);
- //設定webClient的相關參數
- webClient.set...;
- return webClient;
- }
- };
- public void setWebClient(WebClient wc) {
- client.set(wc);
- }
- public WebClient getWebClient() {
- return client.get();
- }
版權聲明:本文為部落客原創文章,未經部落客允許不得轉載。