天天看點

WebClient在多線程、使用代理情況下 socket closed 問題的一個解決辦法[htmlunit] WebClient在多線程、使用代理情況下 socket closed 問題的一個解決辦法[htmlunit]

http://blog.csdn.net/eclipseek/article/details/7478208

WebClient在多線程、使用代理情況下 socket closed 問題的一個解決辦法[htmlunit]

分類: 技術專題 2012-04-19 16:53  3364人閱讀  評論(0)  收藏  舉報 socket 多線程 null 浏覽器 伺服器 工作 通過 WebClient 的内置浏覽器,可以執行頁面抓取工作,有時可能需要設定代理,

WebClient webClient = new WebClient(BrowserVersion.x);

webClient.setProxyConfig(ProxyConfig pc);

在單線程情況下,使用這樣建立的webClient不會有問題:用戶端到代理伺服器的連接配接能夠很有次序的建立、關閉。

考慮這樣的情況:多個線程并發地通路 WebClient,可能就會報下面的異常:

[Thread-7] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Closing connection [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]][null]

........

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Total connections kept alive: 0

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Total issued connections: 0

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Total allocated connection: 0 out of 20

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - No free connections [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]][null]

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Available capacity: 2 out of 2 [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]][null]

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Creating new connection [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]]

2012-04-19 10:31:32,926 [Thread-6] DEBUG org.apache.http.impl.client.DefaultHttpClient  - socket closed

java.net.SocketException: socket closed

at java.net.SocketInputStream.socketRead0(Native Method)

at java.net.SocketInputStream.read(Unknown Source)

at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:130)

at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:127)

at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:233)

at org.apache.http.impl.conn.LoggingSessionInputBuffer.readLine(LoggingSessionInputBuffer.java:100)

at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98)

at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:210)

at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:271)

at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:227)

at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:209)

at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:292)

at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:126)

at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:483)

at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:641)

at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:597)

at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:134)

at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1406)

at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1460)

at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1325)

at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:304)

at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:370)

異常資訊顯示:thread-6使用webclient時,檢測到 socket closed異常,檢視上面的異常,存在 socket (http://192.168.5.29:3128->http://58.223.139.151:8080)被 thread-7 關閉的情況,thread-8 建立了新 socket,可能之後某個時間點,socket又被關閉,導緻 thread-6 報socket closed異常。

通過使用ThreadLocal為不同的線程建立各自獨立的 WebClient 對象,就能避免上述問題:

[java]  view plain copy

  1. // 每個線程保持一個獨立的 WebClient 對象,防止線程共用一個浏覽器互相幹擾  
  2.     private ThreadLocal<WebClient> client = new ThreadLocal<WebClient>() {  
  3.         protected synchronized WebClient initialValue(){  
  4.             WebClient webClient = new WebClient(version);  
  5.              //設定webClient的相關參數  
  6.             webClient.set...;  
  7.             return webClient;  
  8.         }  
  9.     };  
  10.     public void setWebClient(WebClient wc) {  
  11.         client.set(wc);  
  12.     }  
  13.     public WebClient getWebClient() {  
  14.         return client.get();  
  15.     }   

版權聲明:本文為部落客原創文章,未經部落客允許不得轉載。