天天看点

WebClient在多线程、使用代理情况下 socket closed 问题的一个解决办法[htmlunit] WebClient在多线程、使用代理情况下 socket closed 问题的一个解决办法[htmlunit]

http://blog.csdn.net/eclipseek/article/details/7478208

WebClient在多线程、使用代理情况下 socket closed 问题的一个解决办法[htmlunit]

分类: 技术专题 2012-04-19 16:53  3364人阅读  评论(0)  收藏  举报 socket 多线程 null 浏览器 服务器 工作 通过 WebClient 的内置浏览器,可以执行页面抓取工作,有时可能需要设置代理,

WebClient webClient = new WebClient(BrowserVersion.x);

webClient.setProxyConfig(ProxyConfig pc);

在单线程情况下,使用这样创建的webClient不会有问题:客户端到代理服务器的连接能够很有次序的建立、关闭。

考虑这样的情况:多个线程并发地访问 WebClient,可能就会报下面的异常:

[Thread-7] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Closing connection [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]][null]

........

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Total connections kept alive: 0

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Total issued connections: 0

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Total allocated connection: 0 out of 20

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - No free connections [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]][null]

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Available capacity: 2 out of 2 [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]][null]

2012-04-19 10:31:32,926 [Thread-8] DEBUG org.apache.http.impl.conn.tsccm.ConnPoolByRoute  - Creating new connection [HttpRoute[{}->http://192.168.5.29:3128->http://58.223.139.151:8080]]

2012-04-19 10:31:32,926 [Thread-6] DEBUG org.apache.http.impl.client.DefaultHttpClient  - socket closed

java.net.SocketException: socket closed

at java.net.SocketInputStream.socketRead0(Native Method)

at java.net.SocketInputStream.read(Unknown Source)

at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:130)

at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:127)

at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:233)

at org.apache.http.impl.conn.LoggingSessionInputBuffer.readLine(LoggingSessionInputBuffer.java:100)

at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98)

at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:210)

at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:271)

at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:227)

at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:209)

at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:292)

at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:126)

at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:483)

at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:641)

at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:597)

at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:134)

at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1406)

at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1460)

at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1325)

at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:304)

at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:370)

异常信息显示:thread-6使用webclient时,检测到 socket closed异常,查看上面的异常,存在 socket (http://192.168.5.29:3128->http://58.223.139.151:8080)被 thread-7 关闭的情况,thread-8 创建了新 socket,可能之后某个时间点,socket又被关闭,导致 thread-6 报socket closed异常。

通过使用ThreadLocal为不同的线程创建各自独立的 WebClient 对象,就能避免上述问题:

[java]  view plain copy

  1. // 每个线程保持一个独立的 WebClient 对象,防止线程共用一个浏览器相互干扰  
  2.     private ThreadLocal<WebClient> client = new ThreadLocal<WebClient>() {  
  3.         protected synchronized WebClient initialValue(){  
  4.             WebClient webClient = new WebClient(version);  
  5.              //设置webClient的相关参数  
  6.             webClient.set...;  
  7.             return webClient;  
  8.         }  
  9.     };  
  10.     public void setWebClient(WebClient wc) {  
  11.         client.set(wc);  
  12.     }  
  13.     public WebClient getWebClient() {  
  14.         return client.get();  
  15.     }   

版权声明:本文为博主原创文章,未经博主允许不得转载。