2017-10-22 7 views
0

と私は、このようにHTMLを解析しようとすると:java.net.SocketExceptionが:接続リセットのエラーが発生しましHTML解析Jsoup

String userAgent = "Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.3) Gecko/20040924 Epiphany/1.4.4 (Ubuntu)"; 
Document doc = Jsoup.connect(decodedUrl).userAgent(userAgent).timeout(30 * 1000).get(); 

は時々私はまさにここに、このエラーをキャッチしますが、時にはそれは大丈夫です。たとえば、私はこのエラーをhttp://example.com/abc-defで捕まえましたが、明確に働くのはhttp://example.com/ghi-jklです。同様に、両方とも同様のhtmlコンテンツを持っています。

スタックトレースにある:

java.net.SocketException: Connection reset 
at java.net.SocketInputStream.read(SocketInputStream.java:210) 
at java.net.SocketInputStream.read(SocketInputStream.java:141) 
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) 
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) 
at java.io.BufferedInputStream.read(BufferedInputStream.java:345) 
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704) 
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647) 
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:675) 
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569) 
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474) 
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) 
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:656) 
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:629) 
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:261) 
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:250) 
at com.marktobuy.service.ProductCrawler.productCrawler(ProductCrawler.java:51) 
at com.marktobuy.service.ProductCrawler$$FastClassBySpringCGLIB$$7b7c5adb.invoke(<generated>) 
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) 
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720) 
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) 
at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99) 
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:281) 
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96) 
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) 
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:655) 
at com.marktobuy.service.ProductCrawler$$EnhancerBySpringCGLIB$$aba81f3d.productCrawler(<generated>) 
at com.marktobuy.controller.MarkController.mark(MarkController.java:45) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221) 
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136) 
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:114) 
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) 
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) 
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) 
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:963) 
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:897) 
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) 
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872) 
at javax.servlet.http.HttpServlet.service(HttpServlet.java:648) 
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846) 
at javax.servlet.http.HttpServlet.service(HttpServlet.java:729) 
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292) 
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207) 
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) 
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240) 
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317) 
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127) 
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) 
at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:115) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) 
at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:137) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) 
at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:111) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) 
at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:169) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) 
at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) 
at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:200) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) 
at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:121) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) 
at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:66) 
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) 
at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:105) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) 
at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:56) 
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) 
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) 
at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:214) 
at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:177) 
at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:346) 
at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:262) 
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240) 
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207) 
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212) 
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:94) 
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:504) 
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141) 
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79) 
at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:620) 
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88) 
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:502) 
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1132) 
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684) 
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1533) 
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1489) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) 
at java.lang.Thread.run(Thread.java:745) 

私はjsoup-1.10.3を使用しています。

どうすればよいですか?

ありがとうございました。

+0

あなたは私たちに一貫して失敗する本当のURLを教えてもらえますか? – luksch

+0

@luksch http://www.hepsiburada.com/momentus-tm239r-04rs-erkek-kol-saati-p-SATM239R-04RS?magaza=Zaman%20At%C3%B6lyesi –

答えて

1

サーバーが接続を終了するためにTCPリセットを送信しました。

おそらく、サービスのAPIに何らかの理由でリセットが発生する可能性がありますか?

これらが本当に説明できない場合、定期的にリセットすると、サービスが信頼できないと受け入れる必要があり、クライアント側のコードでキャッチバックオフと再試行のメカニズムの信頼性に問題が生じる可能性があります。例えば

int count = 0; 
int maxTries = 3; 
boolean continueHttpCall = true; 
while (continueHttpCall) { 
    try { 
     String userAgent = "Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.3) Gecko/20040924 Epiphany/1.4.4 (Ubuntu)"; 
     Document doc = Jsoup.connect(decodedUrl).userAgent(userAgent).timeout(30 * 1000).get(); 

     continueHttpCall = false; 

     // on success do something with the response 
     // ... 
    } catch (SocketException ex) { 
     if (++count == maxTries) { 
      continueHttpCall = false; 

      // do something with the exception e.g. 
      // throw ex; 
     } 
    } 
} 

または代わりにあなたのためのキャッチ&再試行のセマンティクスを実装するために、このようなFailsafeとしてライブラリを使用しています。例:

public void connect() { 
    String userAgent = "Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.3) Gecko/20040924 Epiphany/1.4.4 (Ubuntu)"; 
    Document doc = Jsoup.connect(decodedUrl).userAgent(userAgent).timeout(30 * 1000).get(); 

    // ... 
} 

RetryPolicy retryPolicy = new RetryPolicy() 
    .retryOn(SocketException.class); 

Failsafe.with(retryPolicy) 
    .onRetry((r, f) -> addressConnectionFailure()) 
    .run(() -> connect()); 
関連する問題