天天看點

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

package dom;

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

import java.io.File;

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

import org.jdom.Document;

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

import org.jdom.input.SAXBuilder;

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

public class TestJdom {

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    public static void main(String[] args) {

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

        File file = new File("./src/dom/aiwf_aiService.xml");

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

        if (file.exists()) {

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

            SAXBuilder builder = new SAXBuilder();

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

            try {

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

                Document doc = builder.build(file);

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

                System.out.println(doc);

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

            } catch (Exception e) {

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

                e.printStackTrace();

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

            }

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

        } else {

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

            System.out.println("can not find xml file:"

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

                    + file.getAbsolutePath());

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

        }

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    }

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

}

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

2,xml檔案

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

<?xml version="1.0" encoding="GBK"?>

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

<!DOCTYPE workflow PUBLIC "-//OpenSymphony Group//DTD OSWorkflow 2.8//EN" "http://www.opensymphony.com/osworkflow/workflow_2_8.dtd">

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

<workflow>

                ...............

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

</workflow>

3,錯誤如下

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

java.net.SocketException: Permission denied: connect

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at java.net.PlainSocketImpl.socketConnect(Native Method)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at java.net.Socket.connect(Socket.java:507)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at java.net.Socket.connect(Socket.java:457)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at sun.net.NetworkClient.doConnect(NetworkClient.java:157)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at sun.net.www.http.HttpClient.openServer(HttpClient.java:365)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at sun.net.www.http.HttpClient.openServer(HttpClient.java:477)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at sun.net.www.http.HttpClient.<init>(HttpClient.java:214)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at sun.net.www.http.HttpClient.New(HttpClient.java:287)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at sun.net.www.http.HttpClient.New(HttpClient.java:299)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:792)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:744)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:669)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:913)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:973)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:905)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:872)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:282)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(XMLDocumentScannerImpl.java:1021)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at org.jdom.input.SAXBuilder.build(SAXBuilder.java:453)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at org.jdom.input.SAXBuilder.build(SAXBuilder.java:810)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at org.jdom.input.SAXBuilder.build(SAXBuilder.java:789)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    at dom.TestJdom.main(TestJdom.java:26)

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

三、分析原因

當執行build的時候jdom分析到

DOCTYPE workflow PUBLIC "-/OpenSymphony Group//DTD OSWorkflow 2.8//EN" "http://www.opensymphony.com/osworkflow/workflow_2_8.dtd 

四、解決辦法

1,最開始檢視jdom api發現了這樣一個方法

builder.setValidation(false);

這樣可以讓jdom不做驗證,但是結果依然出問題,查了一下原因,說雖然不驗證但是還是會下載下傳

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

How do I keep the DTD from loading? Even when I turn off validation the parser tries to load the DTD file.

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

Even when validation is turned off, an XML parser will by default load the external DTD file in order to parse the DTD for external entity declarations. Xerces has a feature to turn off this behavior named "http://apache.org/xml/features/nonvalidating/load-external-dtd" and if you know you're using Xerces you can set this feature on the builder.

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

builder.setFeature(

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

  "http://apache.org/xml/features/nonvalidating/load-external-dtd", false);

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

If you're using another parser like Crimson, your best bet is to set up an EntityResolver that resolves the DTD without actually reading the separate file.

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

import org.xml.sax.*;

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

import java.io.*;

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

public class NoOpEntityResolver implements EntityResolver {

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

  public InputSource resolveEntity(String publicId, String systemId) {

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

    return new InputSource(new StringBufferInputStream(""));

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

  }

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

Then in the builder

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

builder.setEntityResolver(new NoOpEntityResolver());

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

There is a downside to this approach. Any entities in the document will be resolved to the empty string, and will effectively disappear. If your document has entities, you need to setExpandEntities(false) code and ensure the EntityResolver only suppresses the DocType.

jdom dom4j解析xml不對dtd doctype進行驗證(轉)

裡邊教我們定義個類

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

             return new InputSource(new StringBufferInputStream(""));

jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)
jdom dom4j解析xml不對dtd doctype進行驗證(轉)

通過builder.setEntityResolver(new NoOpEntityResolver())方法來隐蔽起dtd驗證器。這樣就不會出錯了。試了一下确實沒問題了。但要知道xml沒有dtd驗證是不好的,我們是否能讓它使用本地dtd驗證呢。例如本文的oswork

我把驗證檔案workflow_2_8.dtd拷貝到本地,能否驗證的時候用本地的呢? 

3,用本地dtd驗證

方法有兩種

方法一、更改xml中的doctype聲明,但是一般情況下更改這個是不好的。更改後就不是标準的了。

方法二、驗證期替換

public Document load(String file) throws JDOMException, IOException {

       try {

  SAXBuilder sax = new SAXBuilder();

        sax.setValidation(false);

        sax.setEntityResolver(new EntityResolver() {

            public InputSource resolveEntity(String publicId,String systemId) throws SAXException, IOException {

                    is.setPublicId(publicId);

                    is.setSystemId(systemId);

                    return is;

                 */

                 return new InputSource(new FileInputStream(""));

            }

        });

  return sax.build(file);

       } catch ( Exception e )  {

           e.printStackTrace();

           return null;           

       }

 }

http://blog.csdn.net/youlianying/article/details/5908335

繼續閱讀