天天看点

【Java】解析XML配置文件总结——SAX解析XML(三)

什么是Sax

  1. SAX,即Simple API For XML。非W3C官方所提供的标准,是“民间”的事实标准。SAX在概念上与DOM完全不同,SAX提供的访问方式是一种顺序模式(即推模式),这是一种快速读写XML数据的方式。当使用SAX解析器对xml文件进行解析时,会触发一系列的事件,并激活相应的事件处理函数,应用程序通过这些事件处理函数对xml文档的访问。
  2. SAX解析器被称为SAXParser,SAXParser是由javax.xml.parsers.SAXParserFactory创建的。与DOM解析器不同,SAX解析器并不创建XML文档的内存表示,因此会占用更少的内存。
  3. SAX是非文档驱动,而是事件驱动的。所谓事件驱动,就是一种基于回调机制的程序运行方法。SAX解析器装载XML文件时,它遍历XML文档并在其主机应用程序中产生事件(经由回调函数、指派函数或者任何可调用平台完成这一功能)表示这一过程。

优缺点

优点:

1、采用事件驱动模式,对内存耗费比较小。

2、适用于只处理XML文件中的数据时。

缺点:

1、编码比较麻烦。

2、很难同时访问XML文件中的多处不同数据。

SAX与DOM

DOM解析器: 将整个XML文档全部加载到内存中,返回文档对象Document

@@解析器DocumentBuilder—>Document document = builder.parse(xmlfile)

SAX解析器:一边读取xml一边处理,并没有返回值

@@解析器SAXParser-----> 将xml文档和文档解析处理器(DefaultHandler及其子类)同时传递给SAX解析器-------解析器调用处理器相应的事件处理方法来处理文档

DOM解析的核心是将XML文档构建成一棵树的模型,而SAX解析方式的核心是要创建一个XML解析器类,让其继承于DefaultHandler类,并且重写DefaultHandler类的五个回调方法

SAX解析流程

SAX解析器通过调用回调方法(事件驱动)将XML文档结构通知客户端,也就是通过调用提供给SAXParser的org.xml.sax.helpers.DefaultHandler处理器内的方法。

org.xml.sax.helpers.DefaultHandler类实现了ContentHandler、ErrorHandler、DTDHandler以及EntityResolver等接口。
最重要的方法有:
1.startDocument()  ---- 文档开始事件
 2.startElement() ---- 元素开始事件
  3.characters() ---- 文本元素事件
   4.endElement() ---- 元素结束事件
    5.endDocument()  ----- 文档结束事件
           

为什么说SAX是推模式解析?

解析器控制xml文件解析,由解析器调用相应事件方法,由位于服务端的解析器内部主导的事件方法调用为推模式

SAX解析器采用了基于事件的模型,它在解析文档的时候可以触发一系列的事件,发生相应事件时,将调用一个回调方法。

例如:

<?xml version=“1.0” encoding=“utf-8”?>
<bookstore>
	<book>java基础</book>
</bookstore>
           

以此触发的是:

Start document

Start element(bookstore)

Characters(whitespace)

Start element(book)

Characters(java基础)

End element(book)

Characters(whitespace)

End element(bookstore)

End document

SAX示例

主方法LoadXmlSax :

package com.xing.loadxml.sax;

import com.xing.loadxml.Utils.FileUtil;
import org.xml.sax.SAXException;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import java.io.File;
import java.io.IOException;
import java.net.URISyntaxException;

public class LoadXmlSax {
    public static void main(String[] args) throws URISyntaxException, ParserConfigurationException, SAXException, IOException {
        File xmlFile = FileUtil.getXmlFile();
        // 1.创建SAXParser的工厂
        SAXParserFactory factory = SAXParserFactory.newInstance();
        // 2.创建SAXParser解析器
        SAXParser parser = factory.newSAXParser();
        // 3.声明自己的Handle 利用Handler对文档进行解析
        MySelfSAXParserHandle mySelfSAXParserHandle = new MySelfSAXParserHandle();
        // 4.解析文档
        parser.parse(xmlFile,mySelfSAXParserHandle);
    }
}

           

自定义Handler-----MySelfSAXParserHandle:

package com.xing.loadxml.sax;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.helpers.DefaultHandler;

import java.util.Stack;

public class MySelfSAXParserHandle extends DefaultHandler {

    // 使用栈这个数据结构来保存 其实也可以用list等其他方式保存
    private Stack<String> stack = new Stack<String>();

    private String category;
    private String id;
    private String lang;
    private String title;
    private String author;
    private String year;
    private String price;

    /**
     *  遇到文档开始的回调
     * @throws SAXException
     */
    @Override
    public void startDocument() throws SAXException {
        System.out.println("startDocument >>>> begin");
    }

    /**
     *  遇到元素开始的回调
     * @param uri
     * @param localName
     * @param qName
     * @param attributes
     * @throws SAXException
     */
    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        System.out.println("{");
        System.out.println("startElement >>>> begin");
        System.out.println("uri="+uri+",localName="+localName+",qName="+qName+",attributes="+attributes.toString());
        stack.push(qName);
        for (int i = 0; i < attributes.getLength(); i++) {
            String attrName = attributes.getQName(i);
            String attrValue = attributes.getValue(i);
            if (attrName.equals("category")) {
                category = attrValue;
            }else if (attrName.equals("id")){
                id = attrValue;
            }else if(attrName.equals("lang")){
                lang = attrValue;
            }
        }
        System.out.println("startElement >>>> end");
        System.out.println("}");

    }

    /**
     * 遇到文本开始的回调
     * @param ch
     * @param start
     * @param length
     * @throws SAXException
     */
    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        System.out.println("{");
        // 取出标签名
        String tag = stack.peek();
        System.out.println("characters >>>> begin");
        String content = new String(ch, start, length);
        System.out.println("characters: " + content);
        System.out.println("characters >>>> end ");
        if (tag.equals("title")){
            title = content;
        }else if (tag.equals("author")){
            author=content;
        }else if (tag.equals("year")){
            year=content;
        }else if (tag.equals("price")){
            price=content;
        }
        System.out.println("}");
    }

    /**
     * 遇到元素结束的回调
     * @param uri
     * @param localName
     * @param qName
     * @throws SAXException
     */
    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        stack.pop();
        System.out.println("{");
        System.out.println("endElement >>>> begin");
        System.out.println("uri="+uri+",localName="+localName+",qName="+qName);
        System.out.println("endElement end");
        if (qName.equals("book")){
            System.out.println("Book info: -------");
            System.out.println("    category: " + category);
            System.out.println("    id: " + id);
            System.out.println("    title: " + title);
            System.out.println("    lang: " + lang);
            System.out.println("    author: " + author);
            System.out.println("    year: " + year);
            System.out.println("    price: " + price);
            category=null;
            id=null;
            title=null;
            lang=null;
            author=null;
            year=null;
            price=null;
            System.out.println();
        }
        System.out.println("}");
    }

    /**
     *  遇到文档结束的回调
     * @throws SAXException
     */
    @Override
    public void endDocument() throws SAXException {
        System.out.println("endDocument end");
    }

    /**
     *  遇到错误的回调
     * @param e
     * @throws SAXException
     */
    @Override
    public void error(SAXParseException e) throws SAXException {
        e.printStackTrace();
    }
}

           

输出:

"D:\Program Files\Java\jdk1.8.0_162\bin\java.exe" "-javaagent:D:\Program Files\JetBrains\IntelliJIDEA\lib\idea_rt.jar=59275:D:\Program Files\JetBrains\IntelliJIDEA\bin" -Dfile.encoding=UTF-8 -classpath "D:\Program Files\Java\jdk1.8.0_162\jre\lib\charsets.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\deploy.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\access-bridge-64.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\cldrdata.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\dnsns.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\jaccess.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\jfxrt.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\localedata.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\nashorn.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\sunec.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\sunjce_provider.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\sunmscapi.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\sunpkcs11.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\ext\zipfs.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\javaws.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\jce.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\jfr.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\jfxswt.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\jsse.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\management-agent.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\plugin.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\resources.jar;D:\Program Files\Java\jdk1.8.0_162\jre\lib\rt.jar;E:\ideawork\loadXML\target\classes" com.xing.loadxml.sax.LoadXmlSax
startDocument >>>> begin
{
startElement >>>> begin
uri=,localName=,qName=bookstore,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 
    
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=book,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=title,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 呐喊
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=title
endElement end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=author,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 鲁迅
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=author
endElement end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=year,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 1995
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=year
endElement end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=price,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 30.00
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=price
endElement end
}
{
characters >>>> begin
characters: 
    
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=book
endElement end
Book info: -------
    category: 文学作品
    id: 001
    title: 呐喊
    lang: CH
    author: 鲁迅
    year: 1995
    price: 30.00

}
{
characters >>>> begin
characters: 
    
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=book,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=title,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 神曲
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=title
endElement end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=author,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 但丁·阿利基埃里
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=author
endElement end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=year,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 1265
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=year
endElement end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=price,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 55.00
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=price
endElement end
}
{
characters >>>> begin
characters: 
    
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=book
endElement end
Book info: -------
    category: 国外作品
    id: null
    title: 神曲
    lang: EN
    author: 但丁·阿利基埃里
    year: 1265
    price: 55.00

}
{
characters >>>> begin
characters: 
    
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=book,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=title,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: JAVA基础教程
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=title
endElement end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=author,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 中邮出版社
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=author
endElement end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=year,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 2010
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=year
endElement end
}
{
characters >>>> begin
characters: 
        
characters >>>> end 
}
{
startElement >>>> begin
uri=,localName=,qName=price,attributes=com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy@7f31245a
startElement >>>> end
}
{
characters >>>> begin
characters: 80.00
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=price
endElement end
}
{
characters >>>> begin
characters: 
    
characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=book
endElement end
Book info: -------
    category: 计算机科学与技术
    id: null
    title: JAVA基础教程
    lang: CH
    author: 中邮出版社
    year: 2010
    price: 80.00

}
{
characters >>>> begin
characters: 

characters >>>> end 
}
{
endElement >>>> begin
uri=,localName=,qName=bookstore
endElement end
}
endDocument end

Process finished with exit code 0

           

SAX不能对xml文件进行增、删、改,它主要用来对xml文件进行遍历解析