天天看點

Avro之二:入門demo

一、使用avro-maven插件為avsc檔案生成對應的java類:

在項目的pom.xml中增加依賴及插件如下:

<dependency>
            <groupId>org.apache.avro</groupId>
            <artifactId>avro</artifactId>
            <version>1.8.1</version>
        </dependency>  

...
        <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.6</source>
                    <target>1.6</target>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.avro</groupId>
                <artifactId>avro-maven-plugin</artifactId>
                <version>1.8.1</version>
                <executions>
                    <execution>
                        <phase>generate-sources</phase>
                        <goals>
                            <goal>schema</goal>
                        </goals>
                        <configuration>
                            <sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
                            <outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>        

執行mvn的install指令後,提示:

[INFO] Final Memory: 16M/217M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.avro:avro-maven-plugin:1.8.1:schema (default) on project study: neither sourceDirectory: D:\fvp-workspace\study\src\main\avro or testSourceDirectory: D:\fvp-workspace\study\src\test\avro are directories -> [Help 1]
[ERROR]       

需要注意下,需要手動在${project.basedir}/src/main和${project.basedir}/src/test下建立avro檔案夾。avro檔案夾就是後面存放Avro的schema檔案了(*.avsc)。

1.1、定義schema

  使用JSON為Avro定義schema。schema由基本類型(null,boolean, int, long, float, double, bytes 和string)和複雜類型(record, enum, array, map, union, 和fixed)組成。例如,以下定義一個user的schema,在main目錄下建立一個avro目錄,然後在avro目錄下建立檔案 user.avsc :

{"namespace": "com.sf.study.avro",
 "type": "record",
 "name": "User",
 "fields": [
     {"name": "name", "type": "string"},
     {"name": "favorite_number",  "type": ["int", "null"]},
     {"name": "favorite_color", "type": ["string", "null"]}
 ]
}      

 如IDE的截圖所示:

Avro之二:入門demo

1.2、用schema生成類檔案

在這裡,因為使用avro插件,是以,直接輸入以下指令,maven插件會自動幫我們生成類檔案:

mvn clean install      

然後在剛才配置的目錄下就會生成相應的類,如下: 

Avro之二:入門demo

如果不使用插件,也可以使用avro-tools來生成:

java -jar /path/to/avro-tools-1.8.1.jar compile schema <schema file> <destination>      

1.3、使用前面生成的類

在前面,類檔案已經建立好了,接下來,可以使用剛才自動生成的類來建立使用者了:

package com.sf.study.avro;

public class CreateUserTest {

    public static void main(String[] args) {
        User user1 = new User();
        user1.setName("zhangsan");
        user1.setFavoriteNumber(256);
        // Leave favorite color null

        // Alternate constructor
        User user2 = new User("lisi", 7, "red");

        // Construct via builder
        User user3 = User.newBuilder()
                     .setName("wangwu")
                     .setFavoriteColor("blue")
                     .setFavoriteNumber(null)
                     .build();
    }

}      

1.4、序列化

把前面建立的使用者序列化并存儲到磁盤檔案:

// Serialize user1, user2 and user3 to disk
        DatumWriter<User> userDatumWriter = new SpecificDatumWriter<User>(User.class);
        DataFileWriter<User> dataFileWriter = new DataFileWriter<User>(userDatumWriter);
        try {
            dataFileWriter.create(user1.getSchema(), new File("users.avro"));
            dataFileWriter.append(user1);
            dataFileWriter.append(user2);
            dataFileWriter.append(user3);
            dataFileWriter.close();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }      

這裡,我們是序列化user到檔案users.avro

Avro之二:入門demo

1.5、反序列化

public static void unserialize() {
        try {
            // Deserialize Users from disk
            DatumReader<User> userDatumReader = new SpecificDatumReader<User>(User.class);
            DataFileReader<User> dataFileReader;
            dataFileReader = new DataFileReader<User>(new File("users.avro"), userDatumReader);
            User user = null;
            while (dataFileReader.hasNext()) {
                // Reuse user object by passing it to next(). This saves us from
                // allocating and garbage collecting many objects for files with
                // many items.
                user = dataFileReader.next(user);
                System.out.println(user);
            }
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }      
{"name": "Alyssa", "favorite_number": 256, "favorite_color": null}
{"name": "Ben", "favorite_number": 7, "favorite_color": "red"}
{"name": "Charlie", "favorite_number": null, "favorite_color": "blue"}