天天看点

Hbase插入参数测试与对比

HBase读写性能和几个参数有密切关系,比如cache和batch会影响读, 而write buffer会影响写,另外除了参数会影响,在程序里怎么处理也极大的影响插入性能,诸如List比一条一条put性能是否要高呢? 网上大部分言论是否正确呢?今天我会通过程序读取HBASE,然后再原封不动的写入另外一张表,对比各个参数的组合对插入的影响。

HTable htable1 = new HTable(hbaseconf, "test2");

Scan scan1 = new Scan();

ResultScanner scaner = htable.getScanner(scan1);

List list = new ArrayList();

scan1.setCaching(300);

htable1.setWriteBufferSize(610241024);

htable1.setAutoFlush(false);

put.setWriteToWAL(false)

测试一:

方法 参数 时间 插入条数 结果比较

put setWriteToWAL(false)

setCaching(300)

setWriteBufferSize(610241024)

setAutoFlush(false) 1分钟 105000 所有参数给到最优的时候,2者性能旗鼓相当

List List<500>

setWriteToWAL(false)

setAutoFlush(false) 1分钟 105000

测试二:

put setWriteToWAL(true)

setAutoFlush(false) 1分钟 95000 开启不写Wal log好像没有影响,哪怕对put也影响不大

setWriteToWAL(true)

测试三:

setWriteBufferSize(110241024)

setAutoFlush(false) 1分钟 95000 write buffer List

影响较大,但是对Put好像

没什么影响

setAutoFlush(false) 1分钟 75000

测试四:

setAutoFlush(true) 1分钟 20000 auto flush 对Put影响极大,但是对List没影响要少很多

setAutoFlush(true) 1分钟 65000

通过以上几个测试, setAutoFlush参数对性能影响最大,不管是

QQ号码转让

通过List 还是直接put, 另外write buffer影响对List 有较大影响。 本身的List和put 好像差距不大,甚至说没有差距。

所以对于Hbase使用put插入,主要关注2个参数,一个是write buffer,一个就是setAutoFlush.

整个测试程序:

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.io.ImmutableBytesWritable;

import org.apache.hadoop.hbase.client.Scan;

import org.apache.hadoop.hbase.client.Result;

import org.apache.hadoop.hbase.client.ResultScanner;

import java.io.IOException;

import java.text.SimpleDateFormat;

import java.util.ArrayList;

import java.util.List;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.TableName;

import org.apache.hadoop.hbase.client.HTable;

import org.apache.hadoop.hbase.filter.*;

import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp;

import org.apache.hadoop.hbase.filter.FilterList;

import org.apache.hadoop.hbase.util.Bytes;

import com.sun.java_cup.internal.runtime.Scanner;

import com.sun.org.apache.xpath.internal.operations.Mod;

public class filterTest {

public static void main(String[] args) throws IOException {

SimpleDateFormat dateformat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");

Configuration hbaseconf = HBaseConfiguration.create();

hbaseconf.set("hbase.zookeeper.quorum",

"datanode01.isesol.com,datanode02.isesol.com,datanode03.isesol.com,datanode04.isesol.com,cmserver.isesol.com");

hbaseconf.set("hbase.zookeeper.property.clientPort", "2181");

hbaseconf.set("user", "hdfs");

HTable htable = new HTable(hbaseconf, "t_ui_all");

/*Filter rowfilter = new RowFilter(CompareOp.EQUAL,

new BinaryPrefixComparator(Bytes.toBytes("A131420033-1007-9223370539574828268")));

Filter rowfilter1 = new RowFilter(CompareOp.EQUAL,

new BinaryComparator(Bytes.toBytes("A131420033-1007-9223370539574828268"))); */

// scan1.setRowPrefixFilter(Bytes.toBytes("A131420033-1007-9223370539574828268"));

// Filter filter = new SingleColumnValueFilter(Bytes.toBytes("cf"),

// Bytes.toBytes("fault_level2_name"), CompareOp.EQUAL,

// Bytes.toBytes("电气问题"));

// scan1.setFilter(rowfilter);

Result result = null;

int j = 0;

System.out.println("start to scan original table and put this result into List" + dateformat.format(System.currentTimeMillis()));

//htable1.setAutoFlush(false);

while (scaner.iterator().hasNext()) {

result = scaner.next();

Put put = new Put(result.getRow());

//put.setWriteToWAL(false);

for (int i = 0; i <= result.listCells().size() - 1; i++) {

put.add("cf".getBytes(), Bytes.toBytes(new String(result.listCells().get(i).getQualifier())), result

.getValue("cf".getBytes(), new String(result.listCells().get(i).getQualifier()).getBytes()));

}

/* j++;

htable1.put(put);

System.out.println("total number is " + j + " start to put these data into hbase");*/

list.add(put);

j++;

if(j % 500 == 0){

System.out.println("total number is " + j + " start to put these data into hbase" + list.size());

htable1.put(list);

list.clear();

htable1.close();

htable.close();

System.out.println("Job finish" + dateformat.format(System.currentTimeMillis()));

继续阅读