天天看點

PostgreSQL的hash索引是否有用?

根據PostgreSQL的手冊,PostgreSQL中hash索引有很大的缺陷,不推薦使用。

http://58.58.27.50:8079/doc/html/9.3.1_zh/indexes-types.html

-----------------------------------------------------------------------------

Hash 索引操作目前沒有記錄 WAL 日志,是以如果資料庫崩潰有未寫入的改變, 我們可能需要用REINDEX重建 Hash 索引。另外, 對hash索引的改變在初始的基礎備份後不是基于流複制或者基于檔案複制的, 是以對于随後使用它們的查詢會給出錯誤的回複。因為這些原因, 我們并不鼓勵使用 Hash 索引。

并且hash索引隻支援"等于"查詢,而PostgreSQL中作為預設索引類型的btree能支援包括“等于”在内的各種比較操作,功能上覆寫了hash索引。是以,在PostgreSQL中,對正常資料類型,btree索引是不二之選。(相比Oracle中有過于豐富的索引類型可選擇,在PostgreSQL中建索引是不是更輕松一點?)

那麼,既然不推薦使用hash索引,PostgreSQL為什麼還要實作hash索引呢?

PostgreSQL的hash索引使用的線性hash算法,可以動态擴充桶數,理論上的查詢複雜度接近O(1),這是btree無法比拟的。下面做個實測,也證明了在大數量的情況下,hash索引的性能确實明顯優于btree。是以在某些極少的特殊場景下,hash索引也還是可以考慮的。

測試環境在Linux虛拟機上

主控端

OS:Win7 64bit

CPU:AMD Athlon II X4 640 3.0G

記憶體:6G

硬碟:Apacer A S510S 128G SATA

虛拟機

OS:CentOS 6.5

CPU:x4核

MEM:2G

PostgreSQL:9.3(shared_buffers=128M)

使用pgbench的TCP-B和Select Only測試對比性能。

分别建2個測試資料庫db1和db2。db1是原始的pgbench測試資料庫,即btree索引。db2則通過下面的方法修改為hash索引。

[chenhj@hanode1 ~]$ psql db2

psql (9.3.4)

Type "help" for help.

db2=# alter table pgbench_accounts DROP CONSTRAINT pgbench_accounts_pkey;

ALTER TABLE

db2=# \d pgbench_accounts

   Table "public.pgbench_accounts"

  Column | Type | Modifiers

----------+---------------+-----------

 aid | integer | not null

 bid | integer |

 abalance | integer |

 filler | character(84) |

db2=# create index on pgbench_accounts using hash(aid);

CREATE INDEX

sclae=1時,差異不大;sclae=100時,hash幾乎是btree的兩倍性能。可見資料量越大,hash的性能優勢越明顯。

sclae=1時,無太大差異

資料略

sclae=10時,大概快了20%

點選(此處)折疊或打開

[chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 db1

transaction type: TPC-B (sort of)

scaling factor: 10

query mode: simple

number of clients: 1

number of threads: 1

duration: 3 s

number of transactions actually processed: 641

tps = 213.418248 (including connections establishing)

tps = 213.716676 (excluding connections establishing)

[chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 db2

number of transactions actually processed: 653

tps = 217.440022 (including connections establishing)

tps = 217.770471 (excluding connections establishing)

[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db1

number of clients: 16

number of transactions actually processed: 2064

tps = 683.721499 (including connections establishing)

tps = 701.078622 (excluding connections establishing)

[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db2

number of transactions actually processed: 3278

tps = 1085.589348 (including connections establishing)

tps = 1110.521633 (excluding connections establishing)

[chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db1

transaction type: SELECT only

number of transactions actually processed: 4922

tps = 1640.242391 (including connections establishing)

tps = 1642.450916 (excluding connections establishing)

[chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db2

number of transactions actually processed: 4771

tps = 1590.028578 (including connections establishing)

tps = 1592.541658 (excluding connections establishing)

[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db1

number of transactions actually processed: 32188

tps = 10724.978992 (including connections establishing)

tps = 10969.036021 (excluding connections establishing)

[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db2

number of transactions actually processed: 33367

tps = 11092.571963 (including connections establishing)

tps = 11339.707975 (excluding connections establishing)

sclae=100時,大概快了1倍

scaling factor: 100

number of transactions actually processed: 1324

tps = 441.307443 (including connections establishing)

tps = 442.905056 (excluding connections establishing)

number of transactions actually processed: 3226

tps = 1075.038414 (including connections establishing)

tps = 1076.599774 (excluding connections establishing)

[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db1

number of transactions actually processed: 2194

tps = 725.580208 (including connections establishing)

tps = 741.318899 (excluding connections establishing)

[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db2

number of transactions actually processed: 3174

tps = 1049.927110 (including connections establishing)

tps = 1073.139131 (excluding connections establishing)

number of transactions actually processed: 1385

tps = 461.481151 (including connections establishing)

tps = 462.109984 (excluding connections establishing)

number of transactions actually processed: 3527

tps = 1175.124543 (including connections establishing)

tps = 1177.144768 (excluding connections establishing)

number of transactions actually processed: 6313

tps = 2092.075168 (including connections establishing)

tps = 2137.468368 (excluding connections establishing)

number of transactions actually processed: 14414

tps = 4775.127727 (including connections establishing)

tps = 4886.910106 (excluding connections establishing)

2種索引的大小差不多

db1=# select pg_table_size('pgbench_accounts_pkey'::regclass);

 pg_table_size

---------------

     224641024

(1 row)

db2=# select pg_table_size('pgbench_accounts_aid_idx'::regclass);

     268451840