根據PostgreSQL的手冊,PostgreSQL中hash索引有很大的缺陷,不推薦使用。
http://58.58.27.50:8079/doc/html/9.3.1_zh/indexes-types.html
-----------------------------------------------------------------------------
Hash 索引操作目前沒有記錄 WAL 日志,是以如果資料庫崩潰有未寫入的改變, 我們可能需要用REINDEX重建 Hash 索引。另外, 對hash索引的改變在初始的基礎備份後不是基于流複制或者基于檔案複制的, 是以對于随後使用它們的查詢會給出錯誤的回複。因為這些原因, 我們并不鼓勵使用 Hash 索引。
并且hash索引隻支援"等于"查詢,而PostgreSQL中作為預設索引類型的btree能支援包括“等于”在内的各種比較操作,功能上覆寫了hash索引。是以,在PostgreSQL中,對正常資料類型,btree索引是不二之選。(相比Oracle中有過于豐富的索引類型可選擇,在PostgreSQL中建索引是不是更輕松一點?)
那麼,既然不推薦使用hash索引,PostgreSQL為什麼還要實作hash索引呢?
PostgreSQL的hash索引使用的線性hash算法,可以動态擴充桶數,理論上的查詢複雜度接近O(1),這是btree無法比拟的。下面做個實測,也證明了在大數量的情況下,hash索引的性能确實明顯優于btree。是以在某些極少的特殊場景下,hash索引也還是可以考慮的。
測試環境在Linux虛拟機上
主控端
OS:Win7 64bit
CPU:AMD Athlon II X4 640 3.0G
記憶體:6G
硬碟:Apacer A S510S 128G SATA
虛拟機
OS:CentOS 6.5
CPU:x4核
MEM:2G
PostgreSQL:9.3(shared_buffers=128M)
使用pgbench的TCP-B和Select Only測試對比性能。
分别建2個測試資料庫db1和db2。db1是原始的pgbench測試資料庫,即btree索引。db2則通過下面的方法修改為hash索引。
[chenhj@hanode1 ~]$ psql db2
psql (9.3.4)
Type "help" for help.
db2=# alter table pgbench_accounts DROP CONSTRAINT pgbench_accounts_pkey;
ALTER TABLE
db2=# \d pgbench_accounts
Table "public.pgbench_accounts"
Column | Type | Modifiers
----------+---------------+-----------
aid | integer | not null
bid | integer |
abalance | integer |
filler | character(84) |
db2=# create index on pgbench_accounts using hash(aid);
CREATE INDEX
sclae=1時,差異不大;sclae=100時,hash幾乎是btree的兩倍性能。可見資料量越大,hash的性能優勢越明顯。
sclae=1時,無太大差異
資料略
sclae=10時,大概快了20%
點選(此處)折疊或打開
[chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 db1
transaction type: TPC-B (sort of)
scaling factor: 10
query mode: simple
number of clients: 1
number of threads: 1
duration: 3 s
number of transactions actually processed: 641
tps = 213.418248 (including connections establishing)
tps = 213.716676 (excluding connections establishing)
[chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 db2
number of transactions actually processed: 653
tps = 217.440022 (including connections establishing)
tps = 217.770471 (excluding connections establishing)
[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db1
number of clients: 16
number of transactions actually processed: 2064
tps = 683.721499 (including connections establishing)
tps = 701.078622 (excluding connections establishing)
[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db2
number of transactions actually processed: 3278
tps = 1085.589348 (including connections establishing)
tps = 1110.521633 (excluding connections establishing)
[chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db1
transaction type: SELECT only
number of transactions actually processed: 4922
tps = 1640.242391 (including connections establishing)
tps = 1642.450916 (excluding connections establishing)
[chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db2
number of transactions actually processed: 4771
tps = 1590.028578 (including connections establishing)
tps = 1592.541658 (excluding connections establishing)
[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db1
number of transactions actually processed: 32188
tps = 10724.978992 (including connections establishing)
tps = 10969.036021 (excluding connections establishing)
[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db2
number of transactions actually processed: 33367
tps = 11092.571963 (including connections establishing)
tps = 11339.707975 (excluding connections establishing)
sclae=100時,大概快了1倍
scaling factor: 100
number of transactions actually processed: 1324
tps = 441.307443 (including connections establishing)
tps = 442.905056 (excluding connections establishing)
number of transactions actually processed: 3226
tps = 1075.038414 (including connections establishing)
tps = 1076.599774 (excluding connections establishing)
[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db1
number of transactions actually processed: 2194
tps = 725.580208 (including connections establishing)
tps = 741.318899 (excluding connections establishing)
[chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db2
number of transactions actually processed: 3174
tps = 1049.927110 (including connections establishing)
tps = 1073.139131 (excluding connections establishing)
number of transactions actually processed: 1385
tps = 461.481151 (including connections establishing)
tps = 462.109984 (excluding connections establishing)
number of transactions actually processed: 3527
tps = 1175.124543 (including connections establishing)
tps = 1177.144768 (excluding connections establishing)
number of transactions actually processed: 6313
tps = 2092.075168 (including connections establishing)
tps = 2137.468368 (excluding connections establishing)
number of transactions actually processed: 14414
tps = 4775.127727 (including connections establishing)
tps = 4886.910106 (excluding connections establishing)
2種索引的大小差不多
db1=# select pg_table_size('pgbench_accounts_pkey'::regclass);
pg_table_size
---------------
224641024
(1 row)
db2=# select pg_table_size('pgbench_accounts_aid_idx'::regclass);
268451840