【DB吐槽大會】第50期 - PG GiST距離排序操作符和過濾無法同時使用索引

背景

1、産品的問題點

PG GiST距離排序操作符和過濾無法同時使用索引

2、問題點背後涉及的技術原理

PG GiST索引支援空間距離排序, 例如按距離某個經緯度點的距離排序, 傳回表裡面的經緯度點.
PG 支援距離操作符, 支援排序功能, 同時支援傳回距離值.

- 但是距離過濾不能使用索引

例子:

《PostgreSQL GiST Order by 距離 + 距離範圍判定 + limit 驟變優化與背景原因》

create extension btree_gist;  
postgres=# \do  
                                             List of operators  
 Schema | Name |        Left arg type        |       Right arg type        |   Result type    | Description   
--------+------+-----------------------------+-----------------------------+------------------+-------------  
 public | <->  | bigint                      | bigint                      | bigint           |   
 public | <->  | date                        | date                        | integer          |   
 public | <->  | double precision            | double precision            | double precision |   
 public | <->  | integer                     | integer                     | integer          |   
 public | <->  | interval                    | interval                    | interval         |   
 public | <->  | money                       | money                       | money            |   
 public | <->  | oid                         | oid                         | oid              |   
 public | <->  | real                        | real                        | real             |   
 public | <->  | smallint                    | smallint                    | smallint         |   
 public | <->  | time without time zone      | time without time zone      | interval         |   
 public | <->  | timestamp with time zone    | timestamp with time zone    | interval         |   
 public | <->  | timestamp without time zone | timestamp without time zone | interval         |   
  
create table t_age(id int, age int);    
insert into t_age select generate_series(1,10000000), random()*120;    
create index idx_t_age_1 on t_age using gist (age);    
    
select * from t_age     
where     
(age <-> 25) < 1     
order by age <-> 25     
limit 100000;    
  
  
 Limit  (cost=0.42..10245.61 rows=100000 width=12) (actual time=0.161..8126.988 rows=83248 loops=1)  
   Output: id, age, ((age <-> 25))  
   Buffers: shared hit=9523157  
   ->  Index Scan using idx_t_age_1 on public.t_age  (cost=0.42..341506.11 rows=3333326 width=12) (actual time=0.160..8115.150 rows=83248 loops=1)  
         Output: id, age, (age <-> 25)  
         Order By: (t_age.age <-> 25)  
         Filter: ((t_age.age <-> 25) < 1)  
         Rows Removed by Filter: 9916752  
         Buffers: shared hit=9523157  
 Planning Time: 0.077 ms  
 Execution Time: 8133.808 ms  
(11 rows)  
  
postgres=# set enable_seqscan=off;  
SET  
postgres=# explain select * from t_age where (age <-> 25) <1 limit 100000;  
                                     QUERY PLAN                                        
-------------------------------------------------------------------------------------  
 Limit  (cost=10000000000.00..10000005827.44 rows=100000 width=8)  
   ->  Seq Scan on t_age  (cost=10000000000.00..10000194247.66 rows=3333326 width=8)  
         Filter: ((age <-> 25) < 1)  
(3 rows)

3、這個問題将影響哪些行業以及業務場景

影響最大的時基于地理位置的網際網路業務, 例如社交、O2O、出行等

4、會導緻什麼問題?

當需要搜尋附近的N個點, 并且距離M以内的雙重需求時, 不能同時使用1個索引來滿足. (要麼隻能用于排序, 要麼隻能用于where過濾)
需要額外的filter計算, 如果滿足條件(距離M以内)的記錄數不足N條, 則導緻掃描整個索引, 性能急劇下降.

5、業務上應該如何避免這個坑

可以使用function來解決這個問題, 每次filter判定是否已越界.

6、業務上避免這個坑犧牲了什麼, 會引入什麼新的問題

需要自定義函數, 開發成本增加.

7、資料庫未來産品疊代如何修複這個坑

希望能直接在核心層面支援, 同一個index既能支援按距離過濾又能支援按距離排序輸出.

【DB吐槽大會】第50期 - PG GiST距離排序操作符和過濾無法同時使用索引

背景

1、産品的問題點

2、問題點背後涉及的技術原理

3、這個問題将影響哪些行業以及業務場景

4、會導緻什麼問題?

5、業務上應該如何避免這個坑

6、業務上避免這個坑犧牲了什麼, 會引入什麼新的問題

7、資料庫未來産品疊代如何修複這個坑

繼續閱讀

Testlink安裝部署之XAMPP

set define off關閉替代變量功能

報錯：'mysql' 不是内部或外部指令，也不是可運作的程式或批處理檔案。

Linxu常用指令技巧彙總

ERROR 1 (HY000): Can't create/write to file '/tmp/#sql_4188_1.MYI' (Errcode: 28)

艱難安裝LDAP,SSL認證

《Linux指令行與Shell腳本程式設計大全第2版.布盧姆》pdf

MySQL的4種隔離級别？出現問題

XX系統實施過程問題總結

無元件上傳圖檔到資料庫中，最完整解決方案

【MySQL資料庫】資料庫索引事務1.索引2.事務

neo4j之cypher使用文檔

NOSQL安全攻擊

mybatis_入門程式Mybatis入門

登入plsql 報錯 the account is locked --使用者被鎖

SequoiaDB巨杉資料庫C++驅動概述