天天看點

PostgreSQL 11 preview - 增加強制custom plan GUC開關(plancache_mode),對付傾斜

标簽

PostgreSQL , plan cache , generic plan , custom plan , plancache_mode

https://github.com/digoal/blog/blob/master/201803/20180325_06.md#%E8%83%8C%E6%99%AF 背景

對于高并發的小事務,使用綁定變量(prepared statement)來緩存執行計劃,可以降低簡單SQL在sql parser, plan上的開銷。

《PostgreSQL 11 preview - 強制auto prepared statment開關(自動化plan cache)(類似Oracle cursor_sharing force)》

但是對于比較複雜的SQL或者說輸入的參數會導緻執行計劃傾斜的SQL,使用綁定變量會導緻性能抖動。

例子,下面的資料在ID=1上面有非常嚴重的傾斜,其他值比較均勻。

create table test (id int , info text, crt_time timestamp);       insert into test select generate_series(1,1000000);       insert into test select 1 from generate_series(1,10000000);           

是以

select count(*) from test where id=1

時,可能用全表掃描更合适。而當id=其他值時,使用索引更好。

PostgreSQL在使用CACHED PLAN時,依舊會使用傳入參數代入CACHE PLAN進行計算,得到成本,然後對比之前5次custom plan的成本,如果相差較大(有門檻值),則會重新發起custom plan,但是并不一定适合所有場景,有時候這種傾斜會一直下去導緻執行計劃不正确。可以看如下文章中的例子。

《PostgreSQL plan cache 源碼淺析 - 如何確定不會計劃傾斜》 《執行計劃選擇算法 與 綁定變量 - PostgreSQL prepared statement: SPI_prepare, prepare|execute COMMAND, PL/pgsql STYLE: custom & generic plan cache》

https://github.com/digoal/blog/blob/master/201803/20180325_06.md#%E9%82%A3%E4%B9%88%E5%A6%82%E4%BD%95%E8%A7%A3%E5%86%B3%E8%BF%99%E4%B8%AA%E9%97%AE%E9%A2%98 那麼如何解決這個問題?

我們知道資料庫有幾種使用綁定變量的方法:

1、資料庫端prepare, execute。

2、驅動層使用prepare, execute接口。

3、使用UDF函數。

4、設定強制使用prepared statement的開關。

PostgreSQL 11可能引入一個GUC參數,可以讓你強制使用custom plan。因為PATCH還沒有送出,最後送出的時候會怎麼實作現在還不得而知。

https://commitfest.postgresql.org/17/990/
Hi,       this patch is based on discussions related to plpgsql2 project.       Currently we cannot to control plan cache from plpgsql directly. We can use       dynamic SQL if we can enforce oneshot plan - but it means little bit less       readable code (if we enforce dynamic SQL from performance reasons). It       means so the code cannot be checked by plpgsql check too.       The plan cache subsystem allows some control by options       CURSOR_OPT_GENERIC_PLAN and CURSOR_OPT_CUSTOM_PLAN. So we just a interface       how to use these options from PLpgSQL. I used Ada language feature (used in       PL/SQL too) - PRAGMA statement. It allows to set compiler directives. The       syntax of PRAGMA statements allows to set a level where entered compiler       directive should be applied. It can works on function level or block level.       Attached patch introduces PRAGMA plan_cache with options: DEFAULT,       FORCE_CUSTOM_PLAN, FORCE_GENERIC_PLAN. Plan cache is partially used every       time - the parser/analyzer result is cached every time.       Examples:       CREATE OR REPLACE FUNCTION foo(a int)       RETURNS int AS  $$       DECLARE ..       BEGIN          DECLARE            /* block level (local scope) pragma */            PRAGMA plan_cache(FORCE_CUSTOM_PLAN);          BEGIN            SELECT /* slow query - dynamic sql is not necessary */          END;        END;       Benefits:       1. remove one case where dynamic sql is necessary now - security, static       check       2. introduce PRAGMAs - possible usage: autonomous transactions, implicit       namespaces settings (namespace for auto variables, namespace for function       arguments).       Comments, notes?       Regards       Pavel           

PATCH新增參數如下

+static const struct config_enum_entry plancache_mode_options[] = {       +	{"default", PLANCACHE_DEFAULT, false},       +	{"force_generic_plan", PLANCACHE_FORCE_GENERIC_PLAN, false},       +	{"force_custom_plan", PLANCACHE_FORCE_CUSTOM_PLAN, false},       +	{NULL, 0, false}       +};           

參數的使用方法

+--       +-- Test plan cache strategy       +--       +create table test_strategy(a int);       +insert into test_strategy select 1 from generate_series(1,1000) union all select 2;       +create index on test_strategy(a);       +analyze test_strategy;       +prepare test_strategy_pp(int) as select count(*) from test_strategy where a = $1;       +-- without 5 evaluation pg uses custom plan       +explain (costs off) execute test_strategy_pp(2);       +                            QUERY PLAN                                   +------------------------------------------------------------------       + Aggregate       +   ->  Index Only Scan using test_strategy_a_idx on test_strategy       +         Index Cond: (a = 2)       +(3 rows)       +       +-- we can force to generic plan       +set plancache_mode to force_generic_plan;       +explain (costs off) execute test_strategy_pp(2);       +           QUERY PLAN                   +---------------------------------       + Aggregate       +   ->  Seq Scan on test_strategy       +         Filter: (a = $1)       +(3 rows)       +       +-- we can fix generic plan by 5 execution       +set plancache_mode to default;       +execute test_strategy_pp(1); -- 1x       + count        +-------       +  1000       +(1 row)       +       +execute test_strategy_pp(1); -- 2x       + count        +-------       +  1000       +(1 row)       +       +execute test_strategy_pp(1); -- 3x       + count        +-------       +  1000       +(1 row)       +       +execute test_strategy_pp(1); -- 4x       + count        +-------       +  1000       +(1 row)       +       +execute test_strategy_pp(1); -- 5x       + count        +-------       +  1000       +(1 row)       +       +-- we should to get really bad plan       +explain (costs off) execute test_strategy_pp(2);       +           QUERY PLAN                   +---------------------------------       + Aggregate       +   ->  Seq Scan on test_strategy       +         Filter: (a = $1)       +(3 rows)       +       +-- but we can force to custom plan       +set plancache_mode to force_custom_plan;       +explain (costs off) execute test_strategy_pp(2);       +                            QUERY PLAN                                   +------------------------------------------------------------------       + Aggregate       +   ->  Index Only Scan using test_strategy_a_idx on test_strategy       +         Index Cond: (a = 2)       +(3 rows)       +       +drop table test_strategy;           

https://github.com/digoal/blog/blob/master/201803/20180325_06.md#%E5%B0%8F%E7%BB%93 小結

通過設定一個開關,可以讓原本使用generic plan的SQL,強制使用custom plan。

比如我們的SQL如果本身就是分析型(或者說本身就是SLOW SQL)的情況下,實際上generic plan帶來的好處是微乎其微的,反而可能因為generic plan用的是cache plan會帶來不便。因為generic plan的執行計劃被固定,特别不适合那種因為輸入條件的變化而導緻執行計劃變化的SQL(通常是複雜SQL)。這種情況下,我們就可以考慮強制使用custom plan.

那麼有人會說,在UDF中使用動态SQL不就好了嗎(動态SQL每次都需要custom plan),要知道動态SQL在UDF中是不被CHECK的,是以可能出一些文法或者什麼問題,很難檢查.而使用這個開關,我們就不用擔心這個問題了.即能檢查SQL文法,又能強制使用custom plan。

一個反例是動态SQL中使用cache plan:

《PostgreSQL Oracle 相容性之 - DBMS_SQL(存儲過程動态SQL中使用綁定變量)》

https://github.com/digoal/blog/blob/master/201803/20180325_06.md#%E5%8F%82%E8%80%83 參考