今天同僚碰到一個sql的性能問題,主要是merge into的性能問題,執行腳本的時候,居然耗時50多分鐘,汗!簡直讓人抓狂,腳本如下:
merge into eds.tw_dp_b_tdterm_imei_day dm
using t_imei_day_1111 temp
on(
dm.date_cd = temp.date_cd and
dm.city_id = temp.city_id and
dm.imei = temp.imei and
dm.usr_nbr = temp.usr_nbr
)
when matched then update set
dm.gsm_flux = temp.gsm_flux ,
dm.td_flux = temp.td_flux ,
dm.gprs_flux = temp.gprs_flux
when not matched then insert(
dm.date_cd ,
dm.city_id ,
dm.imei ,
dm.buss_city_id ,
dm.type_id ,
dm.usr_nbr ,
dm.gsm_flux ,
dm.td_flux ,
dm.gprs_flux
values(
temp.date_cd ,
temp.city_id ,
temp.imei ,
temp.buss_city_id ,
temp.type_id ,
temp.usr_nbr ,
temp.gsm_flux ,
temp.td_flux ,
temp.gprs_flux
);
這
兩個表的資料量大緻如下:
t_imei_day_1111 三十多萬, eds.tw_dp_b_tdterm_imei_day 三百多萬資料,跑數前都做過表的相關資訊
收集。 eds.tw_dp_b_tdterm_imei_day表的索引有效,碎片很少:
sql> select count(1) from t_imei_day_1111;
count(1)
----------
333575
sql> select count(1) from eds.tw_dp_b_tdterm_imei_day;
3731336
sql>
腳本的執行計劃如下:
sql> merge into eds.tw_dp_b_tdterm_imei_day dm
dm.gsm_flux = temp.gsm_flux ,
dm.td_flux = temp.td_flux ,
);
333575 rows merged.
execution plan
----------------------------------------------------------
plan hash value: 3661285687
--------------------------------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes |tempspc| cost (%cpu)| time | pstart| pstop |
| 0 | merge statement | | 264k| 54m| | 3576 (2)| 00:01:20 | | |
| 1 | merge | tw_dp_b_tdterm_imei_day | | | | | | | |
| 2 | view | | | | | | | | |
|* 3 | hash join outer | | 264k| 44m| 31m| 3576 (2)| 00:01:20 | | |
| 4 | table access full | t_imei_day_1111 | 264k| 28m| | 100 (2)| 00:00:03 | | |
| 5 | partition range all| | 2128k| 125m| | 765 (7)| 00:00:17 | 1 | 365 |
| 6 | table access full | tw_dp_b_tdterm_imei_day | 2128k| 125m| | 765 (7)| 00:00:17 | 1 | 365 |
predicate information (identified by operation id):
---------------------------------------------------
3 - access("dm"."usr_nbr"(+)="temp"."usr_nbr" and "dm"."imei"(+)="temp"."imei" and
"dm"."city_id"(+)="temp"."city_id" and "dm"."date_cd"(+)="temp"."date_cd")
note
-----
- dynamic sampling used for this statement
statistics
9975 recursive calls
1212324 db block gets
111135 consistent gets
2447 physical reads
228686840 redo size
822 bytes sent via sql*net to client
2571 bytes received via sql*net from client
3 sql*net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
333575 rows processed
sql>
在優化腳本期間,我添加了use_merge提示,結果執行腳本的時間一下子縮短到幾十秒。但是讓我很納悶的是, 無論怎麼看,添加use_merge提示的腳本的執行計劃怎麼比較都比上面沒添加提示的腳本的執行計劃查(執行計劃如下所示)
sql> merge /*+ use_merge(dm temp) */ into eds.tw_dp_b_tdterm_imei_day dm
dm.city_id = temp.city_id and
dm.gsm_flux = temp.gsm_flux ,
dm.imei ,
dm.buss_city_id ,
dm.gsm_flux ,
temp.imei ,
temp.buss_city_id ,
;
plan hash value: 2993229602
---------------------------------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes |tempspc| cost (%cpu)| time | pstart| pstop |
| 0 | merge statement | | 1688k| 346m| | 14060 (1)| 00:05:13 | | |
| 1 | merge | tw_dp_b_tdterm_imei_day | | | | | | | |
| 2 | view | | | | | | | | |
| 3 | merge join outer | | 1688k| 283m| | 14060 (1)| 00:05:13 | | |
| 4 | sort join | | 264k| 28m| 68m| 2424 (1)| 00:00:54 | | |
| 5 | table access full | t_imei_day_1111 | 264k| 28m| | 100 (2)| 00:00:03 | | |
|* 6 | sort join | | 2128k| 125m| 586m| 11636 (2)| 00:04:19 | | |
| 7 | partition range all| | 2128k| 125m| | 765 (7)| 00:00:17 | 1 | 365 |
| 8 | table access full | tw_dp_b_tdterm_imei_day | 2128k| 125m| | 765 (7)| 00:00:17 | 1 | 365 |
6 - access("dm"."usr_nbr"(+)="temp"."usr_nbr" and "dm"."imei"(+)="temp"."imei" and
filter("dm"."date_cd"(+)="temp"."date_cd" and "dm"."city_id"(+)="temp"."city_id" and
"dm"."imei"(+)="temp"."imei" and "dm"."usr_nbr"(+)="temp"."usr_nbr")
1884 recursive calls
1217983 db block gets
90427 consistent gets
654 physical reads
259129380 redo size
814 bytes sent via sql*net to client
2667 bytes received via sql*net from client
11 sorts (memory)
sql> exit