1、explode
explode(ARRAY) 清單中的每個元素生成一行
explode(MAP) map中每個key-value對,生成一行,key為一列,value為一列
限制:
1、No other expressions are allowed in SELECT
SELECT pageid, explode(adid_list) AS myCol... is not supported
2、UDTF's can't be nested
SELECT explode(explode(adid_list)) AS myCol... is not supported
3、GROUP BY / CLUSTER BY / DISTRIBUTE BY / SORT BY is not supported
SELECT explode(adid_list) AS myCol ... GROUP BY myCol is not supported
2、lateral view
可使用lateral view解除以上限制,文法:
lateralView: LATERAL VIEW explode(expression) tableAlias AS columnAlias (',' columnAlias)*
fromClause: FROM baseTable (lateralView)*
案例:
table名稱為pageAds
SELECT pageid, adid
FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid;
輸出結果:
3、多個lateral view
from語句後面可以帶多個lateral view語句
表名:baseTable
from後隻有一個lateral view:
SELECT myCol1, col2 FROM baseTable
LATERAL VIEW explode(col1) myTable1 AS myCol1;
結果:
多個lateral view:
SELECT myCol1, myCol2 FROM baseTable
LATERAL VIEW explode(col1) myTable1 AS myCol1
LATERAL VIEW explode(col2) myTable2 AS myCol2;
4、Outer Lateral Views
如果array類型的字段為空,但依然需傳回記錄,可使用outer關鍵詞。
比如:select * from src LATERAL VIEW explode(array()) C AS a limit 10;
這條語句中的array字段是個空清單,這條語句不管src表中是否有記錄,結果都是空的。
而:select * from src LATERAL VIEW OUTER explode(array()) C AS a limit 10;
結果中的記錄數為src表的記錄數,隻是a字段為NULL。
比如:
238 val_238 NULL
86 val_86 NULL
311 val_311 NULL
27 val_27 NULL
165 val_165 NULL
409 val_409 NULL
255 val_255 NULL
278 val_278 NULL
98 val_98 NULL
官方文檔:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-explode