天天看点

Spark2 DataSet聚合操作

import org.apache.spark.sql.functions._

data.groupBy("gender").agg(count($"age"),max($"age").as("maxAge"), avg($"age").as("avgAge")).show 
+------+----------+------+------+                                                
|gender|count(age)|maxAge|avgAge| 
+------+----------+------+------+ 
|female|         |  |  | 
|  male|         |  |  | 
+------+----------+------+------+ 


data.groupBy("gender").agg("age"->"count","age" -> "max", "age" -> "avg").show 
+------+----------+--------+--------+                                            
|gender|count(age)|max(age)|avg(age)| 
+------+----------+--------+--------+ 
|female|         |    |    | 
|  male|         |    |    | 
+------+----------+--------+--------+ 
           

继续阅读