天天看點

k8s on sparkk8s on spark

k8s on spark

spark 介紹

1、spark介紹

spark-submit可以直接用于将Spark應用程式送出到Kubernetes叢集。送出機制的工作方式如下:

Spark建立在Kubernetes容器中運作的Spark驅動程式。

驅動程式将建立執行程式,這些執行程式也将在Kubernetes Pod中運作并連接配接到它們,并執行應用程式代碼。

當應用程式完成時,執行程式pod終止并被清理,但是驅動程式pod保留日志,并在Kubernetes API中保持“完成”狀态,直到最終對其進行垃圾收集或手動清理為止。

2、安裝條件
部署k8s叢集
 節點可用記憶體大于2G
 安裝JAVA環境,jdk>=8
           

文檔位址:http://spark.apache.org/docs/latest/running-on-kubernetes.html

安裝部署

3、下載下傳安裝包
[[email protected] ~]# wget http://archive.apache.org/dist/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz
[[email protected] ~]# tar xf spark-2.4.3-bin-hadoop2.7.tgz
[[email protected] ~]# mv spark-2.4.3-bin-hadoop2.7 /usr/local/spark-2.4.3
//添加環境變量
[[email protected] spark-2.4.3]# cat /etc/profile
export PATH=/usr/local/spark-2.4.3:$PATH
           
4、建立docker鏡像
[[email protected] spark-2.4.0]# ./bin/docker-image-tool.sh -r wxtime -t 2.4.0 build
[[email protected] ~]# docker images
REPOSITORY                                                                       TAG                 IMAGE ID            CREATED             SIZE
wxtime/spark-r                                                                   2.4.0               592aff869ffb        4 days ago          756MB
wxtime/spark-py                                                                  2.4.0               47e104fe2827        4 days ago          462MB
wxtime/spark  
[[email protected] ~]# docker login                                                                  2.4.0               24aab7c864da        4 days ago          371MB
[[email protected] spark-2.4.0]# ./bin/docker-image-tool.sh -r wxtime -t 2.4.0 push
[[email protected] spark-2.4.0]# kubectl cluster-info
 Kubernetes master is running at https://192.168.1.101:6443
           
5、測試
[[email protected] spark-2.4.0]# ./bin/spark-shell
scala> sc.parallelize(1 to 1000).count()
res1: Long = 1000
[[email protected] spark-2.4.0] kubectl create serviceaccount spark
[[email protected] spark-2.4.0] kubectl create clusterrolebinding spark-role --clusterrole=edit --service account=default:spark --namespace=default
           
6、以叢集模式啟動SparkPi
bin/spark-submit \
--master k8s://https://10.10.0.224:6443 \
--deploy-mode cluster \
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=5 \
--conf spark.kubernetes.container.image=wxtime/spark:2.4.3 \
--conf spark.kubernetes.container.image.pullPolicy=Always  \
local:///opt/spark/examples/jars/spark-examples_2.11-2.4.3.jar