k8s on spark

spark 介绍

1、spark介绍

spark-submit可以直接用于将Spark应用程序提交到Kubernetes集群。提交机制的工作方式如下：

Spark创建在Kubernetes容器中运行的Spark驱动程序。

驱动程序将创建执行程序，这些执行程序也将在Kubernetes Pod中运行并连接到它们，并执行应用程序代码。

当应用程序完成时，执行程序pod终止并被清理，但是驱动程序pod保留日志，并在Kubernetes API中保持“完成”状态，直到最终对其进行垃圾收集或手动清理为止。

2、安装条件

部署k8s集群
 节点可用内存大于2G
 安装JAVA环境，jdk>=8

文档地址：http://spark.apache.org/docs/latest/running-on-kubernetes.html

安装部署

3、下载安装包

[[email protected] ~]# wget http://archive.apache.org/dist/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz
[[email protected] ~]# tar xf spark-2.4.3-bin-hadoop2.7.tgz
[[email protected] ~]# mv spark-2.4.3-bin-hadoop2.7 /usr/local/spark-2.4.3
//添加环境变量
[[email protected] spark-2.4.3]# cat /etc/profile
export PATH=/usr/local/spark-2.4.3:$PATH

4、创建docker镜像

[[email protected] spark-2.4.0]# ./bin/docker-image-tool.sh -r wxtime -t 2.4.0 build
[[email protected] ~]# docker images
REPOSITORY                                                                       TAG                 IMAGE ID            CREATED             SIZE
wxtime/spark-r                                                                   2.4.0               592aff869ffb        4 days ago          756MB
wxtime/spark-py                                                                  2.4.0               47e104fe2827        4 days ago          462MB
wxtime/spark  
[[email protected] ~]# docker login                                                                  2.4.0               24aab7c864da        4 days ago          371MB
[[email protected] spark-2.4.0]# ./bin/docker-image-tool.sh -r wxtime -t 2.4.0 push
[[email protected] spark-2.4.0]# kubectl cluster-info
 Kubernetes master is running at https://192.168.1.101:6443

5、测试

[[email protected] spark-2.4.0]# ./bin/spark-shell
scala> sc.parallelize(1 to 1000).count()
res1: Long = 1000
[[email protected] spark-2.4.0] kubectl create serviceaccount spark
[[email protected] spark-2.4.0] kubectl create clusterrolebinding spark-role --clusterrole=edit --service account=default:spark --namespace=default

6、以集群模式启动SparkPi

bin/spark-submit \
--master k8s://https://10.10.0.224:6443 \
--deploy-mode cluster \
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=5 \
--conf spark.kubernetes.container.image=wxtime/spark:2.4.3 \
--conf spark.kubernetes.container.image.pullPolicy=Always  \
local:///opt/spark/examples/jars/spark-examples_2.11-2.4.3.jar

k8s on sparkk8s on spark

k8s on spark

spark 介绍

1、spark介绍

2、安装条件

安装部署

3、下载安装包

4、创建docker镜像

5、测试

6、以集群模式启动SparkPi

继续阅读

docker 命令集锦

LINUX常见命令集锦

windows开始→运行→输入的命令集锦 winver---------检查Windows版本 w

更改LYNC SIP地址

Storm编译打包过程中遇到的一些问题及解决方法

ansible配置文件说明及ad hoc命令

vsftpd dead but subsys locked 的解决方法

(SpringBoot)日志种类：log、monitor、access、out、gc、backup

Shell编程——sort排序、uniq忽略重复、tr替换压缩删除、cut指定删除字段、正则表达式元字符sort 命令uniq 命令tr 命令cut 命令正则表达式

Linxu常用命令技巧汇总

httpd服务的部署、启动、配置和简单优化一、部署二、启动三、配置文件

《Linux命令行与Shell脚本编程大全第2版.布卢姆》pdf

nginx 安装错误信息解决

Ambari介绍和架构原理

CentOS 7,docker安装

【Docker】端口映射问题操作步骤