天天看點

Spark安裝(standalone)

文檔:http://spark.apache.org/docs/latest/spark-standalone.html

安裝scala

https://www.scala-lang.org/download/

wget -P /opt/downloads https://downloads.lightbend.com/scala/2.13.0/scala-2.13.0.rpm

rpm -ivh /opt/downloads/scala-2.13.0.rpm

環境變量

vim /etc/profile

export SCALA_HOME=/usr/scala-2.13.0

export PATH=$PATH:$SCALA_HOME/bin

生效

source /etc/profile

檢查

scala -version

叢集(略)

安裝Spark

https://www.apache.org/dyn/closer.lua/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz

wget -P /opt/downloads http://mirrors.tuna.tsinghua.edu.cn/apache/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz

wget -P /opt/downloads http://mirror.bit.edu.cn/apache/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz

tar zxvf /opt/downloads/spark-2.4.3-bin-hadoop2.7.tgz -C /opt

mv /opt/spark-2.4.3-bin-hadoop2.7/ /opt/spark

cp /opt/spark/conf/spark-env.sh.template /opt/spark/conf/spark-env.sh

vim /opt/spark/conf/spark-env.sh

export JAVA_HOME=/usr/java/jdk1.8.0_201-amd64

export SCALA_HOME=/usr/share/scala

export HADOOP_HOME=/opt/hadoop

export HADOOP_CONF_DIR=/opt/hadoop/etc/hadoop

export SPARK_MASTER_IP=0.0.0.0

export SPARK_MASTER_PORT=7077

export SPARK_MASTER_WEBUI_PORT=28080

export SPARK_WORKER_WEBUI_PORT=28081

export SPARK_WORKER_MEMORY=4g

export SPARK_WORKER_CORES=2

export SPARK_WORKER_INSTANCES=1

export SPARK_PID_DIR=/var/run

其中預設的8080和8081端口太容易沖突,建議修改下。

cp /opt/spark/conf/slaves.template /opt/spark/conf/slaves

vim /opt/spark/conf/slaves

修改ip

啟動

先啟動hadoop

/opt/spark/sbin/start-all.sh

/opt/spark/sbin/stop-all.sh

檢視start-all.sh檔案cat /opt/spark/sbin/start-all.sh

可以發現它加載了spark-config.sh配置檔案,啟動了叢集master主節點和slaves子節點。

啟動後webui位址 http://192.168.1.xxx:28080