spark dirver本質是一個spark叢集的驅動程式,你要調用spark叢集的計算功能,必須要通過它!
from pyspark import SparkConf, SparkContext
conf = SparkConf().setMaster("local").setAppName("My test App")
sc = SparkContext(conf=conf)
lines = sc.textFile("/tmp/tmp.txt")
print lines.count()
print lines.first()
然後,在tmp下放置一個檔案tmp.txt,運作:
./bin/spark-submit my_example/test.py