- 操作系统:Ubuntu20.04
- 所需软件: Java 8 +,Python 3.5 +, Scala 2.11.12+
- 安装Java 8
sudo apt-get updatesudo apt-get install openjdk-8-jdkjava -version
- 安装scala
sudo wget https://downloads.lightbend.com/scala/2.12.8/scala-2.12.8.deb sudo dpkg -i scala-2.12.8.deb scala -version
- 安装pip
sudo apt-get install pip
- 安装py4j
sudo pip install py4j
- 安装Python3.6
默认情况下,Ubuntu20.04附带了Python2.7和Python3.5,如果没有附带或者想要安装最新python3.6,可以使用"deadsnakes"团队PPA,它为Ubuntu打包了最新Python版本sudo add-apt-repository ppa:deadsnakes/ppasudo apt updatesudo apt install python3.6
- 安装spark
sudo wget https://mirrors.tuna.tsinghua.edu.cn/apache/spark/spark-2.4.8/spark-2.4.8-bin-hadoop2.7.tgz tar -zxvf spark-2.4.8-bin-hadoop2.7.tgz
- 编辑系统变量
SPARK_HOME路径视自己安装路径决定。vim ~/.bashrc
export SPARK_HOME=/opt/module/spark-2.4.8-bin-hadoop2.7 export PATH=${SPARK_HOME}/bin:$PATH
source ~/.bashrc
- 启动pyspark
pyspark
文章转载于:链接
作者:Congqing He