pyspark with jupyter

首先配置jupyter config文件。

jupyter-notebook --generate-config

修改jupyter config文件

c.NotebookApp.port = 18888
c.NotebookApp.ip = '0.0.0.0'
c.NotebookApp.allow_root = True

当然要配置好spark,emr环境spark已经完全配置正确。配置pyspark参数

export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'

启动pyspark即可。

pyspark --master yarn