使用pyspark时报错?

所有的都调试好了,cmd运行pyspark也可以,但在python里的ide运行这一段程序总是报错。

from pyspark import SparkConf, SparkContext

from py4j.java_gateway import JavaGateway

# 初始化Spark
conf = SparkConf().setMaster("local").setAppName("My App")
sc = SparkContext(conf=conf)
print("-----------------2-------------------")
lines = sc.textFile("E:///JinXiejie/spark-2.2.0-bin-hadoop2.7/README.md")
pythonLines = lines.filter(lambda line : "Python" in line)
print("---------------3-----------------")
print
pythonLines.first()


报告错误类型:

C:\Python34\python.exe E:/JinXiejie/PythonCases/PyDemo/Pydemo.py

-----------------1-------------------

Traceback (most recent call last):

File "E:/JinXiejie/PythonCases/PyDemo/Pydemo.py", line 10, in <module>

sc = SparkContext(conf=conf)

File "C:\Python34\lib\site-packages\pyspark\context.py", line 115, in __init__

SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)

File "C:\Python34\lib\site-packages\pyspark\context.py", line 283, in _ensure_initialized

SparkContext._gateway = gateway or launch_gateway(conf)

File "C:\Python34\lib\site-packages\pyspark\java_gateway.py", line 80, in launch_gateway

proc = Popen(command, stdin=PIPE, env=env)

File "C:\Python34\lib\subprocess.py", line 859, in __init__

restore_signals, start_new_session)

File "C:\Python34\lib\subprocess.py", line 1112, in _execute_child

startupinfo)

FileNotFoundError: [WinError 2] 系统找不到指定的文件。


Process finished with exit code 1


解决方案:
打开PyCharm,run->edit-Configurations->Environment variables


分别添加PYTHONPATH和SPARK_HOME如下(具体路径为自己电脑spark文件位置):



问题解决!

Logo

权威|前沿|技术|干货|国内首个API全生命周期开发者社区

更多推荐