Spark集群搭建与测试 - 图文 下载本文

注:4040端口用来查看正在运行的程序的job信息,而18080用来查看执行完毕的程序的job信息。

17. 通过spark-submit提交程序到spark集群测试SPARK集群:

a. 通过以下命令提交spark自带的sparkPi程序:spark-submit --class

org.apache.spark.examples.SparkPi --master spark://master:7077 $SPARK_HOME/lib/spark-examples-1.6.0-hadoop2.6.0.jar

b. webUI查看:

注意:spark-submit更详细的信息请参考官方文档:

http://spark.apache.org/docs/latest/submitting-applications.html

18. 附录:常见错误与解决方法:

a. 错误:安装ubuntu时出现错误:This kernel requires an X86-64 CPU, but only

detected an i686 CPU错误。解决方法:安装64位的系统需要64位的cpu且cpu要允许硬件虚拟化。首先查看cpu是否是64 位,是的话去bios里, 在cpu configuration中找到Virtualization,将其状态改为enabled.

b. 错误:E212:can't open file for writing。解决方法:权限不够,切换到ROOT. c. 错误:sudo时报错: unable to change to root gid: Operation not permitted解

决方法:这是因为我们使用guest角色登录系统了,切换用户,使用普通用户登录系统后即可使用sudo命令。 d. 错误:执行下面命令是报错:

root@worker1:~# scp id_rsa.pub root@master:/root/.ssh/id_rsa.pub.slave1 root@worker1's password: Permission denied, please try again.

root@worker1's password: Permission denied, please try again. root@worker1's password: Permission denied (publickey,password).lost connection

解决方法:登录到scp的目标机,将/etc/ssh/sshd_config 中注释掉

#PermitRootLogin without-password,添加 PermitRootLogin yes,重启ssh服务即可。(sudo service ssh restart.) e. 错误:启动spark history server时报错:

Exception in thread \

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruct orAccessorImpl.java:62) at

sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingC

onstructorAccessorImpl.java:45)atjava.lang.reflect.Constructor.newInstance(Constructor.java:422)

at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.sca la:235)at

org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scal a)Caused by: java.lang.IllegalArgumentException: Log directory specified does not exist: file:/tmp/spark-events. Did you configure the correct one through spark.history.fs.logDirectory?

解决方法:在启动spark history server时需要在命令行动态指定目录,或提前配置好相关参数。以spark.history开头的需要配置在spark-env.sh中的

SPARK_HISTORY_OPTS,以spark.eventLog开头的配置在spark-defaults.conf。 可以在spark-defaults.conf中添加如下配置: spark.eventLog.enabled true

spark.eventLog.dir hdfs://spark/jobhistory spark.history.fs.logDirectory hdfs://spark/jobhistory