[关闭]
@tsing1226 2016-05-04T15:44:40.000000Z 字数 834 阅读 931

spark

Spark 集群脚本的书写



  1. 在Spark配置文件夹 conf下创建文件slaves,该文件中必须包含spark worker的所有主机名。

    $mkdir -p /conf/slaves

  2. master主机对其他worker机器中的通过SSH进行无密钥访问。如果没有设置SSH无密钥访问,你可以设置SPARK_SSH_FOREGROUND环境变量为每个worker设置密码。

  3. 根据SPARK_HOME/bin下的脚本启动/停止Spark集群

    • sbin/start-master.sh - Starts a master instance on the machine the script is executed on.
    • sbin/start-slaves.sh - Starts a slave instance on each machine specified in the conf/slaves file.
    • sbin/start-all.sh - Starts both a master and a number of slaves as described above.
    • sbin/stop-master.sh - Stops the master that was started via the bin/start-master.sh script.
    • sbin/stop-slaves.sh - Stops all slave instances on the machines specified in the conf/slaves file.
    • sbin/stop-all.sh - Stops both the master and the slaves as described above.

注意:这些脚本的执行必须在你想要运行的spark master机器之上,而不是在本地机器上。
4. (可选)配置环境变量
修改配置文件conf/spark-env.sh
5.启动集群

./sbin/start-master.sh

6.将应用程序部署到集群上

./bin/spark-shell --master spark://IP:PORT
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注