소프트웨어 개발/Scala - Functional

Spark 멀티노드 슬레이브 활성화 및 에러노트

늘근이 2017. 1. 8. 22:51

Spark를 설치하려면, 일단은 아래를 따라한다

http://data-flair.training/blogs/install-deploy-run-spark-2-x-multi-node-cluster-step-by-step-guide/

설치는 그렇게 어렵지 않다. 


1) 모든 노드에 자바 및 스파크 , 스칼라 설치

2) 비밀번호 필요없도록 ssh키 공유

3) hosts파일 변경 (/etc/hosts)

4) conf파일 변경 (spark-env.sh, slave)

5) sbin/start-all.sh


밑에는 에러노트

starting org.apache.spark.deploy.master.Master, logging to /usr/local/spark/logs/spark-rakyunkoh-org.apache.spark.deploy.master.Master-1-rakyunui-MacBook-Pro.local.out

localhost: ssh: connect to host localhost port 22: Connection refused

DataNode1: ssh: connect to host datanode1 port 22: Connection refused

DataNode3: Warning: Permanently added 'datanode3' (ECDSA) to the list of known hosts.

DataNode3: mkdir: cannot create directory '/usr/local/spark/logs': Permission denied

DataNode3: chown: cannot access '/usr/local/spark/logs': No such file or directory

DataNode3: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/spark/logs/spark-rakyunkoh-org.apache.spark.deploy.worker.Worker-1-raspberrypi.out

DataNode3: /usr/local/spark/sbin/spark-daemon.sh: line 128: /usr/local/spark/logs/spark-rakyunkoh-org.apache.spark.deploy.worker.Worker-1-raspberrypi.out: No such file or directory

DataNode2: mkdir: cannot create directory '/usr/local/spark/logs': Permission denied

DataNode2: chown: cannot access '/usr/local/spark/logs': No such file or directory

DataNode2: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/spark/logs/spark-rakyunkoh-org.apache.spark.deploy.worker.Worker-1-raspberrypi.out

DataNode2: /usr/local/spark/sbin/spark-daemon.sh: line 128: /usr/local/spark/logs/spark-rakyunkoh-org.apache.spark.deploy.worker.Worker-1-raspberrypi.out: No such file or directory

DataNode3: failed to launch: nice -n 0 /usr/local/spark/bin/spark-class org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://rakyunui-MacBook-Pro.local:7077

DataNode3: tail: cannot open '/usr/local/spark/logs/spark-rakyunkoh-org.apache.spark.deploy.worker.Worker-1-raspberrypi.out' for reading: No such file or directory

DataNode3: full log in /usr/local/spark/logs/spark-rakyunkoh-org.apache.spark.deploy.worker.Worker-1-raspberrypi.out

DataNode2: failed to launch: nice -n 0 /usr/local/spark/bin/spark-class org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://rakyunui-MacBook-Pro.local:7077

DataNode2: tail: cannot open '/usr/local/spark/logs/spark-rakyunkoh-org.apache.spark.deploy.worker.Worker-1-raspberrypi.out' for reading: No such file or directory

DataNode2: full log in /usr/local/spark/logs/spark-rakyunkoh-org.apache.spark.deploy.worker.Worker-1-raspberrypi.out


데이터노드 모두에서 제대로 기동되지 않았다. 

하나는 포트를 9999번으로 잡고있는데, 이 때문이며 

다른것은 권한이 없는것처럼 보인다.


1) 포트

$ sudo vim /etc/ssh/sshd_config

들어가서 포트번호를 22로 바꾸고,

$ sudo service ssh restart


2) 권한

$ sudo chmod XXX 폴더이름


노드를 멈추려면,  stop-all.sh  시작하려면 start.all.sh (sbin디렉터리 아래에 있다.)

위와같이 연결되어있는 슬레이브를 죄다 실행시킬수있다.

실제로 jps(자바 프로세스) 확인해보면 worker가 작동하고있는것을 확인가능하다. 

두개는 잘 뜨는데.. 넷북 하나는 안뜬다. 실제로 동작하는지 보기위해, 코어를 늘리고 

일단, 스파크쉘을 구동하기 전에, 다음과 같이 확실히 명시를 해주면 제대로 돌아가게끔 보인다. 다만, 코어와 메모리를 이상하게 할당해놓으면 아래와같이 이상해진다.


./spark-shell --master spark://rakyunui-MacBook-Pro.local:7077