Hadoop集群部署Spark

从这里选择对应Hadoop版本的Spark进行下载:

wget http://d3kbcqa49mib13.cloudfront.net/spark-2.1.0-bin-hadoop2.6.tgz

tar -xvf spark-2.1.0-bin-hadoop2.6.tgz -C ./

(待续)

参考:

Install spark on yarn cluster

GatewayNode上安装Hive

将rpi3用作GatewayNode(从slaves名单中剔除),rpi1和rpi2用作DataNode。

安装Hive

mkdir /opt/hive

cd /opt/hive

wget https://www.apache.org/dist/hive/stable-2/apache-hive-2.1.1-bin.tar.gz

tar -xzvf apache-hive-2.1.1-bin.tar.gz >> /dev/null

mv apache-hive-2.1.1-bin/* ./

修改conf/hive-env.sh

HADOOP_HOME=/op......

树莓派集群安装Hadoop

准备工作

4台彼此连通的树莓派,其中:

rpi0是主节点(namenode)

rpi1、rpi2和rpi3是子节点(datanode)

安装步骤

创建安装和HDFS路径

sudo mkdir /opt/hadoop/

sudo mkdir /opt/hadoop_tmp/

安装Java

sudo apt-get update

sudo apt-get install oracle-java8-jdk

配置环境变量

export HADOOP_HOME=/opt/hadoop/hadoop

export PATH=$PATH:$HADOOP_HOME/bin

......

Hadoop 1.0 vs Hadoop 2.0