1.配置环境版本
资料上传百度云,自取:链接:https://pan.baidu.com/s/1evVp5Zk0_X7VdjKlHGkYCw 提取码:ypti
复制这段内容后打开百度网盘手机App,操作更方便哦
(之前安装的是apache版本的Hadoop2.6.4,在启动hive的时候,报错,无奈又统一换成CDH)
2.安装前的配置工作
2.1 安装jdk
(1)下载jdk
(2)解压,然后在/etc/profile文件配置环境变量
export JAVA_HOME=/home/jdk1.8.0_131
export PATH=${JAVA_HOME}/bin:${PATH}
2.2 ssh免密登录
ssh-keygen
根据文件的路径更改:
cp /root/.ssh/id_rsa.pub /root/.ssh/authoried_keys
用命令测试:
ssh localhost
2.3 mysql安装(hive环境会需要)
可参考菜鸟教程:https://www.runoob.com/linux/mysql-install-setup.html
我的数据库是远程的,需要配置mysql的远程连接
2.4配置IP
设置/etc/hosts,两台服务器都需要更改,我的是两台,一个master,一个data,括号里边不写入。
IP地址 hostname (master)
IP地址 hostname (data)
3.安装Hadoop
(1)下载文件
(2)分别解压到服务器上,设置环境变量
环境变量配置:
export HADOOP_HOME=/home/hadoop-2.6.0-cdh5.15.1
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-DJava.library.path=$HADOOP_HOME/lib"
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
#export YARN_CONF_IR=/home/hadoop-2.6.4/etc/hadoop
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar
记得用source /etc/profile生效!!!
(3)配置文件
- 配置master服务器
进入hadoop文件目录/etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
进入hadoop文件目录/etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<!--<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop-2.6.0-cdh5.15.1/hadoop_data/hdfs/namenode</value>
</property> --></configuration>
进入hadoop文件目录/etc/hadoop/mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
</property>
</configuration>
进入hadoop文件目录/etc/hadoop/yarn-site.xml:
<configuration><!-- Site specific YARN configuration properties -->
<property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property>
<property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property>
<property><name>yarn.resourcemanager.resource-tracker.address</name><value>master:8025</value></property>
<property><name>yarn.resourcemanager.scheduler.address</name><value>master:8030</value></property>
<property><name>yarn.resourcemanager.address</name><value>master:8050</value></property></configuration>
在hadoop文件目录/etc/hadoop/新建masters文件,并键入master
在hadoop文件目录/etc/hadoop/新建slaves文件,并键入data(如果有多个data服务器,分别写入,例如data1,data2,data3)
- 配置data服务器
进入hadoop文件目录/etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
进入hadoop文件目录/etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop-2.6.0-cdh5.15.1/hadoop_data/hdfs/datanode</value>
</property>
</configuration>
进入hadoop文件目录/etc/hadoop/mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
</property>
</configuration>
进入hadoop文件目录/etc/hadoop/yarn-site.xml:
<configuration><!-- Site specific YARN configuration properties -->
<property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property>
<property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property>
<property><name>yarn.resourcemanager.resource-tracker.address</name><value>master:8025</value></property>
<property><name>yarn.resourcemanager.scheduler.address</name><value>master:8030</value></property>
<property><name>yarn.resourcemanager.address</name><value>master:8050</value></property></configuration>
(4)启动
进入hadoop文件目录/sbin,启动start-all.sh,也可以分别启动start-dfs.sh和start-yarn.sh
(5)查看
- master服务器,启动了NameNode节点:
- data服务器,启动了DataNode节点:
4.安装Hbase
(1)下载Hbase解压
(2)配置环境变量
export HBASE_HOME=/home/hbase-1.2.0-cdh5.15.1
export PATH=$PATH:$HBASE_HOME/bin
(3)配置文件
进入Hbase安装目录/conf/hbase-env.sh,更改
进入Hbase安装目录/conf/hbase-site.xml,更改
<configuration><property><name>hbase.rootdir</name><value>file:/home/hbase-1.2.0-cdh5.15.1/hbase_data</value></property></configuration>
(4)启动
输入hbase shell
5.安装Hive
(1)下载Hive解压
(2)配置环境变量
export HIVE_HOME=/home/hive-1.1.0-cdh5.15.1
export PATH=:$JAVA_HOME/bin:$MAVEN_HOME/bin:$FINDBUGS_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SQOOP_HOME/bin:$HIVE_HOME/bin:$PATH
(3)配置文件
进入Hive安装目录/conf/hive-env.sh,更改
export HADOOP_HOME=/home/hadoop-2.6.0-cdh5.15.1/
export HBASE_HOME=/home/hbase-1.2.0-cdh5.15.1
进入Hive安装目录/conf/hive-site.sh,更改
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://IP地址:3306/hive?createDatabaseIfNotExsit=true;characterEncoding=utf8&useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property>
</configuration>
设置了远程连接mysql数据库,hive数据库是默认的,不能更改,需要在mysql提前新建。
(4)启动
输入hive启动
如果报终端Jline包错误的话,需要将hadoop文件目录/share/hadoop/yarn/lib/下的jline包和Hive安装目录/lib/的jline包版本一致!!!
安装告一段落,剩下继续!