目前本系列文章有:
搭建大數(shù)據(jù)平臺系列(0)-機器準備
搭建大數(shù)據(jù)平臺系列(1)-Hadoop環(huán)境搭建[hdfs,yarn,mapreduce]
搭建大數(shù)據(jù)平臺系列(2)-zookeeper環(huán)境搭建
搭建大數(shù)據(jù)平臺系列(3)-hbase環(huán)境搭建
搭建大數(shù)據(jù)平臺系列(4)-hive環(huán)境搭建
0.準備步驟
Hive 是依賴在Hadoop上的,所以他的安裝不需要像Hadoop或者spark那樣每個節(jié)點都安裝一遍,只需在Hadoop的master節(jié)點上安裝一個即可。Hive的安裝前,需要Hadoop的環(huán)境,以及Mysql。
1.安裝過程
1.1下載并解壓安裝包
#下載hive-1.1.0-cdh5.5.0.tar.gz到master機器的~/bigdataspacce文件夾下
#解壓安裝包的命令:
[hadoop@master ~]$ cd ~/bigdataspacce
[hadoop@master bigdataspace]$ tar -zxvf hive-1.1.0-cdh5.5.0.tar.gz
#解壓完成后刪除壓縮包:
[hadoop@master bigdataspace]$ rm hive-1.1.0-cdh5.5.0.tar.gz
#配置HIVE_HOME環(huán)境變量
[hadoop@master ~]$ sudo vi /etc/profile
(添加配置內(nèi)容如下,紅色為需要新增的配置)
export HIVE_HOME=/home/hadoop/bigdataspace/hive-1.1.0-cdh5.5.0
export PATH=$JAVA_HOME/bin:$HBASE_HOME/bin:$HIVE_HOME/bin:$PATH
#讓環(huán)境變量生效
[hadoop@master ~]$ source /etc/profile
1.2修改hive-env.sh配置文件
[hadoop@master ~]$ cd /home/hadoop/bigdataspace/hive-1.1.0-cdh5.5.0/conf
[hadoop@master conf]$ cp hive-env.sh.template hive-env.sh
[hadoop@master conf]$ vi hive-env.sh
#在hive-env.sh配置文件末尾加上:
export HADOOP_HOME=/home/hadoop/bigdataspace/hadoop-2.6.0-cdh5.5.0
export HIVE_CONF_DIR=/home/hadoop/bigdataspace/hive-1.1.0-cdh5.5.0/conf
1.3新建hive-site.xml配置文件
[hadoop@master conf]$ vi hive-env.sh
##主要的配置內(nèi)容如下:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/data/hive-1.1.0-cdh5.5.0/hive-db/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/data/hive-1.1.0-cdh5.5.0/tmp/hive-${user.name}</value>
<description>Scratch space for Hive jobs</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/data/hive-1.1.0-cdh5.5.0/tmp/${user.name}</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/data/hive-1.1.0-cdh5.5.0/downloaded</value>
<description>
Temporary local directory for added resources in the remote file system.
</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/data/hive-1.1.0-cdh5.5.0/queryLogs/${user.name}</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>
jdbc:mysql://slave1:3306/hive?useUnicode=true&characterEncoding=utf8
</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
</configuration>
1.4添加mysql-connector的jar包到hive安裝路徑下的lib文件夾
#$HIVE_HOME為前面hive安裝的目錄路徑:/home/hadoop/bigdataspace/hive-1.1.0-cdh5.5.0
[hadoop@master ~] mv mysql-connector-java-5.1.33.jar $HIVE_HOME/lib
1.5啟動元數(shù)據(jù)服務(wù)
[hadoop@master ~]$ cd ~/bigdataspace/hive-1.1.0-cdh5.5.0
[hadoop@master hive-1.1.0-cdh5.5.0]$ ./bin/hive --service metastore &
1.6啟動/停止hive (CTL)命令行
#因為一開始配置了HIVE_HOME環(huán)境變量,可以直接在任何目錄下執(zhí)行hive命令了,進入hive控制臺
[hadoop@master bigdataspace]$ hive
Logging initialized using configuration in jar:file:/home/hadoop/bigdataspace
/hive-1.1.0-cdh5.5.0/lib/hive-common-1.1.0-cdh5.5.0.jar!/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive (default)>
上面報錯了,解決Logging initialized using configuration in jar:file… (因為沒log配置文件,直接從jar包查找)
$ cd ~/bigdataspace/ /hive-1.1.0-cdh5.5.0/conf
$ cp beeline-log4j.properties.template beeline-log4j.properties
$ cp hive-log4j.properties.template hive-log4j.properties
$ cp hive-exec-log4j.properties.template hive-exec-log4j.properties
[hadoop@master bigdataspace]$ hive
Logging initialized using configuration in file:/home/hadoop/bigdataspace/
hive-1.1.0-cdh5.5.0/conf/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive (default)>
hive> quit; #(退出hive,使用exit也可以)
1.7啟動/停止beeline命令行(CTL)
#啟動:
[hadoop@master bigdataspace]$ beeline
#停止:
beeline> !q
1.8HiveServer2的使用
[hadoop@master ~]$ cd ~/bigdataspace/hive-1.1.0-cdh5.5.0/bin/
[hadoop@master bin]$ ./hiveserver2 & #后面的&表示改命名在系統(tǒng)后臺執(zhí)行
(如果執(zhí)行上面命令讓界面無法回到命令行,可以按ctrl+C回到命令行,這里&會讓hiverserver2在后臺繼續(xù)執(zhí)行)
#查看HiveServer2的進程情況(如果無則hiverserver2啟動失敗或停止了):
[hadoop@master bin]$ ps -ef |grep HiveServer2
hadoop 25545 14762 3 17:02 pts/1 00:00:21 /home/hadoop/bigdataspace/jdk1.8.0_60/bin/java -Xmx256m -Djava.library.path=/home/hadoop/bigdataspace/hadoop-2.6.0-cdh5.5.0/lib/native/ -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/home/hadoop/bigdataspace/hadoop-2.6.0-cdh5.5.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/hadoop/bigdataspace/hadoop-2.6.0-cdh5.5.0 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /home/hadoop/bigdataspace/hive-1.1.0-cdh5.5.0/lib/hive-service-1.1.0-cdh5.5.0.jar org.apache.hive.service.server.HiveServer2
hadoop 26038 14762 0 17:14 pts/1 00:00:00 grep HiveServer2
(“kill -9 PID” 可以通過kill停止hiveserver2的后臺服務(wù))
使用beeline連接hiveserver2測試:
(
jdbc:hive2:表示連接到hiveserver2
master:表示hiveserver2安裝的機器host/IP
10001:表示hiveserver2設(shè)置的端口號(hive-site.xml中可設(shè)置)
)
[hadoop@master hive-1.1.0-cdh5.5.0]$ beeline -u jdbc:hive2://master:10001
###這里可能會出現(xiàn)一些slf4j包有多個,引用異常,但是不是報錯,如:
SLF4J: Class path contains multiple SLF4J bindings
SLF4J: Found binding in [jar:file:/home/hadoop/bigdataspace/had…)
Connecting to jdbc:hive2://master:10001
Connected to: Apache Hive (version 1.1.0-cdh5.5.0)
Driver: Hive JDBC (version 1.1.0-cdh5.5.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.1.0-cdh5.5.0 by Apache Hive
0: jdbc:hive2://master:10001>
以上完成了Hive的基本安裝配置。