一 需要準(zhǔn)備的軟件
1.Ubuntu 14.04
? ? ?三個(gè)主機(jī)
? ? ?192.168.71.136 ?cloud01
? ? ?192.168.71.135 ?cloud02
? ? ? 192.168.71.137 ?cloud03
2.jdk-7u51-linux-i586.tar.gz
3.hadoop-2.2.0.tar.gz
百度云盤(pán)鏈接:pan.baidu.com/s/1pKADKNL
二 操作步驟
單機(jī)搭建
1.修改主機(jī)名分別修改三個(gè)主機(jī)名為cloud01 cloud02 cloud03
Sudo gedit /etc/hostnname(重啟)
2 在hosts中添加地址內(nèi)容
192.168.71.134 cloud01
192.168.71.129 cloud02
192.168.71.130 cloud03
Sudo gedit /etc/hosts
3 ?安裝java(分別安裝)
新建文件夾并八java的壓縮包拷貝到該目錄下
Sudo mkdir/usr/java
解壓
Sudo tar –zxvf文件名
修改配置文件
Sudo gedit/etc/profile
添加如下內(nèi)容:
export JAVA_HOME=/usr/java/jdk1.7.0_51
exportCLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
exportPATH=$JAVA_HOME/bin:$PATH
執(zhí)行命令
Source /etc/profile
查看是否安裝成功
Java –version
4.安裝hadoop
把文件拷貝到家目錄解壓
Sudo tar –zxvf文件名
解壓之后chmod –R 777得到的文件名 賦予執(zhí)行權(quán)限
這一步為止表示單機(jī)安裝完畢 驗(yàn)證一下
在該目錄下執(zhí)行
./bin/hadoop jar./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 10 20
偽分布搭建(接上)
5.安裝ssh
執(zhí)行命令sudo apt-get install ssh
在家目錄新建文件.sshsudo mkdir .ssh
進(jìn)入該文件夾cd .ssh
ssh-keygen –t rsa (一路enter)
cat id_rsa.pub >> authorized_keys
sudo service ssh restart
測(cè)試一下ssh cloud00(效果是不需要輸入密碼)
6配置hadoop環(huán)境變量
首先在家目錄創(chuàng)建幾個(gè)文件夾
~/hddata/dfs/name
~/hddata/dfs/data
~/hddata/tmp
然后在hadoop 2.2.0文件夾下 修改三個(gè)配置文件
gedit etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_51
gedit etc/hadoop/core-site.xml
fs.default.name
hdfs://localhost:9000
hadoop.tmp.dir
/home/hduser/hddata/tmp
gedit etc/hadoop/hdfs-site.xml
dfs.namenode.name.dir
/home/hduser/hddata/dfs/name
dfs.datanode.data.dir
/home/hduser/hddata/dfs/data
dfs.replication
1
cp etc/hadoop/mapred-site.xml.templateetc/hadoop/mapred-site.xml
gedit etc/hadoop/mapred-site.xml
mapred.job.tracker
localhost:54311
mapred.map.tasks
10
mapred.reduce.tasks
2
格式硬盤(pán)
./bin/hdfs namenode –format
啟動(dòng)所有的程序
./sbin/start-all.sh
查看啟動(dòng)程序
Jps
3776 ResourceManager
3354 NameNode
3645 SecondaryNameNode
3467 DataNode
3895 NodeManager
4382 Jps
測(cè)試在瀏覽器中輸入loalhost:50070
這個(gè)時(shí)候 偽分布搭建已經(jīng)完成
集群搭建
1.解壓集群配置文件,在虛擬機(jī)中打開(kāi)三個(gè)機(jī)器
2.修改每臺(tái)機(jī)器的固定IP地址,注意查看網(wǎng)關(guān)和DNS
3.修改每臺(tái)機(jī)器的hosts
sudo gedit /etc/hosts
192.168.71..136 cloud01
192.168.71.135 cloud02
192.168.71.137 cloud03
注:請(qǐng)將原文件最上面的第二行127.0.1.1刪除掉,每臺(tái)機(jī)器都要做
4.每臺(tái)機(jī)器配公私鑰
sudo apt-get install ssh
mkdir .ssh
cd .ssh
ssh-keygen -t rsa
cat id_rsa.pub>>authorized_keys
sudo service ssh restart
ssh localhost
如果存在.ssh文件夾,則應(yīng)先刪除.ssh(rm-rf .ssh)
5.發(fā)送主機(jī)的公鑰,并加入到每臺(tái)機(jī)器的授權(quán)文件中
cd .ssh
scp authorized_keyshduser@cloud02:~/.ssh/authorized_keys_from_cloud01
分別進(jìn)入cloud02和cloud03,執(zhí)行以下命令
cd .ssh
catauthorized_keys_from_cloud01>>authorized_keys
6.在每臺(tái)機(jī)器上安裝jdk
7.在主機(jī)上安裝hadoop-2.2.0(tar-zxvf hadoop-2.2.0.tar.gz)
8.在每臺(tái)機(jī)器的主文件夾下新建以下三個(gè)文件夾
~/hddata/dfs/name
~/hddata/dfs/data
~/hdata/tmp
scp -r ~/hddata hduser@cloud02:~/
scp -r ~/hddata hduser@cloud03:~/
9.在主機(jī)上修改7個(gè)配置文件
cd hadoop-2.2.0
(1)geditetc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_51
(2)geditetc/hadoop/yarn-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_51
(3)geditetc/hadoop/slaves
cloud01
cloud02
cloud03
(4)geditetc/hadoop/core-site.xml
fs.defaultFS
hdfs://cloud01:9000
io.file.buffer.size
131072
hadoop.tmp.dir
/home/hduser/hddata/tmp
(5)geditetc/hadoop/hdfs-site.xml
dfs.namenode.secondary.http-address
cloud01:9001
dfs.namenode.name.dir
/home/hduser/hddata/dfs/name
dfs.datanode.data.dir
/home/hduser/hddata/dfs/data
dfs.replication
2
dfs.webhdfs.enabled
true
(6)cpmapred-site.xml.template mapred-site.xml
gedit etc/hadoop/mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.jobhistory.address
cloud01:10020
mapreduce.jobhistory.webapp.address
cloud01:19888
(7)geditetc/hadoop/yarn-site.xml
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
yarn.resourcemanager.address
cloud01:8132
yarn.resourcemanager.scheduler.address
cloud01:8130
yarn.resourcemanager.resource-tracker.address
cloud01:8131
yarn.resourcemanager.admin.address
cloud01:8133
yarn.resourcemanager.webapp.address
cloud01:8188
9.將主機(jī)上的hadoop-2.2.0的文件夾發(fā)送給另兩臺(tái)機(jī)器
scp -r hadoop-2.2.0 hduser@cloud02:~/
scp -r hadoop-2.2.0 hduser@cloud03:~/
10.格式化namenode
cd hadoop-2.2.0
./bin/hdfs namenode -format
11.啟動(dòng)hadoop
./sbin/start-all.sh
查看文件塊組成
./bin/hdfs fsck / -files -blocks
./bin/hdfs dfsadmin -report
http://192.168.71.136:50070
http://192.168.71.136:8188
./sbin/mr-jobhistory-daemon.sh start historyserver
12.運(yùn)行pi
./bin/hadoop jar./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 10 20