视频1 视频21 视频41 视频61 视频文章1 视频文章21 视频文章41 视频文章61 推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37 推荐39 推荐41 推荐43 推荐45 推荐47 推荐49 关键词1 关键词101 关键词201 关键词301 关键词401 关键词501 关键词601 关键词701 关键词801 关键词901 关键词1001 关键词1101 关键词1201 关键词1301 关键词1401 关键词1501 关键词1601 关键词1701 关键词1801 关键词1901 视频扩展1 视频扩展6 视频扩展11 视频扩展16 文章1 文章201 文章401 文章601 文章801 文章1001 资讯1 资讯501 资讯1001 资讯1501 标签1 标签501 标签1001 关键词1 关键词501 关键词1001 关键词1501 专题2001
Hadoop、Hbase完全分布式搭建
2020-11-09 16:14:25 责编:小采
文档

一、Hadoop1.0到2.0的架构变化 650) this.width=650;" src="http://www.68idc.cn/help/uploads/allimg/151111/1213314049-0.jpg" title="图片1.png" alt="wKioL1UNG-aSt10OAAHl295Gnjw111.jpg" /> 1、Hadoop 2.0由HDFS、MapReduce和YARN三个分支构成 2、HDFS

一、Hadoop1.0到2.0的架构变化

1、Hadoop 2.0由HDFS、MapReduce和YARN三个分支构成

2、HDFSNN Federation、HA

3、MapReduce运行在YARN上的MR

4、YARN资源管理系统


二、HDFS 2.0

1、解决HDFS 1.0中单点故障和内存受限问题。

2、解决单点故障

HDFS HA通过主备NameNode解决

如果主NameNode发生故障则切换到备NameNode上

3、解决内存受限问题

HDFS Federation(联邦)

水平扩展支持多个NameNode

每个NameNode分管一部分目录

所有NameNode共享所有DataNode存储资

4、仅是架构上发生了变化使用方式不变

对HDFS使用者透明

HDFS 1.0中的命令和API仍可以使用$ hadoop fs -ls /user/hadoop/$ hadoop fs -mkdir /user/hadoop/data


三、HDFS 2.0 HA

1、主备NameNode

2、解决单点故障

主NameNode对外提供服务备NameNode同步主NameNode元数据以待切换

所有DataNode同时向两个NameNode汇报数据块信息

3、两种切换选择

手动切换通过命令实现主备之间的切换可以用HDFS升级等场合

自动切换基于Zookeeper实现

4、基于Zookeeper自动切换方案

Zookeeper Failover Controller监控NameNode健康状态并向Zookeeper注册NameNode

NameNode挂掉后ZKFC为NameNode竞争锁获得ZKFC 锁的NameNode变为active


四、环境搭建

192.168.1.2 master

192.168.1.3 slave1

192.168.1.4 slave2

Hadoop versionhadoop-2.2.0.tar.gz

Hbase versionhbase-0.98.11-hadoop2-bin.tar.gz

Zookeeper versionzookeeper-3.4.5.tar.gz

JDK versionjdk-7u25-linux-x.gz


1、主机HOSTS文件配置

[root@master ~]# cat /etc/hosts
192.168.1.2 master
192.168.1.3 slave1
192.168.1.4 slave2
[root@slave1 ~]# cat /etc/hosts
192.168.1.2 master
192.168.1.3 slave1
192.168.1.4 slave2
[root@slave2 ~]# cat /etc/hosts
192.168.1.2 master
192.168.1.3 slave1
192.168.1.4 slave2


2、配置节点之间互信

[root@master ~]# useradd hadoop
[root@slave1 ~]# useradd hadoop
[root@slave2 ~]# useradd hadoop
[root@master ~]# passwd hadoop
[root@slave1 ~]# passwd hadoop
[root@slave2 ~]# passwd hadoop
[root@master ~]# su - hadoop
[hadoop@master ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub slave1
[hadoop@master ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub slave2
[hadoop@master ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub master



3、JDK环境配置

[root@master ~]# tar jdk-7u25-linux-x.gz
[root@master ~]# mkdir /usr/java
[root@master ~]# mv jdk-7u25-linux-x.gz /usr/java
[root@master ~]# cd /usr/java/
[root@master java]# ln -s jdk1.7.0_25 jdk
# 修改/etc/profile,添加
export JAVA_HOME=/usr/java/jdk
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=/usr/java/jdk/bin:$PATH
[root@master ~]# source /etc/profile
[root@master ~]# java -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) -Bit Server VM (build 23.25-b01, mixed mode)

# slave1,slave2同样操作


4.Hadoop安装

[root@master ~]# tar zxvf hadoop-2.2.0.tar.gz
[root@master ~]# mv hadoop-2.2.0 /home/hadoop/
[root@master ~]# cd /home/hadoop/
[root@master hadoop]# ln -s hadoop-2.2.0 hadoop
[root@master hadoop]# chown -R hadoop.hadoop /home/hadoop/
[root@master ~]# cd /home/hadoop/hadoop/etc/hadoop
# 修改hadoop-env.sh文件
export JAVA_HOME=/usr/java/jdk
export HADOOP_HEAPSIZE=200

# 修改mapred-env.sh文件
export JAVA_HOME=/usr/java/jdk
export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=1000

# 修改yarn-env.sh文件
export JAVA_HOME=/usr/java/jdk
JAVA_HEAP_MAX=-Xmx300m
YARN_HEAPSIZE=100


# 修改core-site.xml文件

	
	fs.defaultFS
	hdfs://master:9000
	
	
	 hadoop.tmp.dir
	 /home/hadoop/tmp
	
	
	hadoop.proxyuser.hadoop.hosts
	*
	
	
	hadoop.proxyuser.hadoop.groups
	*
	


# 修改hdfs-site.xml文件

	
	dfs.namenode.secondary.http-address
	master:9001
	
	
	dfs.namenode.name.dir
	/home/hadoop/dfs/name
	
	
	dfs.datanode.data.dir
	/home/hadoop/dfs/data
	
	
	dfs.replication
	2
	
	
	dfs.webhdfs.enabled
	true
	



# 修改mapred-site.xml文件

	
	mapreduce.framework.name
	yarn
	
	
	mapreduce.jobhistory.address
	master:10020
	
	
	mapreduce.jobhistory.webapp.address
	master:19888
	
	
	mapreduce.map.memory.mb
	512
	
	
	mapreduce.map.cpu.vcores
	1
	
	
	mapreduce.reduce.memory.mb
	512
	


# 修改yarn-site.xml文件

	
	yarn.nodemanager.aux-services
	mapreduce_shuffle
	
	
	yarn.nodemanager.aux-services.mapreduce.shuffle.class
	org.apache.hadoop.mapred.ShuffleHandler
	
	
	yarn.resourcemanager.address
	master:8032
	
	
	yarn.resourcemanager.scheduler.address
	master:8030
	
	
	yarn.resourcemanager.resource-tracker.address
	master:8031
	
	
	yarn.resourcemanager.admin.address
	master:8033
	
	
	yarn.resourcemanager.webapp.address
	master:8088
	
	
	yarn.scheduler.minimum-allocation-mb
	100
	
	
	yarn.scheduler.maximum-allocation-mb
	200
	
	
	yarn.scheduler.minimum-allocation-vcores
	1
	
	
	yarn.scheduler.maximum-allocation-vcores
	2
	


# 修改slaves文件
slave1
slave2

# 修改 /home/hadoop/.bashrc

export HADOOP_DEV_HOME=/home/hadoop/hadoop
export PATH=$PATH:$HADOOP_DEV_HOME/bin
export PATH=$PATH:$HADOOP_DEV_HOME/sbin
export HADOOP_MAPARED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop
export HDFS_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

# 将上面修改的文件全部传送到slave1,slave2节点



5、在master节点上启动hdfs

[hadoop@master ~]$ cd /home/hadoop/hadoop/sbin/
[hadoop@master sbin]$ ./start-dfs.sh 
15/03/21 00:49:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [master]
master: starting namenode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-namenode-master.out
slave2: starting datanode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-datanode-slave1.out
Starting secondary namenodes [master]
master: starting secondarynamenode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-secondarynamenode-master.out

# 查看进程
[hadoop@master ~]$ jps
39093 Jps
317 SecondaryNameNode
38767 NameNode

[root@slave1 ~]# jps
2463 Jps
2379 DataNode

[root@slave2 ~]# jps
2463 Jps
2379 DataNode

#启动jobhistory

[hadoop@master sbin]$ mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /home/hadoop/hadoop-2.2.0/logs/mapred-hadoop-historyserver-master.out



6、启动yarn

[hadoop@master ~]$ cd /home/hadoop/hadoop/sbin/
[hadoop@master sbin]$ ./start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/hadoop-2.2.0/logs/yarn-hadoop-resourcemanager-master.out
slave2: starting nodemanager, logging to /home/hadoop/hadoop-2.2.0/logs/yarn-hadoop-nodemanager-slave2.out
slave1: starting nodemanager, logging to /home/hadoop/hadoop-2.2.0/logs/yarn-hadoop-nodemanager-slave1.out

# 查看进程
[hadoop@master sbin]$ jps
39390 Jps
317 SecondaryNameNode
39147 ResourceManager
38767 NameNode
[hadoop@slave1 ~]$ jps
26 Jps
2535 NodeManager
2379 DataNode

[hadoop@slave2 ~]$ jps
8261 Jps
8150 NodeManager
8004 DataNode


7、查看hdfs文件系统

[hadoop@master sbin]$ hadoop fs -ls /
15/03/21 15:56:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
drwxr-xr-x - hadoop supergroup 0 2015-03-20 17:46 /hbase
drwxrwx--- - hadoop supergroup 0 2015-03-20 16:56 /tmp



8、安装Zookeeper

[root@master ~]# tar zxvf zookeeper-3.4.5.tar.gz -C /home/hadoop/
[root@master ~]# cd /home/hadoop/
[root@master hadoop]# ln -s zookeeper-3.4.5 zookeeper
[root@master hadoop]# chown -R hadoop.hadoop /home/hadoop/zookeeper
[root@master hadoop]# cd zookeeper/conf/
[root@master conf]# cp zoo_sample.cfg zoo.cfg
# 修改zoo.cfg
dataDir=/home/hadoop/zookeeper/data
dataLogDir=/home/hadoop/zookeeper/logs
server.1=192.168.1.2:7000:7001
server.2=192.168.1.3:7000:7001
server.3=192.168.1.4:7000:7001
#在slave1,slave2执行相同的操作

[hadoop@master conf]# cd /home/hadoop/zookeeper/data/
[hadoop@master data]# echo 1 > myid 
[hadoop@slave1 data]# echo 2 > myid 
[hadoop@slave2 data]# echo 3 > myid 

#启动zookeeper
[hadoop@master ~]$ cd zookeeper/bin/
[hadoop@master bin]$ ./zkServer.sh start
[hadoop@slave1 ~]$ cd zookeeper/bin/
[hadoop@slave1 bin]$ ./zkServer.sh start
[hadoop@slave2 ~]$ cd zookeeper/bin/
[hadoop@slave2 bin]$ ./zkServer.sh start



9、Hbase安装

[root@master ~]# tar zxvf hbase-0.98.11-hadoop2-bin.tar.gz -C /home/hadoop/
[root@master ~]# cd /home/hadoop/
[root@master hadoop]# ln -s hbase-0.98.11-hadoop2 hbase
[root@master hadoop]# chown -R hadoop.hadoop /home/hadoop/hbase
[root@master hadoop]# cd /home/hadoop/hbase/conf/
# 修改hbase-env.sh文件
export JAVA_HOME=/usr/java/jdk
export HBASE_HEAPSIZE=50

# 修改 hbase-site.xml 文件

	
	hbase.rootdir
	hdfs://master:9000/hbase
	
	
	hbase.cluster.distributed
	true
	
	 
	 hbase.zookeeper.property.clientPort 
	 2181 
	
	
	 hbase.zookeeper.quorum
	 master,slave1,slave2
	


# 修改regionservers文件
slave1
slave2

# 将上面修改的文件传送到slave1,slave2



10、在master上面启动Hbase

[hadoop@master ~]$ cd hbase/bin/
[hadoop@master bin]$ ./start-hbase.sh 
master: starting zookeeper, logging to /home/hadoop/hbase/bin/../logs/hbase-hadoop-zookeeper-master.out
slave1: starting zookeeper, logging to /home/hadoop/hbase/bin/../logs/hbase-hadoop-zookeeper-slave1.out
slave2: starting zookeeper, logging to /home/hadoop/hbase/bin/../logs/hbase-hadoop-zookeeper-slave2.out
starting master, logging to /home/hadoop/hbase/bin/../logs/hbase-hadoop-master-master.out
slave1: starting regionserver, logging to /home/hadoop/hbase/bin/../logs/hbase-hadoop-regionserver-slave1.out
slave2: starting regionserver, logging to /home/hadoop/hbase/bin/../logs/hbase-hadoop-regionserver-slave2.out

# 查看进程
[hadoop@master bin]$ jps
39532 QuorumPeerMain
317 SecondaryNameNode
39147 ResourceManager
39918 HMaster
38767 NameNode
40027 Jps

[hadoop@slave1 data]$ jps
3021 HRegionServer
3133 Jps
2535 NodeManager
2379 DataNode
2942 HQuorumPeer

[hadoop@slave2 ~]$ jps
8430 HRegionServer
8351 HQuorumPeer
8150 NodeManager
8558 Jps
8004 DataNode

# 验证

[hadoop@master bin]$ ./hbase shell
2015-03-21 16:11:44,534 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version 0.98.11-hadoop2, r6e6cf74c1161035545d95921816121eb3a516fe0, Tue Mar 3 00:23:49 PST 2015

hbase(main):001:0> list
TABLE 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/hbase-0.98.11-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2015-03-21 16:11:56,499 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
0 row(s) in 1.9010 seconds

=> []




11、查看集群状态

HDFS UIhttp://192.168.1.2:50070/dfshealth.jsp


YARN UIhttp://192.168.1.2:8088/cluster


jobhistory UIhttp://192.168.1.2:19888/jobhistory



HBASE UIhttp://192.168.1.2:60010/master-status

下载本文
显示全文
专题