Hadoop(一)之实验一CentOS7配置Hadoop系统:配置CentOS和下载安装包
Hadoop(二)之实验一CentOS7配置Hadoop系统:安装Zookeeper3.4.14
Hadoop(三)之实验一CentOS7配置Hadoop系统:安装 Hadoop3.1.2
Hadoop(四)之实验一CentOS7配置Hadoop系统:启动Hadoop


7.启动Hadoop

(1)启动JournalNode集群

在启动前,我们先将配置好的 hadoop 复制到其他机器
【只在c0上】

for N in $(seq 1 3); do scp -r /home/work/_app/hadoop-3.1.2 c$N:/home/work/_app/; done;

备用 NameNode 和活动 NameNode 通过一组独立的节点或守护进程(称为JournalNode)保持同步。JournalNodes 遵循环形拓扑,其中节点彼此连接以形成环。JournalNode 服务于它的请求并将信息复制到环中的其他节点。这在 JournalNode 失败的情况下提供容错。

在所有机器上使用 hdfs --daemon start journalnode 命令来启动Journalnode。输入 JPS 命令后,您将在所有节点中看到 JournalNode 守护程序。
【只在c0上】

for N in $(seq 0 3); do ssh c$N hdfs --daemon start journalnode;jps; done;
[root@c0 _src]# for N in $(seq 0 3); do ssh c$N hdfs --daemon start journalnode;jps; done;
5440 JournalNode
4197 QuorumPeerMain
5466 Jps
5440 JournalNode
5507 Jps
4197 QuorumPeerMain
5440 JournalNode
4197 QuorumPeerMain
5518 Jps
5440 JournalNode
4197 QuorumPeerMain
5529 Jps

关闭命令为:hdfs --daemon stop journalnode

(2)格式化 NameNode

一旦启动了 JournalNodes,就必须首先同步两个HA NameNodes的磁盘元数据。

在新版本的 HDFS 集群中,应首先在其中一个 NameNode 上运行 format 命令格式化。格式化一个 NameNode 有两种方法,任意方法都可以,本文中的示例,在 c0 上使用方法一

【只在c0上】

hdfs namenode -format
[root@c0 ~]# hdfs namenode -format
2019-11-04 21:09:56,002 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = c0/192.168.157.11
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 3.1.2
STARTUP_MSG:   classpath = 
...
STARTUP_MSG:   build = https://github.com/apache/hadoop.git -r 1019dde65bcf12e05ef48ac71e84550d589e5d9a; compiled by 'sunilg' on 2019-01-29T01:39Z
STARTUP_MSG:   java = 1.8.0_221
************************************************************/
2019-11-04 21:09:56,033 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2019-11-04 21:09:56,441 INFO namenode.NameNode: createNameNode [-format]
Formatting using clusterid: CID-3691ff4d-1b50-4ea0-bca7-9b2f20f314e7
2019-11-04 21:09:58,030 INFO namenode.FSEditLog: Edit logging is async:true
2019-11-04 21:09:58,066 INFO namenode.FSNamesystem: KeyProvider: null
2019-11-04 21:09:58,067 INFO namenode.FSNamesystem: fsLock is fair: true
2019-11-04 21:09:58,068 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
2019-11-04 21:09:58,131 INFO namenode.FSNamesystem: fsOwner             = root (auth:SIMPLE)
2019-11-04 21:09:58,131 INFO namenode.FSNamesystem: supergroup          = supergroup
2019-11-04 21:09:58,132 INFO namenode.FSNamesystem: isPermissionEnabled = false
2019-11-04 21:09:58,132 INFO namenode.FSNamesystem: Determined nameservice ID: mshkcluster
2019-11-04 21:09:58,132 INFO namenode.FSNamesystem: HA Enabled: true
2019-11-04 21:09:58,231 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2019-11-04 21:09:58,253 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
2019-11-04 21:09:58,298 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
2019-11-04 21:09:58,318 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2019-11-04 21:09:58,319 INFO blockmanagement.BlockManager: The block deletion will start around 2019 十一月 04 21:09:58
2019-11-04 21:09:58,322 INFO util.GSet: Computing capacity for map BlocksMap
2019-11-04 21:09:58,322 INFO util.GSet: VM type       = 64-bit
2019-11-04 21:09:58,380 INFO util.GSet: 2.0% max memory 235.9 MB = 4.7 MB
2019-11-04 21:09:58,380 INFO util.GSet: capacity      = 2^19 = 524288 entries
2019-11-04 21:09:58,413 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false
2019-11-04 21:09:58,432 INFO Configuration.deprecation: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS
2019-11-04 21:09:58,433 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2019-11-04 21:09:58,433 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
2019-11-04 21:09:58,433 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
2019-11-04 21:09:58,433 INFO blockmanagement.BlockManager: defaultReplication         = 3
2019-11-04 21:09:58,433 INFO blockmanagement.BlockManager: maxReplication             = 512
2019-11-04 21:09:58,433 INFO blockmanagement.BlockManager: minReplication             = 1
2019-11-04 21:09:58,433 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
2019-11-04 21:09:58,433 INFO blockmanagement.BlockManager: redundancyRecheckInterval  = 3000ms
2019-11-04 21:09:58,433 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
2019-11-04 21:09:58,433 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
2019-11-04 21:09:58,578 INFO namenode.FSDirectory: GLOBAL serial map: bits=24 maxEntries=16777215
2019-11-04 21:09:58,610 INFO util.GSet: Computing capacity for map INodeMap
2019-11-04 21:09:58,610 INFO util.GSet: VM type       = 64-bit
2019-11-04 21:09:58,610 INFO util.GSet: 1.0% max memory 235.9 MB = 2.4 MB
2019-11-04 21:09:58,611 INFO util.GSet: capacity      = 2^18 = 262144 entries
2019-11-04 21:09:58,611 INFO namenode.FSDirectory: ACLs enabled? false
2019-11-04 21:09:58,611 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true
2019-11-04 21:09:58,611 INFO namenode.FSDirectory: XAttrs enabled? true
2019-11-04 21:09:58,611 INFO namenode.NameNode: Caching file names occurring more than 10 times
2019-11-04 21:09:58,643 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true, maxSnapshotLimit: 65536
2019-11-04 21:09:58,650 INFO snapshot.SnapshotManager: SkipList is disabled
2019-11-04 21:09:58,664 INFO util.GSet: Computing capacity for map cachedBlocks
2019-11-04 21:09:58,664 INFO util.GSet: VM type       = 64-bit
2019-11-04 21:09:58,664 INFO util.GSet: 0.25% max memory 235.9 MB = 603.8 KB
2019-11-04 21:09:58,664 INFO util.GSet: capacity      = 2^16 = 65536 entries
2019-11-04 21:09:58,741 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
2019-11-04 21:09:58,742 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
2019-11-04 21:09:58,742 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2019-11-04 21:09:58,749 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
2019-11-04 21:09:58,749 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2019-11-04 21:09:58,752 INFO util.GSet: Computing capacity for map NameNodeRetryCache
2019-11-04 21:09:58,752 INFO util.GSet: VM type       = 64-bit
2019-11-04 21:09:58,752 INFO util.GSet: 0.029999999329447746% max memory 235.9 MB = 72.5 KB
2019-11-04 21:09:58,752 INFO util.GSet: capacity      = 2^13 = 8192 entries
2019-11-04 21:09:58,840 WARN client.QuorumJournalManager: Quorum journal URI 'qjournal://c0:8485;c1:8485/mshkcluster' has an even number of Journal Nodes specified. This is not recommended!
2019-11-04 21:10:00,961 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1125323158-192.168.157.11-1572873000961
2019-11-04 21:10:01,032 INFO common.Storage: Storage directory /home/work/_data/hadoop-3.1.2/namenode has been successfully formatted.
2019-11-04 21:10:01,388 INFO namenode.FSImageFormatProtobuf: Saving image file /home/work/_data/hadoop-3.1.2/namenode/current/fsimage.ckpt_0000000000000000000 using no compression
2019-11-04 21:10:01,615 INFO namenode.FSImageFormatProtobuf: Image file /home/work/_data/hadoop-3.1.2/namenode/current/fsimage.ckpt_0000000000000000000 of size 388 bytes saved in 0 seconds .
2019-11-04 21:10:01,679 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2019-11-04 21:10:01,800 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at c0/192.168.157.11
************************************************************/

(3)启动 zookeeper 故障转移控制器

Apache ZooKeeper 是一种高可用性服务,用于维护少量协调数据,通知客户端该数据的更改以及监视客户端是否存在故障。

自动 HDFS 故障转移的实现依赖于 ZooKeeper 来实现以下功能:

  • 故障检测 – 集群中的每个 NameNode 计算机都在 ZooKeeper 中维护一个持久会话。如果计算机崩溃,ZooKeeper 会话将过期,通知其他 NameNode 应该触发故障转移。
  • Active NameNode选举 – ZooKeeper 提供了一种简单的机制,可以将节点专门选为活动节点。如果当前活动的 NameNode 崩溃,则另一个节点可能在 ZooKeeper 中采用特殊的独占锁,指示它应该成为下一个活动的。

ZKFailoverController(ZKFC)是一个新组件,它是一个 ZooKeeper 客户端,它还监视和管理 NameNode 的状态。每台运行 NameNode 机器也运行 ZKFC。

ZKFC 主要做以下工作:

  • 运行状况监视 – ZKFC定期使用运行状况检查命令对其本地 NameNode 进行 ping 操作。只要 NameNode 及时响应健康状态,ZKFC 就认为该节点是健康的。如果节点已崩溃,冻结或以其他方式进入不健康状态,则运行状况监视器会将其标记为运行状况不佳。
  • ZooKeeper会话管理 – 当本地 NameNode 运行正常时,ZKFC 在 ZooKeeper 中保持会话打开。如果本地 NameNode 处于活动状态,它还拥有一个特殊的“锁定”znode。此锁使用 ZooKeeper 对“短暂”节点的支持; 如果会话过期,将自动删除锁定节点。
  • 基于ZooKeeper的选举 – 如果本地 NameNode 是健康的,并且 ZKFC 发现没有其他节点当前持有锁 znode ,它将自己尝试获取锁。如果成功,那么它“赢得了选举”,并负责运行故障转移以使其本地 NameNode 处于活动状态。故障转移过程类似于上述手动故障转移:首先,必要时对先前的活动进行隔离,然后本地 NameNode 转换为活动状态。

(4)格式化 zookeeper

在一台 NameNode 机器 c0 上执行 hdfs zkfc -formatZK 命令,格式化 zookeeper 故障转移控制器
【只在c0上】

hdfs zkfc -formatZK
[root@c0 _src]# hdfs zkfc -formatZK
2019-11-04 21:12:46,786 INFO tools.DFSZKFailoverController: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting DFSZKFailoverController
STARTUP_MSG:   host = c0/192.168.157.11
STARTUP_MSG:   args = [-formatZK]
STARTUP_MSG:   version = 3.1.2
STARTUP_MSG:   classpath = ...
2019-11-04 21:12:47,915 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/home/work/_app/hadoop-3.1.2/lib/native
2019-11-04 21:12:47,915 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
2019-11-04 21:12:47,915 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2019-11-04 21:12:47,915 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
2019-11-04 21:12:47,915 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
2019-11-04 21:12:47,915 INFO zookeeper.ZooKeeper: Client environment:os.version=3.10.0-1062.el7.x86_64
2019-11-04 21:12:47,915 INFO zookeeper.ZooKeeper: Client environment:user.name=root
2019-11-04 21:12:47,915 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
2019-11-04 21:12:47,915 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/work/_src
2019-11-04 21:12:47,916 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=c0:2181,c1:2181,c2:2181,c3:2181 sessionTimeout=10000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@61df66b6
2019-11-04 21:12:47,968 INFO zookeeper.ClientCnxn: Opening socket connection to server c2/192.168.157.13:2181. Will not attempt to authenticate using SASL (unknown error)
2019-11-04 21:12:47,985 INFO zookeeper.ClientCnxn: Socket connection established to c2/192.168.157.13:2181, initiating session
2019-11-04 21:12:48,014 INFO zookeeper.ClientCnxn: Session establishment complete on server c2/192.168.157.13:2181, sessionid = 0x300006b55f20000, negotiated timeout = 4000
2019-11-04 21:12:48,117 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mshkcluster in ZK.
2019-11-04 21:12:48,121 INFO zookeeper.ZooKeeper: Session: 0x300006b55f20000 closed
2019-11-04 21:12:48,125 WARN ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x300006b55f20000
2019-11-04 21:12:48,130 INFO zookeeper.ClientCnxn: EventThread shut down for session: 0x300006b55f20000
2019-11-04 21:12:48,140 INFO tools.DFSZKFailoverController: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down DFSZKFailoverController at c0/192.168.157.11
************************************************************/

验证 zkfc 是否格式化成功,如果多了一个 hadoop-ha 包就是成功了

zkCli.sh
[root@c0 _src]# zkCli.sh
Connecting to localhost:2181
2019-11-04 21:16:46,562 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf, built on 03/06/2019 16:18 GMT
2019-11-04 21:16:46,574 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=c0
2019-11-04 21:16:46,574 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_221
2019-11-04 21:16:46,594 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2019-11-04 21:16:46,595 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/opt/jdk1.8.0_221/jre
2019-11-04 21:16:46,595 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/home/work/_app/zookeeper-3.4.14/bin/../zookeeper-server/target/classes:/home/work/_app/zookeeper-3.4.14/bin/../build/classes:/home/work/_app/zookeeper-3.4.14/bin/../zookeeper-server/target/lib/*.jar:/home/work/_app/zookeeper-3.4.14/bin/../build/lib/*.jar:/home/work/_app/zookeeper-3.4.14/bin/../lib/slf4j-log4j12-1.7.25.jar:/home/work/_app/zookeeper-3.4.14/bin/../lib/slf4j-api-1.7.25.jar:/home/work/_app/zookeeper-3.4.14/bin/../lib/netty-3.10.6.Final.jar:/home/work/_app/zookeeper-3.4.14/bin/../lib/log4j-1.2.17.jar:/home/work/_app/zookeeper-3.4.14/bin/../lib/jline-0.9.94.jar:/home/work/_app/zookeeper-3.4.14/bin/../lib/audience-annotations-0.5.0.jar:/home/work/_app/zookeeper-3.4.14/bin/../zookeeper-3.4.14.jar:/home/work/_app/zookeeper-3.4.14/bin/../zookeeper-server/src/main/resources/lib/*.jar:/home/work/_app/zookeeper-3.4.14/bin/../conf:
2019-11-04 21:16:46,595 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2019-11-04 21:16:46,595 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2019-11-04 21:16:46,595 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2019-11-04 21:16:46,595 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2019-11-04 21:16:46,595 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2019-11-04 21:16:46,595 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=3.10.0-1062.el7.x86_64
2019-11-04 21:16:46,595 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root
2019-11-04 21:16:46,595 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root
2019-11-04 21:16:46,596 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/home/work/_src
2019-11-04 21:16:46,597 [myid:] - INFO  [main:ZooKeeper@442] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@446cdf90
2019-11-04 21:16:46,687 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1025] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
Welcome to ZooKeeper!
JLine support is enabled
2019-11-04 21:16:46,873 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@879] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2019-11-04 21:16:46,954 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x100008a65630000, negotiated timeout = 4000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, hadoop-ha]
[zk: localhost:2181(CONNECTED) 1] quit
Quitting...
2019-11-04 21:17:04,246 [myid:] - INFO  [main:ZooKeeper@693] - Session: 0x100008a65630000 closed
2019-11-04 21:17:04,248 [myid:] - INFO  [main-EventThread:ClientCnxn$EventThread@522] - EventThread shut down for session: 0x100008a65630000

(5)启动 NameNode

指定 c0 节点上使用 hdfs --daemon start namenode 命令启动 HDFS NameNode

【只在c0上】

hdfs --daemon start namenode
jps
[root@c0 _src]# hdfs --daemon start namenode
[root@c0 _src]# jps
5440 JournalNode
4197 QuorumPeerMain
5802 NameNode
5835 Jps

关闭 NameNode 的命令为:hdfs --daemon stop namenode
浏览http://c0:50070/能够看到以下效果(要在虚拟机里的浏览器输入这个网址,不是在外面的浏览器输):
在这里插入图片描述

(6)将 NameNode 数据复制到备用 NameNode

在另一台 NameNode 机器 c1 上执行hdfs namenode -bootstrapStandby命令,将 Meta 数据从 Active NameNode 复制到 Standby NameNode。

【注意在c1上】

hdfs namenode -bootstrapStandby
[root@c1 _app]# hdfs namenode -bootstrapStandby
2019-11-04 21:25:54,938 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = c1/192.168.157.12
STARTUP_MSG:   args = [-bootstrapStandby]
STARTUP_MSG:   version = 3.1.2
STARTUP_MSG:   classpath = ...
STARTUP_MSG:   build = https://github.com/apache/hadoop.git -r 1019dde65bcf12e05ef48ac71e84550d589e5d9a; compiled by 'sunilg' on 2019-01-29T01:39Z
STARTUP_MSG:   java = 1.8.0_221
************************************************************/
2019-11-04 21:25:54,966 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2019-11-04 21:25:55,463 INFO namenode.NameNode: createNameNode [-bootstrapStandby]
2019-11-04 21:25:55,872 INFO ha.BootstrapStandby: Found nn: c0, ipc: c0/192.168.157.11:8020
=====================================================
About to bootstrap Standby ID c1 from:
           Nameservice ID: mshkcluster
        Other Namenode ID: c0
  Other NN's HTTP address: http://c0:50070
  Other NN's IPC  address: c0/192.168.157.11:8020
             Namespace ID: 116041773
            Block pool ID: BP-1125323158-192.168.157.11-1572873000961
               Cluster ID: CID-3691ff4d-1b50-4ea0-bca7-9b2f20f314e7
           Layout version: -64
       isUpgradeFinalized: true
=====================================================
2019-11-04 21:25:58,011 INFO common.Storage: Storage directory /home/work/_data/hadoop-3.1.2/namenode has been successfully formatted.
2019-11-04 21:25:58,127 INFO namenode.FSEditLog: Edit logging is async:true
2019-11-04 21:25:58,207 WARN client.QuorumJournalManager: Quorum journal URI 'qjournal://c0:8485;c1:8485/mshkcluster' has an even number of Journal Nodes specified. This is not recommended!
2019-11-04 21:25:58,403 INFO namenode.TransferFsImage: Opening connection to http://c0:50070/imagetransfer?getimage=1&txid=0&storageInfo=-64:116041773:1572873000961:CID-3691ff4d-1b50-4ea0-bca7-9b2f20f314e7&bootstrapstandby=true
2019-11-04 21:25:58,532 INFO common.Util: Combined time for file download and fsync to all disks took 0.00s. The file download took 0.00s at 0.00 KB/s. Synchronous (fsync) write to disk of /home/work/_data/hadoop-3.1.2/namenode/current/fsimage.ckpt_0000000000000000000 took 0.00s.
2019-11-04 21:25:58,532 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000000 size 388 bytes.
2019-11-04 21:25:58,613 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at c1/192.168.157.12
************************************************************/

然后在 c1 使用 hdfs --daemon start namenode 命令启动 HDFS NameNode
【注意在c1上】

hdfs --daemon start namenode
jps
[root@c1 _app]# hdfs --daemon start namenode
[root@c1 _app]# jps
3827 QuorumPeerMain
5574 NameNode
5591 Jps
5230 JournalNode

浏览 c1 的 50070 端口,http://10.0.0.101:50070/ 能够看到以下效果:

这个时候在网址上可以看到 c0 和 c1 的状态都是 standby
  
通过下面的命令,也可以查看 NameNode 的状态
【只在c0上】

hdfs haadmin -getServiceState c0
hdfs haadmin -getServiceState c1
[root@c0 ~]# hdfs haadmin -getServiceState c0
standby
[root@c0 ~]# hdfs haadmin -getServiceState c1
standby

也可以通过 hdfs haadmin -getAllServiceState 命令,查看所有 NameNode 的状态

[root@c0 _src]# hdfs haadmin -getAllServiceState
c0:8020                                            standby   
c1:8020                                            standby   

(7)启动 HDFS 进程

由于在配置中启用了自动故障转移,start-dfs.sh 脚本现在将在任何运行 NameNode 的计算机上自动启动 zkfc 守护程序和 datanodes 。当 zkfc 启动时,它们将自动选择一个要激活的名称节点。

在 c0 上使用 start-dfs.sh 启动所有 HDFS 进程。
【只在c0上】

start-dfs.sh
[root@c0 ~]# start-dfs.sh
Starting namenodes on [c0 c1]
Last login: Mon Mar  4 22:14:22 CST 2019 from lionde17nianmbp on pts/3
c0: namenode is running as process 7768.  Stop it first.
c1: namenode is running as process 17984.  Stop it first.
Starting datanodes
Last login: Sun Mar 10 19:40:52 CST 2019 on pts/3
Starting ZK Failover Controllers on NN hosts [c0 c1]
Last login: Sun Mar 10 19:40:52 CST 2019 on pts/3

关闭命令为:stop-dfs.sh

您通过 hdfs haadmin -getAllServiceState 命令,也可以查看 NameNode 的状态,可以发现 c0 是 standby,c1 是active
【只在c0上】

hdfs haadmin -getAllServiceState
[root@c0 ~]# hdfs haadmin -getAllServiceState
c0:8020                                            standby
c1:8020                                            active

(8)测试 HDFS 是否可用

创建 /home/work/_data/test.mshk.top.txt 测试文件,输入以下内容并保存:
【只在c0上】

gedit /home/work/_data/test.mshk.top.txt
hello hadoop
hello mshk.top
welcome mshk.top
hello world

我们在 HDFS 上创建一个 mshk.top 的文件夹,并将/home/work/_data/test.mshk.top.txt 文件放入到 HDFS 的 mshk.top 目录
【只在c0上】

  1. 先启动dfs(如果已经执行了第(7)步,就去下个指令)
start-dfs.sh
hdfs haadmin -getAllServiceState
  1. dfs的文件操作
hdfs dfs -ls /
hdfs dfs -mkdir /mshk.top
hdfs dfs -ls /
hdfs dfs -put /home/work/_data/test.mshk.top.txt /mshk.top
hdfs dfs -ls /mshk.top

例子:

[root@c0 _src]# start-dfs.sh
Starting namenodes on [c0 c1]
上一次登录:一 11月  4 21:55:01 CST 2019pts/0 上
Starting datanodes
上一次登录:一 11月  4 21:58:33 CST 2019pts/0 上
Starting journal nodes [c0 c1]
上一次登录:一 11月  4 21:58:37 CST 2019pts/0 上
Starting ZK Failover Controllers on NN hosts [c0 c1]
上一次登录:一 11月  4 21:58:44 CST 2019pts/0 上
[root@c0 _src]# hdfs haadmin -getAllServiceState
c0:8020                                            standby   
c1:8020                                            active    
[root@c0 _src]# hdfs dfs -ls /
[root@c0 _src]# hdfs dfs -mkdir /mshk.top
[root@c0 _src]# hdfs dfs -ls /
Found 1 items
drwxr-xr-x   - root supergroup          0 2019-11-04 21:59 /mshk.top
[root@c0 _src]# hdfs dfs -put /home/work/_data/test.mshk.top.txt /mshk.top
[root@c0 _src]# hdfs dfs -ls /mshk.top
Found 1 items
-rw-r--r--   3 root supergroup         58 2019-11-04 21:59 /mshk.top/test.mshk.top.txt

打开 http://c1:50070 的管理界面,能够看到我们添加的文件

在这里插入图片描述

(9)启动 YARN

运行 start-yarn.sh 脚本来启动 YARN, start-yarn.sh 会根据配置文件,自动在所配置的所有 Master 上启动 ResourceManager 守护进程,在其他节点上启动 NodeManager 守护进程

  • 【c0】:
start-yarn.sh
jps
  • 【c1、c2、c3】:
jps
# c0
[root@c0 _src]# start-yarn.sh
Starting resourcemanagers on [ c0 c1]
上一次登录:一 11月  4 21:58:49 CST 2019pts/0 上
Starting nodemanagers
上一次登录:一 11月  4 22:04:36 CST 2019pts/0 上
[root@c0 _src]# jps
4197 QuorumPeerMain
9301 NameNode
9594 JournalNode
10922 Jps
9804 DFSZKFailoverController
10735 ResourceManager

# c1
[root@c1 _app]# jps
3827 QuorumPeerMain
7461 NameNode
7558 JournalNode
7990 ResourceManager
8246 Jps
7658 DFSZKFailoverController

# c2
[root@c2 _src]# jps
6208 NodeManager
5169 JournalNode
6321 Jps
3718 QuorumPeerMain
5993 DataNode

# c3
[root@c3 _src]# jps
6080 NodeManager
5044 JournalNode
6198 Jps
3639 QuorumPeerMain
5864 DataNode

关闭 YARN 的命令为:stop-yarn.sh

在 c0 上,通过 http://c0:8088 能够看到资源管理界面

在这里插入图片描述

(10)测试 YARN 的可用性

测试 YARN 是否可用,我们来做一个经典的例子,统计刚才放入 HDFS 中 mshk.top 目录下面的 /home/work/_data/test.mshk.top.txt 的单词频率

【只在c0上】

yarn jar /home/work/_app/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.2.jar wordcount /mshk.top/test.mshk.top.txt /output
[root@c0 _src]# yarn jar /home/work/_app/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.2.jar wordcount /mshk.top/test.mshk.top.txt /output
2019-11-04 22:10:01,926 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
2019-11-04 22:10:03,186 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1572876293270_0001
2019-11-04 22:10:04,104 INFO input.FileInputFormat: Total input files to process : 1
2019-11-04 22:10:04,382 INFO mapreduce.JobSubmitter: number of splits:1
2019-11-04 22:10:04,940 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1572876293270_0001
2019-11-04 22:10:04,943 INFO mapreduce.JobSubmitter: Executing with tokens: []
2019-11-04 22:10:05,400 INFO conf.Configuration: resource-types.xml not found
2019-11-04 22:10:05,400 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2019-11-04 22:10:06,059 INFO impl.YarnClientImpl: Submitted application application_1572876293270_0001
2019-11-04 22:10:06,163 INFO mapreduce.Job: The url to track the job: http://c1:8088/proxy/application_1572876293270_0001/
2019-11-04 22:10:06,164 INFO mapreduce.Job: Running job: job_1572876293270_0001
2019-11-04 22:10:24,666 INFO mapreduce.Job: Job job_1572876293270_0001 running in uber mode : false
2019-11-04 22:10:24,673 INFO mapreduce.Job:  map 0% reduce 0%
2019-11-04 22:10:38,009 INFO mapreduce.Job:  map 100% reduce 0%
2019-11-04 22:10:50,266 INFO mapreduce.Job:  map 100% reduce 100%
2019-11-04 22:10:51,295 INFO mapreduce.Job: Job job_1572876293270_0001 completed successfully
2019-11-04 22:10:51,536 INFO mapreduce.Job: Counters: 53
	File System Counters
		FILE: Number of bytes read=72
		FILE: Number of bytes written=438749
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=168
		HDFS: Number of bytes written=46
		HDFS: Number of read operations=8
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=1
		Launched reduce tasks=1
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=10089
		Total time spent by all reduces in occupied slots (ms)=8627
		Total time spent by all map tasks (ms)=10089
		Total time spent by all reduce tasks (ms)=8627
		Total vcore-milliseconds taken by all map tasks=10089
		Total vcore-milliseconds taken by all reduce tasks=8627
		Total megabyte-milliseconds taken by all map tasks=5165568
		Total megabyte-milliseconds taken by all reduce tasks=4417024
	Map-Reduce Framework
		Map input records=5
		Map output records=8
		Map output bytes=89
		Map output materialized bytes=72
		Input split bytes=110
		Combine input records=8
		Combine output records=5
		Reduce input groups=5
		Reduce shuffle bytes=72
		Reduce input records=5
		Reduce output records=5
		Spilled Records=10
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=231
		CPU time spent (ms)=1960
		Physical memory (bytes) snapshot=277495808
		Virtual memory (bytes) snapshot=4609847296
		Total committed heap usage (bytes)=142237696
		Peak Map Physical memory (bytes)=187273216
		Peak Map Virtual memory (bytes)=2301186048
		Peak Reduce Physical memory (bytes)=90222592
		Peak Reduce Virtual memory (bytes)=2308661248
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=58
	File Output Format Counters 
		Bytes Written=46

查看统计结果:
【只在c0上】

hadoop fs -cat /output/part-*
[root@c0 _src]# hadoop fs -cat /output/part-*
hadoop	1
hello	3
mshk.top	2
welcome	1
world	1
Logo

权威|前沿|技术|干货|国内首个API全生命周期开发者社区

更多推荐