溫馨提示×

溫馨提示×

您好，登錄后才能下訂單哦！

密碼登錄×

忘記密碼？

登錄注冊×

獲取短信驗證碼

其他方式登錄

點擊登錄注冊即表示同意《億速云用戶服務條款》

用戶登錄×

賬戶密碼登錄

請使用微信掃描上方二維碼

使用幫助

請求超時！

請點擊重新獲取二維碼

Hadoop高可用搭建的示例分析

發布時間：2022-02-25 10:22:48 來源：億速云閱讀：211 作者：小新欄目：開發技術

這篇文章給大家分享的是有關Hadoop高可用搭建的示例分析的內容。小編覺得挺實用的，因此分享給大家做個參考，一起跟隨小編過來看看吧。

Hadoop高可用搭建超詳細

實驗環境

1.安裝jdk
2.修改hostname
3.修改hosts映射，并配置ssh免密登錄
4.設置時間同步
5.安裝hadoop至/opt/data目錄下
6.修改hadoop配置文件
7.zookeeper集群安裝配置
8.啟動集群
安裝步驟

實驗環境

master：192.168.10.131
slave1：192.168.10.129
slave2：192.168.10.130
操作系統ubuntu-16.04.3
hadoop-2.7.1
zookeeper-3.4.8

安裝步驟

1.安裝jd

將jdk安裝到opt目錄下

tar -zvxf jdk-8u221-linux-x64.tar.gz

配置環境變量

vim etc/profile
#jdk
export JAVA_HOME=/opt/jdk1.8.0_221
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
source etc/profile

2.修改hostname

分別將三臺虛擬機的修改為master、slave1、slave2

vim  /etc/hostname

3.修改hosts映射，并配置ssh免密登錄

修改hosts文件，每臺主機都需進行以下操作

vim /etc/hosts

192.168.10.131 master
192.168.10.129 slave1
192.168.10.130 slave2

配置ssh免密
首先需要關閉防火墻

1、查看端口開啟狀態
sudo ufw status
2、開啟某個端口，比如我開啟的是8381
sudo ufw allow 8381
3、開啟防火墻
sudo ufw enable
4、關閉防火墻
sudo ufw disable
 5、重啟防火墻
 sudo ufw reload
 6、禁止外部某個端口比如80
 sudo ufw delete allow 80
 7、查看端口ip
 netstat -ltn

集群在啟動的過程中需要ssh遠程登錄到別的主機上，為了避免每次輸入對方主機的密碼，我們需要配置免密碼登錄（提示操作均按回車）

ssh-keygen -t rsa

將每臺主機的公匙復制給自己以及其他主機

ssh-copy-id -i ~/.ssh/id_rsa.pub root@master
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave2

4.設置時間同步

安裝ntpdate服務

apt-get install ntpdate

修改ntp配置文件

vim /etc/ntp.conf

# /etc/ntp.conf, configuration for ntpd; see ntp.conf(5) for help

driftfile /var/lib/ntp/ntp.drift

# Enable this if you want statistics to be logged.
#statsdir /var/log/ntpstats/

statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable

# Specify one or more NTP servers.

# Use servers from the NTP Pool Project. Approved by Ubuntu Technical Board
# on 2011-02-08 (LP: #104525). See http://www.pool.ntp.org/join.html for
# more information.
#pool 0.ubuntu.pool.ntp.org iburst
#pool 1.ubuntu.pool.ntp.org iburst
#pool 2.ubuntu.pool.ntp.org iburst
#pool 3.ubuntu.pool.ntp.org iburst

# Use Ubuntu's ntp server as a fallback.
#pool ntp.ubuntu.com

# Access control configuration; see /usr/share/doc/ntp-doc/html/accopt.html for
# details.  The web page <http://support.ntp.org/bin/view/Support/AccessRestrictions>
# might also be helpful.
#
# Note that "restrict" applies to both servers and clients, so a configuration
# that might be intended to block requests from certain clients could also end
# up blocking replies from your own upstream servers.

# By default, exchange time with everybody, but don't allow configuration.
restrict -4 default kod notrap nomodify nopeer noquery limited
restrict -6 default kod notrap nomodify nopeer noquery limited

# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1
restrict ::1

# Needed for adding pool entries
restrict source notrap nomodify noquery


# Clients from this (example!) subnet have unlimited access, but only if
# cryptographically authenticated.
# 允許局域網內設備與這臺服務器進行同步時間.但是拒絕讓他們修改服務器上的時間
#restrict 192.168.10.131 mask 255.255.255.0 nomodify notrust
#statsdir /var/log/ntpstats/

statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable

# Specify one or more NTP servers.

# Use servers from the NTP Pool Project. Approved by Ubuntu Technical Board
# on 2011-02-08 (LP: #104525). See http://www.pool.ntp.org/join.html for
# more information.
#pool 0.ubuntu.pool.ntp.org iburst
#pool 1.ubuntu.pool.ntp.org iburst
#pool 2.ubuntu.pool.ntp.org iburst
#pool 3.ubuntu.pool.ntp.org iburst

# Use Ubuntu's ntp server as a fallback.
#pool ntp.ubuntu.com

# Access control configuration; see /usr/share/doc/ntp-doc/html/accopt.html for
# details.  The web page <http://support.ntp.org/bin/view/Support/AccessRestrictions>
# might also be helpful.
#
# Note that "restrict" applies to both servers and clients, so a configuration
# that might be intended to block requests from certain clients could also end
# up blocking replies from your own upstream servers.

# By default, exchange time with everybody, but don't allow configuration.
restrict -4 default kod notrap nomodify nopeer noquery limited
restrict -6 default kod notrap nomodify nopeer noquery limited

# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1
restrict ::1

# Needed for adding pool entries
restrict source notrap nomodify noquery


# Clients from this (example!) subnet have unlimited access, but only if
# cryptographically authenticated.
# 允許局域網內設備與這臺服務器進行同步時間.但是拒絕讓他們修改服務器上的時間
#restrict 192.168.10.131 mask 255.255.255.0 nomodify notrust
restrict 192.168.10.129 mask 255.255.255.0 nomodify notrust
restrict 192.168.10.130 mask 255.255.255.0 nomodify notrust

# 允許上層時間服務器修改本機時間
#restrict times.aliyun.com nomodify
#restrict ntp.aliyun.com  nomodify
#restrict cn.pool.ntp.org nomodify 

# 定義要同步的時間服務器
server 192.168.10.131 perfer
#server times.aliyun.com iburst prefer    # prefer表示為優先，表示本機優先同步該服務器時間
#server ntp.aliyun.com iburst
#server cn.pool.ntp.org iburst

#logfile /var/log/ntpstats/ntpd.log    # 定義ntp日志目錄
#pidfile  /var/run/ntp.pid    # 定義pid路徑

# If you want to provide time to your local subnet, change the next line.
# (Again, the address is an example only.)
#broadcast 192.168.123.255

# If you want to listen to time broadcasts on your local subnet, de-comment the
# next lines.  Please do this only if you trust everybody on the network!
#disable auth
#broadcastclient

#Changes recquired to use pps synchonisation as explained in documentation:
#http://www.ntp.org/ntpfaq/NTP-s-config-adv.htm#AEN3918

#server 127.127.8.1 mode 135 prefer    # Meinberg GPS167 with PPS
#fudge 127.127.8.1 time1 0.0042        # relative to PPS for my hardware

#server 127.127.22.1                   # ATOM(PPS)
#fudge 127.127.22.1 flag3 1            # enable PPS API
server 127.127.1.0
fudge 127.127.1.0 stratum 10

啟動ntpd服務，并查看ntp同步狀態

service ntpd start　　#啟動ntp服務
ntpq -p　　　　　　  #觀察時間同步狀況
ntpstat　　　　　　   #查看時間同步結果

重啟服務，與master主機時間同步

/etc/init.d/ntp restart
ntpdate 192.168.10.131

5.安裝hadoop至/opt/data目錄下

在/opt目錄下新建Data目錄

cd /opt
mkdir Data

下載并解壓hadoop至/opt/data目錄

wget https://archive.apache.org/dist/hadoop/common/hadoop-2.7.1/
tar -zvxf hadoop-2.7.1.tar /opt/data

配置環境變量

# HADOOP
export HADOOP_HOME=/opt/Data/hadoop-2.7.1
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
export HADOOP_YARN_HOME=$HADOOP_HOME

6.修改hadoop配置文件

文件目錄hadoop-2.7.1/etc/hadoop

修改hadoop-env.sh

export JAVA_HOME=/opt/jdk1.8.0_221

修改core-site.xml

<configuration>
<!-- 指定hdfs的nameservice為ns1 -->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://ns1/</value>
    </property>
    
<!-- 指定hadoop臨時目錄 -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/Data/hadoop-2.7.1/tmp</value>
    </property>
    
<!-- 指定zookeeper地址 -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>slave1:2181,slave2:2181</value>
    </property>
    
<!--修改core-site.xml中的ipc參數,防止出現連接journalnode服務ConnectException-->
    <property>
        <name>ipc.client.connect.max.retries</name>
        <value>100</value>
    <description>Indicates the number of retries a client will make to establish a server connection.</description>
    </property>
</configuration>

修改hdfs-site.xml

<configuration>
<!--指定hdfs的nameservice為ns1，需要和core-site.xml中的保持一致 -->
   <property>
      <name>dfs.nameservices</name>
      <value>ns1</value>
   </property>

<!-- ns1下面有兩個NameNode，分別是nn1，nn2 -->
   <property>
      <name>dfs.ha.namenodes.ns1</name>
      <value>nn1,nn2</value>
   </property>

<!-- nn1的RPC通信地址 -->
   <property>
      <name>dfs.namenode.rpc-address.ns1.nn1</name>
      <value>master:9820</value>
   </property>

<!-- nn1的http通信地址 -->
   <property>
      <name>dfs.namenode.http-address.ns1.nn1</name>
      <value>master:9870</value>
   </property>

<!-- nn2的RPC通信地址 -->
   <property>
      <name>dfs.namenode.rpc-address.ns1.nn2</name>
      <value>slave1:9820</value>
   </property>

<!-- nn2的http通信地址 -->
   <property>
      <name>dfs.namenode.http-address.ns1.nn2</name>
      <value>slave1:9870</value>
   </property>

<!-- 指定NameNode的日志在JournalNode上的存放位置 -->
   <property>
      <name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://master:8485;slave1:8485;slave2:8485/ns1</value>
   </property>

<!-- 指定JournalNode在本地磁盤存放數據的位置 -->
   <property>
      <name>dfs.journalnode.edits.dir</name>
      <value>/opt/Data/hadoop-2.7.1/journal</value>
   </property>

<!-- 開啟NameNode失敗自動切換 -->
   <property>
      <name>dfs.ha.automatic-failover.enabled</name>
      <value>true</value>
   </property>

<!-- 配置失敗自動切換實現方式 -->
   <property>
      <name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
   </property>

<!-- 配置隔離機制方法，多個機制用換行分割，即每個機制暫用一行-->
   <property>
      <name>dfs.ha.fencing.methods</name>
      <value>
      sshfence
      shell(/bin/true)
     </value>
   </property>

<!-- 使用sshfence隔離機制時需要ssh免登陸 -->
   <property>
      <name>dfs.ha.fencing.ssh.private-key-files</name>
      <value>/root/.ssh/id_rsa</value>
   </property>

<!-- 配置sshfence隔離機制超時時間 -->
   <property>
      <name>dfs.ha.fencing.ssh.connect-timeout</name>
      <value>30000</value>
   </property>
   
   <!--配置namenode存放元數據的目錄，可以不配置，如果不配置則默認放到hadoop.tmp.dir下-->
   <property>
      <name>dfs.namenode.name.dir</name>
      <value>/opt/Data/hadoop-2.7.1/data/name</value>
   </property>
   
   <!--配置datanode存放元數據的目錄，可以不配置，如果不配置則默認放到hadoop.tmp.dir下-->
   <property>
      <name>dfs.datanode.data.dir</name>
      <value>/opt/Data/hadoop-2.7.1/data/data</value>
   </property>
   
    <!--配置復本數量-->
   <property>
      <name>dfs.replication</name>
      <value>2</value>
   </property>
   
   <!--設置用戶的操作權限，false表示關閉權限驗證，任何用戶都可以操作-->
   <property>
      <name>dfs.webhdfs.enabled</name>
      <value>true</value>
   </property>
   
</configuration>

修改mapred-site.xml

將文件名修改為mapred-site.xml
cp mapred-queues.xml.template mapred-site.xml


<configuration>
    <property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
    </property>

</configuration>

修改yarn-site.xml

<configuration>
<!-- 指定nodemanager啟動時加載server的方式為shuffle server -->
<property>
<name>yarn.nodemanager.aux-services</name>   
<value>mapreduce_shuffle</value> 
</property>

<!--配置yarn的高可用-->
<property>   
<name>yarn.resourcemanager.ha.enabled</name>   
<value>true</value> 
</property>

<!--執行yarn集群的別名-->        
<property>   
<name>yarn.resourcemanager.cluster-id</name>   
<value>cluster1</value> 
</property> 

<!--指定兩個resourcemaneger的名稱-->
<property>  
<name>yarn.resourcemanager.ha.rm-ids</name>   
<value>rm1,rm2</value> 
</property> 

<!--配置rm1的主機-->
<property>   
<name>yarn.resourcemanager.hostname.rm1</name>   
<value>master</value> 
</property> 

<!--配置rm2的主機-->
<property>   
<name>yarn.resourcemanager.hostname.rm2</name>   
<value>slave1</value> 
</property>      
                         
<!--配置2個resourcemanager節點--> 
<property>   
<name>yarn.resourcemanager.zk-address</name>   
<value>slave1:2181,slave2:2181</value> 
</property>        
                       
<!--zookeeper集群地址-->
<property>    
<name>yarn.nodemanager.vmem-check-enabled</name>    
<value>false</value>    
<description>Whether virtual memory limits will be enforced for containers</description>
</property>

<!--物理內存8G-->  
<property>    
<name>yarn.nodemanager.vmem-pmem-ratio</name>    
<value>8</value>             
<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
</property>
</configuration>

修改slave

master
slave1
slave2

7.zookeeper集群安裝配置

下載并解壓zookeeper-3.4.8.tar.gz

wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.8/zookeeper-3.4.8.tar.gz
tar -zvxf zookeeper-3.4.8.tar.gz /opt/Data

修改配置文件

#zookeeper
export ZOOKEEPER_HOME=/opt/Data/zookeeper-3.4.8
export PATH=$PATH:$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin

進入conf目錄，復制zoo-sample.cfg為zoo.cfg

cp zoo-sample.cfg zoo.cfg

修改zoo.cfg

dataDir=/opt/Data/zookeeper-3.4.8/tmp  //需要在zookeeper-3.4.8目錄下新建tmp目錄

server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888

在tmp目錄中創建myid文件

vim myid 

1     //其他主機需要修改該編號  2,3

8.啟動集群

格式化master主機namenode。/etc/hadoop目錄下輸入該命令

hadoop namenode -format

將Data目錄拷貝到其他兩臺主機上

scp -r /opt/Data root@slave1:/opt
scp -r /opt/Data root@slave2:/opt

啟動zookeeper,所有節點均執行

hadoop-daemon.sh start zkfc

格式化zookeeper,所有節點均執行

hdfs zkfc -formatZK

啟動journalnode，namenode備用節點相同（hadoop-2.7.1目錄下執行）

hadoop-daemon.sh start journalnode

啟動集群

start-all.sh

查看端口

netstat -ntlup  #可以查看服務端占用的端口

查看進程jps

感謝各位的閱讀！關于“Hadoop高可用搭建的示例分析”這篇文章就分享到這里了，希望以上內容可以對大家有一定的幫助，讓大家可以學到更多知識，如果覺得文章不錯，可以把它分享出去讓更多的人看到吧！

向AI問一下細節

推薦閱讀：

免責聲明：本站發布的內容（圖片、視頻和文字）以原創、轉載和分享為主，文章觀點不代表本網站立場，如果涉及侵權請聯系站長郵箱：is@yisu.com進行舉報，并提供相關證據，一經查實，將立刻刪除涉嫌侵權內容。

上一篇新聞：
web前端常用的技術框架有哪些
下一篇新聞：
web前端工程師要具備哪些技能

猜你喜歡

AI
助
手

產品服務

地區劃分

專題活動

幫助支持

關于我們

售后咨詢

7*24小時在線電話：400-100-2938

7*24小時在線 QQ：800811969

關注億速云

億速云公眾號

手機網站二維碼

亚洲午夜精品一区二区_中文无码日韩欧免_久久香蕉精品视频_欧美主播一区二区三区美女