配置CentOS上的HDFS高可用性(HA)涉及多個步驟,包括設置NameNode、SecondaryNameNode、DataNode、ZooKeeper以及配置相關文件。以下是一個基本的指南,幫助你在CentOS系統上配置HDFS高可用性。
下載并解壓ZooKeeper:
wget https://downloads.apache.org/zookeeper/zookeeper-3.8.0/apache-zookeeper-3.8.0-bin.tar.gz
tar -xzf apache-zookeeper-3.8.0-bin.tar.gz
mv apache-zookeeper-3.8.0-bin /opt/zookeeper
配置ZooKeeper:
編輯/opt/zookeeper/conf/zoo.cfg文件,添加或修改以下內容:
dataDir=/var/lib/zookeeper
clientPort=2181
server.1=node1:2888
server.2=node2:2888
server.3=node3:2888
創建myid文件: 在每臺節點上創建一個myid文件,內容為其節點編號。
echo 1 > /var/lib/zookeeper/myid # 在node1上
echo 2 > /var/lib/zookeeper/myid # 在node2上
echo 3 > /var/lib/zookeeper/myid # 在node3上
啟動ZooKeeper服務:
/opt/zookeeper/bin/zkServer.sh start
下載并解壓Hadoop:
wget https://downloads.apache.org/hadoop/core/hadoop-3.2.0/hadoop-3.2.0.tar.gz
tar -xzf hadoop-3.2.0.tar.gz
mv hadoop-3.2.0 /opt/hadoop
配置環境變量:
編輯/etc/profile文件,添加以下內容:
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
配置core-site.xml:
編輯/opt/hadoop/etc/hadoop/core-site.xml文件,添加以下內容:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>node1:2181,node2:2181,node3:2181</value>
</property>
</configuration>
配置hdfs-site.xml:
編輯/opt/hadoop/etc/hadoop/hdfs-site.xml文件,添加以下內容:
<configuration>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>node1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>node2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>node1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>node2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node1:8485;node2:8485;node3:8485/mycluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/var/lib/hadoop/journalnode</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
</configuration>
hdfs-site.xml文件,添加以下內容:<configuration>
<property>
<name>dfs.datanode.data.dir</name>
<value>/var/lib/hadoop/datanode</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>node1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>node2:8020</value>
</property>
</configuration>
格式化NameNode: 在NameNode節點上執行以下命令:
hdfs namenode -format
啟動Hadoop服務:
start-dfs.sh
hdfs haadmin
通過以上步驟,你可以在CentOS上配置一個高可用的HDFS集群。請根據你的具體需求和環境調整配置。