Disabling and Redeploying HDFS HA
Continue reading:
Disabling and Redeploying HDFS HA Using Cloudera Manager
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
- Go to the HDFS service.
- Select .
- Select the hosts for the NameNode and the SecondaryNameNode and click Continue.
- Select the HDFS checkpoint directory and click Continue.
- Confirm that you want to take this action.
- Update the Hive Metastore NameNode.
Cloudera Manager ensures that one NameNode is active, and saves the namespace. Then it stops the standby NameNode, creates a SecondaryNameNode, removes the standby NameNode role, and restarts all the HDFS services.
Disabling and Redeploying HDFS HA Using the Command Line
- If you use Cloudera Manager, do not use these command-line instructions.
- This information applies specifically to CDH 5.8.x. If you use a lower version of CDH, see the documentation for that version located at Cloudera Documentation.
Step 1: Shut Down the Cluster
- Shut down Hadoop services across your entire cluster. Do this from Cloudera Manager; or, if you are not using Cloudera Manager, run the following command on every host in your cluster:
$ for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x stop ; done
- Check each host to make sure that there are no processes running as the hdfs, yarn, mapred
or httpfs users from root:
# ps -aef | grep java
Step 2: Unconfigure HA
- Disable the software configuration.
- If you are using Quorum-based storage and want to unconfigure it, unconfigure the HA properties described under Enabling HDFS HA Using the Command Line.
If you intend to redeploy HDFS HA later, comment out the HA properties rather than deleting them.
- If you were using NFS shared storage in CDH 4, you must unconfigure the properties described below before upgrading to CDH 5.
- If you are using Quorum-based storage and want to unconfigure it, unconfigure the HA properties described under Enabling HDFS HA Using the Command Line.
- Move the NameNode metadata directories on the standby NameNode. The location of these directories is configured by dfs.namenode.name.dir and dfs.namenode.edits.dir. Move them to a backup location.
Step 3: Restart the Cluster
for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x start ; done
Properties to unconfigure to disable an HDFS HA configuration using NFS shared storage
- In your core-site.xml file:
fs.defaultFS (formerly fs.default.name)
Optionally, you may have configured the default path for Hadoop clients to use the HA-enabled logical URI. For example, if you used mycluster as the nameservice ID as shown below, this will be the value of the authority portion of all of your HDFS paths.
<property> <name>fs.default.name/name> <value>hdfs://mycluster</value> </property>
- In your hdfs-site.xml configuration file:
dfs.nameservices
<property> <name>dfs.nameservices</name> <value>mycluster</value> </property>
Note: If you are also using HDFS federation, this configuration setting will include the list of other nameservices, HA or otherwise, as a comma-separated list.dfs.ha.namenodes.[nameservice ID]
A list of comma-separated NameNode IDs used by DataNodes to determine all the NameNodes in the cluster. For example, if you used mycluster as the nameservice ID, and you used nn1 and nn2 as the individual IDs of the NameNodes, you would have configured this as follows:
<property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property>
dfs.namenode.rpc-address.[nameservice ID]
For both of the previously-configured NameNode IDs, the full address and RPC port of the NameNode process. For example:
<property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>machine1.example.com:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>machine2.example.com:8020</value> </property>
Note: You may have similarly configured the servicerpc-address setting.dfs.namenode.http-address.[nameservice ID]
The addresses for both NameNodes' HTTP servers to listen on. For example:
<property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>machine1.example.com:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>machine2.example.com:50070</value> </property>
Note: If you have Hadoop's Kerberos security features enabled, and you use HSFTP, you will have set the https-address similarly for each NameNode.dfs.namenode.shared.edits.dir
The path to the remote shared edits directory which the standby NameNode uses to stay up-to-date with all the file system changes the Active NameNode makes. You should have configured only one of these directories, mounted read/write on both NameNode machines. The value of this setting should be the absolute path to this directory on the NameNode machines. For example:
<property> <name>dfs.namenode.shared.edits.dir</name> <value>file:///mnt/filer1/dfs/ha-name-dir-shared</value> </property>
dfs.client.failover.proxy.provider.[nameservice ID]
The name of the Java class that the DFS client uses to determine which NameNode is the current active, and therefore which NameNode is currently serving client requests. The only implementation which shipped with Hadoop is the ConfiguredFailoverProxyProvider. For example:
<property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property>
dfs.ha.fencing.methods - a list of scripts or Java classes which will be used to fence the active NameNode during a failover.Note: If you implemented your own custom fencing method, see the org.apache.hadoop.ha.NodeFencer class.- The sshfence fencing method
sshfence - SSH to the active NameNode and kill the process
For example:
<property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/exampleuser/.ssh/id_rsa</value> </property>
Optionally, you may have configured a non-standard username or port to perform the SSH, as shown below, and also a timeout, in milliseconds, for the SSH:<property> <name>dfs.ha.fencing.methods</name> <value>sshfence([[username][:port]])</value> </property> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> <description> SSH connection timeout, in milliseconds, to use with the builtin sshfence fencer. </description> </property>
- The shell fencing method
shell - run an arbitrary shell command to fence the active NameNode
The shell fencing method runs an arbitrary shell command, which you may have configured as shown below:<property> <name>dfs.ha.fencing.methods</name> <value>shell(/path/to/my/script.sh arg1 arg2 ...)</value> </property>
- The sshfence fencing method
- In your hdfs-site.xml:
<property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property>
- In your core-site.xml file, add:
<property> <name>ha.zookeeper.quorum</name> <value>zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181</value> </property>
Other properties: There are several other configuration parameters which you may have set to control the behavior of automatic failover, though they were not necessary for most installations. See the configuration section of the Hadoop documentation for details.
Redeploying HDFS High Availability
If you need to redeploy HA using Quorum-based storage after temporarily disabling it, proceed as follows:
- Shut down the cluster as described in Step 1: Shut Down the Cluster.
- Uncomment the properties you commented out in Step 2: Unconfigure HA.
- Deploy HDFS HA, following the instructions under Deploying HDFS High Availability.
<< Enabling HDFS HA | ©2016 Cloudera, Inc. All rights reserved | Configuring Other CDH Components to Use HDFS HA >> |
Terms and Conditions Privacy Policy |