This is the documentation for Cloudera Enterprise 5.8.x. Documentation for other versions is available at Cloudera Documentation.

Upgrading HBase

  Note: To see which version of HBase is shipping in CDH 5, check the Version and Packaging Information. For important information on new and changed components, see the CDH 5 Release Notes.
  Important: Before you start, make sure you have read and understood the previous section, New Features and Changes for HBase in CDH 5, and check the Known Issues in CDH 5 and Incompatible Changes and Limitations for HBase.

Coprocessors and Custom JARs

When upgrading HBase from one major version to another (such as upgrading from CDH 4 to CDH 5), you must recompile coprocessors and custom JARs after the upgrade.

Never rely on HBase directory layout on disk.

The HBase directory layout is an implementation detail and is subject to change. Do not rely on the directory layout for client or administration functionality. Instead, access HBase using the supported APIs.

Upgrading HBase from CDH 4 to CDH 5

CDH 5.0 HBase is based on Apache HBase 0.96.1.1 Remember that once a cluster has been upgraded to CDH 5, it cannot be reverted to CDH 4. To ensure a smooth upgrade, this section guides you through the steps involved in upgrading HBase from the older CDH 4.x releases to CDH 5.

These instructions also apply to upgrading HBase from CDH 4.x directly to CDH 5.1.0, which is a supported path.

When upgrading from CDH 4.x to CDH 5.5.1, extra steps are required. See Extra steps must be taken when upgrading from CDH 4.x to CDH 5.5.1..

Prerequisites

HDFS and ZooKeeper should be available while upgrading HBase.

Overview of Upgrade Procedure

Before you can upgrade HBase from CDH 4 to CDH 5, your HFiles must be upgraded from HFile v1 format to HFile v2, because CDH 5 no longer supports HFile v1. The upgrade procedure itself is different if you are using Cloudera Manager or the command line, but has the same results. The first step is to check for instances of HFile v1 in the HFiles and mark them to be upgraded to HFile v2, and to check for and report about corrupted files or files with unknown versions, which need to be removed manually. The next step is to rewrite the HFiles during the next major compaction. After the HFiles are upgraded, you can continue the upgrade. After the upgrade is complete, you must recompile custom coprocessors and JARs.

Upgrade HBase Using the Command Line

CDH 5 comes with an upgrade script for HBase. You can run /usr/lib/hbase/bin/hbase --upgrade to see its Help section. The script runs in two modes: -check and -execute.

Step 1: Check for HFile v1 files and compact if necessary

  1. Run the upgrade command in -check mode, and examine the output.
    $ /usr/lib/hbase/bin/hbase upgrade -check
    Your output should be similar to the following:
    Tables Processed:
    hdfs://localhost:41020/myHBase/.META.
    hdfs://localhost:41020/myHBase/usertable
    hdfs://localhost:41020/myHBase/TestTable
    hdfs://localhost:41020/myHBase/t
    
    Count of HFileV1: 2
    HFileV1:
    hdfs://localhost:41020/myHBase/usertable /fa02dac1f38d03577bd0f7e666f12812/family/249450144068442524
    hdfs://localhost:41020/myHBase/usertable /ecdd3eaee2d2fcf8184ac025555bb2af/family/249450144068442512
    
    Count of corrupted files: 1
    Corrupted Files:
    hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812/family/1
    Count of Regions with HFileV1: 2
    Regions to Major Compact:
    hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812
    hdfs://localhost:41020/myHBase/usertable/ecdd3eaee2d2fcf8184ac025555bb2af
    In the example above, you can see that the script has detected two HFile v1 files, one corrupt file and the regions to major compact.

    By default, the script scans the root directory, as defined by hbase.rootdir. To scan a specific directory, use the --dir option. For example, the following command scans the /myHBase/testTable directory.

    /usr/lib/hbase/bin/hbase upgrade --check --dir /myHBase/testTable
  2. Trigger a major compaction on each of the reported regions. This major compaction rewrites the files from HFile v1 to HFile v2 format. To run the major compaction, start HBase Shell and issue the major_compact command.
    $ /usr/lib/hbase/bin/hbase shell
    hbase> major_compact 'usertable'
    You can also do this in a single step by using the echo shell built-in command.
    $ echo "major_compact 'usertable'" | /usr/lib/hbase/bin/hbase shell
  3. Once all the HFileV1 files have been rewritten, running the upgrade script with the -check option again will return a "No HFile v1 found" message. It is then safe to proceed with the upgrade.

Step 2: Gracefully shut down CDH 4 HBase cluster

Shut down your CDH 4 HBase cluster before you run the upgrade script in -execute mode.

To shut down HBase gracefully:

  1. Stop the REST and Thrift server and clients, then stop the cluster.
    1. Stop the Thrift server and clients:
      sudo service hbase-thrift stop
      Stop the REST server:
      sudo service hbase-rest stop
    2. Stop the cluster by shutting down the master and the RegionServers:
      1. Use the following command on the master node:
        sudo service hbase-master stop
      2. Use the following command on each node hosting a RegionServer:
        sudo service hbase-regionserver stop
  2. Stop the ZooKeeper Server:
    $ sudo service zookeeper-server stop

Step 3: Uninstall the old version of HBase and replace it with the new version.

  1. To remove HBase on Red-Hat-compatible systems:
    $ sudo yum remove hadoop-hbase

    To remove HBase on SLES systems:

    $ sudo zypper remove hadoop-hbase

    To remove HBase on Ubuntu and Debian systems:

    $ sudo apt-get purge hadoop-hbase
      Warning:

    If you are upgrading an Ubuntu or Debian system from CDH3u3 or lower, you must use apt-get purge (rather than apt-get remove) to make sure the re-install succeeds, but be aware that apt-get purge removes all your configuration data. If you have modified any configuration files, DO NOT PROCEED before backing them up.

    CAUTION:
    On Ubuntu systems, make sure you remove HBase before removing ZooKeeper; otherwise your HBase configuration will be deleted. This is because hadoop-hbase depends on hadoop-zookeeper, and so purging hadoop-zookeeper will purge hadoop-hbase.
  2. Follow the instructions for installing the new version of HBase at HBase Installation.

Step 4: Run the HBase upgrade script in -execute mode

  Important: Before you proceed with Step 4, upgrade your CDH 4 cluster to CDH 5. See Upgrading to CDH 5 for instructions.

This step executes the actual upgrade process. It has a verification step which checks whether or not the Master, RegionServer and backup Master znodes have expired. If not, the upgrade is aborted. This ensures no upgrade occurs while an HBase process is still running. If your upgrade is aborted even after shutting down the HBase cluster, retry after some time to let the znodes expire. Default znode expiry time is 300 seconds.

As mentioned earlier, ZooKeeper and HDFS should be available. If ZooKeeper is managed by HBase, then use the following command to start ZooKeeper.

/usr/lib/hbase/bin/hbase-daemon.sh start zookeeper

The upgrade involves three steps:

  • Upgrade Namespace: This step upgrades the directory layout of HBase files.
  • Upgrade Znodes: This step upgrades /hbase/replication (znodes corresponding to peers, log queues and so on) and table znodes (keep table enable/disable information). It deletes other znodes.
  • Log Splitting: In case the shutdown was not clean, there might be some Write Ahead Logs (WALs) to split. This step does the log splitting of such WAL files. It is executed in a “non distributed mode”, which could make the upgrade process longer in case there are too many logs to split. To expedite the upgrade, ensure you have completed a clean shutdown.
Run the upgrade command in -execute mode.
$ /usr/lib/hbase/bin/hbase upgrade -execute

Your output should be similar to the following:

Starting Namespace upgrade
Created version file at hdfs://localhost:41020/myHBase with version=7
Migrating table testTable to hdfs://localhost:41020/myHBase/.data/default/testTable
…..
Created version file at hdfs://localhost:41020/myHBase with version=8
Successfully completed NameSpace upgrade.
Starting Znode upgrade
….
Successfully completed Znode upgrade
Starting Log splitting
…
Successfully completed Log splitting

The output of the -execute command can either return a success message as in the example above, or, in case of a clean shutdown where no log splitting is required, the command would return a "No log directories to split, returning" message. Either of those messages indicates your upgrade was successful.

  Important: Configuration files
  • If you install a newer version of a package that is already on the system, configuration files that you have modified will remain intact.
  • If you uninstall a package, the package manager renames any configuration files you have modified from <file> to <file>.rpmsave. If you then re-install the package (probably to install a new version) the package manager creates a new <file> with applicable defaults. You are responsible for applying any changes captured in the original configuration file to the new configuration file. In the case of Ubuntu and Debian upgrades, you will be prompted if you have made changes to a file for which there is a new version. For details, see Automatic handling of configuration files by dpkg.

Step 5 (Optional): Move Tables to Namespaces

CDH 5 introduces namespaces for HBase tables. As a result of the upgrade, all tables are automatically assigned to namespaces. The root, meta, and acl tables are added to the hbase system namespace. All other tables are assigned to the default namespace.

To move a table to a different namespace, take a snapshot of the table and clone it to the new namespace. After the upgrade, do the snapshot and clone operations before turning the modified application back on.

  Warning: Do not move datafiles manually, as this can cause data corruption that requires manual intervention to fix.

Step 6: Recompile coprocessors and custom JARs.

Recompile any coprocessors and custom JARs, so that they will work with the new version of HBase.

FAQ

In order to prevent upgrade failures because of unexpired znodes, is there a way to check/force this before an upgrade?

The upgrade script "executes" the upgrade when it is run with the -execute option. As part of the first step, it checks for any live HBase processes (RegionServer, Master and backup Master), by looking at their znodes. If any such znode is still up, it aborts the upgrade and prompts the user to stop such processes, and wait until their znodes have expired. This can be considered an inbuilt check.

The -check option has a different use case: To check for HFile v1 files. This option is to be run on live CDH 4 clusters to detect HFile v1 and major compact any regions with such files.

What are the steps for Cloudera Manager to do the upgrade?

See Upgrade to CDH 5 for instructions on upgrading HBase within a Cloudera Manager deployment.

Upgrading HBase from a Lower CDH 5 Release

  Important: Rolling upgrade is not supported between a CDH 5 Beta release and a CDH 5 GA release. Cloudera recommends using Cloudera Manager if you need to do rolling upgrades.

To upgrade HBase from a lower CDH 5 release, proceed as follows.

The instructions that follow assume that you are upgrading HBase as part of an upgrade to the latest CDH 5 release, and have already performed the steps underUpgrading from an Earlier CDH 5 Release to the Latest Release.

During a rolling upgrade from CDH 5.0.x to CDH 5.4.x the HBase Master UI will display the URLs to the old HBase RegionServers using an incorrect info port number. Once the rolling upgrade completes the HBase master UI will use the correct port number.

Step 1: Perform a Graceful Cluster Shutdown

  Note: Upgrading using rolling restart is not supported.

To shut HBase down gracefully:

  1. Stop the Thrift server and clients, then stop the cluster.
    1. Stop the Thrift server and clients:
      sudo service hbase-thrift stop
    2. Stop the cluster by shutting down the master and the RegionServers:
      • Use the following command on the master node:
        sudo service hbase-master stop
      • Use the following command on each node hosting a RegionServer:
        sudo service hbase-regionserver stop
  2. Stop the ZooKeeper Server:
    $ sudo service zookeeper-server stop

Step 2: Install the new version of HBase

  Note: You may want to take this opportunity to upgrade ZooKeeper, but you do not have to upgrade Zookeeper before upgrading HBase; the new version of HBase will run with the older version of Zookeeper. For instructions on upgrading ZooKeeper, see Upgrading ZooKeeper from an Earlier CDH 5 Release.

To install the new version of HBase, follow directions in the next section, HBase Installation.

  Important: Configuration files
  • If you install a newer version of a package that is already on the system, configuration files that you have modified will remain intact.
  • If you uninstall a package, the package manager renames any configuration files you have modified from <file> to <file>.rpmsave. If you then re-install the package (probably to install a new version) the package manager creates a new <file> with applicable defaults. You are responsible for applying any changes captured in the original configuration file to the new configuration file. In the case of Ubuntu and Debian upgrades, you will be prompted if you have made changes to a file for which there is a new version. For details, see Automatic handling of configuration files by dpkg.
Page generated July 8, 2016.