Backing Up Before Upgrading from CDH 4 to CDH 5
Backing up CDH components before upgrading your Cloudera Manager and CDH software provides a way to roll back the upgrade. This topic provides procedures to back up your cluster so that you can roll back your cluster to its pre-upgrade state.
Backup Steps
- Preparing to Back Up
- Stopping the Cluster
- Backing Up CDH 4 and Cloudera Manager Repository Files
- Backing Up ZooKeeper
- Backing Up HDFS (With High Availability)
- Backing Up HDFS (Without High Availability)
- Backing Up HBase
- Backing Up Hive
- Backing Up Oozie
- Backing Up Search
- Backing Up Sqoop 2
- Backing Up Hue
- Backing Up Cloudera Manager
- Backing Up Other CDH Components
Preparing to Back Up
- Open the Cloudera Manager Admin console.
- Go to the service where you need to look up a parameter (for example, HDFS, HBase, or ZooKeeper).
- Click the Configuration tab.
- Enter the name of the parameter in the search box.
The parameter and its value display on the right.
For some services, you back up data stored in relational databases such as Oracle, MariaDB, MySQL, or PostgreSQL. See the documentation for those products to learn how to back up and restore the databases.
- As you back up the required files described in this topic, record which hosts the backups come from (or back up the files to the same host).
- After starting your backups, do not add additional components or change any configurations until your upgrade is successfully completed or rolled back.
- Complete all backup steps before starting the upgrade to CDH 5.
Stopping the Cluster
- Go to the Home page.
- In the drop-down list next to your cluster, select Stop.
Backing Up CDH 4 and Cloudera Manager Repository Files
Operating System | Path |
---|---|
RHEL | /etc/yum.repos.d |
SLES | /etc/zypp/repos.d |
Ubuntu or Debian | /etc/apt/sources.list.d |
For example, on a RHEL or similar system, back up the files in /etc/yum.repos.d that have cloudera as part of their name.
Backing Up ZooKeeper
On all ZooKeeper hosts, back up the ZooKeeper data directory specified with the dataDir property in the ZooKeeper configuration. The default location is /var/lib/zookeeper.
Record the permissions of the files and directories; you will need these to roll back ZooKeeper.
Backing Up HDFS (With High Availability)
Follow this procedure to back up an HDFS deployment that has been configured for high availability.
- On both NameNode hosts, back up one of the NameNode data directories specified with the dfs.namenode.name.dir property.
- On each JournalNode, back up the JournalNode edits directory specified by the dfs.journalnode.edits.dir property. Note which JournalNode host the backup comes from.
- Back up the VERSION files for each DataNode, noting which DataNode you are backing up. There may be multiple data directories in each node, but you need to back up only one of them on each DataNode. The location of the data directories is specified with the dfs.datanode.data.dir property. The VERSION file is located in the current subdirectory. You will use the version files to get the storageID when you perform the rollback steps; for example (using the default path): /data/dfs/dn/current/VERSION. You only need this storageID when rolling back the DataNodes; copying the VERSION file is suggested as a convenience.
Backing Up HDFS (Without High Availability)
Use this procedure to back up an HDFS deployment that has not been configured for high availability.
- On the NameNode host, back up one of the NameNode data directories specified with the dfs.namenode.name.dir property.
- Back up the VERSION files for each DataNode, noting which DataNode you are backing up. There may be multiple data directories in each node, but you need to back up only one of them on each DataNode. The location of the data directories is specified with the dfs.datanode.data.dir property. The VERSION file is located in the current subdirectory. You will use the version files to get the storageID when you perform the rollback steps; for example (using the default path): /data/dfs/dn/current/VERSION. You only need this storageID when rolling back the DataNodes; copying the VERSION file is suggested as a convenience.
Backing Up HBase
Because the rollback procedure also rolls back HDFS, the data in HBase is also rolled back. In addition, HBase metadata stored in ZooKeeper is recovered as part of the ZooKeeper rollback procedure.
If your cluster is configured to use HBase replication, Cloudera recommends that you document all replication peers. If necessary (for example, because the HBase znode has been deleted), you can roll back HBase as part of the HDFS rollback without the ZooKeeper metadata. This metadata can be reconstructed in a fresh ZooKeeper installation, with the exception of the replication peers, which you must add back. For information on enabling HBase replication, listing peers, and adding a peer, see HBase Replication in the CDH 4 documentation.
Backing Up Hive
Back up the database that backs the Hive metastore. See Backing up Databases.
Backing Up Oozie
Back up the Oozie database. See Backing up Databases.
Backing Up Search
On each Solr node, back up the contents of the Solr Data directory and record the permissions for the directory. This location is specified with the Solr Data Directory property. The default location is:
/var/lib/solr
Search data on ZooKeeper is restored as part of the ZooKeeper rollback.
Backing Up Sqoop 2
If you are not using the default embedded Derby database for Sqoop 2, back up the database you have configured for Sqoop 2. Otherwise, back up the repository subdirectory of the Sqoop 2 metastore directory. This location is specified with the Sqoop 2 Server Metastore Directory property. The default location is: /var/lib/sqoop2. For this default location, Derby database files are located in /var/lib/sqoop2/repository.
Backing Up Hue
- Back up the Hue database. See Backing up Databases.
- Back up the app registry file, <HUE_HOME>/app.reg, where HUE_HOME is the location of your Hue installation. For package installs, this is usually /usr/lib/hue; for parcel installs, this is usually, /opt/cloudera/parcels/<parcel version>/lib/hue/.
Backing Up Cloudera Manager
- Stop Cloudera Management Services using Cloudera Manager:
- Select .
- Select .
- Stop Cloudera Manager Server by running the following command on the Cloudera Manager Server host:
sudo service cloudera-scm-server stop
- On the host where Cloudera Manager Server is running, back up the /etc/cloudera-scm-server/db.properties file.
- On the host where the Event Server role is configured to run, back up the contents of the directory specified with the Event Server Index Directory property (the default value is /var/lib/cloudera-scm-eventserver).
- Back up the /etc/cloudera-scm-agent/config.ini file on each host in the cluster.
- Back up the following Cloudera Manager-related databases; see Backing up Databases:
- Cloudera Manager Server
- Activity Monitor (depending on your deployment, this role may not be installed)
- Reports Manager
- Service Monitor
- Host Monitor
- Navigator Audit Server
- Navigator Metadata Server
Backing Up Other CDH Components
- MapReduce
- YARN
- Spark
- Pig
- Sqoop
- Impala
Backing up Databases
Several steps in the backup procedures require you to back up various databases used in a CDH cluster. The steps for backing up and restoring databases differ depending on the database vendor and version you select for your cluster and are beyond the scope of this document.
- MariaDB 5.5: http://mariadb.com/kb/en/mariadb/backup-and-restore-overview/
- MySQL 5.5: http://dev.mysql.com/doc/refman/5.5/en/backup-and-recovery.html
- MySQL 5.6: http://dev.mysql.com/doc/refman/5.6/en/backup-and-recovery.html
- PostgreSQL 8.4: https://www.postgresql.org/docs/8.4/static/backup.html
- PostgreSQL 9.2: https://www.postgresql.org/docs/9.2/static/backup.html
- PostgreSQL 9.3: https://www.postgresql.org/docs/9.3/static/backup.html
- Oracle 11gR2: http://docs.oracle.com/cd/E11882_01/backup.112/e10642/toc.htm
<< Rolling Back a CDH 4-to-CDH 5 Upgrade | ©2016 Cloudera, Inc. All rights reserved | Procedure for Rolling Back a CDH 4-to-CDH 5 Upgrade >> |
Terms and Conditions Privacy Policy |