This is the documentation for Cloudera Enterprise 5.8.x. Documentation for other versions is available at Cloudera Documentation.

Managing Hive Using Cloudera Manager

Continue reading:

How Hive Configurations are Propagated to Hive Clients

Because the Hive service does not have worker roles, another mechanism is needed to enable the propagation of client configurations to the other hosts in your cluster. In Cloudera Manager gateway roles fulfill this function. Whether you add a Hive service at installation time or at a later time, ensure that you assign the gateway roles to hosts in the cluster. If you do not have gateway roles, client configurations are not deployed.

Using a Load Balancer with HiveServer2

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

Considerations When Upgrading CDH

  1. Go to the Hive service.
  2. Click the Configuration tab.
  3. Select Scope > HiveServer2.
  4. Select Category > Main.
  5. Locate the HiveServer2 Load Balancer property or search for it by typing its name in the Search box.
  6. Enter values for hostname:port number.
      Note: When you set the HiveServer2 Load Balancer property, Cloudera Manager regenerates the keytabs for HiveServer2 roles. The principal in these keytabs contains the load balancer hostname. If there is a Hue service that depends on this Hive service, it also uses the load balancer to communicate with Hive.
  7. Click Save Changes to commit the changes.
  8. Restart the Hive service.

Hive has undergone major version changes from CDH 4.0 to 4.1 and between CDH 4.1 and 4.2. (CDH 4.0 had Hive 0.8.0, CDH 4.1 used Hive 0.9.0, and CDH 4.2 and higher has 0.10.0). This requires that you manually back up and upgrade the Hive metastore database when upgrading between major Hive versions.

You should follow the steps in the appropriate in the Cloudera Manager procedure for upgrading CDH to upgrade the metastore before you restart the Hive service. This applies whether you are upgrading to packages or parcels. The procedure for upgrading CDH using packages is at Upgrading CDH 4 Using Packages. The procedure for upgrading with parcels is at Upgrading CDH 4 Using Parcels.

Considerations When Upgrading Cloudera Manager

Cloudera Manager 4.5 added support for Hive, which includes the Hive Metastore Server role type. This role manages the metastore process when Hive is configured with a remote metastore.

When upgrading from Cloudera Manager versions before 4.5, Cloudera Manager automatically creates new Hive services to capture the previous implicit Hive dependency from Hue and Impala. Your previous services continue to function without impact. If Hue was using a Hive metastore backed by a Derby database, the newly created Hive Metastore Server also uses Derby. Because Derby does not allow concurrent connections, Hue continues to work, but the new Hive Metastore Server does not run. The failure is harmless (because nothing uses this new Hive Metastore Server at this point) and intentional, to preserve the set of cluster functionality as it was before upgrade. Cloudera discourages the use of a Derby-backed Hive metastore due to its limitations and recommends switching to a different supported database.

Cloudera Manager provides a Hive configuration option to bypass the Hive Metastore Server. When this configuration is enabled, Hive clients, Hue, and Impala connect directly to the Hive metastore database. Prior to Cloudera Manager 4.5, Hue and Impala connected directly to the Hive metastore database, so the bypass mode is enabled by default when upgrading to Cloudera Manager 4.5 and higher. This ensures that the upgrade does not disrupt your existing setup. You should plan to disable the bypass mode, especially when using CDH 4.2 and higher. Using the Hive Metastore Server is the recommended configuration, and the WebHCat Server role requires the Hive Metastore Server to not be bypassed. To disable bypass mode, see Disabling Bypass Mode .

Cloudera Manager 4.5 and higher also supports HiveServer2 with CDH 4.2. In CDH 4, HiveServer2 is not added by default, but can be added as a new role under the Hive service (see Role Instances). In CDH 5, HiveServer2 is a mandatory role.

Disabling Bypass Mode

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

In bypass mode Hive clients directly access the metastore database instead of using the Hive Metastore Server for metastore information.
  1. Go to the Hive service.
  2. Click the Configuration tab.
  3. Select Scope > HIVE service_name (Service-Wide)
  4. Select Category > Advanced.
  5. Clear the Bypass Hive Metastore Server property.
  6. Click Save Changes to commit the changes.
  7. Re-deploy Hive client configurations.
  8. Restart Hive and any Hue or Impala services configured to use that Hive service.
Page generated July 8, 2016.