Managing Encryption Keys and Zones
Interacting with the KMS and creating encryption zones requires the use of two new CLI commands: hadoop key and hdfs crypto. The following sections will help you get started with creating encryption keys and setting up encryption zones.
Validating Hadoop Key Operations
$ sudo -u <key_admin> hadoop key create keytrustee_test $ hadoop key list
Creating Encryption Zones
Once a KMS has been set up and the NameNode and HDFS clients have been correctly configured, use the hadoop key and hdfs crypto command-line tools to create encryption keys and set up new encryption zones.
- Create an encryption key for your zone as the application user that will be using the key. For example, if you are creating an encryption zone for HBase, create the key as the
hbase user as follows:
$ sudo -u hbase hadoop key create <key_name>
- Create a new empty directory and make it an encryption zone using the key created above.
$ sudo -u hdfs hadoop fs -mkdir /encryption_zone $ sudo -u hdfs hdfs crypto -createZone -keyName <key_name> -path /encryption_zone
You can verify creation of the new encryption zone by running the -listZones command. You should see the encryption zone along with its key listed as follows:$ sudo -u hdfs hdfs crypto -listZones /encryption_zone <key_name>
Warning: Do not delete an encryption key as long as it is still in use for an encryption zone. This results in loss of access to data in that zone.
For more information and recommendations on creating encryption zones for each CDH component, see Configuring CDH Services for HDFS Encryption.
Adding Files to an Encryption Zone
Existing data can be encrypted by coping it copied into the new encryption zones using tools like DistCp.
sudo -u hdfs hadoop distcp /user/dir /encryption_zone
DistCp Considerations
A common use case for DistCp is to replicate data between clusters for backup and disaster recovery purposes. This is typically performed by the cluster administrator, who is an HDFS superuser. To retain this workflow when using HDFS encryption, a new virtual path prefix has been introduced, /.reserved/raw/, that gives superusers direct access to the underlying block data in the filesystem. This allows superusers to distcp data without requiring access to encryption keys, and avoids the overhead of decrypting and re-encrypting data. It also means the source and destination data will be byte-for-byte identical, which would not have been true if the data was being re-encrypted with a new EDEK.
When using /.reserved/raw/ to distcp encrypted data, make sure you preserve extended attributes with the -px flag. This is because encrypted attributes such as the EDEK are exposed through extended attributes and must be preserved to be able to decrypt the file.
This means that if the distcp is initiated at or above the encryption zone root, it will automatically create a new encryption zone at the destination if it does not already exist. Hence, Cloudera recommends you first create identical encryption zones on the destination cluster to avoid any potential mishaps.
Copying between encrypted and unencrypted locations
By default, distcp compares checksums provided by the filesystem to verify that data was successfully copied to the destination. When copying between an unencrypted and encrypted location, the filesystem checksums will not match since the underlying block data is different.
In this case, you can specify the -skipcrccheck and -update flags to avoid verifying checksums.
Deleting Encryption Zones
$ sudo -u hdfs hadoop fs -rm -r -skipTrash /encryption_zone
Backing Up Encryption Keys
If you are using the Java KeyStore KMS, make sure you regularly back up the Java KeyStore that stores the encryption keys. If you are using the Key Trustee KMS and Key Trustee Server, see Backing Up and Restoring Key Trustee Server and Clients for instructions on backing up Key Trustee Server and Key Trustee KMS.
<< Enabling HDFS Encryption Using the Wizard | ©2016 Cloudera, Inc. All rights reserved | Configuring the Key Management Server (KMS) >> |
Terms and Conditions Privacy Policy |