Encryption in ODP
ODP enforces encryption at multiple layers: in-transit TLS for all service communication, at-rest transparent encryption for HDFS, and wire encryption for RPC protocols. This document covers each layer and how to configure it through Ambari.
Encryption in Transit (TLS)
All inter-service communication in a secured ODP cluster runs over TLS. This includes:
- HDFS block transfers (DataNode ↔ client, DataNode ↔ DataNode)
- YARN Resource Manager ↔ Node Manager communication
- HBase master ↔ RegionServer communication
- ZooKeeper client connections
- Ambari server ↔ agent communication
- Knox gateway (client-facing TLS)
- Ranger Admin, Ranger KMS, and all web UIs
Auto-TLS (Recommended)
Ambari's Auto-TLS feature automates certificate lifecycle management across the entire cluster. It:
- Generates a cluster-internal CA or integrates with FreeIPA's Dogtag CA.
- Issues signed certificates for every Ambari agent (one per host).
- Distributes certificates and configures all services automatically.
- Handles certificate renewal.
Enable Auto-TLS during initial cluster setup:
# Run on the Ambari server host
/var/lib/ambari-server/resources/scripts/configs.py \
--action=set \
--cluster=<cluster-name> \
--config-type=cluster-env \
--key=security_enabled \
--value=true
In practice, Auto-TLS is enabled through the Ambari setup wizard before deploying services. Once enabled, all services are configured with TLS automatically.
Manual Certificate Management
For environments where Auto-TLS is not used, certificates must be generated and distributed manually. Key locations by service:
| Service | Keystore / Truststore path |
|---|---|
| HDFS (NameNode, DataNode) | /etc/security/serverKeys/hdfs.jks |
| YARN | /etc/security/serverKeys/yarn.jks |
| HBase | /etc/security/serverKeys/hbase.jks |
| Ranger Admin | /etc/ranger/admin/conf/ranger-admin-keystore.jks |
| Knox | /usr/odp/current/knox-server/data/security/keystores/gateway.jks |
Generate a signed certificate for a service:
# Generate key and CSR
keytool -genkeypair -alias hdfs-nn \
-keyalg RSA -keysize 2048 \
-validity 730 \
-keystore /etc/security/serverKeys/hdfs.jks \
-storepass <keystore-password> \
-dname "CN=master01.dev01.hadoop.clemlab.com, OU=ODP, O=Clemlab, C=FR"
keytool -certreq -alias hdfs-nn \
-keystore /etc/security/serverKeys/hdfs.jks \
-storepass <keystore-password> \
-file hdfs-nn.csr
# Sign with your CA, then import:
keytool -importcert -alias ca-root \
-keystore /etc/security/serverKeys/hdfs.jks \
-storepass <keystore-password> \
-file ca.crt
keytool -importcert -alias hdfs-nn \
-keystore /etc/security/serverKeys/hdfs.jks \
-storepass <keystore-password> \
-file hdfs-nn.crt
Encryption at Rest — HDFS Transparent Data Encryption (TDE)
HDFS TDE encrypts data blocks on disk without requiring any changes to applications. Encryption and decryption are transparent to HDFS clients — the client reads and writes plaintext while the DataNode stores ciphertext.
Architecture
HDFS Client
│ plaintext read/write
▼
NameNode (manages encryption zones + DEK metadata)
│ DEK wrapped with zone key
▼
Ranger KMS (Key Management Server)
│ stores master keys (zone keys)
▼
DataNode (stores encrypted blocks)
- Zone key: Master key stored in Ranger KMS. Never leaves KMS.
- Data Encryption Key (DEK): Per-file key generated by HDFS. Encrypted (wrapped) with the zone key. Stored in HDFS metadata.
- Encrypted Data Encryption Key (EDEK): What is stored in NameNode's fsimage.
Installing and Configuring Ranger KMS
Ranger KMS is a separate service deployed through Ambari (Add Service → Ranger KMS).
After installation, configure the KMS backing store in Ambari under Ranger KMS → Configs:
# Database for key store (MySQL / PostgreSQL)
ranger.ks.jpa.jdbc.url=jdbc:mysql://db-host:3306/rangerkms
ranger.ks.jpa.jdbc.user=rangerkms
ranger.ks.jpa.jdbc.password=<password>
Ranger KMS listens on port 9292 (HTTP) or 9293 (HTTPS).
Creating Encryption Zones
An encryption zone is an HDFS directory where all files are automatically encrypted with a specific zone key.
# Step 1: Authenticate as HDFS admin
kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs@DEV01.HADOOP.CLEMLAB.COM
# Step 2: Create a key in Ranger KMS
hadoop key create my-zone-key \
-size 256 \
-provider kms://https@ranger-kms-host:9293/kms
# Step 3: Create the HDFS directory
hdfs dfs -mkdir /encrypted/finance
# Step 4: Create the encryption zone
hdfs crypto -createZone -keyName my-zone-key -path /encrypted/finance
Verify the zone:
hdfs crypto -listZones
# Output:
# /encrypted/finance my-zone-key
All files written to /encrypted/finance are automatically encrypted. Existing files are not encrypted retroactively — they must be moved into the zone.
Key Management
Create, rotate, and delete keys through the Ranger KMS REST API or the hadoop key CLI:
# List all keys
hadoop key list --provider kms://https@ranger-kms-host:9293/kms
# Roll (rotate) a key — new files use the new key version
hadoop key roll my-zone-key \
--provider kms://https@ranger-kms-host:9293/kms
# Delete a key (only if no encryption zone uses it)
hadoop key delete my-zone-key \
--provider kms://https@ranger-kms-host:9293/kms
Access to keys is controlled by Ranger KMS policies. Only HDFS and authorised users can retrieve DEKs.
Accessing Encrypted Data
Applications do not need code changes to read encrypted data. However:
- The application's Kerberos principal must have
DECRYPT_EEKpermission in the Ranger KMS policy for the relevant key. - HDFS handles decryption transparently during reads.
Grant decrypt access in Ranger KMS UI under Service → KMS → Policies:
Resource: my-zone-key
User: hive — Permissions: GET, GET_KEYS, GET_METADATA, GENERATE_EEK, DECRYPT_EEK
User: spark — Permissions: GET, GET_METADATA, GENERATE_EEK, DECRYPT_EEK
Wire Encryption for RPC
Beyond TLS for HTTP/REST endpoints, ODP also supports in-flight encryption for Hadoop RPC (binary protocol used between HDFS clients and NameNode/DataNode, between MapReduce tasks, etc.).
Hadoop RPC Encryption
Configure in core-site.xml via Ambari:
<property>
<name>hadoop.rpc.protection</name>
<!-- Options: authentication (integrity only), integrity, privacy (full encryption) -->
<value>privacy</value>
</property>
<property>
<name>dfs.encrypt.data.transfer</name>
<value>true</value>
</property>
<property>
<name>dfs.encrypt.data.transfer.algorithm</name>
<!-- Options: 3des, rc4 -->
<value>3des</value>
</property>
Setting hadoop.rpc.protection=privacy enables full encryption of all RPC traffic. integrity provides HMAC-based message authentication without encryption. authentication provides only Kerberos authentication without protection of the data stream.
HBase RPC Encryption
Configure in hbase-site.xml:
<property>
<name>hbase.rpc.protection</name>
<value>privacy</value>
</property>
ZooKeeper TLS
ZooKeeper client-server and quorum TLS is configured in zoo.cfg:
sslQuorum=true
client.secure=true
ssl.keyStore.location=/etc/security/serverKeys/zookeeper.jks
ssl.keyStore.password=<password>
ssl.trustStore.location=/etc/security/serverKeys/zookeeper.truststore.jks
ssl.trustStore.password=<password>
Ambari manages these settings when Auto-TLS is enabled.
Ozone Encryption
Apache Ozone 2.0.0 supports encryption at the object level through the same Ranger KMS infrastructure used for HDFS TDE.
Encrypting an Ozone Bucket
# Authenticate
kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs@DEV01.HADOOP.CLEMLAB.COM
# Create an encrypted bucket using a KMS key
ozone sh bucket create \
--replicationFactor=THREE \
--type=RATIS \
-k my-zone-key \
o3://ozone-host:9862/myvolume/encrypted-bucket
# Verify
ozone sh bucket info o3://ozone-host:9862/myvolume/encrypted-bucket
# Look for: encryptionKeyName: my-zone-key
All objects (keys) written to an encrypted Ozone bucket are encrypted with a DEK derived from the bucket's zone key, following the same KMS delegation pattern as HDFS TDE.
Ozone in-transit encryption
Ozone uses TLS for all gRPC-based communication between OzoneManager, SCM (Storage Container Manager), and DataNodes. Configure through Ambari under Ozone → Configs → Advanced ozone-site:
ozone.security.enabled=true
hdds.grpc.tls.enabled=true
When Ambari Auto-TLS is active, these are set automatically.