Skip to main content
Version: 1.3.1.0

ODP Security Overview

Security in ODP is built on a defense-in-depth model: multiple independent layers each enforce controls, so that a failure or bypass at one layer does not expose data. ODP integrates Kerberos, Apache Ranger, Apache Knox, and TLS encryption into a cohesive security architecture that Ambari automates from end to end.

Defense-in-Depth Architecture

External clients

Apache Knox (perimeter gateway, TLS termination, SSO)

Apache Ranger (authorization: RBAC, ABAC, tag-based, row/column filtering)

Kerberos (authentication: every service, every RPC call)

TLS / wire encryption (in-transit encryption between all nodes)

HDFS encryption zones (at-rest encryption for sensitive datasets)

Each layer addresses a different threat vector. Together they satisfy the security requirements of regulated environments (GDPR, NIS2, internal compliance frameworks).

Kerberos Authentication

Kerberos is the authentication backbone of ODP. When enabled (the default in production deployments), every service-to-service and client-to-service communication is authenticated using Kerberos tickets. No component accepts unauthenticated connections.

How It Works in ODP

  • Each Hadoop service (NameNode, DataNode, ResourceManager, HiveServer2, Kafka brokers, HBase, etc.) obtains a service principal from the Key Distribution Center (KDC).
  • Clients authenticate to the KDC with their user principal (via kinit or a keytab) and receive a Ticket Granting Ticket (TGT).
  • When a client connects to a service, it presents a service ticket obtained from the TGT — no passwords travel over the network.
  • Service impersonation (acting on behalf of a user) uses constrained delegation, preventing lateral movement if a service is compromised.

FreeIPA Integration

ODP integrates with FreeIPA as the KDC and LDAP identity provider. FreeIPA provides:

  • Centralized user and group management with a web UI
  • Automatic principal provisioning for Hadoop services
  • DNS-based service discovery
  • Certificate management for TLS

Ambari's Kerberos wizard supports FreeIPA natively: it automates principal creation, keytab generation and distribution to all nodes, and service configuration updates for all ODP components.

MIT KDC and Active Directory

For environments without FreeIPA, ODP also supports MIT KDC (standalone) and Active Directory (via cross-realm trusts or direct AD integration) as Kerberos providers.

Apache Ranger — Authorization

Apache Ranger provides centralized, fine-grained authorization for all ODP services. Ranger policies are defined once in the Ranger Admin UI and enforced by lightweight plugins running inside each service.

Policy Models

Role-Based Access Control (RBAC): Policies grant or deny access to resources (HDFS paths, Hive databases/tables/columns, Kafka topics, HBase tables) based on users and groups.

Attribute-Based Access Control (ABAC): Policies can include conditions on request attributes (time of day, IP address range, client application) to implement context-sensitive access rules.

Tag-Based Policies: Ranger integrates with Apache Atlas. When Atlas classifies a column with a tag such as PII or SENSITIVE, Ranger automatically applies masking or denial policies to that column across all engines — without requiring the policy author to know which tables contain sensitive data.

Row-Level and Column-Level Filtering

Ranger supports:

  • Column masking: Replace actual values with masked representations (nullify, hash, partial mask) for unauthorized users — the column is still visible in query results but the data is hidden.
  • Row-level filtering: Append a WHERE clause to queries transparently so that users only see rows matching their authorization context.

These capabilities enable self-service access to production datasets without exposing sensitive records.

Ranger Audit

Every access attempt (successful or denied) is logged to the Ranger audit trail, which can be written to HDFS, Solr, or an external SIEM. The audit log records user identity, resource accessed, action taken, policy matched, and result — providing the evidence required for compliance reporting.

Apache Knox — Perimeter Gateway

Apache Knox acts as the single entry point for external access to ODP cluster services. No internal service ports need to be exposed to clients outside the cluster network.

REST Proxy

Knox translates HTTPS requests from external clients into Kerberos-authenticated requests to internal services:

  • HDFS WebHDFS
  • YARN ResourceManager REST API
  • HiveServer2 (via JDBC-over-HTTPS)
  • HBase REST gateway
  • Ambari REST API

SSO and Identity Federation

Knox integrates with external identity providers via SAML 2.0 and OAuth 2.0/OIDC, enabling single sign-on for web UIs (Ambari, Zeppelin, Ranger Admin). Users authenticate once with their corporate IdP and receive a Knox token that grants access to all authorized cluster services.

OIDC support is being expanded in ODP 1.3.2.0 to cover additional services and token-based machine-to-machine authentication flows.

Encryption

In-Transit Encryption

All inter-node communication in ODP uses TLS 1.2+:

  • HDFS data transfer protocol (DataNode-to-DataNode, client-to-DataNode)
  • YARN RPC
  • HiveServer2 JDBC connections
  • Kafka broker-to-broker and client-to-broker communication
  • HBase, ZooKeeper, and all Ambari-managed services

TLS certificates are issued and managed through FreeIPA's integrated certificate authority when FreeIPA is the KDC.

At-Rest Encryption

HDFS Transparent Data Encryption (TDE) encrypts data at the block level using encryption zones. An encryption zone is an HDFS directory subtree where all files are automatically encrypted with a zone key managed by the Hadoop Key Management Server (KMS). Applications read and write data transparently — the encryption/decryption happens in the HDFS client before data is sent to or received from DataNodes.

Ambari Security Automation

Securing a Hadoop cluster manually is complex and error-prone. ODP's Ambari automates the full security setup:

  • Kerberos Wizard: A guided workflow in Ambari that collects KDC credentials and automates principal creation, keytab distribution, and service reconfiguration across all cluster nodes for every ODP component.
  • Ranger Plugin Auto-Configuration: When Ranger is enabled in Ambari, it automatically installs and configures the Ranger plugin for every supported service (HDFS, Hive, HBase, Kafka, Knox, Atlas, YARN) and creates default deny-all policies that administrators then refine.
  • TLS Auto-Configuration: Ambari can generate and distribute TLS certificates for all services using the cluster's CA, enabling encrypted communications without manual certificate management.

OIDC Support (Coming in ODP 1.3.2.0)

ODP 1.3.2.0 will extend OIDC (OpenID Connect) support to enable token-based authentication flows for:

  • Modern web UIs and REST API clients that prefer bearer tokens over Kerberos tickets
  • Machine-to-machine authentication for automated pipelines
  • Integration with corporate IdPs (Keycloak, Azure AD, Okta) without requiring a Kerberos cross-realm trust

OIDC tokens will be validated by Knox and exchanged for Kerberos service tickets internally, maintaining backward compatibility with Kerberos-secured backend services.