Skip to main content
Version: 1.3.1.0

Master Node Sizing

Master nodes in an ODP cluster host the coordination, metadata, and management services that must remain highly available. Poor sizing of master nodes is a common cause of cluster instability — particularly NameNode GC pauses, slow YARN scheduling, or Ambari UI timeouts.

This page provides detailed sizing guidance for each master service, including JVM heap recommendations.

NameNode (HA Pair)

The HDFS NameNode holds the entire filesystem namespace in memory. Memory requirements scale with the number of files, blocks, and directories in HDFS.

MetricGuidance
Heap per NameNode2 GB per million blocks (minimum 8 GB, typical 16–64 GB)
CPU8–16 cores
RAM (total node)64–128 GB
OS + metadata disk2x SSD RAID 1 (for edit logs and fsimage)

JVM heap example (hadoop-env):

HADOOP_NAMENODE_OPTS="-Xms32g -Xmx32g -XX:+UseG1GC"

Notes:

  • Always run NameNode in HA mode with a shared edit log via JournalNodes (minimum 3 JournalNodes, typically co-located on master nodes)
  • The NameNode fsimage and edit log directories (dfs.namenode.name.dir, dfs.namenode.edits.dir) must be on fast, reliable storage. SSD with RAID 1 mirroring is strongly recommended
  • Monitor NameNode heap usage carefully; NameNode OOM crashes can take the cluster offline

ResourceManager (HA Pair)

The YARN ResourceManager schedules and tracks all application containers across the cluster.

MetricGuidance
Heap4–16 GB (scales with cluster size and application count)
CPU4–8 cores
RAM (total node)32–64 GB

JVM heap example (yarn-env):

YARN_RESOURCEMANAGER_HEAPSIZE=8192

Notes:

  • ResourceManager HA uses ZooKeeper for active/standby state tracking. Ensure ZooKeeper is healthy before starting ResourceManager
  • In large clusters (500+ nodes), consider co-locating the ResourceManager on a dedicated node rather than sharing with the NameNode

HBase Master

The HBase Master coordinates region assignment and handles administrative operations. It is not in the data path for reads/writes (unlike RegionServers), so its sizing is more modest.

MetricGuidance
Heap2–4 GB
CPU4 cores
RAM (total node)16–32 GB (on shared master node)

JVM heap example (hbase-env):

HBASE_MASTER_OPTS="-Xms4g -Xmx4g -XX:+UseG1GC"

Notes:

  • Deploy HBase Master in HA mode (primary + backup master on different nodes)
  • HBase Master shares ZooKeeper with HDFS/YARN; ensure ZooKeeper ensemble is appropriately sized

ZooKeeper Quorum

ZooKeeper provides distributed coordination for HDFS NameNode HA, YARN ResourceManager HA, HBase, Kafka, and other ODP components.

MetricGuidance
Heap1–4 GB
CPU2–4 cores
Data diskSSD strongly recommended (ZooKeeper transaction log is latency-sensitive)
Quorum size3 nodes (minimum), 5 for larger clusters

JVM heap example (zoo.cfg / zookeeper-env):

ZK_SERVER_HEAP=2048

Notes:

  • Always deploy an odd number of ZooKeeper nodes (3 or 5) to maintain quorum
  • ZooKeeper transaction logs (dataLogDir) must be on a dedicated disk with consistent low latency. Shared disks cause ZooKeeper election timeouts under load
  • Co-locating ZooKeeper on master nodes is standard practice for small to medium clusters

Ambari Server

The Ambari Server manages cluster configuration, deployment, and monitoring. It embeds a Tomcat servlet container and a metrics collection stack.

MetricGuidance
Heap2–4 GB
CPU4 cores
RAM (total node)16–32 GB (on shared master node)
DatabaseExternal PostgreSQL or MySQL on SSD

JVM heap example (ambari-env.sh):

AMBARI_JVM_ARGS="-Xms2g -Xmx4g"

Notes:

  • Ambari Server should connect to an external database (PostgreSQL recommended) rather than the embedded Derby database, which is unsuitable for production
  • For clusters with Ambari Metrics Service (AMS), the AMetrics Collector requires additional RAM (4–8 GB heap) and fast disk for the HBase-backed time series store

Ranger

Apache Ranger provides centralized security policy management, audit logging, and row/column level access control for Hive, HDFS, HBase, Kafka, Knox, NiFi, and other services.

MetricGuidance
Ranger Admin heap2–4 GB
Ranger UserSync heap512 MB – 1 GB
CPU4 cores
External DB (policies)PostgreSQL or MySQL on SSD
Audit storeSolr (4–8 GB heap) or HDFS

JVM heap example (ranger-env):

ranger_admin_max_heap_size=4096

Notes:

  • The Ranger Solr audit store (ranger_audit_solr) should have a dedicated heap of 4–8 GB and fast disk for its index
  • For high-audit-volume clusters, consider an external Solr or Elasticsearch cluster instead of the embedded Solr managed by Ambari

Atlas

Apache Atlas provides metadata governance, lineage tracking, and data classification.

MetricGuidance
Atlas Server heap4–8 GB
CPU4–8 cores
Embedded HBase (for graph)Shared with cluster HBase or dedicated
Embedded Solr4–8 GB heap

JVM heap example (atlas-env):

export ATLAS_SERVER_OPTS="-Xms4g -Xmx8g -XX:+UseG1GC"

Notes:

  • Atlas uses JanusGraph backed by HBase for its metadata graph store. It also uses Solr for full-text search. Both are resource-intensive
  • For large environments (millions of entities), consider dedicating a full master node to Atlas

Knox

Apache Knox provides a single entry point for all REST API and UI access to ODP services through an API gateway and SSO proxy.

MetricGuidance
Knox Server heap1–2 GB
CPU4–8 cores (TLS termination is CPU-bound)
RAM (total node)16–32 GB (on edge or master node)

JVM heap example (knox-env):

KNOX_MAX_MEM=2048

Notes:

  • Knox handles TLS/SSL for all incoming connections. A modern CPU with AES-NI hardware acceleration significantly reduces TLS overhead for high-concurrency deployments
  • Knox is typically deployed on the edge node, not on master nodes, to isolate external-facing traffic from internal coordination services