The heart and the brains of a cluster run on servers (in the HPCC world people call them nodes). In general, there are two major types of nodes: master/login nodes and compute nodes. Let’s go over the basic functions of HPCC nodes and what general options are available.

Master/Login Node

The general function of the master node is to be the brains behind the cluster. Its fundamental mission is to run the Cluster Management Tools (CMT)—see the other section on HPCC Software. (<link is coming soon>). In addition, this node can also run other cluster functions such as:

  • Job scheduling
  • Cluster monitoring
  • Cluster reporting
  • User account management

For reasonably sized systems the master node can also serve as a login node for users. This option means that the master node needs to have a network interface for the private cluster network (typically eth1) and a network interface to the outside world (typically eth0).

In addition, users will sometimes run what are called pre-processing and post-processing applications on the master/login node. The pre-processing applications create the input data for jobs that will run on the cluster (a job is a generic term to mean application launched by a job scheduler). The post-processing applications take the output of the cluster job and perform an analysis on the output data. Moreover, there are times that users will run interactive codes on the node for visualization or for data manipulation.

In many cases, the master node can also have attached storage that is exported to the compute nodes using NFS (for smaller clusters). Or, for larger clusters, the master node can be attached to the storage network and can monitor and administer the storage.

From all these functions the master node needs to perform, it needs to be fairly robust. Consequently, it is usually designed to have features such as hot-swappable power supplies and RAID-1 (mirroring) of the drives that contain the OS. Also, the master/login node attaches to multiple networks including a management network (TCP based), the internal computational network (if there is one), and the outside network. You will need room for several network connections as appropriate. You can even go to the next level and make the master node into a complete HA solution with failover of critical system functions between nodes (however, this may not be needed since most job schedulers and CMTs can handle multiple failover servers without resorting to making the master node an HA solution).

For larger systems, splitting the tasks of the master node from that of the login nodes is recommended. In this split the master node is tasked to control the cluster with the CMT package. It can also be the main point for job scheduling and resource management. The login node(s) then become the systems that users log into, run their pre-processing and post-processing applications, and submit jobs to the cluster. From a physical configuration perspective, master nodes and login nodes are usually configured the same. As clusters grow in size or the number of users on the cluster grows, you can add more login nodes. There are many ways to configure multiple login nodes, but we’re straying too far from the general theme of an Introduction to HPCC nodes.

Compute Nodes

The compute nodes do really one thing—compute. That is really about it. Not much else to say about this except that the form factor (chassis) for the compute node can vary based on the requirements for the compute node as discussed in the next section.

What U are U?

One of the critical options for HPCC nodes is the packaging. Sometimes people will call this a “form factor.” Basically, it’s the chassis for each node. Here are some possibilities for HPCC nodes:


  • Do you use a 1U chassis for the master/login nodes or the compute nodes?


1U front new

  • Or do you use a 2U?


2U Front New

  • Or 4U?

4U front

  • Or blades?

Blades

Well, as always, the answer to the question of which form factor is appropriate is, “it depends.” So let’s talk about situations where one option is perhaps better than others, and what the pluses and minuses are for each option. I like to call this discussion, “What U are U?” Here are some general rules of thumb from my experience for choosing a form factor:

  • The 1U form factor is typically used when:
    1. 1 to 2 sockets are needed in a node
    2. PCI-e card(s) are needed when certain functionality is not on the motherboard (for example, IB card, PCI-e card to Nvidia Tesla 1U system)
    3. Density is not an issue
    4. Smaller clusters (blades are usually more expensive than 1U nodes for smaller clusters)
    5. Systems that need added or larger hard drives with hot-swap capability that blades can’t provide (blades typically have a small number of drives that are small in form factor and capacity and may or may not be hot-swappable)
  • 2U form factor nodes are typically used when:
    1. 2 to 4 sockets (typically 2 sockets) are needed in a node
    2. More hard drives are needed than you can put in a 1U form factor
    3. There is a possibility of adding PCI-e cards that have functionality not on the motherboard (for example, new IB card, PCI-e card to Nvidia Tesla 1U system)
    4. Multiple network connections are needed (such as a master/login node)
    5. Redundant power supplies are needed (although, many 1U nodes have redundant power supplies)
    6. The nodes need to use less power than a 1U (a 2U node has larger fans than a 1U and typically uses less power)
  • The 4U form factor nodes are typically used when:
    1. A large number of sockets is needed in a single node (typically 4 sockets)
    2. A large number of DIMM slots (lots of memory) is needed in a single node
    3. A large number of hard drives are needed in the node
    4. A large number of PCI-e cards are needed with a single node
    5. The nodes need to use less power than a 2U (a 4U node has larger fans than a 2U and typically uses less power)
  • Blades are typically used in situations when:
    1. High density is critical
    2. Power and cooling is absolutely critical (blades typically have better power and cooling than rack-mount nodes)
    3. The applications don’t require much local storage capacity
    4. PCI-e expansion cards are not required (typically blades have built-in IB or GbE)
    5. Larger clusters (blades can be cheaper than 1U nodes for larger systems)

With these general comments, HPCC nodes usually follow these recommendations:

  • Master/login nodes are either a 2U or a 4U form factor.
    • The larger space is used for multiple networks
    • The nodes definitely need redundant power supplies
    • Sometimes the storage for the system is put in the master/login node (usually for smaller systems)
  • Compute nodes can be built from almost any form factor:
    • 1U
      • Most common compute node form factor (1 to 2 sockets)
    • 2U
      • Needed when power usage is extremely important
      • Needed when the application needs more local I/O than can be put in a 1U node
    • 4U
      • Four or more sockets are needed in a single node
      • A great deal of local I/O is needed
      • Needed when power usage has to be lower than a 2U node
    • Blades
      • High density is needed
      • You don’t need any PCI-e cards in the node


Picking a good or appropriate (I won’t say optimal) form factor begins with knowing your applications. Unfortunately, a discussion of this nature is a bit beyond the scope of this page, so use the previously discussed rules of thumb to get an idea of what kind of form factor suits your needs.

Return to Introduction to HPCC page