Blake GonzalesIn order to effectively discuss HPC Cooling, we have to first discuss those components that consume large amounts of power. Power in HPC environments is generally consumed by the same components and infrastructure that you would typically find in enterprise environments, with one very large caveat: You have to scale up! The major power consumers in HPC environments are your well-known system components such as processors, memory, and to some degree disk storage and interconnect hardware. The ratio of compute nodes to other components is usually high, so you will likely find that most of your power is consumed by your computational resources. Take a walk around any HPC environment and find the components that are generating heat; these hot spots will be your major power consumers.

Now that we know our hot spots, we can discuss cooling. For every watt consumed in a component, there is a cooling power component that must be consumed as well. Cooling infrastructure includes items such as chillers, air handlers, fans etc. When you add a few more servers in an enterprise environment it is easy to overlook the power consumed by the incremental cooling needed. This is not so with HPC. Careful planning for HPC power and cooling resources is as critical as the design of the computational components.

It is very important to effectively cool your HPC components, otherwise they tend to fail at much higher rates. To reduce the risk of overheating, most CPUs now include a mechanisms that will power down a node when temperatures reach a critical threshold. In recent years, water-cooled rack doors have made a comeback in the industry, especially as systems become very dense with the use of blades. These doors either chill the air before the air enters nodes in the system, or chill the exiting warm exhaust air.

In large HPC and enterprise environments it is advisable to utilize hot/cold aisle concepts to control the flow of chilled an exhaust air. For a cold aisle, rack fronts face each other. In this way, you can direct your chilled air to your cold aisles where hot air exhaust will not interfere. If we were to place rack fronts across the aisle from the rack rears, cooling will not be as effective because some of the exhaust will be sucked into the intakes across the aisle. For hot aisles, the rack rears are facing each other in order to consolidate exhaust. In many environments, it is advisable to contain either the hot or cold aisles so that the two air masses do not mix. For instance, you could contain the hot aisle air and direct it immediately out of the data center.

Next Up… MEMORY

-- Blake Gonzales, Dell HPC Scientist

See other posts from this Blog series:
INTRO
SMP
CLUSTERED SYSTEMS
CLUSTERED SYSTEM INFRASTRUCTURE
POWER DISTRIBUTION