General purpose Graphic Processor Units (GPUs) have proven their acceleration capacity across several HPC application classes; in general, they are very suitable for accelerating compute-intensive applications E.g., Computational Fluid Dynamics (CFD), Molecular Dynamics (MD), Quantum Chemistry (QC), Computational Finance (CF) and Oil & Gas applications etc. However among the available areas, Molecular Dynamics (MD) has benefitted tremendously due to GPU acceleration. This is in-part due to the nature of its core algorithms being suitable for the hybrid CPU-GPU computing model and equally important, freely available sophisticated GPU-enabled molecular dynamics simulators. NAMD is such a GPU-enabled simulator. For more detailed information about NAMD and GPUs, please visit http://www.ks.uiuc.edu/Research/namd/ and http://www.nvidia.com/TeslaApps.
In this blog we evaluate improved NAMD performance due to GPU accelerate compute nodes. Two proteins F1ATPASE and STMV, which consist of 327K and 1066K atoms respectively, are chosen due to their relatively large problem size. The performance measure is “days/ns”, that shows the number of days required to simulate 1 nanosecond of real-time.
Figure 1: Relative performance of two NAMD benchmarks on the 8 node R720 cluster. F1ATPASE is accelerated about 1.1X and STMV about 2.8X.
Figure 1, illustrates the relative performance of the two NAMD benchmarks on the 8 node R720 cluster, keeping the number of GPUs fixed at 16. In both cases the benchmarks run faster due to GPUs, however the acceleration is very sensitive to problem size. In the case of F1ATPASE we see a modest 1.1X acceleration and for STMV we observe 2.8X acceleration. As expected the acceleration improves with problem size. There seems to be a minimum threshold of 300K atoms to make GPUs feasible, as shown with the F1ATPASE model. Figure 2 shows the additional power required for GPUs; there is a 1.6X increase in total power consumption. From the power efficiency point of view running STMV with dual internal GPUs is beneficial as the performance gain is 2.8X for an additional 1.6X power.
In summary, GPUs can accelerate NAMD simulations. Problem size is a key factor in determining how much a particular simulation gets accelerated. In a previous study (http://en.community.dell.com/techcenter/high-performance-computing/w/wiki/namd-performance-on-pe-c6100-and-c410x.aspx) we found a similar sensitivity t problem/simulation size (number of atoms). STMV, the largest simulation we have, is about 1 million atoms and accelerates much better than smaller simulations, it is expected that even larger simulations can be accelerated even more.
Figure 2: Relative Power Consumption of two NAMD benchmarks on the 8 node R720 cluster. In both cases there is about 1.6X increase in power consumption due to GPUs.
The cluster consists of one master node and eight PowerEdge R720 compute nodes, as shown in Figure 3. The compute nodes can be configured with one or two of the internal Tesla M2090 GPUs, each node in our cluster has two M2090 GPUs for acceleration. The details of the hardware, software and NAMD parameter setup are given below:
Figure 3: The 8 node Power Edge R720 Cluster. Each compute node has 2 internal GPUs; total of 16 Tesla M2090 GPUs.
The Mellanox SX6025 is a 36-port Non-blocking Unmanaged 56Gb/s InfiniBand Switch System.
What Nics are used in each R720 in which slot?
The IB cards used are Mellanox Model No CX354A (ConnectX-3 FDR + 40GigE). The cards are in slot 1 in each R720.