By Xin Chen, Garima Kochhar and Mario Gallegos. August 2012.

Almost all clusters, no matter the size, need an NFS based solution. In some clusters this storage is used only for applications and home directories. In others depending on the size of the cluster and the IO requirements, it can be used for processing temporary files as well. Every system admin knows NFS, but in our experience, tuning NFS is non-trivial. You already have the servers, the storage and the software. How do you get the best performance and reliability out of your configuration? And how do you get up to 4000MB/s throughput from NFS? Keep reading!

NSS-HA is a line of optimized NFS based storage solutions for HPC configurations that also provide High Availability. Including the latest solution described here, three versions of NSS-HA solutions have been released since 2011. With the introduction of the latest version of NSS-HA (called NSS4-HA and described in this article), the NFS servers have been upgraded to take advantage of several new technologies that promise to improve IO performance.

We’re going to throw some model numbers at you now. This is to easily and simply explain what new technologies we’re talking about. If you’re familiar with this, skip ahead to the next paragraph. These technologies have been released with the Dell PowerEdge R620 server. It features the Intel Xeon E5-2600 series processors (based on the Intel micro-architecture codenamed Sandy Bridge-EP) and provides enhanced systems management features and lower power consumption when compared to the previous 11th generation Dell PowerEdge servers. The integrated PCIe Gen-3 I/O capabilities of the latest Intel Xeon processors allow for a faster interconnect using the 56 Gb/sec fourteen data rate (FDR) InfiniBand adapters. For 10 Gb Ethernet solutions, an onboard Network Daughter Card that does not consume a PCIe slot is now an option. The 1U PowerEdge R620 has enough PCIe slots to satisfy the requirements of the NSS-HA solution and allows for a denser solution. Additionally, the PowerEdge R620 provides increased memory capacity and bandwidth.  All these factors combine to provide better IO performance as described below. The storage subsystem of this release of the NSS-HA remains unchanged.

For readers familiar with the NSS-HA solutions, Table 1 gives an easy to read way to see what’s new. It also allows you to decide if you need an upgrade at all. Note that there are significant changes in configuration steps between the NSS2-HA and NSS3-HA releases, while there are few configuration changes between the NSS3-HA and NSS4-HA releases. With the help of the PowerEdge R620 and FDR InfiniBand network connection, the NSS4-HA solution now achieves sequential read peak performance up to 4058 MB/sec! The Performance section at the bottom of this article gives more detail on the IO performance of this configuration compared to the previous generation.

Table 1. The comparisons among NSS-HA Solutions

  NSS2-HA Release (April 2011) NSS3-HA Release (February 2012) “Large capacity configuration” NSS4-HA Release   (July 2012)  “PowerEdge R620 based solution”
Release Purpose Initial Release. Add the ability to support greater than 100TB storage capacity. Move to latest server technology.  Take advantage of the performance improvement with Dell PowerEdge 12th generation servers.
Storage The maximum supported size is 96 TB in a standard configuration. The maximum supported size is 288 TB in a standard configuration.
Capacity   The XL configuration supports 2x288 TB (two file systems).
  The XL configuration supports 2x96 TB (two file systems).  
     
Sequential Performance Peak write: 1275 MB/sec. Peak write: 1495 MB/sec. Peak write: 1535 MB/sec.
(standard configuration) Peak read:  2430 MB/sec. Peak read:  2127 MB/sec. Peak read:  4058 MB/sec.
Configuration Details The complete configuration steps can be found at Dell HPC NFS Storage Solution High Availability Configurations, Version 1.1. Compared to the NSS2-HA release, there are significant changes in configuration steps. Compared to the NSS3-HA release, there are only a few changes in NSS4-HA configuration steps.
   
The complete configuration steps can be found at Dell HPC NFS Storage Solution – High availability with large capacities, Version 2.1. The complete configuration steps will be published in July 2012.
   
HA Functionalities All three releases use the same mechanisms to tolerate or recover the following failures:
·        Single local disk failure on a server
·        Single server failure
·        Power supply or power bus failure
·        Fence device failure
·        SAS cable/port failure
·        Dual SAS cable/card failure
·        InfiniBand /10GbE link failure
·        Private switch failure
·        Heartbeat network interface failure
·        RAID controller failure on Dell PowerVault MD3200 storage array

Why did we use Dell PowerEdge R620 servers?

Compared to the NSS3-HA solution releases, the biggest change in the NSS4-HA release is that the new Dell 12th generation PowerEdge R620 server is deployed as an NFS server, while the Dell 11th generation PowerEdge R710 server was used as the NFS server in the two previous releases.

The Dell NSS-HA solution is designed to provide storage service to HPC clusters. Besides providing high availability and reliability, it is also essential for a storage solution to deliver excellent I/O performance for HPC clusters. The PowerEdge R620 leverages current state-of-art technologies to enhance existing network and disk I/O processing capabilities, compared to PowerEdge R710. The key features of the PowerEdge R620 are listed below. These features position the PowerEdge R620 to be a better performing platform and better performing NFS server in NSS-HA than the PowerEdge R710:

  • Faster processor: The PowerEdge R620 is equipped with the new Intel Xeon E5-2680 processor, which provides faster processing speed and more cores than the Xeon E5630 used in the PowerEdge R710.
  • Larger capacity and faster memory: With this release of the NSS-HA solution, the NFS server is equipped with 128 GB of memory running at 1600 MT/s versus 96 GB of 1333 MT/s memory in the previous solution. Larger memory size and higher frequency are critical to server performance.  
  • Fast internal connection: faster connections are provided throughout the system with 8.0 GT/s with Intel Quick Path Interconnect (QPI) compared to 5.86 GT/s supported with the Intel Xeon E5630 in the PowerEdge R710.
  • Faster InfiniBand link: In the PowerEdge R620, a PCIe Gen 3 based fourteen data rate (FDR) card is supported, which can provide a bandwidth of up to 56 Gb/sec. While the PowerEdge R710 can only support PCIe Gen 2 speeds and uses quad data rate (QDR) links which have a maximum bandwidth of 40 Gb/sec. 
  • Smaller form factor: The PowerEdge R620 is a 1U rack server, while the PowerEdge R710 is a 2U rack server. That translates into a denser solution with this release of the NSS-HA solution.
  • The PowerEdge R620 can support an onboard 10 Gb Ethernet network daughter card for clusters that require 10 GbE connectivity, which frees a PCIe slot in the NFS server.

Performance Improvement

Due to the many powerful features of the PowerEdge R620, the current NSS-HA release provides significant I/O performance improvement:

  • Sequential read/write performance: about 75 percent increment on average; most of the improvement is with sequential reads. The write performance does not change much between the current and previous release, as the RAID 6 write performance is largely determined by the storage system itself (disk drives in the storage subsystem are configured with RAID 6). 
  • Random read/write performance: about 17 percent increment for random writes and 23 percent increment for random reads on average.
  • Metadata operation performance: the increment on average is more than 20 percent for file create, stat, and remove operations.

The following figures show the comparisons between NSS3-HA and NSS4-HA. Note: NSS3-HA and NSS4-HA have the exact same storage subsystem.

Figure 1. IPoIB large sequential write performance: NSS4-HA vs. NSS3-HA

Figure 2. IPoIB large sequential read performance: NSS4-HA vs. NSS3-HA

Figure 3. IPoIB random write performance: NSS4-HA vs. NSS3-HA

Figure 4. IPoIB random read performance: NSS4-HA vs. NSS3-HA

Figure 5. IPoIB file create performance: NSS4-HA vs. NSS3-HA

Figure 6. IPoIB file stat performance: NSS4-HA vs. NSS3-HA

Figure 7. IPoIB file remove performance: NSS4-HA vs. NSS3-HA

For detailed information about Dell NSS4-HA solution, please refer to “Dell HPC NFS Storage Solution High Availability (NSS-HA) Configurations with Dell PowerEdge 12th Generation Servers.” For the detailed Dell NSS4-HA configuration guide, please refer to the attachment of the blog.