Scott Collier
I’m going to be writing a series of blog posts discussing the new Dell | Terascala HPC Storage Solution. In “Part I” of this blog, I’ll be giving a high level overview of the Dell | Terascala HPC Storage Solution and how we have it configured in our lab. I’ll also cover some benchmark data we have acquired as part of our validation process. Moving forward with the series I’ll discuss benchmarks and performance more in depth.
In the world of HPC (High Performance Computing), storage I/O is the bottleneck for many applications. Dell has partnered with Terascala in an attempt to alleviate this bottleneck. The solution we will be talking about today is the Dell | Terascala HPC Storage Solution.


The Dell | Terascala HPC Storage Solution is an appliance oriented scalable solution that has several key features such as:

  • Pre-Configured highly available OSS’s (Object Storage Servers) and MDS’s (Metadata Servers)
  • Pre-Configured Lustre filesystem
  • Pre-Configured MD3000 storage array
  • GUI for centralized administration
  • File system status
  • File system space usage
  • File system throughput
  • Failover capabilities
  • File System mount / un-mount capabilities
  • Etc…
  • SNMP alerts for hardware failures
  • Boot from SAN capabilities

In our lab, the Dell | Terascala HPC Storage Solution is configured as such:
Compute Node Software / Hardware:

  • Platform PCM 1.2a
  • Platform OFED 1.4.2
  • RHEL 5.3
  • 2.6.18-128.7.1.el5
  • Lustre 1.8.2
  • Mellanox ConnectX-{1,2} QDR HCA’s

Object and Metadata Servers Software / Hardware:

  • Lustre 1.6.7.1
  • Mellanox Infinihost III DDR
  • Full remote management

The diagram below is a visual representation of our cluster setup. Here’s the hardware configuration:

  • 64 Dell R410’s for compute nodes
  • Each compute node has 24GB memory
  • 1 Dell R710 as frontend node
  • 2 Dell PC6248 Ethernet leaf switches
  • 1 Dell PC8024F 10GB/s Ethernet core switch
  • 1 Qlogic 12800 QDR IB switch
  • 1 Qlogic 9024 DDR IB switch (uplinked to QDR switch via 5 QDR-DDR cables)
  • 2 Terascala MDS servers (connected to 1 MD3000)
  • Connected to the DDR IB fabric
  • 8GB memory per server
  • 2 Terascala OSS servers (connected to 2 MD3000)
  • Connected to the DDR IB fabric
  • 8 GB memory per server
Terascala Lustre cluster set-up

The first set of benchmarks I am going to cover in this post is IOzone N-to-N sequential writes and reads. You can get Iozone here:
http://www.iozone.org/
For this test we run IOzone in “distributed” mode which basically creates a “master” process on our frontend node that communicates with all other compute node IOzone processes via SSH. The compute nodes will run their tests and report results back to the “master” process and then the “master” process will calculate the aggregate throughput and report the results to you.
Basically, I created an IOzone host file and we are able to communicate with all clients because our cluster middleware has already shared SSH keys and also provides DNS services.
One example IOzone command used to generate the following charts is:

/usr/sbin/iozone -i 0 -i 1 -+n -c -e -r 64k -s 48g -Rb /tmp/iozone_$$.wks \
-l 28 -u 28 -+m clientlist/clients >> /tmp/iozone_custom.log

This was scripted to automate the generation of data we have below.

Here are the results of IOzone sequential Writes:
Iozone Sequential Writes - No Striping

Some thoughts on the above results –
1. The Dell | Terascala HPC Storage Solution achieves peak write performance of ~ 1300MB/s at 8 clients.
2. The block sizes we use do not create too much variance in performance.
3. The performance drops after 24 nodes because the access pattern becomes more random than sequential as you add more clients.

Here are the results of IOzone sequential Reads:
Iozone Sequential Reads - No Striping

Some thoughts on the above results –
1. The Dell | Terascala HPC Storage Solution achieves peak read performance of ~ 1400MB/s.
2. The block sizes we use do create a bit of variance.
3. The saturation point is the same for all block sizes.
This concludes Part I of the “Dell | Terascala HPC Storage Solution series. Next time I will discuss metadata testing with mdtest. Please send me a note if you are interested in seeing any particular type of performance analysis with regards to the Dell | Terascala HPC Storage Solution.

Read Part 2 Here

-- Scott Collier