December 2016 – HPC Innovation Lab
In order to build a balanced cluster ecosystem and eliminate bottle-necks, the need for powerful and dense server node configurations is essential to support parallel computing. The challenge is to provide maximum compute power with efficient I/O subsystem performance, including memory and networking. Some of the emerging technologies along with traditional computing that are needed for intense compute power are advanced parallel algorithms in the areas of research, life science and financial application.
Dell PowerEdge C6320p
The introduction of the Dell EMC C6320p platform, which is one of the densest and greatest maximum core capacity platform offerings in HPC solutions, provides a leap in this direction.
The PowerEdge C6320p platform is Dell EMC’s first self-bootable Intel Xeon Phi platform. The previously available versions of Intel Xeon Phi were PCIE adapters that required a host system to be plugged into. From the core perspective, it supports up to 72 processing cores, with each core supporting two vector processing units capable of AVX-512 instructions. This increases the computation of floating point operations requiring longer vector instructions unlike Intel Xeon® v4 processors that support up to AVX-2 instructions. The Intel Xeon Phi in Dell EMC C6320p also features on-package 16GB of fast MCDRAM that is stacked on the processor. The availability of MCDRAM helps out-of-order execution in applications that are sensitive to high memory bandwidth. This is in addition to the six channels of DDR4 memory hosted on the server. Being a single socket server, the C6320p provides a low power consumption compute node compared to traditional two socket nodes in HPC.
The following table shows platform differences as we compare the current Dell EMC PowerEdge C6320 and Dell EMC PowerEdge C6320p server offerings in HPC.
Server Form Factor
2U Chassis with four sleds
Intel ® Xeon Phi
Max cores in a sled
Up to 44 physical cores, 88 logical cores
(with two * Intel ® Xeon E5-2699 v4, 2.2 GHz, 55MB, 22 cores, 145W)
Up to 72 physical cores, 288 logical cores
(with the Intel ®Xeon Phi Processor 7290 (16GB, 1.5GHz, 72 core, 245W)
Theoretical DP Flops per sled
16 DDR4 DIMM slots
6 DDR4 DIMM slots +
on-die 16GB MCDRAM
MCDRAM BW (Memory mode)
~ 475-490 GB/s
~ 135 GB/s
Dual port 1Gb/10GbE
Single port 1GbE
Intel Omni-Path Fabric (100Gbps)
Mellanox Infiniband (100Gbps)
Intel Omni-PathFabric (100Gbps)
On-board Mellanox Infiniband (100Gbps)
Up to 24 x 2.5” or 12 x 3.5” HD
6 x 2.5” HD per node +
Internal 1.8” SSD option for boot
Integrated Dell EMC Remote Access Controller
Dedicated and shared iDRAC8
Table 1: Comparing the C6320 and C6320p offering in HPC
Dell EMC Supported HPC Solution:
Dell EMC offers a complete, tested, verified and validated solution offering on the C6320p servers. This is based on Bright Cluster Manger 7.3 with RHEL 7.2 that includes specific highly recommended kernel and security updates. It will also provide support for the upcoming RHEL 7.3 operating system. The solution provides automated deployment, configuration, management and monitoring of the cluster. It also integrates recommended Intel performance tweaks, as well as required software drivers and other development toolkits to support the Intel Xeon Phi programming model.
The solution provides the latest networking support for both InfiniBand and Intel Omni-Path Fabric. It also includes Dell EMC-supported System Management tools that are bundled to provide customers with the ease of cluster management on Dell EMC hardware.
*Note: As a continuation to this blog, there will be follow-on micro-level benchmarking and application study published on C6320p.
ram for graphics