Our community is talking about the new Dell Technologies. Join the discussion in the Dell EMC Community Network:
The Dell OpenManage team has published iDRAC Service Module version 2.1 for Ubuntu and Debian. Builds are available for Ubuntu 12.04 and 14.04 as well as Debian Wheezy and Jessie. It is available via the apt repositories at <http://linux.dell.com/repo/community/ubuntu/>. Please refer to the whitepaper at <http://en.community.dell.com/techcenter/extras/m/white_papers/20441098> for details.
by Ashish Kumar Singh
This blog explores performance analysis of WRF (Weather Research and Forecasting) model on a cluster of PowerEdge R730 servers with Intel Xeon Phi 7120Ps Coprocessors. All the runs were carried out with Hyper Threading (logical Processors) disabled.
The WRF (Weather Research and Forecasting) model is a next-generation mesoscale numerical weather prediction system designed to serve both atmospheric research and operational forecasting needs. The model serves a wide range of metrological applications across scales from tens of meters to thousands of kilometers. WRF allows for atmospheric simulations based on real data (observations, analysis) or idealized conditions to be generated.
Test Cluster Configuration:
The test cluster consisted of four PowerEdge R730 servers with two Intel Xeon Phi 7120P co-processors each. Each PowerEdge R730 had two Intel Xeon E5-2695v3 @ 2.3GHz CPU and eight 16GB DIMMS of 2133MHz making it a total of 128GB of memory. Each PowerEdge R730 consisted of one Mellanox FDR Infiniband HCA card in the low-profile x8 PCIe Gen3 slot (Linked with CPU2).
Compute node configuration
The BIOS options selected for this blog were as below:
WRF performance analysis was run for Conus-2.5km data. The Conus-2.5km data set was a single domain,the large size 2.5KM is equal to the continental US, which had the final 3hr simulation for hours 3-6, starting from a provided restart file. It may also be performed for the full 6hrs starting from a cold start.
All the runs on CPU with Intel Xeon Phi configuration were performed in symmetric mode. For single node CPUs-only configuration, the average time was 7.425 seconds. However on CPUs and two Intel Xeon Phi configurations, the average time taken was 6.093 seconds, which showed improvement of 1.2 times. With a two node cluster of CPUs and Intel Xeon Phi, the average time was 2.309 seconds, an improvement of 3.2 times. For a four node cluster of CPUs and Intel Xeon Phi configuration, a performance improvement was increased to 5.7 times.
The power consumption analysis for WRF with Conus-2.5KM benchmark is shown below. On single node, with CPU only configuration, the power consumption was 395.4 watts. On CPUs with one Intel Xeon Phi configuration, power consumption was at 526.3 watts, while on CPUs with two Intel Xeon Phi configuration, the power consumption was 688.2 watts.
Results showed power consumption increase in addition of Intel Xeon Phi. However, results also showed increase in performance per watt to the order of 2.6 times on a CPUs with two Intel Xeon Phi configuration.
The configuration of CPUs with Intel Xeon Phi 7120P showed sustained performance and power-efficiency gains in comparison to CPUs-only configuration. With two Intel Xeon Phi 7120Ps WRF with Conus-2.5KM benchmark showed 1.2 fold increase and performance per watt improved by more than 2.6 times too, resulting in a powerful, easy-to-use and energy efficient HPC platform.
This blog explores the application performance analysis of NAMD (NAnoscale Molecular Dynamics) for large data sets on cluster of PowerEdge R730 servers with Intel Xeon Phi 7120Ps. All the runs were carried out with Hyper Threading (logical processors) disabled. IB verbs version of NAMD was used for all the runs.
The test cluster consisted of four PowerEdge R730 servers with two Intel Xeon Phi 7120P co-processors each. Each PowerEdge R730 had two Intel Xeon E5-2695v3 @ 2.3GHz CPU and eight 16GB DIMMS of 2133MHz making it a total of 128GB of memory per server. Each PowerEdge R730 consisted of one Mellanox FDR Infiniband HCA card in the low-profile x8 PCIe Gen3 slot (Linked with CPU2).
Compute node configuration
The BIOS options selected for this blog are as below:
NAMD (NAnoscale Molecular Dynamics) is a parallel, object-oriented simulation package written using the Charm++ parallel programming model, designed for high performance simulation of large bimolecular systems. Charm++ is developed with simplified parallel programming and also provides automatic load balancing, which is crucial to the performance of NAMD.
All the runs with STMV (virus) benchmark were run with ibverbs version of NAMD. The performance analysis with STMV benchmark shown below. STMV (Satellite Tobacco Mosaic Virus) is a small, icosahedral plant virus. On single node, we observed performance improvement of 2.5 times on CPUs with Intel Xeon Phi configuration in comparison to CPUs-only configuration.
STMV showed performance of 0.2ns/day with CPUs-only configuration. With CPUs and two Intel Xeon Phi performance was 0.5ns/day, which showed performance increase of 2.5 times. While on a four node cluster with the CPUs and Intel Xeon Phi 7120P performance increase was 8.5 times. Scaling from one node to four node resulted in almost 3.5 times scale-up.
The Power analysis was done for single node among CPUs-only configuration, CPUs with one Intel Xeon Phi 7120P configuration and CPUs with two Intel Xeon Phi 7120P configuration. With CPUs and two Intel Xeon Phi configuration, the power consumption increased along with the performance per watt, which was 2.4 times in comparison to CPU-only configuration. The power efficiency increase showed in below picture.
With CPUs and two Intel Xeon Phi 7120Ps, the STMV benchmark demonstrated increase of 2.5 times in performance and 2.4 times in power efficiency when compared to CPUs-only configuration, resulting in a powerful and energy efficient HPC platform.
This blog explores the HPL (High Performance LINPACK) performance and power analysis on Intel Xeon Phi 7120P cluster with current generation PowerEdge R730 servers. All the runs were carried out with Hyper Threading (logical Processors) disabled.
Compute node configuration
High Performance LINPACK is a benchmark that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed memory systems. HPL performed with block size of NB=192 for CPU only and NB=1280 for Intel Xeon Phi (offload) with different problem sizes of N=118272 (NB=1280) for single node N=172032 (NB=1280) for two node and N=215040 (NB=1280) for four node cluster runs.
Compared to the Intel CPU only configuration, the acceleration was about 3 times with Intel Xeon Phi 7120Ps.
On a single node, with CPUs only, the PowerEdge R730 achieved 802.09 GFOLPS, while with two 7120Ps it was 2.553 TFLOPS. So the 7120P provides 3.26X performance increase. Similarly, two node and four node demonstrated performance increase of 3.25X.
The HPL power consumption analysis is shown among CPU only, CPU with one Intel Xeon Phi and CPU with two Intel Xeon Phi.
The power consumption of single node CPUs-only was about 398.72 watts. With two 7120Ps and CPUs, it was increased to 983.5 watts. It showed the power consumption of the CPUs-only configuration was lower than system with Intel Xeon Phi. while the performance per watt for the configurations with Intel Xeon Phi was 1.31 times of CPUs-only configuration.
The Intel Xeon Phi 7120P showed sustained performance and power-efficiency gains in comparison to CPUs only. With two Intel Xeon Phi 7120Ps, HPL benchmark showed three fold performance increase in comparison to CPUs only and the performance per watt was improved by more than one fold, resulting in a powerful and energy efficient HPC platform.
This blog explores the application performance analysis of LAMMPS on a cluster of PowerEdge R730 servers with Intel Xeon Phi 7120Ps. All the runs were carried out with Hyper Threading (logical processors) disabled.
LAMMPS (Large Scale Atomic/Molecular Massively Parallel Simulator) is a classical molecular dynamics code, capable of doing simulation for solid-state materials (metals, semi-conductors), soft matter (biomolecules, polymers) and coarse-grained or mesoscopic systems. It can be used to model atoms or more generically as a parallel particle simulator at the atomic, meso or continuum scale.
Compute node configuration
LAMMPS was run for Rhodopsin benchmark. Rhodopsin benchmark simulates the movement of protein in the retina which in turn plays an important role in the perception of light. The protein is solvated lipid bilayer using the CHARMM force field with particle-particle particle-mesh long-range electrostatics and SHAKE constraints. The simulation was performed with 2,048,000 atoms at the temperature of 300K and pressure of 1 atm. The results for single node, two nodes and four nodes are as shown below. On one node with CPU only configuration, the loop-time was 66.5 seconds, while configuration of CPUs and two Intel Xeon Phi 7120Ps had a loop-time of 34.8 seconds. This demonstrated a performance increase of 1.9X. In comparison to CPUs only, CPUs + co-processors from one node to four nodes showed performance increase of 5.2X.
The LAMMPS power consumption analysis with RHODOPSIN benchmark is shown below. On single node, the power consumption by a CPU-only configuration was 442.4 watts, while configuration with CPUs and one co-processor consumed around 423W and subsequently configuration with CPUs and two co-processors consumed 450.8W.
All the LAMMPS runs on co-processors used the auto-balance mode. The performance per watt demonstrated 2 fold increase with CPUs + 2 co-processors than CPUs only.
The Intel Xeon Phi 7120Ps cluster with Dell PowerEdge R730 showed sustained performance increase of two fold. The power-efficiency was increased by 2X with two Intel Xeon Phi 7120Ps in comparison to CPUs only, resulting in a powerful, energy-efficient HPC platform.
One of the highlights of the Supercomputing Conference is always the Student Cluster Competition. It's an opportunity to see some of the future superstars of our industry demonstrate their skill under some pretty intense (but fun) pressure!
The student cluster competition also provides the competitors with a variety of opportunities. For some undergrads it is their first chance to focus exclusively on HPC. For other students the competition affords them the opportunity to make important networking and mentoring connections.
College teams from around the world are already gearing up for the Student Cluster Competition at SC 15 this November in Austin. The hometown favorites are no doubt feeling the pressure to fourpeat! It's an exciting time for everyone involved - especially the students.
Threepeat winners from SC14 in New Orleans.
You can learn more about the Student Cluster Competition and its positive impact on the lives of participants in this video.
Many businesses pursue analytics projects with the idea that they can respond ever more effectively to dynamic customer, market and business demands. Thompson describes the critical factors of analytics agility necessary to make this happen.
Applied to the preponderance of connected monitoring, social psychology research suggests that increased public awareness of the Internet of Things around us will influence our daily choices for the better. Davis ruminates on this effect.
In her latest CMSWire article, Schloss notes that the very term, "Shadow IT," portends rogue employees and covert operations, but such negative connotations are unwarranted. Shadow IT will only grow larger in a world of advanced analytics, so she examines the common myths to explain why businesses should want Shadow IT to thrive.
One way to think of "advanced analytics" might be as "complex analytics," the kind that go beyond the traditional analytics employed to produce business intelligence. However, there is another meaning that is certainly apropos in the context of anticipating circumstances, predicting trends, and prescribing actions. In such future-facing scenarios, "advanced analytics" might suitably be thought of as "analytics in advance." This is where the analytics maturity model starts to make sense.
David Sweenor, Statistica product marketing manager in Dell Software's Information Management Group, provides a practical summary of this maturity model in his recent post, "5 ways to boost your business IQ." It is worth noting that the model he touts is layered like a pyramid, with each layer built upon the solid foundation of previous layers. There is no skipping ahead when it come to maturity: every level of analytical maturity must be earned and learned in sequence.
Accordingly, Sweenor helpfully provides a quick overview of five advanced analytics techniques that should be evaluated by any organization seeking to build up its maturity: segmentation, decision trees, predictive models, text analytics, and optimization/simulation. And he describes some helpful Statistica case studies that prove the value of advanced analytics in real-world scenarios. Read David's post and see where you are in the model. You can also find more Statistica case studies under the "Resources" tab here.
I'm happy to announce we have an open beta for a brand new cartridge for Foglight for Virtualization, Enterprise Edition built from the ground-up for Citrix XenDesktop and XenApp 7.0 and higher running on VMware. We developed this cartridge to meet the needs of not only the virtual administrator who supports the Citrix VDI environment, but the Citrix Administrator who have responsibility for the performance and availability of Citrix XenDesktop and XenServer 7.0 and higher running on VMWare ESXi.
Whether the issue is a XenDesktop boot-storm or XenApp screen rendering, the new Citrix cartridge for Foglight for Virtualization gives the Citrix administrator the power to analyze all aspects of the user experience, ranging from network latency to the impact of storage IO. By providing a holistic approach to Citrix performance analysis, administrators can diagnose and resolve performance issues in seconds or minutes instead of hours or days. Whether the users are logging in from their windows client or an tablet, Foglight for Virtualization can show latency and Netscaler performance from a LAN and WAN perspective as well as the Virtual Machine performance on which the desktops and applications are running.
Lastly, Foglight for Virtualization shows the end-to-end topology of the Citrix environment that adjusts dynamically as VDI sessions go online/offline or as VMs move within the data center.
If you are running Citrix XenDesktop and/or XenApp 7.0 and higher on VMware ESXi and want to participate in the beta, please contact Hassan Fahimi directly at firstname.lastname@example.org. Hassan can also be reached at 949 754 8415.
About John Maxwell
John Maxwell leads the Product Management team for Foglight at Dell Software. Outside of work he likes to hike, bike, and try new restaurants.
View all posts by John Maxwell |
We recently teamed up with a leading provider of government market research to gain a deeper understanding of the VM landscape across federal agencies, and the results underscore the need to address common challenges they’re facing.
It turns out, the majority of the 150 respondents who participated in this study were most concerned with the security of their systems. In fact, 65 percent claimed security as their biggest challenge. Operational issues ― including performance monitoring, optimization, wasted resources, capacity management and the ability to efficiently manage both virtual and physical platforms ― were also cited as problematic, though not nearly as top of mind as security.
Yet another concern respondents noted was just how long it takes to detect and remediate problems. More than half of respondents admitted they generally need a day or more to address security concerns. Given the sensitive nature of much of the government’s data, that’s a chilling statistic.
The good news? Overcoming all these challenges doesn’t have to be nearly as hard or overwhelming as it often seems. With a solid virtual monitoring and management solution in place, it’s possible to quickly address and resolve every single one of these issues ― proactively and from a single interface.
So if you can relate to the norm among federal agencies, consider taking the lead on making a change. It’s a whole lot easier than dealing with problems after the fact. And acting early to get the right solution in place just might make you the silent hero who kept his organization out of the news. Because sometimes, flying under the radar is a reward in itself.
See all the results from our study, check out the infographic.
And if you’re interested in rapidly and proactively taking charge of your virtual environment, get yourself a free trial of this solution.