by David Detweiler
Congratulations to Team South Africa on their second place finish in the Student Cluster Competition at the International Supercomputing Conference (ISC) in Frankfurt, Germany earlier this month. The students hailing from the University of Witwatersrand narrowly missed three-peating as champions.
The team was comprised of Ari Croock, James Allingham, Sasha Naidoo, Robert Clucas, Paul Osei Sekyere, and Jenalea Miller, with reserve team members Vyacheslav Schevchenko and Nabeel Rajab. Together, they represented the Centre for High Performance Computing (CHPC) at the competition.
The South African students competed against teams from seven other nations over a sleep-depriving three days. During the competition, the teams were tasked with designing and building their own small cluster computers, and run a series of HPC benchmarks and applications. In addition, the students were assigned to optimize four science applications, three of which were announced before the competition, with the fourth introduced during the event.
The competition was sponsored by ISC and the HPC Advisory Council. Each team was scored based on three criteria:
With young people like team South Africa entering the field, the future of HPC looks brighter than ever. Congratulations on a job well done!
Scientific research has reaped the rewards offered by big data technologies. New insights have been discovered in a wide range of disciplines thanks to the collection, analysis and visualization of large data sets. In a recent series of articles, insideBigData examined some of the noteworthy benefits researchers are realizing when adopting big data technologies.
The results of big data adoption have been impressive. Just a few examples of how big data is being employed across a variety of disciplines include:
Data has afforded researchers tremendous opportunities. Researchers are collaborating across companies, disciplines and even continents in manners never before available to them. As with any rapid adoption of technology, some difficulties are to be expected. However, the benefits offered by big data far outweigh any potential issues.
Big data has already ushered in exciting scientific advancements across a myriad of disciplines. The benefits for researchers – and society – are just beginning.
You can learn more about Big Data and scientific research in this white paper.
The alignment between high performance computing (HPC) and big data has steadily gained traction over the past few years. As analytics and big data continue to be top of mind for organizations of all sizes and industries, traditional IT departments are considering HPC solutions to help provide rapid and reliable information to business owners so they can make more informed decisions.
This alignment is clearly seen by increasing sales of hyper-converged systems. IDC predicts sales of these systems will increase 116% this year compared 2014, reaching an impressive $807 million. This significant growth is expected to continue over the next few years. Indeed, the market is expected to experience nearly 60% compound annual growth rate (CAGR) from 2014 to 2019, at which time it will generate more than $3.9 billion in total sales.
To meet this growing customer demand, more hyper-converged systems are being offered. For example, the latest offering in the 13th generation Dell PowerEdge server portfolio, the PowerEdge C6320, is now available. These types of solutions help organizations meet their increasingly demanding workloads by offering improved performance, power improvements and cost-efficient compute and storage. This allows customers to optimize application performance and productivity while conserving energy use and saving traditional datacenter space.
Among the top research organizations and enterprises utilizing the marriage between HPC and big data is San Diego Supercomputer Center (SDSC). Comet, it’s new, recently-deployed petascale supercomputer is leveraging 27 racks of PowerEdge C6320, totaling 1,944 nodes or 46,656 cores. This represents a five-fold increase in compute capacity versus their previous system. In turn this affords SDSC the ability to provide HPC to a much larger research community. You can read more about Comet and how it is being in this Q&A with Rick Wagner, SDSC’s high-performance computing systems manager. (LINK)
Learn more about the PowerEdge C6320 here.
The democratization of HPC is under way. Removing the complexities traditionally associated with HPC, and focusing on making insightful data more easily accessible to a company’s users are the lynchpins to greater adoption of high performance computing for organizations beyond the more traditional groups.
HPC is no longer simply about crunching information. The science has evolved to include predicting and developing actionable insights. That is where the smaller, newer adopters uncover the true value of HPC.
However, these organizations can become overwhelmed by the amount, size, and types of information they’re collecting, storing, and analyzing. Increasingly, these enterprises are identifying HPC as an efficient and cost effective solution to quickly glean valuable insights from their big data applications.
That cost-effective efficiency can yield impressive measureable results. In just one example, Onur Celebioglu, Dell’s director of HPC & SAP HANA solutions, Engineered Solutions and Cloud, cited how HPC has allowed life sciences using big data to slash genetic sequencing from four days to just four hours per patient. That reduction has provided an untold improvement in treatment plans, which has bettered the lives of patients and their families.
Greater democratization also occurs when companies realize it is possible to leverage HPC, cloud, and big data to benefit their business without abandoning their existing systems. Having the ability to build onto an existing system as business needs warrant, allows more organizations that otherwise couldn’t reap the benefits of HPC to do so.
You can read more about the democratization of HPC at EnterpriseTech.
Promising students in South Africa will now have an exciting new opportunity to obtain greater, more in-depth experiences in high performance computing (HPC). A partnership between the South African Department of Trade and Industry (DTI), the Center for High Performance Computing (CHPC), and Dell Computers has resulted in a new IT academy.
Slated to open in January of 2016, each year the Khulisa IT Academy will play host to promising students from economically disadvantaged areas throughout the country. "Khulisa" translates as "nurturing" in the isiZulu language.
The purpose of the academy is to grow the skill set and experience of young South Africans pursuing careers in HPC. During their two-year terms at the academy, students will be able to marry the theoretical aspects of HPC they have learned in the classroom with real-life, practical experiences offered through various industry internships.
To allow the students to concentrate on their education and future professions, each will receive a stipend for the duration of their time at the academy. Upon graduation, these rising HPC stars will be ready to enter into careers in any number of industries.
Dell is honored to be able to play a small role in helping these worthy students. The company is investing financially in the academy, as well as offering startup funding for the ventures of students with proven entrepreneurial skills.
by Seth Feder
Genomics is no longer solely the domain of university research labs and clinical trials. Commercial entities such as tertiary care hospitals, cancer centers, and large diagnostics labs are now sequencing genomes. Perhaps ahead of the science, consumers are seeing direct marketing messages about genomic tumor assessments on TV. Not surprising, venture capitalists are looking for their slice of the pie, last year investing approximately $248 million in personalized medicine startups.
So how can health IT professionals get involved? As in the past, technology coupled with innovation (and the right use-case) can drive new initiatives to widespread adoption. In this case, genomic medicine has the right use-case and IT innovation is driving adoption.
While the actual DNA and RNA sequencing takes place inside very sophisticated instrumentation, sequencing is just one step in the process. The raw data has to be processed, analyzed, interpreted, reported, shared, and then stored for later use. Sound familiar? It should, because we have seen this before in such fields as digital imaging which drove the wide spread deployment of Picture Archiving and Communicating Systems (PACS) in just about every hospital and imaging clinic around the world.
As in PACS, those in clinical IT must implement, operationalize, and support the workflow. The processing and analysis of genomic data is essentially a big data problem, solved by immense amounts of computing power. In the past, these resources were housed inside large exotic supercomputers only available to elite institutions. But today HPC built on scale-out x86 architectures with multi core processors have made this power attainable to the masses – and thus democratized. Parallel file systems that support HPC are much easier to implement and support, as are standard high bandwidth InfiniBand and Ethernet networks. Further, public cloud is emerging as a supplement to on-premise computing power. Some organizations are exploring off-loading part of the work beyond their own firewall, either for added compute resources or as a location for long term data storage.
For example, in 2012 myself and others at Dell worked with the Translational Genomics Research Institute (TGen) to tune its system for genomics input/output demands by scaling its existing HPC cluster to include more servers, storage and networking bandwidth. This allowed researchers to get the IT resources they needed faster without having to depend on shared systems. TGen worked with the Neuroblastoma and Medulloblastoma Translational Research Consortium (NMTRC) to develop methodology for fast sequencing of childhood cancer tumors, allowing NMTRC doctors to quickly identify appropriate treatments for young patients.
You can now get pre-configured HPCs to work with genomic software toolsets, which enabled clinical and translational research centers like TGen to do large-scale sequencing projects. The ROI and price per performance is compelling for anyone doing heavy genomic workloads. Essentially, with one rack of gear, any clinical lab now has all the compute power needed to process and analyze multiple genome sequences per day, which is a clinically relevant pace.
Genomic medicine is here, and within a few years will become standard care to sequence many diseases in order to determine proper treatment. As the science advances, the HPC community will be ready contribute in making this a reality. You can learn more here.
by Tom Raisor
The San Diego Supercomputer Center (SDSC) at the University of California, San Diego has transitioned into the early operations stages of its new Comet supercomputer. When it is fully functioning, the new cluster will have an overall peak performance approaching two petaflops.
Comet has been designed as a solution for the "long tail" of science, which refers to the significant amount of research that is computationally-based, but modest-sized. Together, these projects represent a great amount of research and potential scientific impact. Much of this research is being conducted in disciplines that are new to high performance computing such as economics, genomics and social sciences.
The Comet cluster includes:
You can learn more about Comet and its mission to serve the long tail of science here.
Having the ability to quickly and effectively react to customer needs and market demands is invaluable to a business. Yet too many decision makers are stymied by a lack of useful insight into their data. However, agility and efficacy in analytics is possible. With the right mindset, tools and technologies, organizations can become much more adroit about how they use the power of analytics to improve decision making.
A recent survey indicated that an impressive 61% of organizations around the globe have data waiting to be processed. Unfortunately, a mere 39% felt they understood how to extrapolate the value from that data.
In order to unlock the value found in data, organizations must have:
The analytics tools needed to drive fast and flexible business decisions are available. However, it also takes the right mindset for the power of analytics to improve decision making.
You can read more about what IT decision makers are thinking about a variety of data-related topics here.
When it comes to processing big data platforms, Hadoop has become the go-to platform. It allows vast amounts of data, especially unstructured or very diverse data, to be quickly processed. As the de facto open sources parallel file system for HPC environments, Lustre provides compute clusters with efficient storage and fast access to large data sets. Together these technologies help to solve big data problems. However, they also present some disadvantages, including a need for HTTP calls, added overhead, reduced efficiency, slower speed, and a requirement for fairly large local storage on each Hadoop node.
There is, however, a way to overcome those obstacles. As a Hadoop software adaptor, Intel Enterprise Edition for Lustre (IEEL) provides direct access to Lustre during MapReduce computations, improving performance.
A presentation by J. Mario Gallegos, at the Recent LUG 15 conference highlighted some of the advantages gained and some of the best practices to follow when adding IEEL.
Among the advantages observed:
You can read about Mario's other findings and see his LUG presentation here.
by Ashish Kumar Singh
This blog explores performance analysis of WRF (Weather Research and Forecasting) model on a cluster of PowerEdge R730 servers with Intel Xeon Phi 7120Ps Coprocessors. All the runs were carried out with Hyper Threading (logical Processors) disabled.
The WRF (Weather Research and Forecasting) model is a next-generation mesoscale numerical weather prediction system designed to serve both atmospheric research and operational forecasting needs. The model serves a wide range of metrological applications across scales from tens of meters to thousands of kilometers. WRF allows for atmospheric simulations based on real data (observations, analysis) or idealized conditions to be generated.
Test Cluster Configuration:
The test cluster consisted of four PowerEdge R730 servers with two Intel Xeon Phi 7120P co-processors each. Each PowerEdge R730 had two Intel Xeon E5-2695v3 @ 2.3GHz CPU and eight 16GB DIMMS of 2133MHz making it a total of 128GB of memory. Each PowerEdge R730 consisted of one Mellanox FDR Infiniband HCA card in the low-profile x8 PCIe Gen3 slot (Linked with CPU2).
Compute node configuration
The BIOS options selected for this blog were as below:
WRF performance analysis was run for Conus-2.5km data. The Conus-2.5km data set was a single domain,the large size 2.5KM is equal to the continental US, which had the final 3hr simulation for hours 3-6, starting from a provided restart file. It may also be performed for the full 6hrs starting from a cold start.
All the runs on CPU with Intel Xeon Phi configuration were performed in symmetric mode. For single node CPUs-only configuration, the average time was 7.425 seconds. However on CPUs and two Intel Xeon Phi configurations, the average time taken was 6.093 seconds, which showed improvement of 1.2 times. With a two node cluster of CPUs and Intel Xeon Phi, the average time was 2.309 seconds, an improvement of 3.2 times. For a four node cluster of CPUs and Intel Xeon Phi configuration, a performance improvement was increased to 5.7 times.
The power consumption analysis for WRF with Conus-2.5KM benchmark is shown below. On single node, with CPU only configuration, the power consumption was 395.4 watts. On CPUs with one Intel Xeon Phi configuration, power consumption was at 526.3 watts, while on CPUs with two Intel Xeon Phi configuration, the power consumption was 688.2 watts.
Results showed power consumption increase in addition of Intel Xeon Phi. However, results also showed increase in performance per watt to the order of 2.6 times on a CPUs with two Intel Xeon Phi configuration.
The configuration of CPUs with Intel Xeon Phi 7120P showed sustained performance and power-efficiency gains in comparison to CPUs-only configuration. With two Intel Xeon Phi 7120Ps WRF with Conus-2.5KM benchmark showed 1.2 fold increase and performance per watt improved by more than 2.6 times too, resulting in a powerful, easy-to-use and energy efficient HPC platform.