Our community is talking about the new Dell Technologies. Join the discussion in the Dell EMC Community Network:
Written by J Tamilarasan
DellEMC has been building and researching certified and tested Oracle platform solutions for over 20 years, with leadership in x86 technologies and clustered implementations. Over the years, our own internal usage of Oracle database and application software has translated that relationship into significant benefits for our customers. Technical certification is a foregone conclusion, with Dell servers listed on the Oracle Hardware Certification Lists (HCL) and on the Oracle Validated Configurations list (VC) that are all qualified to run Oracle Linux and Oracle VM. But certification goes far beyond these base levels. Dell offers Tested and Validated solutions for other operating environments, with complete guidance for deployment and configuration. These qualifications serve as a reassurance to customers that their configuration has been tested and is enterprise-ready.
Oracle’s latest server virtualization product, delivers many important new features and enhancements to enable rapid enterprise application deployment throughout public and private cloud infrastructure. The new release continues expanding support for both Oracle and non-Oracle workloads - providing customers and partners with additional choices and interoperability - including the capability to enable OpenStack support.
Below are the Links where you find the DellEMC Hardware specifications which are certified and published by Oracle.
OVM 3.4.4 support for DellEMC PowerEdge 14G server.
Dell EMC’s new PowerEdge 14G servers are highly scalable and performance optimized so they are a good fit for both traditional and cloud-native workloads. The new PowerEdge servers feature automation to increase productivity and simplify lifecycle management. PowerEdge users can use Quick Sync 2 to manage the servers through mobile devices (Android and iOS).
OVM 3.4.4 on PowerEdge 14G servers will be validated and certified with the hardware specifications and the same is performed by the Custom Solutions Engineering, DellEMC team which validates and certifies OVM & OVS.
OVM 3.4.4 supports PowerEdge Raid Controller 9, (PERC 9) on PowerEdge 14G Servers but not PERC 10. Installing OVM 3.4.4 on DellEMC PowerEdge 14G requires updating to the most recent kernel after which you may encounter the following errors:
1. Unable to find the storage devices during the installation as the PERC 10 driver is not associated with the OVM 3.4.4 kernel
2. It is to be noted that upgrading the OVM kernel with the other OL versions like OL5 will result in Kernel Panic.
Dell’s Custom Solutions Engineering can install the drivers and certify the Operating System. Please contact your account team to start the process.
To learn more about Dell Custom Solutions Engineering visit www.dell.com/customsolutions
We require customers sign a Disclaimer form to acknowledge that this is not supported by Dell and that there are associated risks the customer must assume.
This is a mandatory hotfix for:
The following is a list of issues resolved in this release.
The user is not prompted for authentication when their credentials are not saved
The Password Manager messages are not displayed in the client’s language
The session window is not resized if a device has been rotated during session launching
Custom resolution more than 1152*2048 cannot be applied to the connector
Copied text is pasted into the remote session from the local app without the last character
Text is copied with additional characters from Microsoft Notepad to Microsoft Paint in the remote session
The remote session constantly moves upward if the user switches between the keyboard and the extended keyboard
There is empty space between the keyboard with F-panel and the session
The F-panel is displayed without on-screen keyboard
Numeric characters are displayed instead of special ones after device rotation with the remote session opened
This hotfix is available for installation directly from the Google Play store and can also be downloaded from: https://support.quest.com/vworkspace/kb/234044
This wiki talks about running a Dell iDRAC iSM module as container in any Linux based environment.
Dell iDRAC Service Module (iSM) is a small OS-resident process that expands Dell iDRAC management into supported host operating systems. Specifically, Dell iSM adds the following services:
This page will help you download the iSM Docker image and run it as a container, regardless of which Linux Operating system you are in. Keeping in mind that you have a Docker Engine in place.
So far the Dell iSM module is built for Redhat and Suse systems and if you would like to run the Dell iSM Module over any other Host Operating system. This Docker image will help you to achieve the task.
If you have a Dell Hardware and using an Operating System for which the Dell iDrac iSM Module installable is not available. This method will help you in utilizing the Dell iSM Docker Module features.
Please contact your Sales/Account Team for the custom Dell iDRAC iSM Module container image.
For Dell Sales/Accounts Teams only: Please Open a Support Request in salesforce.com to engage CSE team. The process for entering the request is mentioned here:
I have built the container image with the docker version 17.03 CE, please make sure you have the docker installed with recent version.
If you do not have any recent version of Docker available, below steps will help you to install the recent docker version
Uninstalling old Docker binaries
Older versions of Docker were called docker or docker-engine. If these are installed, uninstall them:
$ sudo apt-get remove docker docker-engine docker.io
It’s OK if apt-get reports that none of these packages are installed.
The contents of /var/lib/docker/, including images, containers, volumes, and networks, are preserved.
Recommended extra packages for Ubuntu Trusty 14.04
Unless you have a strong reason not to, install the linux-image-extra-* packages, which allow Docker to use the aufs storage drivers.
$ sudo apt-get update
$ sudo apt-get install \
linux-image-extra-$(uname -r) \
Below are the steps to download and run the Dell iDRAC iSM Module image of version 2.4.
1. Download the zipped format of docker Dell iDRAC iSM Module and import it using the below command.
2. Make sure the image is now listed under the docker images command.
3. We need to initiate the container with the privileges command as the systemd command throws an error when you start it normally.
$docker run --privileged -ti -d -e "container=docker" -v /sys/fs/cgroup:/sys/fs/cgroup --net=host path/to/containerImage /usr/sbin/init
Kindly make sure to keep the window open so that the instance isn’t going down.
The Dell iDrac iSM module is now up and running. You may use the features of the Dell iDrac iSM Module as like it is installed in the physical server.
Once you have installed the Dell iDrac iSM Module, you will be able to see the changes in the iDrac GUI and below are some examples.
Note : In this example, I have the base system as Ubuntu and the image is built in Redhat 7.2. You may have any base system/Docker host to run this image.
Disclaimer :- This Dell iDRAC iSM Image is not Officially supported on any Dell Servers.
An interview with Ajeet Singh Raina, Dell EMC's Docker Captain covering the ecosystem , tips , and difference between a container and a virtual machine.
Written by Marty Glaser
The Dell EMC SC5020 array is a next-generation storage array that expands on the capabilities of our most popular SC Series model ever, the SC4020. Now we’re expanding on the number of technical resources to help you get the most out of SC5020 solutions. But first, here's an overview on this new member of the SC Series family.
As detailed in a recent SC5020 blog post by Craig Bernero, the SC5020 boasts impressive increases with up to 45% more IOPS*, 4x more memory, 2x greater capacity* (2 PB) and up to 3x bandwidth* (12Gb SAS). And with a 3U chassis instead of 2U, it offers much denser storage capacity with up to 30 drives in the controller itself. The SC5020 is also optimized for efficiency and economics, offering the lowest price entry point per gigabyte of any storage array in the industry, even when configured as an all-flash array.**
The SC5020 offers flexibility with its expansion options as well as flash configurations. You can easily expand with additional enclosures through 12Gb SAS. And with support for 0% to 100% flash configurations, you can start with spinning disks and add flash later — without any downtime or reconfiguration required. Or start with all flash and add a tier 3 of spinning drives (7.2K, 10K or 15K rpm) for cheap-and-deep storage needs. Simply customize the array at purchase and add flash or spinning media as your storage needs evolve.
Adding to its robust iSCSI and Fibre Channel transport options, the SC5020 also offers direct-attached SAS for up to 4 hosts — a straightforward and cost-effective way of placing enterprise-featured SC Series storage at a branch office without the cost and complexity of iSCSI/FC switches, networks and fabrics. The SC5020 can participate in Live Volume automatic failover configurations as the primary location, secondary location or managed DR location. In addition to Live Volume, the SC5020 supports the full range of SC Series options from federation to data reduction (compression and deduplication) in 0% to 100% flash configurations.
To help you take full advantage of SC5020 solutions, we have released several new papers such as best practices guides and reference architectures. I’d like to call attention to a few titles that relate specifically to the capabilities of the SC5020.
Microsoft Exchange Solution Reviewed Program (ESRP)
These two papers detail SC5020 storage solutions for Microsoft Exchange Server, based on the Microsoft Exchange Solution Reviewed Program (ESRP) Storage program. They describe the performance characteristics of fully hardware-redundant Exchange 2016 solutions housing 9,000 or 15,000 user mailboxes in two 3U SC5020 arrays containing 7.2K or 10K drives.
Dell EMC SC Series SC5020 9,000 Mailbox Exchange 2016 Resiliency Storage Solution using 7.2K Drives
Dell EMC SC Series SC5020 15,000 Mailbox Exchange 2016 Resiliency Storage Solution using 10K Drives
Data Warehouse Fast Track (DWFT) with Microsoft SQL Server
These companion documents describe the design principles and guidelines used to achieve an optimally balanced 60TB Data Warehouse Fast Track reference architecture for SQL Server 2016 using Dell EMC PowerEdge R730 servers and SC5020 arrays. The first paper is a reference architecture that shows the configuration along with performance results, and the second paper is a deployment guide with step-by-step instructions for building the configuration.
60TB Data Warehouse Fast Track Reference Architecture for Microsoft SQL Server 2016 using Dell EMC PowerEdge R730 and SC5020
Deploying the 60TB Data Warehouse Fast Track Reference Architecture for Microsoft SQL Server 2016 using Dell EMC PowerEdge R730 and SC5020
Oracle OLAP processing
Highlighting the SC5020 array, this paper discusses the advanced features of Dell EMC SC Series storage and provides guidance on how they can be leveraged to deliver a cost-effective solution for Oracle OLAP and DSS deployments.
Optimizing Dell EMC SC Series Storage for Oracle OLAP Processing
Virtual desktop infrastructure (VDI) with VMware View
This paper is a storage template for a VDI environment based on the SC5020 array, simulating a small- to mid-sized company with 3,500 end-user VMs running Microsoft Windows 7 and a workload typical of knowledge workers.
3,500 Persistent VMware View VDI Users on Dell EMC SC5020 Storage
Heterogeneous virtualized workloads with Hyper-V
This paper provides best practices for configuring Microsoft Windows Server 2016 Hyper-V and Dell EMC SC5020 storage with heterogeneous application workloads.
Dell EMC SC5020 with Heterogeneous Virtualized Workloads on Microsoft Windows Server 2016 with Hyper-V Storage
SAS front-end support for Microsoft Hyper-V and VMware
The two papers describe how to configure VMware vSphere and Microsoft Hyper-V hosts equipped with supported SAS HBAs to access SAN storage on select SC Series arrays with SAS front-end ports. This includes the SC5020, which is showcased in these two documents.
Dell EMC SC Series with SAS Front-end Support for VMware vSphere
Dell EMC SC Series with SAS Front-end Support for Microsoft Hyper-V
The Midrange Storage Technical Solutions team at Dell EMC provides focused and informative documents and videos, such as best practices and reference architectures for specific applications, operating systems and configurations. We continually refresh this content as new SC Series models and features are added, so make sure to bookmark our SC Series Technical Documents page on Dell TechCenter. More product information about SC Series arrays can be found at DellEMC.com/SCSeries.
We always appreciate feedback on how we can improve our knowledge-base resources. Please send feedback or recommendations to StorageSolutionsFeedback@dell.com. Thank you!
*Based on April 2017 internal Dell EMC testing, compared to previous-generation SC4020. Actual performance will vary depending upon application and configuration.
**Dell internal analysis, April 2017. Estimated street price, net effective capacity after 5:1 data reduction including 5 years of 7x24x4-hour on-site support. Comparison vs. other midrange storage from major-vendors. Customer’s price may vary based on a variety of circumstances and data should be used for comparison purposes.
This blog is originally written by Teixeira Fabiano
Have you heard about NVDIMM-N? NVDIMM-N is a very nice feature available on Dell EMC 14G platform (click here for more details). NVDIMM-N (also known as Persistent Memory [PMEM]) is a Storage Class Memory (SCM), technology that combines flash storage and DRAM on the same memory module, enabling extraordinary performance improvement over other storage technologies such as SAS/SATA drives, SSD and NVMe. NVDIMM-N modules (288-Pin DDR4 2666MHz) are connected to a standard memory slot, taking full advantage of high bandwidth and low latency of memory bus.
NVDIMM-N can operate in two different modes on Windows Server 2016 RTM and Windows Server version 1709:
DAX: Direct AccessBlock: Regular block device
DAX pretty much bypass the whole storage stack, delivering very low latency access to the application (Application must be DAX-Aware – SQL is a great example). NVDIMM still can deliver a low latency access in Block mode, however the IO still need to go through the whole storage stack.
Dell EMC presented a great demo @MS Ignite 2017 where we could see the power of NVDIMM - a SQL Server running on 14G server with DAX mode enable, delivering a fast low latency configuration also NVDIMMs configured as Cache device on Storage Spaces Direct (S2D), which also delivered a great low latency configuration. Hope you had a chance to stop by to check the demo.
MS released new Windows Server 1709 October 17th, 2017 and they added a very cool feature to Hyper-V (Compute): Storage-Class support for VMs and Virtualized Persistent Memory (vPEM). How about having this low latency technology inside a VM (Windows and Linux – not all the OSes support NVDIMM), fully utilizing your new 14G server? It would be great, right?
So, how about configuring that? Let’s do it! ------------------------------Lab Configuration------------------------------Server: R740xd• 2 x CPUs (Intel(R) Xeon(R) Gold 6126T CPU @ 2.60GHz)• 2 x NVDIMM-N (16GB DDR4 2666MHz) + 12 x regular RDIMMS (also 16GB DDR4 2666MHz)• BOSS (Boot Optimization Storage Solution) – 2xM.2 SATA SSD in HW RAID used for the OS Installation• OS: Windows Server, version 1709
----------------------------------------------------------------------------Storage-Class Memory support for VMs - Limitations----------------------------------------------------------------------------• No Migrations• No runtime resizing• No Thin-Provisioning or Snapshots• Incompatible with old Hypervisors (2016 RTM or below)• Implemented through PowerShell
---------------------------------------------------------------------Configuring NVDIMM/PMEM in a Windows VM---------------------------------------------------------------------1. Enable NVDIMM in the R740xd BIOS. You can enable node interleaving on Windows Server 1709, something not possible on Windows 2016 RTM. Instead of seeing multiple SCM devices in Windows, you will see only one device if Node Interleaving is configured.2. Install Windows Server version 1709.3. Install the Hyper-V role.4. Verify if SCM disk has been detected by the Hypervisor.
Follow the procedure below if OS cannot identify the SCM disk:A) Verify Pnp Device Status (Storage Class Memory Bus).
B) Run the following PowerShell cmdlet to find out the PnP Device ID, if Device Status is equal to “Error”:
C) Create the following registry keys:
Path: HKLM\System\CurrentControlSet\Enum\ACPI\ACPI0012\<PnP Device ID>\Device Parameters\Name: ScmBus
Path: HKLM\System\CurrentControlSet\Enum\ACPI\ACPI0012\<PnP Device ID>\Device Parameters\ScmBus\Name: NfitValidateAllowUnalignedSPAs, Type: DWORD, Value: 1
D) Reboot the server.E) Check PnP Device again in order to make sure Storage Class Memory Bus device is working properly.
F) Verify if SCM disk has been detected
5. Initialize SCM disk.6. Create New Volume, then format it. Use the parameter -DAX $True in order to properly enable SCM for Hyper-V utilization. You won’t be able to present NVDIMM to VMs if -DAX option is not present. 7. Confirm that DAX is enabled. 8. Create a new Gen2 Virtual Machine. Install Windows 2016 RTM or Windows Server, version 1709.9. Shutdown the VM. 10. Add PMEM Controller to the VM.
11. Create .vhdpmem file (new file extension). You will need to specify the -Fixed parameter. The vhdpmem disk won’t work with dynamic VHD configuration. 12. Attach VHDPMEM to the VM,
13. Start the VM
14. Connect to the VM (You can do everything from PS!) in order to check the SCM disk is there.
15. Initialize and format the SCM disk. For the SQL guys (SQL 2016 or above), if you want to take advantage of NVDIMMs, format the volume as DAX inside the VM (use the -IsDAX $True parameter).
----------------------------------------------------------------Configuring NVDIMM/PMEM in a Linux VM----------------------------------------------------------------1. Create a VM Gen2 and install the OS (I’m running RHEL 7.4 in my lab).2. Repeats Steps 9-13.3. After starting the VM, connect to the VM via Console, use Bash in Windows 10 (needs to be installed 1st) or use your preferred SSH tool in order to verify if the “pmem” device has been detected.
Sincerely, Fabiano Teixeira
Have any comments, questions or suggestions? Please contact us on WinServerBlogs@dell.com
Stephen Rousset – Distinguished Eng., ESI Director of Architecture, Dell EMC
Designed for OPEX Savings: DSS 9000
Design is a funny word. Some people think design means how it looks. But of course, if you dig deeper, it’s really how it works. – Steve Jobs
When we talk about how we designed the DSS 9000 we are not just talking about how the physical parts go together, but also how it was designed to improve operations in large scale environments. Savings on capital expense (CAPEX) is naturally a top level concern when purchasing new infrastructure, especially in scale-out environments where the savings is multiplied by the hundreds or even thousands of servers. But CAPEX savings is a one-time savings. Operational expense (OPEX) is an equally, if not more important, consideration when upgrading scale-out infrastructure, because the OPEX savings is a recurring savings. It is multiplied not just by the scale of the infrastructure – but also over time.
Because we designed the recently launched DSS 9000 specifically as a rack scale infrastructure, we designed in important operational efficiencies. An earlier blog has discussed the ways in which rack scale management of the DSS 9000 delivers efficiencies. In this blog I will expand on how the DSS 9000 brings more ongoing OPEX savings by being faster and easier to deploy, scale and service.
Easy to Deploy Since the DSS 9000 is targeted for large scale infrastructure installations with 100’s or 1000’s of racks, we specifically designed it so deployment and service in those environments are as easy as possible. To that end, it is sold pre-integrated as a complete rack scale hardware solution, customized to the specifications of the customer. Once a customer (working with ESI Solution Architects) has specified the configuration that best suits their workload needs - documented in a System Requirements Document (SRD) - we assemble the entire rack, including open networking choices they may have made and third-party components they may want to integrate. This saves enormous amounts of time for the customer in terms of racking and cabling equipment – and has the additional positive environmental effect of greatly reducing wasteful packaging. (Again, multiplied by tens, hundreds and sometimes even thousands of racks).
Beyond the expected server level validation which ensures all specified components and commodities (NICs, drives, RAID cards, DIMMs, etc.) are present and functional and all of the BIOS and BMC settings are checked against the customer specifications, we perform a complete rack level validation. This validation examines cabling, correct server placement in the rack, switch configuration and infrastructure firmware revisions. At this point we make the corrections or adjustments necessary for individual servers or rack infrastructure to ensure optimal performance.
So, when DSS 9000 racks arrive at a customer’s loading dock, they are ready to be rolled into place and connected into the existing infrastructure. We also can provide a “manifest” that lists MAC addresses, server commodities data and Dell EMC IDs prior to shipping so network preparation can be done ahead of time to ensure the process of incorporating the new racks into the existing infrastructure is as smooth and fast as possible. Once positioned in the data center the systems should be ready for application software installation and to be deployed as part of the customer’s operation.
Easy to Scale Scaling the DSS 9000 is easier at both the rack level and within the rack. As described above, scaling your infrastructure by the rack is as simple as ordering the configuration you want and rolling it from the loading dock to its place in the datacenter.
But the modular design of the DSS 9000 also makes it easy to scale at a more granular level. In some high growth environments, a rack may be purchased but configured as only partially populated - to leave room for growth. Adding more capacity becomes as simple as sliding new sleds into the vacant blocks and connecting the resident network. This capability allows organizations to scale out deployments with speed and confidence in response to dynamic business needs.
The innovative rack management features of the DSS 9000 also simplify scaling. When the need to scale arises and preconfigured nodes are added to the rack, the Rack Manager interface makes management of the new nodes consistent… and instantaneous. The newly introduced nodes are automatically inventoried and immediately accessible for management commands. And with dynamic pooling of disaggregated resources, available through the combination of the Redfish management APIs and Intel ™ Rack Scale Design, compute and storage resources can be readily scaled for workloads running on the rack. Easy to service One example of a design feature derived from our long history of data center experience is cold aisle serviceability - an efficiency feature that the DSS 9000 shares with the infrastructure of most hyperscale cloud companies. Cold aisle serviceability simplifies the physical servicing of the rack when all the cabling is located on the front (cold aisle) and when hot swap components are all accessible from that side as well. (See photos.) Productivity is increased and no service person is unnecessarily working in the elevated temperatures of the hot aisle or having to move back and forth for front and rear service needs. As an added benefit, the delta T temperature in the unoccupied hot aisle can be kept a few degrees higher, allowing for savings on cooling costs.
Customers that do business internationally should also be aware they can rely on the Extreme Scale Infrastructure organization’s worldwide supply chain, manufacturing, deployment and support to deliver a consistent, seamless experience that accelerates time to value and minimizes disruption to their operations. Conclusion Easy to deploy. Easy to scale. Easy to service. The holistic design of the DSSS 9000 rack scale infrastructure is squarely based on the hyperscale principles that Dell EMC helped pioneer over the last decade while working with the largest players in the hyperscale arena. Now, with the DSS 9000, other large scale data centers can benefit from the same levels of operational efficiency and savings that they enjoy. Inquiries about the DSS 9000 and other ESI rack scale solutions can be made at ESI@dell.com .
Authors: Rengan Xu, Frank Han, Nishanth Dandapanthula.
HPC Innovation Lab. October 2017
In this blog, we will give an introduction to Singularity containers and how they should be used to containerize HPC applications. We run different deep learning frameworks with and without Singularity containers and show that there is no performance loss with Singularity containers. We also show that Singularity can be easily used to run MPI applications.
Introduction to Singularity
Singularity is a container system developed by Lawrence Berkeley Lab to provide container technology like Docker for High Performance Computing (HPC). It wraps applications into an isolated virtual environment to simplify application deployment. Unlike virtual machines, the container does not have a virtual hardware layer and its own Linux kernel inside the host OS. It is just sandboxing the environment; therefore, the overhead and the performance loss are minimal. The goal of the container is reproducibility. The container has all environment and libraries an application needs to run, and it can be deployed anywhere so that anyone can reproduce the results the container creator generated for that application.
Besides Singularity, another popular container is Docker, which has been widely used for many applications. However, there are several reasons that Docker is not suitable for an HPC environment. The following are various reasons that we choose Singularity rather than Docker:
Security concern. The Docker daemon has root privileges and this is a security concern for several high performance computing centers. In contrast, Singularity solves this by running the container with the user’s credentials. The access permissions of a user are the same both inside the container and outside the container. Thus, a non-root user cannot change anything outside of his/her permission.
HPC Scheduler. Docker does not support any HPC job scheduler, but Singularity integrates seamlessly with all job schedulers including SLURM, Torque, SGE, etc.
GPU support. Docker does not support GPU natively. Singularity is able to support GPUs natively. Users can install whatever CUDA version and software they want on the host which can be transparently passed to Singularity.
MPI support. Docker does not support MPI natively. So if a user wants to use MPI with Docker, a MPI-enabled Docker needs to be developed. If a MPI-enabled Docker is available, the network stacks such as TCP and those needed by MPI are private to the container which makes Docker containers not suitable for more complicated networks like Infiniband. In Singularity, the user’s environment is shared to the container seamlessly.
Challenges with Singularity in HPC and Workaround
Many HPC applications, especially deep learning applications, have deep library dependences and it is time consuming to figure out these dependences and debug build issues. Most deep learning frameworks are developed in Ubuntu but they need to be deployed to Red Hat Enterprise Linux. So it is beneficial to build those applications once in a container and then deploy them anywhere. The most important goal of Singularity is portability which means once a Singularity container is created, the container can be run on any system. However, so far it is still not easy to achieve this goal. Usually we build a container on our own laptop, a server, a cluster or a cloud, and then deploy that container on a server, a cluster or a cloud. When building a container, one challenge is in GPU-based systems which have GPU driver installed. If we choose to install GPU driver inside the container, but the driver version does not match the host GPU driver, then an error will occur. So the container should always use the host GPU driver. The next option is to bind the paths of the GPU driver binary file and libraries to the container so that these paths are visible to the container. However, if the container OS is different than the host OS, such binding may have problems. For instance, assume the container OS is Ubuntu while the host OS is RHEL, and on the host the GPU driver binaries are installed in /usr/bin and the driver libraries are installed in /usr/lib64. Note that the container OS also have /usr/bin and /usr/lib64; therefore, if we bind those paths from the host to the container, the other binaries and libraries inside the container may not work anymore because they may not be compatible in different Linux distributions. One workaround is to move all those driver related files to a central place that does not exist in the container and then bind that central place.
The second solution is to implement the above workaround inside the container so that the container can use those driver related files automatically. This feature has already been implemented in the development branch of Singularity repository. A user just need to use the option “--nv” when launching the container. However, based on our experience, a cluster usually installs GPU driver in a shared file system instead of the default local path on all nodes, and then Singularity is not able to find the GPU driver path if the driver is not installed in the default or common paths (e.g. /usr/bin, /usr/sbin, /bin, etc.). Even if the container is able to find the GPU driver and the corresponding driver libraries and we build the container successfully, if in the deployment system the host driver version is not new enough to support the GPU libraries which were linked to the application when building the container, then an error will occur. Because of the backward compatibility of GPU driver, the deployment system should keep the GPU driver up to date to ensure its libraries are equal to or newer than the GPU libraries that were used for building the container.
Another challenge is to use InfiniBand with the container because InfiniBand driver is kernel dependent. There is no issue if the container OS and host OS are the same or compatible. For instance, RHEL and Centos are compatible, and Debian and Ubuntu are compatible. But if these two OSs are not compatible, then it will have library compatibility issue if we let the container use the host InfiniBand driver and libraries. If we choose to install the InfiniBand driver inside the container, then the drivers in the container and the host are not compatible. The Singularity community is still trying hard to solve this InfiniBand issue. Our current solution is to make the container OS and host OS be compatible and let the container reuse the InfiniBand driver and libraries on the host.
Singularity on Single Node
To measure the performance impact of using Singularity container, we ran the neural network Inception-V3 with three deep learning frameworks: NV-Caffe, MXNet and TensorFlow. The test server is a Dell PowerEdge C4130 configuration G. We compared the training speed in images/sec with Singularity container and bare-metal (without container). The performance comparison is shown in Figure 1. As we can see, there is no overhead or performance penalty when using Singularity container.
Figure 1: Performance comparison with and without Singularity
Singularity at Scale
We ran HPL across multiple nodes and compared the performance with and without container. All nodes are Dell PowerEdge C4130 configuration G with four P100-PCIe GPUs, and they are connected via Mellanox EDR InfiniBand. The result comparison is shown in Figure 2. As we can see, the percent performance difference is within ± 0.5%. This is within normal variation range since the HPL performance is slightly different in each run. This indicates that MPI applications such as HPL can be run at scale without performance loss with Singularity.
Figure 2: HPL performance on multiple nodes
In this blog, we introduced what Singularity is and how it can be used to containerize HPC applications. We discussed the benefits of using Singularity over Docker. We also mentioned the challenges of using Singularity in a cluster environment and the workarounds available. We compared the performance of bare metal vs Singularity container and the results indicated that there is no performance loss when using Singularity. We also showed that MPI applications can be run at scale without performance penalty with Singularity.
It is possible to create two connections for the same user if using different registry letters in user name (test, Test etc.)
Configuration is not updated automatically if the user target has a higher priority during the first connection to the configuration
The notification message should be shown after each attempt to launch the connector when it is already launched in the DI mode
Connector may close if the user cancels the connection action when connector is in the DI mode
Microphone redirection is not displayed in the connection setting policies
Add the Microphone redirection support for the Mac connector
The application will not launch until the tutorial window is closed
Session may close during launching on the 10.10.5 system
Connector may close after clicking on the Connection Settings button on the 10.9.5 system
The Change Password checkbox is not displayed on the Welcome screen if the Require Authentication checkbox is unchecked and PM is configured
User's credentials aren’t filled if the user cancels finding of the configuration process
The Details button is displayed in the Adding Manual Connection window
The Next button is disabled when using clipboard to paste into the User Name field on the Auto-Configure screen
There is an empty space in the extended settings if the Detect connection quality automatically connection speed is selected in the configuration properties
Mac connector doesn’t display the non-login broker errors
There are alignment issues in the connection settings if user saves credentials after the connection setting was opened at least once
This hotfix is available for download at: https://support.quest.com/vworkspace/kb/233661
Corruptions appear in Microsoft Excel 2010 if the session is launched from 2008R2 server with Windows 7
The cursor is not displayed in Microsoft Word 2010 if the session is launched from 2008R2 server with Windows 7
The Connection bar is not resized in the Split View mode
The remote session is not resized properly in the Split View mode
The width of the vWs website\email address field changes when the user enters data into it
This hotfix is available for download from the Apple Store at: https://itunes.apple.com/us/app/vworkspace/id406043462