Dell Community

Blog Group Posts
Application Performance Monitoring Blog Foglight APM 105
Blueprint for HPC - Blog Blueprint for High Performance Computing 0
Custom Solutions Engineering Blog Custom Solutions Engineering 9
Data Security Data Security 8
Dell Big Data - Blog Dell Big Data 68
Dell Cloud Blog Cloud 42
Dell Cloud OpenStack Solutions - Blog Dell Cloud OpenStack Solutions 0
Dell Lifecycle Controller Integration for SCVMM - Blog Dell Lifecycle Controller Integration for SCVMM 0
Dell Premier - Blog Dell Premier 3
Dell TechCenter TechCenter 1,861
Desktop Authority Desktop Authority 25
Featured Content - Blog Featured Content 0
Foglight for Databases Foglight for Databases 35
Foglight for Virtualization and Storage Management Virtualization Infrastructure Management 256
General HPC High Performance Computing 229
High Performance Computing - Blog High Performance Computing 35
Hotfixes vWorkspace 66
HPC Community Blogs High Performance Computing 27
HPC GPU Computing High Performance Computing 18
HPC Power and Cooling High Performance Computing 4
HPC Storage and File Systems High Performance Computing 21
Information Management Welcome to the Dell Software Information Management blog! Our top experts discuss big data, predictive analytics, database management, data replication, and more. Information Management 229
KACE Blog KACE 143
Life Sciences High Performance Computing 12
On Demand Services Dell On-Demand 3
Open Networking: The Whale that swallowed SDN TechCenter 0
Product Releases vWorkspace 13
Security - Blog Security 3
SharePoint for All SharePoint for All 388
Statistica Statistica 24
Systems Developed by and for Developers Dell Big Data 1
TechCenter News TechCenter Extras 47
The NFV Cloud Community Blog The NFV Cloud Community 0
Thought Leadership Service Provider Solutions 0
vWorkspace - Blog vWorkspace 512
Windows 10 IoT Enterprise (WIE10) - Blog Wyse Thin Clients running Windows 10 IoT Enterprise Windows 10 IoT Enterprise (WIE10) 6
Latest Blog Posts
  • Dell TechCenter

    Extreme Scale Infrastructure (ESI) Architects' POV Series


    Stephen Rousset – Distinguished Eng., ESI Director of Architecture, Dell EMC

    For more than a decade, Dell EMC’s Extreme Scale Infrastructure (ESI) group has been working to help customers push at the boundaries of large scale data center computing. In that time, it has always been our mission to understand customers’ unique business requirements, provide deep industry expertise and develop solutions to match their specific needs.

    We understand that large-scale organizations are facing an unprecedented challenge today – trying to provide competitive advantage and customer value in an increasingly dynamic environment. They need to both stay abreast of overwhelming technology innovation, while quickly implementing the IT infrastructure capabilities their customers demand.

    Dell EMC ESI is uniquely positioned to help these organizations.  With over a decade of experience working with hyperscale cloud vendors, ESI understands the unique needs of carriers and service providers and knows that it is never a one-size-fits-all model when it comes to large-scale infrastructure. ESI provides tailored solutions designed using hyperscale best practices that combine the latest technologies and practical, experience-based rack scale efficiencies.

    Critical to our mission is the ability to stay at the forefront of industry trends and standards – and even in some cases to lead the effort.  The members of the ESI architecture team have been dedicated to ensuring DELL EMC and its customers gain the advantages of the latest technologies as they become available.  Some examples of our leadership are; co-chair of the Distributed Management Task Force’s (DMTF) Redfish standards, co-chair of the Open Compute Project’s (OCP) server group, and active participation in the Infrastructure Masons organization.

    We also have ESI architects working across a wide span of new technologies that will impact large scale computing environments in the near and long-term future, both investigating and influencing the latest trends, developments and standards.  This blog series is intended to help keep you up-to-date on our latest work and give you our in-depth point-of-view, so that you can have a better understanding of where the industry is moving to and how you might better plan for your large scale computing needs.

    It is our intent to deliver one blog each month on a relevant topic of interest. The first two will be focused on the benefits of rack scale solutions and the current state of the Redfish standards efforts. Inquiries about ESI can be made at


  • General HPC

    Deep Learning Inference on P40 vs P4 with Skylake

    Authors: Rengan Xu, Frank Han and Nishanth Dandapanthula. Dell EMC HPC Innovation Lab. July. 2017

    This blog evaluates the performance, scalability and efficiency of deep learning inference on P40 and P4 GPUs on Dell EMC’s PowerEdge R740 server. The purpose is to compare P40 versus P4 in terms of performance and efficiency. It also measures the accuracy differences between high precision and reduced precision floating point in deep learning inference.

    Introduction to R740 Server

    The PowerEdgeTM R740 is Dell EMC’s latest generation 2-socket, 2U rack server designed to run complex workloads using highly scalable memory, I/O, and network options. The system features the Intel Xeon Processor Scalable Family (architecture codenamed Skylake-SP), up to 24 DIMMs, PCI Express (PCIe) 3.0 enabled expansion slots, and a choice of network interface technologies to cover NIC and rNDC. The PowerEdge R740 is a general-purpose platform capable of handling demanding workloads and applications, such as data warehouses, ecommerce, databases, and high performance computing (HPC). It supports up to 3 Tesla P40 GPUs or 4 Tesla P4 GPUs.

    Introduction to P40 and P4 GPUs

    NVIDIA® launched Tesla® P40 and P4 GPUs for the inference phase of deep learning. Both GPU models are powered by NVIDIA PascalTM architecture and designed for deep learning deployment, but they have different purposes. P40 is designed to deliver maximum throughput, while P4’s is aimed to provide better energy efficiency. Aside from high floating point throughput and efficiency, both GPU models introduce two new optimized instructions designed specifically for inference computations. The two new instructions are 8-bit integer (INT8) 4-element vector dot product (DP4A) and 16-bit 2-element vector dot product (DP2A) instructions. Although many HPC applications require high precision computation with FP32 (32-bit floating point) or FP64 (64-bit floating point), deep learning researchers have found using FP16 (16-bit floating point) is able to achieve the same inference accuracy as FP32 and many applications only require INT8 (8-bit integer) or lower precision to keep an acceptable inference accuracy. Tesla P4 delivers a peak of 21.8 INT8 TIOP/s (Tera Integer Operations per Second), while P40 delivers a peak of 47.0 INT8 TIOP/s. Other differences between these two GPU models are shown in Table 1. This blog uses both types of GPUs in the benchmarking.

    Table 1: Comparison between Tesla P40 and P4


    Tesla P40

    Tesla P4

    CUDA Cores



    Core Clock

    1531 MHz

    1063 MHz

    Memory Bandwidth

    346 GB/s

    192 GB/s

    Memory Size

    24 GB GDDR5

    8 GB GDDR5

    FP32 Compute

    12.0 TFLOPS

    5.5 TFLOPS

    INT8 Compute

    47 TIOPS

    22 TIOPS




    Introduction to NVIDIA TensorRT

    NVIDIA TensorRTTM, previously called GIE (GPU Inference Engine), is a high performance deep learning inference engine for production deployment of deep learning applications that maximizes inference throughput and efficiency. TensorRT provides users the ability to take advantage of fast reduced precision instructions provided in the Pascal GPUs. TensorRT v2 supports the new INT8 operations that are available on both P40 and P4 GPUs, and to the best of our knowledge it is the only library that supports INT8 to date.

    Testing Methodology

    This blog quantifies the performance of deep learning inference using NVIDIA TensorRT on one PowerEdge R740 server which supports up to 3 Tesla P40 GPUs or 4 Tesla P4 GPUs. Table 2 shows the hardware and software details. The inference benchmark we used was giexec in TensorRT sample codes. The synthetic images, which were filled with random non-zero numbers to simulate real images, were used in this sample code. Two classic neural networks were tested: AlexNet (2012 ImageNet winner) and GoogLeNet (2014 ImageNet winner) which is much deeper and more complicated than AlexNet.

    We measured the inference performance in images/sec which means the number of images that can be processed per second.

    Table 2: Hardware configuration and software details


    PowerEdge R740


    2 x Intel Xeon Gold 6150


    192GB DDR4 @ 2667MHz


    400GB SSD

    Shared storage

    9TB NFS through IPoIB on EDR Infiniband


    3x Tesla P40 with 24GB GPU memory, or

    4x Tesla P4 with 8 GB GPU memory

    Software and Firmware

    Operating System

    RHEL 7.2


    0.58 (beta version)

    CUDA and driver version

    8.0.44 (375.20)

    NVIDIA TensorRT Version

    2.0 EA and 2.1 GA

    Performance Evaluation


    In this section, we will present the inference performance with NVIDIA TensorRT on GoogLeNet and AlexNet. We also implemented the benchmark with MPI so that it can be run on multiple GPUs within a server. Figure 1 and Figure 2 show the inference performance with AlexNet and GoogLeNet on up to three P40s and four P4s in one R740 server. In these two figures, batch size 128 was used. The power consumption of each configuration was also measured and the energy efficiency of the configurations is plotted as a “performance per watt” metric. The power consumption was measured by subtracting the power when the system was idle from the power when running the inference. Both the images/sec and images/sec/watt metrics numbers are relative to the numbers on one P40. Figure 3 shows the performance with different batch sizes with 1 GPU, and both metrics numbers are relative to the numbers on P40 with batch size 1. In all figures, INT8 operations were used. The following conclusions can be observed:

    • Performance: with the same number of GPUs, the inference performance on P4 is around half of that on P40. This is consistent with the theoretical INT8 performance on both types of GPUs: 22 TIOPS on P4 vs 47 TIOPS on P40 on single GPU. Also since inference with larger batch sizes gives higher overall throughput but consumes more memory, and P4 has only 8GB memory compared to P40 24GB memory, P4 could not complete the inference with batch size 2048 or larger.
    • Scalability: the performance scales linearly on both P40s and P4s when multiple GPUs are used, because of no communication happens between the GPUs used in the test.
    • Efficiency (performance/watt): the performance/watt on P4 is ~1.5x than that on P40. This is also consistent with the theoretical efficiency difference. Because the theoretical performance of P4 is 1/2 of P40 and its TDP is around 1/3 of P40 (75W vs 250W), therefore its performance/watt is ~1.5x than P40.

    Figure 1: The inference performance with AlexNet on P40 and P4

    Figure 2: The performance of inference with GoogLeNet on P40 and P4

    Figure 3: P40 vs P4 for AlexNet with different batch sizes

    In our previous blog, we compared the inference performance using both FP32 and INT8 and the conclusion is that INT8 is ~3x faster than FP32. In this study, we also compare the accuracy when using both operations to verify that using INT8 can get comparable performance to FP32. We used the latest TensorRT 2.1 GA version to do this benchmarking. To make INT8 data encode the same information as FP32 data, a calibration method is applied in TensorRT to convert FP32 to INT8 in a way that minimizes the loss of information. More details of this calibration method can be found in the presentation “8-bit Inference with TensorRT” from GTC 2017. We used ILSVRC2012 validation dataset for both calibration and benchmarking. The validation dataset has 50,000 images and was divided into batches where each batch has 25 images. The first 50 batches were used for calibration purpose and the rest of the images were used for accuracy measurement. Several pre-trained neural network models were used in our experiments, including ResNet-50, ResNet-101, ResNet-152, VGG-16, VGG-19, GoogLeNet and AlexNet. Both top-1 and top-5 accuracies were recorded using FP32 and INT8 and the accuracy difference between FP32 and INT8 was calculated. The result is shown in Table 3. From this table, we can see the accuracy difference between FP32 and INT8 is between 0.02% - 0.18% which means very minimum accuracy loss is achieved, while 3x speed up can be achieved.

    Table 3: The accuracy comparison between FP32 and INT8





























































    In this blog, we compared the inference performance on both P40 and P4 GPUs in the latest Dell EMC PowerEdge R740 server and concluded that P40 has ~2x higher inference performance compared to P4. But P4 is more power efficient and the performance/watt is ~1.5x than P40. Also with NVIDIA TensorRT library, INT8 can achieve comparable accuracy compared to FP32 while outperforming it with 3x in terms of performance.

  • General HPC

    Dell EMC HPC Systems - SKY is the limit

    Munira Hussain, HPC Innovation Lab, July 2017

    This is an announcement about the Dell EMC HPC refresh that introduces support for 14th Generation servers based on the new Intel® Xeon® Processor Scalable Family (micro-architecture also known as “Skylake”). This includes the addition of PowerEdge R740, R740xd, R640, R940 and C6420 servers to the portfolio. The portfolio consists of fully tested, validated, and integrated solution offerings. These provide high speed interconnects, storage, an option for both hardware and cluster level system management and monitoring software.


    On a high level, the new generation Dell EMC Skylake servers for HPC provide greater computation power, which includes support for up to 28 cores and memory speed up to 2667 MT/s; the architecture extends AVX instructions to AVX512. The AVX512 instructions can execute up to 32 DP FLOP per cycle, which is twice the capability of the previous 13th generation servers that used Intel Xeon E5-2600 v4 processors (“Broadwell”). Additionally, the number of core counts per socket is 20% higher per system when compared to the previous generation, which consisted of a maximum 22 cores. It consists of six memory channels per socket; therefore, a minimum of 12 DIMMs are needed for a dual socket server to provide up to full memory bandwidth. The chipset also has 48 PCI-E lanes per socket, up from 40 lanes in the previous generation.


    The table below notes the enhancements in the latest PowerEdge servers over the previous generations:


    High Level Comparison of the Dell EMC Server Generations for HPC Offering:



    The HPC release supporting Dell EMC 14G servers is based on the Red Hat Enterprise Linux 7.3 operating system. It is based on the 3.10.0-514.el7.x86_64 kernel. The release also supports the new version of Bright Cluster Manager 8.0. Bright Cluster Manager (BCM) is integrated with Dell EMC supported tools, drivers, and third-party software components for the ease of deployment, configuration, and management of the cluster. It includes Dell EMC System Management tools based on OpenManage 9.0.1 and Dell EMC Deployment ToolKit 6.0.1 that help manage, monitor, and administer Dell EMC hardware. Additionally, updated third party drivers and development tools from Mellanox OFED for InfiniBand, Intel IFS for Omni-Path, NVIDIA CUDA for latest Accelerators, and other packages for Machine Learning are also included. Details of the components are as below:

    • Based on Red Hat Enterprise Linux 7.3 (Kernel 3.10.0-514.el7.x86_64)
    • Dell EMC System Management tools from Open Manage 9.0.1 and DTK 6.0.1 for 14G and Open Manage 8.5 and DTK 5.5 for up to 13G Dell EMC servers
    • Updated Dell EMC supported drivers for network and storage deployed during install

      • megaraid_sas = 7.700.50
      • igb =
      • ixgbe = 4.6.3
      • i40e = 1.6.44
      • tg3 = 1.137q
      • bnx2 = 2.2.5r
      • bnx2x = 1.714.2
    • Mellanox OFED 3.4 and 4.0 for InfiniBand
    • Intel IFS 10.3.1 drivers for Omni-Path
    • CUDA 8.0 drivers for NVidia accelerators
    • Intel XPPSL 1.5.1 for Intel Xeon Phi processors
    • Additional Machine Learning packages such as TensorFlow, Caffe, Cudnn, Digits and required dependencies are also supported and available for download


    Below are some images of the Bright Cluster Manager 8.0 BrightView:

    Figure1: This shows the overview of the Cluster. It displays the total capacity, usage, and job status.


    Figure 2: Displays the cascading view of Cluster configuration and respective settings within a group. The settings can be modified and applied from the console.


    Figure 3: Dell EMC Settings Tab shows the parsed info on hardware configuration and the required BIOS level settings.


    Dell EMC HPC Systems based on the 14th Generation servers expand HPC computation capacity and demands. They are fully balanced and architected solutions that are validated and verified for the customers, and the configurations are scalable. Please stay tuned as follow-on blogs will cover performance and application study; these will be posted here:


  • Dell TechCenter

    UEFI Secure Boot with Dell PowerEdge 14G with VMware ESXi 6.5

    Dell EMC PowerEdge 14G Servers are the next generation of servers in Dell EMC Servers portfolio and it comes with a lot of innovative features.In this Whitepaper we are going to provide some useful information for users who plan to use UEFI Secure Boot with Dell PowerEdge 14G Servers. 

    UEFI Secure Boot is a technology where the system firmware checks that the system boot loader is signed with a cryptographic key authorized by a database contained in the system firmware. This feature ensures proper Signature verification happens in the next stage which includes Boot loader, Kernel and user space and prevents any execution of unsigned code.

    Please refer to the white paper located at Dell TechCenter.

  • Dell TechCenter

    Rollup Health status in iDRAC on the Dell EMC 14th generation

    In a large heterogeneous data center, a management application helps you manage the data center assets by maintaining inventory, periodically collecting the health statistics, and providing incident management methods. One of the major functionalities of one-to-many (1xN) management applications is collecting health statistics of multiple servers in the data center. The management applications use SNMP, WS-Man, or REST APIs to collect data from multiple devices of the server. Commonly monitored devices are—sensors, storage devices, power supply units (PSUs), temperature indicators, and cooling fans. iDRAC provides a component-level health status and a cumulative health status called Rollup status. The Rollup status provides an overview of the subsystem and the overall system indicated by the following Infographics:


    While the cumulative health status or aggregation of the individual component Rollup statistics of all the devices of a server is represented as Global Rollup status, in the 14th generation PowerEdge servers, Dell EMC has introduced new methods for health monitoring and reporting. These methods report the individual statuses of devices, their aggregated health, and the reasons for failure. For any health change, iDRAC logs the Lifecycle Controller event and error messages.  For more information about using event and error messages, see the Event and Error Message Reference Guide for Dell EMC PowerEdge Servers.


    The rollup status of a device is derived by considering the health statuses of components in the server under consideration. The extreme severity level of a component is assigned to the overall health status of a server. For example, a server has a PSU in Warning state, but also has a fan in Critical state. Therefore, the Rollup health status of the server is considered to the extreme severe state which is Critical.


    Global Rollup tree structure 




    In iDRAC that is factory-installed on the 14th generation PowerEdge servers, the GUI displays the rollup health with symbols on the System Summary page.


    Figure 1:  iDRAC System Health overview

  • Dell TechCenter

    DellEMC factory install changes for VMware ESXi

    This blog post is written Dell Hypervisor Engineering team

    The DellEMC PowerEdge 14G servers’ factory installed with VMware ESXi (from June 2017 till June 2018) will have root password set to Server Service Tag. Refer to <Locating Service Tag of your system> section under Getting started guide at on locating service tag.

    From June 2018 onwards, the DellEMC PowerEdge 14G servers’ factory installed with VMware ESXi will have root password set to Server Service Tag followed by Exclamation mark. Refer to "Custom DCUI screen for DellEMC customized VMware ESXi" which detail about the custom DCUI screen Dell EMC introduced to notify this change. The change in password is due to the password complexity policies revised for vSphere 6.7. Refer to vSphere Security document to know about the password complexities.  


    • Dell EMC PowerEdge 13G servers factory installed with VMware ESXi continue to have no password set for root user.
    • Dell EMC PowerEdge 14G servers shipped from factory with VMware ESXi installed on BOSS-S1 device will not have VMFS datastore enabled by default. Dell EMC recommends BOSS-S1 device as OS boot device as well as "vSphere ESXi Logging device" only from VMware ESXi point of view. Refer to VMware KB 2145210. Refer to the white paper to know more about BOSS-S1 device.
  • vWorkspace - Blog

    What's new for vWorkspace - June 2017

    Updated monthly, this publication provides you with new and recently revised information and is organized in the following categories; Documentation, Notifications, Patches, Product Life Cycle, Release, Knowledge Base Articles.

    Subscribe to the RSS (Use IE only)


    Knowledgebase Articles


    230053 - After making changes to vWorkspace server template, the system stuck in shutting down phase

    After some modification to the Gold Image, there might be situation where there might be corruption. Would this impact the Production...?

    Created: June 2, 2017


    230206 - Are MS XML and Java 7 compatible with vWorkspace?

    Are MS XML 4.0 and Java 7 compatible with vWorkspace? Is there any potential conflicts?

    Created: June 8, 2017



    105537 - Remote Control Session RDP 8.0: This function is not supported on this System (120)

    Revised: June 1, 2017


    53244 - VWorkspace console may show VDI Computers Initializing or Re-Initializing periodically.

    Looking in the vWorkspace Management Console may show VDI Computers initializing or re-initializing periodically.

    Revised: June 2, 2017


    50674 - How to create a Published Application?

    Revised: June 5, 2017


    70481 - HOW TO: Secure Access Certificate Configuration

    When the Secure Access Certificate is incorrectly configured, any of the below symptoms may be experienced: When trying to import a Certificate...

    Revised: June 6, 2017


    224064 - Quest Data Collector service crashing and will not restart on a session host

    The Quest Data Collector service (pndcsvc.exe) has crashed and will not restart on a session host. Users are not able to connect.

    Revised: June 13, 2017


    Product Life Cycle - vWorkspace

    Revised: June 2017

  • Dell TechCenter

    Dell EMC OpenManage HPE Operations Manager i (OMi) Operations Connector V1.0 now available!

    Dell EMC OpenManage HPE OMi Operations Connector Version 1.0 now available

    We are excited to bring you the latest addition to OpenManage Connection portfolio with Dell EMC OpenManage HPE Operations Manager i (OMi) Operations Connector.

    The new Dell EMC OpenManage HPE OMi Operations Connector Version 1.0 provides a relatively easy way for monitoring Dell EMC devices in a heterogeneous environment monitored by HPE OMi. So, you will not only protect your IT organization’s existing investment in HPE OMi but also get the single, consistent and holistic view of your Dell EMC infrastructure directly from HPE OMi – talk about “Single-pane-of-glass”.  

    The OpenManage HPE OMi Operations Connector provides capabilities to integrate the Dell EMC OpenManage Essential (OME) with HPE OMi. The Operations Connector for OME periodically collects systems management data about the Events and Topology from OME and transfers it to HPE Omi, therefore helping system administrators and business users monitor their heterogeneous infrastructure and business services closely. You can also launch the OME web console directly from the event perspective in HPE OMi to perform further troubleshooting, configuration, upgrade and other lifecycle management activities.

    Key Features

    Feature Description

    Customer Benefits

    Topology Collection and Forwarding

    Policy-based automatic and periodic collection and forwarding of Topology information of all the Dell EMC devices from OME into HPE OMi

    Single-Pane-of-Glass: Single, consistent and holistic view of Dell EMC infrastructure in environments monitored by HPE OMi


    Device Relationship hierarchy

    Creating the relationship between the topology devices and their group which is exactly similar to OME device hierarchy

    Uniform view of data center infrastructure across multiple consoles


    Event Collections and Forwarding

    Policy-based automatic and periodic collection and forwarding of Event / Alert information of all the Dell EMC devices from OME into HPE OMi

    Increased effectiveness with Bottom-up visibility into issues and problems affecting the infrastructure that in turn impact critical business services

    Web URL Console Launch

    Link and launch for OME web console directly from a node or event perspective in HPE OMi


    Reduced effort and time to remediate the events and resolve problems by launching OME directly from within HPE OMi

    Notification of node count

    Notifying the user about the number of nodes being collected and forwarded by Operations Connector from OME into HPE OMi

    Keep track of the Dell EMC infrastructure managed by OME and efficiently manage the  licenses – buy as per your need

    Once again we’re excited to bring this latest release to you. For more information, download links, and product documentation please visit the Dell Tech Center Wiki for OpenManage HPE OMi Operations Connector.