Our community is talking about the new Dell Technologies. Join the discussion in the Dell EMC Community Network:
Now updated quarterly, this publication provides you with new and recently revised information and is organized in the following categories; Documentation, Notifications, Patches, Product Life Cycle, Release, Knowledge Base Articles.
Subscribe to the RSS (Use IE only)
254153 - Mandatory Hotfix 655084 for 8.6 MR3 Connection Broker
This is a Mandatory hotfix and can be installed on the following vWorkspace roles: Connection Broker Please see the full Release Notes...
Created: April 16, 2018
254155 - Mandatory Hotfix 655082 for 8.6 MR3 Management Console
This is a Mandatory hotfix and can be installed on the following vWorkspace roles: Management Console Please see the full Release Notes...
254156 - Mandatory Cumulative Hotfix 655081 for Web Access
This is a Mandatory hotfix and can be installed on the following vWorkspace roles: Web Access Please see the full Release Notes...
254158 - Mandatory Cumulative Hotfix 655085 for 8.6 MR3 Windows connector
This is a Mandatory hotfix and can be installed on the following vWorkspace roles: Windows Connector Please see the full Release Notes...
254164 - Mandatory Hotfix 655086 for PNTools / RDSH role
This is a Mandatory hotfix and can be installed on the following vWorkspace roles: Remote Desktop Session Host (RDSH) PNTools (VDI...
254191 - Mandatory Hotfix 655071 for 8.6 MR3 Android Connector
This mandatory hotfix addresses the following issues: Two-Factor Authentication stabilization BYOD stabilization Password Manager...
Created: April 17, 2018
255135 - Mandatory Hotfix 655127 for 8.6 MR3 Mac connector
This cumulative mandatory hotfix addresses the following issues: The macOS Connector might stop working if the macOS device is an Active...
Created: May 4, 2018
255201 - Mandatory Hotfix 655128 for 8.6 MR3 iOS Connector
This mandatory hotfix addresses the following issues: Fixes and improvements of the BYOD feature Fixes and improvements of the...
Created: May 7, 2018
255659 - An unexpected error has occurred in the program. Failed to activate control VB.UserControl
When launching the vWorkspace Management console you receive the following error: An unexpected error has occurred in the program. Failed to...
Created: May 17, 2018
178358 - Windows 10 Support
Microsoft Windows 10 is now supported as of vWorkspace 8.6.2; Creator Update (1703) is added with vWorkspace 8.6.3
Revised: April 2, 2018
225565 - Is VMware 6.7 currently supported
Is VMware vSphere 6.7 supported in any of the current versions of vWorkspace?
Revised: April 18, 2018
255659 - An unexpected error has occurred in the program. Failed to activate control VB.UserControl
Revised: June 18, 2018
58105 - Troubleshooting connectivity issues with vWorkspace and VDI/TS systems.
Various errors can occur if communications or connectivity issues exist between the vWorkspace Connection Broker(s) and the Virtual Desktop...
Revised: June 25, 2018
56803 - How to Enable Connection Broker Logging
Steps to enable Connection Broker logging in vWorkspace.
Revised: June 29, 2018
88861 - How to repair or fully rebuild Windows WMI Repository
How to repair or fully rebuild Windows WMI Repository
105373 - Video: How to enable the Diagnostics Tool
How to use the diagnostic tool in vWorkspace 8.x.
Product Life Cycle - vWorkspace
Although this is the most common question, it is hard to answer since the amount of data mainly depends on how complex the learning concept is. In Machine Learning (ML), the learning complexity can be broken down into informational and computational complexities. Further, informational complexity considers two aspects, how many training examples are needed (sample complexity) and how fast a learner/model’s estimate can converge to the true population parameters (rate of convergence). Computational complexity refers the types of algorithms and the computational resources to extract the learner/model’s prediction within a reasonable time. As you can guess now, this blog will cover informational complexity to answer the question.
Let’s try to learn what banana is. In this example, banana is the learning concept (one hypothesis, that is ‘to be’ or ‘not to be banana’), and the various descriptions associated with banana can be features such as colors and shapes. Unlike the way human can process the concept of banana – the human does not require non-banana information to classify a banana, typical machine learning algorithm requires counter-examples. Although there is One Class Classification (OCC) which has been widely used for outlier or anomaly detection, this is harder than the problem of conventional binary/multi-class classification.
Let’s place another concept ‘Apple’ into this example and make this practice as a binary-class classification. By doing this, we just made the learning concept simpler, ‘to be banana = not apple’ and ‘not to be banana = apple’. This is little counter-intuitive since adding an additional learning concept into a model makes the model simpler: however, OCC basically refers one versus all others, and the number of all other cases are pretty much infinite. This is where we are in terms of ML; one of the simplest learning activities for human is the most difficult problem to solve in ML. Before generating some data for banana, we need to define some terms.
Ideally, we want to have all the instances in the training sample set S covering all the possible combinations of features with respect to t as you can see in Table 1. There are three possible values for f1 and two possible values for f2. Also, there are two classes in this example. Therefore, the number of all the possible instances |X| = |f1| x |f2| x |C| = 3 x 2 x 2 = 12. However, f2 is a lucky feature[iii] that is mutually exclusive between banana and apple. Hence, |f2| is considered as 1 in this case. In addition to that, we can subtract one case because there is no red banana. For this example, only 5 instances can exhaust the entire sample space H. In general, the number of features (columns) in a data set is exponentially proportional to the required number of training samples (|S| = n, where n is the unique number of samples in the set). If we assume that all features are binary like a simple value of yes or no, then |X| = 2 x 2 x 2 = 23. Two to the power of the number of columns is the minimum n in the simplest case. This example only works when the values in all the features are discrete values. If we use the gradient color values for Color (RGB 256 color pallet ranges from 0 to 16777215 in decimal), the required number of training samples will increase quite significantly because now you need to multiply 16777216 for f1 if all the possible colors exist in H.
It is worth noting that the number of instances we calculate here does not always guarantee that a learner/model can converge properly. If you have the number of data equal or below this number, the amount of data is simply too small for the most of algorithm except a ML algorithm evaluating one feature at a time such as a decision tree. As a rough rule of thumb, many statisticians say that a sample size of 30 is large enough. This rule can be applied for a regression based ML algorithm that assumes one smooth linear decision boundary. Although an optimal n could be different on a case-by-case basis, it is not a bad idea to start from the total number of samples of N = |X| x 30.
Table 1 Training sample set S
In ML, learning curve refers a plot of the classification accuracy against the training set size. This is not an estimation method; it requires building a classification model multiple times with the different size of training data set. This is a good technique for sanity-checks (underfitting – high bias and overfitting – high variance) for a model. It also can be utilized to optimize/improve the performance.
Back in the day, not a single ML paper was accepted without a learning curve. Without this simple plot, the entire performance claim will be unverifiable.
Internal web page
External web page
Contacts Americas Kihoon Yoon Sr. Principal Systems Dev Eng Kihoon.Yoon@dell.com +1 512 728 4191
Or attributes. A feature is an individual measurable property or characteristic of a phenomenon being observed.
[ii] Instance is also referred as example or sample
[iii] If a feature like f2 exists in a data set, we could make 100% accurate prediction simply by looking at f2.
[iv] Is there yellow apple? Yes, Golden Delicious is yellow.
This blog post is written by Vijay Kumar from Hypervisor Engg. Dell EMC
VMware vSphere 6.7 is the next major release after the update release vSphere 6.5 Update 1. vSphere 6.7 was released from VMware on April 17, 2018. The importance of vSphere 6.7 is the number of bug fixes contained in the software as well as a number of feature updates. vSphere 6.7 is supported on Dell EMC's 13th and 14th Generation of PowerEdge Servers.
This version of ESXi is now Factory installed and shipped from Dell. Customers who choose to buy Dell EMC's 14th generation of servers can opt for factory Installation on either the BOSS-S1 (Boot Optimized Storage System) M.2 drives or on the Dual IDSDM SD cards. Dell customized version of ESXi 6.7 is posted at Dell support page
VMware ESXi 6.7 New Features
Refer to VMware's white paper on what's new with vSphere 6.7. Some of the highlights are below
Dell 6.7 documents posted @ Dell support page
Yet another amazing Dockercon !
I attended Dockercon 2018 last week which happened in the most geographically blessed city of Northern California – San Francisco & in the largest convention and exhibition complex- Moscone Center. With over 5000+ attendees from around the globe, 100+ sponsors, Hallway tracks, workshops & Hands-on labs, Dockercon allowed developers, sysadmins, Product Managers & industry evangelists come closer to share their wealth of experience around the container technology.This time I was lucky enough to get chance to visit Docker HQ, Townsend Street for the first time. It was an emotional as well as proud feeling to be part of such vibrant community home.
This Dockercon, there has been couple of exciting announcements.Three of the new features were targeted at Docker EE, while the two were for Docker Desktop. Here’s a rundown of what I think are the most 5 exciting announcements made last week:
Under this blog post, I will go through each one of the announcements in details.
With an estimated 85% of today’s enterprise IT organizations employing a multi-cloud strategy, it has become more critical that customers have a ‘single pane of glass’ for managing their entire application portfolio. Most enterprise organisations have a hybrid and multi-cloud strategy. Containers has helped to make applications portable but let us accept the fact that even though containers are portable today but the management of containers is still a nightmare. The reason being –
This time Docker introduced new application management capabilities for Docker Enterprise Edition that will allow organisations to federate applications across Docker Enterprise Edition environments deployed on-premises and in the cloud as well as across cloud-hosted Kubernetes. This includes Azure Kubernetes Service (AKS), AWS Elastic Container Service for Kubernetes (EKS), and Google Kubernetes Engine (GKE). The federated application management feature will automate the management and security of container applications on premises and across Kubernetes-based cloud services.It will provide a single management platform to enterprises so that they can centrally control and secure the software supply chain for all the containerized applications.
With this announcement, undoubtedly Docker Enterprise Edition is the only enterprise-ready container platform that can deliver federated application management with a secure supply chain. Not only does Docker give you your choice of Linux distribution or Windows Server, the choice of running in a virtual machine or on bare metal, running traditional or microservices applications with either Swarm or Kubernetes orchestration, it also gives you the flexibility to choose the right cloud for your needs.
If you want to read more about it, please refer this official blog.
The partnership between Docker and Microsoft is not new. They have been working together since 2014 to bring containers to Windows and .NET applications. This DockerCon, Docker & Microsoft both shared the next step in this partnership with the preview and demonstration of Kubernetes support on Windows Server with Docker Enterprise Edition.
With this announcement, Docker is the only platform to support production-grade Linux and Windows containers, as well as dual orchestration options with Swarm and Kubernetes.
There has been a rapid rise of Windows containers as organizations recognize the benefits of containerisation and want to apply them across their entire application portfolio and not just their Linux-based applications.
Docker and Microsoft brought container technology into Windows Server 2016, ensuring consistency for the same Docker Compose file and CLI commands across both Linux and Windows. Windows Server ships with a Docker Enterprise Edition engine, meaning all Windows containers today are based on Docker. Recognizing that most enterprise organizations have both Windows and Linux applications in their environment, we followed that up in 2017 with the ability to manage mixed Windows and Linux clusters in the same Docker Enterprise Edition environment, enabling support for hybrid applications and driving higher efficiencies and lower overhead for organizations. Using Swarm orchestration, operations teams could support different application teams with secure isolation between them, while also allowing Windows and Linux containers to communicate over a common overlay network.
If you want to know further details, refer this official blog.
Dockercon 2018 was NOT just for Enterprise customers, but also for Developers. Talking about the new capabilities for Docker Desktop, it is getting new template-based workflows which will enable developers to build new containerized applications without having to learn Docker commands or write Docker files. This template-based workflows will also help development teams to share their own practices within the organisation.
On the 1st day of Dockercon, Docker team previewed an upcoming Docker Desktop feature that will make it easier than ever to design your own container-based applications. For a certain set of developers, the current iteration of Docker Desktop has everything one might need to containerize an applications, but it does require an understanding of the Dockerfile and Compose file specifications in order to get started and the Docker CLI to build and run your applications.
In the upcoming Docker Desktop release, you can expect the below features –
If you’re interested in getting early access to the new app design feature in Docker Desktop then please sign up at beta.docker.com.
Soon after Dockercon, one of the most promising tool announced for Developers was Docker Application Packages (docker-app). The “docker-app” is an experimental utility to help make Compose files more reusable and sharable.
Compose files do a great job of describing a set of related services. Not only are Compose files easy to write, they are generally easy to read as well. However, a couple of problems often emerge:
Fundamentally, Compose files are not easy to share between concerns. Docker Application Packages aim to solve these problems and make Compose more useful for development and production.
In my next blog post, I will talk more about this tool. If you want to try your hands, head over to https://github.com/docker/app
Recently, Function as a Service (FaaS) programming paradigm has gained a lot of traction in the cloud community. At first, only large cloud providers such as AWS Lambda, Google Cloud Functions or Azure Functions provided such services with a pay-per-invocation model, but since then interest has increased for developers and entreprises to build their own solutions on an open source model.
This Dockercon, Docker identified at least 9 different frameworks out of which the following six: OpenFaaS, nuclio, Gestalt, Riff, Fn and OpenWhisk were already confirmed to be supported under the upcoming Docker Enterprise Edition. Docker, Inc started an open source repository to document how to install all these frameworks on Kubernetes on Docker EE, with the goal of providing a benchmark of these frameworks: docker serverless benchmark Github Repository. Pull Requests are welcome to document how to install other serverless frameworks on Docker EE.
Did you find this blog helpful? I am really excited about the upcoming Docker days and feel that these upcoming features will really excite the community. If you have any questions, join me this July 7th at Docker Bangalore Meetup Group, Nutanix Office where I can going to go deeper into Dockercon 2018 Announcements. See you there.
GPUs are useful for accelerating large matrix operations, analytics, deep learning workloads and several other use cases. NVIDIA introduced the Pascal line of their Tesla GPUs in 2016, the Volta line of GPUs in 2017, and recently announced their latest Tesla GPU based on the Volta architecture with 32GB of GPU memory. The V100 GPU is available with both PCIe and NVLink version, allowing GPU-to-GPU communication over PCIe or over NVLink. The NVLink version of the GPU is also called an SXM2 module.
This blog will give an introduction to the new Volta V100-32GB GPUs and compare the HPL performance between different V100 models. Tests were performed using a Dell EMC PowerEdge C4140 with both PCIe and SXM2 configurations. There are several other platforms which support GPUs: PowerEdge R740, PowerEdge R740XD, PowerEdge R840, and PowerEdge R940xa. A similar study was conducted in the past comparing the performance of the P100 and V100 GPUs with the HPL, HPCG, AMBER, and LAMMPS applications.
Table 1 below provides an overview of Volta device specifications.
The PowerEdge C4140 Server is an accelerator optimized server with support for two Intel Xeon Scalable processors and four NVIDIA Tesla GPUs (PCIe or NVLink) in a 1U form factor. The PCIe version of the GPUs is supported with standard PCIe Gen3 connections between GPU to CPU. The NVLink configuration allows GPU-to-GPU communication over the NVLink interconnect. Applications that can take advantage of the higher NVLink bandwidth and the higher clock rate of the V100-SXM2 module can benefit from this option. The PowerEdge C4140 platform is available in four different Configurations: B, C, K, and G. The configurations are distinct in their PCIe lane layout and NVLink capability and are shown in Figure 1 through Figure 4.
In Configuration B, the GPU to GPU communication is through a PCIe switch, and the PCIe switch is connected to a single CPU. In Configuration C and G two GPUs are connected to each CPU, however in Configuration C the two GPUs are directly connected to each CPU, where as in Configuration G the GPUs are connected to the CPU via a PCIe switch. The PCIe Switch in Configuration G is logically divided into two virtual switches mapping 2GPUs to each CPU. In Configuration K, GPU-to-GPU communication is over NVLink, with all GPUs connected to a single CPU. As seen in the figures below all the configurations have additional x16 slots available apart from the GPU slots.
Figure 1: PowerEdge C4140 Configuration B Figure 2: PowerEdge C4140 Configuration C
Figure 3: PowerEdge C4140 Configuration G Figure 4: PowerEdge C4140 Configuration K
The PowerEdge C4140 platform can support a variety of Intel Xeon CPU models, up to 1.5 TB of memory with 24 DIMM slots, multiple network adapters and provides several local storage options. For more information on this server click here.
To evaluate the performance difference between the V100-16GB and the V100-32GB GPUs, a series of tests were conducted. These tests were run on a single PowerEdge C4140 server with the configurations detailed below in Table 2-4.
Table 2: Tested Configurations Details
Table 3: Hardware Configuration
Table 4: Software/Firmware Configuration:
High Performance Linpack (HPL) is a standard HPC benchmark used to measure computing power. It is also used as a reference benchmark by the Top500 list to rank supercomputers worldwide. This benchmark provides a measurement of the peak computational performance of the entire system. There are few parameters that are significant in this benchmark:
The resultant performance of HPL is reported in GFLOPS.
N is the problem size provided as input to the benchmark and determines the size of the dense linear matrix that is solved by HPL. HPL performance tends to increase with increasing N value (problem size) until limits of system memory, CPU or data communication bandwidth begins to limit the performance. For GPU system, the highest HPL performance will commonly occur when the problem size is close to the size of the GPUs memory and the performance will be higher when a larger problem size will fit in that memory.
In this section of the blog, the HPL performance of the NVIDIA V100-16GB and the V100-32GB GPUs is compared using PowerEdge C4140 configuration B and K (refer to Table 2). Recall that configuration B uses PCIe V100s with 250W power limit and configuration K uses SXM2 V100s with higher clocks and 300W power limit. Figure 5 shows the maximum performance that can be achieved on different configurations. We measured a 14% improvement when running HPL on V100-32GB with PCIe versus V100-16GB with PCIe, and there was a 16% improvement between V100-16GB SXM2 and V100-32GB SXM2. The size of the GPU memory made a big difference in terms of performance as the larger memory GPU can accommodate a larger problem size, a larger N.
As seen in Table 1 the V100-16GB, V100-32GB PCIe and V100-16GB, V100-32GB SXM2 have the same number of cores, double precision performance and GPU Bandwidth except for the GPU memory. We also measured ~6% HPL performance improvement from PCIe to SXM2 GPUs which is a small delta in HPL performance but Deep learning frameworks like Tensor Flow and Caffe show much more performance improvement.
Running HPL using only CPUs yields ~2.3TFLOPS with the Xeon Gold 6148; therefore, one PowerEdge C4140 system with four GPUs provides floating point capabilities equal to about nine two socket Intel Xeon 6148 servers.
Figure 5: HPL Performance on different C4140 configurations.
Figure 6 and Figure 7 shows the performance of V100 16GB vs 32GB GPU for different values of N. Table 2 shows the configurations used for this test. These graphs helps us visualize how the GPU cards perform with different problem sizes. As explained above, the problem size is calculated based on the size of the GPU memory, the 32GB GPU can accommodate a larger problem size than the 16GB GPU. When a problem size that is larger than what will fit in GPU memory is executed on a GPU system, the system memory attached to the CPU is used, and this leads to a drop in performance as the data must move from system memory to GPU memory. For ease of understanding the test data is split into two different graphs.
Figure 6: HPL performance with different problem sizes (N)
In Figure 6 we notice that the HPL performance for both the cards is similar until the problem size (N) approximately fills up V100-16GB memory, the same problem size (N) would approximately fill up half the memory for V100-32GB GPUs. In the second graph in Figure 7 we notice that the performance of the V100 16GB GPU drops as it cannot fit larger problem sizes in the GPU memory and must start to use system host memory. The 32GB GPU continues to perform better with larger and larger N until the problem size reaches the maximum capacity of the V100 32GB memory.
Figure 7: HPL performance with different problem sizes (N)
Conclusion and Future work:
PowerEdge C4140 is one of the most prominent GPU based server options for HPC related solutions. We measured a 14-17% improvement in HPL performance when moving from the smaller memory V100-16GB GPU to the larger memory V100-32GB GPU. For memory bound applications, the new Volta 32GB cards would be the preferred option.
For future work, we will run molecular dynamic applications, deep learning workloads and compare the performance between different Volta cards and C4140 configurations.
Please contact HPC innovation lab if you’d like to evaluate the performance of your application on PowerEdge Servers.
Stampede2 system, is the result of collaboration between the Texas Advanced Computing Center (TACC), Dell EMC and Intel. Stampede2 consists of 1,736 Dell EMC PowerEdge C6420 nodes with dual-socket Intel Skylake processors, 4,204 Dell EMC PowerEdge C6320p nodes with Intel Knights Landing bootable processors, a total of 5,940 compute nodes, and 24 additional login and management servers, Dell EMC Networking H-series switches, all interconnected by an Intel Omni-Path Architecture (OPA) fabric.
Two technical white papers were recently published through the joint efforts of TACC, Dell EMC and Intel. One white paper describes the Network Integration and Testing Best Practices on the Stampede2 cluster. The other white paper discusses the Application Performance of Intel Skylake and Intel Knights Landing Processors on Stampede2 and highlights the significant performance advantage of Intel Skylake processor at a multi node scale in four commonly used applications: NAMD, LAMMPS, Gromacs and WRF. For build details, please contact your Dell EMC representative. If you have VASP license, we are happy to share VASP benchmark results as well.
Deploying Intel Omni-Path Architecture Fabric in Stampede2 at the Texas Advanced Computing Center–Network Integration and Testing Best Practices (H17245)
Application Performance of Intel Skylake and Intel Knights Landing Processors on Stampede2 (H17212)
This blog is written by Dell Hypervisor Engg.
Persistent Memory(also known as Non Volatile Memory (NVM)) is a random access memory type which retains it’s contents even when system power goes down in the event of an unexpected power loss, user initiated shutdown, system crash etc. Dell EMC introduced support for NVDIMM-N from their 14th generation of PowerEdge servers. VMware announced support for NVDIMM-N from vSphere ESXi 6.7 onwards. The NVDIMM-N resides in a standard CPU Memory slot, placing data closer to the processor thus reducing the latency and fetch maximum performance. This document detail about the support stance of NVDIMM-N and VMware ESXi specific to Dell EMC PowerEdge servers. This paper provides an insight into the usecases where NVDIMM is involved and the behavior caveats of the same.
Dell EMC support for Persistent Memory (PMem) and VMware ESXi
This blog helps to understand why the transition happened from 512 bytes sector disk to 4096 bytes sector disk. The blog also gives answers to why 4096 bytes (4K) sector disk should be opted for OS installation. The blog first explains about sector layout to understand the need of migration, then gives reasoning behind the migration and finally it covers the benefits of 4K sector drive over 512 bytes sector drive.
A sector is the minimum storage unit of a hard disk drive. It is a subdivision of a track on a hard disk drive. The sector size is an important factor in the design of Operating System because it represents the atomic unit of I/O operations on a hard disk drive. In Linux, you can check the size of the disk sector using "fdisk -l" command.
Figure-1: The disk sector size in Linux
As shown in Figure-1, both the logical and physical sectors are 512bytes long for this Linux system.
The sector layout is structured as follows:1) Gap section: Each sector on a drive is separated by gap section.2) Sync section: It indicates the beginning of the sector.3) Address Mark section: It contains information related to sector identification e.g. sector’s number and location.4) Data section: It contains the actual user data.5) ECC section: It contains error correction codes that are used to repair and recover data that might be damaged during the disk read/write process.
Each sector stores a fixed amount of user data, traditionally 512 bytes for hard disk drives. But because of better data integrity at higher densities and robust error correction capabilities newer HDDs now store 4096 bytes (4K) in each sector.
Need for large sector
The number of bits stored on a given length of track is termed as areal density. Increasing areal density is a trend in the disk drive industry not only because it allows greater volumes of data to be stored in the same physical space but it also improves transfer speed at which that medium can operate. With the increase in areal density, the sector has now consumed a smaller and smaller amount of space on the hard drive surface. This creates a problem because the physical size of the sectors on hard drives has shrunk but media defects have not. If the data in a hard drive sector consumes smaller areas then error correction becomes challenging. This is because media defects of the same size can damage a higher percentage of the data in the disk which has small area for a sector than the disk which has large area for a sector.
There are two approaches to solve this problem. The first approach is to invest more disk space to ECC bytes to assure continued data reliability. But if we invest more disk space to ECC bytes this will lead to less disk format efficiency. Disk format efficiency is defined as (number of user data bytes X 100) / total number of bytes on disk. Another disadvantage is that the more ECC bits included, the disk controller requires more processing power to process the ECC algorithm.
Second approach is to increase the size of the data block and slightly increase the ECC bytes for each data block. With the increase of data block size, the amount of overhead required for each sector to store control information like gap, sync, address mark section etc. would reduce. For each sector the ECC bytes will increase but overall ECC bytes required for a disk would reduce because of larger sector. Reducing the overall amount of space used for error correction code improves format efficiency and increased ECC bytes for each sector gives capability to use more efficient and powerful error-correction algorithms. Thus, transition to a larger sector size has two benefits: improved reliability and greater disk capacity.
Why 4K only?
From a throughput perspective, the ideal block size should be roughly equal to the characteristic size of a typical data transaction. We have to acknowledge that the average file size today is more than 512 bytes. Now a days applications in modern systems use data in large blocks, much larger than the traditional 512-byte sector size. Too small block sizes cause too much transaction overhead. While in case of large block sizes each transaction transfers a large amount of unnecessary data.
The size of a standard transaction in relational data Base systems is 4K. The consensus of opinion in the hard disk drive industry has been that physical block sizes of 4K-Block would provide a good compromise. It also corresponds to paging size used by operating systems and processors.
Figure-2: 512 bytes block vs 4096 bytes block
Figure-3: Format Efficiency improvement in 4K disk
Table-1: Format Efficiency improvement in 4K disk
As we see in Figure-2, 4K sectors are 8 times as large as traditional 512 byte ones. Hence for the same data payload one need 8 times less gap, sync and address mark sections and 4 times less error correction code section. Reducing the amount of space used for error correction code and other non-data section improves format efficiency for 4K Format. Format efficiency improvement is shown in Figure-3 and Table-1, there is a gain of 8.6% format efficiency for 4K sector disk over 512byte sector disk.
Figure-4: Effect of media defect on disk density
As shown in Figure-4, the effect of media defect on disk with higher areal density is more than the disk with the lower areal density disk. As areal density increases we need more ECC bytes to retain same level of error correction capability. The 4K format provides enough space to expand the ECC field from 50 to 100 bytes to accommodate new ECC algorithms. The enhanced ECC coverage improves the ability to detect and correct processed data errors beyond the 50-byte defect length associated with the 512-byte sector format.
4K drive Support on OS & Dell PowerEdge Servers
4K Data disks are supported on Windows Server 2012 but as boot disk only supported in UEFI mode. For Linux, 4K hard drives require a minimum of RHEL 6.1 and SLES 11 SP2. 4K boot drives are only supported in UEFI mode in Linux. Kernel support for 4K drives is available in kernel versions 2.6.31 and above.PERC H330, H730, H730P, H830, FD33xS, and FD33xD cards support 4K block size disk drives, which enables you to efficiently use the storage space. 4K disks can be used on the Dell PowerEdge Servers supporting above PERC cards.
The physical size of each sector on the disk has become smaller as a result of increase in areal densities in disk drives. If the number of disk defects does not scale at the same rate, then we expect more sectors to be corrupted and we need strong error correction capability for each sector. Disk drives with larger physical sectors and more ECC bytes for each sector provide enhanced data protection and correction algorithms. The 4K format helps to achieve better format efficiencies and improves the reliability and error correction capability. This transition will result in better user experiences, hence the 4K drive should be opted for OS installation.
We published the whitepaper, “Dell EMC PowerEdge R940 makes De Novo Aseembly easier”, last year to study the behavior of SOAPdenovo2 . However, the whitepaper is limited to one De Novo assembly application. Hence, we want to expand our application coverage little further. We decided to test SPAdes (2012) since it is a relatively new application and reported for some improvement on the Euler-Velvet-SC assembler (2011) and SOAPdenovo. SPAdes is also based on de Bruijn graph algorithm like most of the assemblers targeting Next Generation Sequencing (NGS) data. De Bruijin graph-based assemblers would be more appropriate for larger datasets having more than a hundred-millions of short reads.
As shown in Figure 1, Greedy-Extension and overlap-layout-consensus (OLC) approaches were used in the very early next gen assemblers . Greedy-Extension’s heuristic is that the highest scoring alignment takes on another read with the highest score. However, this approach is vulnerable to imperfect overlaps and multiple matches among the reads and leads to an incomplete assembly or an arrested assembly. OLC approach works better for long reads such as Sanger or other technology generating more than 100bp due to minimum overlap threshold (454, Ion Torrent, PacBio, and so on). De Bruijin graph-based assemblers are more suitable for short read sequencing technologies such as Illumina. The approach breaks the sequencing reads into successive k-mers, and the graph maps the k-mers. Each k-mer forms a node, and edges are drawn between each k-mer in a read.
Figure 1 Overview of de novo short reads assemblers. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3056720/
SPAdes is a relatively recent application based on de Bruijn graph for both single-cell and multicell data. It improves on the recently released Euler Velvet Single Cell (E +V- SC) assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data).
All tests were performed on Dell EMC PowerEdge R940 configured as shown in Table 1. The total number of cores available in the system is 96, and the total amount of memory is 1.5TB.
The data used for the tests is a paired-end read, ERR318658 which can be downloaded from European Nucleotide Archive (ENA). The read generated from blood sample as a control to identify somatic alterations in the primary and metastatic colorectal tumors. This data contains 3.2 Billion Reads (BR) with the read length of 101 nucleotides.
SPAdes runs three sets of de Bruijn graphs with 21-mer, 33-mer, and 55-mer consecutively. This is the main difference with regards to SOAPdenovo2 which run a single k-mer, either 63-mer or 127-mer.
In Figure 2, the runtimes, wall-clock times, are plotted in days (blue bars) with various number of cores, 28, 46, and 92 cores. Since we do not want to use the entire cores of each socket, 92 cores were picked as the maximum number of cores for the system. One core per socket was reserved for OS and other maintenance processes. Subsequent tests were done by reducing the number of cores in half. Peak memory consumptions for each case is plotted as a line graph. SPAdes runs significantly longer than SOAPdenovo2 due to the multiple iterations on three different k-mers.
The peak memory consumption is very similar to SOAPdenovo2. Both applications require slightly less than 800GB memory to process 3.2 BR.
Utilizing more cores helps to reduce the runtime of SPAdes significantly as shown in Figure 2. For SPAdes, it is recommendable to use the highest core count CPUs like Intel Xeon Plantinum 8180 processor with 28 cores and 3.80GHz to bring down the runtime further.
It refers an earlier version of SOAPdenovo, not SOAPdenovo2.
I am excited to announce the availability of Quick Start on the newly launched Wyse 5070 WIE10 thin client. The Quick Start product runs on first boot and can be launched manually as required. Quick Start provides the end user with an enhanced first time out-of-box experience aka OOBE and informs the user about the product details-both hardware as well as software. Upon walking through the screens, the end user is prompted to configure the thin client if they chose to, or simply proceed with using their brand-new Dell Wyse 5070 thin client.
Here are some screenshots: