Our community is talking about the new Dell Technologies. Join the discussion in the Dell EMC Community Network:
By Ashish Kumar Singh. January 2017 (HPC Innovation Lab)
This blog presents an in-depth analysis of the High Performance Conjugate Gradient (HPCG) benchmark on the Intel Xeon Phi processor, which is based on Intel Xeon Phi architecture codenamed “Knights Landing”. The analysis has been performed on PowerEdge C6320p platform with the new Intel Xeon Phi 7230 processor.
Introduction to HPCG and Intel Xeon Phi 7230 processor
The HPCG benchmark constructs a logically global, physically distributed sparse linear system using a 27-point stencil at each grid point in 3D domain such that the equation at the point (I, j, k) depend on its values and 26 surrounding neighbors. The global domain computed by benchmark is (NRx * Nx) X (NRy*Ny) X (NRz*Nz), where Nx, Ny and Nz are dimensions of local subgrids, assigned to each MPI process and number of MPI ranks are NR = (NRx X NRy X NRz). These values can be defined in hpcg.dat file or passed in the command line arguments.
The HPCG benchmark is based on conjugate gradient solver, where the pre-conditioner is a three level hierarchical multi-grid (MG) method with Gauss-Seidel. The algorithm starts with MG and contains Symmetric Gauss-Seidel (SymGS) and Sparse Matrix-vector multiplication (SPMV) routines for each level. Both SYMGS and SPMV require data from their neighbor as data is distributed across nodes which is provided by their predecessor, the Exchange Halos routine. The residual should be lower than 1-6 which is locally computed by Dot Product (DDOT), while MPI_Allreduce follows the DDOT and completes the global operation. WAXPBY only updates a vector with sum of two scaled vectors. Scaled vector addition is a simple operation that calculates the output vector by scaling the input vectors with a constant and performing an addition on the values of the same index. So, HPCG has four computational blocks SPMV, SymGS, WAXPBY and DDOT, while two communication blocks MPI_Allreduce and Halos Exchange.
Intel Xeon Phi Processor is a new generation of processors from the Intel Xeon Phi family. Previous generations of Intel Xeon Phi were available as a coprocessor, in a PCI card form factor and required an Intel Xeon processor. The Intel Xeon Phi 7230 contains 64 cores @ 1.3GHz of core frequency along with the turbo speed of 1.5GHz and 32MB of L2 cache. It supports DDR4-2400MHz memory up to 384GB and instruction set of AVX512. Intel Xeon Phi processor also encloses 16GB of MCDRAM memory on socket with a sustained memory bandwidth of up to ~480GB/s measured by the Stream benchmark. Intel Xeon Phi 7230 delivers up to ~1.8TFLOPS of double precision HPL performance.
This blog showcases the performance of HPCG benchmark on the Intel KNL processor and compares the performance to that on the Intel Broadwell E5-2697 v4 processor. The Intel Xeon Phi cluster comprises of one head node which is PowerEdge R630 and 12 PowerEdge C6320p as compute nodes. While Intel Xeon processor cluster includes one PowerEdge R720 as head node and 12 PowerEdge R630 as compute nodes. All compute nodes are connected by Intel Omni-Path of 100GB/s. The cluster shares the storage of head node over NFS. The detailed information of the clusters are mentioned below in table1. All HPCG tests on Intel Xeon Phi has been performed with the BIOS settings of “quadrant” cluster mode and “Memory” memory mode.
Table1: Cluster Hardware and software details
HPCG Performance analysis with Intel KNL
Choosing the right problem size for HPCG should follow the following rules. The problem size should be large enough not to fit in the cache of the device. The problem size should be able to occupy the significant fraction of main memory, at least 1/4th of total. For HPCG performance characterization, we have chosen the local domain dimension of 128^3, 160^3, and 192^3 with the execution time of t=30 seconds. The local domain dimension defines the global domain dimension by (NR*Nx) x (NR*Ny) x (NR*Nz), where Nx=Ny=Nz=160 and NR is the number of MPI processes.
Figure 1: HPCG Performance comparison with multiple local dimension grid size
As shown in figure 1, the local dimension grid size of 160^3 gives the best performance of 48.83GFLOPS. The problem size bigger than 128^3 allows for more parallelism and it fits well inside the MCDRAM while 192^3 does not. All these tests have been carried out with 4 MPI processes and 32 OpenMP threads per MPI process on a single Intel KNL server.
Figure 2: HPCG performance comparison with multiple execution time.
Figure 2 demonstrates HPCG performance with multiple execution times for grid size of 160^3 on a single Intel KNL server. As per the graph, HPCG performance doesn’t change even by changing the execution time. It means execution time does not appear to be a factor for HPCG performance. So, we may not need to spend hours or days of time to benchmark large clusters, which in result, will save both time and power. Although, the official execution time should be >=1800 seconds as reported in the output file. If you decide to submit your results to TOP 500 ranking list, execution time should be not less than 1800seconds.
Figure 3: Time consumed by HPCG computational routines.
Figure 3 shows the time consumed by each computational routine from 1 to 12 KNL nodes. Time spent by each routine is mentioned in HPCG output file as shown in the figure 4. As per the above graph, HPCG spends its most of the time in the compute intensive pre-conditioning of SYMGS function and matrix vector multiplication of sparse matrix (SPMV). The vector update phase (WAXPBY) consumes very less time in comparison to SymGS and least time by residual calculation (DDOT) out of all four computation routines. As the local grid size is same across all multi-node runs, the time spent by all four compute kernels for each multi-node run are approximately same. The output file shown in figure 4, shows performance of all four computation routines. In which, MG consists both SymGS and SPMV.
Figure 4: A slice of HPCG output file
Here is the HPCG multi-nodes performance comparison between Intel Xeon E5-2697 v4 @2.3GHz (Broadwell processor) and Intel KNL processor 7230 with Intel Omni-path interconnect.
Figure 5: HPCG performance comparison between Intel Xeon Broadwell processor and Intel Xeon Phi processor
Figure 5 shows HPCG performance comparison between dual Intel Broadwell 18 cores processors and one Intel Xeon phi 64 cores processor. Dots in figure 5 show the performance acceleration of KNL servers over Broadwell dual socket servers. For single KNL node, HPCG performs 2.23X better than Intel Broadwell node. For Intel KNL multi-nodes also HPCG show more than 100% performance increase over Broadwell processor nodes. With 12 Intel KNL nodes, HPCG performance scales out well and shows performance up to ~520 GFLOPS.
Overall, HPCG shows ~2X higher performance with Intel KNL processor on PowerEdge C6320p over Intel Broadwell processor server. HPCG performance scales out well with more number of nodes. So, PowerEdge C6320p platform will be a prominent choice for HPC applications like HPCG.
The auto discovery request of the Dell PowerEdge servers to Dell Lifecycle Controller Integration (DLCI) version 3.3 for Microsoft System Center Configuration Manager (Configuration Manager) with Dell Provisioning Server (DPS) that is installed on Windows Server 2016 might fail. The failure is prompted with the No credential returned error message.
The auto discovery request failure occurs when the Dell PowerEdge servers’ sends an auto discovery handshake request to DPS installed on Windows Server 2016 over TLS 1.2. The TLS 1.2 handshake request by the Dell PowerEdge servers’ is rejected by Windows Server 2016 in this scenario. Although the handshake request is sent with the auto-negotiate flag, Windows Server 2016 returns an application error instead of a TLS error alert as in the earlier Windows versions such as, Windows 2012 R2. Hence, DPS cannot renegotiate with TLS 1.1.
Note: This behavioral change is observed in Windows Server 2016.
To perform auto discovery with DLCI version 3.3 for Configuration Manager with DPS, following are the two workarounds:
To ensure successful TLS 1.1 renegotiation, set "SSLAlwaysNegoClientCert" to "true" by updating the SSLBindings of DPS.
See https://msdn.microsoft.com/en-us/library/ms525641(v=vs.90).aspx in Microsoft product documentation for more information about SSLAlwaysNegoClientCert.
See https://msdn.microsoft.com/en-us/library/ms689452(v=vs.90).aspx in Microsoft product documentation for more information about SSLBinding Class [IIS 7 and higher].
See the Dell Lifecycle Controller Integration Version 3.3 for Microsoft System Center Configuration Manager Installation Guide at Dell.com/support/home for more information about installing DPS and the supported OS configurations.
By Garima Kochhar. HPC Innovation Lab. January 2016.
The Intel Xeon Phi bootable processor (architecture codenamed “Knights Landing” – KNL) is ready for prime time. The HPC Innovation Lab has had access to a few engineering test units, and this blog presents the results of our initial benchmarking study. [We also published our results with Cryo-EM workloads on these systems, and that study is available here.]
The KNL processor is from the Intel Xeon Phi product line but is a bootable processor, i.e., the system does not need another processor in it to power on, just the KNL. Unlike the Xeon Phi coprocessors or the NVIDIA K80 and P100 GPU cards that are housed in a system that has a Xeon processor as well, the KNL is the only processor in the server. This necessitates a new server board design and the PowerEdge C6320p is the Dell EMC platform that supports the KNL line of processors. A C6320p server includes support for one KNL processor and six DDR4 memory DIMMs. The network choices include Mellanox InfiniBand EDR, Intel Omni-Path, or choices of add-in 10GbE Ethernet adapters. The platform has the other standard components you’d expect from the PowerEdge line including a 1GbE LOM, iDRAC and systems management capabilities. Further information on C6320p is available here.
The KNL processor models include 16GB of on-package memory called MCDRAM. The MCDRAM can be used in three modes – memory mode, cache mode or hybrid mode. The 16GB of MCDRAM is visible to the OS as addressable memory and must be addressed explicitly by the application when used in memory mode. In cache mode, the MCDRAM is used as the last level cache of the processor. And in hybrid mode, a portion of the MCDRAM is available as memory and the other portion is used as cache. The default setting is cache mode as this is expected to benefit most applications. This setting is configurable in the server BIOS.
The architecture of the KNL processor allows the processor cores + cache and home agent directory + memory to be organized into different clustering modes. These modes are called all2all, quadrant and hemisphere, Sub-NUMA Clustering-2 and Sub-NUMA Clustering 4. They are described in this Intel article. The default setting in the Dell EMC BIOS is quadrant mode and can be changed in the Dell EMC BIOS. All tests below are with the quadrant mode.
The configuration of the systems used in this study is described in Table 1.
Table 1 - Test configuration
12 * Dell EMC PowerEdge C6320p
Intel Xeon Phi 7230. 64 cores @ 1.3 GHz, AVX base 1.1 GHz.
96 GB at 2400 MT/s [16 GB * 6 DIMMS]
Intel Omni-Path and Mellanox EDR
Red Hat Enterprise Linux 7.2
Intel 2017, 17.0.0.098 Build 20160721
Intel MPI 5.1.3
Intel XPPSL 1.4.1
The first check was to measure the memory bandwidth on the KNL system. To measure memory bandwidth to the MCDRAM, the system must be in “memory” mode. A snippet of the OS view when the system is in quadrant + memory mode is in Figure 1.
Note that the system presents two NUMA nodes. One NUMA node contains all the cores (64 cores * 4 logical siblings per physical core) and the 96 GB of DDR4 memory. The second NUMA node, node1, contains the 16GB of MCDRAM.
Figure 1 – NUMA layout in quadrant+memory mode
On this system, the dmidecode command shows six DDR4 memory DIMMs, and eight 2GB MCDRAM memory chips that make up the 16GB MCDRAM.
STREAM Triad results to the MCDRAM on the Intel Xeon Phi 7230 measured between 474-487 GB/s across 16 servers. The memory bandwidth to the DDR4 memory is between 83-85 GB/s. This is expected performance for this processor model. This link includes information on running stream on KNL.
When the system has MCDRAM in cache mode, the STREAM binary used for DDR4 performance above reports memory bandwidth of 330-345 GB/s.
XPPSL includes a micprun utility that makes it easy to run this micro-benchmark on the MCDRAM. “micprun –k stream –p <num cores>” is the command to run a quick stream test and this will pick the MCDRAM (NUMA node1) automatically if available.
The KNL processor architecture supports AVX512 instructions. With two vector units per core, this allows the processor to execute 32 DP floating point operations per cycle. For the same core count and processor speed, this doubles the floating point capabilities of KNL when compared to Xeon v4 or v3 processors (Broadwell or Haswell) that can do only 16 FLOPS/cycle.
HPL performance on KNL is slightly better (up to 5%) with the MCDRAM in memory mode when compared to cache mode and when using the HPL binary packaged with Intel MKL. Therefore the tests below are with the system in quadrant+memory mode.
On our test systems, we measured between 1.7 – 1.9 TFLOP/s HPL performance per server across 16 test servers. The micprun utility mentioned above is an easy way to run single server HPL tests. “micprun –k hplinpack –p <problem size>” is the command to run a quick HPL test. However for cluster-level tests, the Intel MKL HPL binary is best.
HPL cluster level performance over the Intel Omni-Path interconnect is plotted in Figure 2. These tests were run using the HPL binary that is packaged with Intel MKL. The results with InfiniBand EDR are similar.
Figure 2 - HPL performance over Intel Omni-Path
The KNL-based system is a good platform for highly parallel vector applications. The on-package MCDRAM helps balance the enhanced processing capability with additional memory bandwidth to the applications. KNL introduces the AVX512 instruction set which further improves the performance of vector operations. The PowerEdge C6320p provides a complete HPC server with multiple network choices, disk configurations and systems management.
This blog presents initial system benchmark results. Look for upcoming studies with HPCG and applications like NAMD and Quantum Espresso. We have already published our results with Cryo-EM workloads on KNL and that study is available here.
Written by Chuck Armstrong
Are you running your VMware virtual machines on 1Gb PS Series storage? Are you considering moving specific VM workloads to 10Gb SC Series storage? Are you looking for the best way to build your environment to support VM workloads on both platforms together?
If yes is your answer to these questions, this is the paper you want to read:
Deploying ESXi 6.0 with Dell PS and SC Series Storage on Separate iSCSI Networks
This blog post covers some of the high-level pieces from the paper.
To level-set, if you’re running a PS Series environment now, you should already be familiar with the best practices regarding the iSCSI SAN infrastructure. Even if you’re not, the figure below shows the first half of this solution: best practices design for a 1Gb PS Series iSCSI SAN architecture.
1Gb PS Series iSCSI best practices design
The key requirements of the PS Series iSCSI SAN infrastructure include:
If you’re new to the SC Series storage platform, you may not be familiar with the best practices regarding the 10Gb iSCSI SAN infrastructure. There are some differences in how this infrastructure should be configured as compared to a PS Series infrastructure. The figure below shows on the second half of this solution: best practices design for a 10Gb SC Series iSCSI SAN architecture.
10Gb SC Series iSCSI best practices design
The key recommendations of the SC Series iSCSI SAN infrastructure include:
The challenge is configuring VMware hosts to connect to storage on both infrastructures while maintaining best practices for both, as shown in the figure below.
1Gb PS Series and 10Gb SC Series infrastructure
The first step is to select adapters to be used for dedicated iSCSI traffic on the host. The 1Gb PS Series environment can use any supported NIC for dedicated iSCSI traffic. Configuring the 1Gb adapters is addressed in the white paper referenced in this blog. Additional detail can be found in the Dell Storage PS Series Configuration Guide.
The 10Gb adapters you use are identified as Dependent Hardware iSCSI Adapters. See VMware Documentation for a definition of Dependent Hardware iSCSI Adapter and models supported.
With the proper adapters identified, installed, and dedicated for respective 1Gb and 10Gb iSCSI traffic, and both 1Gb and 10Gb switches properly configured to support the iSCSI traffic, the rest is all in the host configuration. This is covered from start to finish in the white paper as well.
As always, check Dell TechCenter for additional storage solutions technical content.
December 2016 – HPC Innovation Lab
In order to build a balanced cluster ecosystem and eliminate bottle-necks, the need for powerful and dense server node configurations is essential to support parallel computing. The challenge is to provide maximum compute power with efficient I/O subsystem performance, including memory and networking. Some of the emerging technologies along with traditional computing that are needed for intense compute power are advanced parallel algorithms in the areas of research, life science and financial application.
Dell PowerEdge C6320p
The introduction of the Dell EMC C6320p platform, which is one of the densest and greatest maximum core capacity platform offerings in HPC solutions, provides a leap in this direction.
The PowerEdge C6320p platform is Dell EMC’s first self-bootable Intel Xeon Phi platform. The previously available versions of Intel Xeon Phi were PCIE adapters that required a host system to be plugged into. From the core perspective, it supports up to 72 processing cores, with each core supporting two vector processing units capable of AVX-512 instructions. This increases the computation of floating point operations requiring longer vector instructions unlike Intel Xeon® v4 processors that support up to AVX-2 instructions. The Intel Xeon Phi in Dell EMC C6320p also features on-package 16GB of fast MCDRAM that is stacked on the processor. The availability of MCDRAM helps out-of-order execution in applications that are sensitive to high memory bandwidth. This is in addition to the six channels of DDR4 memory hosted on the server. Being a single socket server, the C6320p provides a low power consumption compute node compared to traditional two socket nodes in HPC.
The following table shows platform differences as we compare the current Dell EMC PowerEdge C6320 and Dell EMC PowerEdge C6320p server offerings in HPC.
Server Form Factor
2U Chassis with four sleds
Intel ® Xeon Phi
Max cores in a sled
Up to 44 physical cores, 88 logical cores
(with two * Intel ® Xeon E5-2699 v4, 2.2 GHz, 55MB, 22 cores, 145W)
Up to 72 physical cores, 288 logical cores
(with the Intel ®Xeon Phi Processor 7290 (16GB, 1.5GHz, 72 core, 245W)
Theoretical DP Flops per sled
16 DDR4 DIMM slots
6 DDR4 DIMM slots +
on-die 16GB MCDRAM
MCDRAM BW (Memory mode)
~ 475-490 GB/s
~ 135 GB/s
Dual port 1Gb/10GbE
Single port 1GbE
Intel Omni-Path Fabric (100Gbps)
Mellanox Infiniband (100Gbps)
Intel Omni-PathFabric (100Gbps)
On-board Mellanox Infiniband (100Gbps)
Up to 24 x 2.5” or 12 x 3.5” HD
6 x 2.5” HD per node +
Internal 1.8” SSD option for boot
Integrated Dell EMC Remote Access Controller
Dedicated and shared iDRAC8
Table 1: Comparing the C6320 and C6320p offering in HPC
Dell EMC Supported HPC Solution:
Dell EMC offers a complete, tested, verified and validated solution offering on the C6320p servers. This is based on Bright Cluster Manger 7.3 with RHEL 7.2 that includes specific highly recommended kernel and security updates. It will also provide support for the upcoming RHEL 7.3 operating system. The solution provides automated deployment, configuration, management and monitoring of the cluster. It also integrates recommended Intel performance tweaks, as well as required software drivers and other development toolkits to support the Intel Xeon Phi programming model.
The solution provides the latest networking support for both InfiniBand and Intel Omni-Path Fabric. It also includes Dell EMC-supported System Management tools that are bundled to provide customers with the ease of cluster management on Dell EMC hardware.
*Note: As a continuation to this blog, there will be follow-on micro-level benchmarking and application study published on C6320p.
In the previous blog post, ARM Templates and Source Control in Azure Stack - Part 1 of 2 (http://en.community.dell.com/techcenter/cloud/b/dell-cloud-blog/archive/2016/12/05/arm-templates-and-source-control-in-azure-stack), we described a process developed by the solution architects at the Dell Technologies Customer Solution Centers for the creation and sharing of ARM templates in our labs. As a review, these are the steps that we took so far:
1. We created a new Azure Resource Group project in Microsoft Visual Studio 2015 named chihostsp, which was based on the sharepoint-2013-non-ha template found in the AzureStack-QuickStart-Template repository in GitHub.
2. We successfully performed a test deployment of chihostsp to both Azure and Azure Stack TP2.
3. Since the deployment itself made changes to the azuredeploy.parameters.json file in the project, we committed those changes to the local GIt repository on the author's machine.
It was then time to share the chihostsp project with the rest of the team so they could make improvements on the base sharepoint-2013-non-ha template. We reviewed the GitHub Guides site (link provided in the previous blog post) and discussed various methods of collaborating on development projects. There were some strong opinions about whether to use forking or branching to keep track of bug and feature changes. After much spirited conversation, we determined that we needed to keep this part of the process extremely simple until we gain more experience. Otherwise, we knew that some of our colleagues would bypass the process due to perceived complexity. We also considered that many of the predefined ARM templates found in the AzureStack-QuickStart-Templates GitHub repository would probably not require complex modifications - at least for our educational purposes.
Here are the key principles we defined for sharing ARM templates and collaborating among our team members:
The first step in the iterative process of collaborating as a team on our chihostsp project was for the author to publish the project to a GitHub repository. We used Visual Studio to accomplish this by browsing to the home page of Team Explorer with the project open and clicking Sync.
Then, we needed to make sure the correct information appeared under Publish to GitHub as in the following screen print:
A remote could then be seen for the project under Team Explorer --> Repository Settings.
The author then browsed to GitHub to invite other solution architects to begin collaborating on the chihostsp project. This was accomplished by browsing to Settings --> Collaborators in the chihostsp repository and adding the collaborator's names and email addresses. Once saved, an automatic email was sent to all the collaborators.
The solution architect assuming the role of a collaborator automatically received an email that provided a link to the chihostsp repository in GitHub. After logging in to GitHub using their own account credentials, the collaborator accepted the invitation sent by the author in the last step. Then, the collaborator cloned the repository to their local machine to begin making modifications by clicking on Clone or Download, and selecting Open in Visual Studio.
Visual Studio automatically opened to the place where the solutions architect collaborator chose to clone the repository on their local machine. Then, they opened the newly cloned repository.
For simplicity, we decided to only make a couple small cosmetic changes to the azuredeploy.json file in the chihostsp repository to practice our newly developed source control process. The status bar was invaluable in keeping track of the number of local changes to our project’s files, showing how many committed changes still needed to be synced to the local master branch, and indicating which branch was currently being modified. In the case of the process outlined in this blog, only the master branch is ever modified. We referred to Team Explorer and the Status Bar often to track our work.
We only made a change to the defaultValue value under the adminUsername parameter from lcladmin to masadmin. Also, the description under the domainName parameter was modified to read "The Active Directory domain name." Then, these changes were committed to the local master branch. Since concepts like forking and branching were not employed in our simple process and pushes were allowed directly to the remote master branch, we all agreed that the descriptions of all commits need to be very clear and verbose.
After making these changes to the azuredeploy.json file, the modifications needed to be shared with the team. The collaborator did this by initiating a Push in Team Explorer’s Synchronization view.
Browsing to the GitHub website, we verified that the commit had successfully made it to our central repository.
The author and all contributors to the project then needed to perform a Pull Synchronization so that the changes were reflected in their local master branches. We now felt confident that we all could continue to push and pull changes to and from the remote master branch at a regular cadence throughout the duration of the project.
The final step in our process requires the author of our chihostsp project to identify the version of code that is guaranteed to deploy successfully. This is accomplished using a concept known as tagging. The author used their local instance of Visual Studio to tag versions of the project that have deployed successfully to Azure and Azure Stack during testing. We set the tags using the Local History view in Visual Studio, which can be launched from Solution Explorer or Team Explorer.
At the time of this writing, Visual Studio 2015 did not have the ability to push or pull tags through its interface to the remote GitHub repository. To ensure that tags were being pushed and pulled successfully, the project author and all collaborators needed to run a special command at the command line. In this case, all project participants had to change to the local chihostsp directory at the command line on their local machines. Using a Git command line editor (posh-git was used here), we all executed the appropriate commands to push and/or pull the tags.
When a push was performed, the tag would appear in the remote repository in GitHub.
We hope these two blog posts provided some valuable insight into where IT infrastructure professionals might begin their Microsoft Azure Stack learning journey. The creation and version controlling of Azure Resource Manager templates is a cornerstone element of anyone looking to leverage Azure and Azure Stack in their organizations. We hope this information is helpful as you undertake your digital transformation journey to a hybrid cloud model, and we look forward to sharing more discoveries and insights as Microsoft Azure Stack matures and becomes available for public release.
Since Microsoft announced Azure Stack, excitement and interest has continued to grow among customers seeking to implement or augment a cloud computing operating model and business strategy.
To meet the demand for consultancy expertise, design sessions and proofs of concept at the Dell Technologies Customer Solution Centers have helped facilitate and accelerate the testing of Technical Preview 2. One of the first questions we always receive is “Where do we start so we can make informed decisions in the future?”
Based on our in-depth experience with Microsoft Azure Stack and our rich heritage with Fast Track Reference Architectures, CPS-Premium, and CPS-Standard based on the Windows Azure Pack and System Center, we believe that Azure Resource Manager (ARM) is the perfect place to start. The ARM model is essential to service deployment and management for both Azure and Azure Stack. Using the ARM model, resources are organized into resource groups and deployed with JSON templates. Since ARM templates are declarative files written in JSON, they are best created and shared within the context of a well-defined source control process. We solution architects at the Solution Centers decided to define what this ARM template creation and source control process might look like within our globally distributed ecosystem of labs and data centers. We have shared some of our learning journey in two blog posts to help facilitate knowledge transfer and preparation.
Hopefully, these blogs will prove to be thought provoking, especially for IT professionals with limited exposure to infrastructure as code or source control processes. Though we have depicted a manual source control process, we hope to evolve to a fully automated model using DevOps principles in the near future.
Our first step was to choose the right tools for the job. We agreed upon the following elements for creating and sharing our ARM templates:
We used the Dell EMC recommended server for an Azure Stack TP2 1-Node POC environment which is the PowerEdge R630. With 2U performance packed into a compact 1U chassis, the PowerEdge R630 two-socket rack server delivers uncompromising density and productivity. The recommended BOM can be found here:
After a few lively discussions about the high level process for creating and sharing our ARM templates, we felt that the following guideposts were a good place to start:
Throughout the blog posts, original creators of templates will be referred to as the authors, and the contributors making enhancements to the templates will be referred to as collaborators.
One of the most powerful aspects of Microsoft Azure Stack is that it runs the same APIs as Microsoft Azure but on a customer's premises. Because of this, service providers are able to use the same ARM templates to deploy a service to both Azure Stack and Azure without modifications to the template. Only templates contained in the AzureStack-QuickStart-Templates GitHub repository have been created to deploy successfully to both Azure Stack TP2 and Azure. At this time, the templates in the Azure-QuickStark-Templates GitHub repository won't necessarily work with Azure Stack TP2 because not all Resource Providers (RP) from Azure are currently available on Azure Stack.
For our first ARM template deployment in the Solution Center, we decided to keep it simple and select the non-HA Sharepoint template from AzureStack-QuickStart-Templates. The diagram that follows depicts the end state of our test Sharepoint farm. Sharepoint is a great example of an application that can be hosted by service providers and made available under a SaaS business model using Azure Stack. This gives customers a way to immediately consume all the collaboration features of Sharepoint from their trusted service provider without investing a great deal of time and effort to deploy it themselves. Other applications that service providers have hosted for customers include Microsoft Exchange and Microsoft Dynamics.
The specific template we selected was sharepoint-2013-non-ha.
Once we selected which ARM template from GitHub we wanted to use for our first deployment, we needed to create a new Azure Resource Group project in Visual Studio on the author's laptop as depicted in the following screen print.
When creating the Resource Group, we checked the “Create directory for solution” and “Create new Git repository". By creating a Git repository, authors are able to use version control while they are beginning work with the new ARM template on their local machine. There are many great references online for getting started with Git in Visual Studio. A few of us thought the following article was a good starting point:
We named the repository and associated directory chihostsp (short for "Chicago Hosted Sharepoint" since the author was in the Chicago Solution Center), which is the name that was also used for the Resource Group when it was deployed. Then, we selected Azure Stack QuickStart from the Show templates from this location drop down. This exposed all the individual ARM templates in the AzureStack-QuickStart-Templates repository on GitHub. We selected sharepoint-2013-non-ha.
The content of the ARM template then appeared in Visual Studio.
We learned that it is important to refrain from modifying the structure of this template within the downloaded folder when just getting started – especially the Deploy-AzureResourceGroup PowerShell Script. This script deploys the newly designed Resource Group regardless of whether an Azure or Azure Stack Subscription is used. To ensure the success of a Resource Group deployment using this PowerShell script in Visual Studio, we added the necessary accounts and subscriptions into Visual Studio Cloud Explorer. When properly configured, Cloud Explorer should look something like the following (we've blacked out part of our domain suffix in some of the screen prints going forward):
We found it desirable to test this new template from the author's machine before making any changes. To do this, we deployed the chihostsp Resource Group through Visual Studio to an Azure Stack TP2 Tenant Subscription. In the case of our lab environment, the name of the Azure Stack TP2 Tenant Subscription was FirstSubscription. We supplied the appropriate parameters when prompted. We found that many deployment failures that can occur were attributed to incorrect naming conventions on these parameters. We made sure to understand the legal resource naming conventions to ensure a successful deployment.
The sharepoint-2013-non-ha template included logic for working with the Azure Stack Key Vault. For an introduction to the Azure Stack Key Vault, please see the following in the Azure Stack documentation:
When editing the following parameters, we clicked on the key vault icon for adminPassword in order to select the appropriate Key Vault for the FirstSubscription Tenant Subscription and supplied the Key Vault Secret.
After a successful deployment of the chihostsp Resource Group, the following was displayed in the Azure Stack Tenant Portal. (Note: Not all the resources deployed into the Resource Group are visible here).
We also made sure that the chihostsp Resource Group could also be successfully deployed to Azure. Here is the final result of that deployment in the Azure Portal:
Since the azuredeploy.parameters.json file changed as part of deploying the chihostsp Resource Group, the author needed to commit the changes in Visual Studio to Git for proper version control. We definitely learned the importance of committing often to the local Git repository with any new project being developed in Visual Studio. The next few screen prints from Visual Studio illustrate the process.
In the next blog post, we will show how the author shared the new chihostsp project using GitHub and proceeded to work with other solution architect collaborators. Here is a link to Blog post 2 for convenience:
This post is originally written by K, Narendra
Dell EMC launched PowerEdge C6320p system based on Intel® Xeon® Phi™ processor 72xx product family.
More details on C6320p could be found here.
Kernel-3.10.0-327.36.1.el7 is the minimum kernel version that supports the Intel® Xeon Phi™ processor 72xx product family. More details are available here.
You can find kernel-3.10.0-327.36.1.el7 here.
This is an optional hotfix for:
The following is a list of issues resolved in this release.
Client proxy settings not working for 64-bit Connector on upgrade
This hotfix is available for download at: