This Blog will try to help you understand how CPU and memory choices affect memory performance on Dell PowerEdge servers, as it relates to new Skylake processors.
Let’s start with the Skylake Processors architecture. The new Skylake Processors have 6 Memory channels per CPU, These are controller by 2 internal memory controllers, each handling 3 memory channels. See the figure 1 & 5 below to better visualize how this is implemented. These memory channels must be populated in certain ways and combined with the right processor choice to achieve the best memory performance that Dell PowerEdge Servers can deliver.
Figure 1. SkyLake Memory controller layout.
Another consideration on the Skylake Processors is the memory controller speeds are different for the classes of Skylake CPUs (Platinum, Gold, Silver and Bronze). Below is a memory speed table for the different classes of Skylake processors. Use this table when choosing the CPU as well as the memory for the server workload requirements of the Dell PowerEdge systems.
Figure 2. Memory Controller Speed Vs. CPU selection
The CPU vs. Memory Controller Speed table above shows there is up to a 20% difference in memory controller speed vs. the 81xx and 61xx Skylake processors and the 31xx Skylake processors. There is a 10% difference between the 81xx and 61xx and the 51xx and 41xx Skylake processors. If the server workload is memory intensive, it would be better severed by choosing an 81xx or 61xx Skylake Processor over the other classes of Skylake Processors.
Figure 3. DIMM types vs. Ranks and Data Width
Memory Ranks and Data Width can also affect memory performance in Skylake Processors. The table above helps explain why the 16GB RDIMMs gives the best memory performance on Skylake processors as seen in figure 4. The 16GB RDIMMs have 2 Ranks and a Data Width of 8 vs. other RDIMMs or LRDIMMs only have a Data width of 4 with 2 Ranks. The Skylake’s internal memory controller when combined with 16GB RDIMMs give up to 5% better memory performance over the other DIMM memory sizes. The combination of memory type and memory controller speed, there can be up to a 48% difference in memory performance between the highest Skylake CPU 8180(2667 MHz) vs. the 3106(2133MHz) the lowest Skylake CPU. The table below shows the memory speed differences between the Skylake classes of CPUs, as well as the difference between DIMM Ranking and Data Width. 8GB RDIMMs are only single rank and are generally 2 to 5% slower than the dual ranked 16GB and 32GB DIMMs. 32GB RDIMMS only have a Data Width of 4 vs. 8 for the 16GB RDIMMs, so the 16GB DIMMs are generally 2 to 3% faster than the 32GB RDIMMs.
Figure 4. Memory speed difference by CPU class and memory size
In this section of the blog, we will discuss the difference between balanced and unbalanced memory configurations and how it affects memory performance.
Figure 5. Balanced vs. Unbalanced Memory vs. DIMM Count
Let’s start with keeping the memory balanced while adding DIMMS to the Skylake’s memory configuration. As you can see in the figure above, memory must be added to the Skylake Processors in a certain way to keep the memory controllers and channels balanced. Sometimes it’s better to keep the memory on just one controller, especially when the memory configuration only has 1 or 3 DIMMs. But when you have 2, 4, or 6 DIMMS it is better to place 1, 2, or 3 DIMMS on each integrated memory controller. In addition, the table shows that there is no way to make 5 or 7 thru 11 DIMMS configurations balanced. This is because, there will be either 2 DIMMs per Memory Channel in the case of 7 thru 11 DIMM configurations or a Memory Channel without a DIMM, As in the 5 DIMM configuration in the table above. This also explains why if the memory configuration of the Dell PowerEdge system has 5 or 7 thru 11 or 13 thru 23 DIMMs there is no way to make it balanced. It is important to keep the memory balanced because it can result in a performance impact of up to 65%. See figure 8 for more details.
Now that we have an understanding of how the memory controllers works on a Skylake processor, let’s put it to practice on Dell Systems that have more than 6 Memory slots per CPU. The Dell Modular systems have 8 memory slots per CPU, which mean due to how the Skylake’s memory controllers works, if all 8 memory slots are populated you will achieve the Max Memory size of the Dell PowerEdge modular System. But the memory will be unbalanced and result in lower memory performance as seen in Figure 8. The figure below shows the Dell PowerEdge Modular Systems have 8 DIMM Slots, if the black slots on Memory Channels 0 and 3 of these systems are populated, it will result in an unbalanced memory configuration and effect memory performance. Placing memory in these black slots will place memory channel 0 and 3 in 2 DIMMs Per Channel mode while leaving memory channels 1,2,4,5 in single DPC mode, as explained in the section above.
Figure 6. Skylake memory controller implementation on C6420, FC\M 640
Below is a table of the max memory supported in the balanced configuration for the C6420 & FC\M 640 PowerEdge Servers. A good rule on these Dell Modular Systems is as the memory size requirement increases, the memory DIMM size must increase to keep the memory balance, so not to impact memory performance.
Figure 7. Max Performance Balanced Memory DIMM configurations
The black slots on the C6420 & FC\M640 can be used to increase the overall memory size of the system, but at the cost of up to 65% of the memory bandwidth. This must be kept in mind when choosing the memory for the server workload. To achieve the max memory performance on the Dell PowerEdge C6420 & FC\M640, only use the 12 white DIMM slots in a dual CPU configuration to increase the memory size. It is better to increase the DIMM size as seen in the table above to achieve maximum memory performance on these systems.
The table below shows the unbalanced memory performance impact of up to 65% by using all 16 DIMM slots to achieve the max configurable memory size of the Dell Modular Systems.
Figure 8. CPU choice (memory controller speed) vs. DIMM count vs. DIMM size.
The figures below of the Dell PowerEdge C6420 and FC\M 640 show that the memory slots A7, B7, A8, B8 place memory controller channels 0 and 3 in 2 DIMMs per channel mode. Since the other 4 memory controller channels are in single DIMM per channel mode, this will cause a drop in memory performance, as reflected in the performance tables above.
Figure 9. FC\M640 Memory Configuration table
Figure 10. C6400\C6420 Memory Configuration table
In conclusion, Processor and memory configurations must be considered when creating the system configurations of the 14G Dell PowerEdge Modular systems. Failing to choose the correct processor and memory configuration can cause dramatic performance impacts on Dell PowerEdge Modular Servers.