Join us at Super Computing 2011!We invite you to visit us at the SC2011 conference in Seattle, Nov 14-17 at Booth #2040 See first-hand how we are enabling research discovery with Dell HPC solutions.
If you find yourself in need of a tool that can change BIOS settings while you are still booted into the operating system, look no further. syscfg is a powerful tool provided by DTK (Dell Toolkit) that allows you to do just that. This article will cover using syscfg to change the number of CPU cores that are presented to the operating system.Obtain DTK
1. Click on "Drivers and Downloads" 2. Click on "Change Your Product" and select the server you are using.3. Change your "Operating System" to "Red Hat Enterprise Linux"4. Select "Systems Management" from the bottom of the page to expand that category5. Proceed to download the "OpenManage Deployment Toolkit"
Environment for this exercise
Process 1. Copy DTK to root's home directory on the installer node (for our example)2. Mount the .iso as a loopback device
# mkdir 1; mount -o loop dtk_3.1.1_165_Linux.iso 1
3. You want to get the dell-toolkit.rpm package and put it on a NFS share
# mkdir /home/apps; cp /root/1/tools/dell-toolkit.rpm /home/apps/.
4. You have two options for installing DTK, on a single server for testing or on every node in the cluster. We will demonstrate running on the entire cluster. You may want to just test it on a few nodes.
# pdsh -a rpm -ivh /home/apps/dell-toolkit.rpm
If you want to test it on a few nodes use the following:
# pdsh -w compute-00-0[1-5] rpm -ivh /home/apps/dell-toolkit.rpm
5. Get the current state of the cluster by viewing the contents of /proc/cpuinfo
# pdsh -a "cat /proc/cpuinfo | grep processor | wc -l && dmidecode | grep -A2 -m 1 'Product Name:' | grep -v Ver"
For a Rocks+ cluster, use the "tentakel" command instead of "pdsh -a"
# tentakel "cat /proc/cpuinfo | grep processor | wc -l && dmidecode | grep -A2 -m 1 'Product Name:' | grep -v Ver"
You should see some output like this: <snip> compute-00-11-eth0: 16 compute-00-11-eth0: Product Name: PowerEdge R410 compute-00-11-eth0: Serial Number: 6X5K0L1 compute-00-37-eth0: 4 compute-00-37-eth0: Product Name: PowerEdge R410 compute-00-37-eth0: Serial Number: HW10ML1 <snip> This will show you how many processors each node is seeing in /proc/cpuinfo.6. Get the current state of the cluster by using syscfg
# pdsh -a "/opt/dell/toolkit/bin/syscfg --logicproc && /opt/dell/toolkit/bin/syscfg --cpucore"
You will see output like this: <snip>compute-00-18-eth0: logicproc=enable compute-00-35-eth0: logicproc=enable compute-00-33-eth0: logicproc=enable compute-00-32-eth0: logicproc=disable compute-00-31-eth0: logicproc=disable compute-00-34-eth0: logicproc=enable compute-00-03-eth0: cpucore=2 compute-00-17-eth0: cpucore=4 compute-00-08-eth0: cpucore=4 compute-00-01-eth0: cpucore=2<snip>7. Disable logical processing and set the cpucore to 4 on all nodes
# pdsh -a "/opt/dell/toolkit/bin/syscfg --logicproc=disable && /opt/dell/toolkit/bin/syscfg --cpucore=4"
8. Reboot the nodes for the changes to take effectConfirm the Changes 1. Get the new state of the cluster by viewing the contents of /proc/cpuinfo
You should see some output like this:<snip> compute-00-11-eth0: 8 compute-00-11-eth0: Product Name: PowerEdge R410 compute-00-11-eth0: Serial Number: 6X5K0L1compute-00-37-eth0: 8 compute-00-37-eth0: Product Name: PowerEdge R410 compute-00-37-eth0: Serial Number: HW10ML1 <snip> 2. Get the new state of the cluster by using syscfg
# pdsh -a "/opt/dell/toolkit/bin/syscfg --logicproc && /opt/dell/toolkit/bin/syscfg --cpucount"
You will see output like: <snip>compute-00-18-eth0: logicproc=disable compute-00-35-eth0: logicproc=disable compute-00-33-eth0: logicproc=disable compute-00-32-eth0: logicproc=disable compute-00-31-eth0: logicproc=disable compute-00-34-eth0: logicproc=disable compute-00-03-eth0: cpucore=4 compute-00-17-eth0: cpucore=4 compute-00-08-eth0: cpucore=4 compute-00-01-eth0: cpucore=4 <snip>Scheduler Impact Clustercorp Rocks+For correct information in the job schedulers with Rocks+, the compute nodes need to be removed and then re-added. A simple re-install is not enough. The frontend's database needs to be cleaned up first and then the servers re-added since Rocks looks at the core count in /proc/cpuinfo for its database and then provides that info to the queuing system. For every node in the cluster:
-- Scott Collier