If you find yourself in need of a tool that can change BIOS settings while you are still booted into the operating system, look no further. syscfg is a powerful tool provided by DTK (Dell Toolkit) that allows you to do just that. This article will cover using syscfg to change the number of CPU cores that are presented to the operating system.


Obtain DTK

  • If you are using PCM 1.2a, follow these instructions, if you are using PCM 1.2b DTK is already packaged.

1. Click on "Drivers and Downloads"
2. Click on "Change Your Product" and select the server you are using.
3. Change your "Operating System" to "Red Hat Enterprise Linux"
4. Select "Systems Management" from the bottom of the page to expand that category
5. Proceed to download the "OpenManage Deployment Toolkit"


Environment for this exercise

  • Hardware
    • Frontend: PER710
    • Compute Nodes: PER410



  • Cluster Middleware (you can test with either)
    • PCM 1.2a
    • For PCM 1.2b, DTK is already packaged.
  • ClusterCorp Rocks+
    • DTK is already bundled with the solution and is installed on every node in the cluster at /opt/dell/toolkit/bin
    • Follow the steps below starting at step 5 and use the "tentakel" command instead of the "pdsh -a" command.

Process

1. Copy DTK to root's home directory on the installer node (for our example)

2. Mount the .iso as a loopback device

# mkdir 1; mount -o loop dtk_3.1.1_165_Linux.iso 1


3. You want to get the dell-toolkit.rpm package and put it on a NFS share

# mkdir /home/apps; cp /root/1/tools/dell-toolkit.rpm /home/apps/.


4. You have two options for installing DTK, on a single server for testing or on every node in the cluster. We will demonstrate running on the entire cluster. You may want to just test it on a few nodes.

# pdsh -a rpm -ivh /home/apps/dell-toolkit.rpm


If you want to test it on a few nodes use the following:

# pdsh -w compute-00-0[1-5] rpm -ivh /home/apps/dell-toolkit.rpm


5. Get the current state of the cluster by viewing the contents of /proc/cpuinfo

# pdsh -a "cat /proc/cpuinfo | grep processor | wc -l && dmidecode | grep -A2 -m 1 'Product Name:' | grep -v Ver"


For a Rocks+ cluster, use the "tentakel" command instead of "pdsh -a"

# tentakel "cat /proc/cpuinfo | grep processor | wc -l && dmidecode | grep -A2 -m 1 'Product Name:' | grep -v Ver"


You should see some output like this:

<snip>
compute-00-11-eth0: 16
compute-00-11-eth0: Product Name: PowerEdge R410
compute-00-11-eth0: Serial Number: 6X5K0L1
compute-00-37-eth0: 4
compute-00-37-eth0: Product Name: PowerEdge R410
compute-00-37-eth0: Serial Number: HW10ML1
<snip>

This will show you how many processors each node is seeing in /proc/cpuinfo.

6. Get the current state of the cluster by using syscfg

# pdsh -a "/opt/dell/toolkit/bin/syscfg --logicproc && /opt/dell/toolkit/bin/syscfg --cpucore"


You will see output like this:

<snip>
compute-00-18-eth0: logicproc=enable
compute-00-35-eth0: logicproc=enable
compute-00-33-eth0: logicproc=enable
compute-00-32-eth0: logicproc=disable
compute-00-31-eth0: logicproc=disable
compute-00-34-eth0: logicproc=enable
compute-00-03-eth0: cpucore=2
compute-00-17-eth0: cpucore=4
compute-00-08-eth0: cpucore=4
compute-00-01-eth0: cpucore=2
<snip>

7. Disable logical processing and set the cpucore to 4 on all nodes

# pdsh -a "/opt/dell/toolkit/bin/syscfg --logicproc=disable && /opt/dell/toolkit/bin/syscfg --cpucore=4"


8. Reboot the nodes for the changes to take effect


Confirm the Changes

1. Get the new state of the cluster by viewing the contents of /proc/cpuinfo

# pdsh -a "cat /proc/cpuinfo | grep processor | wc -l && dmidecode | grep -A2 -m 1 'Product Name:' | grep -v Ver"


You should see some output like this:

<snip>
compute-00-11-eth0: 8
compute-00-11-eth0: Product Name: PowerEdge R410
compute-00-11-eth0: Serial Number: 6X5K0L1
compute-00-37-eth0: 8
compute-00-37-eth0: Product Name: PowerEdge R410
compute-00-37-eth0: Serial Number: HW10ML1
<snip>

2. Get the new state of the cluster by using syscfg

# pdsh -a "/opt/dell/toolkit/bin/syscfg --logicproc && /opt/dell/toolkit/bin/syscfg --cpucount"


You will see output like:

<snip>
compute-00-18-eth0: logicproc=disable
compute-00-35-eth0: logicproc=disable
compute-00-33-eth0: logicproc=disable
compute-00-32-eth0: logicproc=disable
compute-00-31-eth0: logicproc=disable
compute-00-34-eth0: logicproc=disable
compute-00-03-eth0: cpucore=4
compute-00-17-eth0: cpucore=4
compute-00-08-eth0: cpucore=4
compute-00-01-eth0: cpucore=4
<snip>


Scheduler Impact

Clustercorp Rocks+
For correct information in the job schedulers with Rocks+, the compute nodes need to be removed and then re-added. A simple re-install is not enough. The frontend's database needs to be cleaned up first and then the servers re-added since Rocks looks at the core count in /proc/cpuinfo for its database and then provides that info to the queuing system.

For every node in the cluster:

  1. rocks remove host compute-x-y
  2. insert-ethers --rack=x --rank=y, and select compute

-- Scott Collier