This post was written by Sathish D of the Dell OpenManage Connections team.
OVERVIEW
HP Operations Manager (HPOM) supports scheduled task policy, which you can use to invoke external applications from the HPOM console. Dell Smart Plug-in (SPI) has a scheduled task policy, which retrieves the device health status and generates corresponding status messages in the HPOM console.
This post explains the Periodic Health monitoring of Dell Servers, Dell Remote Access Controllers (DRAC’s) and Dell Chassis using Dell SPI Scheduled Status Poll Polices with HP Operation Manager for Windows.
Periodic Health Monitoring Scheduled Status Poll policies for Dell servers, DRAC’s, and Dell Chassis in HPOM console.
PREREQUISITES:
Before scheduling status poll policies, complete these prerequisites
Reference Documents:
<ADMIN NOTE: Broken link has been removed from this post by Dell>
MONITORING DELL SERVER’s IN HPOM CONSOLE:
Once the Dell Hardware Auto-grouping Policy is completed, the policy identifies the Dell servers that were discovered (in-band or out-of-band through iDRAC7 with available Licenses), and then creates hierarchies on both the Nodes and Services in HPOM console.
Hierarchical representation of classified Dell Servers in Service Hierarchy:
(Both in-band and Out-of-band Management through iDRAC7 devices)
DEPLOYING THE DELL SERVER SCHEDULED STATUS POLL POLICY:
The scheduled task policy “Dell Server scheduled Status Poll” is used to monitor the overall system health status of Dell Servers in the HPOM Console, both in-band and Out-of-band with iDrac7 devices, as well as bare metal devices via iDrac7 devices. The default interval of this policy is every day 2:00 A.M. The interval can be changed to custom value as required; policies are deployed via the Management Server.
Once the policy is run, the health status of the device is queried through the communication protocol and its corresponding status with associated severity is displayed in the HPOM console. The health status message is also associated with the device, classified under Dell groups in both the Node and Service hierarchies.
The schedule task policy acknowledges the health message and posts the current health status message for the classified Servers and Integrated Dell Remote Access Controller (iDRAC) 7 devices in the Active Message Browser of the HPOM console. The latest health status of the servers and the iDRAC 7 devices is always displayed in the HPOM console.
The Scheduled Status Poll policy generates health messages with three different severities in the HPOM console.
Message Association in Service Hierarchy:
Once the Dell Servers Scheduled Status Poll policy is run, the policy retrieves the overall health status of the classified Dell servers, and iDRAC 7 devices. The retrieved health status is mapped with its corresponding health messages (Normal, Warning, or Critical).The health message is associated with the server’s and iDrac7 devices (child node: Global System Status) and are seen on the Active Message Browser of the HPOM Console. The message severity of the child node is propagated to the parent Node in both the Node and Service Hierarchies
Service Hierarchy: Only the Global System Status message with its corresponding health severity will be associated and updated for the node. The health severity will propagate to the device parent group.
Node Hierarchy: In Node hierarchy, SNMP Trap Messages and Health messages are associated with the Server node; the worst case message severity is propagated to its parent Node group.
Health Status Message Association for Server Node in Service Hierarchy:
TROUBLESHOOTING STEPS:
When the device Global health is warning or critical status, follow these steps to troubleshoot the issues:
- Review the outstanding messages in the Active Message browser of the device. If any issues exist, resolve them as per instructions in the message browser.
- Launch 1:1 console for further troubleshooting,
- Launch Dell tools for further troubleshooting, and to take the corrective action:
The Warranty report page is used to retrieve warranty related information via the service tag associated with the system. You can review the warranty details of the system and also renew the warranty.
OME Console can be used to verify the device and component’s health. It also provides rich device inventory information. You can launch the OME console to further troubleshoot the device specific information.
MONITORING DELL DRAC’s AND CHASSIS IN HPOM CONSOLE:
Once the Dell Hardware Auto-grouping Policy is completed, it identifies discovered Dell DRAC’s and Chassis;
Hierarchical Representation of Dell DRAC’s and Chassis in Services:
DEPLOYING THE DELL DRAC’s AND CHASSIS STATUS POLL POLICY:
The scheduled task policy “Dell DRAC’s and Chassis scheduled Status Poll” monitors the overall system health status of Dell DRAC’s (DRAC5 and, iDRAC6 devices both Monolithic and Modular) and Dell Chassis (Chassis Management Controller (CMC) and DRAC/MC) devices in the HPOM console. The default interval of this policy is every day 2:00 A.M. The interval can be changed to the custom value as required. Policies are deployed via the Management Server.
Once the policy is run, the health status of the device is queried through the SNMP protocol and a corresponding message with the associated severity will be shown in the HPOM console. The health status message is also associated with the device, classified under Dell groups in the both Node and Service Hierarchies.
The schedule task policy acknowledges the health message and posts the current health status message for the classified DRAC’s and Chassis in the Active Message Browser of the HPOM console .The latest health status of the DRAC’s and Chassis is always displayed in the HPOM console.
Once the Dell DRAC’s and Chassis Scheduled Status Poll policy is run, the policy retrieves the overall health status of the classified Dell DRAC’s and Chassis. The retrieved health status is mapped with the appropriate health message (Normal, Warning or Critical).The health message is associated for the DRAC’s or Chassis child nodes (Child node: Global System Status) and are seen on the Active Message Browser of the HPOM Console. The message severity for the node (Normal, Warning, and Critical) is propagated to the parent Node in both Node and the Service Hierarchies.
Health Status Message Association for Chassis in Service Hierarchy:
When the device Global health status is “Warning” or “Critical”, follow these steps to troubleshoot the issues:
- Review the outstanding messages in the Active Message browser of the device. If any issues exist– resolve them as per instructions in the message browser.
APPENDIX:
Refer to the following links: