by Munira Hussain and Xin Chen

This blog aims to address the preview of managing and monitoring the Dell HPC NFS Storage Solution. Bright Cluster Manager 7.0 integrates and helps to address the Dell NSS storage management.  One of the challenges from a system administrator’s perspective is configuring, managing and monitoring the solution. Bright gathers metrics and data from Dell Systems, the Operating System and Storage management utilities, thus making all of the information available through a powerful and intuitive graphical user interface.

With the integration of NSS and Bright Cluster Manager an administrator defines the NSS server (or servers in a fail-over pair). Bright Cluster Management GUI or command line shell will then monitor and manage the NSS solution, including displaying information about the status of the HA failover. The main dashboard also reports the total I/O capacity, the utilized capacity and percentage of file system space available along with the directory mount points. The event viewer reports any existing issues, alerts or thresholds surpassed. To enable NSS functionality, a storage role is assigned to specific servers and an administrator would then define these servers as being part of an NSS configuration specifically. The NSS nodes further have the option of input for specified number of NFS threads that can be set for optimal performance. Recommended numbers of threads are based on the specific configuration based on the NSS I/O capacity per the NSS recipe.  

Additionally Bright - NSS integration provides details on the Storage configuration and captures information on the type of storage and controllers as well as lists details on respective firmware and Operating System. It further monitors and displays information on virtual disks, RAID configurations, physical disks and reports information on any errors or bad disks/sectors present.  

There are many I/O metrics and parameters that a system admin utilizes to check the filesystem and I/O performance. Bright Cluster Manager exposes the kernel level parameters  gathered from iostat and nfsstat and graphs these for easy monitoring.  Various metrics and nodes can be selected together and added for comparison purposes. The table below compares the IOtime across two different partitions on a NSS node. Another graph compares the deviation in the memory utilization and sectors Read and Written across the active passive NSS nodes. Administrators can use these numerous metrics to define thresholds for alerting or automation purposes; for example if an NSS is 85% full, an email alert can be sent to administrators.