How do I rebuild a corrupt drive on the Perc S100?

Servers

Servers
Information and ideas on Dell PowerEdge rack, tower and blade server solutions.

How do I rebuild a corrupt drive on the Perc S100?

This question is answered

I have a new T110 II with a Perc S100 and 2 drives in a mirrored array. This morning the server wouldn't boot up - the dreaded 'Windows unable to find boot drive'. I unplugged one of the drives (HDD 0) and the system booted up okay. If I plug the other drive back in the system fails to boot. 

In this case how do I rebuild the mirror on the failed drive? Just so you are aware the failed drive is physically okay and passes all hardware tests - it appears as though I have a software corruption that fails on boot.

Dell have suggested that I format the drive in another machine and put it back into the array - that seems a bit clumsy. Is there no way I can force a rebuild on the machine itself?

Thanks for any help.

Verified Answer
  • Here is a link for the S100 documentation for further reference: http://dell.to/JrRG4E

    You have a couple choices in the Ctrl R bios:

    1. Rescan the drives

    To perform a rescan, select Rescan Disks from the Main Menu field and press <Enter>. (The activity indicator, in the information field at the top of the window, spins while the physical disks are being polled).

    The Rescan Disks option rescans all the channels, searches for new or removed physical disks, and re-reads the configuration information from each physical disk.

     NOTE: Sometimes when a physical disk has failed, it can be brought online through a rescan.  

    2. Set it as a hotspare and let it rebuild

    3. Swap virtual disks.   According to the manual, the boot disk has to be 1 in the boot order.  

    Regards,

    Geoff P
    Dell | Social Outreach Services - Enterprise


    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device!
    (iOS, Android, Windows)

All Replies
  • Here is a link for the S100 documentation for further reference: http://dell.to/JrRG4E

    You have a couple choices in the Ctrl R bios:

    1. Rescan the drives

    To perform a rescan, select Rescan Disks from the Main Menu field and press <Enter>. (The activity indicator, in the information field at the top of the window, spins while the physical disks are being polled).

    The Rescan Disks option rescans all the channels, searches for new or removed physical disks, and re-reads the configuration information from each physical disk.

     NOTE: Sometimes when a physical disk has failed, it can be brought online through a rescan.  

    2. Set it as a hotspare and let it rebuild

    3. Swap virtual disks.   According to the manual, the boot disk has to be 1 in the boot order.  

    Regards,

    Geoff P
    Dell | Social Outreach Services - Enterprise


    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device!
    (iOS, Android, Windows)

  • Thanks Geoff. I will try it tomorrow and let you know how I get on.

  • Geoff - would I be better off wiping the drive on another machine first? I don't want the corruption to get duplicated to the other drive and I figure a blank drive might be better.

  • Your call. I would go ahead and clear it, then add it back in as a hotspare.  Let us know how you fair.

    Regards,

    Geoff P
    Dell | Social Outreach Services - Enterprise


    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device!
    (iOS, Android, Windows)

  • Ok - thanks Geoff. Will replace this evening and let you know how I got on.

  • I placed the drive back in without wiping it and did a rescan (thought I would try this first). Everything came back up fine! :) Looking at the Open Manage screen I could see that the drive appeared offline to the controller so I have set it up as a global hot spare (the only option available to me)  and it is now rebuilding the mirror. Will check it again this evening.

    Thanks for all your help Geoff.

  • Unfortunately every 3-6 months I have to go through the whole process again. If I swap the drives over it comes back up and rebuilds the corrupt drive. It looks like there is some issue with the PERC 100 controller. Has anyone else had this problem? Is it just me?

  • "Just so you are aware the failed drive is physically okay and passes all hardware tests"

    If it is the same drive which fails, I would replace the drive. Hardware tests do not account for infrequent intermittant disk controller issues, such as a component going out of spec or a power anomally which passes through the system which possibly could be detected only if the diags was run for weeks. In the past  20 years I have had approx 15 disks arrays pass diags with flying colors, but only stabilize after one or more of the disks were replaced.

  • Thanks for the reply. I think that this is one of those cases. I have noticed that there are quite a few firmware updates for certain drives in my Poweredge. As my drives are attached to the PERC 100 card how do I update the firmware as Windows doesn't 'see' the drive directly - only the controller card.

  • Hi Wedgels2 - Did you ever get this resolved? I seem to be having a very similar problem with a new T110 II and Perc S100 with RAID1 where every 3 or so months either disk 0 or disk 1 will 'fail' according to Open Manager and the array will be degraded. Rebooting the server brings the 'failed' disk back to a 'ready' state where I can then add it as a hot global spare and rebuild the array.

    I have had the motherboard and raid controller replaced, reinstalled the OS and replaced one of the disks. Since replacing one of the disks, I have had disk 0 and disk 1 reported as failed but both times been able to rebuild the array after rebooting.

    I'm not convinced it really is a disk issue and wonder if this is a known problem or if anyone has experienced the same issue and resolved it?