Forums

Storage

Storage
Information and ideas on Dell storage solutions, including DAS, NAS, SAN and backup.

Mirroring and Drive Rebuilding

  • Say you have an array of 1tb drives and you have a hot-spare configured. If one drive in the array fails and the hot-spare kicks in and starts rebuilding can you replace the failed drive before the hot-spare has finished rebuilding without causing issues to the array's volumes?

    Thanks
  • You didn't post what server model, raid controller and OS you're talking about, but most likely (can't say for sure for 100% of the possible setups) the answer is yes.

    If your raid controller is a PERC, there won't be any failback (where the raid controller (automatically) builds back to the replacement drive; the replacement drive either becomes the new hotspare or you have to make it the hotspare (in OpenManage).

    Member since 2003

  • perc 6/e attached to md1000 through 2950. i was under the impression with this raid card that fail back happened automatically.

     

    so for instance i have the md1000 array configured as such disks 6-14 raid volume. disk 0 hot spare. disk 6 fails and starts rebuilding on disk 0. replace disk 6 before rebuild finishes. are you saying that now disk 6 becomes the hot-spare or should not disk 6 not start rebuilding as part of the array volume and disk 0 falls back into hot-spare mode like i thought was the case. this occurs on perc5's if memory serves me correctly.

     

    thanks 

  • There's no mention of failback (or "fail back") in the feature list of the PERC6 (here).

    The MD3000(i), Dell|EMC SANs and Dell|Equallogic SANs do fail back, but I don't think I've ever seen any Dell PERC offer fail back, but maybe someone else remembers a PERC with this capability (it's rare to find with host-based raid controllers in the first place).

    Member since 2003

  • Hmmm. Im obviously confused then. Okay so in such a case what is best practice for such a scenario? I guess there really is no way to tell just by looking at an enclosure then what the actual hot-spare drive is if in such a case as mentioned above, a failed disk was replaced the hot-spare will shift positions.

    Will the replacement disk automatically become the new hot-spare or do you need to run a reconfigure on the volume and specify it?
  • Correct, you can't tell the hot-spare by physically looking at the system. You need to check either Server Administrator or the RAID bios.

    The replacement disk needs to be manually configured as the hot-spare.

    If you have a preference to the hot-spare location, your best option would probably be to configure the replacement drive as a hot-spare, let the original hot-spare finish rebuilding, then take the original hot-spare disk offline manually so the new hot-spare (the replacement of the failed drive) starts to rebuild. Once the rebuild starts, force the hot-spare back online and reconfigure as the hot-spare.

  • Thanks for the info.

  • snapohead wrote:

    Correct, you can't tell the hot-spare by physically looking at the system. You need to check either Server Administrator or the RAID bios.

    The replacement disk needs to be manually configured as the hot-spare.

    If you have a preference to the hot-spare location, your best option would probably be to configure the replacement drive as a hot-spare, let the original hot-spare finish rebuilding, then take the original hot-spare disk offline manually so the new hot-spare (the replacement of the failed drive) starts to rebuild. Once the rebuild starts, force the hot-spare back online and reconfigure as the hot-spare.


    Snapohead is correct in the method he outlines as a method to have the hot spare in a designated location.  However, I would not recommend using this method for host based RAID controllers. 

     

    The reason is if you pull a disk, you are forcing the LUN in to a degraded state and with RAID 5, you can only have 1 failed disk at a time, so if another disk fails during this rebuild, your data will be lost.  I would recommend just configuring the new disk as the hot spare.  For array based RAID controllers, the array will copy the data from the hot spare to replaced failed disk, then reactivate original hot spare.  In this case, the during this copy, the LUN is not degraded.

  • Sorry to hijack an old thread but my question relates to one of the posts

    I have a EQL PS4000X that had a disk stop working, it did fail over to one of the hot spares i had setup but when i removed and replaced the disk it didn't fail back as you describe that it should.

    I have not found a way be sides ejecting other disks that would get the disk back active again to see firstly if it is now ok or if it is indeed actually dead and requires replacement.

    are you able to advise that if there is a manual of bringing the disk back online?