Hard disk failed on PowerEdge 2900

Servers

Servers
Information and ideas on Dell PowerEdge rack, tower and blade server solutions.

Hard disk failed on PowerEdge 2900

This question is answered

Hello,

 

One of RAID 5 hard disks in Dell PowerEdge 2900 failed and server has “Dell PERC 5/i Integrated Controller”. RAID 5 was built by 4 hard disks and each has 300 GB. But, logical disk formed by RAID 5 is about 560 GB. According to RAID 5 calculation, the capacity of logical hard disk should be about 900 GB since it is built by 4x300GB hard disks. It seems to me that one of the hard disks in RAID 5 is being used for “hot spare”. Is it correct?

 

We have “Dell Server Administrator version 4.5.0” installed on that server but it can’t display any information like RAID configuration, hot spare, etc. I am going to install “Dell Server Adminitrator version 5.2.0, A00” which will give more information, hopefully.

 

I’d like to ask the following questions.

 

  1. If one of RAID hard disk is hot spare, is there step by step guide for replacing failed hard disk with new one?
  2. If there is no hot spare, is there step by step guide for replacing failed hard disk with new one?

 

Many thanks.

Aung

Verified Answer
  • Hi Aung,

    It sounds from the information that you have given that one of the hard drives was set up as a hot spare. If one of the hard drives was set up as a hot spare, the hot spare disk would have automatically started to rebuild and you would only have to replace the failed drive and assign that as the new hot spare.

    To do this, you would follow the steps outlined here: support.dell.com/.../chapterh.htm (under Creating Global Hot Spares)

    If there is no hot spare, you would remove the failed hard drive and then follow the steps outlined here: support.dell.com/.../chapterh.htm (under Performing a Manual Rebuild of an Individual Physical Disk)

    I would try the above first before we verify what's happening with Dell OpenManage Server Adminstrator as we want to verify that the virtual disk is back up and running in a fully working state.

    Please let us know how you get on and if you need any further help, please don't hesitate to get back in touch!


    John C
    Dell | Social Outreach Services


    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

  • Hi Aung,

    I've discussed this with one of my colleagues and you can also simply remove the failed hard drive and replace with a working one while the server is running and the rebuild should happen automatically - this should result in no downtime at all as the rebuild can take place in the background, however it will have a significant impact on the server performance.

    If you would rather take the server offline, you can also do the same rebuild by doing the steps as you have outlined above.

    It's very tricky to estimate the length of time it will take to rebuild the hard drive, a conservative estimate would be 3 to 4 hours, but it could take longer than this. If you were rebuilding the hard drive without shutting down the server, the rebuild would take longer.

    John C
    Dell | Social Outreach Services


    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

  • Hi Aung,

    I'm glad to hear that you've managed to replace the hard drive okay. As for identifying whether the new hard disk has been fully built, OpenManage Server Administrator is the tool that would tell you this.

    I appreciate that you were having issues with OpenManage Server Administrator 4.5.0, so I would advise you to install the latest version which is 6.5.0 which can be found at the Drivers and Download section of the Support Site here: support.dell.com/.../driverslist.aspx

    If you need any help installing OpenManage Server Administrator or have any other problems, please let me know!

    John C
    Dell | Social Outreach Services


    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

All Replies
  • 001.png

    Hello,

    With “Dell Server Administrator version 4.5.0”, it seems to me that there is option to check RAID configuration and setup. However, even I did "Global Rescan", nothing was appeared. Please see picture below. Can anybody suggest?

    Many thanks.

    Aung

  • Hi Aung,

    It sounds from the information that you have given that one of the hard drives was set up as a hot spare. If one of the hard drives was set up as a hot spare, the hot spare disk would have automatically started to rebuild and you would only have to replace the failed drive and assign that as the new hot spare.

    To do this, you would follow the steps outlined here: support.dell.com/.../chapterh.htm (under Creating Global Hot Spares)

    If there is no hot spare, you would remove the failed hard drive and then follow the steps outlined here: support.dell.com/.../chapterh.htm (under Performing a Manual Rebuild of an Individual Physical Disk)

    I would try the above first before we verify what's happening with Dell OpenManage Server Adminstrator as we want to verify that the virtual disk is back up and running in a fully working state.

    Please let us know how you get on and if you need any further help, please don't hesitate to get back in touch!


    John C
    Dell | Social Outreach Services


    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

  • I don't believe 4.5 is supported on a 2900, so install the latest version for your OS (what OS is it??).  There may be minimum Service Packs or firmware required.

  • MA350004.JPG

    Hi John,

    Thank you very much for your kind information and help. In fact, it is mail server. Today, I restarted the server just to check the status of RAID and failed hard disk via PERC BIOS. My previous assumption as RAID 5 with the hot spare is wrong. Please see the attached pictures and those are screenshots of status on PERC BIOS.

     

    According to information on BIOS, it is RAID 10 with no hot spare. Please correct me if I am wrong.

     

    The following steps would have to be done for replacing failed hard disk.

     

    1. Shutdown the server.
    2. Replace the failed hard disk with new one.
    3. Restart server and press CLT+R to go to PERC BIOS
    4. Go to new hard disk and REBUILD by following guide from this link http://support.dell.com/support/edocs/storage/RAID/PERC5/en/UG/HTML/chapterh.htm#wp1069357

     

    If there is anything wrong in above steps, kindly correctly me. Would you know how long (estimate time) to REBUILD for new hard disk with 300 GB capacity for RAID 10? Since it is mail server, downtime would be critical for us.

     

    Thank you in advance.

    Aung

  • MA350008.JPG

    Sorry, this is another screenshot.

  • MA350009.JPG

    This is another one. Many thanks.

    Aung

  • Hi Aung,

    I've discussed this with one of my colleagues and you can also simply remove the failed hard drive and replace with a working one while the server is running and the rebuild should happen automatically - this should result in no downtime at all as the rebuild can take place in the background, however it will have a significant impact on the server performance.

    If you would rather take the server offline, you can also do the same rebuild by doing the steps as you have outlined above.

    It's very tricky to estimate the length of time it will take to rebuild the hard drive, a conservative estimate would be 3 to 4 hours, but it could take longer than this. If you were rebuilding the hard drive without shutting down the server, the rebuild would take longer.

    John C
    Dell | Social Outreach Services


    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

  • Just remember ... never shutdown the system to replace a hot-swappable drive - if they are hot-swappable, replace it "hot".  If you can access the drive (pull it out) from the outside of the machine, it is hot-swappable ... and even cabled SAS/SATA drives are hot-swappable, but this can be trickier to do without disturbing anything inside.

  • Hi John,

     

    Thanks a lot for your kind answer and help. I successfully replaced the failed hard disk and it is being built into RAID.

     

    I would like to know how I can identify the new hard disk has been fully built and integrated into RAID.

     

    Many thanks.

    Aung

  • Hi Aung,

    I'm glad to hear that you've managed to replace the hard drive okay. As for identifying whether the new hard disk has been fully built, OpenManage Server Administrator is the tool that would tell you this.

    I appreciate that you were having issues with OpenManage Server Administrator 4.5.0, so I would advise you to install the latest version which is 6.5.0 which can be found at the Drivers and Download section of the Support Site here: support.dell.com/.../driverslist.aspx

    If you need any help installing OpenManage Server Administrator or have any other problems, please let me know!

    John C
    Dell | Social Outreach Services


    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

  • Hi John,

     

    Thank you very much for your great help. Since it requires uninstalling the previous version (OpenManage Server Administrator 4.5.0) to install the latest version 6.5.0 and might need to reboot, I would do it when I will update MS critical patches for server in future. Of course, I can check from PERC BIOS at that time as well.

     

    At the time being, it seems to me that everything is fine with server after replacing failed hard disk.

     

    All the best,

    Aung

  • Hi Aung,

    You're very welcome, I'm glad that the information I provided was put to good use! If there's anything else we can do for you, please don't hesitate to get back in touch with us again!

    John C
    Dell | Social Outreach Services


    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)