PowerEdge R510 Troubles

Servers

Servers
Information and ideas on Dell PowerEdge rack, tower and blade server solutions.

PowerEdge R510 Troubles

This question is answered

Server: PowerEdge R510

BIOS: v1.9.0

OS: Windows 2008 R2 (Release and SP1)

Hard drive configuration: 2 SSDs for OS, 6 spinning disks for RAID 5 (have tried removing and recreating all RAID arrays, doing a long initialize each time. have tried removing all disks but one SSD for the install, have tried both SSDs in a RAID 0 and a RAID 1, similar results every time)

We were previously running Windows 2008 R2 on this server and needed to wipe and reinstall. I attempted to use the Dell SBUU, which would not install as it could not find the OS Driver Pack. I used the SBUU to update the firmware. After this, having the following issues:

Attempted to install OS using Dell SMTD - server stalls at "Expanding Windows files". Left here for over an hour, did not proceed, hard reset server.

Attempted to install OS using USC - server stalls at creating partitions. Left here for an hour, did not proceed, hard reset server.

Attempted to install OS using USC (retrieving drivers from FTP) - after clicking Next at the Insert OS disc screen, server stalled for 2 hours. Could not proceed or go back, hard reset server.

Attempted to install OS using OS DVD - Server stalled at setting up your computer for the first time screen for over 2 hours, hard reset server. Was able to get to the logon screen, after attempting to logon, server stalled at the "Please wait for the Local Session Manager" screen. Left at this screen for 23 hours, hard reset server.

Called Dell support, who lead me through a bunch of troubleshooting steps (safe mode, turning off all services, etc). After every boot, still encountered issues with the server. Some of these were, non-responsive mouse/keyboard, server stalling (mouse appeared to move, but server would not do anything). Usually this would occur within 10-15 seconds of booting up. They had me try ALT+E then F while in the BIOS settings screen to clear the BIOS, and this appeared to make things better, but would not last more than 2 minutes after booting up. Call was closed while server was in this state.

Booted with Linux Mint LiveCD and ran benchmarks on drives, all appears to work fine.

Booted with CentOS Live CD and reflashed BIOS to 1.9.0. Went into BIOS and performed BIOS clear (ALT+E and F). Booted to Windows, same problems as before, server worked for a minute or so, then appeared to stall.

I've been working on this server for over a week now, and Dell support doesn't seem to want to help. Any ideas?

(I have some other things I'm going to try, but getting frustrated here and soon to run out of ideas.)

Verified Answer
  • Hello Jason

    It sounds like all of your problems were caused by the system BIOS update. BIOS updates will commonly change voltages throughout the system. I am curious as to what the original problem was that prompted the OS reinstall.

    The first thing I would do is drain flea power and clear NVram. It is not uncommon for old settings or process threads to remain in volatile or non-volatile memory. With the wide sweeping changes a BIOS update can make, this data left in memory can cause any number of odd issues. Disconnect power from the server for about 1 minute, move the NVram jumper to the reset position, reattach power cables, power up until it stops posting, shut the server back down, disconnect power for about 1 minute again, move jumper back, and then power up normally.

    If that does not do the trick then I would rule out issues with add on cards or external devices. The BIOS update may cause compatibility issues with components in the system. If disconnecting/removing an add on device corrects the problem then I would recommend updating the firmware on that device to see if that corrects the issue.

    There is also the possibility of bad media. All of the lockups you mentioned occur when the system is attempting to read from the OS DVD. There are no lockups mentioned when booting to alternate media and running tests. I would recommend trying another DVD.

    Let me know how it goes.

    Thanks

    Daniel Mysinger

    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

All Replies
  • Hello Jason

    It sounds like all of your problems were caused by the system BIOS update. BIOS updates will commonly change voltages throughout the system. I am curious as to what the original problem was that prompted the OS reinstall.

    The first thing I would do is drain flea power and clear NVram. It is not uncommon for old settings or process threads to remain in volatile or non-volatile memory. With the wide sweeping changes a BIOS update can make, this data left in memory can cause any number of odd issues. Disconnect power from the server for about 1 minute, move the NVram jumper to the reset position, reattach power cables, power up until it stops posting, shut the server back down, disconnect power for about 1 minute again, move jumper back, and then power up normally.

    If that does not do the trick then I would rule out issues with add on cards or external devices. The BIOS update may cause compatibility issues with components in the system. If disconnecting/removing an add on device corrects the problem then I would recommend updating the firmware on that device to see if that corrects the issue.

    There is also the possibility of bad media. All of the lockups you mentioned occur when the system is attempting to read from the OS DVD. There are no lockups mentioned when booting to alternate media and running tests. I would recommend trying another DVD.

    Let me know how it goes.

    Thanks

    Daniel Mysinger

    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

  • Tried the NVRAM clearing, following exactly what you specified. Reinstalled Windows after that, just in case (since the original installs didn't go cleanly). The install completed successfully, and I was able to sign into Windows without any lockups, which is further than before, so I'm hoping that it's good.

    As to the original 'problem' that caused us to try installing the OS over again. This was our SCCM 2007 server, and we are attempting to update to SCCM 2012. As this is a complete reinstall, we thought an OS reinstall to be the cleanest way to do this. We never thought that it would take over a week to get the OS reinstalled.

    If I might suggest, could the NVRAM reset be added to the server team's 'script'? I talked to two different techs and neither of them suggested this as something to try (they both had me try reflashing the BIOS again, and reseting BIOS settings by ALT+E, F, but not the NVRAM reset).

  • Good to hear

    Let's keep our fingers crossed that the issue is resolved. We don't have scripting that I can have updated, but I will pass along the information to see if we can't get this information covered in vitality training.

    Thanks

    Daniel Mysinger

    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

  • This could be BIOS bug!! our mobo in R510 just failed recently! with 1.9.0 and 10.2.0 BIOS versions. Take a look at:

    en.community.dell.com/.../20130283.aspx

  • We had some problems yesterday after clean factory OS install in R510. As examples:

    - stuttering mouse / keyboard at screen AND over iDRAC remotely

    - 25% load on idle server (interrupts)

    This was caused by not fresh drivers, we did full upgrade using package deployment which contains full driver package (~140MB in case of R510). Seems to be solved now.

    Also BACS3 was replaced by BACS4.

  • tomaszg

    We had some problems yesterday after clean factory OS install in R510. As examples:

    - stuttering mouse / keyboard at screen

    - 25% load on idle server (interrupts)

    This was caused by not fresh drivers, we did full upgrade using package deployment which contains full driver package (~140MB in case of R510). Seems to be solved now.

    Also BACS3 was replaced by BACS4.

    Thanks for the update!

    Daniel Mysinger

    Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

  • Update.

    Windows update method is called Lightscripts, it creates folder "System Bundle (Windows) PER510 v430".

    Running batch script "apply_components" applies all patches at once. Run as Administrator, this will avoid asking You admin rights on every file execution. It also creates diagnostic log.

    File with OS drivers for R510 is called "Drivers-for-OS-Deployment_Application_YVX7C_WN32_7.1.0.9_A00.exe".

    Be aware: You will loose connection to RDP, iDRAC and all fans will be spinning at full speed. DO NOT stop update, wait until it finish. You need about 2hrs for mainteance, since this is a lot of firmware files (1 043 648 512 bytes in case of R510).

    Of course there are other methods, I use NFS and CIFS shares and cdu_2.1_core_339.iso for boot.