Perc 4e/di RAID Kit - PE 2850

Servers

Servers
Information and ideas on Dell PowerEdge rack, tower and blade server solutions.

Perc 4e/di RAID Kit - PE 2850

This question is not answered

We have a Dell PowerEdge 2850 that is crashing and rebooting sporadically due to RAID controller issues.  Upon booting, the message "Memory/battery problems were detected" appears and states that the controller has recovered but cache has been lost.  This message is displayed on every boot.  I have found in several places that this indicates either a problem with the RAID battery or the memory module on the riser card.  Is this kit available specifically for the Perc 4e/di card?  Will the other kits out there such as this one http://stores.velocitytechsolutions.com/-strse-361/H1813-Dell-Poweredge-perc/Detail.bok work?  It indicates that it is for the perc 4i, but looks very similar.  My only concern with these is that the memory module may not match exactly.  Any advice would be appreciated.

All Replies
  • The PERC 4/Di is an older controller, found on 6th and 7th generation servers.  8th generation servers introduced the PERC 4e/Di.  They require different size and speed of memory, so you will need something specific to the 4e/Di.  If the LCD is blue, then start with replacing the memory - part number 4D554.  If the LCD is amber scrolling a message about the ROMB battery voltage, then replace the battery.

  • Thanks for the reply!  The LCD is blue so I will start with the memory.  I have not seen anything amber about battery voltage.  I see "256MB 1Rx8 PC2-3200R-333-10-A1" on the current memory module.  I do not see 4D554 anywhere.  Did they make this module in different speeds as mine seems to be 333Mhz and a quick google search shows the 4D554 as 400Mhz.  Just want to make sure I get the right one.  If you're fairly confident that 4D554 will work on this controller than it's worth a shot to me.

    Thanks again.

  • If it is on the memory chip, it may be embedded in a longer number - maybe something like 04d554-JW01Y66 ... if you have a longer number like that, the part number may actually be different - just post the whole string.  Alternatively, you could look up your Service Tag on support.dell.com, then click the link on the left-hand side for Warranty Status, then change the tab to Original Configuration ... should list a part number there.

    http://www.impactcomputers.com/f6928.html

    http://www.impactcomputers.com/4d554.html

    4D554 is the main part number, although there are a few that will work, like F6928.  PERC's can be very picky with the memory they will accept, so I would start there.  The RAID memory has the same specs (mostly) as the system memory - PC2-3200/DDR2 400MHz, Registered ECC, single-rank, and 256MB in size.  The specs you posted for the other is PC2-3200, which all runs at 400MHz (3200/8=400).  The 333 in the description, I believe is the latency, usually written as 3-3-3.

     

     

  • Looks like the original configuration shows 4D554.  Although I can't be certain that this was never swapped in the past.  Here is the full number listed on the chip.

     

    KR   M393T3253FG0-CCCDS   0449

     

    Looks like I'll be trying the 4D554.

  • 4D554 did not work.  After installing a new replacement module BIOS is reporting that the DIMM is not campatible.  Any other suggestions on what module may work?  Maybe the controller is bad.

    I will probably also try replacing the battery.  Is there a way to isolate  the memory as the problem, it appears that this is a pretty difficult part to find.  If replacing the battery and a correct DIMM do not fix the problem, I guess replacing the entire controller would be the next move.

     

     

  • Unless the replacement DIMM is bad, it should be compatible with any version of the PERC 4e/Di.  It is likely, given what we know now (although unlikely in general) that the memory slot or controller is bad.  A bad battery is not out of the realm of possibility, but again, not very likely.

    The controller is on the large riser card, so if you are looking to replace either the controller or the memory slot, the riser will need to be replaced.  With riser cards, you should be careful to get a match for the one you have - if your riser has PCI-X slots, then there is a chance that your motherboard will not like a riser with PCIe slots.  Only the latter versions of the 2850 motherboard will accept either type of riser, as the 2850 did not support PCIe on its initial release.

     

  • The raid kits for the PE2850 are perc4e/di and they all use the same raid memory. The Dell part number is 4D554 and the specs are: 256mb 1Rx8 PC2-3200R-333. You may have received a memory stick that did not meet the specs I listed above that is why you got the incompatible error. I had the same issue with one of my PE2850's. I searched the web and found www.velocitytechsolutions.com and they were very knowledgable as to what I needed. They overnighted my part to me and within minutes my server was running flawlessly, my raid was intact and I have had no issues since. They took time to explain it all to me.

  • Velocity Tech Solutions was actually the vendor I got my replacement part from.  I'm sending it back because it was wrong and not compatible.  I may try another vendor and try swapping the DIMM again.  If this does not work, my next move will be replacing the entire riser card that the controller resides on.

  • Hello ski_guy.  

    My name is Tom.  I work for a company that has Dell Poweredge servers and we too got the error of "Battery Module is Present..." like you did.  I was wondering if you ever resolved this issue and how you may have done that.  The error came up on our server this morning for the first time.  I'm hoping it's nothing major.

     

    Thanks,

     

    Tom

     

  • Tom,

    First, you should edit your post to remove your phone number, as those types of personal details are not allows on the forums.

    Second, if your 2850 is telling you that the "battery module is present", that is normal.  This thread is regarding a "memory/battery problems were detected" issue. 

    If you are having an issue ... we need to know exactly what YOUR problem is - exact error messages, error codes, where you are getting them, what happened previous to the problem, etc., etc.  Simply saying "me too" does not help us understand what is wrong with your computer, especially if the messages you are receiving are different (memory battery problems were detected vs. battery module is [not] present).

  • Thanks theflash1932.  I edited my phone number out of the post. 

    I also spoke with a Dell Support Rep, and read to him what the error message was:

    "Battery Module is present on the adaptor memory/battery problems were detected.  The adaptor has recovered, but cached data was lost.  Press any key to continue..."

    We then pressed any key and the server came right back up.  He said that as a pre-emptive measure, that I should replace the memory/battery and raid controller.  He's sending me the parts and I'll be putting them in soon.  I'm crossing my fingers that the server stays up until I get the parts put in.

    I've replaced the raid card on another pe2850 before , but because drives kept dropping offline (orange light).  Dell kept sending drives (four), but finally the raid card was replaced and drives quit going bad.

    Tom

  • This is one of the most misunderstood messages for this machine among Dell's Technical Support. 

    • Do you get that message every time the system boots?  Or did you just get it once, then panicked? 
    • If you saw it just once upon booting up, did the system crash previously?  Power outage? 
    • Was this system unplugged/diconnected from the outlet for some length of time (moved, rewired rack, etc.)? 
    • Is the LCD blue or amber?  If amber, what is the error message it is scrolling?

    This will happen anytime the battery has drained while the system was off.  So, if it was left unplugged, that can happen.  If the riser card was reseated and/or the battery cable unplugged from the riser, this can also happen.  Also, during a learning cycle where the battery drains completely, if the system is rebooted when it is drained, you will get that message.  Under these circumstances, the message should not return on subsequent reboots once the battery has enough charge (an hour or two).

    Aside from the many "normal" (non-failing hardware) reasons for this message ...  if this returns on all subsequent reboots, it can be for a few reasons:

    1. RAID Memory is bad.  This is the most likely cause for this message.  This will usually be accompanied by a blue LCD, and often times, the OS will not boot properly.  It will seem as though Windows is corrupted (missing/corrupted files), but you will not be able to repair it in Recovery Console.  The system usually will not work until the memory is replaced.
    2. RAID Battery is bad.  As the RAID battery is what keeps the cache stored on the memory chip, the battery can be at fault.  This will usually be accompanied by an error message on the LCD panel, and/or a message during POST or in OpenManage about the battery being missing or having some voltage problem.  The OS will usually function fine during this, but the controller's cache feature is obviously turned off.
    3. RAID Controller is bad.  The RAID controller is what manages the battery's charge state and the cached data, so if the controller is not working properly, it can cause the memory and/or battery not to store the cache correctly, either by force or neglect.  The RAID controller is located on the PCI riser.
    4. Riser Card is bad.  This is the SAME part as above.  The riser card contains the RAID controller, so while the controller part might be functioning correctly, if the riser is damaged, it can cause errors similar to above.
    5. Motherboard is bad.  This is the least likely problem.  As it has little to do with the RAID controller, problems would arise in the connector for the riser card. 

    The plan to replace the memory, battery, and riser (controller) is a sound one, in that you will have everything you need to replace the most common causes of the problem, but I would do a couple of things to make sure you actually have bad hardware.  Make sure the error comes back on every subsequent reboot.  If it does not, AND the system was not unplugged for some length of time prior to the error message, then update your system's firmware (BIOS, ESM, RAID firmware (driver first)).  If the system was unplugged, then give it an hour or so for the battery to charge.

     

  • I have the exact same problem on my poweredge 2850.

    Battery Module is present on the adaptor memory/battery problems were detected.  The adaptor has recovered, but cached data was lost.  Press any key to continue..."

    Then get an error in window server 2003 saying ...active directory is rebuilding indices  and then the server reboots.

    any ideas?

    got a battery and risecard on order, cant find any memory in the UK though?

     Thanks!

    Nathan

    http://www.RascalsCastles.co.uk

  • We are having a very similar error as well.  We use a PE2950 and the error simply states "Memory/battery problems were detected. The adapter has recovered, but cached data was lost."

    This started happening on Friday at 4:23pm.  Then it lasted 4 days and did it again on Tuesday at 4:14pm and again at 6:30pm.  Dell is sending us a battery and memory next day so I will see if I can post back tomorrow what we find out.

    Marc