T7500 memory failure at DIMM3

Desktop

Desktop
Desktop computer Forums (Audio, General Hardware, Video)

T7500 memory failure at DIMM3

This question is not answered

Hi

I've decided to upgrade the memory on my T7500 from 8GB (two 4GB sticks) to 24GB (3 8GB sticks). They're ECC registered DDR3-1333  (and were previously installed in a T610 server), tested and known-to-be-working and good.
As the documentation says I've populated DIMM1 DIMM2 and DIMM3 but every time when I boot I see the message:

Alert! Memory failure detected at DIMM3!

If I hit F1 the system boots fine and all is good.

I've ran the system diagnostics, Windows Memory Test, etc - all seems fine.
I have 4 of those sticks and just in case moved all of them around in the slots - no change - always the message about DIMM3. And yet the memory there seems to work just fine.

I even put the 4th stick there - currently with 32GB - everything appears to be working fine, but I still get that message.

So my question - is this due to a BIOS upgrade/downgrade that I need to do?
My current BIOS version is A14 (which is the latest shown on the downloads) and processor is X5570 (the old 5500 series and not the new 5600).

Pretty much I'm hoping for some software solution as the system is out of warranty and I'd very much like that I don't have the case as in those reviews: http://reviews.dell.com/2341n/precision-t5500/dell-precision-t5500-reviews/reviews.htm?sort=rank

Any help or suggestion would be appreciated!
Thanks

p.s. I'd be happy even with a workaround where I do get the message but it doesn't stop the system from booting up.

All Replies
  • Forgot to add the memory specs (in case if that's the crucial piece) :

    It's Samsung M393B1K70CH0-CH9 , CAS9, 2 Rank Double-sided module

    www.newegg.com/.../Product.aspx

  • Hi vstrinski,

    I suggest that you remove the Memory stick from the DIMM3 slot and then try starting the system. I suspect that it might be an issue with the slot. Also, I would request you to check the memory available on your system with all the Memory sticks attached.

    I hope this helps.

    Please let me know in case of any queries.

    Thanks and regards
    Harish R
    #iworkfordell

    Thanks and Regards,
    Harish R
    Dell Social Media and Community Professional
    Order Status : http://dell.to/1fgKSTr
    Download Drivers : http://dell.to/1hcxG98

  • Hi Harish,

    Thanks a lot for the suggestions!

    I'v already tried that actually - the issue seems to be isolated specifically with DIMM3 (so far).
    I tried all 4 sticks - in slots 1,2,3 and 4 - the system complains but does see all of the memory - 32GB. If I populate slots 1,2 and 3 - complains but all 24GB are there.
    I even tried slots 1,2 and 4,5 - system complained that configuration is not optimal but regardless of that did boot up and all memory was there.
    I also tried putting any of the 4 modules that I have in slot-3 and they all behave the same way - error in that slot.

    I'll see if I can find some other memory to test with - I think I have 3x1GB left over from another machine. Do you know if there are any specific requirements for the memory? (for example like timing, CAS delay, speed, etc that is or isn't okay and could be the issue?). I've read the manual several times and it doesn't list anything specific...

    What's bizarre is that regardless of the error message everything works just fine - all memory passes all tests, no system issues, etc.
    This makes me think that it could be some software issue or a setting in the BIOS that I may have missed.

    I doubt that it is the processor that's causing the issue - I have another X5502 sitting around and could test as a last option (I'd really prefer not to deal with heatsinks and thermal compound though).

    So the message itself doesn't bother me too much, but the fact that it makes the system halt and waiting for me to press F1 is a problem.  Pretty much if I can tweak the system so that it doesn't halt and wait for F1 - that would do.


    Thanks again!

  • Hi vstrinski,

    I understand from your previous post that when you tried all memory sticks - in slots 1,2,3 and 4 - the system complains "Alert! Memory failure detected at DIMM3!" but if you leave the DIMM3 slot and tried slots 1,2 and 4,5 then the system complains that 'Configuration is not optimal'. Please let me know if I have understood correctly.

    To answer your query about the specific requirements for the memory, the Precision T7500 supports 1066 MHz and 1333Mhz DDR3 memory cards. It does not have any specific requirements/rules except that the DIMM slots 1, 2 and 3 must be populated before DIMM slots 4, 5 and 6. In addition, when populating a Quad-rank DIMM with a Single- or Dual-rank DIMM in the same channel, the Quad-rank DIMM must be populated farthest from the CPU.

    Your system has A14 version of BIOS installed and that is the latest available with us.I would also suggest you to reset/load the 'BIOS' defaults.

    I hope this helps.

    Please let me know in case of any queries.

    Thanks and regards
    Harish R
    #iworkfordell

    Thanks and Regards,
    Harish R
    Dell Social Media and Community Professional
    Order Status : http://dell.to/1fgKSTr
    Download Drivers : http://dell.to/1hcxG98

  • Hi Harish,

    yes, you have understood me correctly. With slots 1,2,3,4 populated I get the error at DIMM3 message, but when populating 1,2,4,5 I don't  - I get the non-optimal message.

    As per your suggestion I reset the BIOS several times - using the "Load Defaults" through the menu and also by moving the jumper from password-reset to BIOS-reset (as per the documentation). This did not make the message go away.

    I also found the 3x1GB memory modules that i have left from another Dell server (they're DDR3 ECC single rank DDR3-1333) and when I put those to my surprise everything worked just fine and I did not get any error message!

    This makes me think that there is either something with the ranking (the new 8GB modules are dual-rank) or maybe I'm hitting some other limitation? According to the specs the 8GB are ECC Registered and the 1GB are ECC Unbuffered - could that be the issue?

    Thanks

    p.s. While resetting the BIOS I stumbled onto the setting to report keyboard errors - turning it off made the system report the error but not stop waiting for me to press F1. So if there is no solution to my DIMM3 error then at least I've found a way to mitigate it and live with it.

    For the record - the 1GB memory modules are M391B2873EH1-CH9.

  • Hi vstrinski,

    The Precision T7500 uses 1066 MHz and 1333Mhz DDR3 unbuffered or registered ECC SDRAM memory. I am sure that the error on the system has nothing to do with it.

    I had a check with the configuration of Single CPU memory and i see that the system is capable of handling 24GB worth Dual Rank memory sticks (3 8GB sticks) or 48GB worth Dual Rank memory sticks (6 8GB sticks) and no other combinations are compatible (Ex. 2 Sticks, 4 Sticks etc.). I would like to know if you have used only 3 memory sticks and checked if you see the errors.

    Please let me know in case of any queries.

    Thanks and regards
    Harish R
    #iworkfordell

    Thanks and Regards,
    Harish R
    Dell Social Media and Community Professional
    Order Status : http://dell.to/1fgKSTr
    Download Drivers : http://dell.to/1hcxG98

  • Hi Harish,

    yes, I did use only 3 sticks - each of them a 8GB for a total of 24GB, and I this is when I saw the error. The sticks were populated in DIMM1, DIMM2 and DIMM3.

    I have a total of 4 of those sticks, and the 4th one was not installed. I posted the exact model and specs in one of the previous messages.

    I also have 3 other sticks that are of 1GB each (specs are also in the previous messages) and I did a separate test with just those installed in exactly the same way - in DIMM1, DIMM2 and DIMM3. When I used them I did not see any error message and the system correctly reported a total of 3GB RAM.

    So my observations so far are that when using 3 dual-rank sticks I do get an error, and when I use 3 single-rank sticks I do not.

    Regards

  • Hi vstrinski,

    Thank you for your patience. I have escalated the case to our Engineering team. I will get back to you as soon as possible.

    Please let me know in case of any queries.

    Thanks and regards
    Harish R
    #iworkfordell

    Thanks and Regards,
    Harish R
    Dell Social Media and Community Professional
    Order Status : http://dell.to/1fgKSTr
    Download Drivers : http://dell.to/1hcxG98

  • Wow, I am having almost the identical problem, except I get the error on DIMM2.

    "Alert! Memory failure detected at DIMM 2"

    Always DIMM2 no matter which DIMM is installed there.

    I am setting up this almost new Precision T7500 (built July, 2012) for a client. The machine had (2) 8 Gb DIMMs,  Slot1 & slot2, both Black Diamond DIMMs. Everything worked fine, except I had a dead SATA port #1. I swapped out the Intel Xeon E5502 (1.6 GHz) for an X5560 (2.8 GHz, Step-Code SLBF4), and restarted only to get the memory failure in DIMM2 error. On 2/18/2013 the motherboard 6FW8P (A02) [came with machine] was, replaced by P/N M1GJ6 (A00) with Dell field service doing the work. After the motherboard replacement, the "memory failure detected at DIMM 2" continues. I've tried all BIOS versions A10-A14 with no effect.

    Memory tests always run without failures.

    The DIMMs are marked "PJ [brand] GS3G085124-R  8 GB PC3-10600   DDRIII 1333".

    The Dell "About Memory" page  support.dell.com/.../a_mem.htm  

    recommended installing memory in identical triples, so I bought a 3rd 8 Gb Black Diamond DIMM described on Amazon.com as:

    "8GB Memory RAM for Dell Precision Workstation T5500, T7500, R5500 240pin PC3-10600 1333MHz DDR3 RDIMM Black Diamond Memory Module Upgrade". Note the word "Registered" does not appear in the description. See:

    www.amazon.com/.../B0018RCY58

    I installed the new 8 Gb DIMM into Slot2 and got the same failure message. I rotated all the memory through slots 1, 2 & 3 and got the same memory error at DIMM2. Windows7 Prof-64 reports the 24 Gb of memory installed, but only 16 Gb available. According to Dell's T7500 documentation "About Memory", each slot - 1, 2 & 3 is on its own memory channel; slot 4, 5 & 6 share the corresponding channels, 1-4, 2-5, 3-6. I installed the DIMMs in slots 1, 3 and 4, restarted and did not get the memory failure error. Instead, I get a different errors:

    "Alert! Memory population error for DIMM 5"

    "Alert! Non-optimal memory population detected. To maximize performance, populate DIMMs

     as specified in the service manual

    In this 1-3-4 slot memory configuration, it still causes a BIOS stop at boot time, but all 24 Gb are now available to Win7.

    I did find this configuration advice on the website of a company called Memory4less.com that goes beyond what the Dell documentation has to say --- see:

    www.memory4less.com/confitems.aspx   Quote:

    Maximum configurations require a 64-bit operating system. DDR3 1333MHz speed requires Intel Xeon X5550 or higher processors.

    1066MHz requires Xeon E5520 or higher, Xeon E5506 or lower support 800MHz only. Memory speed will be reduced as the number of modules installed increases. 1333MHz is supported with one DIMM per channel only using Xeon X55xx series. Please refer to the system manual for proper installation of DIMMs for Single, Dual, and Triple Channel configurations. Using Registered ECC DIMMs: Any configuration up to 192GB can be reached using 1GB, 2GB, 4GB and 8GB modules and kits. ECC Unbuffered DIMMs: any configuration up to 48GB can be reached using 1GB, 2GB, and 4GB modules and kits.

  • Which Precision T7500 motherboard do you have installed (D881F, 6FW8P or M1GJ6)? I have the same problem with both 6FW8P and M1GJ6, which I believe are the older of the three design levels.

  • I also have a T7500 that stalls on reboot with the message:

    "Alert!  Memory failure detected in DIMM2"

    Running diagnostics finds nothing. Not sure if performance is affected.

    Stalls on remote reboot and auto-update reboots.

    Machine has 48GB of RAM and was built 2011.01.28

    Don't have time to play with it. Machine is under warranty. Time to call in the IT experts.

    Just wanted to point out that at least 3 machines have the problem....

  • First,  thank you "vstrinski" for taking the time to write up a clear description of the failure.

    After consulting with several colleagues and reading more about the Xeon X5500 family of processors on the Intel website, the best cause I can attribute the error message "Alert!  Memory failure detected in DIMM n"  is a failure of the memory controller, which is part of the processor chip in the Xeon X5500 family. The consensus is my Xeon processor is damaged. The X5500s have a three-channel memory controller; one of the channels, in my case the channel controlling Slot2 and Slot5, I believe is damaged.

    To test this hypothesis partially, I reinstall the Intel Xeon E5603 (1.6 GHz) that came new with the T7500 from Dell in July of 2012 [in an earlier post I mis-identified the original processor as a Xeon E5502] . The problem is gone. So I ordered another used X5560 (2.8 GHz, Step-Code SLBF4) on eBay and am returning the used part I bought a few weeks ago when the problem started. At the time I installed it a few weeks ago, I thought the trouble was caused by driving the motherboard and DIMMs up to 1333 MHz from the 800 Mhz clock speed used by the E5503 processor, but that I now believe was wrong. The new X5560 should arrive this coming Saturday, so I should have a definitive answer sometime this weekend.

  • JP - that's an interesting idea but I doubt that this is the case. At the time when I originally tested that I had two Xeon processors - both x55 series - one was a 1.8GHz (I think quad-core) and the other was a 2.93GHz quad-core. At that time I repeated all tests with both processors - 3x8GB and 3x1GB with each of them and they both behaved the same - 3x1GB had no errors and 3x8GB reported errors. (I suppose it is possible that both of my processors were faulty.)

    I suspect that more likely the case is some hardware issue related to the 5520 chipset, and which issue has subsequently been resolved in the newer versions of the board. (How do I tell which revision is my boar?) I suspect that there is also some limitation with the memory ranks - when I used the simplest single-rank memories all worked fine but when I moved up to dual-ranked ones it didn't work. I don't have any 2 and 4GB single-rank chips to test with but I suspect they'll work just fine.

    Someone else mentioned earlier that they had a problem with remotely rebooting the machine and it stopping during boot with the error message - I resolved that by modifying the BIOS settings. (I forgot the exact name of the setting but it is in essence something like "stop on errors" which I set to No)

    I ended up replacing the motherboard and that solved all of my issues.

  • Hello vstrinski,

    Sorry about the time lag in getting back to you. You were right about the processor chip not being the problem. My machine was built just last summer, 2012-7-12, yet it came supplied with a relatively old motherboard P/N 6FW8P (A02).  SATA port 1 died, so Dell warranty service was out and replaced the motherboard with an ancient refurbished (aka "used") P/N M1GJ6 (A00); I had expected the replacement to be P/N D881F (A05), so this was disappointing. After the motherboard swap, the Memory Failure error message persisted, so after trying anything and everything I could think of, I decided to pull the motherboard and inspect the CPU socket with a magnifying visor, since the replacement motherboard was a used part. Low and behold, there where three groups of bent or slightly displaced pins in the LGA1377 socket. Using a hypodermic needle and an X-Acto knife, I gently moved the pins back to their correct positions, replaced the CPU, and for the first time, the machine booted with no memory failure on DIMM2 BIOS message.

    What level motherboard is installed on your T7500? The easiest way to find out is to download the Passmark Performance Benchmark R8 program and run the program called PerformanceTest in trial mode. There is a Dell service program that also displays the motherboard's part number and service level, but I can't remember which one.

    See: www.passmark.com/.../pt_download.htm

    This is a very useful tool and measuring and managing the T7500.  Run the PerformanceTest, then look under the heading System and read down the column This Computer for Motherboard Model and Motherboard Version.

    I just installed the 2nd CPU riser kit with a second Xeon X5560 processor, now the machine won't complete POST - I am getting an amber power button and blinking 2-3-4 lights which the Service Manual describes as Pb7. The debugging never ends it would seem.  

  • JP -

    that's pretty much exactly what happened in my case as well - I installed a second processor and got the exact same error message. Until then I was happily living with (and ignoring) the memory error. But the CPU one was a bit too much and that's what prompted me to replace the mobo.

    All got resolved with the new motherboard.