RAID-0 write performance issues with PERC-6i

Dell Storage Community

RAID-0 write performance issues with PERC-6i

  • rated by 0 users
  • This post has 11 Replies |
  • 0 Followers
  • Todd,

    I am testing a Dell 2900 server hosting Windows Server 2003 Enterprise Edition (x86 32-bit) SP2, with a PERC-6i and 10 x 146 GB SAS 15 KRPM drives configured as 4 RAID-0 virtual disks (1 x 1, 2 x 2, 1 x 5). An application was tested on this configuration, and was found to perform worse than expected (given understandings and assumptions of RAID-0 performance).

    IOMeter was then run on the same server for each disk size. The findings from these tests were:

    - Sequential and random read throughput rates were about as expected for all disk sizes

    - Sequential and random write throughput rates were significantly lower than expected for 2- and 5-disk RAID-0 virtual disks (yet were as expected for the 1-disk virtual disk)

    - Write performance for 2- and 5-disk RAID-0 virtual disks were worse than for a 1-disk virtual disk

    - Write performance for a 5-disk RAID-0 virtual disk was worse than a 5-disk RAID-5 virtual disk: up to 3 times worse for random writes and up to 11 times worse for sequential writes.

    - No significant changes in the RAID-0 write performance resulted from the use of 64 KB NTFS allocation unit size or from 1 MB volume alignment

    These findings lead me to suspect a possible defect in the PERC-6i firmware for RAID-0 support.

    Has anyone else experiences these issues?


    (to be continuted in next post - due to posting size limits).


    Scott R.
  • Todd,

    (continued from earlier post)

    I posted these issues recently on a SQL Server blog thread (see link: http://sqlblog.com/blogs/joe_chang/archive/2008/09/04/io-cost-structure-preparing-for-ssd-arrays.aspx). The thread author also tested and found poor write performance with RAID-0 virtual disks on the PERC-6i but not with the PERC-5, as did another poster to this thread. These posts suggest a broader issue rather than an isolated case involving just our system.

    A side concern is that RAID-10, RAID-50, and RAID-60 use striping (RAID-0) on top of their respective base protection models (RAID-1, RAID-5, and RAID-6). Is it possible that RAID-10, RAID-50, and RAID-60 protection models may also be impacted by this RAID-0 write performance issue?

    The PERC-6i firmware version is 6.0.2-0002 and the driver version is 2.14.00.32 on my test server. No similar problems were found on the Dell support site from a number of searches.

    In contrast, the published findings from the PERC-6 RAID controller analysis white paper (the paper you mentioned in your post) for RAID-0 write workloads (web server log, SQL Server log) yielded much better performance – about what would be expected for RAID-0.

    I am reporting this issue to Dell support for investigation and follow-up.

    Any suggestions or insights from your side?

    Thanks!

    Scott R.
  • I sent an email over the author of the PERC 6 performance study that you referenced to see if he has any guidance. I believe that there was a firmware update for the PERC 6 - can you tell me what version you are using? - As I am sure this is one of the questions that he will ask me :)

    Todd
  • Todd,

    The PERC-6i firmware version is 6.0.2-0002 and the driver version is 2.14.00.32 (as gathered by OMSA). I posted it in the second part of the message above - sorry if it was confusing.

    My review of outstanding PERC-6 firmware versions on the Dell support site showed only one newer version - 6.0.3-0002, A05, dated 06/23/2008. The description of this update (problems and fixes) did not mention any RAID-0 write performance issues.

    Thanks for your help.

    Scott R.
  • Todd,

    An update: I reported the issue to Dell support. After collecting and submitting a DSET report for the test server (as well as my IOMeter test results), I was told that nothing is "broke" - the server (including the PERC-6 RAID controller) reports that it is "working" (nothing is failing or causing explicit errors). The reported information on poor PERC-6 RAID-0 write performance will be "passed up the chain", but there is no formal feedback process on such support-related requests, and no time line estimate / approximation / etc. on resolution. I was told to create a technical update subscription for the server, so that I can be notified when a PERC-6 firmware update is available (which I have done). I asked if problem case records could be searched for similar problems, and was told they could not be searched. The case is closed, and that's all I have to show for it.

    Any feedback from the staff member who authored the PERC-6 performance study?

    As I mentioned in the second post above (continuation of the original post), I am aware of at least two other persons who say they have recreated the same PERC-6 RAID-0 poor write performance issue I experienced. Follow the blog link given above in that post for further details.


    Scott
  • Scott,

    Sorry for the delayed response. I'm working on getting some help on the PERC performance specifics, the author of the paper you reference is out of the office currently.

    I agree with you that it looks like some type of strange bug with RAID 0, but I haven't been able to get to the right people here yet to find out.

    One question that I have after reading through everything - What type of storage are you attaching to? Is it an MD1000?

    Thanks,
    Todd
  • Todd,

    The 10 x 146 GB SAS disks were all located in the internal drive bays of the PE 2900 - 4 drives on connector #1 and 6 drives on connector #2 of the PERC-6i. No MD1000s were used in this configuration for the tests. The 5-disk RAID-0 virtual disk used physical disks across both connectors (3 drives on one connector and 2 drives on the other connector) - a good practice to balance I/O bandwidth across the connectors.


    Scott R.
  • Scott - Here is the response that I got from our performance team:

    We have never seen RAID 5 be better than RAID 0, especially for writes. However, using a file system, the writes go to the buffer cache, not the RAID controller. They are then written out by the OS lazy writer. You can setup the OS to force writes (flush), but that depends on how the system was setup. Here are a few issues:

    1. He mentions the fileallocation size, so he is using a file system on the LUN. That is not how we run the tests since the buffering caching in the operating system can then interfere.
    2. He mentions that this is a PERC6i, but doesn’t mention how he has set up the card (write through or write back).
    3. He mentions 1x1, 2x2, 1x5 which I interpret as 1 disk without RAID, 2 sets of RAID 0 with 2 disks each and a single RAID 0 over five disks.

    So, I would look at two things…
    1. Look at the settings on the RAID controller… I would bet it is WT for RAID 0 and Write back for RAID 5. WT is better throughput with a reasonable queue depth. So with a single i/o queued and WT on for RAID 0 versus WB for RAID 5… it could run slower. The general rule is write back is best (lowest latency) unless you have a large queue depth and then WT could give more throughput.
    2. Look at the filesystem buffer cache effects… are the two arrays treated differently.

    I think that you already answered some of the questions with your config. Can you take a look at the cache settings and try IOmeter without filesystem?

    Thanks,
    Todd
  • Todd,

    Thanks for your reply. I had planned to send you more information to your questions, but have since learned that the problem is likely resolved with a recent firmware upgrade for the PERC-6 RAID controllers (6.1.1-0045, A07 – my tests were done on 6.0.2-0002).

    See link: http://support.dell.com/support/downloads/download.aspx?c=us&l=en&s=gen&releaseid=R196813&SystemID=PWE_2900&servicetag=CP0YRG1&os=WNET&osl=en&deviceid=13514&devlib=0&typecnt=0&vercnt=4&catid=-1&impid=-1&formatcnt=5&libid=46&fileid=272112

    See the following separate blog entries for further details on the original problem and on another person’s test results of the resolution:

    http://sqlblog.com/blogs/joe_chang/archive/2008/10/01/dell-perc6-raid-controller-performance.aspx

    http://sqlblog.com/blogs/joe_chang/archive/2008/09/04/io-cost-structure-preparing-for-ssd-arrays.aspx

    I plan to test the PERC-6 firmware upgrade when my test server is available again from its current assignment.

    Have the folks you know in Dell’s test labs tested the issue – before and after fix?

    Thanks for your interest in these issues. I look forward to your feedback.


    Scott R.
  • Todd,

    In a issue related to the previous post (continuation due to wiki post size limitations):

    Under the section titled “Fixes and Enhancements”, the PERC-6 firmware upgrade documentation notes two specific changes of interest:

    2. Improved performance for RAID 0 virtual disks.

    13. Improved background initialization (BGI) performance.

    Item #2 is the issue we have been discussing.

    Item #13 is an issue I have noticed but have not yet discussed here. I have followed the documentation for requesting a fast initialize on both RAID-5 and RAID-10 virtual disks (on both PERC-5 and PERC-6 RAID controllers), only to have the process take many hours – running what appears to be a regular (non-fast) background initialize and ignoring the request for a “fast initialize”. I can understand a RAID-5 initialize taking awhile, since the RAID-5 initialize process has to read all disks, and compute / rewrite parity. I would expect that RAID-10 initialize should be very fast – a data copy versus a multi-disk read / compute parity / write process for RAID-5. Yet I consistently experience slower initialize times (using regular or fast initialize) for RAID-10 virtual disks than RAID-5 virtual disks. I have even seen slower initialize times on a RAID-10 virtual disk with 4 disks total than a RAID-5 virtual disk with 6 disks total (same size / speed / interface disks in all cases). This behavior is completely counter-intuitive to what I would expect. I am hopeful that this firmware change may improve the situation, but I will reserve judgment until I can test it.

    Have the folks you know in Dell’s test labs experienced any similar issues regarding virtual disk background initialize taking longer than expected?

    Thanks for your interest in these issues. I look forward to your feedback.


    Scott R.
  • For Raid 0 (not 10/50/60) there is a performance fix in the latest firmware posted to support.dell.com for the PERC 6 family. (as mentioned above)
    6.1.1-0047

    This fixes performance for Raid 0 arrays in WB mode.

    BGI taking an excessive amount of time has also been fixed. There is a difference in how fast init/BGI and full init work on PERC 6. With SAS drive in particular Full Init will do copy functions that will speed up the overal process, but the downside is the array is locked while this is ongoing (and if you reboot you will fall back to a BGI).
    There are also enhancements for SATA performance.
  • Hi Scott,
    did you have a chance to confirm that RAID0 is now performing as expected? We are just about to get a couple of 2950s, which we will be using with the MD1000, but I'm wondering, whether to get the older PERC 5e with that to avoid the RAID 0 issues.
    Also, would the new firrmware also be fixing performance in RAID 10 mode?

    We are planning to use the new Intel X25-E SSD drives, but since they are only 64GB max, we definitely need RAID 0 for testing and eventually RAID 10 when we go into production.

    Wolfgang.
Page 1 of 1 (12 items)