we have an MD3000i presenting LUN's to RedHat5.1 kernel 2.6.18-92.el5PAE .
Everything is looking good apart from the following error messages :
Messages from the RedHat server (R900).
sde: Current: sense key: Recovered Error <<vendor>> ASC=0x95 ASCQ=0x1ASC=0x95 ASCQ=0x1
sde: Current: sense key: Recovered Error <<vendor>> ASC=0xd0 ASCQ=0x6ASC=0xd0 ASCQ=0x6
At the same time we get the following info from the MD event log:
Description: Timeout on drive side of controller [physical disk]
Description: Media scan (scrub) completed
Description: Immediate availability initialization (IAF) completed on volume
Description: VDD logger an error
Description: VDD repair started
These go no and on for a few hours pointing to both raid module controllers.
Are these errors real errors or just info ?
The R900 has stopped logging the errors so I assume that the MD has fixed itself, and the MD has also stopped giving these error messages.
I would like confirmation that the MD3000i is in a sane state if possible.
This is output from the scsi layer in the kernel. Without knowing what sde is actually attached to, its hard to say. After installing the MD3000i RDAC multipath failover driver, did you reboot? You can perform a quick check by "lsmod | grep mpp". The mpp driver should be listed as a kernel module:
If you are still seeing those in /var/log/messages, I would contact support for assistance.
This message says that a drive did not respond within a specified timeout period. Displayed in log as informational.
The MD3000i will periodically scan your virtual disks for errors and repairs any errors that it finds. This is a periodic process that is performed in the background. This is a normal message.
This message says that a virtual disk that you have just created is ready to be used. This is a normal message.
The MD3000i detected an error on a VDD (virtual disk drive).
The MD3000i is repairing the error it detected earlier.
mppUpper and mppVhba are loaded and the versions from modinfo are correct as specified in Dell documentation.
/dev/sde is a LUN presented from the MD3000i.
I've logged a call with Dell about this, but Im getting a bit of a run around. They are saying the error is not a problem and is to be expected. I find this hard to believe as the repairs are happening 10 to 20 times a day.
I suspect we have a disk on the way out, but I would like to get confirmation of this, which Im failing to get off Dell.