Start a Conversation

Unsolved

T

1 Rookie

 • 

16 Posts

43

March 7th, 2024 21:45

Cannot force HDD in Ready status to go offline

I need to replace a disk in RAID because of "Predictive failure reported for Disk 3 on Integrated RAID Controller 1"

/admin1-> racadm raid get pdisks -o -p state
Disk.Direct.3:RAID.Integrated.1-1
   State                            = Ready
Disk.Bay.0:Enclosure.Internal.0-1:RAID.Integrated.1-1
   State                            = Online
Disk.Bay.1:Enclosure.Internal.0-1:RAID.Integrated.1-1
   State                            = Online
Disk.Bay.2:Enclosure.Internal.0-1:RAID.Integrated.1-1
   State                            = Online
Disk.Bay.3:Enclosure.Internal.0-1:RAID.Integrated.1-1
   State                            = Ready

When I try to put it offline in order to replace it I get the following error :

/admin1-> racadm storage forceoffline:Disk.Bay.3:Enclosure.Internal.0-1:RAID.Integrated.1-1

ERROR: STOR013 : One or more Storage device(s) are not in a state where the
 operation can be completed.
Make sure the specified storage devices are in a state appropriate for
the requested operation and retry the operation.

What's the problem and how can I force it to turn offline in order to replace it ?

Thanks for your help

Moderator

 • 

3.7K Posts

March 8th, 2024 06:28

Hello thanks for choosing Dell and welcome to our community.

 

Could you try through OMSA?

 

Respectfully,

1 Rookie

 • 

16 Posts

March 8th, 2024 08:09

Hello,

Thanks for your reply.

Unfortunately the OS is a Debian 10 and OMSA is not available for this OS...

Is there another way I can proceed ?

(edited)

Moderator

 • 

3.2K Posts

March 8th, 2024 13:55

Hi,

The problem you're encountering is that the RAID controller can't take Disk 3 offline because it's still considered "Ready" by the system. A "Ready" state typically means the disk is recognized and appears healthy from the controller's perspective, even though it has a predictive failure warning.

Here's why you can't force it offline and how to proceed:

Why forcing offline might not be possible:

  • Protection mechanism: RAID controllers prioritize data integrity. Forcing a potentially healthy disk offline could risk data loss during rebuild.

How to proceed with replacing the disk:

  1. Manual Hot Swap (if supported):
  • Some RAID controllers allow hot swapping drives without manually taking them offline. Check your system's documentation to see if your controller supports this. If so, this is the recommended approach as it minimizes downtime.
  1. Put the RAID in maintenance mode (if available):
  • Certain RAID controllers offer a maintenance mode that allows putting the array in a degraded state for service. This might enable taking Disk 3 offline for replacement. Refer to your controller's manual for specific instructions.
  1. Initiate a controlled online replacement:
  • This is the most common approach. The process involves:
    • Issuing a command to the RAID controller to start the replacement process.
    • The controller will degrade the RAID array (meaning data redundancy is reduced).
    • You can then physically replace the failing disk (Disk 3 in your case).
    • The controller will automatically rebuild the data onto the new disk, restoring redundancy.

Here's how to find the specific commands for your RAID controller:

  • Consult your system's documentation for the exact commands for putting the RAID in maintenance mode or initiating a controlled online replacement. These commands might vary depending on your RAID controller model.
  • You can also search the internet for "[Your RAID controller model] online replacement guide" or similar keywords.

Additional Tips:

  • Make sure your system has a recent backup before proceeding with any RAID operations.
  • If you're unsure about any steps, consider contacting your system administrator or the manufacturer's support for guidance.

By following these steps, you should be able to replace the failing disk safely without needing to force it offline.

 

1 Rookie

 • 

16 Posts

March 8th, 2024 16:04

Thanks for these precisions.

Disk 3 is a hotspare dedicated and I think it doesn't hold any RAID 5 data :

/admin1-> racadm raid get pdisks -o -p usedraiddiskspace,availableraiddiskspace
...
Disk.Bay.3:Enclosure.Internal.0-1:RAID.Integrated.1-1
   UsedRaidDiskSpace                = 0.00 GB
   AvailableRaidDiskSpace           = 1117.25 GB

The RAID controller of our PowerEdge R420 is a "PERC H310 Mini (Intégré)", and I guess you refer to its documentation when you say "Consult your system's documentation". I see in it :

Does this mean I can proceed with option 1. Manual Hot Swap ? Just replace the disk physically ?

For option 2, I don't find any information about "maintenance mode" so I guess there's none

For option 3, I don't find a command for the controller to start a replacement process. But as it's a hotspare disk it doesn't need to be "extracted" from the RAID, no ?

Thanks

Moderator

 • 

8.5K Posts

March 8th, 2024 16:32

Tdevred,

 

If the drive is indeed assigned as a Hotspare as you say, then there isn't a reason to force it offline, as it 1 isn't online in the Virtual Disk and 2 it has no data on it. A Ready drive is simply a drive that the controller sees as ready to be used, but has no data or any configuration data on it. 

 

Before you do anything though, would you confirm if the Virtual Disk is in an Optimal state? If so, then you can just proceed by removing the hotspare predicted failure drive, waiting a couple of minutes, then inserting a replacement drive.

 

If it isn't Optimal then let us know.

 

 

1 Rookie

 • 

16 Posts

March 8th, 2024 17:06

What do you mean by "Optimal state" ? 

Here are all informations I have about Disk 3 :

Disk.Bay.3:Enclosure.Internal.0-1:RAID.Integrated.1-1
   Status                           = Unknown
   DeviceDescription                = Disk 3 in Backplane 1 of Integrated RAID Controller 1
   RollupStatus                     = Unknown
   Name                             = Physical Disk 0:1:3
   State                            = Ready
   OperationState                   = Not Applicable
   PowerStatus                      = Spun-Up
   Size                             = 1117.25 GB
   FailurePredicted                 = NO
   RemainingRatedWriteEndurance     = Not Applicable
   SecurityStatus                   = Not Capable
   BusProtocol                      = SAS
   MediaType                        = HDD
   UsedRaidDiskSpace                = 0.00 GB
   AvailableRaidDiskSpace           = 1117.25 GB
   Hotspare                         = Dedicated
   Manufacturer                     = SEAGATE
   ProductId                        = ST1200MM0088
   Revision                         = TT31
   SerialNumber                     = Z400DHQN
   PartNumber                       = TH0WXPCX2123365C00GFA00
   NegotiatedSpeed                  = 12.0 Gb/s
   ManufacturedDay                  = 6
   ManufacturedWeek                 = 19
   ManufacturedYear                 = 2016
   ForeignKeyIdentifier             = null
   SasAddress                       = 0x5000C500856E1ACD
   FormFactor                       = 2.5 Inch
   RaidNominalMediumRotationRate    = 10000
   T10PICapability                  = Not Capable
   BlockSizeInBytes                 = 512
   MaxCapableSpeed                  = Unknown
   SelfEncryptingDriveCapability    = Not Capable

PS: I don't anderstand why the Disk appears twice in racadm raid get pdisks :

/admin1-> racadm raid get pdisks
Disk.Direct.3:RAID.Integrated.1-1
Disk.Bay.0:Enclosure.Internal.0-1:RAID.Integrated.1-1
Disk.Bay.1:Enclosure.Internal.0-1:RAID.Integrated.1-1
Disk.Bay.2:Enclosure.Internal.0-1:RAID.Integrated.1-1
Disk.Bay.3:Enclosure.Internal.0-1:RAID.Integrated.1-1

Disk.Direct.3:RAID.Integrated.1-1 and Disk.Bay.3:Enclosure.Internal.0-1:RAID.Integrated.1-1 seem to be the same (same SerialNumber) :

Disk.Direct.3:RAID.Integrated.1-1
   Status                           = Warning
   DeviceDescription                = Disk 3 on Integrated RAID Controller 1
   RollupStatus                     = Warning
   Name                             = Physical Disk 0:3
   State                            = Ready
   OperationState                   = Not Applicable
   PowerStatus                      = Spun-Up
   Size                             = 1117.25 GB
   FailurePredicted                 = YES
   RemainingRatedWriteEndurance     = Not Applicable
   SecurityStatus                   = Not Capable
   BusProtocol                      = SAS
   MediaType                        = HDD
   UsedRaidDiskSpace                = 0.00 GB
   AvailableRaidDiskSpace           = 1117.25 GB
   Hotspare                         = Dedicated
   Manufacturer                     = SEAGATE
   ProductId                        = ST1200MM0088
   Revision                         = TT31
   SerialNumber                     = Z400DHQN
   PartNumber                       = TH0WXPCX2123365C00GFA00
   NegotiatedSpeed                  = 6.0 Gb/s
   ManufacturedDay                  = 6
   ManufacturedWeek                 = 19
   ManufacturedYear                 = 2016
   ForeignKeyIdentifier             = null
   SasAddress                       = 0x5000C500856E1ACD
   FormFactor                       = 2.5 Inch
   RaidNominalMediumRotationRate    = 10000
   T10PICapability                  = Not Capable
   BlockSizeInBytes                 = 512
   MaxCapableSpeed                  = Unknown
   SelfEncryptingDriveCapability    = Not Capable

What is that "Disk.Direct.3" ?

1 Rookie

 • 

16 Posts

March 8th, 2024 17:10

Oh sorry, you were talking about virtual Disks, yes they look good :

/admin1-> racadm storage get vdisks -o
Disk.Virtual.0:RAID.Integrated.1-1
   Status                           = Ok
   DeviceDescription                = Virtual Disk 0 on Integrated RAID Controller 1
   Name                             = system
   RollupStatus                     = Ok
   State                            = Online
   OperationalState                 = Not applicable
   Layout                           = Raid-5
   Size                             = 1000.00 GB
   SpanDepth                        = 1
   AvailableProtocols               = SAS
   MediaType                        = HDD
   ReadPolicy                       = No Read Ahead
   WritePolicy                      = Write Through
   StripeSize                       = 64K
   DiskCachePolicy                  = Default
   BadBlocksFound                   = NO
   Secured                          = NO
   RemainingRedundancy              = 1
   EnhancedCache                    = Not Applicable
   T10PIStatus                      = Disabled
   BlockSizeInBytes                 = 512
Disk.Virtual.1:RAID.Integrated.1-1
   Status                           = Ok
   DeviceDescription                = Virtual Disk 1 on Integrated RAID Controller 1
   Name                             = data
   RollupStatus                     = Ok
   State                            = Online
   OperationalState                 = Not applicable
   Layout                           = Raid-5
   Size                             = 1234.50 GB
   SpanDepth                        = 1
   AvailableProtocols               = SAS
   MediaType                        = HDD
   ReadPolicy                       = No Read Ahead
   WritePolicy                      = Write Through
   StripeSize                       = 64K
   DiskCachePolicy                  = Default
   BadBlocksFound                   = NO
   Secured                          = NO
   RemainingRedundancy              = 1
   EnhancedCache                    = Not Applicable
   T10PIStatus                      = Disabled
   BlockSizeInBytes                 = 512

Moderator

 • 

8.5K Posts

March 8th, 2024 17:23

What I was referring to was the Virtual Disk status, not the Physical Disk. So on the second image you provided, it looks like the Virtual Disks are called System and Data, if you click on those what does it show for their status?

As far as Disk Direct, I believe it is due to it being a standalone SSD disk. 

 

 

1 Rookie

 • 

16 Posts

March 8th, 2024 17:38

Status are just "Ok"

Data :

System :

Moderator

 • 

8.5K Posts

March 8th, 2024 17:53

Thank you. The virtual disks are fine, that is just how it is shown in that display, if you were to access the controller BIOS it would display as Optimal. 
So you should just be able to remove the predicted failure drive, wait a couple of minutes, then insert the replacement, once installed you can then reconfigure it as a hotspare.

1 Rookie

 • 

16 Posts

March 8th, 2024 17:57

Thanks Chris, I will proceed on monday, as I won't have access to the server before, and let you know.

1 Rookie

 • 

16 Posts

March 11th, 2024 11:06

Hello,

I have replaced the Disk 3.

Now "racadm raid get pdisks" output changed :

/admin1-> racadm raid get pdisks
Disk.Bay.0:Enclosure.Internal.0-1:RAID.Integrated.1-1
Disk.Bay.1:Enclosure.Internal.0-1:RAID.Integrated.1-1
Disk.Bay.2:Enclosure.Internal.0-1:RAID.Integrated.1-1
Disk.Bay.3:Enclosure.Internal.1-0:RAID.Integrated.1-1

Disk.Direct.3:RAID.Integrated.1-1 disappeared and Disk.Bay.3:Enclosure.Internal.0-1:RAID.Integrated.1-1 became Disk.Bay.3:Enclosure.Internal.1-0:RAID.Integrated.1-1

I have no controll on the new Disk.Bay.3:Enclosure.Internal.1-0:RAID.Integrated.1-1, if I try I get this output :

ERROR: STOR099 : Unable to find the FQDD Disk.Bay.3:Enclosure.Internal.1-0:RAID.Integrated.1-1
because an invalid FQDD is entered or an operation is pending on the specified FQDD.
Do the following and then retry the operation:
 1) Make sure the FQDD entered is valid.
 2) Make sure an operation is not pending on the FQDD identified in the message.
For more information refer to the RACADM Command Line Reference Guide.

So when I check pending operation on physical disks I get :

/admin1-> racadm storage get pdisks -pending
Disk.Direct.3:RAID.Integrated.1-1
Disk.Bay.3:Enclosure.Internal.0-1:RAID.Integrated.1-1

What am I supposed to do ?

Moderator

 • 

3.4K Posts

March 11th, 2024 15:57

Hello, 

It looks like you've successfully replaced Disk 3, but now you're encountering an issue with the system not recognizing the new disk properly, indicated by the change in the disk bay's enclosure number and the errors you're encountering when trying to control the new disk.

Can you try with racadm storage get pdisks command to check the status of all physical disks?

And also try to rescanning disks to see if the raid controller correctly recognize it.

If the disk is still not recognized correctly, you may need to manually assign it as a hot spare or integrate it into your RAID array through the RAID controller's BIOS or management software.

If these steps do not resolve the issue, it may be beneficial to contact Dell's technical support for more in-depth troubleshooting.

Thanks

3 Apprentice

 • 

406 Posts

March 11th, 2024 16:48

Hope you have a tested disk and check it does not have any previous config if not then.

1. Remove the drive and reinsert it and let me know how it goes.

1 Rookie

 • 

16 Posts

March 11th, 2024 17:40

Thanks, physical disks status looks ok even if informations are missing for the new disk 3 :

/admin1-> racadm storage get pdisks -o
Disk.Bay.0:Enclosure.Internal.0-1:RAID.Integrated.1-1
   Status                           = Unknown
   DeviceDescription                = Disk 0 in Backplane 1 of Integrated RAID Controller 1
   RollupStatus                     = Unknown
   Name                             = Physical Disk 0:1:0
   State                            = Online
   OperationState                   = Not Applicable
   PowerStatus                      = Spun-Up
   Size                             = 1117.25 GB
   FailurePredicted                 = NO
   RemainingRatedWriteEndurance     = Not Applicable
   SecurityStatus                   = Not Capable
   BusProtocol                      = SAS
   MediaType                        = HDD
   UsedRaidDiskSpace                = 1117.25 GB
   AvailableRaidDiskSpace           = 0.00 GB
   Hotspare                         = NO
   Manufacturer                     = SEAGATE
   ProductId                        = ST1200MM0088
   Revision                         = TT31
   SerialNumber                     = Z400DPW5
   PartNumber                       = TH0WXPCX2123365C015EA00
   NegotiatedSpeed                  = 12.0 Gb/s
   ManufacturedDay                  = 6
   ManufacturedWeek                 = 19
   ManufacturedYear                 = 2016
   ForeignKeyIdentifier             = null
   SasAddress                       = 0x5000C500856EE4DD
   FormFactor                       = 2.5 Inch
   RaidNominalMediumRotationRate    = 10000
   T10PICapability                  = Not Capable
   BlockSizeInBytes                 = 512
   MaxCapableSpeed                  = Unknown
   SelfEncryptingDriveCapability    = Not Capable
Disk.Bay.1:Enclosure.Internal.0-1:RAID.Integrated.1-1
   Status                           = Unknown
   DeviceDescription                = Disk 1 in Backplane 1 of Integrated RAID Controller 1
   RollupStatus                     = Unknown
   Name                             = Physical Disk 0:1:1
   State                            = Online
   OperationState                   = Not Applicable
   PowerStatus                      = Spun-Up
   Size                             = 1117.25 GB
   FailurePredicted                 = NO
   RemainingRatedWriteEndurance     = Not Applicable
   SecurityStatus                   = Not Capable
   BusProtocol                      = SAS
   MediaType                        = HDD
   UsedRaidDiskSpace                = 1117.25 GB
   AvailableRaidDiskSpace           = 0.00 GB
   Hotspare                         = NO
   Manufacturer                     = SEAGATE
   ProductId                        = ST1200MM0088
   Revision                         = TT31
   SerialNumber                     = Z400DGLW
   PartNumber                       = TH0WXPCX2123365C017HA00
   NegotiatedSpeed                  = 12.0 Gb/s
   ManufacturedDay                  = 6
   ManufacturedWeek                 = 19
   ManufacturedYear                 = 2016
   ForeignKeyIdentifier             = null
   SasAddress                       = 0x5000C500856E58A9
   FormFactor                       = 2.5 Inch
   RaidNominalMediumRotationRate    = 10000
   T10PICapability                  = Not Capable
   BlockSizeInBytes                 = 512
   MaxCapableSpeed                  = Unknown
   SelfEncryptingDriveCapability    = Not Capable
Disk.Bay.2:Enclosure.Internal.0-1:RAID.Integrated.1-1
   Status                           = Unknown
   DeviceDescription                = Disk 2 in Backplane 1 of Integrated RAID Controller 1
   RollupStatus                     = Unknown
   Name                             = Physical Disk 0:1:2
   State                            = Online
   OperationState                   = Not Applicable
   PowerStatus                      = Spun-Up
   Size                             = 1117.25 GB
   FailurePredicted                 = NO
   RemainingRatedWriteEndurance     = Not Applicable
   SecurityStatus                   = Not Capable
   BusProtocol                      = SAS
   MediaType                        = HDD
   UsedRaidDiskSpace                = 1117.25 GB
   AvailableRaidDiskSpace           = 0.00 GB
   Hotspare                         = NO
   Manufacturer                     = SEAGATE
   ProductId                        = ST1200MM0088
   Revision                         = TT31
   SerialNumber                     = Z400C5NG
   PartNumber                       = TH0WXPCX2123365C0158A00
   NegotiatedSpeed                  = 12.0 Gb/s
   ManufacturedDay                  = 6
   ManufacturedWeek                 = 19
   ManufacturedYear                 = 2016
   ForeignKeyIdentifier             = null
   SasAddress                       = 0x5000C500856D5F39
   FormFactor                       = 2.5 Inch
   RaidNominalMediumRotationRate    = 10000
   T10PICapability                  = Not Capable
   BlockSizeInBytes                 = 512
   MaxCapableSpeed                  = Unknown
   SelfEncryptingDriveCapability    = Not Capable
Disk.Bay.3:Enclosure.Internal.1-0:RAID.Integrated.1-1
   Status                           = Ok
   DeviceDescription                = Disk 3 in Backplane 0 of Integrated RAID Controller 1
   RollupStatus                     = Ok
   Name                             = Physical Disk 1:0:3
   State                            = Non-Raid
   OperationState                   = Not Applicable
   PowerStatus                      = Spun-Up
   Size                             = 1117.25 GB
   FailurePredicted                 = NO
   RemainingRatedWriteEndurance     = Not Applicable
   SecurityStatus                   = Not Capable
   BusProtocol                      = SAS
   MediaType                        = HDD
   UsedRaidDiskSpace                = 0.00 GB
   AvailableRaidDiskSpace           = 1117.25 GB
   Hotspare                         = NO
   Manufacturer                     = IBM 207x
   ProductId                        = ST1200MM0088
   Revision                         = B587
   SerialNumber                     =
   PartNumber                       =
   NegotiatedSpeed                  = 6.0 Gb/s
   ManufacturedDay                  = 0
   ManufacturedWeek                 = 0
   ManufacturedYear                 = 0
   ForeignKeyIdentifier             = null
   SasAddress                       = 0x5000C50093B4CD59
   FormFactor                       = Unknown
   RaidNominalMediumRotationRate    = 0
   T10PICapability                  = Not Capable
   BlockSizeInBytes                 = 512
   MaxCapableSpeed                  = Unknown
   SelfEncryptingDriveCapability    = Not Capable

Not sure about "rescanning disks", is that performing a "Patrol Read Operation" ?

No Events found!

Top