Which Hot Spare will be used for a failed drive? (EMC Clariion / VNX)

How does an EMC Clariion or VNX decide which Hot Spare will be used for any failed drive?

First of all not the entire failed drive will be rebuilt, but only the LUNs that reside on the failed drive. Furthermore all LUNs on the failed drive will be rebuilt to the same Hot Spare, so a single failed drive will be replaced by a single Hot Spare. So if for example a 600GB drive fails with only 100GB worth of LUNs on it, in theory a 146GB drive could be invoked to rebuild the data. The location of the last LUN block on the failed drive specifies how large the Hot Spare needs to be. If on a 600GB drive the last block of the last LUN sits on “location 350GB”, but the amount of disk space used by all LUNs residing on that drive is 100GB, the 146 and 300GB Hot Spares aren’t valid choices, since the last block address is beyond the 300GB mark (350GB). So valid Hot Spares would be 400GB or larger.

First the storage array will scan the same bus on which the failed drive resides. If no Hot Spare is found all other buses will be scanned starting with bus 0. Rotational speeds are NOT a consideration for the array, so the best practice is to only use Hot Spares equally fast or faster than any of the drives you are protecting.

A storage array with the following drives, Hot Spares can replace the following drives:

  • Fibre Channel and SATA-II drives can hot spare for other Fibre Channel and SATA-II drives;
  • SAS drives can only hot spare for other SAS drives;
  • NL-SAS drives can only hot spare for other NL-SAS drives;
  • An EFD (=SSD) can only replace failed EFDs (=SSDs), so it will not be invoked if a spinning disk fails;
  • ATA drives can only replace other ATA drives (nowadays ATA drives aren’t used anymore);
  • Fibre Channel (or SAS) or SATA-II (or NL-SAS) drives can not replace EFDs (=SSDs) or ATA drives and vice versa;
  • In a Clariion or VNX you cannot dedicate a specific Hot Spare to any specific RAID Group, so all Hot Spares are global;
  • Spread your Hot Spares accross all available buses as much as possible.

 

Logical Unit Type Hot Spare Type Can the Hot Spare Swap in For the Logical Unit?
EFD EFD Yes
EFD FC / SAS No
EFD SATA II / NL-SAS No
EFD ATA No
FC EFD No
FC FC Yes
FC SATA II Yes
FC ATA No
SAS EFD No
SAS SAS Yes
SAS NL-SAS No
SATA II / NL-SAS EFD No
SATA II FC Yes
SATA II / NL-SAS SATA II / NL-SAS Yes
SATA II ATA No
ATA FC No
ATA SATA II No
ATA ATA Yes

Data inside the Clariion or VNX Vaults (first 5 / 4 drives) will NOT be rebuilt unless a user LUN resides on it and then only the user LUN will be rebuilt. The information within the Vault is protected by it’s own redundancy mechanisms. The storage array’s configuration is protected by a tripple mirror for example and the OS of each SP is mirrored on one of the first 4 drives. When a replacement drive is placed all data is rebuilt of course, user LUNs as well as CX/VNX data on the 4 first drives!

Would you like to comment on this post?

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trackbacks and Pingbacks: