Skip to content

img0

Disk Replacement

Replace Disk

Warning

In this LAB environment as we have not physically removed or replaced a disk we will simulate with enableDisk, however remember in a real disk replacement scenario, you would choose replaceDisk


Instructions

  1. Login as Admin
  2. Click Cluster Tab
  3. Click Nodes Sub-Tab
  4. Click Advanced Tab
  5. Select Disk Management
  6. Select disableDisk
  7. Choose a target Node
  8. Type /cloudian1 in Mount Point
  9. Click Execute

Ensure Result shows disableDisk completed.


Instructions

  1. You may need to wait a few minutes for the CMC to register the disabled disk.
  2. Click on Cluster Tab
  3. Click on Nodes Sub-Tab
  4. Select the node you disabled the disk on under the Host dropdown
  5. Ensure that the disk is showing as NotAvail

(You may need to refresh screen)


To Activate a Replacement Disk and Restore Data to It

Important

After you’ve physically installed the replacement disk, follow these steps:

Instructions

  1. Log into the CMC
  2. Select Cluster tab
  3. Select Nodes sub-tab
  4. Select Advanced.
  5. For the Command Type select "Disk Management"
  6. For the Command select “enableDisk". (In a true disk failure scenario you would need to use the “replaceDisk” option)
  7. Choose the Target Node (the node on which the disk resides)
  8. Enter the Mount Point of the replacement disk. This must be the same as the mount point of the disk that you disabled (/cloudian1).
  9. Click Execute.

Wait some time for the operation to complete. The disk replacement operation automatically invokes a repair on the mount point (to recreate on the new disk the same set of S3 object data that was on the disk you replaced). The duration of this repair operation will depend on how much data is involved.

Instructions

  1. Select Cluster tab
  2. Select Nodes sub-tab

Info

In the "Disk Detail Info" section, the replacement disk should have a green status icon (indicating that its status is OK): and the operation is done. In the Lab it may be necessary to click "Clear Error History" to clear the red error indicator.


Repair System

Info

In line with best practices we must now perform a manual repair of the system to ensure that any reduction in redundancy is repaired as quickly as possible. As we know that studentXn5 had an issue we will concentrate on this node (modify the command to match your own lab, where X is your lab ID).

Instructions

  1. We will start with Cassandra, log into your system using the sa_admin account
  2. Start a Cassandra repair of all keyspaces
    hsstool -h studentXn5 repair allkeyspaces
    
  3. The repair in the labs will only take a few seconds to complete. In a real world scenario it will likely take much longer

Instructions

  1. Next let us repair our erasure coded fragments, if you have logged out, log into your system using the sa_admin account
  2. Start the repair process
    hsstool -h studentXn5 repairec
    
  3. Again, the repair in the labs will only take a few seconds to complete.

Instructions

  1. Finally let us repair our replicas, if you have logged out, log into your system using the sa_admin account
  2. Start the repair process
    hsstool -h studentXn5 repair
    
  3. On completion fully log out of the sa_admin session using the exit command
    exit