Skip to content

img0

Disk Replacement

Replace Disk

Warning

In this LAB environment as we have not physically removed or replaced a disk we will simulate the replacement process with enableDisk, however remember in a real disk replacement scenario, you would choose the replaceDisk option


Instructions

  1. Login as Admin
  2. Click Cluster Tab
  3. Click Nodes Sub-Tab
  4. Click Advanced Tab
  5. Select Disk Management
  6. Select disableDisk
  7. Choose a target Node
  8. Type /cloudian1 in Mount Point
  9. Click Execute

Ensure Result shows disableDisk completed.


Instructions

  1. You may need to wait a few minutes for the CMC to register the disabled disk.
  2. Click on Cluster Tab
  3. Click on Nodes Sub-Tab
  4. Select the node you disabled the disk on under the Host dropdown
  5. Ensure that the disk is showing as NotAvail

(You may need to refresh screen)


To Activate a Replacement Disk and Restore Data to It

Important

After you’ve physically installed the replacement disk, follow these steps:

Instructions

  1. Log into the CMC
  2. Select Cluster tab
  3. Select Nodes sub-tab
  4. Select Advanced.
  5. For the Command Type select "Disk Management"
  6. For the Command select “enableDisk". (In a true disk failure scenario you would need to use the “replaceDisk” option)
  7. Choose the Target Node (the node on which the disk resides)
  8. Enter the Mount Point of the replacement disk. This must be the same as the mount point of the disk that you disabled (/cloudian1).
  9. Click Execute.

Wait some time for the operation to complete. The disk replacement operation automatically invokes a repair on the mount point (to recreate on the new disk the same set of S3 object data that was on the disk you replaced). The duration of this repair operation will depend on how much data is involved.

Instructions

  1. Select Cluster tab
  2. Select Nodes sub-tab

Info

In the "Disk Detail Info" section, the replacement disk should have a green status icon (indicating that its status is OK): and the operation is done. In the Lab it may be necessary to click "Clear Error History" to clear the red error indicator.


Repair System

Info

When you physically install a new disk and then execute the HyperStore replaceDisk function, the system automatically does the following:
1. Creates a primary partition and an ext4 file system on the new disk
1. Establishes appropriate permissions on the mount
1. Remounts the new disk (using the same mount point that the prior disk had), uncomments its entry in /etc/fstab, and marks the disk as available for HyperStore reads and writes
1. Moves back to the new disk the same set of storage tokens that were automatically moved away from the prior disk when it was disabled
1. Performs a data repair for the new disk (populating the new disk with its correct inventory of object replicas and/or erasure coded object fragments)

Warning

Although the HyperStore system has automatically repaired the disk and associated data we should never rely soley on automated processes and must understand how to trigger a repair manually. Whenever we know an issue has occured and been corrected we should considder manually running a repair process.

Instructions

  1. We will start with Cassandra, log into your system using the sa_admin account
  2. Start a Cassandra repair of all keyspaces this will repair replicated objects in the HSFS and also repair all the Cassandra keyspaces. Cassandra repair will be completed first, then repair HSFS replicas

Remember to modify the command to match your own system

hsstool -h studentXn5 repair allkeyspaces
1. The repair in the labs will only take a few seconds to complete. In a real world scenario it will likely take much longer


Instructions

  1. Next let us repair our erasure coded fragments, if you have logged out, log into your system using the sa_admin account
  2. Start the repair process
    hsstool -h studentXn5 repairec
    
  3. Again, the repair in the labs will only take a few seconds to complete.

Instructions

  1. Finally let us repair our replicas. This has already been completed during our first manual repair of allkeyspaces, however we will complete this again to test the command. If you have logged out, log into your system using the sa_admin account
  2. Start the repair process
    hsstool -h studentXn5 repair
    
  3. On completion fully log out of the sa_admin session using the exit command
    exit