Introduction
RAID arrays provide increased performance and redundancy by combining
individual disks into virtual storage devices in specific
configurations. In Linux, the mdadm
utility makes it easy to create and manage software RAID arrays.
In a previous guide, we covered how to create RAID arrays with mdadm
on Ubuntu 16.04.
In this guide, we will demonstrate how to manage RAID arrays on an
Ubuntu 16.04 server. Managing RAID arrays is quite straight forward in
most cases.
Info: Due to the inefficiency of RAID setups on
virtual private servers, we don’t recommend deploying a RAID setup on
DigitalOcean droplets. The efficiency of datacenter disk replication
makes the benefits of a RAID negligible, relative to a setup on
baremetal hardware. This tutorial aims to be a reference for a
conventional RAID setup.
Prerequisites
To complete this guide, you will need access to a non-root sudo
user. You can follow our Ubuntu 16.04 initial server setup guide to set up an appropriate user.
As mentioned above, this guide will cover RAID array management. Follow our guide on how to create RAID arrays with mdadm
on Ubuntu 16.04
to create one or more arrays before starting on this guide. This guide
will assume that you have one or more arrays to operate on.
One of the most essential requirements for proper management is the
ability to find information about the structure, component devices, and
current state of the array.
To get detailed information about a RAID device, pass the RAID device with the -D
or --detail
option to mdadm
:
The important information about the array will be displayed:
Output
/dev/md0:
Version : 1.2
Creation Time : Mon Aug 8 21:19:06 2016
Raid Level : raid10
Array Size : 209584128 (199.88 GiB 214.61 GB)
Used Dev Size : 104792064 (99.94 GiB 107.31 GB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Update Time : Mon Aug 8 21:36:36 2016
State : active
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : near=2
Chunk Size : 512K
Name : mdadmwrite:0 (local to host mdadmwrite)
UUID : 0dc2e687:1dfe70ac:d440b2ac:5828d61d
Events : 18
Number Major Minor RaidDevice State
0 8 0 0 active sync set-A /dev/sda
1 8 16 1 active sync set-B /dev/sdb
2 8 32 2 active sync set-A /dev/sdc
3 8 48 3 active sync set-B /dev/sdd
From this view you can see the RAID level, the array size, the health
of the individual pieces, the UUID of the array, and the component
devices and their roles. The information provided in this view is all
fairly well labeled.
To get the shortened details for an array, appropriate for adding to the /dev/mdadm/mdadm.conf
file, you can pass in the --brief
or -b
flags with the detail view:
Output
ARRAY /dev/md0 metadata=1.2 name=mdadmwrite:0 UUID=0dc2e687:1dfe70ac:d440b2ac:5828d61d
To get a quick human-readable summary of a RAID device, use the -Q
option to query it:
Output
/dev/md0: 199.88GiB raid10 4 devices, 0 spares. Use mdadm --detail for more detail.
This can be used to find the key info about a RAID device at a glance.
You can also use mdadm
to query individual component devices.
The -Q
option, when used with a component device, will tell you the array it is a part of and its role:
Output
/dev/sdc: is not an md array
/dev/sdc: device 2 in 4 device active raid10 /dev/md0. Use mdadm --examine for more detail.
You can get more detailed information by using the -E
or --examine
options:
Output
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 0dc2e687:1dfe70ac:d440b2ac:5828d61d
Name : mdadmwrite:0 (local to host mdadmwrite)
Creation Time : Mon Aug 8 21:19:06 2016
Raid Level : raid10
Raid Devices : 4
Avail Dev Size : 209584128 (99.94 GiB 107.31 GB)
Array Size : 209584128 (199.88 GiB 214.61 GB)
Data Offset : 131072 sectors
Super Offset : 8 sectors
Unused Space : before=130984 sectors, after=0 sectors
State : active
Device UUID : b0676ef0:73046e93:9d9c7bde:c80352bb
Update Time : Mon Aug 8 21:36:36 2016
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 8be1be96 - correct
Events : 18
Layout : near=2
Chunk Size : 512K
Device Role : Active device 2
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
This information is similar to that displayed when using the -D
option with the array device, but focused on the component device’s relationship to the array.
To get detailed information about each of the assembled arrays on your server, check the /proc/mdstat
file. This is often the best way to find the current status of the active arrays on your system:
Output
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid10 sdd[3] sdc[2] sdb[1] sda[0]
209584128 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
unused devices: <none>
The output here is quite dense, providing a lot of information in a small amount of space.
/proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
. . .
The Personalities line describes the different RAID levels and configurations that the kernel currently supports.
The line beginning with md0 describes the beginning of a RAID device description. The indented line(s) that follow are also describe this device.
/proc/mdstat
. . .
md0 : active raid10 sdd[3] sdc[2] sdb[1] sda[0]
. . .
The first line state that the array is active (not faulty) and
configured as RAID 10. Afterwards, the component devices that were used
to build the array are listed. The numbers in the brackets describe
the current “role” of the device in the array (this affects which copies
of data the device is given).
/proc/mdstat
. . .
209584128 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
. . .
The second line displayed in this example gives the number of blocks
the virtual devices provides, the metadata version (1.2 in this
example), and the chunk size of the array. Since this is a RAID 10
array, it also includes information about the layout of the array (this
example has been configured to store two copies of each chunk of data in
the “near” layout).
The last items in square brackets both represent currently available
devices out of a healthy set. The first number in the numeric brackets
indicates the size of a healthy array while the second number represents
the currently available number of devices. The other brackets are a
visual indication of the array health, with “U” representing healthy
devices and “_” representing faulty devices.
If your array is currently assembling or recovering, you might have
another line that shows the progress. It would look something like
this:
/proc/mdstat
. . .
[>....................] resync = 0.9% (2032768/209584128) finish=15.3min speed=225863K/sec
. . .
This this describes the operation currently being applied and the
current progress in a number of different ways. It also provides the
current speed and an estimated time until completion.
After you have a good idea of what arrays are currently running on your system, there are a number of actions you can take.
Stopping an Array
To stop an array, the first step is to unmount it.
Step outside of the mounted directory and unmount it by typing:
You can stop all active arrays by typing:
If you want to stop a specific array, pass it to the mdadm --stop
command:
This will stop the array. You will have to reassemble the array to access it again.
Starting an Array
To start all arrays defined in the configuration files or /proc/mdstat
, type:
To start a specific array, you can pass it in as an argument to mdadm --assemble
:
This works if the array is defined in the configuration file.
If the correct definition for the array is missing from the
configuration file, the array can still be started by passing in the
component devices:
Once the array is assembled, it can be mounted as usual:
The array should now be accessible at the mount point.
Adding a Spare Device to an Array
Spare devices can be added to any arrays that offer redundancy (such
as RAID 1, 5, 6, or 10). The spare will not be actively used by the
array unless an active device fails. When this happens, the array will
resync the data to the spare drive to repair the array to full health.
Spares cannot be added to non-redundant arrays (RAID 0) because the array will not survive the failure of a drive.
To add a spare, simply pass in the array and the new device to the mdadm --add
command:
If the array is not in a degraded state, the new device will be added
as a spare. If the device is currently degraded, the resync operation
will immediately begin using the spare to replace the faulty drive.
After you add a spare, update the configuration file to reflect your new device orientation:
Remove or comment out the current line that corresponds to your array definition:
/etc/mdadm/mdadm.conf
. . .
# ARRAY /dev/md0 metadata=1.2 name=mdadmwrite:0 UUID=d81c843b:4d96d9fc:5f3f499c:6ee99294
Afterwards, append your current configuration:
The new information will be used by the mdadm
utility to assemble the array.
Increasing the Number of Active Devices in an Array
It is possible to grow an array by increasing the number of active
devices within the assembly. The exact procedure depends slightly on
the RAID level you are using.
With RAID 1 or 10
Begin by adding the new device as a spare, just as demonstrated in the last section:
Find out the current number of RAID devices in the array:
Output
/dev/md0:
Version : 1.2
Creation Time : Wed Aug 10 15:29:26 2016
Raid Level : raid1
Array Size : 104792064 (99.94 GiB 107.31 GB)
Used Dev Size : 104792064 (99.94 GiB 107.31 GB)
Raid Devices : 2
Total Devices : 3
Persistence : Superblock is persistent
. . .
We can see that in this example, the array is configured to actively
use two devices, and that the total number of devices available to the
array is three (because we added a spare).
Now, reconfigure the array to have an additional active device. The spare will be used to satisfy the extra drive requirement:
The array will begin to reconfigure with an additional active disk. To view the progress of syncing the data, type:
You can continue to use the device as the process completes.
With RAID 5 or 6
Begin by adding the new device as a spare, just as demonstrated in the last section:
Find out the current number of RAID devices in the array:
Output
/dev/md0:
Version : 1.2
Creation Time : Wed Aug 10 18:38:51 2016
Raid Level : raid5
Array Size : 209584128 (199.88 GiB 214.61 GB)
Used Dev Size : 104792064 (99.94 GiB 107.31 GB)
Raid Devices : 3
Total Devices : 4
Persistence : Superblock is persistent
. . .
We can see that in this example, the array is configured to actively
use three devices, and that the total number of devices available to the
array is four (because we added a spare).
Now, reconfigure the array to have an additional active device. The
spare will be used to satisfy the extra drive requirement. When growing
a RAID 5 or RAID 6 array, it is important to include an additional
option called --backup-file
. This should point to a location off the array where a backup file containing critical information will be stored.
Note
The backup file is only used for a very short but critical time
during this process, after which it will be deleted automatically.
Because the time when this is needed is very brief, you will likely
never see the file on disk, but in the event that something goes wrong,
it can be used to rebuild the array. This post has some additional information if you would like to know more.
The following output indicates that the critical section will be backed up:
Output
mdadm: Need to backup 3072K of critical section..
The array will begin to reconfigure with an additional active disk. To view the progress of syncing the data, type:
You can continue to use the device as this process completes.
After the reshape is complete, you will need to expand the filesystem on the array to utilize the additional space:
Your array should now have a filesystem that matches its capacity.
With RAID 0
Because RAID 0 arrays cannot have spare drives (there is no chance
for a spare to rebuild a damaged RAID 0 array), we must add the new
device at the same time that we grow the array.
First, find out the current number of RAID devices in the array:
Output
/dev/md0:
Version : 1.2
Creation Time : Wed Aug 10 19:17:14 2016
Raid Level : raid0
Array Size : 209584128 (199.88 GiB 214.61 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
. . .
We can now increment the number of RAID devices in the same operation as the new drive addition:
You will see output indicating that the array has been changed to RAID 4:
Output
mdadm: level of /dev/md0 changed to raid4
mdadm: added /dev/sdc
This is normal and expected. The array will transition back into
RAID 0 when the data has been redistributed to all existing disks.
You can check the progress of the action by typing:
Once the sync is complete, resize the filesystem to use the additional space:
Your array should now have a filesystem that matches its capacity.
Removing a Device from an Array
Removing a drive from a RAID array is sometimes necessary if there is a fault or if you need to switch out the disk.
For a device to be removed, it must first be marked as “failed”
within the array. You can check if there is a failed device by using mdadm --detail
:
Output
/dev/md0:
Version : 1.2
Creation Time : Wed Aug 10 21:42:12 2016
Raid Level : raid5
Array Size : 209584128 (199.88 GiB 214.61 GB)
Used Dev Size : 104792064 (99.94 GiB 107.31 GB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent
Update Time : Thu Aug 11 14:10:43 2016
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
Name : mdadmwrite:0 (local to host mdadmwrite)
UUID : bf7a711b:b3aa9440:40d2c12e:79824706
Events : 144
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 0 1 active sync /dev/sda
2 8 16 2 active sync /dev/sdb
0 8 32 - faulty /dev/sdc
The highlighted lines all indicate that a drive is no longer functioning (/dev/sdc
in this example).
If you need to remove a drive that does not have a problem, you can manually mark it as failed with the --fail
option:
Output
mdadm: set /dev/sdc faulty in /dev/md0
If you look at the output of mdadm --detail
, you should see that the device is now marked faulty.
Once the device is failed, you can remove it from the array with mdadm --remove
:
Output
mdadm: hot removed /dev/sdc from /dev/md0
You can then replace it with a new drive, using the same mdadm --add
command that you use to add a spare:
Output
mdadm: added /dev/sdd
The array will begin to recover by copying data to the new drive.
Deleting an Array
To destroy an array, including all data contained within, begin by following the process we used to stop an array.
First, unmount the filesystem:
Next, stop the array:
Afterwards, delete the array itself with the --remove
command targeting the RAID device:
Once the array itself is removed, you should use mdadm --zero-superblock
on each of the component devices. This will erase the md
superblock, a header used by mdadm
to assemble and manage the component devices as part of an array. If
this is still present, it may cause problems when trying to reuse the
disk for other purposes.
You can see that the superblock is still present in the array by checking out the FSTYPE
column in the lsblk --fs
output:
Output
NAME FSTYPE LABEL UUID MOUNTPOINT
sda linux_raid_member mdadmwrite:0 bf7a711b-b3aa-9440-40d2-c12e79824706
sdb linux_raid_member mdadmwrite:0 bf7a711b-b3aa-9440-40d2-c12e79824706
sdc linux_raid_member mdadmwrite:0 bf7a711b-b3aa-9440-40d2-c12e79824706
sdd
vda
├─vda1 ext4 DOROOT 4f8b85db-8c11-422b-83c4-c74195f67b91 /
└─vda15
In this example, /dev/sda
, /dev/sdb
, and /dev/sdc
were all part of the array and are still labeled as such.
Remove the labels by typing:
Next, make sure you remove or comment out any references to the array in the /etc/fstab
file:
/etc/fstab
. . .
# /dev/md0 /mnt/md0 ext4 defaults,nofail,discard 0 0
Save and close the file when you are finished.
Remove or comment out any references to the array from the /etc/mdadm/mdadm.conf
file as well:
/etc/mdadm/mdadm.conf
# ARRAY /dev/md0 metadata=1.2 name=mdadmwrite:0 UUID=bf7a711b:b3aa9440:40d2c12e:79824706
Save and close the file when you are finished.
Update the initramfs by typing:
This should remove the device from the early boot environment.
We can make other actions to md device.
echo resync > /sys/block/md2/md/sync_action # to resume intrerrupted rsync
echo repair > /sys/block/md2/md/sync_action
echo check > /sys/block/md2/md/sync_action
Ihe initial status for /sys/block/md2/md/sync_action is idle
To check disk and partition you can take:
lsblk -o NAME,SIZE,FSTYPE,TYPE,MOUNTPOINT
To see smartctl details:
apt-get install smartmontools
smartctl -a /dev/sda
Conclusion
Linux’s mdadm
utility makes it fairly easy to manage
arrays once you understand the conventions it uses and the places where
you can look for information. This guide is in no ways exhaustive, but
serves to introduce some of the management tasks that you might need to
perform on a day-to-day basis.
Once you’re comfortable creating and managing RAID arrays with mdadm
,
there are a number of different directions you can explore next.
Volume management layers like LVM integrate tightly with RAID and allow
you to flexibly partition space into logical volumes. Similarly, LUKS
and dm-crypt encryption is commonly used to encrypt the RAID devices
prior to writing the filesystem. Linux allows all of these technologies
to be used together to enhance your storage capabilities.
source