mdadm (Linux Software RAID) Quick Reference

Last updated on June 8 2010. Please send any corrections to Bryan Smith <bryanesmith at gmail.com>.

Description

Printable quick reference when working with the mdadm utility to create and manage software RAID on Linux.

Warning

This is a reference guide, and not an introduction to mdadm nor any other utilities contained in this document.

Note that...

Do not attempt the commands in this guide unless you are permitted to do so and are already familiar with them. Read tutorials on the above topics first to ensure you understand them.

As always, backup all system files before editing.

Disclaimer

The author assumes no responsibility for the accuracy of the guide or any consequences of using it.

Navigation

  1. Create array
  2. Start and stop array
  3. Troubleshooting
  4. Replacing disk
  5. Adding spare disk
  6. Monitoring
  7. Destroying an array
  8. Miscellaneous
  9. Topics not covered

1. Create array

Assumes no other users logged in to system, and that disks used in RAID are not mounted.

Step 1: Create Linux raid autodetect partitions on disks

For each disk, create a partition with system type "Linux raid autodetect". Assuming the first disk is /dev/sda:

# fdisk /dev/sda (... intro message from fdisk) Command (m for help): n (... create partition. Assume it is partition number 1) Command (m for help): t Partition number (1-5): 1 Hex code (type L to list codes): fd Changed system type of artition 1 to fd (Linux raid autodetect) Command (m for help): w (... commits changes and exists)

Repeat for each participating disk in array.

Step 2: Create array

Choose your RAID level (0,1,5,6, etc.), participating partitions and determine whether you want to start with a spare disk. (Can add spares later to array. See Section 5: Adding spare disk.) E.g., if level 6 array with five partitions added as RAID devices {/dev/sda1, /dev/sdb1, /dev/sdc1, /dev/sdd1, /dev/sde1}, and want destination array with device /dev/md0:

# mdadm --create --verbose /dev/md0 --level=6 --raid-devices=5 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1

Step 3: Wait for RAID array to synchronize

Note RAID arrays without redundancy (i.e., 0, or striping) do not need to synchronize. The size and type of array, along with system load and hardware, will impact the time it takes for array synchronization to complete.

Look at /proc/mdstat to determine if still synchronizing. E.g., here's an example with two RAID 6 arrays. md0 is still synchronizing while md1 has completed::

root@wolverine:/etc/mdadm# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid6 sde1[4] sdh1[8] sdc1[2] sdb1[1] sdd1[3] sdq1[9](S) sdg1[6] sdf1[5] sda1[0] 5860559616 blocks level 6, 64k chunk, algorithm 2 [8/7] [UUUUUUU_] [===================>.] recovery = 99.5% (972091272/976759936) finish=3.1min speed=24989K/sec md1 : active raid6 sdp1[6] sdm1[4] sdk1[2] sdi1[0] sdl1[3] sdn1[5] sdj1[1] sdr1[7] 5860559616 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU] unused devices: <none> root@wolverine:/etc/mdadm#

Here's the same two arrays after both have completed:

root@wolverine:/etc/mdadm# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid6 sde1[4] sdh1[7] sdc1[2] sdb1[1] sdd1[3] sdq1[8](S) sdg1[6] sdf1[5] sda1[0] 5860559616 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU] md1 : active raid6 sdp1[6] sdm1[4] sdk1[2] sdi1[0] sdl1[3] sdn1[5] sdj1[1] sdr1[7] 5860559616 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU] unused devices: <none> root@wolverine:/etc/mdadm#

You should be able to use the array before the synchronization is complete, but with the additional CPU load of synchronization as well as the loss of redundancy.

Step 4: Create filesystem and mount

Add a filesystem to the array. E.g., to create an ext3 filesystem on array md0:

# mkfs.ext3 /dev/md0 [... output while creating filesystem ...]

Create a mount point and mount the disk:

# mkdir /mnt/raid0 # mount /dev/md0 /mnt/raid0

Backup fstab (usually at /etc/fstab) and add the new array(s) so that mounts at startup. E.g., to automount the previously mounted array /dev/md0 at /mnt/raid0, add the following line to fstab:

/dev/md0 /mnt/raid0 ext3 defaults 1 2

Step 5: Store configuration in mdadm.conf

If properly configured (should be default), system will scan partitions for RAID information. For maintenance purposes, it is a good idea to store to configuration file.

Before storing configuration, you must locate the configuration file mdadm.conf. It might be at /etc/mdadm.conf or /etc/mdadm/mdadm.conf or elsewhere. (This can even be set to a different location.)

Backup the file before making any changes.

To output configuration of all arrays to file (assuming at /etc/mdadm.conf):

# mdadm --detail --scan --verbose >> /etc/mdadm.conf

Note that mdadm.conf can be used for limiting startup to certain RAID devices, limit scanning of certain partitions, email notifications, programs/scripts to invoke on RAID events (such as disk failures). Although it is not necessary, it makes a useful reference and is worthwhile when monitoring array. (See Section 6: Monitoring.)

2. Start and stop array

To stop array at /dev/md0:

mdadm -S /dev/md0

To start (assemble) a stopped array at /dev/md0 using specified disks /dev/sda1 /dev/sdb1 /dev/sdc1:

mdadm -A /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1

If you wish to start (assemble) all stopped arrays in mdadm.conf:

mdadm -As

3. Troubleshooting

The first place to check is /proc/mdstat. If every disk is online and okay, you will see something like [8/8] indicating that all 8 disks are available, along with [UUUUUUUU]. (The number depends on the total number of disks.) For example, the following contents of /proc/mdstat indicates that the md0 and md1 arrays are fine:

root@wolverine:/etc/mdadm# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid6 sde1[4] sdh1[7] sdc1[2] sdb1[1] sdd1[3] sdq1[8](S) sdg1[6] sdf1[5] sda1[0] 5860559616 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU] md1 : active raid6 sdp1[6] sdm1[4] sdk1[2] sdi1[0] sdl1[3] sdn1[5] sdj1[1] sdr1[7] 5860559616 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU] unused devices: <none> root@wolverine:/etc/mdadm#

However, if you see something like [8/7], then a disk is offline. Additionally, you should notice a _ representing a filed disk, e.g., [UUUUUUU_]. If a disk has really failed, you should see a (F) next to the device name (e.g., sdb1[3](F)), as opposed to a (S) (e.g., sdb1[3](S)), which is a spare disk.

Before removing a failed disk, make sure your array isn't resynchronizing, as the disk might become available after resynchronization is complete. See Section 4: Replacing disk for proper removal and replacement of a disk.

To get more information information on an array /dev/md0, including state of array and member partitions:

mdadm --detail /dev/md0

To get information on a particular partition /dev/sdc1:

mdadm --examine /dev/sdc1

You may want to check dmesg for I/O errors if you are still having issues.

4. Replacing disk

After identifying the failed disk (see Section 3: Troubleshooting to identify the failed disk using /proc/mdstat), you need to flag the device as failed. E.g., if /dev/sdb1 on array /dev/md0:

mdadm --manage /dev/md0 --fail /dev/sdb1

After flagging it as failed, then remove it from the array:

mdadm --manage /dev/md0 --remove /dev/sdb1

Replace the disk (when machine is powered off), then partition it and set the appropriate system type. (See Section 1: Create array for more information.) Then add the partition to the array:

mdadm --manage /dev/md0 --add /dev/sdb1

If the RAID includes redundancy, you will need to wait for the array to resynchronize. Check /proc/mdstat for progress on resynchronization.

5. Adding spare disk

You can either specify a spare disk when creating an array or add one later.

To add a spare when creating an array, just specify the number of spare devices (--spare-devices), and make sure you list enough devices for the RAID devices + spare devices. For example, here we'll create a RAID 6 with six disks, but additionally specify one spare, for a total of seven devices:

# mdadm --create --verbose /dev/md0 --level=6 --raid-devices=6 --spare-devices=1 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1

You can add a spare to an array, even if it is running:

mdadm /dev/md0 --add /dev/sdg1

You can verify that the device is a spare in /proc/mdstat.

6. Monitoring

In mdadm.conf, you can add a line to specify a script to execute on an event (e.g., disk failure).

PROGRAM /path/to/your/script.sh

The first parameter is the event and the second is the device. Here's a simple shell script to simply echo the event out, which could be redirected to a log or sent as an email:

#!/bin/sh echo "mdadm event $1 occured on device $2"

However, you must daemonize mdadm to monitor for failures. Run:

mdadm --monitor --scan --daemonize

If you want mdadm to automatically daemonize at startup so you can monitor, do the following:

  1. Save the command in a shell script:#!/bin/sh mdadm --monitor --scan --daemonize
  2. Put that shell script in /etc/init.d/ directory
  3. Say it is called mdadm-daemon.sh. Then run: update-rc.d mdadm-daemon.sh defaults
  4. Make sure it is executable: chmod +x /etc/init.d/mdadm-daemon.sh

If you have an email application like sendmail properly configured, you can add a line to mdadm.conf to send email notices on events:

MAILADDR pooh@hundredacres.com

You should test your machines regularly to make sure that you are getting email when events are occuring, much like you should regularly test a smoke detector. To test, run:

mdadm --monitor --scan --test

7. Destroying an array

To destroy an array, you first need to stop the array if it is running. (See Section 2: Start and stop array.)

After the array is stopped, you will want to remove the array entry from mdadm.conf. (See Section 1.5: Store configuration in mdadm.conf.)

Next, remove the array. Assuming it is /dev/md0:

mdadm --remove /dev/md0

(I am not sure what this last command does, and the mdadm documentation says this is for removing member devices from an array. I included this step from: http://www.ducea.com/2009/03/08/mdadm-cheat-sheet/.)

Additionally, you should overwrite the superblocks from the disks, which are used to identify arrays at startup (if mdadm is configured to do so). For example, here we overwrite the superblock entries from the first partition of five disks:

mdadm --zero-superblock /dev/sd{a,b,c,d,e}1

8. Miscellaneous

Recommended resources

9. Topics not covered