md_monitor(8)md_monitor(8)NAMEmd_monitor - MD device monitor
SYNOPSISmd_monitor [-d|--daemonize] [-f file|--file file] [-s|--syslog] [-e
num|--expires=num] [-l num|--process-limit=num] [-m|--fail-mirror]
[-o|--fail-disk] [-r num|--retries=num] [-p prio|--log-priority=prio]
[-v|--verbose] [-y|--check-in-sync] [-c cmd|--command=cmd] [-V|--ver‐
sion] [-h|--help]
DESCRIPTION
The md_monitor monitors the component devices of each MD array for I/O
issues. It will update the monitored MD arrays on each status change,
setting devices to 'faulty' or re-integrate working devices.
OPTIONSmd_monitor recognizes the following command-line options:
-d, --daemonize
Start md_monitor in background
-f file, --logfile=file
Write logging information into file instead of stdout
-s, --syslog
Write logging information to syslog.
-e num, --expires=num
Set failfast_expires to num.
-l num, --process-limit=num
Set maximum number of processes (RLIMIT_NPROC, see getrlimit(2))
to num.
-m, --fail-mirror
Fail and reset the entire mirror half when one device failed.
This is the default.
-o, --fail-disk
Only fail the affected disk when one device failed. This is the
opposite of --fail-mirror.
-r num, --retries=num
Set failfast_retries to num.
-p prio, --log-priority=prio
Set logging priority to prio.
-t secs, --check-timeout=secs
Run path checker every secs seconds. Default is 1.
-v, --verbose
Increase logging priority
-y, --check-in-sync
Run path checkers for 'in_sync' devices. Without this option
path checkers will be stopped whenever a device is detected to
be 'in_sync'. They will be re-started once a device has been
marked as 'faulty' or 'timeout'.
-c cmd, --command=cmd
Send command cmd to daemon.
-h, --help
Display md_monitor usage information.
-V, --version
Display md_monitor version information.
MD_MONITOR COMMAND MODE
When specifying --command the md_monitor program connects to a already
running md_monitor program and send a pre-defined command. The command
has the following syntax:
cmd:md@dev
The following values for cmd are recognised. If not specified other‐
wise, md needs to be the device node of an existing MD array.
Shutdown
Shutdown md_monitor; md argument should be /dev/console
RebuildStarted
Rebuild has started on array md.
RebuildFinished
Rebuild has finished on array md.
DeviceDisappeared
MD array has been stopped; md_monitor will stop monitoring the
component devices for that array.
Fail MD detected a failure on the component device dev of array md.
md_monitor will re-check the device every failfast_expires sec‐
onds.
Remove The component device dev has been removed from the MD array md.
md_monitor will stop monitoring this device.
SpareActive
MD has integrated the device dev into array md. md_monitor will
re-start monitoring of this device every failfast_expires sec‐
onds. The check interval will be increased for each successful
check up to a maximum of failfast_expires * failfast_retries
seconds.
ArrayStatus
Return the current internal status of the monitored devices.
MirrorStatus
Return the status of the MD component devices in abbreviated
form. Each character represents the status of the MD component
device at that position. For the possible states see the next
paragraph.
MonitorStatus
Return the current internal status of the monitored devices in
abbreviated form. Each character represents the internal status
of the monitored device in abbreviated form.
DEVICE STATUS DISPLAYmd_monitor will be displaying state information about the monitored
devices when the CLI command MirrorStatus or MonitorStatus is sent.
Each character of the returned string represents the state of the
device at that location.
The possible states for MirrorStatus are:
. Unknown
A In_Sync
W Faulty
U Ready
S Spare
- Removed
R Recovery pending
P Removal pending
R and P are intermediate states, which are set by md_monitor whenever a
command has been sent to mdadm, but no notification has been received
yet. U is set when the device is found working, but --fail-mirror is
set and not all devices within the mirror side are found to be working.
The possible states for MonitorStatus are:
. Unknown
X Internal error
A I/O ok
W I/O failed
R I/O pending
T I/O timeout
R and T describe the same condition, ie I/O has been stalled. The state
will switch from R to T when the timeout as set by failfast_expires *
failfast_retries seconds has expired.
THEORY OF OPERATIONmd_monitor sets up a path checker thread for each MD component device.
This path checker will issue every check-time seconds an asynchronous
I/O request to the device. It will then wait up to expires * retries
seconds for this I/O to complete. If no response has been received
during that time, the monitor status for this path is set to 'I/O time‐
out'. If the I/O completed the monitor status for this path will be set
to 'I/O ok' or 'I/O failed', depending on whether the I/O completed
without error or not. If the path checker has been interrupted during
waiting, the monitor status for this path will be set to 'I/O pending'.
After the monitor status has been updated, the path checker thread will
update the MD status for this device and invoke an action, depending on
these two states. If check-in-sync has been specified the path checker
continue to run even for 'in_sync' paths. Otherwise the path checker be
stopped when a path is marked as 'in_sync'. Path checkers will be
restarted whenever a device is marked as 'faulty' or 'timeout'.
MDADM INTEGRATIONmd_monitor listens to udev events for any device changes. It is
designed to integrate into MD via the --monitor functionality of mdadm.
To use this function mdadm needs to be started with
mdadm --monitor --scan --program=md_script
where md_script is a bash script containing eg:
#!/bin/bash
# MD monitor script
#
EVENT=$1
MD=$2
DEV=$3
/sbin/md_monitor -c "${EVENT}:${MD}@${DEV}"
A default md_script is installed at
/usr/share/misc/md_notify_device.sh.
It is recommended to use an /etc/mdadm.conf configuration file
when using md_monitor to monitor MD arrays.
FILES
/usr/share/misc/md_notify_device.sh Default md_monitor script.
/etc/mdadm.conf MD configuration file
SEE ALSOmdadm(8)mdadm.conf(7)
Wed Mar 14 2012 md_monitor(8)