1CCISS_VOL_STATUS(8) CCISS_VOL_STATUS(8)
2
3
4
6 cciss_vol_status - show status of logical drives attached to HP Smar‐
7 tarray controllers
8
10 cciss_vol_status [OPTION] [DEVICE]...
11
13 Shows the status of logical drives configured on HP Smartarray con‐
14 trollers.
15
17 -p, --persnickety
18 Without this option, device nodes which can't be opened, or
19 which are not found to be of the correct device type are
20 silently ignored. This lets you use wildcards, e.g.:
21 cciss_vol_status /dev/sg* /dev/cciss/c*d0, and the program will
22 not complain as long as all devices which are found to be of the
23 correct type are found to be ok. However, you may wish to
24 explicitly list the devices you expect to be there, and be noti‐
25 fied if they are not there (e.g. perhaps a PCI slot has died,
26 and the system has rebooted, so that what was once
27 /dev/cciss/c1d0 is no longer there at all). This option will
28 cause the program to complain about any device node listed which
29 does not appear to be the right device type, or is not openable.
30
31 -C, --copyright
32 If stderr is a terminal, Print out a copyright message, and
33 exit.
34
35 -q, --quiet
36 This option doesn't do anything. Previously, without this
37 option and if stderr is a terminal, a copyright message precedes
38 the normal program output. Now, the copyright message is only
39 printed via the -C option.
40
41 -s Query each physical drive for S.M.A.R.T data and report any
42 drives in "predictive failure" state.
43
44 -u, --try-unknown-devices
45 If a device has an unrecognized board ID, normally the program
46 will not attempt to communicate with it. In case you have some
47 Smart Array controller which is newer than this program, the
48 program may not recognize it. This option permits the program
49 to attempt to interrogate the board even if it is unrecognized
50 on the assumption that it is in fact a Smart Array of some kind.
51
52 -v, --version
53 Print the version number and exit.
54
55 -V, --verbose
56 Print out more information about the controllers and physical
57 drives. For each controller, the board ID, number of logical
58 drives, currently running firmware revision and ROM firmware
59 revision are printed. For each physical drive, the location,
60 vendor, model, serial number, and firmware revision are printed.
61
62 -x, --exhaustive
63 Deprecated. Previously, it "exhaustively" searched for logical
64 drives, as, under some circumstances some logical drives might
65 otherwise be missed. This option no longer does anything, as
66 the algorithm for finding logical drives was changed to obviate
67 the need for it.
68
70 The DEVICE argument indicates which RAID controller is to be queried.
71 Note, that it indicates which RAID controller, not which logical drive.
72
73 For the cciss driver, the "d0" nodes matching "/dev/cciss/c*d0" are the
74 nodes which correspond to the RAID controllers. (See note 1, below.)
75 It is not necessary to invoke cciss_vol_status on each logical drive
76 individually, though if you do this, each time it will report the sta‐
77 tus of ALL logical drives on the controller.
78
79 For the hpsa driver, or for fibre attached MSA1000 family devices, or
80 for the hpahcisr sotware RAID driver which emulates Smart Arrays, the
81 RAID controller is accessed via the scsi generic driver, and the device
82 nodes will match "/dev/sg*" Some variants of the "lsscsi" tool will
83 easily identify which device node corresponds to the RAID controller.
84 Some variants may only report the SCSI nexus (controller/bus/target/lun
85 tuple.) Some distros may not have the lsscsi tool.
86
87 Executing the following query to the /sys filesystem and correlating
88 this with the contents of /proc/scsi/scsi or output of lsscsi can help
89 in finding the right /dev/sg node to use with cciss_vol_status:
90
91 wumpus:/home/scameron # ls -l /sys/class/scsi_generic/*
92 lrwxrwxrwx 1 root root 0 2009-11-18 12:31 /sys/class/scsi_generic/sg0 -> ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/0000:03:03.0/host0/target0:0:0/0:0:0:0/scsi_generic/sg0
93 lrwxrwxrwx 1 root root 0 2009-11-18 12:31 /sys/class/scsi_generic/sg1 -> ../../devices/pci0000:00/0000:00:1f.1/host2/target2:0:0/2:0:0:0/scsi_generic/sg1
94 lrwxrwxrwx 1 root root 0 2009-11-19 07:47 /sys/class/scsi_generic/sg2 -> ../../devices/pci0000:00/0000:00:05.0/0000:0e:00.0/host4/target4:3:0/4:3:0:0/scsi_generic/sg2
95 wumpus:/home/scameron # cat /proc/scsi/scsi
96 Attached devices:
97 Host: scsi0 Channel: 00 Id: 00 Lun: 00
98 Vendor: COMPAQ Model: BD03685A24 Rev: HPB6
99 Type: Direct-Access ANSI SCSI revision: 03
100 Host: scsi2 Channel: 00 Id: 00 Lun: 00
101 Vendor: SAMSUNG Model: CD-ROM SC-148A Rev: B408
102 Type: CD-ROM ANSI SCSI revision: 05
103 Host: scsi4 Channel: 03 Id: 00 Lun: 00
104 Vendor: HP Model: P800 Rev: 6.82
105 Type: RAID ANSI SCSI revision: 00
106 wumpus:/home/scameron # lsscsi
107 [0:0:0:0] disk COMPAQ BD03685A24 HPB6 /dev/sda
108 [2:0:0:0] cd/dvd SAMSUNG CD-ROM SC-148A B408 /dev/sr0
109 [4:3:0:0] storage HP P800 6.82 -
110
111 From the above you can see that /dev/sg2 corresponds to SCSI nexus
112 4:3:0:0, which corresponds to the HP P800 RAID controller listed in
113 /proc/scsi/scsi.
114
116 [root@somehost]# cciss_vol_status -q /dev/cciss/c*d0
117 /dev/cciss/c0d0: (Smart Array P800) RAID 0 Volume 0 status: OK.
118 /dev/cciss/c0d0: (Smart Array P800) RAID 0 Volume 1 status: OK.
119 /dev/cciss/c0d0: (Smart Array P800) RAID 1 Volume 2 status: OK.
120 /dev/cciss/c0d0: (Smart Array P800) RAID 5 Volume 4 status: OK.
121 /dev/cciss/c0d0: (Smart Array P800) RAID 5 Volume 5 status: OK.
122 /dev/cciss/c0d0: (Smart Array P800) Enclosure MSA60 (S/N: USP6340B3F) on Bus 2, Physical Port 1E status: Power Supply Unit failed
123 /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 0 status: OK.
124 /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 1 status: OK.
125 /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 2 status: OK.
126 /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 3 status: OK.
127 /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 4 status: OK.
128 /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 5 status: OK.
129 /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 6 status: OK.
130 /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 7 status: OK.
131
132 [root@someotherhost]# cciss_vol_status -q /dev/sg0 /dev/cciss/c*d0
133 /dev/sg0: (MSA1000) RAID 1 Volume 0 status: OK. At least one spare drive.
134 /dev/sg0: (MSA1000) RAID 5 Volume 1 status: OK.
135 /dev/cciss/c0d0: (Smart Array P800) RAID 0 Volume 0 status: OK.
136
137 [root@localhost]# ./cciss_vol_status -s /dev/sg1
138 /dev/sda: (Smart Array P410i) RAID 0 Volume 0 status: OK.
139 connector 1I box 1 bay 1 HP DG072A9BB7 B365P6803PCP0633 HPD0 S.M.A.R.T. predictive failure.
140 [root@localhost]# echo $?
141 1
142
143 [root@localhost]# ./cciss_vol_status -s /dev/cciss/c0d0
144 /dev/cciss/c0d0: (Smart Array P800) RAID 0 Volume 0 status: OK.
145 connector 2E box 1 bay 8 HP DF300BB6C3 3LM08AP700009713RXUT HPD3 S.M.A.R.T. predictive failure.
146 /dev/cciss/c0d0: (Smart Array P800) Enclosure MSA60 (S/N: USP6340B3F) on Bus 2, Physical Port 2E status: OK.
147
148
149 [root@localhost cciss_vol_status]# ./cciss_vol_status --verbose /dev/sg0
150 Controller: Smart Array P420i
151 Board ID: 0x3354103c
152 Logical drives: 1
153 Running firmware: 3.42
154 ROM firmware: 3.42
155 /dev/sda: (Smart Array P420i) RAID 1 Volume 0 status: OK.
156 Physical drives: 2
157 connector 1I box 2 bay 1 HP EG1200FCVBQ KZG21NVD HPD1 OK
158 connector 2I box 2 bay 5 HP EG1200FCVBQ KZG20X7D HPD1 OK
159 /dev/sg0(Smart Array P420i:0): Non-Volatile Cache status:
160 Cache configured: Yes
161 Read cache memory: 81 MiB
162 Write cache memory: 735 MiB
163 Write cache enabled: Yes
164 Flash backed cache present
165
166
167
169 Normally, a logical drive in good working order should report a status
170 of "OK." Possible status values are:
171
172 "OK." (0) - The logical drive is in good working order.
173
174 "FAILED." (1) - The logical drive has failed, and no i/o to it is
175 poosible.
176 Additionally, failed drives will be identified by connector, box
177 and bay, as well as vendor, model, serial number, and firmware
178 revision.
179
180 "Using interim recovery mode." (3) - One or more drives has failed,
181 but not so many that the logical drive can no longer operate.
182 The failed drives should be replaced as soon as possible.
183
184 "Ready for recovery operation." (4) - Failed drive(s) have been
185 replaced, and the controller is about to begin rebuilding redun‐
186 dant parity data.
187
188 "Currently recovering." (5) - Failed drive(s) have been replaced,
189 and the controller is currently rebuilding redundant parity
190 information.
191
192 "Wrong physical drive was replaced." (6) - A drive has failed, and
193 another (working) drive was replaced.
194
195 "A physical drive is not properly connected." (7) - There is some
196 cabling or backplane problem in the drive enclosure.
197
198 (From fwspecwww.doc, see cpqarray project on sourceforge.net):
199 Note: If the unit_status value is 6 (Wrong physical drive was
200 replaced) or 7 (A physical drive is not properly connected), the
201 unit_status of all other configured logical drives will be
202 marked as 1 (Logical drive failed). This is to force the user to
203 correct the problem and to insure that once the problem is cor‐
204 rected, the data will not have been corrupted by any user
205 action.
206
207 "Hardware is overheating." (8) - Hardware is too hot.
208
209 "Hardware was overheated." (9) - At some point in the past,
210 the hardware got too hot.
211
212 "Currently expannding." (10) - The controller is currently in the
213 process of expanding a logical drive.
214
215 "Not yet available." (11) - The logical drive is not yet finished
216 being configured.
217
218 "Queued for expansion." (12) - The logical drive will be expended
219 when the controller is able to begin working on it.
220
221 Additionally, the following messages may appear regarding spare drive
222 status:
223
224 "At least one spare drive designated"
225 "At least one spare drive activated and currently rebuilding"
226 "At least one activated on-line spare drive is completely rebuilt on this logical drive"
227 "At least one spare drive has failed"
228 "At least one spare drive activated"
229 "At least one spare drive remains available"
230 Active spares will be identified by connector, box and bay, as well
231 as by vendor, model, serial number, and firmware revision.
232
233 For each logical drive, the total number of failed physical drives, if
234 more than zero, will be reported as:
235
236 "Total of n failed physical drives detected on this logical drive."
237
238 with "n" replaced by the actual number, of course.
239
240 "Replacement" drives -- newly inserted drives that replace a previously
241 failed drive but are not yet finished rebuilding -- are also identified
242 by connector, box and bay, as well as by vendor, model, serial number,
243 and firmware revision.
244
245 If the -s option is specified, each physical drive will be queried for
246 S.M.A.R.T data, any any drives in predictive failure state will be
247 reported, identified by connector, box and bay, as well as vendor,
248 model, serial number, and firmware revision.
249
250 Additionally failure conditions of disk enclosure fans, power supplies,
251 and temperature are reported as follows:
252
253 "Fan failed"
254 "Temperature problem"
255 "Door alert"
256 "Power Supply Unit failed"
257
259 /dev/cciss/c*d0 (Smart Array PCI controllers using the cciss driver)
260 /dev/sg* (Fibre attached MSA1000 controllers and Smart Array con‐
261 trollers using the hpsa driver or hpahcisr software RAID driver.)
262
264 0 - All configured logical drives queried have status of "OK."
265
266 1 - One or more configured logical drives queried have status other
267 than "OK."
268
270 MSA500 G1 logical drive numbers may not be reported correctly.
271
272 I've seen enclosure serial numbers contain garbage.
273
274 Some Smart Arrays support more than 128 physical drives on a single
275 RAID controller. cciss_vol_status does not.
276
278 Written by Stephen M. Cameron
279
281 Report bugs to <scameron@beardog.cce.hp.com>
282
284 Copyright © 2007 Hewlett-Packard Development Company, L.P.
285 This is free software; see the source for copying conditions. There is
286 NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
287 PURPOSE.
288
290 http://cciss.sourceforge.net
291
293 The /dev/cciss/c*d0 device nodes of the cciss driver do double duty.
294 They serve as an access point to both the RAID controllers, and to the
295 first logical drive of each RAID controller. Notice that a
296 /dev/cciss/c*d0 node will be present for each controller even if no
297 logical drives are configured on that controller. It might be cleaner
298 if the driver had a special device node just for the controller,
299 instead of making these device nodes do double duty. It has been like
300 that since the 2.2 linux kernel timeframe. At that time, device major
301 and minor nodes were statically allocated at compile time, and were in
302 short supply. Changing this behavior at this point would break lots of
303 userland programs.
304
305
306
307cciss_vol_status (ccissutils) May 2013 CCISS_VOL_STATUS(8)