1th_define(1M)           System Administration Commands           th_define(1M)
2
3
4

NAME

6       th_define - create fault injection test harness error specifications
7

SYNOPSIS

9       th_define [-n name -i instance| -P path] [-a acc_types]
10            [-r reg_number] [-l offset [length]]
11            [-c count [failcount]] [-o operator [operand]]
12            [-f acc_chk] [-w max_wait_period [report_interval]]
13
14
15       or
16
17
18       th_define [-n name -i instance| -P path]
19            [-a log [acc_types] [-r reg_number] [-l offset [length]]]
20            [-c count [failcount]] [-s collect_time] [-p policy]
21            [-x flags] [-C comment_string]
22            [-e fixup_script [args]]
23
24
25       or
26
27
28       th_define [-h]
29
30

DESCRIPTION

32       The th_define utility provides an interface to the bus_ops fault injec‐
33       tion bofi device driver for  defining  error  injection  specifications
34       (referred  to  as errdefs). An errdef corresponds to a specification of
35       how to corrupt a device driver's accesses to its hardware. The  command
36       line  arguments  determine  the  precise  nature  of  the  fault  to be
37       injected. If the supplied arguments define  a  consistent  errdef,  the
38       th_define  process  will store the errdef with the bofi driver and sus‐
39       pend itself until the criteria given by the errdef become satisfied (in
40       practice, this will occur when the access counts go to zero).
41
42
43       You use the th_manage(1M) command with the start option to activate the
44       resulting errdef. The effect of th_manage with the start option is that
45       the bofi driver acts upon the errdef by matching the number of hardware
46       accesses—specified  in  count,  that  are  of  the  type  specified  in
47       acc_types, made by instance number instance—of the driver whose name is
48       name, (or by the driver instance specified by path) to the register set
49       (or DMA handle) specified by reg_number, that lie within the range off‐
50       set to offset + length from the beginning of the register  set  or  DMA
51       handle.  It  then  applies  operator  and operand to the next failcount
52       matching accesses.
53
54
55       If acc_types includes log, th_define runs in automatic test script gen‐
56       eration  mode, and a set of test scripts (written in the Korn shell) is
57       created and placed in a sub-directory of the current directory with the
58       name  <driver>.test.<id> (for example, glm.test.978177106). A separate,
59       executable script is generated for each access handle that matches  the
60       logging  criteria.  The  log  of  accesses is placed at the top of each
61       script as a record of the session. If  the  current  directory  is  not
62       writable,  file  output is written to standard output. The base name of
63       each test file is the driver name, and the extension is a  number  that
64       discriminates  between different access handles. A control script (with
65       the same name as the created test directory) is generated that will run
66       all the test scripts sequentially.
67
68
69       Executing  the  scripts  will install, and then activate, the resulting
70       error definitions. Error definitions are activated sequentially and the
71       driver  instance  under  test  is taken offline and brought back online
72       before each test (refer to the -e  option  for  more  information).  By
73       default,  logging  applies to all PIO accesses, all interrupts, and all
74       DMA accesses to and from areas mapped for both reading and writing. You
75       can  constrain  logging by specifying additional acc_types, reg_number,
76       offset and length. Logging will continue for count  matching  accesses,
77       with an optional time limit of collect_time seconds.
78
79
80       Either  the  -n  or  -P  option must be provided. The other options are
81       optional. If an option (other than -a)  is  specified  multiple  times,
82       only the final value for the option is used. If an option is not speci‐
83       fied, its associated value is set to an appropriate default, which will
84       provide maximal error coverage as described below.
85

OPTIONS

87       The following options are available:
88
89       -n name
90
91           Specify the name of the driver to test. (String)
92
93
94       -i instance
95
96           Test  only  the specified driver instance (-1 matches all instances
97           of driver). (Numeric)
98
99
100       -P path
101
102           Specify the full device path of the driver to test. (String)
103
104
105       -r reg_number
106
107           Test only the given register set or DMA handle (-1 matches all reg‐
108           ister sets and DMA handles). (Numeric)
109
110
111       -a acc_types
112
113           Only  the  specified access types will be matched. Valid values for
114           the acc_types argument are log,  pio,  pio_r,  pio_w,  dma,  dma_r,
115           dma_w  and intr. Multiple access types, separated by spaces, can be
116           specified. The default is to match all hardware accesses.
117
118           If acc_types is set to log, logging will match  all  PIO  accesses,
119           interrupts and DMA accesses to and from areas mapped for both read‐
120           ing and writing. log can be combined with other acc_types, in which
121           case  the  matching condition for logging will be restricted to the
122           specified addional acc_types. Note that dma_r will match  only  DMA
123           handles  mapped for reading only; dma_w will match only DMA handles
124           mapped for writing only; dma will match only DMA handles mapped for
125           both reading and writing.
126
127
128       -l offset [length]
129
130           Constrain  the  range of qualifying accesses. The offset and length
131           arguments indicate that any access of the type specified  with  the
132           -a  option, to the register set or DMA handle specified with the -r
133           option, lie at least offset bytes into the register set or DMA han‐
134           dle and at most offset + length bytes into it. The default for off‐
135           set is 0. The default for length is the maximum value that  can  be
136           placed  in  an  offset_t C data type (see types.h). Negative values
137           are converted into unsigned quantities. Thus, th_define -l 0 -1  is
138           maximal.
139
140
141       -c count[failcount]
142
143           Wait  for count number of matching accesses, then apply an operator
144           and operand (see the -o option) to the  next  failcount  number  of
145           matching  accesses. If the access type (see the -a option) includes
146           logging, the number of logged accesses is given by  count  +  fail‐
147           count  -  1.  The  -1 is required because the last access coincides
148           with the first faulting access.
149
150           Note that access logging may be combined with  error  injection  if
151           failcount  and operator are nonzero and if the access type includes
152           logging and any of the other access types (pio, dma and  intr)  See
153           the description of access types in the definition of the -a option,
154           above.
155
156           When the count and failcount fields reach zero, the status  of  the
157           errdef is reported to standard output. When all active errdefs cre‐
158           ated by the th_define  process  complete,  the  process  exits.  If
159           acc_types  includes log, count determines how many accesses to log.
160           If count is not specified, a default value is used. If failcount is
161           set  in  this  mode, it will simply increase the number of accesses
162           logged by a further failcount - 1.
163
164
165       -o operator [operand]
166
167           For qualifying PIO read and write accesses, the value read from  or
168           written  to  the  hardware  is  corrupted according to the value of
169           operator:
170
171           EQ     operand is returned to the driver.
172
173
174           OR     operand is bitwise ORed with the real value.
175
176
177           AND    operand is bitwise ANDed with the real value.
178
179
180           XOR    operand is bitwise XORed with the real value.
181
182           For PIO write accesses, the following operator is allowed:
183
184           NO    Simply ignore the driver's attempt to write to the hardware.
185
186           Note that a driver performs PIO  via  the  ddi_getX(),  ddi_putX(),
187           ddi_rep_getX() and ddi_rep_putX() routines (where X is 8, 16, 32 or
188           64). Accesses made using ddi_getX() and ddi_putX() are treated as a
189           single  access, whereas an access made using the ddi_rep_*(9F) rou‐
190           tines are broken down into their respective number of accesses,  as
191           given  by  the repcount parameter to these DDI calls. If the access
192           is performed via a DMA handle, operator and value  are  applied  to
193           every  access  that comprises the DMA request. If interference with
194           interrupts has been requested then the operator may take any of the
195           following values:
196
197           DELAY    After  count  accesses (see the -c option), delay delivery
198                    of the next failcount number  of  interrupts  for  operand
199                    number of microseconds.
200
201
202           LOSE     After count number of interrupts, fail to deliver the next
203                    failcount number of real interrupts to the driver.
204
205
206           EXTRA    After count number of interrupts, start delivering operand
207                    number  of  extra interrupts for the next failcount number
208                    of real interrupts.
209
210           The default value for operand and operator is to corrupt  the  data
211           access by flipping each bit (XOR with -1).
212
213
214       -f acc_chk
215
216           If  the  acc_chk  parameter  is  set to 1 or pio, then the driver's
217           calls  to  ddi_check_acc_handle(9F)  return  DDI_FAILURE  when  the
218           access  count  goes  to  1. If the acc_chk parameter is set to 2 or
219           dma, then the driver's  calls  to  ddi_check_dma_handle(9F)  return
220           DDI_FAILURE when the access count goes to 1.
221
222
223       -w max_wait_period [report_interval]
224
225           Constrain  the  period  for  which  an error definition will remain
226           active. The option applies only to non-logging errdefs. If an error
227           definition  remains  active  for  max_wait_period seconds, the test
228           will be aborted. If report_interval is set to a nonzero value,  the
229           current status of the error definition is reported to standard out‐
230           put every report_interval seconds. The default value is  zero.  The
231           status  of the errdef is reported in parsable format (eight fields,
232           each separated by a colon (:) character, the last  of  which  is  a
233           string enclosed by double quotes and the remaining seven fields are
234           integers):
235
236           ft:mt:ac:fc:chk:ec:s:"message" which are defined as follows:
237
238           ft           The UTC time when the fault was injected.
239
240
241           mt           The UTC time when the driver reported the fault.
242
243
244           ac           The number of remaining non-faulting accesses.
245
246
247           fc           The number of remaining faulting accesses.
248
249
250           chk          The value of the acc_chk field of the errdef.
251
252
253           ec           The number of  fault  reports  issued  by  the  driver
254                        against  this errdef (mt holds the time of the initial
255                        report).
256
257
258           s            The severity level reported by the driver.
259
260
261           "message"    Textual reason why the driver has reported a fault.
262
263
264
265       -h
266
267           Display the command usage string.
268
269
270       -s collect_time
271
272           If acc_types is given with the -a  option  and  includes  log,  the
273           errdef  will  log accesses for collect_time seconds (the default is
274           to log until the log becomes full). Note that, if the errdef speci‐
275           fication  matches multiple driver handles, multiple logging errdefs
276           are registered with the bofi driver and logging terminates when all
277           logs  become  full or when collect_time expires or when the associ‐
278           ated errdefs are cleared. The current  state  of  the  log  can  be
279           checked with the th_manage(1M) command, using the broadcast parame‐
280           ter. A log can be terminated  by  running  th_manage(1M)  with  the
281           clear_errdefs  option  or  by  sending  a  SIGALRM  signal  to  the
282           th_define process. See alarm(2) for the semantics of SIGALRM.
283
284
285       -p policy
286
287           Applicable when the acc_types option includes  log.  The  parameter
288           modifies  the  policy  used  for converting from logged accesses to
289           errdefs. All policies are inclusive:
290
291               o      Use rare to bias error definitions toward rare  accesses
292                      (default).
293
294               o      Use  operator to produce a separate error definition for
295                      each operator type (default).
296
297               o      Use common  to  bias  error  definitions  toward  common
298                      accesses.
299
300               o      Use  median  to  bias  error  definitions  toward median
301                      accesses.
302
303               o      Use maximal to produce multiple  error  definitions  for
304                      duplicate accesses.
305
306               o      Use unbiased to create unbiased error definitions.
307
308               o      Use  onebyte,  twobyte, fourbyte, or eightbyte to select
309                      errdefs corresponding to 1, 2, 4 or 8 byte accesses  (if
310                      chosen,  the  -xr  option is enforced in order to ensure
311                      that ddi_rep_*() calls are decomposed into multiple sin‐
312                      gle accesses).
313
314               o      Use  multibyte to create error definitions for multibyte
315                      accesses    performed    using    ddi_rep_get*()     and
316                      ddi_rep_put*().
317           Policies  can be combined by adding together these options. See the
318           NOTES section for further information.
319
320
321       -x flags
322
323           Applicable when the acc_types option includes log. The flags param‐
324           eter modifies the way in which the bofi driver logs accesses. It is
325           specified as a string containing any combination of  the  following
326           letters:
327
328           w    Continuous logging (that is, the log will wrap when full).
329
330
331           t    Timestamp each log entry (access times are in seconds).
332
333
334           r    Log  repeated  I/O  as  individual  accesses  (for  example, a
335                ddi_rep_get16(9F) call which has a repcount of N is  logged  N
336                times  with  each  transaction logged as size 2 bytes. Without
337                this option, the default  logging  behavior  is  to  log  this
338                access  once  only,  with a transaction size of twice the rep‐
339                count).
340
341
342
343       -C comment_string
344
345           Applicable when the acc_types option includes log.  It  provides  a
346           comment  string  to  be  placed  in any generated test scripts. The
347           string must be enclosed in double quotes.
348
349
350       -e fixup_script [args]
351
352           Applicable when the acc_types option includes log. The output of  a
353           logging errdefs is to generate a test script for each driver access
354           handle. Use this option to embed a command in the resulting  script
355           before  the  errors  are  injected. The generated test scripts will
356           take an instance offline and bring it back online before  injecting
357           errors  in  order  to  bring  the  instance into a known fault-free
358           state. The executable fixup_script will be called  twice  with  the
359           set  of  optional args— once just before the instance is taken off‐
360           line and again after the instance has been brought online. The fol‐
361           lowing variables are passed into the environment of the called exe‐
362           cutable:
363
364           DRIVER_PATH           Identifies the device path of the instance.
365
366
367           DRIVER_INSTANCE       Identifies the instance number of the device.
368
369
370           DRIVER_UNCONFIGURE    Has the value 1 when the instance is about to
371                                 be taken offline.
372
373
374           DRIVER_CONFIGURE      Has  the  value  1 when the instance has just
375                                 been brought online.
376
377           Typically, the executable ensures that the device under test is  in
378           a  suitable  state to be taken offline (unconfigured) or in a suit‐
379           able state for error injection (for example configured, error  free
380           and  servicing  a  workload). A minimal script for a network driver
381           could be:
382
383             #!/bin/ksh
384
385             driver=xyznetdriver
386             ifnum=$driver$DRIVER_INSTANCE
387
388             if [[ $DRIVER_CONFIGURE = 1 ]]; then
389                  ifconfig $ifnum plumb
390                  ifconfig $ifnum ...
391                  ifworkload start $ifnum
392             elif [[ $DRIVER_UNCONFIGURE = 1 ]]; then
393                  ifworkload stop $ifnum
394                  ifconfig $ifnum down
395                  ifconfig $ifnum unplumb
396             fi
397             exit $?
398
399
400           The -e option must be the last option on the command line.
401
402
403
404       If the -a log option is selected but the -e  option  is  not  given,  a
405       default  script  is used. This script repeatedly attempts to detach and
406       then re-attach the device instance under test.
407

EXAMPLES

409   Examples of Error Definitions
410       th_define -n foo -i 1 -a log
411
412
413       Logs all accesses to all handles used by instance 1 of the  foo  driver
414       while  running  the  default  workload  (attaching  and  detaching  the
415       instance). Then generates a set of test scripts to  inject  appropriate
416       errdefs while running that default workload.
417
418
419       th_define -n foo -i 1 -a log pio
420
421
422       Logs  PIO  accesses  to  each  PIO handle used by instance 1 of the foo
423       driver while running the default workload (attaching and detaching  the
424       instance).  Then  generates a set of test scripts to inject appropriate
425       errdefs while running that default workload.
426
427
428       th_define -n foo -i 1 -p onebyte median -e fixup arg -now
429
430
431       Logs all accesses to all handles used by instance 1 of the  foo  driver
432       while running the workload defined in the fixup script fixup with argu‐
433       ments arg and -now. Then generates a set  of  test  scripts  to  inject
434       appropriate  errdefs  while  running that workload. The resulting error
435       definitions are requested to focus upon single byte accesses  to  loca‐
436       tions  that  are accessed a median number of times with respect to fre‐
437       quency of access to I/O addresses.
438
439
440       th_define -n se -l 0x20 1 -a pio_r -o OR 0x4 -c 10 1000
441
442
443       Simulates a stuck serial chip command by forcing 1000 consecutive  read
444       accesses  made  by  any instance of the se driver to its command status
445       register, thereby returning status busy.
446
447
448       th_define -n foo -i 3 -r 1 -a pio_r -c 0 1 -f 1 -o OR 0x100
449
450
451       Causes 0x100 to be ORed into the next physical I/O read access from any
452       register  in register set 1 of instance 3 of the foo driver. Subsequent
453       calls in the driver to ddi_check_acc_handle() return DDI_FAILURE.
454
455
456       th_define -n foo -i 3 -r 1 -a pio_r -c 0 1 -o OR 0x0
457
458
459       Causes 0x0 to be ORed into the next physical I/O read access  from  any
460       register  in register set 1 of instance 3 of the foo driver. This is of
461       course a no-op.
462
463
464       th_define -n foo -i 3 -r 1 -l 0x8100 1 -a pio_r -c 0 10 -o EQ 0x70003
465
466
467       Causes the next ten next physical I/O reads from the register at offset
468       0x8100  in  register  set  1  of instance 3 of the foo driver to return
469       0x70003.
470
471
472       th_define -n foo -i 3 -r 1 -l 0x8100  1  -a  pio_w  -c  100  3  -o  AND
473       0xffffffffffffefff
474
475
476       The  next  100  physical I/O writes to the register at offset 0x8100 in
477       register set 1 of instance 3 of the foo driver take  place  as  normal.
478       However,  on each of the three subsequent accesses, the 0x1000 bit will
479       be cleared.
480
481
482       th_define -n foo -i 3 -r 1 -l 0x8100 0x10 -a pio_r -c 0 1 -f 1 -o XOR 7
483
484
485       Causes the bottom three bits to have their values toggled for the  next
486       physical  I/O read access to registers with offsets in the range 0x8100
487       to 0x8110 in register set 1 of instance 3 of the foo driver. Subsequent
488       calls in the driver to ddi_check_acc_handle() return DDI_FAILURE.
489
490
491       th_define -n foo -i 3 -a pio_w -c 0 1 -o NO 0
492
493
494       Prevents the next physical I/O write access to any register in any reg‐
495       ister set of instance 3 of the foo driver from going out on the bus.
496
497
498       th_define -n foo -i 3 -l 0 8192 -a dma_r -c 0 1 -o OR 7
499
500
501       Causes 0x7 to be ORed into each long long in the first  8192  bytes  of
502       the  next  DMA  read,  using  any  DMA handle for instance 3 of the foo
503       driver.
504
505
506       th_define  -n  foo  -i  3  -r  2  -l  0  8  -a  dma_r  -c  0  1  -o  OR
507       0x7070707070707070
508
509
510       Causes  0x70  to  be  ORed into each byte of the first long long of the
511       next DMA read, using the DMA handle with sequential allocation number 2
512       for instance 3 of the foo driver.
513
514
515       th_define -n foo -i 3 -l 256 256 -a dma_w -c 0 1 -f 2 -o OR 7
516
517
518       Causes  0x7 to be ORed into each long long in the range from offset 256
519       to offset 512 of the next DMA write, using any DMA handle for  instance
520       3   of   the   foo   driver.   Subsequent   calls   in  the  driver  to
521       ddi_check_dma_handle() return DDI_FAILURE.
522
523
524       th_define -n foo -i  3  -r  0  -l  0  8  -a  dma_w  -c  100  3  -o  AND
525       0xffffffffffffefff
526
527
528       The next 100 DMA writes using the DMA handle with sequential allocation
529       number 0 for instance 3 of the foo driver take place  as  normal.  How‐
530       ever,  on each of the three subsequent accesses, the 0x1000 bit will be
531       cleared in the first long long of the transfer.
532
533
534       th_define -n foo -i 3 -a intr -c 0 6 -o LOSE 0
535
536
537       Causes the next six interrupts for instance 3 of the foo driver  to  be
538       lost.
539
540
541       th_define -n foo -i 3 -a intr -c 30 1 -o EXTRA 10
542
543
544       When  the  thirty-first  subsequent interrupt for instance 3 of the foo
545       driver occurs, a further ten interrupts are also generated.
546
547
548       th_define -n foo -i 3 -a intr -c 0 1 -o DELAY 1024
549
550
551       Causes the next interrupt for instance  3  of  the  foo  driver  to  be
552       delayed by 1024 microseconds.
553

NOTES

555       The  policy  option  in the th_define -p syntax determines how a set of
556       logged accesses will be converted into the set  of  error  definitions.
557       Each  logged  access  will  be  matched  against the chosen policies to
558       determine whether an error definition should be created  based  on  the
559       access.
560
561
562       Any  number  of  policy options can be combined to modify the generated
563       error definitions.
564
565   Bytewise Policies
566       These select particular I/O transfer sizes.  Specifing  a  byte  policy
567       will  exclude other byte policies that have not been chosen. If none of
568       the byte type policies is selected,  all  transfer  sizes  are  treated
569       equally.  Otherwise,  only  those  specified  transfer  sizes  will  be
570       selected.
571
572       onebyte      Create errdefs for one byte accesses (ddi_get8())
573
574
575       twobyte      Create errdefs for two byte accesses (ddi_get16())
576
577
578       fourbyte     Create errdefs for four byte accesses (ddi_get32())
579
580
581       eightbyte    Create errdefs for eight byte accesses (ddi_get64())
582
583
584       multibyte    Create errdefs for repeated byte accesses (ddi_rep_get*())
585
586
587   Frequency of Access Policies
588       The frequency of access to a location is determined  according  to  the
589       access  type,  location and transfer size (for example, a two-byte read
590       access to address A is considered distinct from a four-byte read access
591       to  address  A). The algorithm is to count the number of accesses (of a
592       given type and size) to a given location, and find the  locations  that
593       were  most and least accessed (let maxa and mina be the number of times
594       these locations were accessed, and mean the total  number  of  accesses
595       divided  by  total number of locations that were accessed). Then a rare
596       access is a location that was accessed less than
597
598
599       (mean - mina) / 3 + mina
600
601
602       times. Similarly for the definition of common accesses:
603
604
605       maxa - (maxa - mean) / 3
606
607
608       A location whose access patterns lies within these cutoffs is  regarded
609       as a location that is accessed with median frequency.
610
611       rare      Create errdefs for locations that are rarely accessed.
612
613
614       common    Create errdefs for locations that are commonly accessed.
615
616
617       median    Create  errdefs for locations that are accessed a median fre‐
618                 quency.
619
620
621   Policies for Minimizing errdefs
622       If a transaction is duplicated, either a  single  or  multiple  errdefs
623       will  be  written to the test scripts, depending upon the following two
624       policies:
625
626       maximal      Create multiple errdefs for locations that are  repeatedly
627                    accessed.
628
629
630       unbiased     Create  a  single errdef for locations that are repeatedly
631                    accessed.
632
633
634       operators    For each location, a default operator and operand is typi‐
635                    cally applied. For maximal test coverage, this default may
636                    be modified using the operators policy so that a  separate
637                    errdef  is  created  for  each  of the possible corruption
638                    operators.
639
640

SEE ALSO

642       kill(1),     th_manage(1M),     alarm(2),     ddi_check_acc_handle(9F),
643       ddi_check_dma_handle(9F)
644
645
646
647SunOS 5.11                        11 Apr 2001                    th_define(1M)
Impressum