fio(1) - f38

1fio(1)                      General Commands Manual                     fio(1)
2
3
4

NAME

6       fio - flexible I/O tester
7

SYNOPSIS

9       fio [options] [jobfile]...
10

DESCRIPTION

12       fio  is a tool that will spawn a number of threads or processes doing a
13       particular type of I/O action as specified by the  user.   The  typical
14       use  of  fio  is to write a job file matching the I/O load one wants to
15       simulate.
16

OPTIONS

18       --debug=type
19              Enable verbose tracing type of various fio actions. May be `all'
20              for  all  types  or  individual types separated by a comma (e.g.
21              `--debug=file,mem'  will  enable  file  and  memory  debugging).
22              `help' will list all available tracing options.
23
24       --parse-only
25              Parse options only, don't start any I/O.
26
27       --merge-blktrace-only
28              Merge blktraces only, don't start any I/O.
29
30       --output=filename
31              Write output to filename.
32
33       --output-format=format
34              Set  the  reporting  format  to  `normal',  `terse',  `json', or
35              `json+'. Multiple formats can be selected, separate by a  comma.
36              `terse' is a CSV based format. `json+' is like `json', except it
37              adds a full dump of the latency buckets.
38
39       --bandwidth-log
40              Generate aggregate bandwidth logs.
41
42       --minimal
43              Print statistics in a terse, semicolon-delimited format.
44
45       --append-terse
46              Print statistics in selected mode AND terse, semicolon-delimited
47              format.   Deprecated, use --output-format instead to select mul‐
48              tiple formats.
49
50       --terse-version=version
51              Set terse version output format (default `3', or `2', `4', `5').
52
53       --version
54              Print version information and exit.
55
56       --help Print a summary of the command line options and exit.
57
58       --cpuclock-test
59              Perform test and validation of internal CPU clock.
60
61       --crctest=[test]
62              Test the speed of the built-in checksumming functions. If no ar‐
63              gument  is given, all of them are tested. Alternatively, a comma
64              separated list can be passed, in which case the given  ones  are
65              tested.
66
67       --cmdhelp=command
68              Print  help  information  for command. May be `all' for all com‐
69              mands.
70
71       --enghelp=[ioengine[,command]]
72              List all commands defined by ioengine, or print help for command
73              defined by ioengine. If no ioengine is given, list all available
74              ioengines.
75
76       --showcmd
77              Convert given jobfiles to a set of command-line options.
78
79       --readonly
80              Turn on safety read-only checks, preventing  writes  and  trims.
81              The  --readonly option is an extra safety guard to prevent users
82              from accidentally starting a write or trim workload when that is
83              not  desired.  Fio  will  only  modify  the device under test if
84              `rw=write/randwrite/rw/randrw/trim/randtrim/trimwrite' is given.
85              This safety net can be used as an extra precaution.
86
87       --eta=when
88              Specifies  when  real-time  ETA estimate should be printed. when
89              may be `always', `never' or `auto'. `auto' is  the  default,  it
90              prints  ETA when requested if the output is a TTY. `always' dis‐
91              regards the output type, and prints ETA when requested.  `never'
92              never prints ETA.
93
94       --eta-interval=time
95              By default, fio requests client ETA status roughly every second.
96              With this option, the interval is configurable.  Fio  imposes  a
97              minimum  allowed  time  to avoid flooding the console, less than
98              250 msec is not supported.
99
100       --eta-newline=time
101              Force a new line for every time period passed. When the unit  is
102              omitted, the value is interpreted in seconds.
103
104       --status-interval=time
105              Force  a  full status dump of cumulative (from job start) values
106              at time intervals. This option  does  *not*  provide  per-period
107              measurements.  So values such as bandwidth are running averages.
108              When the time unit is omitted, time is interpreted  in  seconds.
109              Note  that  using  this  option with `--output-format=json' will
110              yield output that technically isn't valid json, since the output
111              will  be  collated  sets of valid json. It will need to be split
112              into valid sets of json after the run.
113
114       --section=name
115              Only run specified section name in job file.  Multiple  sections
116              can  be  specified.   The --section option allows one to combine
117              related jobs into one file.  E.g.  one  job  file  could  define
118              light,  moderate,  and  heavy sections. Tell fio to run only the
119              "heavy" section by giving `--section=heavy' command line option.
120              One  can  also specify the "write" operations in one section and
121              "verify" operation in another section. The --section option only
122              applies to job sections. The reserved *global* section is always
123              parsed and used.
124
125       --alloc-size=kb
126              Allocate additional internal smalloc pools of size  kb  in  KiB.
127              The  --alloc-size  option  increases shared memory set aside for
128              use by fio.  If running large jobs with randommap  enabled,  fio
129              can  run  out  of  memory.  Smalloc is an internal allocator for
130              shared structures from a fixed size memory pool and can grow  to
131              16  pools. The pool size defaults to 16MiB.  NOTE: While running
132              `.fio_smalloc.*' backing store files are visible in `/tmp'.
133
134       --warnings-fatal
135              All fio parser warnings are fatal, causing fio to exit  with  an
136              error.
137
138       --max-jobs=nr
139              Set  the  maximum  number of threads/processes to support to nr.
140              NOTE: On Linux, it may be necessary to increase the  shared-mem‐
141              ory  limit  (`/proc/sys/kernel/shmmax')  if fio runs into errors
142              while creating jobs.
143
144       --server=args
145              Start a backend server, with args specifying what to listen  to.
146              See CLIENT/SERVER section.
147
148       --daemonize=pidfile
149              Background  a  fio  server, writing the pid to the given pidfile
150              file.
151
152       --client=hostname
153              Instead of running the jobs locally, send and run  them  on  the
154              given hostname or set of hostnames. See CLIENT/SERVER section.
155
156       --remote-config=file
157              Tell fio server to load this local file.
158
159       --idle-prof=option
160              Report CPU idleness. option is one of the following:
161
162                     calibrate
163                            Run unit work calibration only and exit.
164
165                     system Show aggregate system idleness and unit work.
166
167                     percpu As system but also show per CPU idleness.
168
169       --inflate-log=log
170              Inflate and output compressed log.
171
172       --trigger-file=file
173              Execute trigger command when file exists.
174
175       --trigger-timeout=time
176              Execute trigger at this time.
177
178       --trigger=command
179              Set this command as local trigger.
180
181       --trigger-remote=command
182              Set this command as remote trigger.
183
184       --aux-path=path
185              Use  the  directory  specified by path for generated state files
186              instead of the current working directory.
187

JOB FILE FORMAT

189       Any parameters following the options will be assumed to be  job  files,
190       unless  they  match  a  job  file  parameter. Multiple job files can be
191       listed and each job file will be regarded as a separate group. Fio will
192       stonewall execution between each group.
193
194       Fio accepts one or more job files describing what it is supposed to do.
195       The job file format is the classic ini file, where the  names  enclosed
196       in  [] brackets define the job name. You are free to use any ASCII name
197       you want, except *global* which has special meaning. Following the  job
198       name  is  a sequence of zero or more parameters, one per line, that de‐
199       fine the behavior of the job. If the first character in a line is a ';'
200       or a '#', the entire line is discarded as a comment.
201
202       A *global* section sets defaults for the jobs described in that file. A
203       job may override a *global* section parameter, and a job file may  even
204       have several *global* sections if so desired. A job is only affected by
205       a *global* section residing above it.
206
207       The --cmdhelp option also lists all options. If used  with  an  command
208       argument, --cmdhelp will detail the given command.
209
210       See  the  `examples/'  directory  for  inspiration  on how to write job
211       files. Note the copyright and license requirements currently  apply  to
212       `examples/' files.
213
214       Note that the maximum length of a line in the job file is 8192 bytes.
215

JOB FILE PARAMETERS

217       Some parameters take an option of a given type, such as an integer or a
218       string. Anywhere a numeric value is required, an arithmetic  expression
219       may be used, provided it is surrounded by parentheses. Supported opera‐
220       tors are:
221
222              addition (+)
223
224              subtraction (-)
225
226              multiplication (*)
227
228              division (/)
229
230              modulus (%)
231
232              exponentiation (^)
233
234       For time values in expressions, units are microseconds by default. This
235       is  different  than for time values not in expressions (not enclosed in
236       parentheses).
237

PARAMETER TYPES

239       The following parameter types are used.
240
241       str    String. A sequence of alphanumeric characters.
242
243       time   Integer with possible time suffix. Without a unit value  is  in‐
244              terpreted  as seconds unless otherwise specified. Accepts a suf‐
245              fix of 'd' for days, 'h' for hours, 'm'  for  minutes,  's'  for
246              seconds,  'ms' (or 'msec') for milliseconds and 'us' (or 'usec')
247              for microseconds. For example, use 10m for 10 minutes.
248
249       int    Integer. A whole number value, which may contain an integer pre‐
250              fix and an integer suffix.
251
252                     [*integer prefix*] **number** [*integer suffix*]
253
254              The  optional  *integer prefix* specifies the number's base. The
255              default is decimal. *0x* specifies hexadecimal.
256
257              The optional *integer suffix* specifies the number's units,  and
258              includes an optional unit prefix and an optional unit. For quan‐
259              tities of data, the default unit is  bytes.  For  quantities  of
260              time, the default unit is seconds unless otherwise specified.
261
262              With  `kb_base=1000',  fio  follows  international standards for
263              unit prefixes. To specify power-of-10 decimal values defined  in
264              the International System of Units (SI):
265
266                     K means kilo (K) or 1000
267                     M means mega (M) or 1000**2
268                     G means giga (G) or 1000**3
269                     T means tera (T) or 1000**4
270                     P means peta (P) or 1000**5
271
272              To specify power-of-2 binary values defined in IEC 80000-13:
273
274                     Ki means kibi (Ki) or 1024
275                     Mi means mebi (Mi) or 1024**2
276                     Gi means gibi (Gi) or 1024**3
277                     Ti means tebi (Ti) or 1024**4
278                     Pi means pebi (Pi) or 1024**5
279
280              For Zone Block Device Mode:
281
282                     z means Zone
283              With  `kb_base=1024'  (the default), the unit prefixes are oppo‐
284              site from those specified in the SI and IEC  80000-13  standards
285              to provide compatibility with old scripts. For example, 4k means
286              4096.
287
288              For quantities of data, an optional unit of 'B' may be  included
289              (e.g., 'kB' is the same as 'k').
290
291              The  *integer  suffix*  is  not  case sensitive (e.g., m/mi mean
292              mebi/mega, not milli). 'b' and 'B' both mean byte, not bit.
293
294              Examples with `kb_base=1000':
295
296                     4 KiB: 4096, 4096b, 4096B, 4k, 4kb, 4kB, 4K, 4KB
297                     1 MiB: 1048576, 1m, 1024k
298                     1 MB: 1000000, 1mi, 1000ki
299                     1 TiB: 1073741824, 1t, 1024m, 1048576k
300                     1 TB: 1000000000, 1ti, 1000mi, 1000000ki
301
302              Examples with `kb_base=1024' (default):
303
304                     4 KiB: 4096, 4096b, 4096B, 4k, 4kb, 4kB, 4K, 4KB
305                     1 MiB: 1048576, 1m, 1024k
306                     1 MB: 1000000, 1mi, 1000ki
307                     1 TiB: 1073741824, 1t, 1024m, 1048576k
308                     1 TB: 1000000000, 1ti, 1000mi, 1000000ki
309
310              To specify times (units are not case sensitive):
311
312                     D means days
313                     H means hours
314                     M mean minutes
315                     s or sec means seconds (default)
316                     ms or msec means milliseconds
317                     us or usec means microseconds
318
319              `z' suffix specifies that the value is measured in zones.  Value
320              is recalculated once block device's zone size becomes known.
321
322              If  the option accepts an upper and lower range, use a colon ':'
323              or minus '-' to separate such values. See irange parameter type.
324              If the lower value specified happens to be larger than the upper
325              value the two values are swapped.
326
327       bool   Boolean. Usually parsed as an integer, however only defined  for
328              true and false (1 and 0).
329
330       irange Integer  range with suffix. Allows value range to be given, such
331              as 1024-4096. A colon may also be used as  the  separator,  e.g.
332              1k:4k.  If  the  option  allows  two sets of ranges, they can be
333              specified with a ',' or '/' delimiter:  1k-4k/8k-32k.  Also  see
334              int parameter type.
335
336       float_list
337              A list of floating point numbers, separated by a ':' character.
338

JOB PARAMETERS

340       With  the  above in mind, here follows the complete list of fio job pa‐
341       rameters.
342
343   Units
344       kb_base=int
345              Select the interpretation of unit prefixes in input parameters.
346
347                     1000   Inputs comply with IEC 80000-13 and  the  Interna‐
348                            tional System of Units (SI). Use:
349
350                            - power-of-2 values with IEC prefixes (e.g., KiB)
351                            - power-of-10 values with SI prefixes (e.g., kB)
352
353                     1024   Compatibility  mode  (default).  To avoid breaking
354                            old scripts:
355
356                            - power-of-2 values with SI prefixes
357                            - power-of-10 values with IEC prefixes
358
359              See bs for more details on input parameters.
360
361              Outputs always use correct prefixes. Most outputs  include  both
362              side-by-side, like:
363
364                     bw=2383.3kB/s (2327.4KiB/s)
365
366              If  only  one value is reported, then kb_base selects the one to
367              use:
368
369                     1000 -- SI prefixes
370                     1024 -- IEC prefixes
371
372       unit_base=int
373              Base unit for reporting. Allowed values are:
374
375                     0      Use auto-detection (default).
376
377                     8      Byte based.
378
379                     1      Bit based.
380
381   Job description
382       name=str
383              ASCII name of the job. This may be used  to  override  the  name
384              printed  by fio for this job. Otherwise the job name is used. On
385              the command line this parameter has the special purpose of  also
386              signaling the start of a new job.
387
388       description=str
389              Text  description  of  the  job. Doesn't do anything except dump
390              this text description when this job is run. It's not parsed.
391
392       loops=int
393              Run the specified number of iterations of this job. Used to  re‐
394              peat the same workload a given number of times. Defaults to 1.
395
396       numjobs=int
397              Create the specified number of clones of this job. Each clone of
398              job is spawned as an independent thread or process. May be  used
399              to  setup  a  larger  number of threads/processes doing the same
400              thing. Each thread is reported separately; to see statistics for
401              all  clones  as a whole, use group_reporting in conjunction with
402              new_group.  See --max-jobs. Default: 1.
403
404   Time related parameters
405       runtime=time
406              Tell fio to terminate processing after the specified  period  of
407              time. It can be quite hard to determine for how long a specified
408              job will run, so this parameter is handy to cap the  total  run‐
409              time to a given time. When the unit is omitted, the value is in‐
410              terpreted in seconds.
411
412       time_based
413              If set, fio will run for the duration of the  runtime  specified
414              even if the file(s) are completely read or written. It will sim‐
415              ply loop over the same workload as many times as the runtime al‐
416              lows.
417
418       startdelay=irange(int)
419              Delay  the start of job for the specified amount of time. Can be
420              a single value or a range. When given as a  range,  each  thread
421              will  choose a value randomly from within the range. Value is in
422              seconds if a unit is omitted.
423
424       ramp_time=time
425              If set, fio will run the specified workload for this  amount  of
426              time  before logging any performance numbers. Useful for letting
427              performance settle before logging results, thus  minimizing  the
428              runtime  required for stable results. Note that the ramp_time is
429              considered lead in time for a job, thus it will increase the to‐
430              tal  runtime  if a special timeout or runtime is specified. When
431              the unit is omitted, the value is given in seconds.
432
433       clocksource=str
434              Use the given clocksource as the base of timing.  The  supported
435              options are:
436
437                     gettimeofday
438                            gettimeofday(2)
439
440                     clock_gettime
441                            clock_gettime(2)
442
443                     cpu    Internal CPU clock source
444
445              cpu  is  the  preferred  clocksource if it is reliable, as it is
446              very fast (and fio is heavy on time calls). Fio  will  automati‐
447              cally  use this clocksource if it's supported and considered re‐
448              liable on the system it is running  on,  unless  another  clock‐
449              source is specifically set. For x86/x86-64 CPUs, this means sup‐
450              porting TSC Invariant.
451
452       gtod_reduce=bool
453              Enable  all  of  the  gettimeofday(2)  reducing  options   (dis‐
454              able_clat,  disable_slat,  disable_bw_measurement)  plus  reduce
455              precision of the timeout somewhat to really shrink the  gettime‐
456              ofday(2)  call count. With this option enabled, we only do about
457              0.4% of the gettimeofday(2) calls we would have done if all time
458              keeping was enabled.
459
460       gtod_cpu=int
461              Sometimes  it's cheaper to dedicate a single thread of execution
462              to just getting the current time. Fio (and  databases,  for  in‐
463              stance)  are  very intensive on gettimeofday(2) calls. With this
464              option, you can set one CPU aside for doing nothing but  logging
465              current  time  to  a  shared  memory  location.  Then  the other
466              threads/processes that run I/O workloads  need  only  copy  that
467              segment,  instead  of entering the kernel with a gettimeofday(2)
468              call. The CPU set aside for doing these time calls will  be  ex‐
469              cluded  from other uses. Fio will manually clear it from the CPU
470              mask of other jobs.
471
472   Target file/device
473       directory=str
474              Prefix filenames with this directory. Used to place files  in  a
475              different location than `./'. You can specify a number of direc‐
476              tories by separating the names with a ':' character.  These  di‐
477              rectories  will  be  assigned  equally distributed to job clones
478              created by numjobs as long as they  are  using  generated  file‐
479              names.  If  specific  filename(s) are set fio will use the first
480              listed directory, and thereby  matching  the  filename  semantic
481              (which  generates  a  file  for each clone if not specified, but
482              lets all clones use the same file if set).
483
484              See the filename option for information on  how  to  escape  ':'
485              characters within the directory path itself.
486
487              Note:  To  control the directory fio will use for internal state
488              files use --aux-path.
489
490       filename=str
491              Fio normally makes up a filename based on the job  name,  thread
492              number,  and  file  number (see filename_format). If you want to
493              share files between threads in a job or several jobs with  fixed
494              file  paths, specify a filename for each of them to override the
495              default. If the ioengine is file based, you can specify a number
496              of  files  by  separating  the names with a ':' colon. So if you
497              wanted a job to open `/dev/sda' and `/dev/sdb' as the two  work‐
498              ing files, you would use `filename=/dev/sda:/dev/sdb'. This also
499              means that whenever this option is  specified,  nrfiles  is  ig‐
500              nored.  The  size of regular files specified by this option will
501              be size divided by number of files unless an  explicit  size  is
502              specified by filesize.
503
504              Each colon in the wanted path must be escaped with a '\' charac‐
505              ter. For instance, if the path is `/dev/dsk/foo@3,0:c' then  you
506              would  use  `filename=/dev/dsk/foo@3,0\:c'  and  if  the path is
507              `F:\filename' then you would use `filename=F\:\filename'.
508
509              On Windows, disk devices are  accessed  as  `\\.\PhysicalDrive0'
510              for  the  first device, `\\.\PhysicalDrive1' for the second etc.
511              Note: Windows and FreeBSD prevent write access to areas  of  the
512              disk containing in-use data (e.g. filesystems).
513
514              The  filename  `-'  is a reserved name, meaning *stdin* or *std‐
515              out*. Which of the two depends on the read/write direction set.
516
517       filename_format=str
518              If sharing multiple files between jobs, it is usually  necessary
519              to  have fio generate the exact names that you want. By default,
520              fio will name a file based on the default file format specifica‐
521              tion  of  `jobname.jobnumber.filenumber'. With this option, that
522              can be customized. Fio will recognize and replace the  following
523              keywords in this string:
524
525                     $jobname
526                            The name of the worker thread or process.
527
528                     $clientuid
529                            IP  of  the  fio  process when using client/server
530                            mode.
531
532                     $jobnum
533                            The incremental number of  the  worker  thread  or
534                            process.
535
536                     $filenum
537                            The incremental number of the file for that worker
538                            thread or process.
539
540              To have dependent jobs share a set of files, this option can  be
541              set  to  have fio generate filenames that are shared between the
542              two. For instance, if `testfiles.$filenum'  is  specified,  file
543              number 4 for any job will be named `testfiles.4'. The default of
544              `$jobname.$jobnum.$filenum' will be  used  if  no  other  format
545              specifier is given.
546
547              If you specify a path then the directories will be created up to
548              the main directory for the file.  So for example if you  specify
549              `a/b/c/$jobnum`  then  the directories a/b/c will be created be‐
550              fore the file setup part of the job.  If you  specify  directory
551              then  the  path will be relative that directory, otherwise it is
552              treated as the absolute path.
553
554       unique_filename=bool
555              To avoid collisions between networked clients, fio  defaults  to
556              prefixing  any  generated filenames (with a directory specified)
557              with the source of the client connecting. To disable this behav‐
558              ior, set this option to 0.
559
560       opendir=str
561              Recursively open any files below directory str.
562
563       lockfile=str
564              Fio  defaults  to  not  locking  any files before it does I/O to
565              them. If a file or file descriptor is shared, fio can  serialize
566              I/O  to  that  file  to  make the end result consistent. This is
567              usual for emulating real workloads that share  files.  The  lock
568              modes are:
569
570                     none   No locking. The default.
571
572                     exclusive
573                            Only  one  thread or process may do I/O at a time,
574                            excluding all others.
575
576                     readwrite
577                            Read-write locking on the file. Many  readers  may
578                            access  the  file at the same time, but writes get
579                            exclusive access.
580
581       nrfiles=int
582              Number of files to use for this job. Defaults to 1. The size  of
583              files will be size divided by this unless explicit size is spec‐
584              ified by filesize. Files are created for each thread separately,
585              and  each  file  will  have a file number within its name by de‐
586              fault, as explained in filename section.
587
588       openfiles=int
589              Number of files to keep open at the same time. Defaults  to  the
590              same as nrfiles, can be set smaller to limit the number simulta‐
591              neous opens.
592
593       file_service_type=str
594              Defines how fio decides which file from a job to  service  next.
595              The following types are defined:
596
597                     random Choose a file at random.
598
599                     roundrobin
600                            Round  robin  over  opened  files. This is the de‐
601                            fault.
602
603                     sequential
604                            Finish one file before moving on to the next. Mul‐
605                            tiple  files  can still be open depending on open‐
606                            files.
607
608                     zipf   Use a Zipf distribution to decide what file to ac‐
609                            cess.
610
611                     pareto Use  a  Pareto distribution to decide what file to
612                            access.
613
614                     normal Use a Gaussian  (normal)  distribution  to  decide
615                            what file to access.
616
617                     gauss  Alias for normal.
618
619              For  random,  roundrobin,  and  sequential, a postfix can be ap‐
620              pended to tell fio how many I/Os to issue before switching to  a
621              new  file.  For example, specifying `file_service_type=random:8'
622              would cause fio to issue 8 I/Os before selecting a new  file  at
623              random.  For  the  non-uniform  distributions,  a floating point
624              postfix can be  given  to  influence  how  the  distribution  is
625              skewed.  See  random_distribution  for a description of how that
626              would work.
627
628       ioscheduler=str
629              Attempt to switch the device hosting the file to  the  specified
630              I/O scheduler before running. If the file is a pipe, a character
631              device file or if device hosting the file could  not  be  deter‐
632              mined, this option is ignored.
633
634       create_serialize=bool
635              If  true,  serialize the file creation for the jobs. This may be
636              handy to avoid interleaving of data files, which may greatly de‐
637              pend on the filesystem used and even the number of processors in
638              the system. Default: true.
639
640       create_fsync=bool
641              fsync(2) the data file after creation. This is the default.
642
643       create_on_open=bool
644              If true, don't pre-create files but allow the  job's  open()  to
645              create  a  file when it's time to do I/O. Default: false -- pre-
646              create all necessary files when the job starts.
647
648       create_only=bool
649              If true, fio will only run the setup phase of the job. If  files
650              need  to  be laid out or updated on disk, only that will be done
651              -- the actual job contents are not executed. Default: false.
652
653       allow_file_create=bool
654              If true, fio is permitted to create files as part of  its  work‐
655              load.  If  this  option is false, then fio will error out if the
656              files it needs to use don't already exist. Default: true.
657
658       allow_mounted_write=bool
659              If this isn't set, fio will  abort  jobs  that  are  destructive
660              (e.g. that write) to what appears to be a mounted device or par‐
661              tition. This should help catch creating  inadvertently  destruc‐
662              tive tests, not realizing that the test will destroy data on the
663              mounted file system. Note that some platforms don't allow  writ‐
664              ing against a mounted device regardless of this option. Default:
665              false.
666
667       pre_read=bool
668              If this is given, files will  be  pre-read  into  memory  before
669              starting  the  given I/O operation. This will also clear the in‐
670              validate flag, since it is pointless to pre-read and  then  drop
671              the  cache.  This  will only work for I/O engines that are seek-
672              able, since they allow you to read the same data multiple times.
673              Thus it will not work on non-seekable I/O engines (e.g. network,
674              splice). Default: false.
675
676       unlink=bool
677              Unlink the job files when done. Not  the  default,  as  repeated
678              runs  of  that job would then waste time recreating the file set
679              again and again. Default: false.
680
681       unlink_each_loop=bool
682              Unlink job files after each iteration or loop. Default: false.
683
684       zonemode=str
685              Accepted values are:
686
687                     none   The zonerange, zonesize zonecapacity and  zoneskip
688                            parameters are ignored.
689
690                     strided
691                            I/O  happens in a single zone until zonesize bytes
692                            have been transferred.  After that number of bytes
693                            has  been  transferred processing of the next zone
694                            starts. The zonecapacity parameter is ignored.
695
696                     zbd    Zoned block device mode. I/O happens  sequentially
697                            in  each  zone,  even  if  random I/O has been se‐
698                            lected. Random I/O happens across  all  zones  in‐
699                            stead  of being restricted to a single zone.  Trim
700                            is handled using a zone reset operation. Trim only
701                            considers  non-empty sequential write required and
702                            sequential write preferred zones.
703
704       zonerange=int
705              For zonemode=strided, this is the size of  a  single  zone.  See
706              also zonesize and zoneskip.
707
708              For zonemode=zbd, this parameter is ignored.
709
710       zonesize=int
711              For  zonemode=strided,  this  is the number of bytes to transfer
712              before skipping zoneskip bytes. If  this  parameter  is  smaller
713              than  zonerange then only a fraction of each zone with zonerange
714              bytes will be accessed.  If this parameter is larger  than  zon‐
715              erange  then  each  zone  will be accessed multiple times before
716              skipping to the next zone.
717
718              For zonemode=zbd, this is the size of a single  zone.  The  zon‐
719              erange  parameter is ignored in this mode. For a job accessing a
720              zoned block device, the specified zonesize must be 0 or equal to
721              the  device  zone  size. For a regular block device or file, the
722              specified zonesize must be at least 512B.
723
724       zonecapacity=int
725              For zonemode=zbd, this defines the capacity of  a  single  zone,
726              which  is  the  accessible area starting from the zone start ad‐
727              dress. This parameter only applies when  using  zonemode=zbd  in
728              combination with regular block devices.  If not specified it de‐
729              faults to the zone size. If the target device is a  zoned  block
730              device,  the  zone capacity is obtained from the device informa‐
731              tion and this option is ignored.
732
733       zoneskip=int[z]
734              For zonemode=strided, the number of bytes to skip after zonesize
735              bytes of data have been transferred.
736
737              For  zonemode=zbd,  the zonesize aligned number of bytes to skip
738              once a zone is fully written (write workloads)  or  all  written
739              data in the zone have been read (read workloads). This parameter
740              is valid only for sequential workloads and  ignored  for  random
741              workloads. For read workloads, see also read_beyond_wp.
742
743
744       read_beyond_wp=bool
745              This parameter applies to zonemode=zbd only.
746
747              Zoned  block  devices are block devices that consist of multiple
748              zones. Each zone has a type, e.g. conventional or sequential.  A
749              conventional  zone can be written at any offset that is a multi‐
750              ple of the block size. Sequential zones must be written  sequen‐
751              tially.  The  position at which a write must occur is called the
752              write pointer. A zoned block device can be either  host  managed
753              or  host  aware.  For  host managed devices the host must ensure
754              that writes happen sequentially. Fio recognizes host managed de‐
755              vices  and  serializes  writes to sequential zones for these de‐
756              vices.
757
758              If a read occurs in a sequential zone beyond the  write  pointer
759              then the zoned block device will complete the read without read‐
760              ing any data from the storage medium. Since such reads  lead  to
761              unrealistically  high  bandwidth and IOPS numbers fio only reads
762              beyond the write pointer if explicitly told to do  so.  Default:
763              false.
764
765       max_open_zones=int
766              When  running  a  random  write test across an entire drive many
767              more zones will be open than in a typical application  workload.
768              Hence  this  command line option that allows to limit the number
769              of open zones. The number of open zones is defined as the number
770              of  zones to which write commands are issued by all threads/pro‐
771              cesses.
772
773       job_max_open_zones=int
774              Limit on the number of simultaneously opened  zones  per  single
775              thread/process.
776
777       ignore_zone_limits=bool
778              If  this  option  is used, fio will ignore the maximum number of
779              open zones limit of the zoned block device in use, thus allowing
780              the option max_open_zones value to be larger than the device re‐
781              ported limit. Default: false.
782
783       zone_reset_threshold=float
784              A number between zero and one that indicates the ratio of  logi‐
785              cal  blocks  with  data to the total number of logical blocks in
786              the test above which zones should be reset periodically.
787
788       zone_reset_frequency=float
789              A number between zero and one that indicates how  often  a  zone
790              reset  should be issued if the zone reset threshold has been ex‐
791              ceeded. A zone reset is  submitted  after  each  (1  /  zone_re‐
792              set_frequency)  write  requests. This and the previous parameter
793              can be used to simulate garbage collection activity.
794
795
796   I/O type
797       direct=bool
798              If value is true, use non-buffered I/O. This  is  usually  O_DI‐
799              RECT.  Note that OpenBSD and ZFS on Solaris don't support direct
800              I/O. On Windows the synchronous ioengines don't  support  direct
801              I/O. Default: false.
802
803       atomic=bool
804              If  value  is  true,  attempt  to  use atomic direct I/O. Atomic
805              writes are guaranteed to be stable once acknowledged by the  op‐
806              erating system. Only Linux supports O_ATOMIC right now.
807
808       buffered=bool
809              If  value is true, use buffered I/O. This is the opposite of the
810              direct option. Defaults to true.
811
812       readwrite=str, rw=str
813              Type of I/O pattern. Accepted values are:
814
815                     read   Sequential reads.
816
817                     write  Sequential writes.
818
819                     trim   Sequential trims (Linux  block  devices  and  SCSI
820                            character devices only).
821
822                     randread
823                            Random reads.
824
825                     randwrite
826                            Random writes.
827
828                     randtrim
829                            Random trims (Linux block devices and SCSI charac‐
830                            ter devices only).
831
832                     rw,readwrite
833                            Sequential mixed reads and writes.
834
835                     randrw Random mixed reads and writes.
836
837                     trimwrite
838                            Sequential trim+write sequences.  Blocks  will  be
839                            trimmed  first, then the same blocks will be writ‐
840                            ten to. So if `io_size=64K' is specified, Fio will
841                            trim a total of 64K bytes and also write 64K bytes
842                            on the same trimmed blocks. This behaviour will be
843                            consistent  with `number_ios' or other Fio options
844                            limiting the total bytes or number of I/O's.
845
846                     randtrimwrite
847                            Like trimwrite , but uses  random  offsets  rather
848                            than sequential writes.
849
850              Fio  defaults  to  read  if the option is not specified. For the
851              mixed I/O types, the default is to split them 50/50. For certain
852              types  of  I/O  the  result may still be skewed a bit, since the
853              speed may be different.
854
855              It is possible to specify the number of I/Os to do  before  get‐
856              ting  a new offset by appending `:<nr>' to the end of the string
857              given. For a random read, it would look like `rw=randread:8' for
858              passing  in  an offset modifier with a value of 8. If the suffix
859              is used with a sequential I/O pattern,  then  the  `<nr>'  value
860              specified  will  be  added  to the generated offset for each I/O
861              turning sequential I/O into sequential I/O with holes.  For  in‐
862              stance,  using  `rw=write:4k' will skip 4k for every write. Also
863              see the rw_sequencer option.
864
865       rw_sequencer=str
866              If an offset modifier is given by  appending  a  number  to  the
867              `rw=str'  line,  then this option controls how that number modi‐
868              fies the I/O offset being generated. Accepted values are:
869
870                     sequential
871                            Generate sequential offset.
872
873                     identical
874                            Generate the same offset.
875
876              sequential is only useful for random I/O, where fio  would  nor‐
877              mally  generate a new random offset for every I/O. If you append
878              e.g. 8 to randread, you would get a new random offset for  every
879              8  I/Os.  The  result would be a seek for only every 8 I/Os, in‐
880              stead of for every I/O. Use `rw=randread:8' to specify that.  As
881              sequential  I/O  is  already  sequential, setting sequential for
882              that would not result in any differences. identical behaves in a
883              similar  fashion,  except  it  sends the same offset 8 number of
884              times before generating a new offset.
885
886       unified_rw_reporting=str
887              Fio normally reports statistics on a per data  direction  basis,
888              meaning that reads, writes, and trims are accounted and reported
889              separately. This option determines whether fio reports  the  re‐
890              sults  normally,  summed together, or as both options.  Accepted
891              values are:
892
893              none   Normal statistics reporting.
894
895              mixed  Statistics are summed per data direction and reported to‐
896                     gether.
897
898              both   Statistics  are  reported normally, followed by the mixed
899                     statistics.
900
901              0      Backward-compatible alias for none.
902
903              1      Backward-compatible alias for mixed.
904
905              2      Alias for both.
906
907       randrepeat=bool
908              Seed the random number generator used for random I/O patterns in
909              a  predictable way so the pattern is repeatable across runs. De‐
910              fault: true.
911
912       allrandrepeat=bool
913              Seed all random number generators in a predictable  way  so  re‐
914              sults are repeatable across runs. Default: false.
915
916       randseed=int
917              Seed  the  random number generators based on this seed value, to
918              be able to control what sequence of output is  being  generated.
919              If  not  set, the random sequence depends on the randrepeat set‐
920              ting.
921
922       fallocate=str
923              Whether pre-allocation is performed when laying down files.  Ac‐
924              cepted values are:
925
926                     none   Do not pre-allocate space.
927
928                     native Use  a  platform's  native pre-allocation call but
929                            fall back to none behavior if it fails/is not  im‐
930                            plemented.
931
932                     posix  Pre-allocate via posix_fallocate(3).
933
934                     keep   Pre-allocate    via    fallocate(2)    with   FAL‐
935                            LOC_FL_KEEP_SIZE set.
936
937                     truncate
938                            Extend file to final size using ftruncate|(2)  in‐
939                            stead of allocating.
940
941                     0      Backward-compatible alias for none.
942
943                     1      Backward-compatible alias for posix.
944
945              May  not  be  available on all supported platforms. keep is only
946              available on Linux. If using ZFS on Solaris this cannot  be  set
947              to  posix  because  ZFS doesn't support pre-allocation. Default:
948              native if any pre-allocation methods except truncate are  avail‐
949              able, none if not.
950
951              Note  that  using truncate on Windows will interact surprisingly
952              with non-sequential write patterns. When writing to a file  that
953              has  been  extended by setting the end-of-file information, Win‐
954              dows will backfill the unwritten portion of the file up to  that
955              offset with zeroes before issuing the new write. This means that
956              a single small write to the end of an extended file  will  stall
957              until the entire file has been filled with zeroes.
958
959       fadvise_hint=str
960              Use  posix_fadvise(2)  or  posix_madvise(2) to advise the kernel
961              what I/O patterns are likely to be issued. Accepted values are:
962
963                     0      Backwards compatible hint for "no hint".
964
965                     1      Backwards compatible hint  for  "advise  with  fio
966                            workload type". This uses FADV_RANDOM for a random
967                            workload, and  FADV_SEQUENTIAL  for  a  sequential
968                            workload.
969
970                     sequential
971                            Advise using FADV_SEQUENTIAL.
972
973                     random Advise using FADV_RANDOM.
974
975       write_hint=str
976              Use  fcntl(2) to advise the kernel what life time to expect from
977              a write. Only supported on Linux, as of version  4.13.  Accepted
978              values are:
979
980                     none   No particular life time associated with this file.
981
982                     short  Data written to this file has a short life time.
983
984                     medium Data written to this file has a medium life time.
985
986                     long   Data written to this file has a long life time.
987
988                     extreme
989                            Data  written  to  this  file has a very long life
990                            time.
991
992              The values are all relative to each other, and no absolute mean‐
993              ing should be associated with them.
994
995       offset=int[%|z]
996              Start  I/O at the provided offset in the file, given as either a
997              fixed size in bytes, zones or a percentage. If a  percentage  is
998              given,  the  generated  offset  will  be  aligned to the minimum
999              blocksize or to the value of offset_align if provided. Data  be‐
1000              fore the given offset will not be touched. This effectively caps
1001              the file size at `real_size - offset'. Can be combined with size
1002              to  constrain  the  start  and end range of the I/O workload.  A
1003              percentage can be specified by a number between 1 and  100  fol‐
1004              lowed  by  '%', for example, `offset=20%' to specify 20%. In ZBD
1005              mode, value can be set as number of zones using 'z'.
1006
1007       offset_align=int
1008              If set to non-zero value, the byte offset generated  by  a  per‐
1009              centage  offset  is aligned upwards to this value. Defaults to 0
1010              meaning that a percentage offset is aligned to the minimum block
1011              size.
1012
1013       offset_increment=int[%|z]
1014              If this is provided, then the real offset becomes `offset + off‐
1015              set_increment * thread_number', where the  thread  number  is  a
1016              counter  that  starts  at  0 and is incremented for each sub-job
1017              (i.e. when numjobs option is specified). This option  is  useful
1018              if  there  are  several  jobs which are intended to operate on a
1019              file in parallel disjoint segments, with  even  spacing  between
1020              the  starting  points.  Percentages can be used for this option.
1021              If a percentage is given, the generated offset will  be  aligned
1022              to the minimum blocksize or to the value of offset_align if pro‐
1023              vided.In ZBD mode, value can be set as  number  of  zones  using
1024              'z'.
1025
1026       number_ios=int
1027              Fio  will  normally perform I/Os until it has exhausted the size
1028              of the region set by size, or if it exhaust the  allocated  time
1029              (or  hits an error condition). With this setting, the range/size
1030              can be set independently of the number of I/Os to perform.  When
1031              fio  reaches  this number, it will exit normally and report sta‐
1032              tus. Note that this does not extend the amount of I/O that  will
1033              be  done,  it will only stop fio if this condition is met before
1034              other end-of-job criteria.
1035
1036       fsync=int
1037              If writing to a file, issue an fsync(2) (or its  equivalent)  of
1038              the dirty data for every number of blocks given. For example, if
1039              you give 32 as a parameter, fio will sync the file  after  every
1040              32  writes  issued. If fio is using non-buffered I/O, we may not
1041              sync the file. The exception is the sg I/O  engine,  which  syn‐
1042              chronizes  the disk cache anyway. Defaults to 0, which means fio
1043              does not periodically issue and wait for  a  sync  to  complete.
1044              Also see end_fsync and fsync_on_close.
1045
1046       fdatasync=int
1047              Like fsync but uses fdatasync(2) to only sync data and not meta‐
1048              data blocks. In Windows, DragonFlyBSD or OSX there is no  fdata‐
1049              sync(2)  so  this  falls back to using fsync(2).  Defaults to 0,
1050              which means fio does not periodically issue and wait for a data-
1051              only sync to complete.
1052
1053       write_barrier=int
1054              Make every N-th write a barrier write.
1055
1056       sync_file_range=str:int
1057              Use sync_file_range(2) for every int number of write operations.
1058              Fio will track range of writes that have happened since the last
1059              sync_file_range(2) call. str can currently be one or more of:
1060
1061                     wait_before
1062                            SYNC_FILE_RANGE_WAIT_BEFORE
1063
1064                     write  SYNC_FILE_RANGE_WRITE
1065
1066                     wait_after
1067                            SYNC_FILE_RANGE_WRITE_AFTER
1068
1069              So  if  you  do `sync_file_range=wait_before,write:8', fio would
1070              use `SYNC_FILE_RANGE_WAIT_BEFORE  |  SYNC_FILE_RANGE_WRITE'  for
1071              every  8  writes. Also see the sync_file_range(2) man page. This
1072              option is Linux specific.
1073
1074       overwrite=bool
1075              If true, writes to a file will always overwrite  existing  data.
1076              If the file doesn't already exist, it will be created before the
1077              write phase begins. If the file exists and is large  enough  for
1078              the specified write phase, nothing will be done. Default: false.
1079
1080       end_fsync=bool
1081              If  true,  fsync(2)  file  contents  when a write stage has com‐
1082              pleted.  Default: false.
1083
1084       fsync_on_close=bool
1085              If true, fio will fsync(2) a dirty file on close.  This  differs
1086              from  end_fsync  in that it will happen on every file close, not
1087              just at the end of the job. Default: false.
1088
1089       rwmixread=int
1090              Percentage of a mixed workload that should  be  reads.  Default:
1091              50.
1092
1093       rwmixwrite=int
1094              Percentage  of  a  mixed workload that should be writes. If both
1095              rwmixread and rwmixwrite is given and the values do not  add  up
1096              to  100%,  the  latter  of  the two will be used to override the
1097              first. This may interfere with a given rate setting, if  fio  is
1098              asked to limit reads or writes to a certain rate. If that is the
1099              case, then the distribution may be skewed. Default: 50.
1100
1101       random_distribution=str:float[:float][,str:float][,str:float]
1102              By default, fio will use a completely uniform  random  distribu‐
1103              tion when asked to perform random I/O. Sometimes it is useful to
1104              skew the distribution in specific ways, ensuring that some parts
1105              of the data is more hot than others.  fio includes the following
1106              distribution models:
1107
1108                     random Uniform random distribution
1109
1110                     zipf   Zipf distribution
1111
1112                     pareto Pareto distribution
1113
1114                     normal Normal (Gaussian) distribution
1115
1116                     zoned  Zoned random distribution zoned_abs Zoned absolute
1117                            random distribution
1118
1119              When using a zipf or pareto distribution, an input value is also
1120              needed to define the access pattern. For zipf, this is the `Zipf
1121              theta'.   For  pareto,  it's  the `Pareto power'. Fio includes a
1122              test program, fio-genzipf, that can be used visualize  what  the
1123              given  input  values  will  yield  in terms of hit rates. If you
1124              wanted to use zipf with a `theta' of 1.2, you  would  use  `ran‐
1125              dom_distribution=zipf:1.2' as the option. If a non-uniform model
1126              is used, fio will disable use of the random map. For the  normal
1127              distribution,  a  normal  (Gaussian)  deviation is supplied as a
1128              value between 0 and 100.
1129
1130              The second, optional float is allowed for pareto, zipf and  nor‐
1131              mal distributions. It allows to set base of distribution in non-
1132              default place, giving more control over most  probable  outcome.
1133              This  value  is  in  range [0-1] which maps linearly to range of
1134              possible random values.  Defaults are:  random  for  pareto  and
1135              zipf,  and  0.5  for  normal.   If you wanted to use zipf with a
1136              `theta` of 1.2 centered on 1/4 of allowed value range, you would
1137              use `random_distribution=zipf:1.2:0.25`.
1138
1139              For a zoned distribution, fio supports specifying percentages of
1140              I/O access that should fall within what range of the file or de‐
1141              vice. For example, given a criteria of:
1142
1143                     60% of accesses should be to the first 10%
1144                     30% of accesses should be to the next 20%
1145                     8% of accesses should be to the next 30%
1146                     2% of accesses should be to the next 40%
1147
1148              we  can  define  that through zoning of the random accesses. For
1149              the above example, the user would do:
1150
1151                     random_distribution=zoned:60/10:30/20:8/30:2/40
1152
1153              A zoned_abs distribution works  exactly  like  thezoned,  except
1154              that  it takes absolute sizes. For example, let's say you wanted
1155              to define access according to the following criteria:
1156
1157                     60% of accesses should be to the first 20G
1158                     30% of accesses should be to the next 100G
1159                     10% of accesses should be to the next 500G
1160
1161              we can define an absolute zoning distribution with:
1162
1163                     random_distribution=zoned:60/10:30/20:8/30:2/40
1164
1165              For both zoned and zoned_abs, fio supports defining  up  to  256
1166              separate zones.
1167
1168              Similarly  to  how bssplit works for setting ranges and percent‐
1169              ages of block sizes. Like bssplit, it's possible to specify sep‐
1170              arate  zones  for  reads,  writes, and trims. If just one set is
1171              given, it'll apply to all of them.
1172
1173       percentage_random=int[,int][,int]
1174              For a random workload, set how big a percentage should  be  ran‐
1175              dom.  This defaults to 100%, in which case the workload is fully
1176              random. It can be set from anywhere from 0 to 100. Setting it to
1177              0  would  make the workload fully sequential. Any setting in be‐
1178              tween will result in a random mix of sequential and random  I/O,
1179              at  the  given percentages. Comma-separated values may be speci‐
1180              fied for reads, writes, and trims as described in blocksize.
1181
1182       norandommap
1183              Normally fio will cover every block of the file when doing  ran‐
1184              dom I/O. If this option is given, fio will just get a new random
1185              offset without looking at past I/O history. This means that some
1186              blocks  may  not be read or written, and that some blocks may be
1187              read/written more than once. If this option is used with  verify
1188              and  multiple  blocksizes  (via bsrange), only intact blocks are
1189              verified, i.e., partially-overwritten blocks are ignored.   With
1190              an async I/O engine and an I/O depth > 1, it is possible for the
1191              same block to be overwritten, which can cause  verification  er‐
1192              rors.   Either  do not use norandommap in this case, or also use
1193              the lfsr random generator.
1194
1195       softrandommap=bool
1196              See norandommap. If fio runs with the random block  map  enabled
1197              and  it fails to allocate the map, if this option is set it will
1198              continue without a random block map. As coverage will not be  as
1199              complete  as  with  random  maps, this option is disabled by de‐
1200              fault.
1201
1202       random_generator=str
1203              Fio supports the following engines for  generating  I/O  offsets
1204              for random I/O:
1205
1206                     tausworthe
1207                            Strong 2^88 cycle random number generator.
1208
1209                     lfsr   Linear feedback shift register generator.
1210
1211                     tausworthe64
1212                            Strong 64-bit 2^258 cycle random number generator.
1213
1214              tausworthe  is a strong random number generator, but it requires
1215              tracking on the side if we want to ensure that blocks  are  only
1216              read or written once. lfsr guarantees that we never generate the
1217              same offset twice, and it's also less computationally expensive.
1218              It's  not  a true random generator, however, though for I/O pur‐
1219              poses it's typically good enough. lfsr only  works  with  single
1220              block  sizes,  not with workloads that use multiple block sizes.
1221              If used with such a workload, fio may read or write some  blocks
1222              multiple  times. The default value is tausworthe, unless the re‐
1223              quired space exceeds 2^32 blocks. If it does, then  tausworthe64
1224              is selected automatically.
1225
1226   Block size
1227       blocksize=int[,int][,int], bs=int[,int][,int]
1228              The  block  size  in  bytes used for I/O units. Default: 4096. A
1229              single value applies to reads, writes,  and  trims.  Comma-sepa‐
1230              rated  values  may  be specified for reads, writes, and trims. A
1231              value not terminated in a comma applies to subsequent types. Ex‐
1232              amples:
1233
1234                     bs=256k        means 256k for reads, writes and trims.
1235                     bs=8k,32k       means  8k  for  reads, 32k for writes and
1236                     trims.
1237                     bs=8k,32k,     means 8k for reads, 32k  for  writes,  and
1238                     default for trims.
1239                     bs=,8k         means default for reads, 8k for writes and
1240                     trims.
1241                     bs=,8k,        means default for reads,  8k  for  writes,
1242                     and default for trims.
1243
1244       blocksize_range=irange[,irange][,irange],
1245       bsrange=irange[,irange][,irange]
1246              A range of block sizes in bytes for I/O units.  The  issued  I/O
1247              unit  will  always  be  a  multiple  of the minimum size, unless
1248              blocksize_unaligned is set.  Comma-separated ranges may be spec‐
1249              ified  for  reads,  writes, and trims as described in blocksize.
1250              Example:
1251
1252                     bsrange=1k-4k,2k-8k
1253
1254       bssplit=str[,str][,str]
1255              Sometimes you want even finer grained control of the block sizes
1256              issued,  not just an even split between them. This option allows
1257              you to weight various block sizes, so that you are able  to  de‐
1258              fine  a  specific  amount  of block sizes issued. The format for
1259              this option is:
1260
1261                     bssplit=blocksize/percentage:blocksize/percentage
1262
1263              for as many block sizes as needed. So if you want  to  define  a
1264              workload  that  has  50%  64k blocks, 10% 4k blocks, and 40% 32k
1265              blocks, you would write:
1266
1267                     bssplit=4k/10:64k/50:32k/40
1268
1269              Ordering does not matter. If the percentage is left  blank,  fio
1270              will  fill  in  the remaining values evenly. So a bssplit option
1271              like this one:
1272
1273                     bssplit=4k/50:1k/:32k/
1274
1275              would have 50% 4k ios, and 25% 1k and 32k ios.  The  percentages
1276              always  add  up to 100, if bssplit is given a range that adds up
1277              to more, it will error out.
1278
1279              Comma-separated values may be specified for reads,  writes,  and
1280              trims as described in blocksize.
1281
1282              If  you  want a workload that has 50% 2k reads and 50% 4k reads,
1283              while having 90% 4k writes and 10% 8k writes, you would specify:
1284
1285                     bssplit=2k/50:4k/50,4k/90:8k/10
1286
1287              Fio supports defining up to 64 different weights for  each  data
1288              direction.
1289
1290       blocksize_unaligned, bs_unaligned
1291              If  set,  fio  will  issue I/O units with any size within block‐
1292              size_range, not just multiples of the minimum size.  This  typi‐
1293              cally won't work with direct I/O, as that normally requires sec‐
1294              tor alignment.
1295
1296       bs_is_seq_rand=bool
1297              If this option is set, fio will use the normal read,write block‐
1298              size  settings  as sequential,random blocksize settings instead.
1299              Any random read or write will use the WRITE blocksize  settings,
1300              and  any  sequential  read  or write will use the READ blocksize
1301              settings.
1302
1303       blockalign=int[,int][,int], ba=int[,int][,int]
1304              Boundary to which fio will  align  random  I/O  units.  Default:
1305              blocksize.  Minimum alignment is typically 512b for using direct
1306              I/O, though it usually depends on the hardware block size.  This
1307              option  is mutually exclusive with using a random map for files,
1308              so it will turn off that option. Comma-separated values  may  be
1309              specified  for  reads,  writes, and trims as described in block‐
1310              size.
1311
1312   Buffers and memory
1313       zero_buffers
1314              Initialize buffers with all zeros. Default:  fill  buffers  with
1315              random data.
1316
1317       refill_buffers
1318              If  this option is given, fio will refill the I/O buffers on ev‐
1319              ery submit. The default is to only fill it at init time and  re‐
1320              use that data. Only makes sense if zero_buffers isn't specified,
1321              naturally. If data verification is  enabled,  refill_buffers  is
1322              also automatically enabled.
1323
1324       scramble_buffers=bool
1325              If  refill_buffers  is  too  costly and the target is using data
1326              deduplication, then setting this option will slightly modify the
1327              I/O  buffer  contents to defeat normal de-dupe attempts. This is
1328              not enough to defeat more clever block compression attempts, but
1329              it will stop naive dedupe of blocks. Default: true.
1330
1331       buffer_compress_percentage=int
1332              If this is set, then fio will attempt to provide I/O buffer con‐
1333              tent (on WRITEs) that compresses to  the  specified  level.  Fio
1334              does  this  by  providing a mix of random data followed by fixed
1335              pattern data. The fixed pattern is either zeros, or the  pattern
1336              specified  by  buffer_pattern.  If  the buffer_pattern option is
1337              used, it might skew the compression ratio slightly. Setting buf‐
1338              fer_compress_percentage  to a value other than 100 will also en‐
1339              able refill_buffers in order to reduce the likelihood that adja‐
1340              cent blocks are so similar that they over compress when seen to‐
1341              gether. See buffer_compress_chunk for how  to  set  a  finer  or
1342              coarser  granularity  of the random/fixed data regions. Defaults
1343              to unset i.e., buffer data will not adhere  to  any  compression
1344              level.
1345
1346       buffer_compress_chunk=int
1347              This  setting allows fio to manage how big the random/fixed data
1348              region  is  when  using  buffer_compress_percentage.  When  buf‐
1349              fer_compress_chunk  is  set  to some non-zero value smaller than
1350              the block size, fio can repeat the random/fixed region  through‐
1351              out the I/O buffer at the specified interval (which particularly
1352              useful when bigger block sizes are used for a job). When set  to
1353              0, fio will use a chunk size that matches the block size result‐
1354              ing in a single random/fixed region within the I/O  buffer.  De‐
1355              faults  to  512.  When  the unit is omitted, the value is inter‐
1356              preted in bytes.
1357
1358       buffer_pattern=str
1359              If set, fio will fill the I/O buffers with this pattern or  with
1360              the  contents of a file. If not set, the contents of I/O buffers
1361              are defined by the other options related to buffer contents. The
1362              setting can be any pattern of bytes, and can be prefixed with 0x
1363              for hex values. It may also be a string, where the  string  must
1364              then be wrapped with "". Or it may also be a filename, where the
1365              filename must be wrapped with ''  in  which  case  the  file  is
1366              opened  and  read.  Note  that not all the file contents will be
1367              read if that would cause the buffers to overflow. So, for  exam‐
1368              ple:
1369
1370                     buffer_pattern='filename'
1371                     or:
1372                     buffer_pattern="abcd"
1373                     or:
1374                     buffer_pattern=-12
1375                     or:
1376                     buffer_pattern=0xdeadface
1377
1378              Also you can combine everything together in any order:
1379
1380                     buffer_pattern=0xdeadface"abcd"-12'filename'
1381
1382       dedupe_percentage=int
1383              If  set,  fio will generate this percentage of identical buffers
1384              when writing. These buffers will  be  naturally  dedupable.  The
1385              contents  of the buffers depend on what other buffer compression
1386              settings have been set. It's possible  to  have  the  individual
1387              buffers  either fully compressible, or not at all -- this option
1388              only controls the distribution of unique buffers.  Setting  this
1389              option  will  also enable refill_buffers to prevent every buffer
1390              being identical.
1391
1392       dedupe_mode=str
1393              If dedupe_percentage is given, then this option controls how fio
1394              generates the dedupe buffers.
1395
1396                     repeat
1397
1398                            Generate  dedupe  buffers  by  repeating  previous
1399                            writes
1400
1401                     working_set
1402
1403                            Generate dedupe buffers from working set
1404
1405              repeat is the default option for fio. Dedupe buffers are  gener‐
1406              ated by repeating previous unique write.
1407
1408              working_set  is  a  more  realistic workload.  With working_set,
1409              dedupe_working_set_percentage should be provided.   Given  that,
1410              fio  will  use  the  initial unique write buffers as its working
1411              set.  Upon deciding to dedupe, fio will randomly choose a buffer
1412              from the working set.  Note that by using working_set the dedupe
1413              percentage will converge to the desired over time  while  repeat
1414              maintains the desired percentage throughout the job.
1415
1416       dedupe_working_set_percentage=int
1417              If  dedupe_mode  is  set  to working_set, then this controls the
1418              percentage of size of the file or device used as the buffers fio
1419              will choose to generate the dedupe buffers from
1420
1421              Note  that  size needs to be explicitly provided and only 1 file
1422              per job is supported
1423
1424       dedupe_global=bool
1425              This controls whether the deduplication buffers will  be  shared
1426              amongst  all  jobs  that  have  this option set. The buffers are
1427              spread evenly between participating jobs.
1428
1429              Note that dedupe_mode must be set to  working_set  for  this  to
1430              work.  Can be used in combination with compression
1431
1432              invalidate=bool
1433                     Invalidate the buffer/page cache parts of the files to be
1434                     used prior to starting I/O if the platform and file  type
1435                     support  it.  Defaults  to true.  This will be ignored if
1436                     pre_read is also specified for the same job.
1437
1438              sync=str
1439                     Whether, and what type, of synchronous  I/O  to  use  for
1440                     writes.  The allowed values are:
1441
1442                            none   Do not use synchronous IO, the default.
1443
1444                            0      Same as none.
1445
1446                            sync   Use  synchronous  file IO. For the majority
1447                                   of I/O engines, this means using O_SYNC.
1448
1449                            1      Same as sync.
1450
1451                            dsync  Use synchronous data IO. For  the  majority
1452                                   of I/O engines, this means using O_DSYNC.
1453
1454              iomem=str, mem=str
1455                     Fio  can use various types of memory as the I/O unit buf‐
1456                     fer. The allowed values are:
1457
1458                            malloc Use memory from malloc(3) as  the  buffers.
1459                                   Default memory type.
1460
1461                            shm    Use shared memory as the buffers. Allocated
1462                                   through shmget(2).
1463
1464                            shmhuge
1465                                   Same as shm, but use huge pages as backing.
1466
1467                            mmap   Use mmap(2) to allocate buffers. May either
1468                                   be  anonymous memory, or can be file backed
1469                                   if a filename is given  after  the  option.
1470                                   The format is `mem=mmap:/path/to/file'.
1471
1472                            mmaphuge
1473                                   Use a memory mapped huge file as the buffer
1474                                   backing. Append  filename  after  mmaphuge,
1475                                   ala `mem=mmaphuge:/hugetlbfs/file'.
1476
1477                            mmapshared
1478                                   Same  as  mmap,  but use a MMAP_SHARED map‐
1479                                   ping.
1480
1481                            cudamalloc
1482                                   Use GPU memory as the buffers for GPUDirect
1483                                   RDMA benchmark.  The ioengine must be rdma.
1484
1485                     The  area  allocated is a function of the maximum allowed
1486                     bs size for the job, multiplied by the I/O  depth  given.
1487                     Note  that  for  shmhuge and mmaphuge to work, the system
1488                     must have free huge pages allocated. This can normally be
1489                     checked       and       set       by      reading/writing
1490                     `/proc/sys/vm/nr_hugepages' on a Linux  system.  Fio  as‐
1491                     sumes  a  huge page is 2 or 4MiB in size depending on the
1492                     platform. So to calculate the number of  huge  pages  you
1493                     need  for  a  given job file, add up the I/O depth of all
1494                     jobs (normally one unless iodepth is used)  and  multiply
1495                     by  the  maximum  bs set.  Then divide that number by the
1496                     huge page size. You can see the size of the huge pages in
1497                     `/proc/meminfo'. If no huge pages are allocated by having
1498                     a non-zero number in `nr_hugepages',  using  mmaphuge  or
1499                     shmhuge will fail. Also see hugepage-size.
1500
1501                     mmaphuge  also  needs  to  have hugetlbfs mounted and the
1502                     file location should point there. So if it's  mounted  in
1503                     `/huge', you would use `mem=mmaphuge:/huge/somefile'.
1504
1505              iomem_align=int, mem_align=int
1506                     This  indicates  the  memory  alignment of the I/O memory
1507                     buffers. Note that the given alignment is applied to  the
1508                     first  I/O unit buffer, if using iodepth the alignment of
1509                     the following buffers are given by the bs used. In  other
1510                     words, if using a bs that is a multiple of the page sized
1511                     in the system, all buffers will be aligned to this value.
1512                     If  using a bs that is not page aligned, the alignment of
1513                     subsequent  I/O  memory  buffers  is  the  sum   of   the
1514                     iomem_align and bs used.
1515
1516              hugepage-size=int
1517                     Defines  the  size of a huge page. Must at least be equal
1518                     to the system setting, see `/proc/meminfo' and `/sys/ker‐
1519                     nel/mm/hugepages/'.  Defaults  to  2 or 4MiB depending on
1520                     the platform. Should probably always  be  a  multiple  of
1521                     megabytes,  so  using `hugepage-size=Xm' is the preferred
1522                     way to set this to avoid setting a non-pow-2 bad value.
1523
1524              lockmem=int
1525                     Pin the specified amount of memory with mlock(2). Can  be
1526                     used  to  simulate a smaller amount of memory. The amount
1527                     specified is per worker.
1528
1529   I/O size
1530       size=int[%|z]
1531              The total size of file I/O for each thread of this job. Fio will
1532              run  until  this many bytes has been transferred, unless runtime
1533              is limited by other options (such as runtime, for  instance,  or
1534              increased/decreased  by io_size).  Fio will divide this size be‐
1535              tween the available files determined by options such as nrfiles,
1536              filename, unless filesize is specified by the job. If the result
1537              of division happens to be 0, the size is  set  to  the  physical
1538              size  of  the given files or devices if they exist.  If this op‐
1539              tion is not specified, fio will use the full size of  the  given
1540              files or devices. If the files do not exist, size must be given.
1541              It is also possible to give size as a percentage between  1  and
1542              100.  If  `size=20%' is given, fio will use 20% of the full size
1543              of the given files or devices. In ZBD mode, size can be given in
1544              units  of number of zones using 'z'. Can be combined with offset
1545              to constrain the start and end  range  that  I/O  will  be  done
1546              within.
1547
1548       io_size=int[%|z], io_limit=int[%|z]
1549              Normally fio operates within the region set by size, which means
1550              that the size option sets both the region and size of I/O to  be
1551              performed.  Sometimes  that  is not what you want. With this op‐
1552              tion, it is possible to define just the amount of I/O  that  fio
1553              should  do. For instance, if size is set to 20GiB and io_size is
1554              set to 5GiB, fio will perform I/O within  the  first  20GiB  but
1555              exit  when 5GiB have been done. The opposite is also possible --
1556              if size is set to 20GiB, and io_size is set to 40GiB,  then  fio
1557              will  do  40GiB  of I/O within the 0..20GiB region. Value can be
1558              set as percentage: io_size=N%.  In this case io_size  multiplies
1559              size=  value.  In  ZBD  mode, value can also be set as number of
1560              zones using 'z'.
1561
1562       filesize=irange(int)
1563              Individual file sizes. May be a range, in which  case  fio  will
1564              select  sizes for files at random within the given range. If not
1565              given, each created file is the same size. This option overrides
1566              size in terms of file size, i.e. size becomes merely the default
1567              for io_size (and has no effect it all if io_size is set  explic‐
1568              itly).
1569
1570       file_append=bool
1571              Perform I/O after the end of the file. Normally fio will operate
1572              within the size of a file. If this option is set, then fio  will
1573              append  to the file instead. This has identical behavior to set‐
1574              ting offset to the size of a file. This  option  is  ignored  on
1575              non-regular files.
1576
1577       fill_device=bool, fill_fs=bool
1578              Sets  size  to  something  really large and waits for ENOSPC (no
1579              space left on device) or EDQUOT (disk  quota  exceeded)  as  the
1580              terminating  condition.  Only makes sense with sequential write.
1581              For a read workload, the mount point will be filled  first  then
1582              I/O started on the result.
1583
1584   I/O engine
1585       ioengine=str
1586              Defines  how the job issues I/O to the file. The following types
1587              are defined:
1588
1589                     sync   Basic read(2) or write(2) I/O. lseek(2) is used to
1590                            position  the  I/O location.  See fsync and fdata‐
1591                            sync for syncing write I/Os.
1592
1593                     psync  Basic pread(2) or pwrite(2) I/O.  Default  on  all
1594                            supported operating systems except for Windows.
1595
1596                     vsync  Basic  readv(2)  or  writev(2)  I/O.  Will emulate
1597                            queuing by coalescing adjacent I/Os into a  single
1598                            submission.
1599
1600                     pvsync Basic preadv(2) or pwritev(2) I/O.
1601
1602                     pvsync2
1603                            Basic preadv2(2) or pwritev2(2) I/O.
1604
1605                     io_uring
1606                            Fast Linux native asynchronous I/O. Supports async
1607                            IO for both direct and buffered IO.   This  engine
1608                            defines engine specific options.
1609
1610                     io_uring_cmd
1611                            Fast Linux native asynchronous I/O for passthrough
1612                            commands.  This engine defines engine specific op‐
1613                            tions.
1614
1615                     libaio Linux native asynchronous I/O. Note that Linux may
1616                            only support queued behavior with non-buffered I/O
1617                            (set `direct=1' or `buffered=0').  This engine de‐
1618                            fines engine specific options.
1619
1620                     posixaio
1621                            POSIX  asynchronous  I/O  using  aio_read(3)   and
1622                            aio_write(3).
1623
1624                     solarisaio
1625                            Solaris native asynchronous I/O.
1626
1627                     windowsaio
1628                            Windows  native  asynchronous I/O. Default on Win‐
1629                            dows.
1630
1631                     mmap   File is memory mapped with mmap(2) and data copied
1632                            to/from using memcpy(3).
1633
1634                     splice splice(2)  is  used  to  transfer the data and vm‐
1635                            splice(2) to transfer data from user space to  the
1636                            kernel.
1637
1638                     sg     SCSI  generic sg v3 I/O. May either be synchronous
1639                            using the SG_IO ioctl, or if the target is  an  sg
1640                            character  device  we use read(2) and write(2) for
1641                            asynchronous  I/O.  Requires  filename  option  to
1642                            specify  either  block  or character devices. This
1643                            engine supports trim operations. The sg engine in‐
1644                            cludes engine specific options.
1645
1646                     libzbc Read,  write,  trim  and  ZBC/ZAC  operations to a
1647                            zoned block device using libzbc library. The  tar‐
1648                            get  can  be  either  an  SG character device or a
1649                            block device file.
1650
1651                     null   Doesn't transfer any data, just pretends to.  This
1652                            is  mainly used to exercise fio itself and for de‐
1653                            bugging/testing purposes.
1654
1655                     net    Transfer over the network  to  given  `host:port'.
1656                            Depending  on  the  protocol  used,  the hostname,
1657                            port, listen and  filename  options  are  used  to
1658                            specify what sort of connection to make, while the
1659                            protocol option determines which protocol will  be
1660                            used. This engine defines engine specific options.
1661
1662                     netsplice
1663                            Like  net,  but  uses splice(2) and vmsplice(2) to
1664                            map data and send/receive.   This  engine  defines
1665                            engine specific options.
1666
1667                     cpuio  Doesn't  transfer  any  data, but burns CPU cycles
1668                            according to the cpuload,  cpuchunks  and  cpumode
1669                            options.   A job never finishes unless there is at
1670                            least one non-cpuio job.
1671
1672                            cpuload=85 will cause that job to do  nothing  but
1673                            burn 85% of the CPU.  In case of SMP machines, use
1674                            numjobs=<nr_of_cpu> to get desired CPU  usage,  as
1675                            the cpuload only loads a single CPU at the desired
1676                            rate.
1677
1678                            cpumode=qsort replace the  default  noop  instruc‐
1679                            tions  loop  by  a qsort algorithm to consume more
1680                            energy.
1681
1682                     rdma   The RDMA I/O engine supports both RDMA memory  se‐
1683                            mantics  (RDMA_WRITE/RDMA_READ) and channel seman‐
1684                            tics (Send/Recv)  for  the  InfiniBand,  RoCE  and
1685                            iWARP  protocols.  This engine defines engine spe‐
1686                            cific options.
1687                     falloc I/O engine that does regular fallocate to simulate
1688                            data transfer as fio ioengine.
1689                            DDIR_READ        does   fallocate(,mode   =   FAL‐
1690                            LOC_FL_KEEP_SIZE,).
1691                            DIR_WRITE      does fallocate(,mode = 0).
1692                            DDIR_TRIM        does   fallocate(,mode   =   FAL‐
1693                            LOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE).
1694
1695                     ftruncate
1696                            I/O  engine  that sends ftruncate(2) operations in
1697                            response to write (DDIR_WRITE) events. Each ftrun‐
1698                            cate  issued  sets  the file's size to the current
1699                            block offset. blocksize is ignored.
1700
1701                     e4defrag
1702                            I/O engine  that  does  regular  EXT4_IOC_MOVE_EXT
1703                            ioctls  to simulate defragment activity in request
1704                            to DDIR_WRITE event.
1705
1706                     rados  I/O engine supporting direct access to Ceph  Reli‐
1707                            able  Autonomic  Distributed  Object Store (RADOS)
1708                            via librados. This ioengine  defines  engine  spe‐
1709                            cific options.
1710
1711                     rbd    I/O  engine supporting direct access to Ceph Rados
1712                            Block Devices (RBD) via librbd without the need to
1713                            use  the  kernel rbd driver. This ioengine defines
1714                            engine specific options.
1715
1716                     http   I/O  engine  supporting  GET/PUT   requests   over
1717                            HTTP(S)  with  libcurl to a WebDAV or S3 endpoint.
1718                            This ioengine defines engine specific options.
1719
1720                            This engine only supports direct IO of  iodepth=1;
1721                            you  need to scale this via numjobs. blocksize de‐
1722                            fines the size of the objects to be created.
1723
1724                            TRIM is translated to object deletion.
1725
1726                     gfapi  Using GlusterFS libgfapi sync interface to  direct
1727                            access  to  GlusterFS volumes without having to go
1728                            through FUSE. This ioengine  defines  engine  spe‐
1729                            cific options.
1730
1731                     gfapi_async
1732                            Using GlusterFS libgfapi async interface to direct
1733                            access to GlusterFS volumes without having  to  go
1734                            through  FUSE.  This  ioengine defines engine spe‐
1735                            cific options.
1736
1737                     libhdfs
1738                            Read and write through Hadoop (HDFS). The filename
1739                            option  is  used  to specify host,port of the hdfs
1740                            name-node to connect. This engine interprets  off‐
1741                            sets  a  little  differently.  In HDFS, files once
1742                            created cannot be modified so  random  writes  are
1743                            not  possible.  To imitate this the libhdfs engine
1744                            expects a bunch of small files to be created  over
1745                            HDFS and will randomly pick a file from them based
1746                            on the offset generated by fio  backend  (see  the
1747                            example   job  file  to  create  such  files,  use
1748                            `rw=write' option). Please note, it may be  neces‐
1749                            sary  to  set  environment  variables to work with
1750                            HDFS/libhdfs properly. Each job uses its own  con‐
1751                            nection to HDFS.
1752
1753                     mtd    Read,  write  and  erase  an  MTD character device
1754                            (e.g.,  `/dev/mtd0').  Discards  are  treated   as
1755                            erases.  Depending  on the underlying device type,
1756                            the I/O may have to go in a certain pattern, e.g.,
1757                            on  NAND, writing sequentially to erase blocks and
1758                            discarding before overwriting. The trimwrite  mode
1759                            works well for this constraint.
1760
1761                     pmemblk
1762                            Read and write using filesystem DAX to a file on a
1763                            filesystem mounted with DAX on a persistent memory
1764                            device through the PMDK libpmemblk library.
1765
1766                     dev-dax
1767                            Read  and  write  using device DAX to a persistent
1768                            memory device (e.g., /dev/dax0.0) through the PMDK
1769                            libpmem library.
1770
1771                     external
1772                            Prefix  to  specify loading an external I/O engine
1773                            object file. Append the engine filename, e.g. `io‐
1774                            engine=external:/tmp/foo.o'   to   load   ioengine
1775                            `foo.o' in `/tmp'. The path can be either absolute
1776                            or  relative. See `engines/skeleton_external.c' in
1777                            the fio source for details of writing an  external
1778                            I/O engine.
1779
1780                     filecreate
1781                            Simply  create  the  files  and do no I/O to them.
1782                            You still need to set filesize so that all the ac‐
1783                            counting  still  occurs, but no actual I/O will be
1784                            done other than creating the file.
1785
1786                     filestat
1787                            Simply do stat() and do no I/O to  the  file.  You
1788                            need  to  set  'filesize'  and  'nrfiles', so that
1789                            files will be created.  This engine is to  measure
1790                            file lookup and meta data access.
1791
1792                     filedelete
1793                            Simply  delete  files by unlink() and do no I/O to
1794                            the file. You need  to  set  'filesize'  and  'nr‐
1795                            files',  so  that files will be created.  This en‐
1796                            gine is to measure file delete.
1797
1798                     libpmem
1799                            Read and write using mmap  I/O  to  a  file  on  a
1800                            filesystem mounted with DAX on a persistent memory
1801                            device through the PMDK libpmem library.
1802
1803                     ime_psync
1804                            Synchronous read and write  using  DDN's  Infinite
1805                            Memory Engine (IME). This engine is very basic and
1806                            issues calls to IME whenever an IO is queued.
1807
1808                     ime_psyncv
1809                            Synchronous read and write  using  DDN's  Infinite
1810                            Memory  Engine  (IME). This engine uses iovecs and
1811                            will try to stack as much IOs as possible (if  the
1812                            IOs  are  "contiguous" and the IO depth is not ex‐
1813                            ceeded) before issuing a call to IME.
1814
1815                     ime_aio
1816                            Asynchronous read and write using  DDN's  Infinite
1817                            Memory Engine (IME). This engine will try to stack
1818                            as much IOs as possible by creating  requests  for
1819                            IME.   FIO  will  then decide when to commit these
1820                            requests.
1821
1822                     libiscsi
1823                            Read and write iscsi lun with libiscsi.
1824
1825                     nbd    Synchronous read and write a Network Block  Device
1826                            (NBD).
1827
1828                     libcufile
1829                            I/O engine supporting libcufile synchronous access
1830                            to nvidia-fs  and  a  GPUDirect  Storage-supported
1831                            filesystem.   This  engine  performs  I/O  without
1832                            transferring buffers between  user-space  and  the
1833                            kernel,  unless verify is set or cuda_io is posix.
1834                            iomem must not be cudamalloc.  This  ioengine  de‐
1835                            fines engine specific options.
1836
1837                     dfs    I/O  engine supporting asynchronous read and write
1838                            operations to  the  DAOS  File  System  (DFS)  via
1839                            libdfs.
1840
1841                     nfs    I/O  engine supporting asynchronous read and write
1842                            operations to NFS filesystems from  userspace  via
1843                            libnfs.  This  is useful for achieving higher con‐
1844                            currency and thus throughput than is possible  via
1845                            kernel NFS.
1846
1847                     exec   Execute  3rd party tools. Could be used to perform
1848                            monitoring during jobs runtime.
1849
1850                     xnvme  I/O engine using the xNVMe C  API,  for  NVMe  de‐
1851                            vices.  The  xnvme  engine provides flexibility to
1852                            access GNU/Linux Kernel NVMe  driver  via  libaio,
1853                            IOCTLs,  io_uring,  the  SPDK NVMe driver, or your
1854                            own custom NVMe driver. The xnvme engine  includes
1855                            engine specific options. (See https://xnvme.io/).
1856
1857   I/O engine specific parameters
1858       In addition, there are some parameters which are only valid when a spe‐
1859       cific ioengine is in use. These are used identically to normal  parame‐
1860       ters,  with  the  caveat  that when used on the command line, they must
1861       come after the ioengine that defines them is selected.
1862
1863       (io_uring,libaio)cmdprio_percentage=int[,int]
1864              Set the percentage of I/O that will be issued with  the  highest
1865              priority.   Default:  0.  A  single  value  applies to reads and
1866              writes. Comma-separated values may be specified  for  reads  and
1867              writes.  For  this  option to be effective, NCQ priority must be
1868              supported and enabled, and `direct=1' option must be  used.  fio
1869              must  also  be run as the root user. Unlike slat/clat/lat stats,
1870              which can be tracked and reported  independently,  per  priority
1871              stats  only  track  and  report a single type of latency. By de‐
1872              fault, completion latency (clat) will be reported,  if  lat_per‐
1873              centiles is set, total latency (lat) will be reported.
1874
1875       (io_uring,libaio)cmdprio_class=int[,int]
1876              Set  the  I/O priority class to use for I/Os that must be issued
1877              with a priority when cmdprio_percentage  or  cmdprio_bssplit  is
1878              set.   If  not specified when cmdprio_percentage or cmdprio_bss‐
1879              plit is set, this defaults to the highest priority class. A sin‐
1880              gle  value  applies  to reads and writes. Comma-separated values
1881              may be specified for reads and writes. See  man  ionice(1).  See
1882              also the prioclass option.
1883
1884       (io_uring,libaio)cmdprio=int[,int]
1885              Set  the  I/O priority value to use for I/Os that must be issued
1886              with a priority when cmdprio_percentage  or  cmdprio_bssplit  is
1887              set.   If  not specified when cmdprio_percentage or cmdprio_bss‐
1888              plit is set, this defaults to 0. Linux limits us to  a  positive
1889              value  between 0 and 7, with 0 being the highest. A single value
1890              applies to reads and  writes.   Comma-separated  values  may  be
1891              specified  for  reads and writes. See man ionice(1). Refer to an
1892              appropriate manpage for other operating systems since the  mean‐
1893              ing of priority may differ. See also the prio option.
1894
1895       (io_uring,libaio)cmdprio_bssplit=str[,str]
1896              To  get  a  finer  control over I/O priority, this option allows
1897              specifying the percentage of IOs that must have a  priority  set
1898              depending  on  the  block  size of the IO. This option is useful
1899              only when used together with the option bssplit, that is, multi‐
1900              ple different block sizes are used for reads and writes.
1901
1902              The  first  accepted  format  for this option is the same as the
1903              format of the bssplit option:
1904
1905                     cmdprio_bssplit=blocksize/percentage:blocksize/percentage
1906
1907              In this case, each entry will use the priority class and  prior‐
1908              ity  level  defined by the options cmdprio_class and cmdprio re‐
1909              spectively.
1910
1911              The second accepted format for this option is:
1912
1913                     cmdprio_bssplit=blocksize/percentage/class/level:block‐
1914                     size/percentage/class/level
1915
1916              In  this  case, the priority class and priority level is defined
1917              inside each entry. In comparison with the first accepted format,
1918              the second accepted format does not restrict all entries to have
1919              the same priority class and priority level.
1920
1921              For both formats, only the read and write  data  directions  are
1922              supported, values for trim IOs are ignored. This option is mutu‐
1923              ally exclusive with the cmdprio_percentage option.
1924
1925       (io_uring,io_uring_cmd)fixedbufs
1926              If fio is asked to do direct IO, then Linux will map  pages  for
1927              each  IO  call, and release them when IO is done. If this option
1928              is set, the pages are pre-mapped  before  IO  is  started.  This
1929              eliminates  the  need  to  map and release for each IO.  This is
1930              more efficient, and reduces the IO latency as well.
1931
1932       (io_uring,io_uring_cmd)nonvectored=int
1933              With this option, fio will use non-vectored read/write commands,
1934              where address must contain the address directly. Default is -1.
1935
1936       (io_uring,io_uring_cmd)force_async
1937              Normal operation for io_uring is to try and issue an sqe as non-
1938              blocking first, and if that fails, execute it in an  async  man‐
1939              ner.  With  this  option set to N, then every N request fio will
1940              ask sqe to be issued in an async manner. Default is 0.
1941
1942       (io_uring,io_uring_cmd,xnvme)hipri
1943              If this option is set, fio will attempt to use polled IO comple‐
1944              tions.  Normal  IO completions generate interrupts to signal the
1945              completion of IO, polled completions do not. Hence they are  re‐
1946              quire  active reaping by the application.  The benefits are more
1947              efficient IO for high IOPS scenarios, and  lower  latencies  for
1948              low queue depth IO.
1949
1950       (io_uring,io_uring_cmd)registerfiles
1951              With this option, fio registers the set of files being used with
1952              the kernel.  This avoids the overhead of managing file counts in
1953              the  kernel,  making  the  submission  and  completion part more
1954              lightweight. Required for the below sqthread_poll option.
1955
1956       (io_uring,io_uring_cmd,xnvme)sqthread_poll
1957              Normally fio will submit IO by issuing a system call  to  notify
1958              the  kernel of available items in the SQ ring. If this option is
1959              set, the act of submitting IO will be done by a  polling  thread
1960              in  the kernel. This frees up cycles for fio, at the cost of us‐
1961              ing more CPU in the system.
1962
1963       (io_uring,io_uring_cmd)sqthread_poll_cpu=int
1964              When `sqthread_poll` is set, this option provides a way  to  de‐
1965              fine which CPU should be used for the polling thread.
1966
1967       (io_uring_cmd)cmd_type=str
1968              Specifies the type of uring passthrough command to be used. Sup‐
1969              ported value is nvme. Default is nvme.
1970
1971       (libaio)userspace_reap
1972              Normally, with the libaio  engine  in  use,  fio  will  use  the
1973              io_getevents(3)  system call to reap newly returned events. With
1974              this flag turned on, the AIO ring will  be  read  directly  from
1975              user-space to reap events. The reaping mode is only enabled when
1976              polling for a minimum of 0 events (e.g. when `iodepth_batch_com‐
1977              plete=0').
1978
1979       (pvsync2)hipri
1980              Set  RWF_HIPRI  on  I/O,  indicating  to the kernel that it's of
1981              higher priority than normal.
1982
1983       (pvsync2)hipri_percentage
1984              When hipri is set this determines the probability of  a  pvsync2
1985              I/O being high priority. The default is 100%.
1986
1987       (pvsync2,libaio,io_uring,io_uring_cmd)nowait=bool
1988              By default if a request cannot be executed immediately (e.g. re‐
1989              source starvation, waiting on locks) it is queued and the initi‐
1990              ating  process  will  be blocked until the required resource be‐
1991              comes free.  This option sets  the  RWF_NOWAIT  flag  (supported
1992              from  the  4.14 Linux kernel) and the call will return instantly
1993              with EAGAIN or a partial result rather than waiting.
1994
1995              It is useful to also use ignore_error=EAGAIN when using this op‐
1996              tion.   Note:  glibc  2.27,  2.28 have a bug in syscall wrappers
1997              preadv2, pwritev2.  They return EOPNOTSUP instead of EAGAIN.
1998
1999              For cached I/O, using this option usually means a request  oper‐
2000              ates  only  with cached data. Currently the RWF_NOWAIT flag does
2001              not supported for cached write.  For direct I/O,  requests  will
2002              only  succeed  if cache invalidation isn't required, file blocks
2003              are fully allocated and the disk request could be issued immedi‐
2004              ately.
2005
2006       (cpuio)cpuload=int
2007              Attempt to use the specified percentage of CPU cycles. This is a
2008              mandatory option when using cpuio I/O engine.
2009
2010       (cpuio)cpuchunks=int
2011              Split the load into cycles of the given time. In microseconds.
2012
2013       (cpuio)cpumode=str
2014              Specify how to stress the CPU. It can take these two values:
2015
2016                     noop   This is the default and directs the CPU to execute
2017                            noop instructions.
2018
2019                     qsort  Replace the default noop instructions with a qsort
2020                            algorithm to consume more energy.
2021
2022       (cpuio)exit_on_io_done=bool
2023              Detect when I/O threads are done, then exit.
2024
2025       (libhdfs)namenode=str
2026              The hostname or IP address of a HDFS cluster  namenode  to  con‐
2027              tact.
2028
2029       (libhdfs)port=int
2030              The listening port of the HFDS cluster namenode.
2031
2032       (netsplice,net)port=int
2033              The  TCP  or  UDP port to bind to or connect to. If this is used
2034              with numjobs to spawn multiple instances of the same  job  type,
2035              then  this will be the starting port number since fio will use a
2036              range of ports.
2037
2038       (rdma,librpma_*)port=int
2039              The port to use for RDMA-CM communication. This  should  be  the
2040              same value on the client and the server side.
2041
2042       (netsplice,net,rdma)hostname=str
2043              The  hostname or IP address to use for TCP, UDP or RDMA-CM based
2044              I/O.  If the job is a TCP listener or UDP reader,  the  hostname
2045              is  not used and must be omitted unless it is a valid UDP multi‐
2046              cast address.
2047
2048       (librpma_*)serverip=str
2049              The IP address to be used for RDMA-CM based I/O.
2050
2051       (librpma_*_server)direct_write_to_pmem=bool
2052              Set to 1 only when Direct Write to PMem from the remote host  is
2053              possible. Otherwise, set to 0.
2054
2055       (librpma_*_server)busy_wait_polling=bool
2056              Set  to  0  to  wait for completion instead of busy-wait polling
2057              completion.  Default: 1.
2058
2059       (netsplice,net)interface=str
2060              The IP address of the network interface used to send or  receive
2061              UDP multicast.
2062
2063       (netsplice,net)ttl=int
2064              Time-to-live  value for outgoing UDP multicast packets. Default:
2065              1.
2066
2067       (netsplice,net)nodelay=bool
2068              Set TCP_NODELAY on TCP connections.
2069
2070       (netsplice,net)protocol=str, proto=str
2071              The network protocol to use. Accepted values are:
2072
2073                     tcp    Transmission control protocol.
2074
2075                     tcpv6  Transmission control protocol V6.
2076
2077                     udp    User datagram protocol.
2078
2079                     udpv6  User datagram protocol V6.
2080
2081                     unix   UNIX domain socket.
2082
2083              When the protocol is TCP or UDP, the port must also be given, as
2084              well as the hostname if the job is a TCP listener or UDP reader.
2085              For unix sockets, the normal filename option should be used  and
2086              the port is invalid.
2087
2088       (netsplice,net)listen
2089              For  TCP  network  connections,  tell fio to listen for incoming
2090              connections rather than initiating an outgoing  connection.  The
2091              hostname must be omitted if this option is used.
2092
2093       (netsplice,net)pingpong
2094              Normally a network writer will just continue writing data, and a
2095              network reader will just consume packages.  If  `pingpong=1'  is
2096              set,  a  writer will send its normal payload to the reader, then
2097              wait for the reader to send the same payload back.  This  allows
2098              fio  to measure network latencies. The submission and completion
2099              latencies then measure local time spent  sending  or  receiving,
2100              and  the  completion  latency  measures how long it took for the
2101              other end to receive and send back. For  UDP  multicast  traffic
2102              `pingpong=1'  should only be set for a single reader when multi‐
2103              ple readers are listening to the same address.
2104
2105       (netsplice,net)window_size=int
2106              Set the desired socket buffer size for the connection.
2107
2108       (netsplice,net)mss=int
2109              Set the TCP maximum segment size (TCP_MAXSEG).
2110
2111       (e4defrag)donorname=str
2112              File will be used as a block donor (swap extents between files).
2113
2114       (e4defrag)inplace=int
2115              Configure donor file blocks allocation strategy:
2116
2117                     0      Default. Preallocate donor's file on init.
2118
2119                     1      Allocate  space  immediately   inside   defragment
2120                            event, and free right after event.
2121
2122       (rbd,rados)clustername=str
2123              Specifies the name of the Ceph cluster.
2124
2125       (rbd)rbdname=str
2126              Specifies the name of the RBD.
2127
2128       (rbd,rados)pool=str
2129              Specifies  the  name  of  the  Ceph pool containing RBD or RADOS
2130              data.
2131
2132       (rbd,rados)clientname=str
2133              Specifies the username (without the 'client.'  prefix)  used  to
2134              access  the  Ceph  cluster. If the clustername is specified, the
2135              clientname shall be the full *type.id* string. If no type.  pre‐
2136              fix is given, fio will add 'client.'  by default.
2137
2138       (rados)conf=str
2139              Specifies  the  configuration path of ceph cluster, so conf file
2140              does not have to be /etc/ceph/ceph.conf.
2141
2142       (rbd,rados)busy_poll=bool
2143              Poll store instead of waiting for completion. Usually this  pro‐
2144              vides  better  throughput at cost of higher(up to 100%) CPU uti‐
2145              lization.
2146
2147       (rados)touch_objects=bool
2148              During initialization, touch (create if do not  exist)  all  ob‐
2149              jects  (files).   Touching  all  objects affects ceph caches and
2150              likely impacts test results.  Enabled by default.
2151
2152       (http)http_host=str
2153              Hostname to connect to. For S3, this could be the  bucket  name.
2154              Default is localhost
2155
2156       (http)http_user=str
2157              Username for HTTP authentication.
2158
2159       (http)http_pass=str
2160              Password for HTTP authentication.
2161
2162       (http)https=str
2163              Whether  to  use  HTTPS instead of plain HTTP. on enables HTTPS;
2164              insecure will enable HTTPS, but disable  SSL  peer  verification
2165              (use with caution!).  Default is off.
2166
2167       (http)http_mode=str
2168              Which  HTTP access mode to use: webdav, swift, or s3. Default is
2169              webdav.
2170
2171       (http)http_s3_region=str
2172              The S3 region/zone to include in the  request.  Default  is  us-
2173              east-1.
2174
2175       (http)http_s3_key=str
2176              The S3 secret key.
2177
2178       (http)http_s3_keyid=str
2179              The S3 key/access id.
2180
2181       (http)http_s3_sse_customer_key=str
2182              The encryption customer key in SSE server side.
2183
2184       (http)http_s3_sse_customer_algorithm=str
2185              The encryption customer algorithm in SSE server side. Default is
2186              AES256
2187
2188       (http)http_s3_storage_class=str
2189              Which storage class to access. User-customizable  settings.  De‐
2190              fault is STANDARD
2191
2192       (http)http_swift_auth_token=str
2193              The  Swift auth token. See the example configuration file on how
2194              to retrieve this.
2195
2196       (http)http_verbose=int
2197              Enable verbose requests from libcurl. Useful  for  debugging.  1
2198              turns  on  verbose  logging from libcurl, 2 additionally enables
2199              HTTP IO tracing.  Default is 0
2200
2201       (mtd)skip_bad=bool
2202              Skip operations against known bad blocks.
2203
2204       (libhdfs)hdfsdirectory
2205              libhdfs will create chunk in this HDFS directory.
2206
2207       (libhdfs)chunk_size
2208              The size of the chunk to use for each file.
2209
2210       (rdma)verb=str
2211              The RDMA verb to use on this side of the RDMA  ioengine  connec‐
2212              tion.  Valid values are write, read, send and recv. These corre‐
2213              spond to the equivalent RDMA  verbs  (e.g.  write  =  rdma_write
2214              etc.).  Note  that this only needs to be specified on the client
2215              side of the connection. See the examples folder.
2216
2217       (rdma)bindname=str
2218              The name to use to bind the local RDMA-CM connection to a  local
2219              RDMA  device.  This  could  be a hostname or an IPv4 or IPv6 ad‐
2220              dress.  On  the  server  side  this  will  be  passed  into  the
2221              rdma_bind_addr() function and on the client site it will be used
2222              in the rdma_resolve_add() function. This can be useful when mul‐
2223              tiple  paths  exist between the client and the server or in cer‐
2224              tain loopback configurations.
2225
2226       (filestat)stat_type=str
2227              Specify stat system call type to measure lookup/getattr  perfor‐
2228              mance.  Default is stat for stat(2).
2229
2230       (sg)hipri
2231              If this option is set, fio will attempt to use polled IO comple‐
2232              tions. This will have a similar effect as (io_uring)hipri.  Only
2233              SCSI  READ  and WRITE commands will have the SGV4_FLAG_HIPRI set
2234              (not UNMAP (trim) nor VERIFY).  Older versions of the  Linux  sg
2235              driver  that  do  not support hipri will simply ignore this flag
2236              and do normal IO. The Linux SCSI Low  Level  Driver  (LLD)  that
2237              "owns"  the  device  also  needs to support hipri (also known as
2238              iopoll and mq_poll). The MegaRAID driver is an example of a SCSI
2239              LLD.   Default:  clear (0) which does normal (interrupted based)
2240              IO.
2241
2242       (sg)readfua=bool
2243              With readfua option set to 1, read operations include the  force
2244              unit access (fua) flag. Default: 0.
2245
2246       (sg)writefua=bool
2247              With  writefua  option  set  to  1, write operations include the
2248              force unit access (fua) flag. Default: 0.
2249
2250       (sg)sg_write_mode=str
2251              Specify the type of write commands to  issue.  This  option  can
2252              take multiple values:
2253
2254                     write (default)
2255                            Write opcodes are issued as usual
2256
2257                     write_and_verify
2258                            Issue WRITE AND VERIFY commands. The BYTCHK bit is
2259                            set to 00b. This directs the device to carry out a
2260                            medium  verification  with  no data comparison for
2261                            the data that was written. The writefua option  is
2262                            ignored with this selection.
2263
2264                     verify This  option  is  deprecated. Use write_and_verify
2265                            instead.
2266
2267                     write_same
2268                            Issue WRITE SAME commands. This transfers a single
2269                            block  to the device and writes this same block of
2270                            data to a contiguous sequence of LBAs beginning at
2271                            the  specified  offset. fio's block size parameter
2272                            specifies the amount of  data  written  with  each
2273                            command.  However,  the  amount  of  data actually
2274                            transferred to the device is equal to the device's
2275                            block  (sector)  size.  For a device with 512 byte
2276                            sectors, blocksize=8k will write 16  sectors  with
2277                            each  command.  fio will still generate 8k of data
2278                            for each command butonly the first 512 bytes  will
2279                            be  used and transferred to the device. The write‐
2280                            fua option is ignored with this selection.
2281
2282                     same   This option is deprecated. Use write_same instead.
2283
2284                     write_same_ndob
2285                            Issue WRITE SAME(16) commands as  above  but  with
2286                            the  No Data Output Buffer (NDOB) bit set. No data
2287                            will be transferred to the device  with  this  bit
2288                            set. Data written will be a pre-determined pattern
2289                            such as all zeroes.
2290
2291                     write_stream
2292                            Issue WRITE STREAM(16) commands. Use the stream_id
2293                            option to specify the stream identifier.
2294
2295                     verify_bytchk_00
2296                            Issue  VERIFY commands with BYTCHK set to 00. This
2297                            directs the device to carry out a medium verifica‐
2298                            tion with no data comparison.
2299
2300                     verify_bytchk_01
2301                            Issue  VERIFY commands with BYTCHK set to 01. This
2302                            directs the device to compare the data on the  de‐
2303                            vice with the data transferred to the device.
2304
2305                     verify_bytchk_11
2306                            Issue  VERIFY commands with BYTCHK set to 11. This
2307                            transfers a single block to the  device  and  com‐
2308                            pares  the contents of this block with the data on
2309                            the device  beginning  at  the  specified  offset.
2310                            fio's  block  size  parameter  specifies the total
2311                            amount of data compared with  this  command.  How‐
2312                            ever,  only  one  block  (sector) worth of data is
2313                            transferred to the device. This is similar to  the
2314                            WRITE  SAME  command  except that data is compared
2315                            instead of written.
2316
2317       (sg)stream_id=int
2318              Set the stream identifier for WRITE STREAM commands. If this  is
2319              set  to 0 (which is not a valid stream identifier) fio will open
2320              a stream and then close it when done. Default is 0.
2321
2322       (nbd)uri=str
2323              Specify the NBD URI of the server to  test.   The  string  is  a
2324              standard   NBD   URI   (see   https://github.com/NetworkBlockDe‐
2325              vice/nbd/tree/master/doc).  Example URIs:
2326
2327                     nbd://localhost:10809
2328
2329                     nbd+unix:///?socket=/tmp/socket
2330
2331                     nbds://tlshost/exportname
2332
2333       (libcufile)gpu_dev_ids=str
2334              Specify the GPU IDs to use with CUDA. This is a  colon-separated
2335              list  of  int.  GPUs are assigned to workers roundrobin. Default
2336              is 0.
2337
2338       (libcufile)cuda_io=str
2339              Specify the type of I/O to use with CUDA. This option takes  the
2340              following values:
2341
2342                     cufile (default)
2343                            Use  libcufile and nvidia-fs. This option performs
2344                            I/O directly between a GPUDirect Storage  filesys‐
2345                            tem and GPU buffers, avoiding use of a bounce buf‐
2346                            fer. If verify is set, cudaMemcpy is used to  copy
2347                            verification data between RAM and GPU(s).  Verifi‐
2348                            cation data is copied from RAM  to  GPU  before  a
2349                            write  and  from  GPU to RAM after a read.  direct
2350                            must be 1.
2351
2352                     posix  Use POSIX to perform I/O with a  RAM  buffer,  and
2353                            use  cudaMemcpy  to  transfer data between RAM and
2354                            the GPU(s).  Data is copied from GPU to RAM before
2355                            a  write  and copied from RAM to GPU after a read.
2356                            verify does not affect the use of cudaMemcpy.
2357
2358       (dfs)pool
2359              Specify the label or UUID of the DAOS pool to connect to.
2360
2361       (dfs)cont
2362              Specify the label or UUID of the DAOS container to open.
2363
2364       (dfs)chunk_size
2365              Specify a different chunk size (in bytes) for the dfs file.  Use
2366              DAOS container's chunk size by default.
2367
2368       (dfs)object_class
2369              Specify  a  different  object  class for the dfs file.  Use DAOS
2370              container's object class by default.
2371
2372       (nfs)nfs_url
2373              URL           in           libnfs           format,           eg
2374              nfs://<server|ipv4|ipv6>/path[?arg=val[&arg=val]*]  Refer to the
2375              libnfs README for more details.
2376
2377       (exec)program=str
2378              Specify the program to execute.  Note the program will receive a
2379              SIGTERM  when  the job is reaching the time limit.  A SIGKILL is
2380              sent once the job is over. The delay between the two signals  is
2381              defined by grace_time option.
2382
2383       (exec)arguments=str
2384              Specify  arguments  to  pass to program.  Some special variables
2385              can be expanded to pass fio's job details to the program :
2386
2387                     %r     replaced by the duration of the job in seconds
2388
2389                     %n     replaced by the name of the job
2390
2391       (exec)grace_time=int
2392              Defines the time between the SIGTERM and  SIGKILL  signals.  De‐
2393              fault is 1 second.
2394
2395       (exec)std_redirect=ool
2396              If  set, stdout and stderr streams are redirected to files named
2397              from the job name. Default is true.
2398
2399       (xnvme)xnvme_async=str
2400              Select the xnvme async command interface. This  can  take  these
2401              values.
2402
2403                     emu    This  is  default  and use to emulate asynchronous
2404                            I/O by using a single thread  to  create  a  queue
2405                            pair  on  top of a synchronous I/O interface using
2406                            the NVMe driver IOCTL.
2407
2408                     thrpool
2409                            Emulate an asynchronous I/O interface with a  pool
2410                            of  userspace  threads on top of a synchronous I/O
2411                            interface using the NVMe driver IOCTL. By  default
2412                            four threads are used.
2413
2414                     io_uring
2415                            Linux native asynchronous I/O interface which sup‐
2416                            ports both direct and buffered I/O.
2417
2418                     libaio Use Linux aio for Asynchronous I/O
2419
2420                     posix  Use the posix asynchronous I/O interface  to  per‐
2421                            form one or more I/O operations asynchronously.
2422
2423                     nil    Do not transfer any data; just pretend to. This is
2424                            mainly used for introspective performance  evalua‐
2425                            tion.
2426
2427       (xnvme)xnvme_sync=str
2428              Select  the  xnvme  synchronous command interface. This can take
2429              these values.
2430
2431                     nvme   This is default and uses Linux NVMe Driver ioctl()
2432                            for synchronous I/O.
2433
2434                     psync  This  supports regular as well as vectored pread()
2435                            and pwrite() commands.
2436
2437                     block  This is the same as psync except that it also sup‐
2438                            ports  zone  management commands using Linux block
2439                            layer IOCTLs.
2440
2441       (xnvme)xnvme_admin=str
2442              Select the xnvme admin command interface. This  can  take  these
2443              values.
2444
2445                     nvme   This is default and uses Linux NVMe Driver ioctl()
2446                            for admin commands.
2447
2448                     block  Use Linux Block Layer ioctl() and sysfs for  admin
2449                            commands.
2450
2451       (xnvme)xnvme_dev_nsid=int
2452              xnvme  namespace  identifier  for  userspace NVMe driver such as
2453              SPDK.
2454
2455       (xnvme)xnvme_iovec
2456              If this option is set, xnvme will use vectored  read/write  com‐
2457              mands.
2458
2459   I/O depth
2460       iodepth=int
2461              Number  of  I/O  units  to keep in flight against the file. Note
2462              that increasing iodepth beyond 1 will not affect synchronous io‐
2463              engines  (except for small degrees when verify_async is in use).
2464              Even async engines may impose OS restrictions  causing  the  de‐
2465              sired  depth  not  to be achieved. This may happen on Linux when
2466              using libaio and not setting `direct=1', since buffered  I/O  is
2467              not  async on that OS. Keep an eye on the I/O depth distribution
2468              in the fio output to verify that the achieved depth  is  as  ex‐
2469              pected. Default: 1.
2470
2471       iodepth_batch_submit=int, iodepth_batch=int
2472              This  defines  how  many pieces of I/O to submit at once. It de‐
2473              faults to 1 which means that we submit each I/O as soon as it is
2474              available,  but can be raised to submit bigger batches of I/O at
2475              the time. If it is set to 0 the iodepth value will be used.
2476
2477       iodepth_batch_complete_min=int, iodepth_batch_complete=int
2478              This defines how many pieces of I/O to retrieve at once. It  de‐
2479              faults to 1 which means that we'll ask for a minimum of 1 I/O in
2480              the retrieval process from the kernel. The I/O retrieval will go
2481              on  until  we hit the limit set by iodepth_low. If this variable
2482              is set to 0, then fio will always check for completed events be‐
2483              fore  queuing  more  I/O.  This helps reduce I/O latency, at the
2484              cost of more retrieval system calls.
2485
2486       iodepth_batch_complete_max=int
2487              This defines maximum pieces of I/O to  retrieve  at  once.  This
2488              variable   should   be   used   along   with  iodepth_batch_com‐
2489              plete_min=int variable, specifying the  range  of  min  and  max
2490              amount  of I/O which should be retrieved. By default it is equal
2491              to iodepth_batch_complete_min value. Example #1:
2492
2493                     iodepth_batch_complete_min=1
2494                     iodepth_batch_complete_max=<iodepth>
2495
2496              which means that we will retrieve at least 1 I/O and up  to  the
2497              whole  submitted  queue depth. If none of I/O has been completed
2498              yet, we will wait.  Example #2:
2499
2500                     iodepth_batch_complete_min=0
2501                     iodepth_batch_complete_max=<iodepth>
2502
2503              which means that we can retrieve up to the whole submitted queue
2504              depth,  but  if  none of I/O has been completed yet, we will NOT
2505              wait and immediately exit the system call. In  this  example  we
2506              simply do polling.
2507
2508       iodepth_low=int
2509              The  low  water  mark indicating when to start filling the queue
2510              again. Defaults to the same as iodepth, meaning  that  fio  will
2511              attempt  to  keep the queue full at all times. If iodepth is set
2512              to e.g. 16 and iodepth_low is set  to  4,  then  after  fio  has
2513              filled  the  queue  of  16 requests, it will let the depth drain
2514              down to 4 before starting to fill it again.
2515
2516       serialize_overlap=bool
2517              Serialize in-flight I/Os that might otherwise  cause  or  suffer
2518              from data races.  When two or more I/Os are submitted simultane‐
2519              ously, there is no guarantee that the I/Os will be processed  or
2520              completed  in  the  submitted  order. Further, if two or more of
2521              those I/Os are writes, any overlapping region between  them  can
2522              become  indeterminate/undefined on certain storage. These issues
2523              can cause verification to fail erratically when at least one  of
2524              the  racing I/Os is changing data and the overlapping region has
2525              a non-zero size. Setting serialize_overlap tells  fio  to  avoid
2526              provoking this behavior by explicitly serializing in-flight I/Os
2527              that have a non-zero overlap. Note that setting this option  can
2528              reduce both performance and the iodepth achieved.
2529
2530              This  option only applies to I/Os issued for a single job except
2531              when it is enabled along with io_submit_mode=offload. In offload
2532              mode,  fio  will  check  for overlap among all I/Os submitted by
2533              offload jobs with serialize_overlap enabled.
2534
2535              Default: false.
2536
2537       io_submit_mode=str
2538              This option controls how fio submits the I/O to the I/O  engine.
2539              The  default  is  `inline', which means that the fio job threads
2540              submit and reap I/O directly.  If  set  to  `offload',  the  job
2541              threads  will  offload I/O submission to a dedicated pool of I/O
2542              threads. This requires some coordination and thus has a  bit  of
2543              extra  overhead,  especially  for lower queue depth I/O where it
2544              can increase latencies. The benefit is that fio can manage  sub‐
2545              mission rates independently of the device completion rates. This
2546              avoids skewed latency reporting if I/O gets backed up on the de‐
2547              vice side (the coordinated omission problem). Note that this op‐
2548              tion cannot reliably be used with async IO engines.
2549
2550   I/O rate
2551       thinktime=time
2552              Stall the job for the specified period of time after an I/O  has
2553              completed  before issuing the next. May be used to simulate pro‐
2554              cessing being done by an application.  When the unit is omitted,
2555              the  value is interpreted in microseconds. See thinktime_blocks,
2556              thinktime_iotime and thinktime_spin.
2557
2558       thinktime_spin=time
2559              Only valid if thinktime is set - pretend to spend CPU time doing
2560              something  with the data received, before falling back to sleep‐
2561              ing for the rest of the period specified by thinktime. When  the
2562              unit is omitted, the value is interpreted in microseconds.
2563
2564       thinktime_blocks=int
2565              Only  valid if thinktime is set - control how many blocks to is‐
2566              sue, before waiting thinktime usecs. If not set, defaults  to  1
2567              which will make fio wait thinktime usecs after every block. This
2568              effectively makes any queue depth setting  redundant,  since  no
2569              more than 1 I/O will be queued before we have to complete it and
2570              do our thinktime. In other words, this setting effectively  caps
2571              the queue depth if the latter is larger.
2572
2573       thinktime_blocks_type=str
2574              Only  valid  if  thinktime is set - control how thinktime_blocks
2575              triggers.  The default is `complete', which  triggers  thinktime
2576              when  fio  completes  thinktime_blocks blocks. If this is set to
2577              `issue', then the trigger happens at the issue side.
2578
2579       thinktime_iotime=time
2580              Only valid if thinktime is set - control thinktime  interval  by
2581              time.   The  thinktime  stall is repeated after IOs are executed
2582              for  thinktime_iotime.   For   example,   `--thinktime_iotime=9s
2583              --thinktime=1s'  repeat  10-second  cycle with IOs for 9 seconds
2584              and stall for 1 second. When the unit is omitted,  thinktime_io‐
2585              time  is  interpreted as a number of seconds.  If this option is
2586              used together with thinktime_blocks, the thinktime stall is  re‐
2587              peated  after  thinktime_iotime  or  after thinktime_blocks IOs,
2588              whichever happens first.
2589
2590
2591       rate=int[,int][,int]
2592              Cap the bandwidth used by this job. The number is in  bytes/sec,
2593              the  normal  suffix  rules  apply. Comma-separated values may be
2594              specified for reads, writes, and trims as  described  in  block‐
2595              size.
2596
2597              For  example, using `rate=1m,500k' would limit reads to 1MiB/sec
2598              and writes to 500KiB/sec. Capping only reads or  writes  can  be
2599              done  with  `rate=,500k'  or  `rate=500k,' where the former will
2600              only limit writes (to 500KiB/sec) and the latter will only limit
2601              reads.
2602
2603       rate_min=int[,int][,int]
2604              Tell  fio  to do whatever it can to maintain at least this band‐
2605              width. Failing to meet this requirement will cause  the  job  to
2606              exit. Comma-separated values may be specified for reads, writes,
2607              and trims as described in blocksize.
2608
2609       rate_iops=int[,int][,int]
2610              Cap the bandwidth to this number of IOPS. Basically the same  as
2611              rate,  just  specified independently of bandwidth. If the job is
2612              given a block size range instead of a fixed value, the  smallest
2613              block  size is used as the metric. Comma-separated values may be
2614              specified for reads, writes, and trims as  described  in  block‐
2615              size.
2616
2617       rate_iops_min=int[,int][,int]
2618              If  fio  doesn't meet this rate of I/O, it will cause the job to
2619              exit.   Comma-separated  values  may  be  specified  for  reads,
2620              writes, and trims as described in blocksize.
2621
2622       rate_process=str
2623              This  option controls how fio manages rated I/O submissions. The
2624              default is `linear', which submits I/O in a linear fashion  with
2625              fixed  delays  between I/Os that gets adjusted based on I/O com‐
2626              pletion rates. If this is set to `poisson', fio will submit  I/O
2627              based  on  a  more  real world random request flow, known as the
2628              Poisson       process       (https://en.wikipedia.org/wiki/Pois‐
2629              son_point_process). The lambda will be 10^6 / IOPS for the given
2630              workload.
2631
2632       rate_ignore_thinktime=bool
2633              By default, fio will attempt to catch up to the  specified  rate
2634              setting,  if any kind of thinktime setting was used. If this op‐
2635              tion is set, then fio will ignore the thinktime and continue do‐
2636              ing  IO  at  the  specified rate, instead of entering a catch-up
2637              mode after thinktime is done.
2638
2639   I/O latency
2640       latency_target=time
2641              If set, fio will attempt to find the max performance point  that
2642              the given workload will run at while maintaining a latency below
2643              this target. When the unit is omitted, the value is  interpreted
2644              in microseconds. See latency_window and latency_percentile.
2645
2646       latency_window=time
2647              Used  with  latency_target to specify the sample window that the
2648              job is run at varying queue depths to test the performance. When
2649              the unit is omitted, the value is interpreted in microseconds.
2650
2651       latency_percentile=float
2652              The percentage of I/Os that must fall within the criteria speci‐
2653              fied by latency_target and latency_window. If not set, this  de‐
2654              faults to 100.0, meaning that all I/Os must be equal or below to
2655              the value set by latency_target.
2656
2657       latency_run=bool
2658              Used with latency_target. If false (default), fio will find  the
2659              highest queue depth that meets latency_target and exit. If true,
2660              fio will continue running and try to meet latency_target by  ad‐
2661              justing queue depth.
2662
2663       max_latency=time[,time][,time]
2664              If  set, fio will exit the job with an ETIMEDOUT error if it ex‐
2665              ceeds this maximum latency. When the unit is omitted, the  value
2666              is  interpreted  in  microseconds. Comma-separated values may be
2667              specified for reads, writes, and trims as  described  in  block‐
2668              size.
2669
2670       rate_cycle=int
2671              Average bandwidth for rate and rate_min over this number of mil‐
2672              liseconds. Defaults to 1000.
2673
2674   I/O replay
2675       write_iolog=str
2676              Write the  issued  I/O  patterns  to  the  specified  file.  See
2677              read_iolog.  Specify a separate file for each job, otherwise the
2678              iologs will be interspersed and the file may  be  corrupt.  This
2679              file will be opened in append mode.
2680
2681       read_iolog=str
2682              Open  an  iolog  with  the specified filename and replay the I/O
2683              patterns it contains. This can be used to store a  workload  and
2684              replay it sometime later. The iolog given may also be a blktrace
2685              binary file, which allows fio to replay a workload  captured  by
2686              blktrace.  See blktrace(8) for how to capture such logging data.
2687              For blktrace replay, the file needs to be turned into a blkparse
2688              binary  data  file  first  (`blkparse  <device>  -o /dev/null -d
2689              file_for_fio.bin').  You can specify a number of files by  sepa‐
2690              rating  the names with a ':' character.  See the filename option
2691              for information on how to escape ':' characters within the  file
2692              names.  These  files will be sequentially assigned to job clones
2693              created by numjobs. '-' is a reserved name,  meaning  read  from
2694              stdin,  notably  if  filename is set to '-' which means stdin as
2695              well, then this flag can't be set to '-'.
2696
2697       read_iolog_chunked=bool
2698              Determines  how  iolog  is  read.  If  false  (default)   entire
2699              read_iolog  will  be  read at once. If selected true, input from
2700              iolog will be read gradually.  Useful when iolog is very  large,
2701              or it is generated.
2702
2703       merge_blktrace_file=str
2704              When  specified,  rather  than  replaying  the  logs  passed  to
2705              read_iolog, the logs go through a merge phase  which  aggregates
2706              them  into a single blktrace.  The resulting file is then passed
2707              on as the read_iolog parameter. The intention here  is  to  make
2708              the order of events consistent. This limits the influence of the
2709              scheduler compared to replaying multiple blktraces  via  concur‐
2710              rent jobs.
2711
2712       merge_blktrace_scalars=float_list
2713              This  is a percentage based option that is index paired with the
2714              list of files passed to read_iolog. When merging  is  performed,
2715              scale  the  time  of each event by the corresponding amount. For
2716              example,  `--merge_blktrace_scalars="50:100"'  runs  the   first
2717              trace in halftime and the second trace in realtime. This knob is
2718              separately tunable from replay_time_scale which scales the trace
2719              during  runtime  and will not change the output of the merge un‐
2720              like this option.
2721
2722       merge_blktrace_iters=float_list
2723              This is a whole number option that is index paired with the list
2724              of  files  passed  to read_iolog. When merging is performed, run
2725              each trace for the specified number of iterations. For  example,
2726              `--merge_blktrace_iters="2:1"'  runs the first trace for two it‐
2727              erations and the second trace for one iteration.
2728
2729       replay_no_stall=bool
2730              When replaying I/O with read_iolog the default  behavior  is  to
2731              attempt to respect the timestamps within the log and replay them
2732              with the appropriate delay between IOPS. By setting  this  vari‐
2733              able  fio  will not respect the timestamps and attempt to replay
2734              them as fast as possible while still  respecting  ordering.  The
2735              result  is the same I/O pattern to a given device, but different
2736              timings.
2737
2738       replay_time_scale=int
2739              When replaying I/O with read_iolog, fio will honor the  original
2740              timing  in  the  trace. With this option, it's possible to scale
2741              the time. It's a percentage option, if set to 50 it means run at
2742              50%  the  original  IO  rate in the trace. If set to 200, run at
2743              twice the original IO rate. Defaults to 100.
2744
2745       replay_redirect=str
2746              While replaying I/O patterns using read_iolog the default behav‐
2747              ior  is to replay the IOPS onto the major/minor device that each
2748              IOP was recorded from. This is sometimes undesirable because  on
2749              a  different machine those major/minor numbers can map to a dif‐
2750              ferent device. Changing hardware on the same system can also re‐
2751              sult in a different major/minor mapping.  replay_redirect causes
2752              all I/Os to be replayed onto the single specified device regard‐
2753              less  of  the  device  it  was recorded from. i.e. `replay_redi‐
2754              rect=/dev/sdc' would cause all I/O in the blktrace or  iolog  to
2755              be replayed onto `/dev/sdc'. This means multiple devices will be
2756              replayed onto a single device, if the  trace  contains  multiple
2757              devices.  If  you  want  multiple devices to be replayed concur‐
2758              rently to multiple redirected devices  you  must  blkparse  your
2759              trace  into separate traces and replay them with independent fio
2760              invocations.  Unfortunately this also breaks the strict time or‐
2761              dering between multiple device accesses.
2762
2763       replay_align=int
2764              Force  alignment  of  the byte offsets in a trace to this value.
2765              The value must be a power of 2.
2766
2767       replay_scale=int
2768              Scale bye offsets down by this  factor  when  replaying  traces.
2769              Should most likely use replay_align as well.
2770
2771   Threads, processes and job synchronization
2772       replay_skip=str
2773              Sometimes  it's  useful  to  skip  certain  IO types in a replay
2774              trace. This could be, for instance, eliminating  the  writes  in
2775              the trace. Or not replaying the trims/discards, if you are redi‐
2776              recting to a device that  doesn't  support  them.   This  option
2777              takes a comma separated list of read, write, trim, sync.
2778
2779       thread Fio defaults to creating jobs by using fork, however if this op‐
2780              tion is given, fio will create  jobs  by  using  POSIX  Threads'
2781              function pthread_create(3) to create threads instead.
2782
2783       wait_for=str
2784              If  set,  the  current job won't be started until all workers of
2785              the specified waitee job are done.  wait_for operates on the job
2786              name  basis,  so  there are a few limitations. First, the waitee
2787              must be defined prior to the waiter job (meaning no forward ref‐
2788              erences).  Second,  if a job is being referenced as a waitee, it
2789              must have a unique name (no duplicate waitees).
2790
2791       nice=int
2792              Run the job with the given nice value. See man nice(2).  On Win‐
2793              dows,  values  less than -15 set the process class to "High"; -1
2794              through -15 set "Above Normal"; 1 through 15 "Below Normal"; and
2795              above 15 "Idle" priority class.
2796
2797       prio=int
2798              Set  the  I/O  priority  value of this job. Linux limits us to a
2799              positive value between 0 and 7, with 0 being  the  highest.  See
2800              man ionice(1). Refer to an appropriate manpage for other operat‐
2801              ing systems since meaning of priority may differ.  For  per-com‐
2802              mand priority setting, see the I/O engine specific `cmdprio_per‐
2803              centage` and `cmdprio` options.
2804
2805       prioclass=int
2806              Set the I/O priority class. See man ionice(1).  For  per-command
2807              priority  setting, see the I/O engine specific `cmdprio_percent‐
2808              age` and `cmdprio_class` options.
2809
2810       cpus_allowed=str
2811              Controls the same options as  cpumask,  but  accepts  a  textual
2812              specification of the permitted CPUs instead and CPUs are indexed
2813              from 0. So to use CPUs 0  and  5  you  would  specify  `cpus_al‐
2814              lowed=0,5'. This option also allows a range of CPUs to be speci‐
2815              fied -- say you wanted a binding to CPUs 0, 5, and 8 to 15,  you
2816              would set `cpus_allowed=0,5,8-15'.
2817
2818              On  Windows,  when  `cpus_allowed' is unset only CPUs from fio's
2819              current processor group will be used and affinity  settings  are
2820              inherited  from  the  system.  An fio build configured to target
2821              Windows 7 makes options that set CPUs processor group aware  and
2822              values  will  set both the processor group and a CPU from within
2823              that group. For example, on a system where processor group 0 has
2824              40 CPUs and processor group 1 has 32 CPUs, `cpus_allowed' values
2825              between 0 and 39 will bind  CPUs  from  processor  group  0  and
2826              `cpus_allowed' values between 40 and 71 will bind CPUs from pro‐
2827              cessor group 1. When using `cpus_allowed_policy=shared' all CPUs
2828              specified  by  a  single  `cpus_allowed' option must be from the
2829              same processor group. For Windows fio builds not built for  Win‐
2830              dows  7,  CPUs  will  only be selected from (and be relative to)
2831              whatever processor group fio happens to be running in  and  CPUs
2832              from other processor groups cannot be used.
2833
2834       cpus_allowed_policy=str
2835              Set  the  policy  of  how  fio distributes the CPUs specified by
2836              cpus_allowed or cpumask. Two policies are supported:
2837
2838                     shared All jobs will share the CPU set specified.
2839
2840                     split  Each job will get a unique CPU from the CPU set.
2841
2842              shared is the default behavior, if the option  isn't  specified.
2843              If  split is specified, then fio will assign one cpu per job. If
2844              not enough CPUs are given for the jobs  listed,  then  fio  will
2845              roundrobin the CPUs in the set.
2846
2847       cpumask=int
2848              Set  the  CPU affinity of this job. The parameter given is a bit
2849              mask of allowed CPUs the job may run on. So if you want the  al‐
2850              lowed CPUs to be 1 and 5, you would pass the decimal value of (1
2851              << 1 | 1 << 5), or 34. See man  sched_setaffinity(2).  This  may
2852              not  work on all supported operating systems or kernel versions.
2853              This option doesn't work well for a higher CPU count  than  what
2854              you  can  store  in an integer mask, so it can only control cpus
2855              1-32. For boxes with larger CPU counts, use cpus_allowed.
2856
2857       numa_cpu_nodes=str
2858              Set this job running on specified NUMA nodes'  CPUs.  The  argu‐
2859              ments  allow comma delimited list of cpu numbers, A-B ranges, or
2860              `all'. Note, to enable NUMA options support, fio must  be  built
2861              on a system with libnuma-dev(el) installed.
2862
2863       numa_mem_policy=str
2864              Set  this job's memory policy and corresponding NUMA nodes. For‐
2865              mat of the arguments:
2866
2867                     <mode>[:<nodelist>]
2868
2869              `mode' is one of the following memory policies: `default', `pre‐
2870              fer', `bind', `interleave' or `local'. For `default' and `local'
2871              memory policies, no node needs to be  specified.  For  `prefer',
2872              only one node is allowed. For `bind' and `interleave' the `node‐
2873              list' may be as follows: a comma delimited list of numbers,  A-B
2874              ranges, or `all'.
2875
2876       cgroup=str
2877              Add  job  to this control group. If it doesn't exist, it will be
2878              created. The system must have a mounted cgroup blkio mount point
2879              for  this  to  work. If your system doesn't have it mounted, you
2880              can do so with:
2881
2882                     # mount -t cgroup -o blkio none /cgroup
2883
2884       cgroup_weight=int
2885              Set the weight of the cgroup to this value. See  the  documenta‐
2886              tion that comes with the kernel, allowed values are in the range
2887              of 100..1000.
2888
2889       cgroup_nodelete=bool
2890              Normally fio will delete the cgroups it has  created  after  the
2891              job  completion.  To override this behavior and to leave cgroups
2892              around after the job completion, set  `cgroup_nodelete=1'.  This
2893              can be useful if one wants to inspect various cgroup files after
2894              job completion. Default: false.
2895
2896       flow_id=int
2897              The ID of the flow. If not specified, it  defaults  to  being  a
2898              global flow. See flow.
2899
2900       flow=int
2901              Weight  in token-based flow control. If this value is used, then
2902              fio regulates the activity between two or more jobs sharing  the
2903              same  flow_id.   Fio  attempts to keep each job activity propor‐
2904              tional to other jobs' activities in the same flow_id group, with
2905              respect  to  requested  weight per job.  That is, if one job has
2906              `flow=3', another job has `flow=2' and  another  with  `flow=1`,
2907              then there will be a roughly 3:2:1 ratio in how much one runs vs
2908              the others.
2909
2910       flow_sleep=int
2911              The period of time, in microseconds,  to  wait  after  the  flow
2912              counter has exceeded its proportion before retrying operations.
2913
2914       stonewall, wait_for_previous
2915              Wait for preceding jobs in the job file to exit, before starting
2916              this one. Can be used to insert serialization points in the  job
2917              file.  A stone wall also implies starting a new reporting group,
2918              see group_reporting. Optionally you  can  use  `stonewall=0`  to
2919              disable or `stonewall=1` to enable it.
2920
2921       exitall
2922              By  default,  fio  will continue running all other jobs when one
2923              job finishes.  Sometimes this is not the desired action. Setting
2924              exitall  will  instead  make  fio terminate all jobs in the same
2925              group, as soon as one job of that group finishes.
2926
2927       exit_what=str
2928              By default, fio will continue running all other  jobs  when  one
2929              job finishes.  Sometimes this is not the desired action. Setting
2930              exitall will instead make fio terminate all  jobs  in  the  same
2931              group. The option exit_what allows you to control which jobs get
2932              terminated when exitall is enabled.  The default value is group.
2933              The allowed values are:
2934
2935                     all    terminates all jobs.
2936
2937                     group  is  the  default and does not change the behaviour
2938                            of exitall.
2939
2940                     stonewall
2941                            terminates all currently running jobs  across  all
2942                            groups  and  continues  execution  with  the  next
2943                            stonewalled group.
2944
2945       exec_prerun=str
2946              Before running this job, issue  the  command  specified  through
2947              system(3).  Output  is redirected in a file called `jobname.pre‐
2948              run.txt'.
2949
2950       exec_postrun=str
2951              After the job completes, issue the command specified though sys‐
2952              tem(3).   Output   is   redirected   in   a  file  called  `job‐
2953              name.postrun.txt'.
2954
2955       uid=int
2956              Instead of running as the invoking user, set the user ID to this
2957              value before the thread/process does any work.
2958
2959       gid=int
2960              Set group ID, see uid.
2961
2962   Verification
2963       verify_only
2964              Do  not  perform  specified  workload,  only  verify  data still
2965              matches previous invocation of this workload. This option allows
2966              one  to  check data multiple times at a later date without over‐
2967              writing it. This option makes  sense  only  for  workloads  that
2968              write  data,  and does not support workloads with the time_based
2969              option set.
2970
2971       do_verify=bool
2972              Run the verify phase after a write phase. Only valid  if  verify
2973              is set. Default: true.
2974
2975       verify=str
2976              If  writing  to  a  file, fio can verify the file contents after
2977              each iteration of the job. Each verification method also implies
2978              verification  of  special header, which is written to the begin‐
2979              ning of each block. This header also includes meta  information,
2980              like offset of the block, block number, timestamp when block was
2981              written, etc. verify can be combined with verify_pattern option.
2982              The allowed values are:
2983
2984                     md5    Use  an  md5  sum of the data area and store it in
2985                            the header of each block.
2986
2987                     crc64  Use an experimental crc64 sum of the data area and
2988                            store it in the header of each block.
2989
2990                     crc32c Use  a crc32c sum of the data area and store it in
2991                            the header of each block. This will  automatically
2992                            use  hardware  acceleration (e.g. SSE4.2 on an x86
2993                            or CRC crypto extensions on ARM64) but  will  fall
2994                            back  to  software crc32c if none is found. Gener‐
2995                            ally the fastest checksum fio supports when  hard‐
2996                            ware accelerated.
2997
2998                     crc32c-intel
2999                            Synonym for crc32c.
3000
3001                     crc32  Use  a  crc32 sum of the data area and store it in
3002                            the header of each block.
3003
3004                     crc16  Use a crc16 sum of the data area and store  it  in
3005                            the header of each block.
3006
3007                     crc7   Use  a  crc7  sum of the data area and store it in
3008                            the header of each block.
3009
3010                     xxhash Use xxhash as the checksum function. Generally the
3011                            fastest software checksum that fio supports.
3012
3013                     sha512 Use sha512 as the checksum function.
3014
3015                     sha256 Use sha256 as the checksum function.
3016
3017                     sha1   Use optimized sha1 as the checksum function.
3018
3019                     sha3-224
3020                            Use optimized sha3-224 as the checksum function.
3021
3022                     sha3-256
3023                            Use optimized sha3-256 as the checksum function.
3024
3025                     sha3-384
3026                            Use optimized sha3-384 as the checksum function.
3027
3028                     sha3-512
3029                            Use optimized sha3-512 as the checksum function.
3030
3031                     meta   This option is deprecated, since now meta informa‐
3032                            tion is included in  generic  verification  header
3033                            and  meta verification happens by default. For de‐
3034                            tailed information see the description of the ver‐
3035                            ify  setting.  This option is kept because of com‐
3036                            patibility's sake with old configurations. Do  not
3037                            use it.
3038
3039                     pattern
3040                            Verify  a  strict pattern. Normally fio includes a
3041                            header with some basic information  and  checksum‐
3042                            ming, but if this option is set, only the specific
3043                            pattern set with verify_pattern is verified.
3044
3045                     null   Only pretend to verify. Useful for testing  inter‐
3046                            nals with `ioengine=null', not for much else.
3047
3048              This  option  can be used for repeated burn-in tests of a system
3049              to make sure that the written data is also correctly read  back.
3050              If  the  data direction given is a read or random read, fio will
3051              assume that it should verify a previously written file.  If  the
3052              data direction includes any form of write, the verify will be of
3053              the newly written data.
3054
3055              To avoid false verification errors, do not use  the  norandommap
3056              option when verifying data with async I/O engines and I/O depths
3057              > 1.  Or use the norandommap and the lfsr random  generator  to‐
3058              gether  to  avoid  writing to the same offset with multiple out‐
3059              standing I/Os.
3060
3061       verify_offset=int
3062              Swap the verification header with data  somewhere  else  in  the
3063              block before writing. It is swapped back before verifying.
3064
3065       verify_interval=int
3066              Write  the  verification  header at a finer granularity than the
3067              blocksize. It will be written for chunks the size of  verify_in‐
3068              terval. blocksize should divide this evenly.
3069
3070       verify_pattern=str
3071              If set, fio will fill the I/O buffers with this pattern. Fio de‐
3072              faults to filling with totally random bytes, but sometimes  it's
3073              interesting  to  fill  with a known pattern for I/O verification
3074              purposes. Depending on the width of the pattern, fio  will  fill
3075              1/2/3/4 bytes of the buffer at the time (it can be either a dec‐
3076              imal or a hex number).  The  verify_pattern  if  larger  than  a
3077              32-bit  quantity  has to be a hex number that starts with either
3078              "0x" or "0X". Use with verify. Also, verify_pattern supports  %o
3079              format,  which  means that for each block offset will be written
3080              and then verified back, e.g.:
3081
3082                     verify_pattern=%o
3083
3084              Or use combination of everything:
3085
3086                     verify_pattern=0xff%o"abcd"-12
3087
3088       verify_fatal=bool
3089              Normally fio will keep checking the entire contents before quit‐
3090              ting on a block verification failure. If this option is set, fio
3091              will exit the job on the first observed failure. Default: false.
3092
3093       verify_dump=bool
3094              If set, dump the contents of both the original  data  block  and
3095              the  data  block  we  read  off disk to files. This allows later
3096              analysis to inspect just what kind of data corruption  occurred.
3097              Off by default.
3098
3099       verify_async=int
3100              Fio  will normally verify I/O inline from the submitting thread.
3101              This option takes an integer describing how many  async  offload
3102              threads  to  create for I/O verification instead, causing fio to
3103              offload the duty of verifying I/O contents to one or more  sepa‐
3104              rate  threads.  If  using this offload option, even sync I/O en‐
3105              gines can benefit from using an iodepth setting higher  than  1,
3106              as  it allows them to have I/O in flight while verifies are run‐
3107              ning.  Defaults to 0 async threads,  i.e.  verification  is  not
3108              asynchronous.
3109
3110       verify_async_cpus=str
3111              Tell  fio to set the given CPU affinity on the async I/O verifi‐
3112              cation threads. See cpus_allowed for the format used.
3113
3114       verify_backlog=int
3115              Fio will normally verify the written contents of a job that uti‐
3116              lizes verify once that job has completed. In other words, every‐
3117              thing is written then everything is read back and verified.  You
3118              may want to verify continually instead for a variety of reasons.
3119              Fio stores the meta data associated with an I/O block in memory,
3120              so  for  large  verify workloads, quite a bit of memory would be
3121              used up holding this meta data. If this option is  enabled,  fio
3122              will write only N blocks before verifying these blocks.
3123
3124       verify_backlog_batch=int
3125              Control  how  many  blocks  fio will verify if verify_backlog is
3126              set. If not set, will default to  the  value  of  verify_backlog
3127              (meaning  the  entire  queue is read back and verified). If ver‐
3128              ify_backlog_batch is  less  than  verify_backlog  then  not  all
3129              blocks  will be verified, if verify_backlog_batch is larger than
3130              verify_backlog, some blocks will be verified more than once.
3131
3132       verify_state_save=bool
3133              When a job exits during the write phase of  a  verify  workload,
3134              save  its current state. This allows fio to replay up until that
3135              point, if the verify state is loaded for the verify read  phase.
3136              The format of the filename is, roughly:
3137
3138                     <type>-<jobname>-<jobindex>-verify.state.
3139
3140              <type>  is  "local"  for a local run, "sock" for a client/server
3141              socket connection, and "ip" (192.168.0.1, for  instance)  for  a
3142              networked client/server connection. Defaults to true.
3143
3144       verify_state_load=bool
3145              If a verify termination trigger was used, fio stores the current
3146              write state of each thread. This can  be  used  at  verification
3147              time  so  that  fio knows how far it should verify. Without this
3148              information, fio will run a full verification pass, according to
3149              the settings in the job file used. Default false.
3150
3151       trim_percentage=int
3152              Number of verify blocks to discard/trim.
3153
3154       trim_verify_zero=bool
3155              Verify that trim/discarded blocks are returned as zeros.
3156
3157       trim_backlog=int
3158              Verify that trim/discarded blocks are returned as zeros.
3159
3160       trim_backlog_batch=int
3161              Trim this number of I/O blocks.
3162
3163       experimental_verify=bool
3164              Enable experimental verification.
3165
3166   Steady state
3167       steadystate=str:float, ss=str:float
3168              Define  the  criterion and limit for assessing steady state per‐
3169              formance. The first parameter designates the  criterion  whereas
3170              the  second  parameter  sets  the  threshold. When the criterion
3171              falls below the threshold for the specified  duration,  the  job
3172              will  stop.  For  example,  `iops_slope:0.1%' will direct fio to
3173              terminate the job when the least squares regression slope  falls
3174              below  0.1% of the mean IOPS. If group_reporting is enabled this
3175              will apply to all jobs in the group. Below is the list of avail‐
3176              able  steady state assessment criteria. All assessments are car‐
3177              ried out using only data from  the  rolling  collection  window.
3178              Threshold  limits can be expressed as a fixed value or as a per‐
3179              centage of the mean in the collection window.
3180
3181              When using this feature, most jobs should include the time_based
3182              and  runtime  options  or  the loops option so that fio does not
3183              stop running after it has covered the full size of the specified
3184              file(s) or device(s).
3185
3186                            iops   Collect  IOPS data. Stop the job if all in‐
3187                                   dividual IOPS measurements are  within  the
3188                                   specified  limit  of  the  mean IOPS (e.g.,
3189                                   `iops:2' means  that  all  individual  IOPS
3190                                   values  must  be  within  2  of  the  mean,
3191                                   whereas `iops:0.2%' means that all individ‐
3192                                   ual  IOPS values must be within 0.2% of the
3193                                   mean IOPS to terminate the job).
3194
3195                            iops_slope
3196                                   Collect IOPS data and calculate  the  least
3197                                   squares  regression  slope. Stop the job if
3198                                   the slope falls below the specified limit.
3199
3200                            bw     Collect bandwidth data. Stop the job if all
3201                                   individual   bandwidth   measurements   are
3202                                   within the  specified  limit  of  the  mean
3203                                   bandwidth.
3204
3205                            bw_slope
3206                                   Collect  bandwidth  data  and calculate the
3207                                   least squares regression  slope.  Stop  the
3208                                   job  if the slope falls below the specified
3209                                   limit.
3210
3211              steadystate_duration=time, ss_dur=time
3212                     A rolling window of this duration will be used  to  judge
3213                     whether  steady state has been reached. Data will be col‐
3214                     lected once per second. The default is 0  which  disables
3215                     steady  state  detection.  When  the unit is omitted, the
3216                     value is interpreted in seconds.
3217
3218              steadystate_ramp_time=time, ss_ramp=time
3219                     Allow the job to run for the  specified  duration  before
3220                     beginning  data  collection for checking the steady state
3221                     job termination criterion. The default  is  0.  When  the
3222                     unit is omitted, the value is interpreted in seconds.
3223
3224   Measurements and reporting
3225       per_job_logs=bool
3226              If  set,  this  generates bw/clat/iops log with per file private
3227              filenames. If not set, jobs with identical names will share  the
3228              log filename. Default: true.
3229
3230       group_reporting
3231              It may sometimes be interesting to display statistics for groups
3232              of jobs as a whole instead of for each individual job.  This  is
3233              especially  true  if  numjobs  is  used;  looking  at individual
3234              thread/process output quickly becomes unwieldy. To see the final
3235              report  per-group  instead of per-job, use group_reporting. Jobs
3236              in a file will be part of the same reporting  group,  unless  if
3237              separated by a stonewall, or by using new_group.
3238
3239       new_group
3240              Start a new reporting group. See: group_reporting. If not given,
3241              all jobs in a file will be part of the same reporting group, un‐
3242              less separated by a stonewall.
3243
3244       stats=bool
3245              By  default, fio collects and shows final output results for all
3246              jobs that run. If this option is set to 0, then fio will  ignore
3247              it in the final stat output.
3248
3249       write_bw_log=str
3250              If  given,  write  a  bandwidth log for this job. Can be used to
3251              store data of the bandwidth of the jobs in their lifetime.
3252
3253              If no str argument is  given,  the  default  filename  of  `job‐
3254              name_type.x.log'  is  used. Even when the argument is given, fio
3255              will still append the type of log. So if one specifies:
3256
3257                     write_bw_log=foo
3258
3259              The actual log name will be `foo_bw.x.log' where `x' is the  in‐
3260              dex  of  the  job  (1..N,  where  N  is  the number of jobs). If
3261              per_job_logs is false, then the filename will  not  include  the
3262              `.x` job index.
3263
3264              The  included  fio_generate_plots  script  uses  gnuplot to turn
3265              these text files into nice graphs. See the LOG FILE FORMATS sec‐
3266              tion for how data is structured within the file.
3267
3268       write_lat_log=str
3269              Same  as write_bw_log, except this option creates I/O submission
3270              (e.g., `name_slat.x.log'), completion (e.g., `name_clat.x.log'),
3271              and  total  (e.g.,  `name_lat.x.log') latency files instead. See
3272              write_bw_log for details about the filename format and  the  LOG
3273              FILE  FORMATS  section  for  how  data  is structured within the
3274              files.
3275
3276       write_hist_log=str
3277              Same as write_bw_log but writes an I/O completion  latency  his‐
3278              togram  file  (e.g.,  `name_hist.x.log') instead. Note that this
3279              file will be empty unless log_hist_msec has also been set.   See
3280              write_bw_log  for  details about the filename format and the LOG
3281              FILE FORMATS section for how data is structured within the file.
3282
3283       write_iops_log=str
3284              Same  as  write_bw_log,  but   writes   an   IOPS   file   (e.g.
3285              `name_iops.x.log`)  instead.  Because fio defaults to individual
3286              I/O logging, the value entry in the IOPS log will  be  1  unless
3287              windowed  logging  (see  log_avg_msec)  has  been  enabled.  See
3288              write_bw_log for details about the filename format and LOG  FILE
3289              FORMATS for how data is structured within the file.
3290
3291       log_entries=int
3292              By  default,  fio  will log an entry in the iops, latency, or bw
3293              log for every I/O that completes. The initial number of I/O  log
3294              entries is 1024.  When the log entries are all used, new log en‐
3295              tries are dynamically allocated.  This dynamic log entry alloca‐
3296              tion  may  negatively impact time-related statistics such as I/O
3297              tail latencies (e.g. 99.9th percentile completion latency). This
3298              option  allows specifying a larger initial number of log entries
3299              to avoid run-time allocation of new log  entries,  resulting  in
3300              more precise time-related I/O statistics.  Also see log_avg_msec
3301              as well. Defaults to 1024.
3302
3303       log_avg_msec=int
3304              By default, fio will log an entry in the iops,  latency,  or  bw
3305              log  for every I/O that completes. When writing to the disk log,
3306              that can quickly grow to a very large size. Setting this  option
3307              makes  fio  average the each log entry over the specified period
3308              of time, reducing the resolution of the log.  See  log_max_value
3309              as  well. Defaults to 0, logging all entries.  Also see LOG FILE
3310              FORMATS section.
3311
3312       log_hist_msec=int
3313              Same as log_avg_msec, but logs entries  for  completion  latency
3314              histograms.  Computing  latency percentiles from averages of in‐
3315              tervals using log_avg_msec is inaccurate.  Setting  this  option
3316              makes  fio  log  histogram  entries over the specified period of
3317              time, reducing log sizes for high IOPS devices  while  retaining
3318              percentile  accuracy. See log_hist_coarseness and write_hist_log
3319              as well.  Defaults to 0, meaning histogram logging is disabled.
3320
3321       log_hist_coarseness=int
3322              Integer ranging from 0 to 6, defining the coarseness of the res‐
3323              olution  of  the  histogram logs enabled with log_hist_msec. For
3324              each increment in coarseness, fio outputs half as many bins. De‐
3325              faults to 0, for which histogram logs contain 1216 latency bins.
3326              See LOG FILE FORMATS section.
3327
3328       log_max_value=bool
3329              If log_avg_msec is set, fio logs the average over  that  window.
3330              If you instead want to log the maximum value, set this option to
3331              1. Defaults to 0, meaning that averaged values are logged.
3332
3333       log_offset=bool
3334              If this is set, the iolog options will include the  byte  offset
3335              for  the I/O entry as well as the other data values. Defaults to
3336              0 meaning that offsets are not present in  logs.  Also  see  LOG
3337              FILE FORMATS section.
3338
3339       log_prio=bool
3340              If  this is set, the iolog options will include the I/O priority
3341              for the I/O entry as well as the other data values. Defaults  to
3342              0  meaning that I/O priorities are not present in logs. Also see
3343              LOG FILE FORMATS section.
3344
3345       log_compression=int
3346              If this is set, fio will compress the I/O logs as  it  goes,  to
3347              keep  the  memory footprint lower. When a log reaches the speci‐
3348              fied size, that chunk is removed and  compressed  in  the  back‐
3349              ground. Given that I/O logs are fairly highly compressible, this
3350              yields a nice memory savings for longer runs.  The  downside  is
3351              that the compression will consume some background CPU cycles, so
3352              it may impact the run. This, however, is also true if  the  log‐
3353              ging  ends  up consuming most of the system memory. So pick your
3354              poison. The I/O logs are saved normally at the end of a run,  by
3355              decompressing  the  chunks and storing them in the specified log
3356              file. This feature depends on the availability of zlib.
3357
3358       log_compression_cpus=str
3359              Define the set of CPUs that are allowed  to  handle  online  log
3360              compression  for the I/O jobs. This can provide better isolation
3361              between performance sensitive jobs, and  background  compression
3362              work. See cpus_allowed for the format used.
3363
3364       log_store_compressed=bool
3365              If  set,  fio  will  store the log files in a compressed format.
3366              They can be decompressed with fio, using the --inflate-log  com‐
3367              mand  line parameter. The files will be stored with a `.fz' suf‐
3368              fix.
3369
3370       log_unix_epoch=bool
3371              If set, fio will log Unix timestamps to the log  files  produced
3372              by enabling write_type_log for each log type, instead of the de‐
3373              fault zero-based timestamps.
3374
3375       log_alternate_epoch=bool
3376              If set, fio will log timestamps based on the epoch used  by  the
3377              clock  specified  in the log_alternate_epoch_clock_id option, to
3378              the log files produced by enabling write_type_log for  each  log
3379              type, instead of the default zero-based timestamps.
3380
3381       log_alternate_epoch_clock_id=int
3382              Specifies the clock_id to be used by clock_gettime to obtain the
3383              alternate epoch if either Blog_unix_epoch or log_alternate_epoch
3384              are  true.  Otherwise  has  no  effect.  Default  value is 0, or
3385              CLOCK_REALTIME.
3386
3387       block_error_percentiles=bool
3388              If set, record errors in trim block-sized units from writes  and
3389              trims and output a histogram of how many trims it took to get to
3390              errors, and what kind of error was encountered.
3391
3392       bwavgtime=int
3393              Average the calculated bandwidth over the given time.  Value  is
3394              specified  in  milliseconds. If the job also does bandwidth log‐
3395              ging through write_bw_log, then the minimum of this  option  and
3396              log_avg_msec will be used. Default: 500ms.
3397
3398       iopsavgtime=int
3399              Average the calculated IOPS over the given time. Value is speci‐
3400              fied in milliseconds. If the job also does IOPS logging  through
3401              write_iops_log, then the minimum of this option and log_avg_msec
3402              will be used. Default: 500ms.
3403
3404       disk_util=bool
3405              Generate disk utilization statistics, if the  platform  supports
3406              it.  Default: true.
3407
3408       disable_lat=bool
3409              Disable  measurements  of total latency numbers. Useful only for
3410              cutting back the number of calls  to  gettimeofday(2),  as  that
3411              does  impact performance at really high IOPS rates. Note that to
3412              really get rid of a large amount of  these  calls,  this  option
3413              must  be  used  with  disable_slat and disable_bw_measurement as
3414              well.
3415
3416       disable_clat=bool
3417              Disable measurements of completion  latency  numbers.  See  dis‐
3418              able_lat.
3419
3420       disable_slat=bool
3421              Disable  measurements  of  submission  latency numbers. See dis‐
3422              able_lat.
3423
3424       disable_bw_measurement=bool, disable_bw=bool
3425              Disable measurements of throughput/bandwidth numbers.  See  dis‐
3426              able_lat.
3427
3428       slat_percentiles=bool
3429              Report submission latency percentiles. Submission latency is not
3430              recorded for synchronous ioengines.
3431
3432       clat_percentiles=bool
3433              Report completion latency percentiles.
3434
3435       lat_percentiles=bool
3436              Report total latency percentiles. Total latency is  the  sum  of
3437              submission latency and completion latency.
3438
3439       percentile_list=float_list
3440              Overwrite  the default list of percentiles for latencies and the
3441              block error histogram. Each number is a floating point number in
3442              the range (0,100], and the maximum length of the list is 20. Use
3443              ':'   to   separate   the   numbers.   For   example,    `--per‐
3444              centile_list=99.5:99.9' will cause fio to report the latency du‐
3445              rations below which 99.5% and 99.9% of  the  observed  latencies
3446              fell, respectively.
3447
3448       significant_figures=int
3449              If  using  --output-format of `normal', set the significant fig‐
3450              ures to this value. Higher values will yield more  precise  IOPS
3451              and  throughput units, while lower values will round. Requires a
3452              minimum value of 1 and a maximum value of 10. Defaults to 4.
3453
3454   Error handling
3455       exitall_on_error
3456              When one job finishes in error, terminate the rest. The  default
3457              is to wait for each job to finish.
3458
3459       continue_on_error=str
3460              Normally fio will exit the job on the first observed failure. If
3461              this option is set, fio will continue the job when  there  is  a
3462              'non-fatal  error' (EIO or EILSEQ) until the runtime is exceeded
3463              or the I/O size specified is completed. If this option is  used,
3464              there  are  two  more  stats  that are appended, the total error
3465              count and the first error. The error field given in the stats is
3466              the first error that was hit during the run.
3467
3468              Note: a write error from the device may go unnoticed by fio when
3469              using buffered IO, as  the  write()  (or  similar)  system  call
3470              merely  dirties  the  kernel pages, unless `sync' or `direct' is
3471              used. Device IO errors occur when the  dirty  data  is  actually
3472              written  out  to  disk.  If  fully sync writes aren't desirable,
3473              `fsync' or `fdatasync' can be used as well. This is specific  to
3474              writes, as reads are always synchronous.
3475
3476                     The allowed values are:
3477
3478                                   none   Exit on any I/O or verify errors.
3479
3480                                   read   Continue on read errors, exit on all
3481                                          others.
3482
3483                                   write  Continue on write  errors,  exit  on
3484                                          all others.
3485
3486                                   io     Continue  on  any I/O error, exit on
3487                                          all others.
3488
3489                                   verify Continue on verify errors,  exit  on
3490                                          all others.
3491
3492                                   all    Continue on all errors.
3493
3494                                   0      Backward-compatible     alias    for
3495                                          'none'.
3496
3497                                   1      Backward-compatible alias for 'all'.
3498
3499                     ignore_error=str
3500                            Sometimes you want to ignore  some  errors  during
3501                            test  in  that case you can specify error list for
3502                            each error type, instead of only being able to ig‐
3503                            nore  the  default  'non-fatal  error'  using con‐
3504                            tinue_on_error.                        `ignore_er‐
3505                            ror=READ_ERR_LIST,WRITE_ERR_LIST,VERIFY_ERR_LIST'
3506                            errors for given error type is separated with ':'.
3507                            Error  may be symbol ('ENOSPC', 'ENOMEM') or inte‐
3508                            ger. Example:
3509
3510                                   ignore_error=EAGAIN,ENOSPC:122
3511
3512                            This option will  ignore  EAGAIN  from  READ,  and
3513                            ENOSPC  and  122(EDQUOT)  from  WRITE. This option
3514                            works by  overriding  continue_on_error  with  the
3515                            list of errors for each error type if any.
3516
3517                     error_dump=bool
3518                            If  set  dump every error even if it is non fatal,
3519                            true by default. If disabled only fatal error will
3520                            be dumped.
3521
3522   Running predefined workloads
3523       Fio includes predefined profiles that mimic the I/O workloads generated
3524       by other tools.
3525
3526       profile=str
3527              The predefined workload to run. Current profiles are:
3528
3529                     tiobench
3530                            Threaded I/O bench (tiotest/tiobench)  like  work‐
3531                            load.
3532
3533                     act    Aerospike Certification Tool (ACT) like workload.
3534
3535       To  view  a profile's additional options use --cmdhelp after specifying
3536       the profile. For example:
3537
3538              $ fio --profile=act --cmdhelp
3539
3540   Act profile options
3541       device-names=str
3542              Devices to use.
3543
3544       load=int
3545              ACT load multiplier. Default: 1.
3546
3547       test-duration=time
3548              How long the entire test takes to run. When the unit is omitted,
3549              the value is given in seconds. Default: 24h.
3550
3551       threads-per-queue=int
3552              Number of read I/O threads per device. Default: 8.
3553
3554       read-req-num-512-blocks=int
3555              Number of 512B blocks to read at the time. Default: 3.
3556
3557       large-block-op-kbytes=int
3558              Size of large block ops in KiB (writes). Default: 131072.
3559
3560       prep   Set to run ACT prep phase.
3561
3562   Tiobench profile options
3563       size=str
3564              Size in MiB.
3565
3566       block=int
3567              Block size in bytes. Default: 4096.
3568
3569       numruns=int
3570              Number of runs.
3571
3572       dir=str
3573              Test directory.
3574
3575       threads=int
3576              Number of threads.
3577

OUTPUT

3579       Fio spits out a lot of output. While running, fio will display the sta‐
3580       tus of the jobs created. An example of that would be:
3581
3582                 Jobs: 1 (f=1): [_(1),M(1)][24.8%][r=20.5MiB/s,w=23.5MiB/s][r=82,w=94 IOPS][eta 01m:31s]
3583
3584       The characters inside the first set of square brackets denote the  cur‐
3585       rent  status  of  each thread. The first character is the first job de‐
3586       fined in the job file, and so forth. The possible  values  (in  typical
3587       life cycle order) are:
3588
3589              P      Thread setup, but not started.
3590              C      Thread created.
3591              I      Thread initialized, waiting or generating necessary data.
3592              p      Thread running pre-reading file(s).
3593              /      Thread is in ramp period.
3594              R      Running, doing sequential reads.
3595              r      Running, doing random reads.
3596              W      Running, doing sequential writes.
3597              w      Running, doing random writes.
3598              M      Running, doing mixed sequential reads/writes.
3599              m      Running, doing mixed random reads/writes.
3600              D      Running, doing sequential trims.
3601              d      Running, doing random trims.
3602              F      Running, currently waiting for fsync(2).
3603              V      Running, doing verification of written data.
3604              f      Thread finishing.
3605              E      Thread exited, not reaped by main thread yet.
3606              -      Thread reaped.
3607              X      Thread reaped, exited with an error.
3608              K      Thread reaped, exited due to signal.
3609
3610       Fio will condense the thread string as not to take up more space on the
3611       command line than needed. For instance, if you have 10 readers  and  10
3612       writers running, the output would look like this:
3613
3614                 Jobs: 20 (f=20): [R(10),W(10)][4.0%][r=20.5MiB/s,w=23.5MiB/s][r=82,w=94 IOPS][eta 57m:36s]
3615
3616       Note  that the status string is displayed in order, so it's possible to
3617       tell which of the jobs are currently doing what. In the  example  above
3618       this means that jobs 1--10 are readers and 11--20 are writers.
3619
3620       The  other values are fairly self explanatory -- number of threads cur‐
3621       rently running and doing I/O, the number of currently open files  (f=),
3622       the  estimated  completion percentage, the rate of I/O since last check
3623       (read speed listed first, then write speed and optionally  trim  speed)
3624       in  terms of bandwidth and IOPS, and time to completion for the current
3625       running group. It's impossible to estimate  runtime  of  the  following
3626       groups (if any).
3627
3628       When  fio is done (or interrupted by Ctrl-C), it will show the data for
3629       each thread, group of threads, and disks in that order. For each  over‐
3630       all thread (or group) the output looks like:
3631
3632                 Client1: (groupid=0, jobs=1): err= 0: pid=16109: Sat Jun 24 12:07:54 2017
3633                   write: IOPS=88, BW=623KiB/s (638kB/s)(30.4MiB/50032msec)
3634                     slat (nsec): min=500, max=145500, avg=8318.00, stdev=4781.50
3635                     clat (usec): min=170, max=78367, avg=4019.02, stdev=8293.31
3636                      lat (usec): min=174, max=78375, avg=4027.34, stdev=8291.79
3637                     clat percentiles (usec):
3638                      |  1.00th=[  302],  5.00th=[  326], 10.00th=[  343], 20.00th=[  363],
3639                      | 30.00th=[  392], 40.00th=[  404], 50.00th=[  416], 60.00th=[  445],
3640                      | 70.00th=[  816], 80.00th=[ 6718], 90.00th=[12911], 95.00th=[21627],
3641                      | 99.00th=[43779], 99.50th=[51643], 99.90th=[68682], 99.95th=[72877],
3642                      | 99.99th=[78119]
3643                    bw (  KiB/s): min=  532, max=  686, per=0.10%, avg=622.87, stdev=24.82, samples=  100
3644                    iops        : min=   76, max=   98, avg=88.98, stdev= 3.54, samples=  100
3645                   lat (usec)   : 250=0.04%, 500=64.11%, 750=4.81%, 1000=2.79%
3646                   lat (msec)   : 2=4.16%, 4=1.84%, 10=4.90%, 20=11.33%, 50=5.37%
3647                   lat (msec)   : 100=0.65%
3648                   cpu          : usr=0.27%, sys=0.18%, ctx=12072, majf=0, minf=21
3649                   IO depths    : 1=85.0%, 2=13.1%, 4=1.8%, 8=0.1%, 16=0.0%, 32=0.0%, >=64=0.0%
3650                      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
3651                      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
3652                      issued rwt: total=0,4450,0, short=0,0,0, dropped=0,0,0
3653                      latency   : target=0, window=0, percentile=100.00%, depth=8
3654
3655       The  job  name  (or  first  job's  name  when using group_reporting) is
3656       printed, along with the group id, count of jobs being aggregated,  last
3657       error  id  seen  (which is 0 when there are no errors), pid/tid of that
3658       thread and the time the job/group completed. Below are the I/O  statis‐
3659       tics  for  each data direction performed (showing writes in the example
3660       above). In the order listed, they denote:
3661
3662              read/write/trim
3663                     The string before the colon shows the I/O  direction  the
3664                     statistics  are  for.  IOPS is the average I/Os performed
3665                     per second. BW is the average bandwidth  rate  shown  as:
3666                     value in power of 2 format (value in power of 10 format).
3667                     The last two values show: (total I/O performed  in  power
3668                     of 2 format / runtime of that thread).
3669
3670              slat   Submission  latency (min being the minimum, max being the
3671                     maximum, avg being the average, stdev being the  standard
3672                     deviation).  This  is the time it took to submit the I/O.
3673                     For sync I/O this row is not displayed as the slat is re‐
3674                     ally  the completion latency (since queue/complete is one
3675                     operation there).  This value can be in nanoseconds,  mi‐
3676                     croseconds  or  milliseconds --- fio will choose the most
3677                     appropriate base and print that  (in  the  example  above
3678                     nanoseconds  was the best scale). Note: in --minimal mode
3679                     latencies are always expressed in microseconds.
3680
3681              clat   Completion latency. Same names as slat, this denotes  the
3682                     time from submission to completion of the I/O pieces. For
3683                     sync I/O, clat will usually be equal (or very  close)  to
3684                     0,  as the time from submit to complete is basically just
3685                     CPU time (I/O has already been done,  see  slat  explana‐
3686                     tion).
3687
3688              lat    Total  latency. Same names as slat and clat, this denotes
3689                     the time from when fio created the I/O unit to completion
3690                     of the I/O operation.
3691
3692              bw     Bandwidth  statistics based on samples. Same names as the
3693                     xlat stats, but also includes the number of samples taken
3694                     (samples)  and  an approximate percentage of total aggre‐
3695                     gate bandwidth this thread received in its  group  (per).
3696                     This  last  value is only really useful if the threads in
3697                     this group are on the same disk, since they are then com‐
3698                     peting for disk access.
3699
3700              iops   IOPS statistics based on samples. Same names as bw.
3701
3702              lat (nsec/usec/msec)
3703                     The distribution of I/O completion latencies. This is the
3704                     time from when I/O leaves fio and when it gets completed.
3705                     Unlike  the  separate read/write/trim sections above, the
3706                     data here and in the remaining sections apply to all I/Os
3707                     for  the  reporting  group. 250=0.04% means that 0.04% of
3708                     the I/Os completed in under 250us. 500=64.11% means  that
3709                     64.11% of the I/Os required 250 to 499us for completion.
3710
3711              cpu    CPU usage. User and system time, along with the number of
3712                     context switches this thread went through, usage of  sys‐
3713                     tem  and  user  time, and finally the number of major and
3714                     minor page faults. The CPU utilization numbers are  aver‐
3715                     ages for the jobs in that reporting group, while the con‐
3716                     text and fault counters are summed.
3717
3718              IO depths
3719                     The distribution of I/O depths over the job lifetime. The
3720                     numbers  are divided into powers of 2 and each entry cov‐
3721                     ers depths from that value up to  those  that  are  lower
3722                     than the next entry -- e.g., 16= covers depths from 16 to
3723                     31. Note that the range covered by a  depth  distribution
3724                     entry can be different to the range covered by the equiv‐
3725                     alent submit/complete distribution entry.
3726
3727              IO submit
3728                     How many pieces of I/O were submitting in a single submit
3729                     call. Each entry denotes that amount and below, until the
3730                     previous entry -- e.g., 16=100% means that  we  submitted
3731                     anywhere  between 9 to 16 I/Os per submit call. Note that
3732                     the range covered by a submit distribution entry  can  be
3733                     different  to  the  range covered by the equivalent depth
3734                     distribution entry.
3735
3736              IO complete
3737                     Like the above submit number,  but  for  completions  in‐
3738                     stead.
3739
3740              IO issued rwt
3741                     The  number  of  read/write/trim requests issued, and how
3742                     many of them were short or dropped.
3743
3744              IO latency
3745                     These values are for latency_target and related  options.
3746                     When  these  options  are engaged, this section describes
3747                     the I/O depth required to meet the specified latency tar‐
3748                     get.
3749
3750       After  each  client  has been listed, the group statistics are printed.
3751       They will look like this:
3752
3753                 Run status group 0 (all jobs):
3754                    READ: bw=20.9MiB/s (21.9MB/s), 10.4MiB/s-10.8MiB/s (10.9MB/s-11.3MB/s), io=64.0MiB (67.1MB), run=2973-3069msec
3755                   WRITE: bw=1231KiB/s (1261kB/s), 616KiB/s-621KiB/s (630kB/s-636kB/s), io=64.0MiB (67.1MB), run=52747-53223msec
3756
3757       For each data direction it prints:
3758
3759              bw     Aggregate bandwidth of threads in this group followed  by
3760                     the  minimum  and maximum bandwidth of all the threads in
3761                     this group.  Values outside of  brackets  are  power-of-2
3762                     format  and  those  within  are the equivalent value in a
3763                     power-of-10 format.
3764
3765              io     Aggregate I/O performed of all threads in this group. The
3766                     format is the same as bw.
3767
3768              run    The  smallest and longest runtimes of the threads in this
3769                     group.
3770
3771       And finally, the disk statistics are printed. This is  Linux  specific.
3772       They will look like this:
3773
3774                   Disk stats (read/write):
3775                     sda: ios=16398/16511, merge=30/162, ticks=6853/819634, in_queue=826487, util=100.00%
3776
3777       Each  value is printed for both reads and writes, with reads first. The
3778       numbers denote:
3779
3780              ios    Number of I/Os performed by all groups.
3781
3782              merge  Number of merges performed by the I/O scheduler.
3783
3784              ticks  Number of ticks we kept the disk busy.
3785
3786              in_queue
3787                     Total time spent in the disk queue.
3788
3789              util   The disk utilization. A value of 100% means we  kept  the
3790                     disk  busy constantly, 50% would be a disk idling half of
3791                     the time.
3792
3793       It is also possible to get fio to dump the current output while  it  is
3794       running,  without  terminating  the  job. To do that, send fio the USR1
3795       signal. You can also get regularly timed  dumps  by  using  the  --sta‐
3796       tus-interval   parameter,  or  by  creating  a  file  in  `/tmp'  named
3797       `fio-dump-status'. If fio sees this file, it will unlink  it  and  dump
3798       the current output status.
3799

TERSE OUTPUT

3801       For  scripted  usage  where  you  typically  want to generate tables or
3802       graphs of the results, fio can output the results in a semicolon  sepa‐
3803       rated format. The format is one long line of values, such as:
3804
3805                 2;card0;0;0;7139336;121836;60004;1;10109;27.932460;116.933948;220;126861;3495.446807;1085.368601;226;126864;3523.635629;1089.012448;24063;99944;50.275485%;59818.274627;5540.657370;7155060;122104;60004;1;8338;29.086342;117.839068;388;128077;5032.488518;1234.785715;391;128085;5061.839412;1236.909129;23436;100928;50.287926%;59964.832030;5644.844189;14.595833%;19.394167%;123706;0;7313;0.1%;0.1%;0.1%;0.1%;0.1%;0.1%;100.0%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.01%;0.02%;0.05%;0.16%;6.04%;40.40%;52.68%;0.64%;0.01%;0.00%;0.01%;0.00%;0.00%;0.00%;0.00%;0.00%
3806                 A description of this job goes here.
3807
3808       The  job  description  (if provided) follows on a second line for terse
3809       v2.  It appears on the same line for other terse versions.
3810
3811       To enable terse output, use the  --minimal  or  `--output-format=terse'
3812       command  line options. The first value is the version of the terse out‐
3813       put format. If the output has to be changed for some reason, this  num‐
3814       ber will be incremented by 1 to signify that change.
3815
3816       Split  up, the format is as follows (comments in brackets denote when a
3817       field was introduced or whether it's specific to some terse version):
3818
3819                      terse version, fio version [v3], jobname, groupid, error
3820
3821              READ status:
3822
3823                      Total IO (KiB), bandwidth (KiB/sec), IOPS, runtime (msec)
3824                      Submission latency: min, max, mean, stdev (usec)
3825                      Completion latency: min, max, mean, stdev (usec)
3826                      Completion latency percentiles: 20 fields (see below)
3827                      Total latency: min, max, mean, stdev (usec)
3828                      Bw (KiB/s): min, max, aggregate percentage of total, mean, stdev, number of samples [v5]
3829                      IOPS [v5]: min, max, mean, stdev, number of samples
3830
3831              WRITE status:
3832
3833                      Total IO (KiB), bandwidth (KiB/sec), IOPS, runtime (msec)
3834                      Submission latency: min, max, mean, stdev (usec)
3835                      Completion latency: min, max, mean, stdev (usec)
3836                      Completion latency percentiles: 20 fields (see below)
3837                      Total latency: min, max, mean, stdev (usec)
3838                      Bw (KiB/s): min, max, aggregate percentage of total, mean, stdev, number of samples [v5]
3839                      IOPS [v5]: min, max, mean, stdev, number of samples
3840
3841              TRIM status [all but version 3]:
3842
3843                      Fields are similar to READ/WRITE status.
3844
3845              CPU usage:
3846
3847                      user, system, context switches, major faults, minor faults
3848
3849              I/O depths:
3850
3851                      <=1, 2, 4, 8, 16, 32, >=64
3852
3853              I/O latencies microseconds:
3854
3855                      <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000
3856
3857              I/O latencies milliseconds:
3858
3859                      <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000, 2000, >=2000
3860
3861              Disk utilization [v3]:
3862
3863                      disk name, read ios, write ios, read merges, write merges, read ticks, write ticks, time spent in queue, disk utilization percentage
3864
3865              Additional Info (dependent on continue_on_error, default off):
3866
3867                      total # errors, first error code
3868
3869              Additional Info (dependent on description being set):
3870
3871                      Text description
3872
3873       Completion latency percentiles can be a grouping of up to 20  sets,  so
3874       for  the terse output fio writes all of them. Each field will look like
3875       this:
3876
3877                 1.00%=6112
3878
3879       which is the Xth percentile, and the `usec' latency associated with it.
3880
3881       For Disk utilization, all disks used by fio are shown. So for each disk
3882       there will be a disk utilization section.
3883
3884       Below is a single line containing short names for each of the fields in
3885       the minimal output v3, separated by semicolons:
3886
3887                 terse_version_3;fio_version;jobname;groupid;error;read_kb;read_bandwidth_kb;read_iops;read_runtime_ms;read_slat_min_us;read_slat_max_us;read_slat_mean_us;read_slat_dev_us;read_clat_min_us;read_clat_max_us;read_clat_mean_us;read_clat_dev_us;read_clat_pct01;read_clat_pct02;read_clat_pct03;read_clat_pct04;read_clat_pct05;read_clat_pct06;read_clat_pct07;read_clat_pct08;read_clat_pct09;read_clat_pct10;read_clat_pct11;read_clat_pct12;read_clat_pct13;read_clat_pct14;read_clat_pct15;read_clat_pct16;read_clat_pct17;read_clat_pct18;read_clat_pct19;read_clat_pct20;read_tlat_min_us;read_lat_max_us;read_lat_mean_us;read_lat_dev_us;read_bw_min_kb;read_bw_max_kb;read_bw_agg_pct;read_bw_mean_kb;read_bw_dev_kb;write_kb;write_bandwidth_kb;write_iops;write_runtime_ms;write_slat_min_us;write_slat_max_us;write_slat_mean_us;write_slat_dev_us;write_clat_min_us;write_clat_max_us;write_clat_mean_us;write_clat_dev_us;write_clat_pct01;write_clat_pct02;write_clat_pct03;write_clat_pct04;write_clat_pct05;write_clat_pct06;write_clat_pct07;write_clat_pct08;write_clat_pct09;write_clat_pct10;write_clat_pct11;write_clat_pct12;write_clat_pct13;write_clat_pct14;write_clat_pct15;write_clat_pct16;write_clat_pct17;write_clat_pct18;write_clat_pct19;write_clat_pct20;write_tlat_min_us;write_lat_max_us;write_lat_mean_us;write_lat_dev_us;write_bw_min_kb;write_bw_max_kb;write_bw_agg_pct;write_bw_mean_kb;write_bw_dev_kb;cpu_user;cpu_sys;cpu_csw;cpu_mjf;cpu_minf;iodepth_1;iodepth_2;iodepth_4;iodepth_8;iodepth_16;iodepth_32;iodepth_64;lat_2us;lat_4us;lat_10us;lat_20us;lat_50us;lat_100us;lat_250us;lat_500us;lat_750us;lat_1000us;lat_2ms;lat_4ms;lat_10ms;lat_20ms;lat_50ms;lat_100ms;lat_250ms;lat_500ms;lat_750ms;lat_1000ms;lat_2000ms;lat_over_2000ms;disk_name;disk_read_iops;disk_write_iops;disk_read_merges;disk_write_merges;disk_read_ticks;write_ticks;disk_queue_time;disk_util
3888
3889       In client/server mode terse output differs from what appears when  jobs
3890       are  run  locally.  Disk  utilization data is omitted from the standard
3891       terse output and for v3 and later appears on its own separate  line  at
3892       the end of each terse reporting cycle.
3893

JSON OUTPUT

3895       The json output format is intended to be both human readable and conve‐
3896       nient for automated parsing. For the  most  part  its  sections  mirror
3897       those  of  the normal output. The runtime value is reported in msec and
3898       the bw value is reported in 1024 bytes per second units.
3899

JSON+ OUTPUT

3901       The json+ output format is identical to the json output  format  except
3902       that  it adds a full dump of the completion latency bins. Each bins ob‐
3903       ject contains a set of (key, value) pairs where keys are latency  dura‐
3904       tions  and  values  count how many I/Os had completion latencies of the
3905       corresponding duration. For example, consider:
3906
3907              "bins" : { "87552" : 1, "89600" : 1, "94720" : 1, "96768"  :  1,
3908              "97792" : 1, "99840" : 1, "100864" : 2, "103936" : 6, "104960" :
3909              534, "105984" : 5995, "107008" : 7529, ... }
3910
3911       This data indicates that one I/O required  87,552ns  to  complete,  two
3912       I/Os  required  100,864ns to complete, and 7529 I/Os required 107,008ns
3913       to complete.
3914
3915       Also included with fio is a Python  script  fio_jsonplus_clat2csv  that
3916       takes  json+  output  and generates CSV-formatted latency data suitable
3917       for plotting.
3918
3919       The latency durations actually represent the midpoints of  latency  in‐
3920       tervals.  For details refer to `stat.h' in the fio source.
3921

TRACE FILE FORMAT

3923       There  are two trace file format that you can encounter. The older (v1)
3924       format is unsupported since version  1.20-rc3  (March  2008).  It  will
3925       still  be described below in case that you get an old trace and want to
3926       understand it.
3927
3928       In any case the trace is a simple text file with a  single  action  per
3929       line.
3930
3931       Trace file format v1
3932              Each  line  represents a single I/O action in the following for‐
3933              mat:
3934
3935                     rw, offset, length
3936
3937              where `rw=0/1' for read/write, and the `offset' and `length' en‐
3938              tries being in bytes.
3939
3940              This format is not supported in fio versions >= 1.20-rc3.
3941
3942       Trace file format v2
3943              The  second  version  of  the trace file format was added in fio
3944              version 1.17. It allows to access more then one file  per  trace
3945              and has a bigger set of possible file actions.
3946
3947              The first line of the trace file has to be:
3948
3949                     "fio version 2 iolog"
3950
3951              Following  this can be lines in two different formats, which are
3952              described below.
3953
3954              The file management format:
3955                     filename action
3956
3957                     The `filename' is given as an absolute path. The `action'
3958                     can be one of these:
3959
3960                            add    Add the given `filename' to the trace.
3961
3962                            open   Open  the  file  with the given `filename'.
3963                                   The `filename' has to have been added  with
3964                                   the add action before.
3965
3966                            close  Close  the  file with the given `filename'.
3967                                   The file has to have been opened before.
3968
3969              The file I/O action format:
3970                     filename action offset length
3971
3972                     The `filename' is given as an absolute path, and  has  to
3973                     have  been  added  and  opened before it can be used with
3974                     this format. The  `offset'  and  `length'  are  given  in
3975                     bytes. The `action' can be one of these:
3976
3977                            wait   Wait  for `offset' microseconds. Everything
3978                                   below 100 is discarded.  The time is  rela‐
3979                                   tive to the previous `wait' statement. Note
3980                                   that action `wait` is  not  allowed  as  of
3981                                   version  3,  as  the  same  behavior can be
3982                                   achieved using timestamps.
3983
3984                            read   Read `length' bytes  beginning  from  `off‐
3985                                   set'.
3986
3987                            write  Write  `length'  bytes beginning from `off‐
3988                                   set'.
3989
3990                            sync   fsync(2) the file.
3991
3992                            datasync
3993                                   fdatasync(2) the file.
3994
3995                            trim   Trim the given file from the given `offset'
3996                                   for `length' bytes.
3997
3998       Trace file format v3
3999              The third version of the trace file format was added in fio ver‐
4000              sion 3.31. It forces each action to have a timestamp  associated
4001              with it.
4002
4003              The first line of the trace file has to be:
4004
4005                     "fio version 3 iolog"
4006
4007              Following  this can be lines in two different formats, which are
4008              described below.
4009
4010              The file management format:
4011                     timestamp filename action
4012
4013              The file I/O action format:
4014                     timestamp filename action offset length
4015
4016                     The `timestamp` is relative to the beginning of  the  run
4017                     (ie  starts at 0). The `filename`, `action`, `offset` and
4018                     `length`  are identical to version 2, except that version
4019                     3 does not allow the `wait` action.
4020

I/O REPLAY - MERGING TRACES

4022       Colocation  is a common practice used to get the most out of a machine.
4023       Knowing which workloads play nicely with  each  other  and  which  ones
4024       don't  is  a  much  harder task. While fio can replay workloads concur‐
4025       rently via multiple jobs, it leaves some variability up to  the  sched‐
4026       uler  making  results harder to reproduce. Merging is a way to make the
4027       order of events consistent.
4028
4029       Merging is integrated into  I/O  replay  and  done  when  a  merge_blk‐
4030       trace_file  is  specified.  The  list  of files passed to read_iolog go
4031       through the merge process and output a single file stored to the speci‐
4032       fied  file.  The  output  file is passed on as if it were the only file
4033       passed to read_iolog. An example would look like:
4034
4035              $      fio      --read_iolog="<file1>:<file2>"      --merge_blk‐
4036              trace_file="<output_file>"
4037
4038       Creating  only  the merged file can be done by passing the command line
4039       argument merge-blktrace-only.
4040
4041       Scaling traces can be done to see the relative impact of any particular
4042       trace  being  slowed down or sped up. merge_blktrace_scalars takes in a
4043       colon separated list of percentage scalars. It is index paired with the
4044       files passed to read_iolog.
4045
4046       With  scaling,  it  may  be  desirable to match the running time of all
4047       traces.  This can be done with merge_blktrace_iters. It is index paired
4048       with read_iolog just like merge_blktrace_scalars.
4049
4050       In  an example, given two traces, A and B, each 60s long. If we want to
4051       see the impact of trace A issuing IOs twice as fast and repeat trace  A
4052       over the runtime of trace B, the following can be done:
4053
4054              $     fio    --read_iolog="<trace_a>:"<trace_b>"    --merge_blk‐
4055              trace_file"<output_file>"      --merge_blktrace_scalars="50:100"
4056              --merge_blktrace_iters="2:1"
4057
4058       This runs trace A at 2x the speed twice for approximately the same run‐
4059       time as a single run of trace B.
4060

CPU IDLENESS PROFILING

4062       In some cases, we want to understand CPU overhead in a test. For  exam‐
4063       ple,  we  test patches for the specific goodness of whether they reduce
4064       CPU usage.  Fio implements a balloon approach to create  a  thread  per
4065       CPU  that  runs at idle priority, meaning that it only runs when nobody
4066       else needs the cpu.  By measuring the amount of work completed  by  the
4067       thread, idleness of each CPU can be derived accordingly.
4068
4069       An unit work is defined as touching a full page of unsigned characters.
4070       Mean and standard deviation of time to complete an  unit  work  is  re‐
4071       ported in "unit work" section. Options can be chosen to report detailed
4072       percpu idleness or overall system idleness by aggregating percpu stats.
4073

VERIFICATION AND TRIGGERS

4075       Fio is usually run in one of two ways, when data verification is  done.
4076       The  first is a normal write job of some sort with verify enabled. When
4077       the write phase has completed, fio switches to reads and  verifies  ev‐
4078       erything  it  wrote.  The second model is running just the write phase,
4079       and then later on running the same  job  (but  with  reads  instead  of
4080       writes)  to  repeat the same I/O patterns and verify the contents. Both
4081       of these methods depend on the write phase being completed, as fio oth‐
4082       erwise has no idea how much data was written.
4083
4084       With  verification  triggers,  fio  supports  dumping the current write
4085       state to local files. Then a subsequent read verify workload  can  load
4086       this  state  and know exactly where to stop. This is useful for testing
4087       cases where power is cut to a server in  a  managed  fashion,  for  in‐
4088       stance.
4089
4090       A verification trigger consists of two things:
4091
4092              1) Storing the write state of each job.
4093
4094              2) Executing a trigger command.
4095
4096       The  write state is relatively small, on the order of hundreds of bytes
4097       to single kilobytes. It contains information on the number  of  comple‐
4098       tions done, the last X completions, etc.
4099
4100       A  trigger  is invoked either through creation ('touch') of a specified
4101       file in the system, or through a timeout setting. If fio  is  run  with
4102       `--trigger-file=/tmp/trigger-file',  then it will continually check for
4103       the existence of `/tmp/trigger-file'. When it sees this file,  it  will
4104       fire off the trigger (thus saving state, and executing the trigger com‐
4105       mand).
4106
4107       For client/server runs, there's both a local and remote trigger. If fio
4108       is running as a server backend, it will send the job states back to the
4109       client for safe storage, then execute the remote trigger, if specified.
4110       If  a  local  trigger is specified, the server will still send back the
4111       write state, but the client will then execute the trigger.
4112
4113       Verification trigger example
4114              Let's say we want to run a powercut test on the remote Linux ma‐
4115              chine  'server'.   Our write workload is in `write-test.fio'. We
4116              want to cut power to 'server' at some point during the run,  and
4117              we'll  run  this test from the safety or our local machine, 'lo‐
4118              calbox'. On the server, we'll start the fio backend normally:
4119
4120                     server# fio --server
4121
4122              and on the client, we'll fire off the workload:
4123
4124                     localbox$       fio        --client=server        --trig‐
4125                     ger-file=/tmp/my-trigger  --trigger-remote="bash -c "echo
4126                     b > /proc/sysrq-triger""
4127
4128              We set `/tmp/my-trigger' as the trigger file, and we tell fio to
4129              execute:
4130
4131                     echo b > /proc/sysrq-trigger
4132
4133              on  the  server once it has received the trigger and sent us the
4134              write state. This will work, but it's not really  cutting  power
4135              to  the  server, it's merely abruptly rebooting it. If we have a
4136              remote way of cutting power to the server through IPMI or  simi‐
4137              lar,  we  could do that through a local trigger command instead.
4138              Let's assume we have a script that does IPMI reboot of  a  given
4139              hostname,  ipmi-reboot.  On localbox, we could then have run fio
4140              with a local trigger instead:
4141
4142                     localbox$       fio        --client=server        --trig‐
4143                     ger-file=/tmp/my-trigger --trigger="ipmi-reboot server"
4144
4145              For  this  case,  fio  would  wait for the server to send us the
4146              write state, then execute `ipmi-reboot server'  when  that  hap‐
4147              pened.
4148
4149       Loading verify state
4150              To  load  stored  write state, a read verification job file must
4151              contain the verify_state_load option. If that is set,  fio  will
4152              load  the  previously  stored state. For a local fio run this is
4153              done by loading the files directly, and on a client/server  run,
4154              the  server  backend  will ask the client to send the files over
4155              and load them from there.
4156

LOG FILE FORMATS

4158       Fio supports a variety of log  file  formats,  for  logging  latencies,
4159       bandwidth,  and  IOPS. The logs share a common format, which looks like
4160       this:
4161
4162              time (msec), value, data direction, block size  (bytes),  offset
4163              (bytes), command priority
4164
4165       `Time'  for the log entry is always in milliseconds. The `value' logged
4166       depends on the type of log, it will be one of the following:
4167
4168              Latency log
4169                     Value is latency in nsecs
4170
4171              Bandwidth log
4172                     Value is in KiB/sec
4173
4174              IOPS log
4175                     Value is IOPS
4176
4177       `Data direction' is one of the following:
4178
4179              0      I/O is a READ
4180
4181              1      I/O is a WRITE
4182
4183              2      I/O is a TRIM
4184
4185       The entry's `block size' is always in bytes. The `offset' is the  posi‐
4186       tion  in  bytes from the start of the file for that particular I/O. The
4187       logging of the offset can be toggled with log_offset.
4188
4189       If log_prio is not set, the entry's `Command priority` is 1 for  an  IO
4190       executed  with  the  highest  RT  priority  class  (prioclass=1 or cmd‐
4191       prio_class=1) and 0 otherwise. This is controlled by the prioclass  op‐
4192       tion  and  the  ioengine  specific cmdprio_percentage cmdprio_class op‐
4193       tions. If log_prio is set, the entry's `Command priority` is the prior‐
4194       ity  set for the IO, as a 16-bits hexadecimal number with the lowest 13
4195       bits indicating the priority value (prio and cmdprio options)  and  the
4196       highest  3  bits  indicating  the IO priority class (prioclass and cmd‐
4197       prio_class options).
4198
4199       Fio defaults to logging every individual I/O but when windowed  logging
4200       is  set  through  log_avg_msec,  either the average (by default) or the
4201       maximum (log_max_value is set) `value' seen over the  specified  period
4202       of  time  is recorded. Each `data direction' seen within the window pe‐
4203       riod will aggregate its values in a separate row. Further,  when  using
4204       windowed logging the `block size' and `offset' entries will always con‐
4205       tain 0.
4206

CLIENT / SERVER

4208       Normally fio is invoked as a stand-alone  application  on  the  machine
4209       where  the  I/O  workload should be generated. However, the backend and
4210       frontend of fio can be run separately i.e., the fio server can generate
4211       an  I/O workload on the "Device Under Test" while being controlled by a
4212       client on another machine.
4213
4214       Start the server on the machine which has access to the storage DUT:
4215
4216              $ fio --server=args
4217
4218       where `args' defines what fio listens to. The arguments are of the form
4219       `type,hostname' or `IP,port'. `type' is either `ip' (or ip4) for TCP/IP
4220       v4, `ip6' for TCP/IP v6, or `sock' for  a  local  unix  domain  socket.
4221       `hostname'  is  either a hostname or IP address, and `port' is the port
4222       to listen to (only valid for TCP/IP, not a local  socket).  Some  exam‐
4223       ples:
4224
4225              1) fio --server
4226                     Start  a  fio  server, listening on all interfaces on the
4227                     default port (8765).
4228
4229              2) fio --server=ip:hostname,4444
4230                     Start a fio server, listening on IP belonging to hostname
4231                     and on port 4444.
4232
4233              3) fio --server=ip6:::1,4444
4234                     Start  a  fio server, listening on IPv6 localhost ::1 and
4235                     on port 4444.
4236
4237              4) fio --server=,4444
4238                     Start a fio server, listening on all interfaces  on  port
4239                     4444.
4240
4241              5) fio --server=1.2.3.4
4242                     Start  a  fio  server, listening on IP 1.2.3.4 on the de‐
4243                     fault port.
4244
4245              6) fio --server=sock:/tmp/fio.sock
4246                     Start  a  fio  server,  listening  on  the  local  socket
4247                     `/tmp/fio.sock'.
4248
4249       Once  a  server  is  running,  a "client" can connect to the fio server
4250       with:
4251
4252              $ fio <local-args> --client=<server> <remote-args> <job file(s)>
4253
4254       where `local-args' are arguments for the client where  it  is  running,
4255       `server' is the connect string, and `remote-args' and `job file(s)' are
4256       sent to the server. The `server' string follows the same format  as  it
4257       does on the server side, to allow IP/hostname/socket and port strings.
4258
4259       Fio can connect to multiple servers this way:
4260
4261              $  fio  --client=<server1> <job file(s)> --client=<server2> <job
4262              file(s)>
4263
4264       If the job file is located on the fio server, then  you  can  tell  the
4265       server  to  load  a  local  file  as  well. This is done by using --re‐
4266       mote-config:
4267
4268              $ fio --client=server --remote-config /path/to/file.fio
4269
4270       Then fio will open this local (to the server) job file instead of being
4271       passed one from the client.
4272
4273       If you have many servers (example: 100 VMs/containers), you can input a
4274       pathname of a file containing host IPs/names as the parameter value for
4275       the  --client  option. For example, here is an example `host.list' file
4276       containing 2 hostnames:
4277
4278              host1.your.dns.domain
4279              host2.your.dns.domain
4280
4281       The fio command would then be:
4282
4283              $ fio --client=host.list <job file(s)>
4284
4285       In this mode, you cannot input server-specific parameters or job  files
4286       -- all servers receive the same job file.
4287
4288       In order to let `fio --client' runs use a shared filesystem from multi‐
4289       ple hosts, `fio --client' now prepends the IP address of the server  to
4290       the filename. For example, if fio is using the directory `/mnt/nfs/fio'
4291       and is writing filename `fileio.tmp', with a --client  `hostfile'  con‐
4292       taining  two  hostnames  `h1' and `h2' with IP addresses 192.168.10.120
4293       and 192.168.10.121, then fio will create two files:
4294
4295              /mnt/nfs/fio/192.168.10.120.fileio.tmp
4296              /mnt/nfs/fio/192.168.10.121.fileio.tmp
4297
4298       Terse output in client/server mode will differ slightly  from  what  is
4299       produced when fio is run in stand-alone mode. See the terse output sec‐
4300       tion for details.
4301

AUTHORS

4303       fio was written by Jens Axboe <axboe@kernel.dk>.
4304       This man page was written  by  Aaron  Carroll  <aaronc@cse.unsw.edu.au>
4305       based on documentation by Jens Axboe.
4306       This  man  page  was  rewritten by Tomohiro Kusumi <tkusumi@tuxera.com>
4307       based on documentation by Jens Axboe.
4308

REPORTING BUGS

4310       Report bugs to the fio mailing list <fio@vger.kernel.org>.
4311       See REPORTING-BUGS.
4312
4313       REPORTING-BUGS: http://git.kernel.dk/cgit/fio/plain/REPORTING-BUGS
4314