1fio(1) General Commands Manual fio(1)
2
3
4
6 fio - flexible I/O tester
7
9 fio [options] [jobfile]...
10
12 fio is a tool that will spawn a number of threads or processes doing a
13 particular type of I/O action as specified by the user. The typical
14 use of fio is to write a job file matching the I/O load one wants to
15 simulate.
16
18 --debug=type
19 Enable verbose tracing type of various fio actions. May be `all'
20 for all types or individual types separated by a comma (e.g.
21 `--debug=file,mem' will enable file and memory debugging).
22 `help' will list all available tracing options.
23
24 --parse-only
25 Parse options only, don't start any I/O.
26
27 --output=filename
28 Write output to filename.
29
30 --output-format=format
31 Set the reporting format to `normal', `terse', `json', or
32 `json+'. Multiple formats can be selected, separate by a comma.
33 `terse' is a CSV based format. `json+' is like `json', except it
34 adds a full dump of the latency buckets.
35
36 --bandwidth-log
37 Generate aggregate bandwidth logs.
38
39 --minimal
40 Print statistics in a terse, semicolon-delimited format.
41
42 --append-terse
43 Print statistics in selected mode AND terse, semicolon-delimited
44 format. Deprecated, use --output-format instead to select mul‐
45 tiple formats.
46
47 --terse-version=version
48 Set terse version output format (default `3', or `2', `4', `5').
49
50 --version
51 Print version information and exit.
52
53 --help Print a summary of the command line options and exit.
54
55 --cpuclock-test
56 Perform test and validation of internal CPU clock.
57
58 --crctest=[test]
59 Test the speed of the built-in checksumming functions. If no
60 argument is given, all of them are tested. Alternatively, a
61 comma separated list can be passed, in which case the given ones
62 are tested.
63
64 --cmdhelp=command
65 Print help information for command. May be `all' for all com‐
66 mands.
67
68 --enghelp=[ioengine[,command]]
69 List all commands defined by ioengine, or print help for command
70 defined by ioengine. If no ioengine is given, list all available
71 ioengines.
72
73 --showcmd=jobfile
74 Convert jobfile to a set of command-line options.
75
76 --readonly
77 Turn on safety read-only checks, preventing writes. The --read‐
78 only option is an extra safety guard to prevent users from acci‐
79 dentally starting a write workload when that is not desired. Fio
80 will only write if `rw=write/randwrite/rw/randrw' is given. This
81 extra safety net can be used as an extra precaution as --read‐
82 only will also enable a write check in the I/O engine core to
83 prevent writes due to unknown user space bug(s).
84
85 --eta=when
86 Specifies when real-time ETA estimate should be printed. when
87 may be `always', `never' or `auto'. `auto' is the default, it
88 prints ETA when requested if the output is a TTY. `always' dis‐
89 regards the output type, and prints ETA when requested. `never'
90 never prints ETA.
91
92 --eta-interval=time
93 By default, fio requests client ETA status roughly every second.
94 With this option, the interval is configurable. Fio imposes a
95 minimum allowed time to avoid flooding the console, less than
96 250 msec is not supported.
97
98 --eta-newline=time
99 Force a new line for every time period passed. When the unit is
100 omitted, the value is interpreted in seconds.
101
102 --status-interval=time
103 Force a full status dump of cumulative (from job start) values
104 at time intervals. This option does *not* provide per-period
105 measurements. So values such as bandwidth are running averages.
106 When the time unit is omitted, time is interpreted in seconds.
107
108 --section=name
109 Only run specified section name in job file. Multiple sections
110 can be specified. The --section option allows one to combine
111 related jobs into one file. E.g. one job file could define
112 light, moderate, and heavy sections. Tell fio to run only the
113 "heavy" section by giving `--section=heavy' command line option.
114 One can also specify the "write" operations in one section and
115 "verify" operation in another section. The --section option only
116 applies to job sections. The reserved *global* section is always
117 parsed and used.
118
119 --alloc-size=kb
120 Set the internal smalloc pool size to kb in KiB. The
121 --alloc-size switch allows one to use a larger pool size for
122 smalloc. If running large jobs with randommap enabled, fio can
123 run out of memory. Smalloc is an internal allocator for shared
124 structures from a fixed size memory pool and can grow to 16
125 pools. The pool size defaults to 16MiB. NOTE: While running
126 `.fio_smalloc.*' backing store files are visible in `/tmp'.
127
128 --warnings-fatal
129 All fio parser warnings are fatal, causing fio to exit with an
130 error.
131
132 --max-jobs=nr
133 Set the maximum number of threads/processes to support to nr.
134 NOTE: On Linux, it may be necessary to increase the shared-mem‐
135 ory limit (`/proc/sys/kernel/shmmax') if fio runs into errors
136 while creating jobs.
137
138 --server=args
139 Start a backend server, with args specifying what to listen to.
140 See CLIENT/SERVER section.
141
142 --daemonize=pidfile
143 Background a fio server, writing the pid to the given pidfile
144 file.
145
146 --client=hostname
147 Instead of running the jobs locally, send and run them on the
148 given hostname or set of hostnames. See CLIENT/SERVER section.
149
150 --remote-config=file
151 Tell fio server to load this local file.
152
153 --idle-prof=option
154 Report CPU idleness. option is one of the following:
155
156 calibrate
157 Run unit work calibration only and exit.
158
159 system Show aggregate system idleness and unit work.
160
161 percpu As system but also show per CPU idleness.
162
163 --inflate-log=log
164 Inflate and output compressed log.
165
166 --trigger-file=file
167 Execute trigger command when file exists.
168
169 --trigger-timeout=time
170 Execute trigger at this time.
171
172 --trigger=command
173 Set this command as local trigger.
174
175 --trigger-remote=command
176 Set this command as remote trigger.
177
178 --aux-path=path
179 Use this path for fio state generated files.
180
182 Any parameters following the options will be assumed to be job files,
183 unless they match a job file parameter. Multiple job files can be
184 listed and each job file will be regarded as a separate group. Fio will
185 stonewall execution between each group.
186
187 Fio accepts one or more job files describing what it is supposed to do.
188 The job file format is the classic ini file, where the names enclosed
189 in [] brackets define the job name. You are free to use any ASCII name
190 you want, except *global* which has special meaning. Following the job
191 name is a sequence of zero or more parameters, one per line, that
192 define the behavior of the job. If the first character in a line is a
193 ';' or a '#', the entire line is discarded as a comment.
194
195 A *global* section sets defaults for the jobs described in that file. A
196 job may override a *global* section parameter, and a job file may even
197 have several *global* sections if so desired. A job is only affected by
198 a *global* section residing above it.
199
200 The --cmdhelp option also lists all options. If used with an command
201 argument, --cmdhelp will detail the given command.
202
203 See the `examples/' directory for inspiration on how to write job
204 files. Note the copyright and license requirements currently apply to
205 `examples/' files.
206
208 Some parameters take an option of a given type, such as an integer or a
209 string. Anywhere a numeric value is required, an arithmetic expression
210 may be used, provided it is surrounded by parentheses. Supported opera‐
211 tors are:
212
213 addition (+)
214
215 subtraction (-)
216
217 multiplication (*)
218
219 division (/)
220
221 modulus (%)
222
223 exponentiation (^)
224
225 For time values in expressions, units are microseconds by default. This
226 is different than for time values not in expressions (not enclosed in
227 parentheses).
228
230 The following parameter types are used.
231
232 str String. A sequence of alphanumeric characters.
233
234 time Integer with possible time suffix. Without a unit value is
235 interpreted as seconds unless otherwise specified. Accepts a
236 suffix of 'd' for days, 'h' for hours, 'm' for minutes, 's' for
237 seconds, 'ms' (or 'msec') for milliseconds and 'us' (or 'usec')
238 for microseconds. For example, use 10m for 10 minutes.
239
240 int Integer. A whole number value, which may contain an integer pre‐
241 fix and an integer suffix.
242
243 [*integer prefix*] **number** [*integer suffix*]
244
245 The optional *integer prefix* specifies the number's base. The
246 default is decimal. *0x* specifies hexadecimal.
247
248 The optional *integer suffix* specifies the number's units, and
249 includes an optional unit prefix and an optional unit. For quan‐
250 tities of data, the default unit is bytes. For quantities of
251 time, the default unit is seconds unless otherwise specified.
252
253 With `kb_base=1000', fio follows international standards for
254 unit prefixes. To specify power-of-10 decimal values defined in
255 the International System of Units (SI):
256
257 K means kilo (K) or 1000
258 M means mega (M) or 1000**2
259 G means giga (G) or 1000**3
260 T means tera (T) or 1000**4
261 P means peta (P) or 1000**5
262
263 To specify power-of-2 binary values defined in IEC 80000-13:
264
265 Ki means kibi (Ki) or 1024
266 Mi means mebi (Mi) or 1024**2
267 Gi means gibi (Gi) or 1024**3
268 Ti means tebi (Ti) or 1024**4
269 Pi means pebi (Pi) or 1024**5
270
271 With `kb_base=1024' (the default), the unit prefixes are oppo‐
272 site from those specified in the SI and IEC 80000-13 standards
273 to provide compatibility with old scripts. For example, 4k means
274 4096.
275
276 For quantities of data, an optional unit of 'B' may be included
277 (e.g., 'kB' is the same as 'k').
278
279 The *integer suffix* is not case sensitive (e.g., m/mi mean
280 mebi/mega, not milli). 'b' and 'B' both mean byte, not bit.
281
282 Examples with `kb_base=1000':
283
284 4 KiB: 4096, 4096b, 4096B, 4k, 4kb, 4kB, 4K, 4KB
285 1 MiB: 1048576, 1m, 1024k
286 1 MB: 1000000, 1mi, 1000ki
287 1 TiB: 1073741824, 1t, 1024m, 1048576k
288 1 TB: 1000000000, 1ti, 1000mi, 1000000ki
289
290 Examples with `kb_base=1024' (default):
291
292 4 KiB: 4096, 4096b, 4096B, 4k, 4kb, 4kB, 4K, 4KB
293 1 MiB: 1048576, 1m, 1024k
294 1 MB: 1000000, 1mi, 1000ki
295 1 TiB: 1073741824, 1t, 1024m, 1048576k
296 1 TB: 1000000000, 1ti, 1000mi, 1000000ki
297
298 To specify times (units are not case sensitive):
299
300 D means days
301 H means hours
302 M mean minutes
303 s or sec means seconds (default)
304 ms or msec means milliseconds
305 us or usec means microseconds
306
307 If the option accepts an upper and lower range, use a colon ':'
308 or minus '-' to separate such values. See irange parameter type.
309 If the lower value specified happens to be larger than the upper
310 value the two values are swapped.
311
312 bool Boolean. Usually parsed as an integer, however only defined for
313 true and false (1 and 0).
314
315 irange Integer range with suffix. Allows value range to be given, such
316 as 1024-4096. A colon may also be used as the separator, e.g.
317 1k:4k. If the option allows two sets of ranges, they can be
318 specified with a ',' or '/' delimiter: 1k-4k/8k-32k. Also see
319 int parameter type.
320
321 float_list
322 A list of floating point numbers, separated by a ':' character.
323
325 With the above in mind, here follows the complete list of fio job
326 parameters.
327
328 Units
329 kb_base=int
330 Select the interpretation of unit prefixes in input parameters.
331
332 1000 Inputs comply with IEC 80000-13 and the Interna‐
333 tional System of Units (SI). Use:
334
335 - power-of-2 values with IEC prefixes (e.g., KiB)
336 - power-of-10 values with SI prefixes (e.g., kB)
337
338 1024 Compatibility mode (default). To avoid breaking
339 old scripts:
340
341 - power-of-2 values with SI prefixes
342 - power-of-10 values with IEC prefixes
343
344 See bs for more details on input parameters.
345
346 Outputs always use correct prefixes. Most outputs include both
347 side-by-side, like:
348
349 bw=2383.3kB/s (2327.4KiB/s)
350
351 If only one value is reported, then kb_base selects the one to
352 use:
353
354 1000 -- SI prefixes
355 1024 -- IEC prefixes
356
357 unit_base=int
358 Base unit for reporting. Allowed values are:
359
360 0 Use auto-detection (default).
361
362 8 Byte based.
363
364 1 Bit based.
365
366 Job description
367 name=str
368 ASCII name of the job. This may be used to override the name
369 printed by fio for this job. Otherwise the job name is used. On
370 the command line this parameter has the special purpose of also
371 signaling the start of a new job.
372
373 description=str
374 Text description of the job. Doesn't do anything except dump
375 this text description when this job is run. It's not parsed.
376
377 loops=int
378 Run the specified number of iterations of this job. Used to
379 repeat the same workload a given number of times. Defaults to 1.
380
381 numjobs=int
382 Create the specified number of clones of this job. Each clone of
383 job is spawned as an independent thread or process. May be used
384 to setup a larger number of threads/processes doing the same
385 thing. Each thread is reported separately; to see statistics for
386 all clones as a whole, use group_reporting in conjunction with
387 new_group. See --max-jobs. Default: 1.
388
389 Time related parameters
390 runtime=time
391 Tell fio to terminate processing after the specified period of
392 time. It can be quite hard to determine for how long a specified
393 job will run, so this parameter is handy to cap the total run‐
394 time to a given time. When the unit is omitted, the value is
395 interpreted in seconds.
396
397 time_based
398 If set, fio will run for the duration of the runtime specified
399 even if the file(s) are completely read or written. It will sim‐
400 ply loop over the same workload as many times as the runtime
401 allows.
402
403 startdelay=irange(int)
404 Delay the start of job for the specified amount of time. Can be
405 a single value or a range. When given as a range, each thread
406 will choose a value randomly from within the range. Value is in
407 seconds if a unit is omitted.
408
409 ramp_time=time
410 If set, fio will run the specified workload for this amount of
411 time before logging any performance numbers. Useful for letting
412 performance settle before logging results, thus minimizing the
413 runtime required for stable results. Note that the ramp_time is
414 considered lead in time for a job, thus it will increase the
415 total runtime if a special timeout or runtime is specified. When
416 the unit is omitted, the value is given in seconds.
417
418 clocksource=str
419 Use the given clocksource as the base of timing. The supported
420 options are:
421
422 gettimeofday
423 gettimeofday(2)
424
425 clock_gettime
426 clock_gettime(2)
427
428 cpu Internal CPU clock source
429
430 cpu is the preferred clocksource if it is reliable, as it is
431 very fast (and fio is heavy on time calls). Fio will automati‐
432 cally use this clocksource if it's supported and considered
433 reliable on the system it is running on, unless another clock‐
434 source is specifically set. For x86/x86-64 CPUs, this means sup‐
435 porting TSC Invariant.
436
437 gtod_reduce=bool
438 Enable all of the gettimeofday(2) reducing options (dis‐
439 able_clat, disable_slat, disable_bw_measurement) plus reduce
440 precision of the timeout somewhat to really shrink the gettime‐
441 ofday(2) call count. With this option enabled, we only do about
442 0.4% of the gettimeofday(2) calls we would have done if all time
443 keeping was enabled.
444
445 gtod_cpu=int
446 Sometimes it's cheaper to dedicate a single thread of execution
447 to just getting the current time. Fio (and databases, for
448 instance) are very intensive on gettimeofday(2) calls. With this
449 option, you can set one CPU aside for doing nothing but logging
450 current time to a shared memory location. Then the other
451 threads/processes that run I/O workloads need only copy that
452 segment, instead of entering the kernel with a gettimeofday(2)
453 call. The CPU set aside for doing these time calls will be
454 excluded from other uses. Fio will manually clear it from the
455 CPU mask of other jobs.
456
457 Target file/device
458 directory=str
459 Prefix filenames with this directory. Used to place files in a
460 different location than `./'. You can specify a number of direc‐
461 tories by separating the names with a ':' character. These
462 directories will be assigned equally distributed to job clones
463 created by numjobs as long as they are using generated file‐
464 names. If specific filename(s) are set fio will use the first
465 listed directory, and thereby matching the filename semantic
466 which generates a file each clone if not specified, but let all
467 clones use the same if set.
468
469 See the filename option for information on how to escape ':' and
470 '´ characters within the directory path itself.
471
472 filename=str
473 Fio normally makes up a filename based on the job name, thread
474 number, and file number (see filename_format). If you want to
475 share files between threads in a job or several jobs with fixed
476 file paths, specify a filename for each of them to override the
477 default. If the ioengine is file based, you can specify a number
478 of files by separating the names with a ':' colon. So if you
479 wanted a job to open `/dev/sda' and `/dev/sdb' as the two work‐
480 ing files, you would use `filename=/dev/sda:/dev/sdb'. This also
481 means that whenever this option is specified, nrfiles is
482 ignored. The size of regular files specified by this option will
483 be size divided by number of files unless an explicit size is
484 specified by filesize.
485
486 Each colon and backslash in the wanted path must be escaped with
487 a '´ character. For instance, if the path is
488 `/dev/dsk/foo@3,0:c' then you would use `file‐
489 name=/dev/dsk/foo@3,0\:c' and if the path is `F:\\filename' then
490 you would use `filename=F\:\\filename'.
491
492 On Windows, disk devices are accessed as `\\\\.\\PhysicalDrive0'
493 for the first device, `\\\\.\\PhysicalDrive1' for the second
494 etc. Note: Windows and FreeBSD prevent write access to areas of
495 the disk containing in-use data (e.g. filesystems).
496
497 The filename `-' is a reserved name, meaning *stdin* or *std‐
498 out*. Which of the two depends on the read/write direction set.
499
500 filename_format=str
501 If sharing multiple files between jobs, it is usually necessary
502 to have fio generate the exact names that you want. By default,
503 fio will name a file based on the default file format specifica‐
504 tion of `jobname.jobnumber.filenumber'. With this option, that
505 can be customized. Fio will recognize and replace the following
506 keywords in this string:
507
508 $jobname
509 The name of the worker thread or process.
510
511 $jobnum
512 The incremental number of the worker thread or
513 process.
514
515 $filenum
516 The incremental number of the file for that worker
517 thread or process.
518
519 To have dependent jobs share a set of files, this option can be
520 set to have fio generate filenames that are shared between the
521 two. For instance, if `testfiles.$filenum' is specified, file
522 number 4 for any job will be named `testfiles.4'. The default of
523 `$jobname.$jobnum.$filenum' will be used if no other format
524 specifier is given.
525
526 If you specify a path then the directories will be created up to
527 the main directory for the file. So for example if you specify
528 `a/b/c/$jobnum` then the directories a/b/c will be created
529 before the file setup part of the job. If you specify directory
530 then the path will be relative that directory, otherwise it is
531 treated as the absolute path.
532
533 unique_filename=bool
534 To avoid collisions between networked clients, fio defaults to
535 prefixing any generated filenames (with a directory specified)
536 with the source of the client connecting. To disable this behav‐
537 ior, set this option to 0.
538
539 opendir=str
540 Recursively open any files below directory str.
541
542 lockfile=str
543 Fio defaults to not locking any files before it does I/O to
544 them. If a file or file descriptor is shared, fio can serialize
545 I/O to that file to make the end result consistent. This is
546 usual for emulating real workloads that share files. The lock
547 modes are:
548
549 none No locking. The default.
550
551 exclusive
552 Only one thread or process may do I/O at a time,
553 excluding all others.
554
555 readwrite
556 Read-write locking on the file. Many readers may
557 access the file at the same time, but writes get
558 exclusive access.
559
560 nrfiles=int
561 Number of files to use for this job. Defaults to 1. The size of
562 files will be size divided by this unless explicit size is spec‐
563 ified by filesize. Files are created for each thread separately,
564 and each file will have a file number within its name by
565 default, as explained in filename section.
566
567 openfiles=int
568 Number of files to keep open at the same time. Defaults to the
569 same as nrfiles, can be set smaller to limit the number simulta‐
570 neous opens.
571
572 file_service_type=str
573 Defines how fio decides which file from a job to service next.
574 The following types are defined:
575
576 random Choose a file at random.
577
578 roundrobin
579 Round robin over opened files. This is the
580 default.
581
582 sequential
583 Finish one file before moving on to the next. Mul‐
584 tiple files can still be open depending on open‐
585 files.
586
587 zipf Use a Zipf distribution to decide what file to
588 access.
589
590 pareto Use a Pareto distribution to decide what file to
591 access.
592
593 normal Use a Gaussian (normal) distribution to decide
594 what file to access.
595
596 gauss Alias for normal.
597
598 For random, roundrobin, and sequential, a postfix can be
599 appended to tell fio how many I/Os to issue before switching to
600 a new file. For example, specifying `file_service_type=random:8'
601 would cause fio to issue 8 I/Os before selecting a new file at
602 random. For the non-uniform distributions, a floating point
603 postfix can be given to influence how the distribution is
604 skewed. See random_distribution for a description of how that
605 would work.
606
607 ioscheduler=str
608 Attempt to switch the device hosting the file to the specified
609 I/O scheduler before running.
610
611 create_serialize=bool
612 If true, serialize the file creation for the jobs. This may be
613 handy to avoid interleaving of data files, which may greatly
614 depend on the filesystem used and even the number of processors
615 in the system. Default: true.
616
617 create_fsync=bool
618 fsync(2) the data file after creation. This is the default.
619
620 create_on_open=bool
621 If true, don't pre-create files but allow the job's open() to
622 create a file when it's time to do I/O. Default: false --
623 pre-create all necessary files when the job starts.
624
625 create_only=bool
626 If true, fio will only run the setup phase of the job. If files
627 need to be laid out or updated on disk, only that will be done
628 -- the actual job contents are not executed. Default: false.
629
630 allow_file_create=bool
631 If true, fio is permitted to create files as part of its work‐
632 load. If this option is false, then fio will error out if the
633 files it needs to use don't already exist. Default: true.
634
635 allow_mounted_write=bool
636 If this isn't set, fio will abort jobs that are destructive
637 (e.g. that write) to what appears to be a mounted device or par‐
638 tition. This should help catch creating inadvertently destruc‐
639 tive tests, not realizing that the test will destroy data on the
640 mounted file system. Note that some platforms don't allow writ‐
641 ing against a mounted device regardless of this option. Default:
642 false.
643
644 pre_read=bool
645 If this is given, files will be pre-read into memory before
646 starting the given I/O operation. This will also clear the
647 invalidate flag, since it is pointless to pre-read and then drop
648 the cache. This will only work for I/O engines that are
649 seek-able, since they allow you to read the same data multiple
650 times. Thus it will not work on non-seekable I/O engines (e.g.
651 network, splice). Default: false.
652
653 unlink=bool
654 Unlink the job files when done. Not the default, as repeated
655 runs of that job would then waste time recreating the file set
656 again and again. Default: false.
657
658 unlink_each_loop=bool
659 Unlink job files after each iteration or loop. Default: false.
660
661 zonesize=int
662 Divide a file into zones of the specified size. See zoneskip.
663
664 zonerange=int
665 Give size of an I/O zone. See zoneskip.
666
667 zoneskip=int
668 Skip the specified number of bytes when zonesize data has been
669 read. The two zone options can be used to only do I/O on zones
670 of a file.
671
672 I/O type
673 direct=bool
674 If value is true, use non-buffered I/O. This is usually
675 O_DIRECT. Note that OpenBSD and ZFS on Solaris don't support
676 direct I/O. On Windows the synchronous ioengines don't support
677 direct I/O. Default: false.
678
679 atomic=bool
680 If value is true, attempt to use atomic direct I/O. Atomic
681 writes are guaranteed to be stable once acknowledged by the
682 operating system. Only Linux supports O_ATOMIC right now.
683
684 buffered=bool
685 If value is true, use buffered I/O. This is the opposite of the
686 direct option. Defaults to true.
687
688 readwrite=str, rw=str
689 Type of I/O pattern. Accepted values are:
690
691 read Sequential reads.
692
693 write Sequential writes.
694
695 trim Sequential trims (Linux block devices only).
696
697 randread
698 Random reads.
699
700 randwrite
701 Random writes.
702
703 randtrim
704 Random trims (Linux block devices only).
705
706 rw,readwrite
707 Sequential mixed reads and writes.
708
709 randrw Random mixed reads and writes.
710
711 trimwrite
712 Sequential trim+write sequences. Blocks will be
713 trimmed first, then the same blocks will be writ‐
714 ten to.
715
716 Fio defaults to read if the option is not specified. For the
717 mixed I/O types, the default is to split them 50/50. For certain
718 types of I/O the result may still be skewed a bit, since the
719 speed may be different.
720
721 It is possible to specify the number of I/Os to do before get‐
722 ting a new offset by appending `:<nr>' to the end of the string
723 given. For a random read, it would look like `rw=randread:8' for
724 passing in an offset modifier with a value of 8. If the suffix
725 is used with a sequential I/O pattern, then the `<nr>' value
726 specified will be added to the generated offset for each I/O
727 turning sequential I/O into sequential I/O with holes. For
728 instance, using `rw=write:4k' will skip 4k for every write. Also
729 see the rw_sequencer option.
730
731 rw_sequencer=str
732 If an offset modifier is given by appending a number to the
733 `rw=str' line, then this option controls how that number modi‐
734 fies the I/O offset being generated. Accepted values are:
735
736 sequential
737 Generate sequential offset.
738
739 identical
740 Generate the same offset.
741
742 sequential is only useful for random I/O, where fio would nor‐
743 mally generate a new random offset for every I/O. If you append
744 e.g. 8 to randread, you would get a new random offset for every
745 8 I/Os. The result would be a seek for only every 8 I/Os,
746 instead of for every I/O. Use `rw=randread:8' to specify that.
747 As sequential I/O is already sequential, setting sequential for
748 that would not result in any differences. identical behaves in a
749 similar fashion, except it sends the same offset 8 number of
750 times before generating a new offset.
751
752 unified_rw_reporting=bool
753 Fio normally reports statistics on a per data direction basis,
754 meaning that reads, writes, and trims are accounted and reported
755 separately. If this option is set fio sums the results and
756 report them as "mixed" instead.
757
758 randrepeat=bool
759 Seed the random number generator used for random I/O patterns in
760 a predictable way so the pattern is repeatable across runs.
761 Default: true.
762
763 allrandrepeat=bool
764 Seed all random number generators in a predictable way so
765 results are repeatable across runs. Default: false.
766
767 randseed=int
768 Seed the random number generators based on this seed value, to
769 be able to control what sequence of output is being generated.
770 If not set, the random sequence depends on the randrepeat set‐
771 ting.
772
773 fallocate=str
774 Whether pre-allocation is performed when laying down files.
775 Accepted values are:
776
777 none Do not pre-allocate space.
778
779 native Use a platform's native pre-allocation call but
780 fall back to none behavior if it fails/is not
781 implemented.
782
783 posix Pre-allocate via posix_fallocate(3).
784
785 keep Pre-allocate via fallocate(2) with FAL‐
786 LOC_FL_KEEP_SIZE set.
787
788 0 Backward-compatible alias for none.
789
790 1 Backward-compatible alias for posix.
791
792 May not be available on all supported platforms. keep is only
793 available on Linux. If using ZFS on Solaris this cannot be set
794 to posix because ZFS doesn't support pre-allocation. Default:
795 native if any pre-allocation methods are available, none if not.
796
797 fadvise_hint=str
798 Use posix_fadvise(2) or posix_madvise(2) to advise the kernel
799 what I/O patterns are likely to be issued. Accepted values are:
800
801 0 Backwards compatible hint for "no hint".
802
803 1 Backwards compatible hint for "advise with fio
804 workload type". This uses FADV_RANDOM for a random
805 workload, and FADV_SEQUENTIAL for a sequential
806 workload.
807
808 sequential
809 Advise using FADV_SEQUENTIAL.
810
811 random Advise using FADV_RANDOM.
812
813 write_hint=str
814 Use fcntl(2) to advise the kernel what life time to expect from
815 a write. Only supported on Linux, as of version 4.13. Accepted
816 values are:
817
818 none No particular life time associated with this file.
819
820 short Data written to this file has a short life time.
821
822 medium Data written to this file has a medium life time.
823
824 long Data written to this file has a long life time.
825
826 extreme
827 Data written to this file has a very long life
828 time.
829
830 The values are all relative to each other, and no absolute mean‐
831 ing should be associated with them.
832
833 offset=int
834 Start I/O at the provided offset in the file, given as either a
835 fixed size in bytes or a percentage. If a percentage is given,
836 the generated offset will be aligned to the minimum blocksize or
837 to the value of offset_align if provided. Data before the given
838 offset will not be touched. This effectively caps the file size
839 at `real_size - offset'. Can be combined with size to constrain
840 the start and end range of the I/O workload. A percentage can
841 be specified by a number between 1 and 100 followed by '%', for
842 example, `offset=20%' to specify 20%.
843
844 offset_align=int
845 If set to non-zero value, the byte offset generated by a per‐
846 centage offset is aligned upwards to this value. Defaults to 0
847 meaning that a percentage offset is aligned to the minimum block
848 size.
849
850 offset_increment=int
851 If this is provided, then the real offset becomes `offset + off‐
852 set_increment * thread_number', where the thread number is a
853 counter that starts at 0 and is incremented for each sub-job
854 (i.e. when numjobs option is specified). This option is useful
855 if there are several jobs which are intended to operate on a
856 file in parallel disjoint segments, with even spacing between
857 the starting points.
858
859 number_ios=int
860 Fio will normally perform I/Os until it has exhausted the size
861 of the region set by size, or if it exhaust the allocated time
862 (or hits an error condition). With this setting, the range/size
863 can be set independently of the number of I/Os to perform. When
864 fio reaches this number, it will exit normally and report sta‐
865 tus. Note that this does not extend the amount of I/O that will
866 be done, it will only stop fio if this condition is met before
867 other end-of-job criteria.
868
869 fsync=int
870 If writing to a file, issue an fsync(2) (or its equivalent) of
871 the dirty data for every number of blocks given. For example, if
872 you give 32 as a parameter, fio will sync the file after every
873 32 writes issued. If fio is using non-buffered I/O, we may not
874 sync the file. The exception is the sg I/O engine, which syn‐
875 chronizes the disk cache anyway. Defaults to 0, which means fio
876 does not periodically issue and wait for a sync to complete.
877 Also see end_fsync and fsync_on_close.
878
879 fdatasync=int
880 Like fsync but uses fdatasync(2) to only sync data and not meta‐
881 data blocks. In Windows, FreeBSD, and DragonFlyBSD there is no
882 fdatasync(2) so this falls back to using fsync(2). Defaults to
883 0, which means fio does not periodically issue and wait for a
884 data-only sync to complete.
885
886 write_barrier=int
887 Make every N-th write a barrier write.
888
889 sync_file_range=str:int
890 Use sync_file_range(2) for every int number of write operations.
891 Fio will track range of writes that have happened since the last
892 sync_file_range(2) call. str can currently be one or more of:
893
894 wait_before
895 SYNC_FILE_RANGE_WAIT_BEFORE
896
897 write SYNC_FILE_RANGE_WRITE
898
899 wait_after
900 SYNC_FILE_RANGE_WRITE_AFTER
901
902 So if you do `sync_file_range=wait_before,write:8', fio would
903 use `SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE' for
904 every 8 writes. Also see the sync_file_range(2) man page. This
905 option is Linux specific.
906
907 overwrite=bool
908 If true, writes to a file will always overwrite existing data.
909 If the file doesn't already exist, it will be created before the
910 write phase begins. If the file exists and is large enough for
911 the specified write phase, nothing will be done. Default: false.
912
913 end_fsync=bool
914 If true, fsync(2) file contents when a write stage has com‐
915 pleted. Default: false.
916
917 fsync_on_close=bool
918 If true, fio will fsync(2) a dirty file on close. This differs
919 from end_fsync in that it will happen on every file close, not
920 just at the end of the job. Default: false.
921
922 rwmixread=int
923 Percentage of a mixed workload that should be reads. Default:
924 50.
925
926 rwmixwrite=int
927 Percentage of a mixed workload that should be writes. If both
928 rwmixread and rwmixwrite is given and the values do not add up
929 to 100%, the latter of the two will be used to override the
930 first. This may interfere with a given rate setting, if fio is
931 asked to limit reads or writes to a certain rate. If that is the
932 case, then the distribution may be skewed. Default: 50.
933
934 random_distribution=str:float[,str:float][,str:float]
935 By default, fio will use a completely uniform random distribu‐
936 tion when asked to perform random I/O. Sometimes it is useful to
937 skew the distribution in specific ways, ensuring that some parts
938 of the data is more hot than others. fio includes the following
939 distribution models:
940
941 random Uniform random distribution
942
943 zipf Zipf distribution
944
945 pareto Pareto distribution
946
947 normal Normal (Gaussian) distribution
948
949 zoned Zoned random distribution zoned_abs Zoned absolute
950 random distribution
951
952 When using a zipf or pareto distribution, an input value is also
953 needed to define the access pattern. For zipf, this is the `Zipf
954 theta'. For pareto, it's the `Pareto power'. Fio includes a
955 test program, fio-genzipf, that can be used visualize what the
956 given input values will yield in terms of hit rates. If you
957 wanted to use zipf with a `theta' of 1.2, you would use `ran‐
958 dom_distribution=zipf:1.2' as the option. If a non-uniform model
959 is used, fio will disable use of the random map. For the normal
960 distribution, a normal (Gaussian) deviation is supplied as a
961 value between 0 and 100.
962
963 For a zoned distribution, fio supports specifying percentages of
964 I/O access that should fall within what range of the file or
965 device. For example, given a criteria of:
966
967 60% of accesses should be to the first 10%
968 30% of accesses should be to the next 20%
969 8% of accesses should be to the next 30%
970 2% of accesses should be to the next 40%
971
972 we can define that through zoning of the random accesses. For
973 the above example, the user would do:
974
975 random_distribution=zoned:60/10:30/20:8/30:2/40
976
977 A zoned_abs distribution works exactly like thezoned, except
978 that it takes absolute sizes. For example, let's say you wanted
979 to define access according to the following criteria:
980
981 60% of accesses should be to the first 20G
982 30% of accesses should be to the next 100G
983 10% of accesses should be to the next 500G
984
985 we can define an absolute zoning distribution with:
986
987 random_distribution=zoned:60/10:30/20:8/30:2/40
988
989 For both zoned and zoned_abs, fio supports defining up to 256
990 separate zones.
991
992 Similarly to how bssplit works for setting ranges and percent‐
993 ages of block sizes. Like bssplit, it's possible to specify sep‐
994 arate zones for reads, writes, and trims. If just one set is
995 given, it'll apply to all of them.
996
997 percentage_random=int[,int][,int]
998 For a random workload, set how big a percentage should be ran‐
999 dom. This defaults to 100%, in which case the workload is fully
1000 random. It can be set from anywhere from 0 to 100. Setting it to
1001 0 would make the workload fully sequential. Any setting in
1002 between will result in a random mix of sequential and random
1003 I/O, at the given percentages. Comma-separated values may be
1004 specified for reads, writes, and trims as described in block‐
1005 size.
1006
1007 norandommap
1008 Normally fio will cover every block of the file when doing ran‐
1009 dom I/O. If this option is given, fio will just get a new random
1010 offset without looking at past I/O history. This means that some
1011 blocks may not be read or written, and that some blocks may be
1012 read/written more than once. If this option is used with verify
1013 and multiple blocksizes (via bsrange), only intact blocks are
1014 verified, i.e., partially-overwritten blocks are ignored.
1015
1016 softrandommap=bool
1017 See norandommap. If fio runs with the random block map enabled
1018 and it fails to allocate the map, if this option is set it will
1019 continue without a random block map. As coverage will not be as
1020 complete as with random maps, this option is disabled by
1021 default.
1022
1023 random_generator=str
1024 Fio supports the following engines for generating I/O offsets
1025 for random I/O:
1026
1027 tausworthe
1028 Strong 2^88 cycle random number generator.
1029
1030 lfsr Linear feedback shift register generator.
1031
1032 tausworthe64
1033 Strong 64-bit 2^258 cycle random number generator.
1034
1035 tausworthe is a strong random number generator, but it requires
1036 tracking on the side if we want to ensure that blocks are only
1037 read or written once. lfsr guarantees that we never generate the
1038 same offset twice, and it's also less computationally expensive.
1039 It's not a true random generator, however, though for I/O pur‐
1040 poses it's typically good enough. lfsr only works with single
1041 block sizes, not with workloads that use multiple block sizes.
1042 If used with such a workload, fio may read or write some blocks
1043 multiple times. The default value is tausworthe, unless the
1044 required space exceeds 2^32 blocks. If it does, then taus‐
1045 worthe64 is selected automatically.
1046
1047 Block size
1048 blocksize=int[,int][,int], bs=int[,int][,int]
1049 The block size in bytes used for I/O units. Default: 4096. A
1050 single value applies to reads, writes, and trims. Comma-sepa‐
1051 rated values may be specified for reads, writes, and trims. A
1052 value not terminated in a comma applies to subsequent types.
1053 Examples:
1054
1055 bs=256k means 256k for reads, writes and trims.
1056 bs=8k,32k means 8k for reads, 32k for writes and
1057 trims.
1058 bs=8k,32k, means 8k for reads, 32k for writes, and
1059 default for trims.
1060 bs=,8k means default for reads, 8k for writes and
1061 trims.
1062 bs=,8k, means default for reads, 8k for writes,
1063 and default for trims.
1064
1065 blocksize_range=irange[,irange][,irange],
1066 bsrange=irange[,irange][,irange]
1067 A range of block sizes in bytes for I/O units. The issued I/O
1068 unit will always be a multiple of the minimum size, unless
1069 blocksize_unaligned is set. Comma-separated ranges may be spec‐
1070 ified for reads, writes, and trims as described in blocksize.
1071 Example:
1072
1073 bsrange=1k-4k,2k-8k
1074
1075 bssplit=str[,str][,str]
1076 Sometimes you want even finer grained control of the block sizes
1077 issued, not just an even split between them. This option allows
1078 you to weight various block sizes, so that you are able to
1079 define a specific amount of block sizes issued. The format for
1080 this option is:
1081
1082 bssplit=blocksize/percentage:blocksize/percentage
1083
1084 for as many block sizes as needed. So if you want to define a
1085 workload that has 50% 64k blocks, 10% 4k blocks, and 40% 32k
1086 blocks, you would write:
1087
1088 bssplit=4k/10:64k/50:32k/40
1089
1090 Ordering does not matter. If the percentage is left blank, fio
1091 will fill in the remaining values evenly. So a bssplit option
1092 like this one:
1093
1094 bssplit=4k/50:1k/:32k/
1095
1096 would have 50% 4k ios, and 25% 1k and 32k ios. The percentages
1097 always add up to 100, if bssplit is given a range that adds up
1098 to more, it will error out.
1099
1100 Comma-separated values may be specified for reads, writes, and
1101 trims as described in blocksize.
1102
1103 If you want a workload that has 50% 2k reads and 50% 4k reads,
1104 while having 90% 4k writes and 10% 8k writes, you would specify:
1105
1106 bssplit=2k/50:4k/50,4k/90:8k/10
1107
1108 Fio supports defining up to 64 different weights for each data
1109 direction.
1110
1111 blocksize_unaligned, bs_unaligned
1112 If set, fio will issue I/O units with any size within block‐
1113 size_range, not just multiples of the minimum size. This typi‐
1114 cally won't work with direct I/O, as that normally requires sec‐
1115 tor alignment.
1116
1117 bs_is_seq_rand=bool
1118 If this option is set, fio will use the normal read,write block‐
1119 size settings as sequential,random blocksize settings instead.
1120 Any random read or write will use the WRITE blocksize settings,
1121 and any sequential read or write will use the READ blocksize
1122 settings.
1123
1124 blockalign=int[,int][,int], ba=int[,int][,int]
1125 Boundary to which fio will align random I/O units. Default:
1126 blocksize. Minimum alignment is typically 512b for using direct
1127 I/O, though it usually depends on the hardware block size. This
1128 option is mutually exclusive with using a random map for files,
1129 so it will turn off that option. Comma-separated values may be
1130 specified for reads, writes, and trims as described in block‐
1131 size.
1132
1133 Buffers and memory
1134 zero_buffers
1135 Initialize buffers with all zeros. Default: fill buffers with
1136 random data.
1137
1138 refill_buffers
1139 If this option is given, fio will refill the I/O buffers on
1140 every submit. The default is to only fill it at init time and
1141 reuse that data. Only makes sense if zero_buffers isn't speci‐
1142 fied, naturally. If data verification is enabled, refill_buffers
1143 is also automatically enabled.
1144
1145 scramble_buffers=bool
1146 If refill_buffers is too costly and the target is using data
1147 deduplication, then setting this option will slightly modify the
1148 I/O buffer contents to defeat normal de-dupe attempts. This is
1149 not enough to defeat more clever block compression attempts, but
1150 it will stop naive dedupe of blocks. Default: true.
1151
1152 buffer_compress_percentage=int
1153 If this is set, then fio will attempt to provide I/O buffer con‐
1154 tent (on WRITEs) that compresses to the specified level. Fio
1155 does this by providing a mix of random data followed by fixed
1156 pattern data. The fixed pattern is either zeros, or the pattern
1157 specified by buffer_pattern. If the buffer_pattern option is
1158 used, it might skew the compression ratio slightly. Setting buf‐
1159 fer_compress_percentage to a value other than 100 will also
1160 enable refill_buffers in order to reduce the likelihood that
1161 adjacent blocks are so similar that they over compress when seen
1162 together. See buffer_compress_chunk for how to set a finer or
1163 coarser granularity of the random/fixed data regions. Defaults
1164 to unset i.e., buffer data will not adhere to any compression
1165 level.
1166
1167 buffer_compress_chunk=int
1168 This setting allows fio to manage how big the random/fixed data
1169 region is when using buffer_compress_percentage. When buf‐
1170 fer_compress_chunk is set to some non-zero value smaller than
1171 the block size, fio can repeat the random/fixed region through‐
1172 out the I/O buffer at the specified interval (which particularly
1173 useful when bigger block sizes are used for a job). When set to
1174 0, fio will use a chunk size that matches the block size result‐
1175 ing in a single random/fixed region within the I/O buffer.
1176 Defaults to 512. When the unit is omitted, the value is inter‐
1177 preted in bytes.
1178
1179 buffer_pattern=str
1180 If set, fio will fill the I/O buffers with this pattern or with
1181 the contents of a file. If not set, the contents of I/O buffers
1182 are defined by the other options related to buffer contents. The
1183 setting can be any pattern of bytes, and can be prefixed with 0x
1184 for hex values. It may also be a string, where the string must
1185 then be wrapped with "". Or it may also be a filename, where the
1186 filename must be wrapped with '' in which case the file is
1187 opened and read. Note that not all the file contents will be
1188 read if that would cause the buffers to overflow. So, for exam‐
1189 ple:
1190
1191 buffer_pattern='filename'
1192 or:
1193 buffer_pattern="abcd"
1194 or:
1195 buffer_pattern=-12
1196 or:
1197 buffer_pattern=0xdeadface
1198
1199 Also you can combine everything together in any order:
1200
1201 buffer_pattern=0xdeadface"abcd"-12'filename'
1202
1203 dedupe_percentage=int
1204 If set, fio will generate this percentage of identical buffers
1205 when writing. These buffers will be naturally dedupable. The
1206 contents of the buffers depend on what other buffer compression
1207 settings have been set. It's possible to have the individual
1208 buffers either fully compressible, or not at all -- this option
1209 only controls the distribution of unique buffers. Setting this
1210 option will also enable refill_buffers to prevent every buffer
1211 being identical.
1212
1213 invalidate=bool
1214 Invalidate the buffer/page cache parts of the files to be used
1215 prior to starting I/O if the platform and file type support it.
1216 Defaults to true. This will be ignored if pre_read is also
1217 specified for the same job.
1218
1219 sync=bool
1220 Use synchronous I/O for buffered writes. For the majority of I/O
1221 engines, this means using O_SYNC. Default: false.
1222
1223 iomem=str, mem=str
1224 Fio can use various types of memory as the I/O unit buffer. The
1225 allowed values are:
1226
1227 malloc Use memory from malloc(3) as the buffers. Default
1228 memory type.
1229
1230 shm Use shared memory as the buffers. Allocated
1231 through shmget(2).
1232
1233 shmhuge
1234 Same as shm, but use huge pages as backing.
1235
1236 mmap Use mmap(2) to allocate buffers. May either be
1237 anonymous memory, or can be file backed if a file‐
1238 name is given after the option. The format is
1239 `mem=mmap:/path/to/file'.
1240
1241 mmaphuge
1242 Use a memory mapped huge file as the buffer back‐
1243 ing. Append filename after mmaphuge, ala `mem=mma‐
1244 phuge:/hugetlbfs/file'.
1245
1246 mmapshared
1247 Same as mmap, but use a MMAP_SHARED mapping.
1248
1249 cudamalloc
1250 Use GPU memory as the buffers for GPUDirect RDMA
1251 benchmark. The ioengine must be rdma.
1252
1253 The area allocated is a function of the maximum allowed bs size
1254 for the job, multiplied by the I/O depth given. Note that for
1255 shmhuge and mmaphuge to work, the system must have free huge
1256 pages allocated. This can normally be checked and set by read‐
1257 ing/writing `/proc/sys/vm/nr_hugepages' on a Linux system. Fio
1258 assumes a huge page is 4MiB in size. So to calculate the number
1259 of huge pages you need for a given job file, add up the I/O
1260 depth of all jobs (normally one unless iodepth is used) and mul‐
1261 tiply by the maximum bs set. Then divide that number by the huge
1262 page size. You can see the size of the huge pages in `/proc/mem‐
1263 info'. If no huge pages are allocated by having a non-zero num‐
1264 ber in `nr_hugepages', using mmaphuge or shmhuge will fail. Also
1265 see hugepage-size.
1266
1267 mmaphuge also needs to have hugetlbfs mounted and the file loca‐
1268 tion should point there. So if it's mounted in `/huge', you
1269 would use `mem=mmaphuge:/huge/somefile'.
1270
1271 iomem_align=int, mem_align=int
1272 This indicates the memory alignment of the I/O memory buffers.
1273 Note that the given alignment is applied to the first I/O unit
1274 buffer, if using iodepth the alignment of the following buffers
1275 are given by the bs used. In other words, if using a bs that is
1276 a multiple of the page sized in the system, all buffers will be
1277 aligned to this value. If using a bs that is not page aligned,
1278 the alignment of subsequent I/O memory buffers is the sum of the
1279 iomem_align and bs used.
1280
1281 hugepage-size=int
1282 Defines the size of a huge page. Must at least be equal to the
1283 system setting, see `/proc/meminfo'. Defaults to 4MiB. Should
1284 probably always be a multiple of megabytes, so using
1285 `hugepage-size=Xm' is the preferred way to set this to avoid
1286 setting a non-pow-2 bad value.
1287
1288 lockmem=int
1289 Pin the specified amount of memory with mlock(2). Can be used to
1290 simulate a smaller amount of memory. The amount specified is per
1291 worker.
1292
1293 I/O size
1294 size=int
1295 The total size of file I/O for each thread of this job. Fio will
1296 run until this many bytes has been transferred, unless runtime
1297 is limited by other options (such as runtime, for instance, or
1298 increased/decreased by io_size). Fio will divide this size
1299 between the available files determined by options such as
1300 nrfiles, filename, unless filesize is specified by the job. If
1301 the result of division happens to be 0, the size is set to the
1302 physical size of the given files or devices if they exist. If
1303 this option is not specified, fio will use the full size of the
1304 given files or devices. If the files do not exist, size must be
1305 given. It is also possible to give size as a percentage between
1306 1 and 100. If `size=20%' is given, fio will use 20% of the full
1307 size of the given files or devices. Can be combined with offset
1308 to constrain the start and end range that I/O will be done
1309 within.
1310
1311 io_size=int, io_limit=int
1312 Normally fio operates within the region set by size, which means
1313 that the size option sets both the region and size of I/O to be
1314 performed. Sometimes that is not what you want. With this
1315 option, it is possible to define just the amount of I/O that fio
1316 should do. For instance, if size is set to 20GiB and io_size is
1317 set to 5GiB, fio will perform I/O within the first 20GiB but
1318 exit when 5GiB have been done. The opposite is also possible --
1319 if size is set to 20GiB, and io_size is set to 40GiB, then fio
1320 will do 40GiB of I/O within the 0..20GiB region.
1321
1322 filesize=irange(int)
1323 Individual file sizes. May be a range, in which case fio will
1324 select sizes for files at random within the given range and lim‐
1325 ited to size in total (if that is given). If not given, each
1326 created file is the same size. This option overrides size in
1327 terms of file size, which means this value is used as a fixed
1328 size or possible range of each file.
1329
1330 file_append=bool
1331 Perform I/O after the end of the file. Normally fio will operate
1332 within the size of a file. If this option is set, then fio will
1333 append to the file instead. This has identical behavior to set‐
1334 ting offset to the size of a file. This option is ignored on
1335 non-regular files.
1336
1337 fill_device=bool, fill_fs=bool
1338 Sets size to something really large and waits for ENOSPC (no
1339 space left on device) as the terminating condition. Only makes
1340 sense with sequential write. For a read workload, the mount
1341 point will be filled first then I/O started on the result. This
1342 option doesn't make sense if operating on a raw device node,
1343 since the size of that is already known by the file system.
1344 Additionally, writing beyond end-of-device will not return
1345 ENOSPC there.
1346
1347 I/O engine
1348 ioengine=str
1349 Defines how the job issues I/O to the file. The following types
1350 are defined:
1351
1352 sync Basic read(2) or write(2) I/O. lseek(2) is used to
1353 position the I/O location. See fsync and fdata‐
1354 sync for syncing write I/Os.
1355
1356 psync Basic pread(2) or pwrite(2) I/O. Default on all
1357 supported operating systems except for Windows.
1358
1359 vsync Basic readv(2) or writev(2) I/O. Will emulate
1360 queuing by coalescing adjacent I/Os into a single
1361 submission.
1362
1363 pvsync Basic preadv(2) or pwritev(2) I/O.
1364
1365 pvsync2
1366 Basic preadv2(2) or pwritev2(2) I/O.
1367
1368 libaio Linux native asynchronous I/O. Note that Linux may
1369 only support queued behavior with non-buffered I/O
1370 (set `direct=1' or `buffered=0'). This engine
1371 defines engine specific options.
1372
1373 posixaio
1374 POSIX asynchronous I/O using aio_read(3) and
1375 aio_write(3).
1376
1377 solarisaio
1378 Solaris native asynchronous I/O.
1379
1380 windowsaio
1381 Windows native asynchronous I/O. Default on Win‐
1382 dows.
1383
1384 mmap File is memory mapped with mmap(2) and data copied
1385 to/from using memcpy(3).
1386
1387 splice splice(2) is used to transfer the data and
1388 vmsplice(2) to transfer data from user space to
1389 the kernel.
1390
1391 sg SCSI generic sg v3 I/O. May either be synchronous
1392 using the SG_IO ioctl, or if the target is an sg
1393 character device we use read(2) and write(2) for
1394 asynchronous I/O. Requires filename option to
1395 specify either block or character devices. The sg
1396 engine includes engine specific options.
1397
1398 null Doesn't transfer any data, just pretends to. This
1399 is mainly used to exercise fio itself and for
1400 debugging/testing purposes.
1401
1402 net Transfer over the network to given `host:port'.
1403 Depending on the protocol used, the hostname,
1404 port, listen and filename options are used to
1405 specify what sort of connection to make, while the
1406 protocol option determines which protocol will be
1407 used. This engine defines engine specific options.
1408
1409 netsplice
1410 Like net, but uses splice(2) and vmsplice(2) to
1411 map data and send/receive. This engine defines
1412 engine specific options.
1413
1414 cpuio Doesn't transfer any data, but burns CPU cycles
1415 according to the cpuload and cpuchunks options.
1416 Setting cpuload=85 will cause that job to do noth‐
1417 ing but burn 85% of the CPU. In case of SMP
1418 machines, use `numjobs=<nr_of_cpu>' to get desired
1419 CPU usage, as the cpuload only loads a single CPU
1420 at the desired rate. A job never finishes unless
1421 there is at least one non-cpuio job.
1422
1423 guasi The GUASI I/O engine is the Generic Userspace
1424 Asynchronous Syscall Interface approach to async
1425 I/O. See http://www.xmailserver.org/guasi-lib.html
1426 for more info on GUASI.
1427
1428 rdma The RDMA I/O engine supports both RDMA memory
1429 semantics (RDMA_WRITE/RDMA_READ) and channel
1430 semantics (Send/Recv) for the InfiniBand, RoCE and
1431 iWARP protocols. This engine defines engine spe‐
1432 cific options.
1433
1434 falloc I/O engine that does regular fallocate to simulate
1435 data transfer as fio ioengine.
1436
1437 DDIR_READ does fallocate(,mode = FAL‐
1438 LOC_FL_KEEP_SIZE,).
1439 DIR_WRITE does fallocate(,mode = 0).
1440 DDIR_TRIM does fallocate(,mode = FAL‐
1441 LOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE).
1442
1443 ftruncate
1444 I/O engine that sends ftruncate(2) operations in
1445 response to write (DDIR_WRITE) events. Each ftrun‐
1446 cate issued sets the file's size to the current
1447 block offset. blocksize is ignored.
1448
1449 e4defrag
1450 I/O engine that does regular EXT4_IOC_MOVE_EXT
1451 ioctls to simulate defragment activity in request
1452 to DDIR_WRITE event.
1453
1454 rados I/O engine supporting direct access to Ceph Reli‐
1455 able Autonomic Distributed Object Store (RADOS)
1456 via librados. This ioengine defines engine spe‐
1457 cific options.
1458
1459 rbd I/O engine supporting direct access to Ceph Rados
1460 Block Devices (RBD) via librbd without the need to
1461 use the kernel rbd driver. This ioengine defines
1462 engine specific options.
1463
1464 gfapi Using GlusterFS libgfapi sync interface to direct
1465 access to GlusterFS volumes without having to go
1466 through FUSE. This ioengine defines engine spe‐
1467 cific options.
1468
1469 gfapi_async
1470 Using GlusterFS libgfapi async interface to direct
1471 access to GlusterFS volumes without having to go
1472 through FUSE. This ioengine defines engine spe‐
1473 cific options.
1474
1475 libhdfs
1476 Read and write through Hadoop (HDFS). The filename
1477 option is used to specify host,port of the hdfs
1478 name-node to connect. This engine interprets off‐
1479 sets a little differently. In HDFS, files once
1480 created cannot be modified so random writes are
1481 not possible. To imitate this the libhdfs engine
1482 expects a bunch of small files to be created over
1483 HDFS and will randomly pick a file from them based
1484 on the offset generated by fio backend (see the
1485 example job file to create such files, use
1486 `rw=write' option). Please note, it may be neces‐
1487 sary to set environment variables to work with
1488 HDFS/libhdfs properly. Each job uses its own con‐
1489 nection to HDFS.
1490
1491 mtd Read, write and erase an MTD character device
1492 (e.g., `/dev/mtd0'). Discards are treated as
1493 erases. Depending on the underlying device type,
1494 the I/O may have to go in a certain pattern, e.g.,
1495 on NAND, writing sequentially to erase blocks and
1496 discarding before overwriting. The trimwrite mode
1497 works well for this constraint.
1498
1499 pmemblk
1500 Read and write using filesystem DAX to a file on a
1501 filesystem mounted with DAX on a persistent memory
1502 device through the PMDK libpmemblk library.
1503
1504 dev-dax
1505 Read and write using device DAX to a persistent
1506 memory device (e.g., /dev/dax0.0) through the PMDK
1507 libpmem library.
1508
1509 external
1510 Prefix to specify loading an external I/O engine
1511 object file. Append the engine filename, e.g.
1512 `ioengine=external:/tmp/foo.o' to load ioengine
1513 `foo.o' in `/tmp'. The path can be either absolute
1514 or relative. See `engines/skeleton_external.c' in
1515 the fio source for details of writing an external
1516 I/O engine.
1517
1518 filecreate
1519 Simply create the files and do no I/O to them.
1520 You still need to set filesize so that all the
1521 accounting still occurs, but no actual I/O will be
1522 done other than creating the file.
1523
1524 libpmem
1525 Read and write using mmap I/O to a file on a
1526 filesystem mounted with DAX on a persistent memory
1527 device through the PMDK libpmem library.
1528
1529 I/O engine specific parameters
1530 In addition, there are some parameters which are only valid when a spe‐
1531 cific ioengine is in use. These are used identically to normal parame‐
1532 ters, with the caveat that when used on the command line, they must
1533 come after the ioengine that defines them is selected.
1534
1535 (libaio)userspace_reap
1536 Normally, with the libaio engine in use, fio will use the
1537 io_getevents(3) system call to reap newly returned events. With
1538 this flag turned on, the AIO ring will be read directly from
1539 user-space to reap events. The reaping mode is only enabled when
1540 polling for a minimum of 0 events (e.g. when `iodepth_batch_com‐
1541 plete=0').
1542
1543 (pvsync2)hipri
1544 Set RWF_HIPRI on I/O, indicating to the kernel that it's of
1545 higher priority than normal.
1546
1547 (pvsync2)hipri_percentage
1548 When hipri is set this determines the probability of a pvsync2
1549 I/O being high priority. The default is 100%.
1550
1551 (cpuio)cpuload=int
1552 Attempt to use the specified percentage of CPU cycles. This is a
1553 mandatory option when using cpuio I/O engine.
1554
1555 (cpuio)cpuchunks=int
1556 Split the load into cycles of the given time. In microseconds.
1557
1558 (cpuio)exit_on_io_done=bool
1559 Detect when I/O threads are done, then exit.
1560
1561 (libhdfs)namenode=str
1562 The hostname or IP address of a HDFS cluster namenode to con‐
1563 tact.
1564
1565 (libhdfs)port
1566 The listening port of the HFDS cluster namenode.
1567
1568 (netsplice,net)port
1569 The TCP or UDP port to bind to or connect to. If this is used
1570 with numjobs to spawn multiple instances of the same job type,
1571 then this will be the starting port number since fio will use a
1572 range of ports.
1573
1574 (rdma)port
1575 The port to use for RDMA-CM communication. This should be the
1576 same value on the client and the server side.
1577
1578 (netsplice,net,rdma)hostname=str
1579 The hostname or IP address to use for TCP, UDP or RDMA-CM based
1580 I/O. If the job is a TCP listener or UDP reader, the hostname
1581 is not used and must be omitted unless it is a valid UDP multi‐
1582 cast address.
1583
1584 (netsplice,net)interface=str
1585 The IP address of the network interface used to send or receive
1586 UDP multicast.
1587
1588 (netsplice,net)ttl=int
1589 Time-to-live value for outgoing UDP multicast packets. Default:
1590 1.
1591
1592 (netsplice,net)nodelay=bool
1593 Set TCP_NODELAY on TCP connections.
1594
1595 (netsplice,net)protocol=str, proto=str
1596 The network protocol to use. Accepted values are:
1597
1598 tcp Transmission control protocol.
1599
1600 tcpv6 Transmission control protocol V6.
1601
1602 udp User datagram protocol.
1603
1604 udpv6 User datagram protocol V6.
1605
1606 unix UNIX domain socket.
1607
1608 When the protocol is TCP or UDP, the port must also be given, as
1609 well as the hostname if the job is a TCP listener or UDP reader.
1610 For unix sockets, the normal filename option should be used and
1611 the port is invalid.
1612
1613 (netsplice,net)listen
1614 For TCP network connections, tell fio to listen for incoming
1615 connections rather than initiating an outgoing connection. The
1616 hostname must be omitted if this option is used.
1617
1618 (netsplice,net)pingpong
1619 Normally a network writer will just continue writing data, and a
1620 network reader will just consume packages. If `pingpong=1' is
1621 set, a writer will send its normal payload to the reader, then
1622 wait for the reader to send the same payload back. This allows
1623 fio to measure network latencies. The submission and completion
1624 latencies then measure local time spent sending or receiving,
1625 and the completion latency measures how long it took for the
1626 other end to receive and send back. For UDP multicast traffic
1627 `pingpong=1' should only be set for a single reader when multi‐
1628 ple readers are listening to the same address.
1629
1630 (netsplice,net)window_size=int
1631 Set the desired socket buffer size for the connection.
1632
1633 (netsplice,net)mss=int
1634 Set the TCP maximum segment size (TCP_MAXSEG).
1635
1636 (e4defrag)donorname=str
1637 File will be used as a block donor (swap extents between files).
1638
1639 (e4defrag)inplace=int
1640 Configure donor file blocks allocation strategy:
1641
1642 0 Default. Preallocate donor's file on init.
1643
1644 1 Allocate space immediately inside defragment
1645 event, and free right after event.
1646
1647 (rbd,rados)clustername=str
1648 Specifies the name of the Ceph cluster.
1649
1650 (rbd)rbdname=str
1651 Specifies the name of the RBD.
1652
1653 (rbd,rados)pool=str
1654 Specifies the name of the Ceph pool containing RBD or RADOS
1655 data.
1656
1657 (rbd,rados)clientname=str
1658 Specifies the username (without the 'client.' prefix) used to
1659 access the Ceph cluster. If the clustername is specified, the
1660 clientname shall be the full *type.id* string. If no type. pre‐
1661 fix is given, fio will add 'client.' by default.
1662
1663 (rbd,rados)busy_poll=bool
1664 Poll store instead of waiting for completion. Usually this pro‐
1665 vides better throughput at cost of higher(up to 100%) CPU uti‐
1666 lization.
1667
1668 (mtd)skip_bad=bool
1669 Skip operations against known bad blocks.
1670
1671 (libhdfs)hdfsdirectory
1672 libhdfs will create chunk in this HDFS directory.
1673
1674 (libhdfs)chunk_size
1675 The size of the chunk to use for each file.
1676
1677 (rdma)verb=str
1678 The RDMA verb to use on this side of the RDMA ioengine connec‐
1679 tion. Valid values are write, read, send and recv. These corre‐
1680 spond to the equivalent RDMA verbs (e.g. write = rdma_write
1681 etc.). Note that this only needs to be specified on the client
1682 side of the connection. See the examples folder.
1683
1684 (rdma)bindname=str
1685 The name to use to bind the local RDMA-CM connection to a local
1686 RDMA device. This could be a hostname or an IPv4 or IPv6
1687 address. On the server side this will be passed into the
1688 rdma_bind_addr() function and on the client site it will be used
1689 in the rdma_resolve_add() function. This can be useful when mul‐
1690 tiple paths exist between the client and the server or in cer‐
1691 tain loopback configurations.
1692
1693 (sg)readfua=bool
1694 With readfua option set to 1, read operations include the force
1695 unit access (fua) flag. Default: 0.
1696
1697 (sg)writefua=bool
1698 With writefua option set to 1, write operations include the
1699 force unit access (fua) flag. Default: 0.
1700
1701 (sg)sg_write_mode=str
1702 Specify the type of write commands to issue. This option can
1703 take three values:
1704
1705 write (default)
1706 Write opcodes are issued as usual
1707
1708 verify Issue WRITE AND VERIFY commands. The BYTCHK bit is
1709 set to 0. This directs the device to carry out a
1710 medium verification with no data comparison. The
1711 writefua option is ignored with this selection.
1712
1713 same Issue WRITE SAME commands. This transfers a single
1714 block to the device and writes this same block of
1715 data to a contiguous sequence of LBAs beginning at
1716 the specified offset. fio's block size parameter
1717 specifies the amount of data written with each
1718 command. However, the amount of data actually
1719 transferred to the device is equal to the device's
1720 block (sector) size. For a device with 512 byte
1721 sectors, blocksize=8k will write 16 sectors with
1722 each command. fio will still generate 8k of data
1723 for each command butonly the first 512 bytes will
1724 be used and transferred to the device. The write‐
1725 fua option is ignored with this selection.
1726
1727
1728 I/O depth
1729 iodepth=int
1730 Number of I/O units to keep in flight against the file. Note
1731 that increasing iodepth beyond 1 will not affect synchronous
1732 ioengines (except for small degrees when verify_async is in
1733 use). Even async engines may impose OS restrictions causing the
1734 desired depth not to be achieved. This may happen on Linux when
1735 using libaio and not setting `direct=1', since buffered I/O is
1736 not async on that OS. Keep an eye on the I/O depth distribution
1737 in the fio output to verify that the achieved depth is as
1738 expected. Default: 1.
1739
1740 iodepth_batch_submit=int, iodepth_batch=int
1741 This defines how many pieces of I/O to submit at once. It
1742 defaults to 1 which means that we submit each I/O as soon as it
1743 is available, but can be raised to submit bigger batches of I/O
1744 at the time. If it is set to 0 the iodepth value will be used.
1745
1746 iodepth_batch_complete_min=int, iodepth_batch_complete=int
1747 This defines how many pieces of I/O to retrieve at once. It
1748 defaults to 1 which means that we'll ask for a minimum of 1 I/O
1749 in the retrieval process from the kernel. The I/O retrieval will
1750 go on until we hit the limit set by iodepth_low. If this vari‐
1751 able is set to 0, then fio will always check for completed
1752 events before queuing more I/O. This helps reduce I/O latency,
1753 at the cost of more retrieval system calls.
1754
1755 iodepth_batch_complete_max=int
1756 This defines maximum pieces of I/O to retrieve at once. This
1757 variable should be used along with iodepth_batch_com‐
1758 plete_min=int variable, specifying the range of min and max
1759 amount of I/O which should be retrieved. By default it is equal
1760 to iodepth_batch_complete_min value. Example #1:
1761
1762 iodepth_batch_complete_min=1
1763 iodepth_batch_complete_max=<iodepth>
1764
1765 which means that we will retrieve at least 1 I/O and up to the
1766 whole submitted queue depth. If none of I/O has been completed
1767 yet, we will wait. Example #2:
1768
1769 iodepth_batch_complete_min=0
1770 iodepth_batch_complete_max=<iodepth>
1771
1772 which means that we can retrieve up to the whole submitted queue
1773 depth, but if none of I/O has been completed yet, we will NOT
1774 wait and immediately exit the system call. In this example we
1775 simply do polling.
1776
1777 iodepth_low=int
1778 The low water mark indicating when to start filling the queue
1779 again. Defaults to the same as iodepth, meaning that fio will
1780 attempt to keep the queue full at all times. If iodepth is set
1781 to e.g. 16 and iodepth_low is set to 4, then after fio has
1782 filled the queue of 16 requests, it will let the depth drain
1783 down to 4 before starting to fill it again.
1784
1785 serialize_overlap=bool
1786 Serialize in-flight I/Os that might otherwise cause or suffer
1787 from data races. When two or more I/Os are submitted simultane‐
1788 ously, there is no guarantee that the I/Os will be processed or
1789 completed in the submitted order. Further, if two or more of
1790 those I/Os are writes, any overlapping region between them can
1791 become indeterminate/undefined on certain storage. These issues
1792 can cause verification to fail erratically when at least one of
1793 the racing I/Os is changing data and the overlapping region has
1794 a non-zero size. Setting serialize_overlap tells fio to avoid
1795 provoking this behavior by explicitly serializing in-flight I/Os
1796 that have a non-zero overlap. Note that setting this option can
1797 reduce both performance and the iodepth achieved. Additionally
1798 this option does not work when io_submit_mode is set to offload.
1799 Default: false.
1800
1801 io_submit_mode=str
1802 This option controls how fio submits the I/O to the I/O engine.
1803 The default is `inline', which means that the fio job threads
1804 submit and reap I/O directly. If set to `offload', the job
1805 threads will offload I/O submission to a dedicated pool of I/O
1806 threads. This requires some coordination and thus has a bit of
1807 extra overhead, especially for lower queue depth I/O where it
1808 can increase latencies. The benefit is that fio can manage sub‐
1809 mission rates independently of the device completion rates. This
1810 avoids skewed latency reporting if I/O gets backed up on the
1811 device side (the coordinated omission problem).
1812
1813 I/O rate
1814 thinktime=time
1815 Stall the job for the specified period of time after an I/O has
1816 completed before issuing the next. May be used to simulate pro‐
1817 cessing being done by an application. When the unit is omitted,
1818 the value is interpreted in microseconds. See thinktime_blocks
1819 and thinktime_spin.
1820
1821 thinktime_spin=time
1822 Only valid if thinktime is set - pretend to spend CPU time doing
1823 something with the data received, before falling back to sleep‐
1824 ing for the rest of the period specified by thinktime. When the
1825 unit is omitted, the value is interpreted in microseconds.
1826
1827 thinktime_blocks=int
1828 Only valid if thinktime is set - control how many blocks to
1829 issue, before waiting thinktime usecs. If not set, defaults to 1
1830 which will make fio wait thinktime usecs after every block. This
1831 effectively makes any queue depth setting redundant, since no
1832 more than 1 I/O will be queued before we have to complete it and
1833 do our thinktime. In other words, this setting effectively caps
1834 the queue depth if the latter is larger.
1835
1836 rate=int[,int][,int]
1837 Cap the bandwidth used by this job. The number is in bytes/sec,
1838 the normal suffix rules apply. Comma-separated values may be
1839 specified for reads, writes, and trims as described in block‐
1840 size.
1841
1842 For example, using `rate=1m,500k' would limit reads to 1MiB/sec
1843 and writes to 500KiB/sec. Capping only reads or writes can be
1844 done with `rate=,500k' or `rate=500k,' where the former will
1845 only limit writes (to 500KiB/sec) and the latter will only limit
1846 reads.
1847
1848 rate_min=int[,int][,int]
1849 Tell fio to do whatever it can to maintain at least this band‐
1850 width. Failing to meet this requirement will cause the job to
1851 exit. Comma-separated values may be specified for reads, writes,
1852 and trims as described in blocksize.
1853
1854 rate_iops=int[,int][,int]
1855 Cap the bandwidth to this number of IOPS. Basically the same as
1856 rate, just specified independently of bandwidth. If the job is
1857 given a block size range instead of a fixed value, the smallest
1858 block size is used as the metric. Comma-separated values may be
1859 specified for reads, writes, and trims as described in block‐
1860 size.
1861
1862 rate_iops_min=int[,int][,int]
1863 If fio doesn't meet this rate of I/O, it will cause the job to
1864 exit. Comma-separated values may be specified for reads,
1865 writes, and trims as described in blocksize.
1866
1867 rate_process=str
1868 This option controls how fio manages rated I/O submissions. The
1869 default is `linear', which submits I/O in a linear fashion with
1870 fixed delays between I/Os that gets adjusted based on I/O com‐
1871 pletion rates. If this is set to `poisson', fio will submit I/O
1872 based on a more real world random request flow, known as the
1873 Poisson process (https://en.wikipedia.org/wiki/Pois‐
1874 son_point_process). The lambda will be 10^6 / IOPS for the given
1875 workload.
1876
1877 rate_ignore_thinktime=bool
1878 By default, fio will attempt to catch up to the specified rate
1879 setting, if any kind of thinktime setting was used. If this
1880 option is set, then fio will ignore the thinktime and continue
1881 doing IO at the specified rate, instead of entering a catch-up
1882 mode after thinktime is done.
1883
1884 I/O latency
1885 latency_target=time
1886 If set, fio will attempt to find the max performance point that
1887 the given workload will run at while maintaining a latency below
1888 this target. When the unit is omitted, the value is interpreted
1889 in microseconds. See latency_window and latency_percentile.
1890
1891 latency_window=time
1892 Used with latency_target to specify the sample window that the
1893 job is run at varying queue depths to test the performance. When
1894 the unit is omitted, the value is interpreted in microseconds.
1895
1896 latency_percentile=float
1897 The percentage of I/Os that must fall within the criteria speci‐
1898 fied by latency_target and latency_window. If not set, this
1899 defaults to 100.0, meaning that all I/Os must be equal or below
1900 to the value set by latency_target.
1901
1902 max_latency=time
1903 If set, fio will exit the job with an ETIMEDOUT error if it
1904 exceeds this maximum latency. When the unit is omitted, the
1905 value is interpreted in microseconds.
1906
1907 rate_cycle=int
1908 Average bandwidth for rate and rate_min over this number of mil‐
1909 liseconds. Defaults to 1000.
1910
1911 I/O replay
1912 write_iolog=str
1913 Write the issued I/O patterns to the specified file. See
1914 read_iolog. Specify a separate file for each job, otherwise the
1915 iologs will be interspersed and the file may be corrupt.
1916
1917 read_iolog=str
1918 Open an iolog with the specified filename and replay the I/O
1919 patterns it contains. This can be used to store a workload and
1920 replay it sometime later. The iolog given may also be a blktrace
1921 binary file, which allows fio to replay a workload captured by
1922 blktrace. See blktrace(8) for how to capture such logging data.
1923 For blktrace replay, the file needs to be turned into a blkparse
1924 binary data file first (`blkparse <device> -o /dev/null -d
1925 file_for_fio.bin').
1926
1927 replay_no_stall=bool
1928 When replaying I/O with read_iolog the default behavior is to
1929 attempt to respect the timestamps within the log and replay them
1930 with the appropriate delay between IOPS. By setting this vari‐
1931 able fio will not respect the timestamps and attempt to replay
1932 them as fast as possible while still respecting ordering. The
1933 result is the same I/O pattern to a given device, but different
1934 timings.
1935
1936 replay_time_scale=int
1937 When replaying I/O with read_iolog, fio will honor the original
1938 timing in the trace. With this option, it's possible to scale
1939 the time. It's a percentage option, if set to 50 it means run at
1940 50% the original IO rate in the trace. If set to 200, run at
1941 twice the original IO rate. Defaults to 100.
1942
1943 replay_redirect=str
1944 While replaying I/O patterns using read_iolog the default behav‐
1945 ior is to replay the IOPS onto the major/minor device that each
1946 IOP was recorded from. This is sometimes undesirable because on
1947 a different machine those major/minor numbers can map to a dif‐
1948 ferent device. Changing hardware on the same system can also
1949 result in a different major/minor mapping. replay_redirect
1950 causes all I/Os to be replayed onto the single specified device
1951 regardless of the device it was recorded from. i.e. `replay_re‐
1952 direct=/dev/sdc' would cause all I/O in the blktrace or iolog to
1953 be replayed onto `/dev/sdc'. This means multiple devices will be
1954 replayed onto a single device, if the trace contains multiple
1955 devices. If you want multiple devices to be replayed concur‐
1956 rently to multiple redirected devices you must blkparse your
1957 trace into separate traces and replay them with independent fio
1958 invocations. Unfortunately this also breaks the strict time
1959 ordering between multiple device accesses.
1960
1961 replay_align=int
1962 Force alignment of I/O offsets and lengths in a trace to this
1963 power of 2 value.
1964
1965 replay_scale=int
1966 Scale sector offsets down by this factor when replaying traces.
1967
1968 Threads, processes and job synchronization
1969 replay_skip=str
1970 Sometimes it's useful to skip certain IO types in a replay
1971 trace. This could be, for instance, eliminating the writes in
1972 the trace. Or not replaying the trims/discards, if you are redi‐
1973 recting to a device that doesn't support them. This option
1974 takes a comma separated list of read, write, trim, sync.
1975
1976 thread Fio defaults to creating jobs by using fork, however if this
1977 option is given, fio will create jobs by using POSIX Threads'
1978 function pthread_create(3) to create threads instead.
1979
1980 wait_for=str
1981 If set, the current job won't be started until all workers of
1982 the specified waitee job are done. wait_for operates on the job
1983 name basis, so there are a few limitations. First, the waitee
1984 must be defined prior to the waiter job (meaning no forward ref‐
1985 erences). Second, if a job is being referenced as a waitee, it
1986 must have a unique name (no duplicate waitees).
1987
1988 nice=int
1989 Run the job with the given nice value. See man nice(2). On Win‐
1990 dows, values less than -15 set the process class to "High"; -1
1991 through -15 set "Above Normal"; 1 through 15 "Below Normal"; and
1992 above 15 "Idle" priority class.
1993
1994 prio=int
1995 Set the I/O priority value of this job. Linux limits us to a
1996 positive value between 0 and 7, with 0 being the highest. See
1997 man ionice(1). Refer to an appropriate manpage for other operat‐
1998 ing systems since meaning of priority may differ.
1999
2000 prioclass=int
2001 Set the I/O priority class. See man ionice(1).
2002
2003 cpus_allowed=str
2004 Controls the same options as cpumask, but accepts a textual
2005 specification of the permitted CPUs instead and CPUs are indexed
2006 from 0. So to use CPUs 0 and 5 you would specify
2007 `cpus_allowed=0,5'. This option also allows a range of CPUs to
2008 be specified -- say you wanted a binding to CPUs 0, 5, and 8 to
2009 15, you would set `cpus_allowed=0,5,8-15'.
2010
2011 On Windows, when `cpus_allowed' is unset only CPUs from fio's
2012 current processor group will be used and affinity settings are
2013 inherited from the system. An fio build configured to target
2014 Windows 7 makes options that set CPUs processor group aware and
2015 values will set both the processor group and a CPU from within
2016 that group. For example, on a system where processor group 0 has
2017 40 CPUs and processor group 1 has 32 CPUs, `cpus_allowed' values
2018 between 0 and 39 will bind CPUs from processor group 0 and
2019 `cpus_allowed' values between 40 and 71 will bind CPUs from pro‐
2020 cessor group 1. When using `cpus_allowed_policy=shared' all CPUs
2021 specified by a single `cpus_allowed' option must be from the
2022 same processor group. For Windows fio builds not built for Win‐
2023 dows 7, CPUs will only be selected from (and be relative to)
2024 whatever processor group fio happens to be running in and CPUs
2025 from other processor groups cannot be used.
2026
2027 cpus_allowed_policy=str
2028 Set the policy of how fio distributes the CPUs specified by
2029 cpus_allowed or cpumask. Two policies are supported:
2030
2031 shared All jobs will share the CPU set specified.
2032
2033 split Each job will get a unique CPU from the CPU set.
2034
2035 shared is the default behavior, if the option isn't specified.
2036 If split is specified, then fio will will assign one cpu per
2037 job. If not enough CPUs are given for the jobs listed, then fio
2038 will roundrobin the CPUs in the set.
2039
2040 cpumask=int
2041 Set the CPU affinity of this job. The parameter given is a bit
2042 mask of allowed CPUs the job may run on. So if you want the
2043 allowed CPUs to be 1 and 5, you would pass the decimal value of
2044 (1 << 1 | 1 << 5), or 34. See man sched_setaffinity(2). This may
2045 not work on all supported operating systems or kernel versions.
2046 This option doesn't work well for a higher CPU count than what
2047 you can store in an integer mask, so it can only control cpus
2048 1-32. For boxes with larger CPU counts, use cpus_allowed.
2049
2050 numa_cpu_nodes=str
2051 Set this job running on specified NUMA nodes' CPUs. The argu‐
2052 ments allow comma delimited list of cpu numbers, A-B ranges, or
2053 `all'. Note, to enable NUMA options support, fio must be built
2054 on a system with libnuma-dev(el) installed.
2055
2056 numa_mem_policy=str
2057 Set this job's memory policy and corresponding NUMA nodes. For‐
2058 mat of the arguments:
2059
2060 <mode>[:<nodelist>]
2061
2062 `mode' is one of the following memory policies: `default', `pre‐
2063 fer', `bind', `interleave' or `local'. For `default' and `local'
2064 memory policies, no node needs to be specified. For `prefer',
2065 only one node is allowed. For `bind' and `interleave' the
2066 `nodelist' may be as follows: a comma delimited list of numbers,
2067 A-B ranges, or `all'.
2068
2069 cgroup=str
2070 Add job to this control group. If it doesn't exist, it will be
2071 created. The system must have a mounted cgroup blkio mount point
2072 for this to work. If your system doesn't have it mounted, you
2073 can do so with:
2074
2075 # mount -t cgroup -o blkio none /cgroup
2076
2077 cgroup_weight=int
2078 Set the weight of the cgroup to this value. See the documenta‐
2079 tion that comes with the kernel, allowed values are in the range
2080 of 100..1000.
2081
2082 cgroup_nodelete=bool
2083 Normally fio will delete the cgroups it has created after the
2084 job completion. To override this behavior and to leave cgroups
2085 around after the job completion, set `cgroup_nodelete=1'. This
2086 can be useful if one wants to inspect various cgroup files after
2087 job completion. Default: false.
2088
2089 flow_id=int
2090 The ID of the flow. If not specified, it defaults to being a
2091 global flow. See flow.
2092
2093 flow=int
2094 Weight in token-based flow control. If this value is used, then
2095 there is a 'flow counter' which is used to regulate the propor‐
2096 tion of activity between two or more jobs. Fio attempts to keep
2097 this flow counter near zero. The flow parameter stands for how
2098 much should be added or subtracted to the flow counter on each
2099 iteration of the main I/O loop. That is, if one job has `flow=8'
2100 and another job has `flow=-1', then there will be a roughly 1:8
2101 ratio in how much one runs vs the other.
2102
2103 flow_watermark=int
2104 The maximum value that the absolute value of the flow counter is
2105 allowed to reach before the job must wait for a lower value of
2106 the counter.
2107
2108 flow_sleep=int
2109 The period of time, in microseconds, to wait after the flow
2110 watermark has been exceeded before retrying operations.
2111
2112 stonewall, wait_for_previous
2113 Wait for preceding jobs in the job file to exit, before starting
2114 this one. Can be used to insert serialization points in the job
2115 file. A stone wall also implies starting a new reporting group,
2116 see group_reporting.
2117
2118 exitall
2119 By default, fio will continue running all other jobs when one
2120 job finishes but sometimes this is not the desired action. Set‐
2121 ting exitall will instead make fio terminate all other jobs when
2122 one job finishes.
2123
2124 exec_prerun=str
2125 Before running this job, issue the command specified through
2126 system(3). Output is redirected in a file called `jobname.pre‐
2127 run.txt'.
2128
2129 exec_postrun=str
2130 After the job completes, issue the command specified though sys‐
2131 tem(3). Output is redirected in a file called `job‐
2132 name.postrun.txt'.
2133
2134 uid=int
2135 Instead of running as the invoking user, set the user ID to this
2136 value before the thread/process does any work.
2137
2138 gid=int
2139 Set group ID, see uid.
2140
2141 Verification
2142 verify_only
2143 Do not perform specified workload, only verify data still
2144 matches previous invocation of this workload. This option allows
2145 one to check data multiple times at a later date without over‐
2146 writing it. This option makes sense only for workloads that
2147 write data, and does not support workloads with the time_based
2148 option set.
2149
2150 do_verify=bool
2151 Run the verify phase after a write phase. Only valid if verify
2152 is set. Default: true.
2153
2154 verify=str
2155 If writing to a file, fio can verify the file contents after
2156 each iteration of the job. Each verification method also implies
2157 verification of special header, which is written to the begin‐
2158 ning of each block. This header also includes meta information,
2159 like offset of the block, block number, timestamp when block was
2160 written, etc. verify can be combined with verify_pattern option.
2161 The allowed values are:
2162
2163 md5 Use an md5 sum of the data area and store it in
2164 the header of each block.
2165
2166 crc64 Use an experimental crc64 sum of the data area and
2167 store it in the header of each block.
2168
2169 crc32c Use a crc32c sum of the data area and store it in
2170 the header of each block. This will automatically
2171 use hardware acceleration (e.g. SSE4.2 on an x86
2172 or CRC crypto extensions on ARM64) but will fall
2173 back to software crc32c if none is found. Gener‐
2174 ally the fastest checksum fio supports when hard‐
2175 ware accelerated.
2176
2177 crc32c-intel
2178 Synonym for crc32c.
2179
2180 crc32 Use a crc32 sum of the data area and store it in
2181 the header of each block.
2182
2183 crc16 Use a crc16 sum of the data area and store it in
2184 the header of each block.
2185
2186 crc7 Use a crc7 sum of the data area and store it in
2187 the header of each block.
2188
2189 xxhash Use xxhash as the checksum function. Generally the
2190 fastest software checksum that fio supports.
2191
2192 sha512 Use sha512 as the checksum function.
2193
2194 sha256 Use sha256 as the checksum function.
2195
2196 sha1 Use optimized sha1 as the checksum function.
2197
2198 sha3-224
2199 Use optimized sha3-224 as the checksum function.
2200
2201 sha3-256
2202 Use optimized sha3-256 as the checksum function.
2203
2204 sha3-384
2205 Use optimized sha3-384 as the checksum function.
2206
2207 sha3-512
2208 Use optimized sha3-512 as the checksum function.
2209
2210 meta This option is deprecated, since now meta informa‐
2211 tion is included in generic verification header
2212 and meta verification happens by default. For
2213 detailed information see the description of the
2214 verify setting. This option is kept because of
2215 compatibility's sake with old configurations. Do
2216 not use it.
2217
2218 pattern
2219 Verify a strict pattern. Normally fio includes a
2220 header with some basic information and checksum‐
2221 ming, but if this option is set, only the specific
2222 pattern set with verify_pattern is verified.
2223
2224 null Only pretend to verify. Useful for testing inter‐
2225 nals with `ioengine=null', not for much else.
2226
2227 This option can be used for repeated burn-in tests of a system
2228 to make sure that the written data is also correctly read back.
2229 If the data direction given is a read or random read, fio will
2230 assume that it should verify a previously written file. If the
2231 data direction includes any form of write, the verify will be of
2232 the newly written data.
2233
2234 verify_offset=int
2235 Swap the verification header with data somewhere else in the
2236 block before writing. It is swapped back before verifying.
2237
2238 verify_interval=int
2239 Write the verification header at a finer granularity than the
2240 blocksize. It will be written for chunks the size of ver‐
2241 ify_interval. blocksize should divide this evenly.
2242
2243 verify_pattern=str
2244 If set, fio will fill the I/O buffers with this pattern. Fio
2245 defaults to filling with totally random bytes, but sometimes
2246 it's interesting to fill with a known pattern for I/O verifica‐
2247 tion purposes. Depending on the width of the pattern, fio will
2248 fill 1/2/3/4 bytes of the buffer at the time (it can be either a
2249 decimal or a hex number). The verify_pattern if larger than a
2250 32-bit quantity has to be a hex number that starts with either
2251 "0x" or "0X". Use with verify. Also, verify_pattern supports %o
2252 format, which means that for each block offset will be written
2253 and then verified back, e.g.:
2254
2255 verify_pattern=%o
2256
2257 Or use combination of everything:
2258
2259 verify_pattern=0xff%o"abcd"-12
2260
2261 verify_fatal=bool
2262 Normally fio will keep checking the entire contents before quit‐
2263 ting on a block verification failure. If this option is set, fio
2264 will exit the job on the first observed failure. Default: false.
2265
2266 verify_dump=bool
2267 If set, dump the contents of both the original data block and
2268 the data block we read off disk to files. This allows later
2269 analysis to inspect just what kind of data corruption occurred.
2270 Off by default.
2271
2272 verify_async=int
2273 Fio will normally verify I/O inline from the submitting thread.
2274 This option takes an integer describing how many async offload
2275 threads to create for I/O verification instead, causing fio to
2276 offload the duty of verifying I/O contents to one or more sepa‐
2277 rate threads. If using this offload option, even sync I/O
2278 engines can benefit from using an iodepth setting higher than 1,
2279 as it allows them to have I/O in flight while verifies are run‐
2280 ning. Defaults to 0 async threads, i.e. verification is not
2281 asynchronous.
2282
2283 verify_async_cpus=str
2284 Tell fio to set the given CPU affinity on the async I/O verifi‐
2285 cation threads. See cpus_allowed for the format used.
2286
2287 verify_backlog=int
2288 Fio will normally verify the written contents of a job that uti‐
2289 lizes verify once that job has completed. In other words, every‐
2290 thing is written then everything is read back and verified. You
2291 may want to verify continually instead for a variety of reasons.
2292 Fio stores the meta data associated with an I/O block in memory,
2293 so for large verify workloads, quite a bit of memory would be
2294 used up holding this meta data. If this option is enabled, fio
2295 will write only N blocks before verifying these blocks.
2296
2297 verify_backlog_batch=int
2298 Control how many blocks fio will verify if verify_backlog is
2299 set. If not set, will default to the value of verify_backlog
2300 (meaning the entire queue is read back and verified). If ver‐
2301 ify_backlog_batch is less than verify_backlog then not all
2302 blocks will be verified, if verify_backlog_batch is larger than
2303 verify_backlog, some blocks will be verified more than once.
2304
2305 verify_state_save=bool
2306 When a job exits during the write phase of a verify workload,
2307 save its current state. This allows fio to replay up until that
2308 point, if the verify state is loaded for the verify read phase.
2309 The format of the filename is, roughly:
2310
2311 <type>-<jobname>-<jobindex>-verify.state.
2312
2313 <type> is "local" for a local run, "sock" for a client/server
2314 socket connection, and "ip" (192.168.0.1, for instance) for a
2315 networked client/server connection. Defaults to true.
2316
2317 verify_state_load=bool
2318 If a verify termination trigger was used, fio stores the current
2319 write state of each thread. This can be used at verification
2320 time so that fio knows how far it should verify. Without this
2321 information, fio will run a full verification pass, according to
2322 the settings in the job file used. Default false.
2323
2324 trim_percentage=int
2325 Number of verify blocks to discard/trim.
2326
2327 trim_verify_zero=bool
2328 Verify that trim/discarded blocks are returned as zeros.
2329
2330 trim_backlog=int
2331 Verify that trim/discarded blocks are returned as zeros.
2332
2333 trim_backlog_batch=int
2334 Trim this number of I/O blocks.
2335
2336 experimental_verify=bool
2337 Enable experimental verification.
2338
2339 Steady state
2340 steadystate=str:float, ss=str:float
2341 Define the criterion and limit for assessing steady state per‐
2342 formance. The first parameter designates the criterion whereas
2343 the second parameter sets the threshold. When the criterion
2344 falls below the threshold for the specified duration, the job
2345 will stop. For example, `iops_slope:0.1%' will direct fio to
2346 terminate the job when the least squares regression slope falls
2347 below 0.1% of the mean IOPS. If group_reporting is enabled this
2348 will apply to all jobs in the group. Below is the list of avail‐
2349 able steady state assessment criteria. All assessments are car‐
2350 ried out using only data from the rolling collection window.
2351 Threshold limits can be expressed as a fixed value or as a per‐
2352 centage of the mean in the collection window.
2353
2354 iops Collect IOPS data. Stop the job if all individual
2355 IOPS measurements are within the specified limit
2356 of the mean IOPS (e.g., `iops:2' means that all
2357 individual IOPS values must be within 2 of the
2358 mean, whereas `iops:0.2%' means that all individ‐
2359 ual IOPS values must be within 0.2% of the mean
2360 IOPS to terminate the job).
2361
2362 iops_slope
2363 Collect IOPS data and calculate the least squares
2364 regression slope. Stop the job if the slope falls
2365 below the specified limit.
2366
2367 bw Collect bandwidth data. Stop the job if all indi‐
2368 vidual bandwidth measurements are within the spec‐
2369 ified limit of the mean bandwidth.
2370
2371 bw_slope
2372 Collect bandwidth data and calculate the least
2373 squares regression slope. Stop the job if the
2374 slope falls below the specified limit.
2375
2376 steadystate_duration=time, ss_dur=time
2377 A rolling window of this duration will be used to judge whether
2378 steady state has been reached. Data will be collected once per
2379 second. The default is 0 which disables steady state detection.
2380 When the unit is omitted, the value is interpreted in seconds.
2381
2382 steadystate_ramp_time=time, ss_ramp=time
2383 Allow the job to run for the specified duration before beginning
2384 data collection for checking the steady state job termination
2385 criterion. The default is 0. When the unit is omitted, the value
2386 is interpreted in seconds.
2387
2388 Measurements and reporting
2389 per_job_logs=bool
2390 If set, this generates bw/clat/iops log with per file private
2391 filenames. If not set, jobs with identical names will share the
2392 log filename. Default: true.
2393
2394 group_reporting
2395 It may sometimes be interesting to display statistics for groups
2396 of jobs as a whole instead of for each individual job. This is
2397 especially true if numjobs is used; looking at individual
2398 thread/process output quickly becomes unwieldy. To see the final
2399 report per-group instead of per-job, use group_reporting. Jobs
2400 in a file will be part of the same reporting group, unless if
2401 separated by a stonewall, or by using new_group.
2402
2403 new_group
2404 Start a new reporting group. See: group_reporting. If not given,
2405 all jobs in a file will be part of the same reporting group,
2406 unless separated by a stonewall.
2407
2408 stats=bool
2409 By default, fio collects and shows final output results for all
2410 jobs that run. If this option is set to 0, then fio will ignore
2411 it in the final stat output.
2412
2413 write_bw_log=str
2414 If given, write a bandwidth log for this job. Can be used to
2415 store data of the bandwidth of the jobs in their lifetime.
2416
2417 If no str argument is given, the default filename of `job‐
2418 name_type.x.log' is used. Even when the argument is given, fio
2419 will still append the type of log. So if one specifies:
2420
2421 write_bw_log=foo
2422
2423 The actual log name will be `foo_bw.x.log' where `x' is the
2424 index of the job (1..N, where N is the number of jobs). If
2425 per_job_logs is false, then the filename will not include the
2426 `.x` job index.
2427
2428 The included fio_generate_plots script uses gnuplot to turn
2429 these text files into nice graphs. See the LOG FILE FORMATS sec‐
2430 tion for how data is structured within the file.
2431
2432 write_lat_log=str
2433 Same as write_bw_log, except this option creates I/O submission
2434 (e.g., `name_slat.x.log'), completion (e.g., `name_clat.x.log'),
2435 and total (e.g., `name_lat.x.log') latency files instead. See
2436 write_bw_log for details about the filename format and the LOG
2437 FILE FORMATS section for how data is structured within the
2438 files.
2439
2440 write_hist_log=str
2441 Same as write_bw_log but writes an I/O completion latency his‐
2442 togram file (e.g., `name_hist.x.log') instead. Note that this
2443 file will be empty unless log_hist_msec has also been set. See
2444 write_bw_log for details about the filename format and the LOG
2445 FILE FORMATS section for how data is structured within the file.
2446
2447 write_iops_log=str
2448 Same as write_bw_log, but writes an IOPS file (e.g.
2449 `name_iops.x.log') instead. See write_bw_log for details about
2450 the filename format and the LOG FILE FORMATS section for how
2451 data is structured within the file.
2452
2453 log_avg_msec=int
2454 By default, fio will log an entry in the iops, latency, or bw
2455 log for every I/O that completes. When writing to the disk log,
2456 that can quickly grow to a very large size. Setting this option
2457 makes fio average the each log entry over the specified period
2458 of time, reducing the resolution of the log. See log_max_value
2459 as well. Defaults to 0, logging all entries. Also see LOG FILE
2460 FORMATS section.
2461
2462 log_hist_msec=int
2463 Same as log_avg_msec, but logs entries for completion latency
2464 histograms. Computing latency percentiles from averages of
2465 intervals using log_avg_msec is inaccurate. Setting this option
2466 makes fio log histogram entries over the specified period of
2467 time, reducing log sizes for high IOPS devices while retaining
2468 percentile accuracy. See log_hist_coarseness and write_hist_log
2469 as well. Defaults to 0, meaning histogram logging is disabled.
2470
2471 log_hist_coarseness=int
2472 Integer ranging from 0 to 6, defining the coarseness of the res‐
2473 olution of the histogram logs enabled with log_hist_msec. For
2474 each increment in coarseness, fio outputs half as many bins.
2475 Defaults to 0, for which histogram logs contain 1216 latency
2476 bins. See LOG FILE FORMATS section.
2477
2478 log_max_value=bool
2479 If log_avg_msec is set, fio logs the average over that window.
2480 If you instead want to log the maximum value, set this option to
2481 1. Defaults to 0, meaning that averaged values are logged.
2482
2483 log_offset=bool
2484 If this is set, the iolog options will include the byte offset
2485 for the I/O entry as well as the other data values. Defaults to
2486 0 meaning that offsets are not present in logs. Also see LOG
2487 FILE FORMATS section.
2488
2489 log_compression=int
2490 If this is set, fio will compress the I/O logs as it goes, to
2491 keep the memory footprint lower. When a log reaches the speci‐
2492 fied size, that chunk is removed and compressed in the back‐
2493 ground. Given that I/O logs are fairly highly compressible, this
2494 yields a nice memory savings for longer runs. The downside is
2495 that the compression will consume some background CPU cycles, so
2496 it may impact the run. This, however, is also true if the log‐
2497 ging ends up consuming most of the system memory. So pick your
2498 poison. The I/O logs are saved normally at the end of a run, by
2499 decompressing the chunks and storing them in the specified log
2500 file. This feature depends on the availability of zlib.
2501
2502 log_compression_cpus=str
2503 Define the set of CPUs that are allowed to handle online log
2504 compression for the I/O jobs. This can provide better isolation
2505 between performance sensitive jobs, and background compression
2506 work. See cpus_allowed for the format used.
2507
2508 log_store_compressed=bool
2509 If set, fio will store the log files in a compressed format.
2510 They can be decompressed with fio, using the --inflate-log com‐
2511 mand line parameter. The files will be stored with a `.fz' suf‐
2512 fix.
2513
2514 log_unix_epoch=bool
2515 If set, fio will log Unix timestamps to the log files produced
2516 by enabling write_type_log for each log type, instead of the
2517 default zero-based timestamps.
2518
2519 block_error_percentiles=bool
2520 If set, record errors in trim block-sized units from writes and
2521 trims and output a histogram of how many trims it took to get to
2522 errors, and what kind of error was encountered.
2523
2524 bwavgtime=int
2525 Average the calculated bandwidth over the given time. Value is
2526 specified in milliseconds. If the job also does bandwidth log‐
2527 ging through write_bw_log, then the minimum of this option and
2528 log_avg_msec will be used. Default: 500ms.
2529
2530 iopsavgtime=int
2531 Average the calculated IOPS over the given time. Value is speci‐
2532 fied in milliseconds. If the job also does IOPS logging through
2533 write_iops_log, then the minimum of this option and log_avg_msec
2534 will be used. Default: 500ms.
2535
2536 disk_util=bool
2537 Generate disk utilization statistics, if the platform supports
2538 it. Default: true.
2539
2540 disable_lat=bool
2541 Disable measurements of total latency numbers. Useful only for
2542 cutting back the number of calls to gettimeofday(2), as that
2543 does impact performance at really high IOPS rates. Note that to
2544 really get rid of a large amount of these calls, this option
2545 must be used with disable_slat and disable_bw_measurement as
2546 well.
2547
2548 disable_clat=bool
2549 Disable measurements of completion latency numbers. See dis‐
2550 able_lat.
2551
2552 disable_slat=bool
2553 Disable measurements of submission latency numbers. See dis‐
2554 able_lat.
2555
2556 disable_bw_measurement=bool, disable_bw=bool
2557 Disable measurements of throughput/bandwidth numbers. See dis‐
2558 able_lat.
2559
2560 clat_percentiles=bool
2561 Enable the reporting of percentiles of completion latencies.
2562 This option is mutually exclusive with lat_percentiles.
2563
2564 lat_percentiles=bool
2565 Enable the reporting of percentiles of I/O latencies. This is
2566 similar to clat_percentiles, except that this includes the sub‐
2567 mission latency. This option is mutually exclusive with
2568 clat_percentiles.
2569
2570 percentile_list=float_list
2571 Overwrite the default list of percentiles for completion laten‐
2572 cies and the block error histogram. Each number is a floating
2573 number in the range (0,100], and the maximum length of the list
2574 is 20. Use ':' to separate the numbers, and list the numbers in
2575 ascending order. For example, `--percentile_list=99.5:99.9' will
2576 cause fio to report the values of completion latency below which
2577 99.5% and 99.9% of the observed latencies fell, respectively.
2578
2579 significant_figures=int
2580 If using --output-format of `normal', set the significant fig‐
2581 ures to this value. Higher values will yield more precise IOPS
2582 and throughput units, while lower values will round. Requires a
2583 minimum value of 1 and a maximum value of 10. Defaults to 4.
2584
2585 Error handling
2586 exitall_on_error
2587 When one job finishes in error, terminate the rest. The default
2588 is to wait for each job to finish.
2589
2590 continue_on_error=str
2591 Normally fio will exit the job on the first observed failure. If
2592 this option is set, fio will continue the job when there is a
2593 'non-fatal error' (EIO or EILSEQ) until the runtime is exceeded
2594 or the I/O size specified is completed. If this option is used,
2595 there are two more stats that are appended, the total error
2596 count and the first error. The error field given in the stats is
2597 the first error that was hit during the run. The allowed values
2598 are:
2599
2600 none Exit on any I/O or verify errors.
2601
2602 read Continue on read errors, exit on all others.
2603
2604 write Continue on write errors, exit on all others.
2605
2606 io Continue on any I/O error, exit on all others.
2607
2608 verify Continue on verify errors, exit on all others.
2609
2610 all Continue on all errors.
2611
2612 0 Backward-compatible alias for 'none'.
2613
2614 1 Backward-compatible alias for 'all'.
2615
2616 ignore_error=str
2617 Sometimes you want to ignore some errors during test in that
2618 case you can specify error list for each error type, instead of
2619 only being able to ignore the default 'non-fatal error' using
2620 continue_on_error.
2621 `ignore_error=READ_ERR_LIST,WRITE_ERR_LIST,VERIFY_ERR_LIST'
2622 errors for given error type is separated with ':'. Error may be
2623 symbol ('ENOSPC', 'ENOMEM') or integer. Example:
2624
2625 ignore_error=EAGAIN,ENOSPC:122
2626
2627 This option will ignore EAGAIN from READ, and ENOSPC and
2628 122(EDQUOT) from WRITE. This option works by overriding con‐
2629 tinue_on_error with the list of errors for each error type if
2630 any.
2631
2632 error_dump=bool
2633 If set dump every error even if it is non fatal, true by
2634 default. If disabled only fatal error will be dumped.
2635
2636 Running predefined workloads
2637 Fio includes predefined profiles that mimic the I/O workloads generated
2638 by other tools.
2639
2640 profile=str
2641 The predefined workload to run. Current profiles are:
2642
2643 tiobench
2644 Threaded I/O bench (tiotest/tiobench) like work‐
2645 load.
2646
2647 act Aerospike Certification Tool (ACT) like workload.
2648
2649 To view a profile's additional options use --cmdhelp after specifying
2650 the profile. For example:
2651
2652 $ fio --profile=act --cmdhelp
2653
2654 Act profile options
2655 device-names=str
2656 Devices to use.
2657
2658 load=int
2659 ACT load multiplier. Default: 1.
2660
2661 test-duration=time
2662 How long the entire test takes to run. When the unit is omitted,
2663 the value is given in seconds. Default: 24h.
2664
2665 threads-per-queue=int
2666 Number of read I/O threads per device. Default: 8.
2667
2668 read-req-num-512-blocks=int
2669 Number of 512B blocks to read at the time. Default: 3.
2670
2671 large-block-op-kbytes=int
2672 Size of large block ops in KiB (writes). Default: 131072.
2673
2674 prep Set to run ACT prep phase.
2675
2676 Tiobench profile options
2677 size=str
2678 Size in MiB.
2679
2680 block=int
2681 Block size in bytes. Default: 4096.
2682
2683 numruns=int
2684 Number of runs.
2685
2686 dir=str
2687 Test directory.
2688
2689 threads=int
2690 Number of threads.
2691
2693 Fio spits out a lot of output. While running, fio will display the sta‐
2694 tus of the jobs created. An example of that would be:
2695
2696 Jobs: 1 (f=1): [_(1),M(1)][24.8%][r=20.5MiB/s,w=23.5MiB/s][r=82,w=94 IOPS][eta 01m:31s]
2697
2698 The characters inside the first set of square brackets denote the cur‐
2699 rent status of each thread. The first character is the first job
2700 defined in the job file, and so forth. The possible values (in typical
2701 life cycle order) are:
2702
2703 P Thread setup, but not started.
2704 C Thread created.
2705 I Thread initialized, waiting or generating necessary data.
2706 p Thread running pre-reading file(s).
2707 / Thread is in ramp period.
2708 R Running, doing sequential reads.
2709 r Running, doing random reads.
2710 W Running, doing sequential writes.
2711 w Running, doing random writes.
2712 M Running, doing mixed sequential reads/writes.
2713 m Running, doing mixed random reads/writes.
2714 D Running, doing sequential trims.
2715 d Running, doing random trims.
2716 F Running, currently waiting for fsync(2).
2717 V Running, doing verification of written data.
2718 f Thread finishing.
2719 E Thread exited, not reaped by main thread yet.
2720 - Thread reaped.
2721 X Thread reaped, exited with an error.
2722 K Thread reaped, exited due to signal.
2723
2724 Fio will condense the thread string as not to take up more space on the
2725 command line than needed. For instance, if you have 10 readers and 10
2726 writers running, the output would look like this:
2727
2728 Jobs: 20 (f=20): [R(10),W(10)][4.0%][r=20.5MiB/s,w=23.5MiB/s][r=82,w=94 IOPS][eta 57m:36s]
2729
2730 Note that the status string is displayed in order, so it's possible to
2731 tell which of the jobs are currently doing what. In the example above
2732 this means that jobs 1--10 are readers and 11--20 are writers.
2733
2734 The other values are fairly self explanatory -- number of threads cur‐
2735 rently running and doing I/O, the number of currently open files (f=),
2736 the estimated completion percentage, the rate of I/O since last check
2737 (read speed listed first, then write speed and optionally trim speed)
2738 in terms of bandwidth and IOPS, and time to completion for the current
2739 running group. It's impossible to estimate runtime of the following
2740 groups (if any).
2741
2742 When fio is done (or interrupted by Ctrl-C), it will show the data for
2743 each thread, group of threads, and disks in that order. For each over‐
2744 all thread (or group) the output looks like:
2745
2746 Client1: (groupid=0, jobs=1): err= 0: pid=16109: Sat Jun 24 12:07:54 2017
2747 write: IOPS=88, BW=623KiB/s (638kB/s)(30.4MiB/50032msec)
2748 slat (nsec): min=500, max=145500, avg=8318.00, stdev=4781.50
2749 clat (usec): min=170, max=78367, avg=4019.02, stdev=8293.31
2750 lat (usec): min=174, max=78375, avg=4027.34, stdev=8291.79
2751 clat percentiles (usec):
2752 | 1.00th=[ 302], 5.00th=[ 326], 10.00th=[ 343], 20.00th=[ 363],
2753 | 30.00th=[ 392], 40.00th=[ 404], 50.00th=[ 416], 60.00th=[ 445],
2754 | 70.00th=[ 816], 80.00th=[ 6718], 90.00th=[12911], 95.00th=[21627],
2755 | 99.00th=[43779], 99.50th=[51643], 99.90th=[68682], 99.95th=[72877],
2756 | 99.99th=[78119]
2757 bw ( KiB/s): min= 532, max= 686, per=0.10%, avg=622.87, stdev=24.82, samples= 100
2758 iops : min= 76, max= 98, avg=88.98, stdev= 3.54, samples= 100
2759 lat (usec) : 250=0.04%, 500=64.11%, 750=4.81%, 1000=2.79%
2760 lat (msec) : 2=4.16%, 4=1.84%, 10=4.90%, 20=11.33%, 50=5.37%
2761 lat (msec) : 100=0.65%
2762 cpu : usr=0.27%, sys=0.18%, ctx=12072, majf=0, minf=21
2763 IO depths : 1=85.0%, 2=13.1%, 4=1.8%, 8=0.1%, 16=0.0%, 32=0.0%, >=64=0.0%
2764 submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
2765 complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
2766 issued rwt: total=0,4450,0, short=0,0,0, dropped=0,0,0
2767 latency : target=0, window=0, percentile=100.00%, depth=8
2768
2769 The job name (or first job's name when using group_reporting) is
2770 printed, along with the group id, count of jobs being aggregated, last
2771 error id seen (which is 0 when there are no errors), pid/tid of that
2772 thread and the time the job/group completed. Below are the I/O statis‐
2773 tics for each data direction performed (showing writes in the example
2774 above). In the order listed, they denote:
2775
2776 read/write/trim
2777 The string before the colon shows the I/O direction the
2778 statistics are for. IOPS is the average I/Os performed
2779 per second. BW is the average bandwidth rate shown as:
2780 value in power of 2 format (value in power of 10 format).
2781 The last two values show: (total I/O performed in power
2782 of 2 format / runtime of that thread).
2783
2784 slat Submission latency (min being the minimum, max being the
2785 maximum, avg being the average, stdev being the standard
2786 deviation). This is the time it took to submit the I/O.
2787 For sync I/O this row is not displayed as the slat is
2788 really the completion latency (since queue/complete is
2789 one operation there). This value can be in nanoseconds,
2790 microseconds or milliseconds --- fio will choose the most
2791 appropriate base and print that (in the example above
2792 nanoseconds was the best scale). Note: in --minimal mode
2793 latencies are always expressed in microseconds.
2794
2795 clat Completion latency. Same names as slat, this denotes the
2796 time from submission to completion of the I/O pieces. For
2797 sync I/O, clat will usually be equal (or very close) to
2798 0, as the time from submit to complete is basically just
2799 CPU time (I/O has already been done, see slat explana‐
2800 tion).
2801
2802 lat Total latency. Same names as slat and clat, this denotes
2803 the time from when fio created the I/O unit to completion
2804 of the I/O operation.
2805
2806 bw Bandwidth statistics based on samples. Same names as the
2807 xlat stats, but also includes the number of samples taken
2808 (samples) and an approximate percentage of total aggre‐
2809 gate bandwidth this thread received in its group (per).
2810 This last value is only really useful if the threads in
2811 this group are on the same disk, since they are then com‐
2812 peting for disk access.
2813
2814 iops IOPS statistics based on samples. Same names as bw.
2815
2816 lat (nsec/usec/msec)
2817 The distribution of I/O completion latencies. This is the
2818 time from when I/O leaves fio and when it gets completed.
2819 Unlike the separate read/write/trim sections above, the
2820 data here and in the remaining sections apply to all I/Os
2821 for the reporting group. 250=0.04% means that 0.04% of
2822 the I/Os completed in under 250us. 500=64.11% means that
2823 64.11% of the I/Os required 250 to 499us for completion.
2824
2825 cpu CPU usage. User and system time, along with the number of
2826 context switches this thread went through, usage of sys‐
2827 tem and user time, and finally the number of major and
2828 minor page faults. The CPU utilization numbers are aver‐
2829 ages for the jobs in that reporting group, while the con‐
2830 text and fault counters are summed.
2831
2832 IO depths
2833 The distribution of I/O depths over the job lifetime. The
2834 numbers are divided into powers of 2 and each entry cov‐
2835 ers depths from that value up to those that are lower
2836 than the next entry -- e.g., 16= covers depths from 16 to
2837 31. Note that the range covered by a depth distribution
2838 entry can be different to the range covered by the equiv‐
2839 alent submit/complete distribution entry.
2840
2841 IO submit
2842 How many pieces of I/O were submitting in a single submit
2843 call. Each entry denotes that amount and below, until the
2844 previous entry -- e.g., 16=100% means that we submitted
2845 anywhere between 9 to 16 I/Os per submit call. Note that
2846 the range covered by a submit distribution entry can be
2847 different to the range covered by the equivalent depth
2848 distribution entry.
2849
2850 IO complete
2851 Like the above submit number, but for completions
2852 instead.
2853
2854 IO issued rwt
2855 The number of read/write/trim requests issued, and how
2856 many of them were short or dropped.
2857
2858 IO latency
2859 These values are for latency_target and related options.
2860 When these options are engaged, this section describes
2861 the I/O depth required to meet the specified latency tar‐
2862 get.
2863
2864 After each client has been listed, the group statistics are printed.
2865 They will look like this:
2866
2867 Run status group 0 (all jobs):
2868 READ: bw=20.9MiB/s (21.9MB/s), 10.4MiB/s-10.8MiB/s (10.9MB/s-11.3MB/s), io=64.0MiB (67.1MB), run=2973-3069msec
2869 WRITE: bw=1231KiB/s (1261kB/s), 616KiB/s-621KiB/s (630kB/s-636kB/s), io=64.0MiB (67.1MB), run=52747-53223msec
2870
2871 For each data direction it prints:
2872
2873 bw Aggregate bandwidth of threads in this group followed by
2874 the minimum and maximum bandwidth of all the threads in
2875 this group. Values outside of brackets are power-of-2
2876 format and those within are the equivalent value in a
2877 power-of-10 format.
2878
2879 io Aggregate I/O performed of all threads in this group. The
2880 format is the same as bw.
2881
2882 run The smallest and longest runtimes of the threads in this
2883 group.
2884
2885 And finally, the disk statistics are printed. This is Linux specific.
2886 They will look like this:
2887
2888 Disk stats (read/write):
2889 sda: ios=16398/16511, merge=30/162, ticks=6853/819634, in_queue=826487, util=100.00%
2890
2891 Each value is printed for both reads and writes, with reads first. The
2892 numbers denote:
2893
2894 ios Number of I/Os performed by all groups.
2895
2896 merge Number of merges performed by the I/O scheduler.
2897
2898 ticks Number of ticks we kept the disk busy.
2899
2900 in_queue
2901 Total time spent in the disk queue.
2902
2903 util The disk utilization. A value of 100% means we kept the
2904 disk busy constantly, 50% would be a disk idling half of
2905 the time.
2906
2907 It is also possible to get fio to dump the current output while it is
2908 running, without terminating the job. To do that, send fio the USR1
2909 signal. You can also get regularly timed dumps by using the --sta‐
2910 tus-interval parameter, or by creating a file in `/tmp' named
2911 `fio-dump-status'. If fio sees this file, it will unlink it and dump
2912 the current output status.
2913
2915 For scripted usage where you typically want to generate tables or
2916 graphs of the results, fio can output the results in a semicolon sepa‐
2917 rated format. The format is one long line of values, such as:
2918
2919 2;card0;0;0;7139336;121836;60004;1;10109;27.932460;116.933948;220;126861;3495.446807;1085.368601;226;126864;3523.635629;1089.012448;24063;99944;50.275485%;59818.274627;5540.657370;7155060;122104;60004;1;8338;29.086342;117.839068;388;128077;5032.488518;1234.785715;391;128085;5061.839412;1236.909129;23436;100928;50.287926%;59964.832030;5644.844189;14.595833%;19.394167%;123706;0;7313;0.1%;0.1%;0.1%;0.1%;0.1%;0.1%;100.0%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.01%;0.02%;0.05%;0.16%;6.04%;40.40%;52.68%;0.64%;0.01%;0.00%;0.01%;0.00%;0.00%;0.00%;0.00%;0.00%
2920 A description of this job goes here.
2921
2922 The job description (if provided) follows on a second line.
2923
2924 To enable terse output, use the --minimal or `--output-format=terse'
2925 command line options. The first value is the version of the terse out‐
2926 put format. If the output has to be changed for some reason, this num‐
2927 ber will be incremented by 1 to signify that change.
2928
2929 Split up, the format is as follows (comments in brackets denote when a
2930 field was introduced or whether it's specific to some terse version):
2931
2932 terse version, fio version [v3], jobname, groupid, error
2933
2934 READ status:
2935
2936 Total IO (KiB), bandwidth (KiB/sec), IOPS, runtime (msec)
2937 Submission latency: min, max, mean, stdev (usec)
2938 Completion latency: min, max, mean, stdev (usec)
2939 Completion latency percentiles: 20 fields (see below)
2940 Total latency: min, max, mean, stdev (usec)
2941 Bw (KiB/s): min, max, aggregate percentage of total, mean, stdev, number of samples [v5]
2942 IOPS [v5]: min, max, mean, stdev, number of samples
2943
2944 WRITE status:
2945
2946 Total IO (KiB), bandwidth (KiB/sec), IOPS, runtime (msec)
2947 Submission latency: min, max, mean, stdev (usec)
2948 Completion latency: min, max, mean, stdev (usec)
2949 Completion latency percentiles: 20 fields (see below)
2950 Total latency: min, max, mean, stdev (usec)
2951 Bw (KiB/s): min, max, aggregate percentage of total, mean, stdev, number of samples [v5]
2952 IOPS [v5]: min, max, mean, stdev, number of samples
2953
2954 TRIM status [all but version 3]:
2955
2956 Fields are similar to READ/WRITE status.
2957
2958 CPU usage:
2959
2960 user, system, context switches, major faults, minor faults
2961
2962 I/O depths:
2963
2964 <=1, 2, 4, 8, 16, 32, >=64
2965
2966 I/O latencies microseconds:
2967
2968 <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000
2969
2970 I/O latencies milliseconds:
2971
2972 <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000, 2000, >=2000
2973
2974 Disk utilization [v3]:
2975
2976 disk name, read ios, write ios, read merges, write merges, read ticks, write ticks, time spent in queue, disk utilization percentage
2977
2978 Additional Info (dependent on continue_on_error, default off):
2979
2980 total # errors, first error code
2981
2982 Additional Info (dependent on description being set):
2983
2984 Text description
2985
2986 Completion latency percentiles can be a grouping of up to 20 sets, so
2987 for the terse output fio writes all of them. Each field will look like
2988 this:
2989
2990 1.00%=6112
2991
2992 which is the Xth percentile, and the `usec' latency associated with it.
2993
2994 For Disk utilization, all disks used by fio are shown. So for each disk
2995 there will be a disk utilization section.
2996
2997 Below is a single line containing short names for each of the fields in
2998 the minimal output v3, separated by semicolons:
2999
3000 terse_version_3;fio_version;jobname;groupid;error;read_kb;read_bandwidth;read_iops;read_runtime_ms;read_slat_min;read_slat_max;read_slat_mean;read_slat_dev;read_clat_min;read_clat_max;read_clat_mean;read_clat_dev;read_clat_pct01;read_clat_pct02;read_clat_pct03;read_clat_pct04;read_clat_pct05;read_clat_pct06;read_clat_pct07;read_clat_pct08;read_clat_pct09;read_clat_pct10;read_clat_pct11;read_clat_pct12;read_clat_pct13;read_clat_pct14;read_clat_pct15;read_clat_pct16;read_clat_pct17;read_clat_pct18;read_clat_pct19;read_clat_pct20;read_tlat_min;read_lat_max;read_lat_mean;read_lat_dev;read_bw_min;read_bw_max;read_bw_agg_pct;read_bw_mean;read_bw_dev;write_kb;write_bandwidth;write_iops;write_runtime_ms;write_slat_min;write_slat_max;write_slat_mean;write_slat_dev;write_clat_min;write_clat_max;write_clat_mean;write_clat_dev;write_clat_pct01;write_clat_pct02;write_clat_pct03;write_clat_pct04;write_clat_pct05;write_clat_pct06;write_clat_pct07;write_clat_pct08;write_clat_pct09;write_clat_pct10;write_clat_pct11;write_clat_pct12;write_clat_pct13;write_clat_pct14;write_clat_pct15;write_clat_pct16;write_clat_pct17;write_clat_pct18;write_clat_pct19;write_clat_pct20;write_tlat_min;write_lat_max;write_lat_mean;write_lat_dev;write_bw_min;write_bw_max;write_bw_agg_pct;write_bw_mean;write_bw_dev;cpu_user;cpu_sys;cpu_csw;cpu_mjf;cpu_minf;iodepth_1;iodepth_2;iodepth_4;iodepth_8;iodepth_16;iodepth_32;iodepth_64;lat_2us;lat_4us;lat_10us;lat_20us;lat_50us;lat_100us;lat_250us;lat_500us;lat_750us;lat_1000us;lat_2ms;lat_4ms;lat_10ms;lat_20ms;lat_50ms;lat_100ms;lat_250ms;lat_500ms;lat_750ms;lat_1000ms;lat_2000ms;lat_over_2000ms;disk_name;disk_read_iops;disk_write_iops;disk_read_merges;disk_write_merges;disk_read_ticks;write_ticks;disk_queue_time;disk_util
3001
3003 The json output format is intended to be both human readable and conve‐
3004 nient for automated parsing. For the most part its sections mirror
3005 those of the normal output. The runtime value is reported in msec and
3006 the bw value is reported in 1024 bytes per second units.
3007
3009 The json+ output format is identical to the json output format except
3010 that it adds a full dump of the completion latency bins. Each bins
3011 object contains a set of (key, value) pairs where keys are latency
3012 durations and values count how many I/Os had completion latencies of
3013 the corresponding duration. For example, consider:
3014
3015 "bins" : { "87552" : 1, "89600" : 1, "94720" : 1, "96768" : 1,
3016 "97792" : 1, "99840" : 1, "100864" : 2, "103936" : 6, "104960" :
3017 534, "105984" : 5995, "107008" : 7529, ... }
3018
3019 This data indicates that one I/O required 87,552ns to complete, two
3020 I/Os required 100,864ns to complete, and 7529 I/Os required 107,008ns
3021 to complete.
3022
3023 Also included with fio is a Python script fio_jsonplus_clat2csv that
3024 takes json+ output and generates CSV-formatted latency data suitable
3025 for plotting.
3026
3027 The latency durations actually represent the midpoints of latency
3028 intervals. For details refer to `stat.h' in the fio source.
3029
3031 There are two trace file format that you can encounter. The older (v1)
3032 format is unsupported since version 1.20-rc3 (March 2008). It will
3033 still be described below in case that you get an old trace and want to
3034 understand it.
3035
3036 In any case the trace is a simple text file with a single action per
3037 line.
3038
3039 Trace file format v1
3040 Each line represents a single I/O action in the following for‐
3041 mat:
3042
3043 rw, offset, length
3044
3045 where `rw=0/1' for read/write, and the `offset' and `length'
3046 entries being in bytes.
3047
3048 This format is not supported in fio versions >= 1.20-rc3.
3049
3050 Trace file format v2
3051 The second version of the trace file format was added in fio
3052 version 1.17. It allows to access more then one file per trace
3053 and has a bigger set of possible file actions.
3054
3055 The first line of the trace file has to be:
3056
3057 "fio version 2 iolog"
3058
3059 Following this can be lines in two different formats, which are
3060 described below.
3061
3062 The file management format:
3063 filename action
3064
3065 The `filename' is given as an absolute path. The `action'
3066 can be one of these:
3067
3068 add Add the given `filename' to the trace.
3069
3070 open Open the file with the given `filename'.
3071 The `filename' has to have been added with
3072 the add action before.
3073
3074 close Close the file with the given `filename'.
3075 The file has to have been opened before.
3076
3077 The file I/O action format:
3078 filename action offset length
3079
3080 The `filename' is given as an absolute path, and has to
3081 have been added and opened before it can be used with
3082 this format. The `offset' and `length' are given in
3083 bytes. The `action' can be one of these:
3084
3085 wait Wait for `offset' microseconds. Everything
3086 below 100 is discarded. The time is rela‐
3087 tive to the previous `wait' statement.
3088
3089 read Read `length' bytes beginning from `off‐
3090 set'.
3091
3092 write Write `length' bytes beginning from `off‐
3093 set'.
3094
3095 sync fsync(2) the file.
3096
3097 datasync
3098 fdatasync(2) the file.
3099
3100 trim Trim the given file from the given `offset'
3101 for `length' bytes.
3102
3104 In some cases, we want to understand CPU overhead in a test. For exam‐
3105 ple, we test patches for the specific goodness of whether they reduce
3106 CPU usage. Fio implements a balloon approach to create a thread per
3107 CPU that runs at idle priority, meaning that it only runs when nobody
3108 else needs the cpu. By measuring the amount of work completed by the
3109 thread, idleness of each CPU can be derived accordingly.
3110
3111 An unit work is defined as touching a full page of unsigned characters.
3112 Mean and standard deviation of time to complete an unit work is
3113 reported in "unit work" section. Options can be chosen to report
3114 detailed percpu idleness or overall system idleness by aggregating per‐
3115 cpu stats.
3116
3118 Fio is usually run in one of two ways, when data verification is done.
3119 The first is a normal write job of some sort with verify enabled. When
3120 the write phase has completed, fio switches to reads and verifies
3121 everything it wrote. The second model is running just the write phase,
3122 and then later on running the same job (but with reads instead of
3123 writes) to repeat the same I/O patterns and verify the contents. Both
3124 of these methods depend on the write phase being completed, as fio oth‐
3125 erwise has no idea how much data was written.
3126
3127 With verification triggers, fio supports dumping the current write
3128 state to local files. Then a subsequent read verify workload can load
3129 this state and know exactly where to stop. This is useful for testing
3130 cases where power is cut to a server in a managed fashion, for
3131 instance.
3132
3133 A verification trigger consists of two things:
3134
3135 1) Storing the write state of each job.
3136
3137 2) Executing a trigger command.
3138
3139 The write state is relatively small, on the order of hundreds of bytes
3140 to single kilobytes. It contains information on the number of comple‐
3141 tions done, the last X completions, etc.
3142
3143 A trigger is invoked either through creation ('touch') of a specified
3144 file in the system, or through a timeout setting. If fio is run with
3145 `--trigger-file=/tmp/trigger-file', then it will continually check for
3146 the existence of `/tmp/trigger-file'. When it sees this file, it will
3147 fire off the trigger (thus saving state, and executing the trigger com‐
3148 mand).
3149
3150 For client/server runs, there's both a local and remote trigger. If fio
3151 is running as a server backend, it will send the job states back to the
3152 client for safe storage, then execute the remote trigger, if specified.
3153 If a local trigger is specified, the server will still send back the
3154 write state, but the client will then execute the trigger.
3155
3156 Verification trigger example
3157 Let's say we want to run a powercut test on the remote Linux
3158 machine 'server'. Our write workload is in `write-test.fio'. We
3159 want to cut power to 'server' at some point during the run, and
3160 we'll run this test from the safety or our local machine,
3161 'localbox'. On the server, we'll start the fio backend normally:
3162
3163 server# fio --server
3164
3165 and on the client, we'll fire off the workload:
3166
3167 localbox$ fio --client=server --trig‐
3168 ger-file=/tmp/my-trigger --trigger-remote="bash -c "echo
3169 b > /proc/sysrq-triger""
3170
3171 We set `/tmp/my-trigger' as the trigger file, and we tell fio to
3172 execute:
3173
3174 echo b > /proc/sysrq-trigger
3175
3176 on the server once it has received the trigger and sent us the
3177 write state. This will work, but it's not really cutting power
3178 to the server, it's merely abruptly rebooting it. If we have a
3179 remote way of cutting power to the server through IPMI or simi‐
3180 lar, we could do that through a local trigger command instead.
3181 Let's assume we have a script that does IPMI reboot of a given
3182 hostname, ipmi-reboot. On localbox, we could then have run fio
3183 with a local trigger instead:
3184
3185 localbox$ fio --client=server --trig‐
3186 ger-file=/tmp/my-trigger --trigger="ipmi-reboot server"
3187
3188 For this case, fio would wait for the server to send us the
3189 write state, then execute `ipmi-reboot server' when that hap‐
3190 pened.
3191
3192 Loading verify state
3193 To load stored write state, a read verification job file must
3194 contain the verify_state_load option. If that is set, fio will
3195 load the previously stored state. For a local fio run this is
3196 done by loading the files directly, and on a client/server run,
3197 the server backend will ask the client to send the files over
3198 and load them from there.
3199
3201 Fio supports a variety of log file formats, for logging latencies,
3202 bandwidth, and IOPS. The logs share a common format, which looks like
3203 this:
3204
3205 time (msec), value, data direction, block size (bytes), offset
3206 (bytes)
3207
3208 `Time' for the log entry is always in milliseconds. The `value' logged
3209 depends on the type of log, it will be one of the following:
3210
3211 Latency log
3212 Value is latency in nsecs
3213
3214 Bandwidth log
3215 Value is in KiB/sec
3216
3217 IOPS log
3218 Value is IOPS
3219
3220 `Data direction' is one of the following:
3221
3222 0 I/O is a READ
3223
3224 1 I/O is a WRITE
3225
3226 2 I/O is a TRIM
3227
3228 The entry's `block size' is always in bytes. The `offset' is the off‐
3229 set, in bytes, from the start of the file, for that particular I/O. The
3230 logging of the offset can be toggled with log_offset.
3231
3232 Fio defaults to logging every individual I/O. When IOPS are logged for
3233 individual I/Os the `value' entry will always be 1. If windowed logging
3234 is enabled through log_avg_msec, fio logs the average values over the
3235 specified period of time. If windowed logging is enabled and
3236 log_max_value is set, then fio logs maximum values in that window
3237 instead of averages. Since `data direction', `block size' and `offset'
3238 are per-I/O values, if windowed logging is enabled they aren't applica‐
3239 ble and will be 0.
3240
3242 Normally fio is invoked as a stand-alone application on the machine
3243 where the I/O workload should be generated. However, the backend and
3244 frontend of fio can be run separately i.e., the fio server can generate
3245 an I/O workload on the "Device Under Test" while being controlled by a
3246 client on another machine.
3247
3248 Start the server on the machine which has access to the storage DUT:
3249
3250 $ fio --server=args
3251
3252 where `args' defines what fio listens to. The arguments are of the form
3253 `type,hostname' or `IP,port'. `type' is either `ip' (or ip4) for TCP/IP
3254 v4, `ip6' for TCP/IP v6, or `sock' for a local unix domain socket.
3255 `hostname' is either a hostname or IP address, and `port' is the port
3256 to listen to (only valid for TCP/IP, not a local socket). Some exam‐
3257 ples:
3258
3259 1) fio --server
3260 Start a fio server, listening on all interfaces on the
3261 default port (8765).
3262
3263 2) fio --server=ip:hostname,4444
3264 Start a fio server, listening on IP belonging to hostname
3265 and on port 4444.
3266
3267 3) fio --server=ip6:::1,4444
3268 Start a fio server, listening on IPv6 localhost ::1 and
3269 on port 4444.
3270
3271 4) fio --server=,4444
3272 Start a fio server, listening on all interfaces on port
3273 4444.
3274
3275 5) fio --server=1.2.3.4
3276 Start a fio server, listening on IP 1.2.3.4 on the
3277 default port.
3278
3279 6) fio --server=sock:/tmp/fio.sock
3280 Start a fio server, listening on the local socket
3281 `/tmp/fio.sock'.
3282
3283 Once a server is running, a "client" can connect to the fio server
3284 with:
3285
3286 $ fio <local-args> --client=<server> <remote-args> <job file(s)>
3287
3288 where `local-args' are arguments for the client where it is running,
3289 `server' is the connect string, and `remote-args' and `job file(s)' are
3290 sent to the server. The `server' string follows the same format as it
3291 does on the server side, to allow IP/hostname/socket and port strings.
3292
3293 Fio can connect to multiple servers this way:
3294
3295 $ fio --client=<server1> <job file(s)> --client=<server2> <job
3296 file(s)>
3297
3298 If the job file is located on the fio server, then you can tell the
3299 server to load a local file as well. This is done by using
3300 --remote-config:
3301
3302 $ fio --client=server --remote-config /path/to/file.fio
3303
3304 Then fio will open this local (to the server) job file instead of being
3305 passed one from the client.
3306
3307 If you have many servers (example: 100 VMs/containers), you can input a
3308 pathname of a file containing host IPs/names as the parameter value for
3309 the --client option. For example, here is an example `host.list' file
3310 containing 2 hostnames:
3311
3312 host1.your.dns.domain
3313 host2.your.dns.domain
3314
3315 The fio command would then be:
3316
3317 $ fio --client=host.list <job file(s)>
3318
3319 In this mode, you cannot input server-specific parameters or job files
3320 -- all servers receive the same job file.
3321
3322 In order to let `fio --client' runs use a shared filesystem from multi‐
3323 ple hosts, `fio --client' now prepends the IP address of the server to
3324 the filename. For example, if fio is using the directory `/mnt/nfs/fio'
3325 and is writing filename `fileio.tmp', with a --client `hostfile' con‐
3326 taining two hostnames `h1' and `h2' with IP addresses 192.168.10.120
3327 and 192.168.10.121, then fio will create two files:
3328
3329 /mnt/nfs/fio/192.168.10.120.fileio.tmp
3330 /mnt/nfs/fio/192.168.10.121.fileio.tmp
3331
3333 fio was written by Jens Axboe <axboe@kernel.dk>.
3334 This man page was written by Aaron Carroll <aaronc@cse.unsw.edu.au>
3335 based on documentation by Jens Axboe.
3336 This man page was rewritten by Tomohiro Kusumi <tkusumi@tuxera.com>
3337 based on documentation by Jens Axboe.
3338
3340 Report bugs to the fio mailing list <fio@vger.kernel.org>.
3341 See REPORTING-BUGS.
3342
3343 REPORTING-BUGS: http://git.kernel.dk/cgit/fio/plain/REPORTING-BUGS
3344
3346 For further documentation see HOWTO and README.
3347 Sample jobfiles are available in the `examples/' directory.
3348 These are typically located under `/usr/share/doc/fio'.
3349
3350 HOWTO: http://git.kernel.dk/cgit/fio/plain/HOWTO
3351 README: http://git.kernel.dk/cgit/fio/plain/README
3352
3353
3354
3355User Manual August 2017 fio(1)