1fio(1) General Commands Manual fio(1)
2
3
4
6 fio - flexible I/O tester
7
9 fio [options] [jobfile]...
10
12 fio is a tool that will spawn a number of threads or processes doing a
13 particular type of I/O action as specified by the user. The typical
14 use of fio is to write a job file matching the I/O load one wants to
15 simulate.
16
18 --debug=type
19 Enable verbose tracing type of various fio actions. May be `all'
20 for all types or individual types separated by a comma (e.g.
21 `--debug=file,mem' will enable file and memory debugging).
22 `help' will list all available tracing options.
23
24 --parse-only
25 Parse options only, don't start any I/O.
26
27 --merge-blktrace-only
28 Merge blktraces only, don't start any I/O.
29
30 --output=filename
31 Write output to filename.
32
33 --output-format=format
34 Set the reporting format to `normal', `terse', `json', or
35 `json+'. Multiple formats can be selected, separate by a comma.
36 `terse' is a CSV based format. `json+' is like `json', except it
37 adds a full dump of the latency buckets.
38
39 --bandwidth-log
40 Generate aggregate bandwidth logs.
41
42 --minimal
43 Print statistics in a terse, semicolon-delimited format.
44
45 --append-terse
46 Print statistics in selected mode AND terse, semicolon-delimited
47 format. Deprecated, use --output-format instead to select mul‐
48 tiple formats.
49
50 --terse-version=version
51 Set terse version output format (default `3', or `2', `4', `5').
52
53 --version
54 Print version information and exit.
55
56 --help Print a summary of the command line options and exit.
57
58 --cpuclock-test
59 Perform test and validation of internal CPU clock.
60
61 --crctest=[test]
62 Test the speed of the built-in checksumming functions. If no
63 argument is given, all of them are tested. Alternatively, a
64 comma separated list can be passed, in which case the given ones
65 are tested.
66
67 --cmdhelp=command
68 Print help information for command. May be `all' for all com‐
69 mands.
70
71 --enghelp=[ioengine[,command]]
72 List all commands defined by ioengine, or print help for command
73 defined by ioengine. If no ioengine is given, list all available
74 ioengines.
75
76 --showcmd=jobfile
77 Convert jobfile to a set of command-line options.
78
79 --readonly
80 Turn on safety read-only checks, preventing writes and trims.
81 The --readonly option is an extra safety guard to prevent users
82 from accidentally starting a write or trim workload when that is
83 not desired. Fio will only modify the device under test if
84 `rw=write/randwrite/rw/randrw/trim/randtrim/trimwrite' is given.
85 This safety net can be used as an extra precaution.
86
87 --eta=when
88 Specifies when real-time ETA estimate should be printed. when
89 may be `always', `never' or `auto'. `auto' is the default, it
90 prints ETA when requested if the output is a TTY. `always' dis‐
91 regards the output type, and prints ETA when requested. `never'
92 never prints ETA.
93
94 --eta-interval=time
95 By default, fio requests client ETA status roughly every second.
96 With this option, the interval is configurable. Fio imposes a
97 minimum allowed time to avoid flooding the console, less than
98 250 msec is not supported.
99
100 --eta-newline=time
101 Force a new line for every time period passed. When the unit is
102 omitted, the value is interpreted in seconds.
103
104 --status-interval=time
105 Force a full status dump of cumulative (from job start) values
106 at time intervals. This option does *not* provide per-period
107 measurements. So values such as bandwidth are running averages.
108 When the time unit is omitted, time is interpreted in seconds.
109 Note that using this option with `--output-format=json' will
110 yield output that technically isn't valid json, since the output
111 will be collated sets of valid json. It will need to be split
112 into valid sets of json after the run.
113
114 --section=name
115 Only run specified section name in job file. Multiple sections
116 can be specified. The --section option allows one to combine
117 related jobs into one file. E.g. one job file could define
118 light, moderate, and heavy sections. Tell fio to run only the
119 "heavy" section by giving `--section=heavy' command line option.
120 One can also specify the "write" operations in one section and
121 "verify" operation in another section. The --section option only
122 applies to job sections. The reserved *global* section is always
123 parsed and used.
124
125 --alloc-size=kb
126 Set the internal smalloc pool size to kb in KiB. The
127 --alloc-size switch allows one to use a larger pool size for
128 smalloc. If running large jobs with randommap enabled, fio can
129 run out of memory. Smalloc is an internal allocator for shared
130 structures from a fixed size memory pool and can grow to 16
131 pools. The pool size defaults to 16MiB. NOTE: While running
132 `.fio_smalloc.*' backing store files are visible in `/tmp'.
133
134 --warnings-fatal
135 All fio parser warnings are fatal, causing fio to exit with an
136 error.
137
138 --max-jobs=nr
139 Set the maximum number of threads/processes to support to nr.
140 NOTE: On Linux, it may be necessary to increase the shared-mem‐
141 ory limit (`/proc/sys/kernel/shmmax') if fio runs into errors
142 while creating jobs.
143
144 --server=args
145 Start a backend server, with args specifying what to listen to.
146 See CLIENT/SERVER section.
147
148 --daemonize=pidfile
149 Background a fio server, writing the pid to the given pidfile
150 file.
151
152 --client=hostname
153 Instead of running the jobs locally, send and run them on the
154 given hostname or set of hostnames. See CLIENT/SERVER section.
155
156 --remote-config=file
157 Tell fio server to load this local file.
158
159 --idle-prof=option
160 Report CPU idleness. option is one of the following:
161
162 calibrate
163 Run unit work calibration only and exit.
164
165 system Show aggregate system idleness and unit work.
166
167 percpu As system but also show per CPU idleness.
168
169 --inflate-log=log
170 Inflate and output compressed log.
171
172 --trigger-file=file
173 Execute trigger command when file exists.
174
175 --trigger-timeout=time
176 Execute trigger at this time.
177
178 --trigger=command
179 Set this command as local trigger.
180
181 --trigger-remote=command
182 Set this command as remote trigger.
183
184 --aux-path=path
185 Use the directory specified by path for generated state files
186 instead of the current working directory.
187
189 Any parameters following the options will be assumed to be job files,
190 unless they match a job file parameter. Multiple job files can be
191 listed and each job file will be regarded as a separate group. Fio will
192 stonewall execution between each group.
193
194 Fio accepts one or more job files describing what it is supposed to do.
195 The job file format is the classic ini file, where the names enclosed
196 in [] brackets define the job name. You are free to use any ASCII name
197 you want, except *global* which has special meaning. Following the job
198 name is a sequence of zero or more parameters, one per line, that
199 define the behavior of the job. If the first character in a line is a
200 ';' or a '#', the entire line is discarded as a comment.
201
202 A *global* section sets defaults for the jobs described in that file. A
203 job may override a *global* section parameter, and a job file may even
204 have several *global* sections if so desired. A job is only affected by
205 a *global* section residing above it.
206
207 The --cmdhelp option also lists all options. If used with an command
208 argument, --cmdhelp will detail the given command.
209
210 See the `examples/' directory for inspiration on how to write job
211 files. Note the copyright and license requirements currently apply to
212 `examples/' files.
213
215 Some parameters take an option of a given type, such as an integer or a
216 string. Anywhere a numeric value is required, an arithmetic expression
217 may be used, provided it is surrounded by parentheses. Supported opera‐
218 tors are:
219
220 addition (+)
221
222 subtraction (-)
223
224 multiplication (*)
225
226 division (/)
227
228 modulus (%)
229
230 exponentiation (^)
231
232 For time values in expressions, units are microseconds by default. This
233 is different than for time values not in expressions (not enclosed in
234 parentheses).
235
237 The following parameter types are used.
238
239 str String. A sequence of alphanumeric characters.
240
241 time Integer with possible time suffix. Without a unit value is
242 interpreted as seconds unless otherwise specified. Accepts a
243 suffix of 'd' for days, 'h' for hours, 'm' for minutes, 's' for
244 seconds, 'ms' (or 'msec') for milliseconds and 'us' (or 'usec')
245 for microseconds. For example, use 10m for 10 minutes.
246
247 int Integer. A whole number value, which may contain an integer pre‐
248 fix and an integer suffix.
249
250 [*integer prefix*] **number** [*integer suffix*]
251
252 The optional *integer prefix* specifies the number's base. The
253 default is decimal. *0x* specifies hexadecimal.
254
255 The optional *integer suffix* specifies the number's units, and
256 includes an optional unit prefix and an optional unit. For quan‐
257 tities of data, the default unit is bytes. For quantities of
258 time, the default unit is seconds unless otherwise specified.
259
260 With `kb_base=1000', fio follows international standards for
261 unit prefixes. To specify power-of-10 decimal values defined in
262 the International System of Units (SI):
263
264 K means kilo (K) or 1000
265 M means mega (M) or 1000**2
266 G means giga (G) or 1000**3
267 T means tera (T) or 1000**4
268 P means peta (P) or 1000**5
269
270 To specify power-of-2 binary values defined in IEC 80000-13:
271
272 Ki means kibi (Ki) or 1024
273 Mi means mebi (Mi) or 1024**2
274 Gi means gibi (Gi) or 1024**3
275 Ti means tebi (Ti) or 1024**4
276 Pi means pebi (Pi) or 1024**5
277
278 With `kb_base=1024' (the default), the unit prefixes are oppo‐
279 site from those specified in the SI and IEC 80000-13 standards
280 to provide compatibility with old scripts. For example, 4k means
281 4096.
282
283 For quantities of data, an optional unit of 'B' may be included
284 (e.g., 'kB' is the same as 'k').
285
286 The *integer suffix* is not case sensitive (e.g., m/mi mean
287 mebi/mega, not milli). 'b' and 'B' both mean byte, not bit.
288
289 Examples with `kb_base=1000':
290
291 4 KiB: 4096, 4096b, 4096B, 4k, 4kb, 4kB, 4K, 4KB
292 1 MiB: 1048576, 1m, 1024k
293 1 MB: 1000000, 1mi, 1000ki
294 1 TiB: 1073741824, 1t, 1024m, 1048576k
295 1 TB: 1000000000, 1ti, 1000mi, 1000000ki
296
297 Examples with `kb_base=1024' (default):
298
299 4 KiB: 4096, 4096b, 4096B, 4k, 4kb, 4kB, 4K, 4KB
300 1 MiB: 1048576, 1m, 1024k
301 1 MB: 1000000, 1mi, 1000ki
302 1 TiB: 1073741824, 1t, 1024m, 1048576k
303 1 TB: 1000000000, 1ti, 1000mi, 1000000ki
304
305 To specify times (units are not case sensitive):
306
307 D means days
308 H means hours
309 M mean minutes
310 s or sec means seconds (default)
311 ms or msec means milliseconds
312 us or usec means microseconds
313
314 If the option accepts an upper and lower range, use a colon ':'
315 or minus '-' to separate such values. See irange parameter type.
316 If the lower value specified happens to be larger than the upper
317 value the two values are swapped.
318
319 bool Boolean. Usually parsed as an integer, however only defined for
320 true and false (1 and 0).
321
322 irange Integer range with suffix. Allows value range to be given, such
323 as 1024-4096. A colon may also be used as the separator, e.g.
324 1k:4k. If the option allows two sets of ranges, they can be
325 specified with a ',' or '/' delimiter: 1k-4k/8k-32k. Also see
326 int parameter type.
327
328 float_list
329 A list of floating point numbers, separated by a ':' character.
330
332 With the above in mind, here follows the complete list of fio job
333 parameters.
334
335 Units
336 kb_base=int
337 Select the interpretation of unit prefixes in input parameters.
338
339 1000 Inputs comply with IEC 80000-13 and the Interna‐
340 tional System of Units (SI). Use:
341
342 - power-of-2 values with IEC prefixes (e.g., KiB)
343 - power-of-10 values with SI prefixes (e.g., kB)
344
345 1024 Compatibility mode (default). To avoid breaking
346 old scripts:
347
348 - power-of-2 values with SI prefixes
349 - power-of-10 values with IEC prefixes
350
351 See bs for more details on input parameters.
352
353 Outputs always use correct prefixes. Most outputs include both
354 side-by-side, like:
355
356 bw=2383.3kB/s (2327.4KiB/s)
357
358 If only one value is reported, then kb_base selects the one to
359 use:
360
361 1000 -- SI prefixes
362 1024 -- IEC prefixes
363
364 unit_base=int
365 Base unit for reporting. Allowed values are:
366
367 0 Use auto-detection (default).
368
369 8 Byte based.
370
371 1 Bit based.
372
373 Job description
374 name=str
375 ASCII name of the job. This may be used to override the name
376 printed by fio for this job. Otherwise the job name is used. On
377 the command line this parameter has the special purpose of also
378 signaling the start of a new job.
379
380 description=str
381 Text description of the job. Doesn't do anything except dump
382 this text description when this job is run. It's not parsed.
383
384 loops=int
385 Run the specified number of iterations of this job. Used to
386 repeat the same workload a given number of times. Defaults to 1.
387
388 numjobs=int
389 Create the specified number of clones of this job. Each clone of
390 job is spawned as an independent thread or process. May be used
391 to setup a larger number of threads/processes doing the same
392 thing. Each thread is reported separately; to see statistics for
393 all clones as a whole, use group_reporting in conjunction with
394 new_group. See --max-jobs. Default: 1.
395
396 Time related parameters
397 runtime=time
398 Tell fio to terminate processing after the specified period of
399 time. It can be quite hard to determine for how long a specified
400 job will run, so this parameter is handy to cap the total run‐
401 time to a given time. When the unit is omitted, the value is
402 interpreted in seconds.
403
404 time_based
405 If set, fio will run for the duration of the runtime specified
406 even if the file(s) are completely read or written. It will sim‐
407 ply loop over the same workload as many times as the runtime
408 allows.
409
410 startdelay=irange(int)
411 Delay the start of job for the specified amount of time. Can be
412 a single value or a range. When given as a range, each thread
413 will choose a value randomly from within the range. Value is in
414 seconds if a unit is omitted.
415
416 ramp_time=time
417 If set, fio will run the specified workload for this amount of
418 time before logging any performance numbers. Useful for letting
419 performance settle before logging results, thus minimizing the
420 runtime required for stable results. Note that the ramp_time is
421 considered lead in time for a job, thus it will increase the
422 total runtime if a special timeout or runtime is specified. When
423 the unit is omitted, the value is given in seconds.
424
425 clocksource=str
426 Use the given clocksource as the base of timing. The supported
427 options are:
428
429 gettimeofday
430 gettimeofday(2)
431
432 clock_gettime
433 clock_gettime(2)
434
435 cpu Internal CPU clock source
436
437 cpu is the preferred clocksource if it is reliable, as it is
438 very fast (and fio is heavy on time calls). Fio will automati‐
439 cally use this clocksource if it's supported and considered
440 reliable on the system it is running on, unless another clock‐
441 source is specifically set. For x86/x86-64 CPUs, this means sup‐
442 porting TSC Invariant.
443
444 gtod_reduce=bool
445 Enable all of the gettimeofday(2) reducing options (dis‐
446 able_clat, disable_slat, disable_bw_measurement) plus reduce
447 precision of the timeout somewhat to really shrink the gettime‐
448 ofday(2) call count. With this option enabled, we only do about
449 0.4% of the gettimeofday(2) calls we would have done if all time
450 keeping was enabled.
451
452 gtod_cpu=int
453 Sometimes it's cheaper to dedicate a single thread of execution
454 to just getting the current time. Fio (and databases, for
455 instance) are very intensive on gettimeofday(2) calls. With this
456 option, you can set one CPU aside for doing nothing but logging
457 current time to a shared memory location. Then the other
458 threads/processes that run I/O workloads need only copy that
459 segment, instead of entering the kernel with a gettimeofday(2)
460 call. The CPU set aside for doing these time calls will be
461 excluded from other uses. Fio will manually clear it from the
462 CPU mask of other jobs.
463
464 Target file/device
465 directory=str
466 Prefix filenames with this directory. Used to place files in a
467 different location than `./'. You can specify a number of direc‐
468 tories by separating the names with a ':' character. These
469 directories will be assigned equally distributed to job clones
470 created by numjobs as long as they are using generated file‐
471 names. If specific filename(s) are set fio will use the first
472 listed directory, and thereby matching the filename semantic
473 (which generates a file for each clone if not specified, but
474 lets all clones use the same file if set).
475
476 See the filename option for information on how to escape ':' and
477 '\' characters within the directory path itself.
478
479 Note: To control the directory fio will use for internal state
480 files use --aux-path.
481
482 filename=str
483 Fio normally makes up a filename based on the job name, thread
484 number, and file number (see filename_format). If you want to
485 share files between threads in a job or several jobs with fixed
486 file paths, specify a filename for each of them to override the
487 default. If the ioengine is file based, you can specify a number
488 of files by separating the names with a ':' colon. So if you
489 wanted a job to open `/dev/sda' and `/dev/sdb' as the two work‐
490 ing files, you would use `filename=/dev/sda:/dev/sdb'. This also
491 means that whenever this option is specified, nrfiles is
492 ignored. The size of regular files specified by this option will
493 be size divided by number of files unless an explicit size is
494 specified by filesize.
495
496 Each colon and backslash in the wanted path must be escaped with
497 a '\' character. For instance, if the path is
498 `/dev/dsk/foo@3,0:c' then you would use `file‐
499 name=/dev/dsk/foo@3,0\:c' and if the path is `F:\filename' then
500 you would use `filename=F\:\\filename'.
501
502 On Windows, disk devices are accessed as `\\.\PhysicalDrive0'
503 for the first device, `\\.\PhysicalDrive1' for the second etc.
504 Note: Windows and FreeBSD prevent write access to areas of the
505 disk containing in-use data (e.g. filesystems).
506
507 The filename `-' is a reserved name, meaning *stdin* or *std‐
508 out*. Which of the two depends on the read/write direction set.
509
510 filename_format=str
511 If sharing multiple files between jobs, it is usually necessary
512 to have fio generate the exact names that you want. By default,
513 fio will name a file based on the default file format specifica‐
514 tion of `jobname.jobnumber.filenumber'. With this option, that
515 can be customized. Fio will recognize and replace the following
516 keywords in this string:
517
518 $jobname
519 The name of the worker thread or process.
520
521 $jobnum
522 The incremental number of the worker thread or
523 process.
524
525 $filenum
526 The incremental number of the file for that worker
527 thread or process.
528
529 To have dependent jobs share a set of files, this option can be
530 set to have fio generate filenames that are shared between the
531 two. For instance, if `testfiles.$filenum' is specified, file
532 number 4 for any job will be named `testfiles.4'. The default of
533 `$jobname.$jobnum.$filenum' will be used if no other format
534 specifier is given.
535
536 If you specify a path then the directories will be created up to
537 the main directory for the file. So for example if you specify
538 `a/b/c/$jobnum` then the directories a/b/c will be created
539 before the file setup part of the job. If you specify directory
540 then the path will be relative that directory, otherwise it is
541 treated as the absolute path.
542
543 unique_filename=bool
544 To avoid collisions between networked clients, fio defaults to
545 prefixing any generated filenames (with a directory specified)
546 with the source of the client connecting. To disable this behav‐
547 ior, set this option to 0.
548
549 opendir=str
550 Recursively open any files below directory str.
551
552 lockfile=str
553 Fio defaults to not locking any files before it does I/O to
554 them. If a file or file descriptor is shared, fio can serialize
555 I/O to that file to make the end result consistent. This is
556 usual for emulating real workloads that share files. The lock
557 modes are:
558
559 none No locking. The default.
560
561 exclusive
562 Only one thread or process may do I/O at a time,
563 excluding all others.
564
565 readwrite
566 Read-write locking on the file. Many readers may
567 access the file at the same time, but writes get
568 exclusive access.
569
570 nrfiles=int
571 Number of files to use for this job. Defaults to 1. The size of
572 files will be size divided by this unless explicit size is spec‐
573 ified by filesize. Files are created for each thread separately,
574 and each file will have a file number within its name by
575 default, as explained in filename section.
576
577 openfiles=int
578 Number of files to keep open at the same time. Defaults to the
579 same as nrfiles, can be set smaller to limit the number simulta‐
580 neous opens.
581
582 file_service_type=str
583 Defines how fio decides which file from a job to service next.
584 The following types are defined:
585
586 random Choose a file at random.
587
588 roundrobin
589 Round robin over opened files. This is the
590 default.
591
592 sequential
593 Finish one file before moving on to the next. Mul‐
594 tiple files can still be open depending on open‐
595 files.
596
597 zipf Use a Zipf distribution to decide what file to
598 access.
599
600 pareto Use a Pareto distribution to decide what file to
601 access.
602
603 normal Use a Gaussian (normal) distribution to decide
604 what file to access.
605
606 gauss Alias for normal.
607
608 For random, roundrobin, and sequential, a postfix can be
609 appended to tell fio how many I/Os to issue before switching to
610 a new file. For example, specifying `file_service_type=random:8'
611 would cause fio to issue 8 I/Os before selecting a new file at
612 random. For the non-uniform distributions, a floating point
613 postfix can be given to influence how the distribution is
614 skewed. See random_distribution for a description of how that
615 would work.
616
617 ioscheduler=str
618 Attempt to switch the device hosting the file to the specified
619 I/O scheduler before running.
620
621 create_serialize=bool
622 If true, serialize the file creation for the jobs. This may be
623 handy to avoid interleaving of data files, which may greatly
624 depend on the filesystem used and even the number of processors
625 in the system. Default: true.
626
627 create_fsync=bool
628 fsync(2) the data file after creation. This is the default.
629
630 create_on_open=bool
631 If true, don't pre-create files but allow the job's open() to
632 create a file when it's time to do I/O. Default: false --
633 pre-create all necessary files when the job starts.
634
635 create_only=bool
636 If true, fio will only run the setup phase of the job. If files
637 need to be laid out or updated on disk, only that will be done
638 -- the actual job contents are not executed. Default: false.
639
640 allow_file_create=bool
641 If true, fio is permitted to create files as part of its work‐
642 load. If this option is false, then fio will error out if the
643 files it needs to use don't already exist. Default: true.
644
645 allow_mounted_write=bool
646 If this isn't set, fio will abort jobs that are destructive
647 (e.g. that write) to what appears to be a mounted device or par‐
648 tition. This should help catch creating inadvertently destruc‐
649 tive tests, not realizing that the test will destroy data on the
650 mounted file system. Note that some platforms don't allow writ‐
651 ing against a mounted device regardless of this option. Default:
652 false.
653
654 pre_read=bool
655 If this is given, files will be pre-read into memory before
656 starting the given I/O operation. This will also clear the
657 invalidate flag, since it is pointless to pre-read and then drop
658 the cache. This will only work for I/O engines that are
659 seek-able, since they allow you to read the same data multiple
660 times. Thus it will not work on non-seekable I/O engines (e.g.
661 network, splice). Default: false.
662
663 unlink=bool
664 Unlink the job files when done. Not the default, as repeated
665 runs of that job would then waste time recreating the file set
666 again and again. Default: false.
667
668 unlink_each_loop=bool
669 Unlink job files after each iteration or loop. Default: false.
670
671 zonemode=str
672 Accepted values are:
673
674 none The zonerange, zonesize and zoneskip parameters
675 are ignored.
676
677 strided
678 I/O happens in a single zone until zonesize bytes
679 have been transferred. After that number of bytes
680 has been transferred processing of the next zone
681 starts.
682
683 zbd Zoned block device mode. I/O happens sequentially
684 in each zone, even if random I/O has been
685 selected. Random I/O happens across all zones
686 instead of being restricted to a single zone.
687
688 zonerange=int
689 Size of a single zone. See also zonesize and zoneskip.
690
691 zonesize=int
692 For zonemode=strided, this is the number of bytes to transfer
693 before skipping zoneskip bytes. If this parameter is smaller
694 than zonerange then only a fraction of each zone with zonerange
695 bytes will be accessed. If this parameter is larger than zon‐
696 erange then each zone will be accessed multiple times before
697 skipping to the next zone.
698
699 For zonemode=zbd, this is the size of a single zone. The zon‐
700 erange parameter is ignored in this mode.
701
702 zoneskip=int
703 For zonemode=strided, the number of bytes to skip after zonesize
704 bytes of data have been transferred. This parameter must be zero
705 for zonemode=zbd.
706
707
708 read_beyond_wp=bool
709 This parameter applies to zonemode=zbd only.
710
711 Zoned block devices are block devices that consist of multiple
712 zones. Each zone has a type, e.g. conventional or sequential. A
713 conventional zone can be written at any offset that is a multi‐
714 ple of the block size. Sequential zones must be written sequen‐
715 tially. The position at which a write must occur is called the
716 write pointer. A zoned block device can be either drive managed,
717 host managed or host aware. For host managed devices the host
718 must ensure that writes happen sequentially. Fio recognizes host
719 managed devices and serializes writes to sequential zones for
720 these devices.
721
722 If a read occurs in a sequential zone beyond the write pointer
723 then the zoned block device will complete the read without read‐
724 ing any data from the storage medium. Since such reads lead to
725 unrealistically high bandwidth and IOPS numbers fio only reads
726 beyond the write pointer if explicitly told to do so. Default:
727 false.
728
729 max_open_zones=int
730 When running a random write test across an entire drive many
731 more zones will be open than in a typical application workload.
732 Hence this command line option that allows to limit the number
733 of open zones. The number of open zones is defined as the number
734 of zones to which write commands are issued.
735
736 zone_reset_threshold=float
737 A number between zero and one that indicates the ratio of logi‐
738 cal blocks with data to the total number of logical blocks in
739 the test above which zones should be reset periodically.
740
741 zone_reset_frequency=float
742 A number between zero and one that indicates how often a zone
743 reset should be issued if the zone reset threshold has been
744 exceeded. A zone reset is submitted after each (1 /
745 zone_reset_frequency) write requests. This and the previous
746 parameter can be used to simulate garbage collection activity.
747
748
749 I/O type
750 direct=bool
751 If value is true, use non-buffered I/O. This is usually
752 O_DIRECT. Note that OpenBSD and ZFS on Solaris don't support
753 direct I/O. On Windows the synchronous ioengines don't support
754 direct I/O. Default: false.
755
756 atomic=bool
757 If value is true, attempt to use atomic direct I/O. Atomic
758 writes are guaranteed to be stable once acknowledged by the
759 operating system. Only Linux supports O_ATOMIC right now.
760
761 buffered=bool
762 If value is true, use buffered I/O. This is the opposite of the
763 direct option. Defaults to true.
764
765 readwrite=str, rw=str
766 Type of I/O pattern. Accepted values are:
767
768 read Sequential reads.
769
770 write Sequential writes.
771
772 trim Sequential trims (Linux block devices and SCSI
773 character devices only).
774
775 randread
776 Random reads.
777
778 randwrite
779 Random writes.
780
781 randtrim
782 Random trims (Linux block devices and SCSI charac‐
783 ter devices only).
784
785 rw,readwrite
786 Sequential mixed reads and writes.
787
788 randrw Random mixed reads and writes.
789
790 trimwrite
791 Sequential trim+write sequences. Blocks will be
792 trimmed first, then the same blocks will be writ‐
793 ten to.
794
795 Fio defaults to read if the option is not specified. For the
796 mixed I/O types, the default is to split them 50/50. For certain
797 types of I/O the result may still be skewed a bit, since the
798 speed may be different.
799
800 It is possible to specify the number of I/Os to do before get‐
801 ting a new offset by appending `:<nr>' to the end of the string
802 given. For a random read, it would look like `rw=randread:8' for
803 passing in an offset modifier with a value of 8. If the suffix
804 is used with a sequential I/O pattern, then the `<nr>' value
805 specified will be added to the generated offset for each I/O
806 turning sequential I/O into sequential I/O with holes. For
807 instance, using `rw=write:4k' will skip 4k for every write. Also
808 see the rw_sequencer option.
809
810 rw_sequencer=str
811 If an offset modifier is given by appending a number to the
812 `rw=str' line, then this option controls how that number modi‐
813 fies the I/O offset being generated. Accepted values are:
814
815 sequential
816 Generate sequential offset.
817
818 identical
819 Generate the same offset.
820
821 sequential is only useful for random I/O, where fio would nor‐
822 mally generate a new random offset for every I/O. If you append
823 e.g. 8 to randread, you would get a new random offset for every
824 8 I/Os. The result would be a seek for only every 8 I/Os,
825 instead of for every I/O. Use `rw=randread:8' to specify that.
826 As sequential I/O is already sequential, setting sequential for
827 that would not result in any differences. identical behaves in a
828 similar fashion, except it sends the same offset 8 number of
829 times before generating a new offset.
830
831 unified_rw_reporting=bool
832 Fio normally reports statistics on a per data direction basis,
833 meaning that reads, writes, and trims are accounted and reported
834 separately. If this option is set fio sums the results and
835 report them as "mixed" instead.
836
837 randrepeat=bool
838 Seed the random number generator used for random I/O patterns in
839 a predictable way so the pattern is repeatable across runs.
840 Default: true.
841
842 allrandrepeat=bool
843 Seed all random number generators in a predictable way so
844 results are repeatable across runs. Default: false.
845
846 randseed=int
847 Seed the random number generators based on this seed value, to
848 be able to control what sequence of output is being generated.
849 If not set, the random sequence depends on the randrepeat set‐
850 ting.
851
852 fallocate=str
853 Whether pre-allocation is performed when laying down files.
854 Accepted values are:
855
856 none Do not pre-allocate space.
857
858 native Use a platform's native pre-allocation call but
859 fall back to none behavior if it fails/is not
860 implemented.
861
862 posix Pre-allocate via posix_fallocate(3).
863
864 keep Pre-allocate via fallocate(2) with FAL‐
865 LOC_FL_KEEP_SIZE set.
866
867 0 Backward-compatible alias for none.
868
869 1 Backward-compatible alias for posix.
870
871 May not be available on all supported platforms. keep is only
872 available on Linux. If using ZFS on Solaris this cannot be set
873 to posix because ZFS doesn't support pre-allocation. Default:
874 native if any pre-allocation methods are available, none if not.
875
876 fadvise_hint=str
877 Use posix_fadvise(2) or posix_madvise(2) to advise the kernel
878 what I/O patterns are likely to be issued. Accepted values are:
879
880 0 Backwards compatible hint for "no hint".
881
882 1 Backwards compatible hint for "advise with fio
883 workload type". This uses FADV_RANDOM for a random
884 workload, and FADV_SEQUENTIAL for a sequential
885 workload.
886
887 sequential
888 Advise using FADV_SEQUENTIAL.
889
890 random Advise using FADV_RANDOM.
891
892 write_hint=str
893 Use fcntl(2) to advise the kernel what life time to expect from
894 a write. Only supported on Linux, as of version 4.13. Accepted
895 values are:
896
897 none No particular life time associated with this file.
898
899 short Data written to this file has a short life time.
900
901 medium Data written to this file has a medium life time.
902
903 long Data written to this file has a long life time.
904
905 extreme
906 Data written to this file has a very long life
907 time.
908
909 The values are all relative to each other, and no absolute mean‐
910 ing should be associated with them.
911
912 offset=int
913 Start I/O at the provided offset in the file, given as either a
914 fixed size in bytes or a percentage. If a percentage is given,
915 the generated offset will be aligned to the minimum blocksize or
916 to the value of offset_align if provided. Data before the given
917 offset will not be touched. This effectively caps the file size
918 at `real_size - offset'. Can be combined with size to constrain
919 the start and end range of the I/O workload. A percentage can
920 be specified by a number between 1 and 100 followed by '%', for
921 example, `offset=20%' to specify 20%.
922
923 offset_align=int
924 If set to non-zero value, the byte offset generated by a per‐
925 centage offset is aligned upwards to this value. Defaults to 0
926 meaning that a percentage offset is aligned to the minimum block
927 size.
928
929 offset_increment=int
930 If this is provided, then the real offset becomes `offset + off‐
931 set_increment * thread_number', where the thread number is a
932 counter that starts at 0 and is incremented for each sub-job
933 (i.e. when numjobs option is specified). This option is useful
934 if there are several jobs which are intended to operate on a
935 file in parallel disjoint segments, with even spacing between
936 the starting points.
937
938 number_ios=int
939 Fio will normally perform I/Os until it has exhausted the size
940 of the region set by size, or if it exhaust the allocated time
941 (or hits an error condition). With this setting, the range/size
942 can be set independently of the number of I/Os to perform. When
943 fio reaches this number, it will exit normally and report sta‐
944 tus. Note that this does not extend the amount of I/O that will
945 be done, it will only stop fio if this condition is met before
946 other end-of-job criteria.
947
948 fsync=int
949 If writing to a file, issue an fsync(2) (or its equivalent) of
950 the dirty data for every number of blocks given. For example, if
951 you give 32 as a parameter, fio will sync the file after every
952 32 writes issued. If fio is using non-buffered I/O, we may not
953 sync the file. The exception is the sg I/O engine, which syn‐
954 chronizes the disk cache anyway. Defaults to 0, which means fio
955 does not periodically issue and wait for a sync to complete.
956 Also see end_fsync and fsync_on_close.
957
958 fdatasync=int
959 Like fsync but uses fdatasync(2) to only sync data and not meta‐
960 data blocks. In Windows, FreeBSD, and DragonFlyBSD there is no
961 fdatasync(2) so this falls back to using fsync(2). Defaults to
962 0, which means fio does not periodically issue and wait for a
963 data-only sync to complete.
964
965 write_barrier=int
966 Make every N-th write a barrier write.
967
968 sync_file_range=str:int
969 Use sync_file_range(2) for every int number of write operations.
970 Fio will track range of writes that have happened since the last
971 sync_file_range(2) call. str can currently be one or more of:
972
973 wait_before
974 SYNC_FILE_RANGE_WAIT_BEFORE
975
976 write SYNC_FILE_RANGE_WRITE
977
978 wait_after
979 SYNC_FILE_RANGE_WRITE_AFTER
980
981 So if you do `sync_file_range=wait_before,write:8', fio would
982 use `SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE' for
983 every 8 writes. Also see the sync_file_range(2) man page. This
984 option is Linux specific.
985
986 overwrite=bool
987 If true, writes to a file will always overwrite existing data.
988 If the file doesn't already exist, it will be created before the
989 write phase begins. If the file exists and is large enough for
990 the specified write phase, nothing will be done. Default: false.
991
992 end_fsync=bool
993 If true, fsync(2) file contents when a write stage has com‐
994 pleted. Default: false.
995
996 fsync_on_close=bool
997 If true, fio will fsync(2) a dirty file on close. This differs
998 from end_fsync in that it will happen on every file close, not
999 just at the end of the job. Default: false.
1000
1001 rwmixread=int
1002 Percentage of a mixed workload that should be reads. Default:
1003 50.
1004
1005 rwmixwrite=int
1006 Percentage of a mixed workload that should be writes. If both
1007 rwmixread and rwmixwrite is given and the values do not add up
1008 to 100%, the latter of the two will be used to override the
1009 first. This may interfere with a given rate setting, if fio is
1010 asked to limit reads or writes to a certain rate. If that is the
1011 case, then the distribution may be skewed. Default: 50.
1012
1013 random_distribution=str:float[,str:float][,str:float]
1014 By default, fio will use a completely uniform random distribu‐
1015 tion when asked to perform random I/O. Sometimes it is useful to
1016 skew the distribution in specific ways, ensuring that some parts
1017 of the data is more hot than others. fio includes the following
1018 distribution models:
1019
1020 random Uniform random distribution
1021
1022 zipf Zipf distribution
1023
1024 pareto Pareto distribution
1025
1026 normal Normal (Gaussian) distribution
1027
1028 zoned Zoned random distribution zoned_abs Zoned absolute
1029 random distribution
1030
1031 When using a zipf or pareto distribution, an input value is also
1032 needed to define the access pattern. For zipf, this is the `Zipf
1033 theta'. For pareto, it's the `Pareto power'. Fio includes a
1034 test program, fio-genzipf, that can be used visualize what the
1035 given input values will yield in terms of hit rates. If you
1036 wanted to use zipf with a `theta' of 1.2, you would use `ran‐
1037 dom_distribution=zipf:1.2' as the option. If a non-uniform model
1038 is used, fio will disable use of the random map. For the normal
1039 distribution, a normal (Gaussian) deviation is supplied as a
1040 value between 0 and 100.
1041
1042 For a zoned distribution, fio supports specifying percentages of
1043 I/O access that should fall within what range of the file or
1044 device. For example, given a criteria of:
1045
1046 60% of accesses should be to the first 10%
1047 30% of accesses should be to the next 20%
1048 8% of accesses should be to the next 30%
1049 2% of accesses should be to the next 40%
1050
1051 we can define that through zoning of the random accesses. For
1052 the above example, the user would do:
1053
1054 random_distribution=zoned:60/10:30/20:8/30:2/40
1055
1056 A zoned_abs distribution works exactly like thezoned, except
1057 that it takes absolute sizes. For example, let's say you wanted
1058 to define access according to the following criteria:
1059
1060 60% of accesses should be to the first 20G
1061 30% of accesses should be to the next 100G
1062 10% of accesses should be to the next 500G
1063
1064 we can define an absolute zoning distribution with:
1065
1066 random_distribution=zoned:60/10:30/20:8/30:2/40
1067
1068 For both zoned and zoned_abs, fio supports defining up to 256
1069 separate zones.
1070
1071 Similarly to how bssplit works for setting ranges and percent‐
1072 ages of block sizes. Like bssplit, it's possible to specify sep‐
1073 arate zones for reads, writes, and trims. If just one set is
1074 given, it'll apply to all of them.
1075
1076 percentage_random=int[,int][,int]
1077 For a random workload, set how big a percentage should be ran‐
1078 dom. This defaults to 100%, in which case the workload is fully
1079 random. It can be set from anywhere from 0 to 100. Setting it to
1080 0 would make the workload fully sequential. Any setting in
1081 between will result in a random mix of sequential and random
1082 I/O, at the given percentages. Comma-separated values may be
1083 specified for reads, writes, and trims as described in block‐
1084 size.
1085
1086 norandommap
1087 Normally fio will cover every block of the file when doing ran‐
1088 dom I/O. If this option is given, fio will just get a new random
1089 offset without looking at past I/O history. This means that some
1090 blocks may not be read or written, and that some blocks may be
1091 read/written more than once. If this option is used with verify
1092 and multiple blocksizes (via bsrange), only intact blocks are
1093 verified, i.e., partially-overwritten blocks are ignored. With
1094 an async I/O engine and an I/O depth > 1, it is possible for the
1095 same block to be overwritten, which can cause verification
1096 errors. Either do not use norandommap in this case, or also use
1097 the lfsr random generator.
1098
1099 softrandommap=bool
1100 See norandommap. If fio runs with the random block map enabled
1101 and it fails to allocate the map, if this option is set it will
1102 continue without a random block map. As coverage will not be as
1103 complete as with random maps, this option is disabled by
1104 default.
1105
1106 random_generator=str
1107 Fio supports the following engines for generating I/O offsets
1108 for random I/O:
1109
1110 tausworthe
1111 Strong 2^88 cycle random number generator.
1112
1113 lfsr Linear feedback shift register generator.
1114
1115 tausworthe64
1116 Strong 64-bit 2^258 cycle random number generator.
1117
1118 tausworthe is a strong random number generator, but it requires
1119 tracking on the side if we want to ensure that blocks are only
1120 read or written once. lfsr guarantees that we never generate the
1121 same offset twice, and it's also less computationally expensive.
1122 It's not a true random generator, however, though for I/O pur‐
1123 poses it's typically good enough. lfsr only works with single
1124 block sizes, not with workloads that use multiple block sizes.
1125 If used with such a workload, fio may read or write some blocks
1126 multiple times. The default value is tausworthe, unless the
1127 required space exceeds 2^32 blocks. If it does, then taus‐
1128 worthe64 is selected automatically.
1129
1130 Block size
1131 blocksize=int[,int][,int], bs=int[,int][,int]
1132 The block size in bytes used for I/O units. Default: 4096. A
1133 single value applies to reads, writes, and trims. Comma-sepa‐
1134 rated values may be specified for reads, writes, and trims. A
1135 value not terminated in a comma applies to subsequent types.
1136 Examples:
1137
1138 bs=256k means 256k for reads, writes and trims.
1139 bs=8k,32k means 8k for reads, 32k for writes and
1140 trims.
1141 bs=8k,32k, means 8k for reads, 32k for writes, and
1142 default for trims.
1143 bs=,8k means default for reads, 8k for writes and
1144 trims.
1145 bs=,8k, means default for reads, 8k for writes,
1146 and default for trims.
1147
1148 blocksize_range=irange[,irange][,irange],
1149 bsrange=irange[,irange][,irange]
1150 A range of block sizes in bytes for I/O units. The issued I/O
1151 unit will always be a multiple of the minimum size, unless
1152 blocksize_unaligned is set. Comma-separated ranges may be spec‐
1153 ified for reads, writes, and trims as described in blocksize.
1154 Example:
1155
1156 bsrange=1k-4k,2k-8k
1157
1158 bssplit=str[,str][,str]
1159 Sometimes you want even finer grained control of the block sizes
1160 issued, not just an even split between them. This option allows
1161 you to weight various block sizes, so that you are able to
1162 define a specific amount of block sizes issued. The format for
1163 this option is:
1164
1165 bssplit=blocksize/percentage:blocksize/percentage
1166
1167 for as many block sizes as needed. So if you want to define a
1168 workload that has 50% 64k blocks, 10% 4k blocks, and 40% 32k
1169 blocks, you would write:
1170
1171 bssplit=4k/10:64k/50:32k/40
1172
1173 Ordering does not matter. If the percentage is left blank, fio
1174 will fill in the remaining values evenly. So a bssplit option
1175 like this one:
1176
1177 bssplit=4k/50:1k/:32k/
1178
1179 would have 50% 4k ios, and 25% 1k and 32k ios. The percentages
1180 always add up to 100, if bssplit is given a range that adds up
1181 to more, it will error out.
1182
1183 Comma-separated values may be specified for reads, writes, and
1184 trims as described in blocksize.
1185
1186 If you want a workload that has 50% 2k reads and 50% 4k reads,
1187 while having 90% 4k writes and 10% 8k writes, you would specify:
1188
1189 bssplit=2k/50:4k/50,4k/90:8k/10
1190
1191 Fio supports defining up to 64 different weights for each data
1192 direction.
1193
1194 blocksize_unaligned, bs_unaligned
1195 If set, fio will issue I/O units with any size within block‐
1196 size_range, not just multiples of the minimum size. This typi‐
1197 cally won't work with direct I/O, as that normally requires sec‐
1198 tor alignment.
1199
1200 bs_is_seq_rand=bool
1201 If this option is set, fio will use the normal read,write block‐
1202 size settings as sequential,random blocksize settings instead.
1203 Any random read or write will use the WRITE blocksize settings,
1204 and any sequential read or write will use the READ blocksize
1205 settings.
1206
1207 blockalign=int[,int][,int], ba=int[,int][,int]
1208 Boundary to which fio will align random I/O units. Default:
1209 blocksize. Minimum alignment is typically 512b for using direct
1210 I/O, though it usually depends on the hardware block size. This
1211 option is mutually exclusive with using a random map for files,
1212 so it will turn off that option. Comma-separated values may be
1213 specified for reads, writes, and trims as described in block‐
1214 size.
1215
1216 Buffers and memory
1217 zero_buffers
1218 Initialize buffers with all zeros. Default: fill buffers with
1219 random data.
1220
1221 refill_buffers
1222 If this option is given, fio will refill the I/O buffers on
1223 every submit. The default is to only fill it at init time and
1224 reuse that data. Only makes sense if zero_buffers isn't speci‐
1225 fied, naturally. If data verification is enabled, refill_buffers
1226 is also automatically enabled.
1227
1228 scramble_buffers=bool
1229 If refill_buffers is too costly and the target is using data
1230 deduplication, then setting this option will slightly modify the
1231 I/O buffer contents to defeat normal de-dupe attempts. This is
1232 not enough to defeat more clever block compression attempts, but
1233 it will stop naive dedupe of blocks. Default: true.
1234
1235 buffer_compress_percentage=int
1236 If this is set, then fio will attempt to provide I/O buffer con‐
1237 tent (on WRITEs) that compresses to the specified level. Fio
1238 does this by providing a mix of random data followed by fixed
1239 pattern data. The fixed pattern is either zeros, or the pattern
1240 specified by buffer_pattern. If the buffer_pattern option is
1241 used, it might skew the compression ratio slightly. Setting buf‐
1242 fer_compress_percentage to a value other than 100 will also
1243 enable refill_buffers in order to reduce the likelihood that
1244 adjacent blocks are so similar that they over compress when seen
1245 together. See buffer_compress_chunk for how to set a finer or
1246 coarser granularity of the random/fixed data regions. Defaults
1247 to unset i.e., buffer data will not adhere to any compression
1248 level.
1249
1250 buffer_compress_chunk=int
1251 This setting allows fio to manage how big the random/fixed data
1252 region is when using buffer_compress_percentage. When buf‐
1253 fer_compress_chunk is set to some non-zero value smaller than
1254 the block size, fio can repeat the random/fixed region through‐
1255 out the I/O buffer at the specified interval (which particularly
1256 useful when bigger block sizes are used for a job). When set to
1257 0, fio will use a chunk size that matches the block size result‐
1258 ing in a single random/fixed region within the I/O buffer.
1259 Defaults to 512. When the unit is omitted, the value is inter‐
1260 preted in bytes.
1261
1262 buffer_pattern=str
1263 If set, fio will fill the I/O buffers with this pattern or with
1264 the contents of a file. If not set, the contents of I/O buffers
1265 are defined by the other options related to buffer contents. The
1266 setting can be any pattern of bytes, and can be prefixed with 0x
1267 for hex values. It may also be a string, where the string must
1268 then be wrapped with "". Or it may also be a filename, where the
1269 filename must be wrapped with '' in which case the file is
1270 opened and read. Note that not all the file contents will be
1271 read if that would cause the buffers to overflow. So, for exam‐
1272 ple:
1273
1274 buffer_pattern='filename'
1275 or:
1276 buffer_pattern="abcd"
1277 or:
1278 buffer_pattern=-12
1279 or:
1280 buffer_pattern=0xdeadface
1281
1282 Also you can combine everything together in any order:
1283
1284 buffer_pattern=0xdeadface"abcd"-12'filename'
1285
1286 dedupe_percentage=int
1287 If set, fio will generate this percentage of identical buffers
1288 when writing. These buffers will be naturally dedupable. The
1289 contents of the buffers depend on what other buffer compression
1290 settings have been set. It's possible to have the individual
1291 buffers either fully compressible, or not at all -- this option
1292 only controls the distribution of unique buffers. Setting this
1293 option will also enable refill_buffers to prevent every buffer
1294 being identical.
1295
1296 invalidate=bool
1297 Invalidate the buffer/page cache parts of the files to be used
1298 prior to starting I/O if the platform and file type support it.
1299 Defaults to true. This will be ignored if pre_read is also
1300 specified for the same job.
1301
1302 sync=bool
1303 Use synchronous I/O for buffered writes. For the majority of I/O
1304 engines, this means using O_SYNC. Default: false.
1305
1306 iomem=str, mem=str
1307 Fio can use various types of memory as the I/O unit buffer. The
1308 allowed values are:
1309
1310 malloc Use memory from malloc(3) as the buffers. Default
1311 memory type.
1312
1313 shm Use shared memory as the buffers. Allocated
1314 through shmget(2).
1315
1316 shmhuge
1317 Same as shm, but use huge pages as backing.
1318
1319 mmap Use mmap(2) to allocate buffers. May either be
1320 anonymous memory, or can be file backed if a file‐
1321 name is given after the option. The format is
1322 `mem=mmap:/path/to/file'.
1323
1324 mmaphuge
1325 Use a memory mapped huge file as the buffer back‐
1326 ing. Append filename after mmaphuge, ala `mem=mma‐
1327 phuge:/hugetlbfs/file'.
1328
1329 mmapshared
1330 Same as mmap, but use a MMAP_SHARED mapping.
1331
1332 cudamalloc
1333 Use GPU memory as the buffers for GPUDirect RDMA
1334 benchmark. The ioengine must be rdma.
1335
1336 The area allocated is a function of the maximum allowed bs size
1337 for the job, multiplied by the I/O depth given. Note that for
1338 shmhuge and mmaphuge to work, the system must have free huge
1339 pages allocated. This can normally be checked and set by read‐
1340 ing/writing `/proc/sys/vm/nr_hugepages' on a Linux system. Fio
1341 assumes a huge page is 4MiB in size. So to calculate the number
1342 of huge pages you need for a given job file, add up the I/O
1343 depth of all jobs (normally one unless iodepth is used) and mul‐
1344 tiply by the maximum bs set. Then divide that number by the huge
1345 page size. You can see the size of the huge pages in `/proc/mem‐
1346 info'. If no huge pages are allocated by having a non-zero num‐
1347 ber in `nr_hugepages', using mmaphuge or shmhuge will fail. Also
1348 see hugepage-size.
1349
1350 mmaphuge also needs to have hugetlbfs mounted and the file loca‐
1351 tion should point there. So if it's mounted in `/huge', you
1352 would use `mem=mmaphuge:/huge/somefile'.
1353
1354 iomem_align=int, mem_align=int
1355 This indicates the memory alignment of the I/O memory buffers.
1356 Note that the given alignment is applied to the first I/O unit
1357 buffer, if using iodepth the alignment of the following buffers
1358 are given by the bs used. In other words, if using a bs that is
1359 a multiple of the page sized in the system, all buffers will be
1360 aligned to this value. If using a bs that is not page aligned,
1361 the alignment of subsequent I/O memory buffers is the sum of the
1362 iomem_align and bs used.
1363
1364 hugepage-size=int
1365 Defines the size of a huge page. Must at least be equal to the
1366 system setting, see `/proc/meminfo'. Defaults to 4MiB. Should
1367 probably always be a multiple of megabytes, so using
1368 `hugepage-size=Xm' is the preferred way to set this to avoid
1369 setting a non-pow-2 bad value.
1370
1371 lockmem=int
1372 Pin the specified amount of memory with mlock(2). Can be used to
1373 simulate a smaller amount of memory. The amount specified is per
1374 worker.
1375
1376 I/O size
1377 size=int
1378 The total size of file I/O for each thread of this job. Fio will
1379 run until this many bytes has been transferred, unless runtime
1380 is limited by other options (such as runtime, for instance, or
1381 increased/decreased by io_size). Fio will divide this size
1382 between the available files determined by options such as
1383 nrfiles, filename, unless filesize is specified by the job. If
1384 the result of division happens to be 0, the size is set to the
1385 physical size of the given files or devices if they exist. If
1386 this option is not specified, fio will use the full size of the
1387 given files or devices. If the files do not exist, size must be
1388 given. It is also possible to give size as a percentage between
1389 1 and 100. If `size=20%' is given, fio will use 20% of the full
1390 size of the given files or devices. Can be combined with offset
1391 to constrain the start and end range that I/O will be done
1392 within.
1393
1394 io_size=int, io_limit=int
1395 Normally fio operates within the region set by size, which means
1396 that the size option sets both the region and size of I/O to be
1397 performed. Sometimes that is not what you want. With this
1398 option, it is possible to define just the amount of I/O that fio
1399 should do. For instance, if size is set to 20GiB and io_size is
1400 set to 5GiB, fio will perform I/O within the first 20GiB but
1401 exit when 5GiB have been done. The opposite is also possible --
1402 if size is set to 20GiB, and io_size is set to 40GiB, then fio
1403 will do 40GiB of I/O within the 0..20GiB region.
1404
1405 filesize=irange(int)
1406 Individual file sizes. May be a range, in which case fio will
1407 select sizes for files at random within the given range and lim‐
1408 ited to size in total (if that is given). If not given, each
1409 created file is the same size. This option overrides size in
1410 terms of file size, which means this value is used as a fixed
1411 size or possible range of each file.
1412
1413 file_append=bool
1414 Perform I/O after the end of the file. Normally fio will operate
1415 within the size of a file. If this option is set, then fio will
1416 append to the file instead. This has identical behavior to set‐
1417 ting offset to the size of a file. This option is ignored on
1418 non-regular files.
1419
1420 fill_device=bool, fill_fs=bool
1421 Sets size to something really large and waits for ENOSPC (no
1422 space left on device) as the terminating condition. Only makes
1423 sense with sequential write. For a read workload, the mount
1424 point will be filled first then I/O started on the result. This
1425 option doesn't make sense if operating on a raw device node,
1426 since the size of that is already known by the file system.
1427 Additionally, writing beyond end-of-device will not return
1428 ENOSPC there.
1429
1430 I/O engine
1431 ioengine=str
1432 Defines how the job issues I/O to the file. The following types
1433 are defined:
1434
1435 sync Basic read(2) or write(2) I/O. lseek(2) is used to
1436 position the I/O location. See fsync and fdata‐
1437 sync for syncing write I/Os.
1438
1439 psync Basic pread(2) or pwrite(2) I/O. Default on all
1440 supported operating systems except for Windows.
1441
1442 vsync Basic readv(2) or writev(2) I/O. Will emulate
1443 queuing by coalescing adjacent I/Os into a single
1444 submission.
1445
1446 pvsync Basic preadv(2) or pwritev(2) I/O.
1447
1448 pvsync2
1449 Basic preadv2(2) or pwritev2(2) I/O.
1450
1451 libaio Linux native asynchronous I/O. Note that Linux may
1452 only support queued behavior with non-buffered I/O
1453 (set `direct=1' or `buffered=0'). This engine
1454 defines engine specific options.
1455
1456 posixaio
1457 POSIX asynchronous I/O using aio_read(3) and
1458 aio_write(3).
1459
1460 solarisaio
1461 Solaris native asynchronous I/O.
1462
1463 windowsaio
1464 Windows native asynchronous I/O. Default on Win‐
1465 dows.
1466
1467 mmap File is memory mapped with mmap(2) and data copied
1468 to/from using memcpy(3).
1469
1470 splice splice(2) is used to transfer the data and
1471 vmsplice(2) to transfer data from user space to
1472 the kernel.
1473
1474 sg SCSI generic sg v3 I/O. May either be synchronous
1475 using the SG_IO ioctl, or if the target is an sg
1476 character device we use read(2) and write(2) for
1477 asynchronous I/O. Requires filename option to
1478 specify either block or character devices. This
1479 engine supports trim operations. The sg engine
1480 includes engine specific options.
1481
1482 null Doesn't transfer any data, just pretends to. This
1483 is mainly used to exercise fio itself and for
1484 debugging/testing purposes.
1485
1486 net Transfer over the network to given `host:port'.
1487 Depending on the protocol used, the hostname,
1488 port, listen and filename options are used to
1489 specify what sort of connection to make, while the
1490 protocol option determines which protocol will be
1491 used. This engine defines engine specific options.
1492
1493 netsplice
1494 Like net, but uses splice(2) and vmsplice(2) to
1495 map data and send/receive. This engine defines
1496 engine specific options.
1497
1498 cpuio Doesn't transfer any data, but burns CPU cycles
1499 according to the cpuload and cpuchunks options.
1500 Setting cpuload=85 will cause that job to do noth‐
1501 ing but burn 85% of the CPU. In case of SMP
1502 machines, use `numjobs=<nr_of_cpu>' to get desired
1503 CPU usage, as the cpuload only loads a single CPU
1504 at the desired rate. A job never finishes unless
1505 there is at least one non-cpuio job.
1506
1507 guasi The GUASI I/O engine is the Generic Userspace
1508 Asynchronous Syscall Interface approach to async
1509 I/O. See http://www.xmailserver.org/guasi-lib.html
1510 for more info on GUASI.
1511
1512 rdma The RDMA I/O engine supports both RDMA memory
1513 semantics (RDMA_WRITE/RDMA_READ) and channel
1514 semantics (Send/Recv) for the InfiniBand, RoCE and
1515 iWARP protocols. This engine defines engine spe‐
1516 cific options.
1517
1518 falloc I/O engine that does regular fallocate to simulate
1519 data transfer as fio ioengine.
1520
1521 DDIR_READ does fallocate(,mode = FAL‐
1522 LOC_FL_KEEP_SIZE,).
1523 DIR_WRITE does fallocate(,mode = 0).
1524 DDIR_TRIM does fallocate(,mode = FAL‐
1525 LOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE).
1526
1527 ftruncate
1528 I/O engine that sends ftruncate(2) operations in
1529 response to write (DDIR_WRITE) events. Each ftrun‐
1530 cate issued sets the file's size to the current
1531 block offset. blocksize is ignored.
1532
1533 e4defrag
1534 I/O engine that does regular EXT4_IOC_MOVE_EXT
1535 ioctls to simulate defragment activity in request
1536 to DDIR_WRITE event.
1537
1538 rados I/O engine supporting direct access to Ceph Reli‐
1539 able Autonomic Distributed Object Store (RADOS)
1540 via librados. This ioengine defines engine spe‐
1541 cific options.
1542
1543 rbd I/O engine supporting direct access to Ceph Rados
1544 Block Devices (RBD) via librbd without the need to
1545 use the kernel rbd driver. This ioengine defines
1546 engine specific options.
1547
1548 http I/O engine supporting GET/PUT requests over
1549 HTTP(S) with libcurl to a WebDAV or S3 endpoint.
1550 This ioengine defines engine specific options.
1551
1552 This engine only supports direct IO of iodepth=1;
1553 you need to scale this via numjobs. blocksize
1554 defines the size of the objects to be created.
1555
1556 TRIM is translated to object deletion.
1557
1558 gfapi Using GlusterFS libgfapi sync interface to direct
1559 access to GlusterFS volumes without having to go
1560 through FUSE. This ioengine defines engine spe‐
1561 cific options.
1562
1563 gfapi_async
1564 Using GlusterFS libgfapi async interface to direct
1565 access to GlusterFS volumes without having to go
1566 through FUSE. This ioengine defines engine spe‐
1567 cific options.
1568
1569 libhdfs
1570 Read and write through Hadoop (HDFS). The filename
1571 option is used to specify host,port of the hdfs
1572 name-node to connect. This engine interprets off‐
1573 sets a little differently. In HDFS, files once
1574 created cannot be modified so random writes are
1575 not possible. To imitate this the libhdfs engine
1576 expects a bunch of small files to be created over
1577 HDFS and will randomly pick a file from them based
1578 on the offset generated by fio backend (see the
1579 example job file to create such files, use
1580 `rw=write' option). Please note, it may be neces‐
1581 sary to set environment variables to work with
1582 HDFS/libhdfs properly. Each job uses its own con‐
1583 nection to HDFS.
1584
1585 mtd Read, write and erase an MTD character device
1586 (e.g., `/dev/mtd0'). Discards are treated as
1587 erases. Depending on the underlying device type,
1588 the I/O may have to go in a certain pattern, e.g.,
1589 on NAND, writing sequentially to erase blocks and
1590 discarding before overwriting. The trimwrite mode
1591 works well for this constraint.
1592
1593 pmemblk
1594 Read and write using filesystem DAX to a file on a
1595 filesystem mounted with DAX on a persistent memory
1596 device through the PMDK libpmemblk library.
1597
1598 dev-dax
1599 Read and write using device DAX to a persistent
1600 memory device (e.g., /dev/dax0.0) through the PMDK
1601 libpmem library.
1602
1603 external
1604 Prefix to specify loading an external I/O engine
1605 object file. Append the engine filename, e.g.
1606 `ioengine=external:/tmp/foo.o' to load ioengine
1607 `foo.o' in `/tmp'. The path can be either absolute
1608 or relative. See `engines/skeleton_external.c' in
1609 the fio source for details of writing an external
1610 I/O engine.
1611
1612 filecreate
1613 Simply create the files and do no I/O to them.
1614 You still need to set filesize so that all the
1615 accounting still occurs, but no actual I/O will be
1616 done other than creating the file.
1617
1618 libpmem
1619 Read and write using mmap I/O to a file on a
1620 filesystem mounted with DAX on a persistent memory
1621 device through the PMDK libpmem library.
1622
1623 ime_psync
1624 Synchronous read and write using DDN's Infinite
1625 Memory Engine (IME). This engine is very basic and
1626 issues calls to IME whenever an IO is queued.
1627
1628 ime_psyncv
1629 Synchronous read and write using DDN's Infinite
1630 Memory Engine (IME). This engine uses iovecs and
1631 will try to stack as much IOs as possible (if the
1632 IOs are "contiguous" and the IO depth is not
1633 exceeded) before issuing a call to IME.
1634
1635 ime_aio
1636 Asynchronous read and write using DDN's Infinite
1637 Memory Engine (IME). This engine will try to stack
1638 as much IOs as possible by creating requests for
1639 IME. FIO will then decide when to commit these
1640 requests.
1641
1642 libiscsi
1643 Read and write iscsi lun with libiscsi.
1644
1645 I/O engine specific parameters
1646 In addition, there are some parameters which are only valid when a spe‐
1647 cific ioengine is in use. These are used identically to normal parame‐
1648 ters, with the caveat that when used on the command line, they must
1649 come after the ioengine that defines them is selected.
1650
1651 (libaio)userspace_reap
1652 Normally, with the libaio engine in use, fio will use the
1653 io_getevents(3) system call to reap newly returned events. With
1654 this flag turned on, the AIO ring will be read directly from
1655 user-space to reap events. The reaping mode is only enabled when
1656 polling for a minimum of 0 events (e.g. when `iodepth_batch_com‐
1657 plete=0').
1658
1659 (pvsync2)hipri
1660 Set RWF_HIPRI on I/O, indicating to the kernel that it's of
1661 higher priority than normal.
1662
1663 (pvsync2)hipri_percentage
1664 When hipri is set this determines the probability of a pvsync2
1665 I/O being high priority. The default is 100%.
1666
1667 (cpuio)cpuload=int
1668 Attempt to use the specified percentage of CPU cycles. This is a
1669 mandatory option when using cpuio I/O engine.
1670
1671 (cpuio)cpuchunks=int
1672 Split the load into cycles of the given time. In microseconds.
1673
1674 (cpuio)exit_on_io_done=bool
1675 Detect when I/O threads are done, then exit.
1676
1677 (libhdfs)namenode=str
1678 The hostname or IP address of a HDFS cluster namenode to con‐
1679 tact.
1680
1681 (libhdfs)port
1682 The listening port of the HFDS cluster namenode.
1683
1684 (netsplice,net)port
1685 The TCP or UDP port to bind to or connect to. If this is used
1686 with numjobs to spawn multiple instances of the same job type,
1687 then this will be the starting port number since fio will use a
1688 range of ports.
1689
1690 (rdma)port
1691 The port to use for RDMA-CM communication. This should be the
1692 same value on the client and the server side.
1693
1694 (netsplice,net,rdma)hostname=str
1695 The hostname or IP address to use for TCP, UDP or RDMA-CM based
1696 I/O. If the job is a TCP listener or UDP reader, the hostname
1697 is not used and must be omitted unless it is a valid UDP multi‐
1698 cast address.
1699
1700 (netsplice,net)interface=str
1701 The IP address of the network interface used to send or receive
1702 UDP multicast.
1703
1704 (netsplice,net)ttl=int
1705 Time-to-live value for outgoing UDP multicast packets. Default:
1706 1.
1707
1708 (netsplice,net)nodelay=bool
1709 Set TCP_NODELAY on TCP connections.
1710
1711 (netsplice,net)protocol=str, proto=str
1712 The network protocol to use. Accepted values are:
1713
1714 tcp Transmission control protocol.
1715
1716 tcpv6 Transmission control protocol V6.
1717
1718 udp User datagram protocol.
1719
1720 udpv6 User datagram protocol V6.
1721
1722 unix UNIX domain socket.
1723
1724 When the protocol is TCP or UDP, the port must also be given, as
1725 well as the hostname if the job is a TCP listener or UDP reader.
1726 For unix sockets, the normal filename option should be used and
1727 the port is invalid.
1728
1729 (netsplice,net)listen
1730 For TCP network connections, tell fio to listen for incoming
1731 connections rather than initiating an outgoing connection. The
1732 hostname must be omitted if this option is used.
1733
1734 (netsplice,net)pingpong
1735 Normally a network writer will just continue writing data, and a
1736 network reader will just consume packages. If `pingpong=1' is
1737 set, a writer will send its normal payload to the reader, then
1738 wait for the reader to send the same payload back. This allows
1739 fio to measure network latencies. The submission and completion
1740 latencies then measure local time spent sending or receiving,
1741 and the completion latency measures how long it took for the
1742 other end to receive and send back. For UDP multicast traffic
1743 `pingpong=1' should only be set for a single reader when multi‐
1744 ple readers are listening to the same address.
1745
1746 (netsplice,net)window_size=int
1747 Set the desired socket buffer size for the connection.
1748
1749 (netsplice,net)mss=int
1750 Set the TCP maximum segment size (TCP_MAXSEG).
1751
1752 (e4defrag)donorname=str
1753 File will be used as a block donor (swap extents between files).
1754
1755 (e4defrag)inplace=int
1756 Configure donor file blocks allocation strategy:
1757
1758 0 Default. Preallocate donor's file on init.
1759
1760 1 Allocate space immediately inside defragment
1761 event, and free right after event.
1762
1763 (rbd,rados)clustername=str
1764 Specifies the name of the Ceph cluster.
1765
1766 (rbd)rbdname=str
1767 Specifies the name of the RBD.
1768
1769 (rbd,rados)pool=str
1770 Specifies the name of the Ceph pool containing RBD or RADOS
1771 data.
1772
1773 (rbd,rados)clientname=str
1774 Specifies the username (without the 'client.' prefix) used to
1775 access the Ceph cluster. If the clustername is specified, the
1776 clientname shall be the full *type.id* string. If no type. pre‐
1777 fix is given, fio will add 'client.' by default.
1778
1779 (rbd,rados)busy_poll=bool
1780 Poll store instead of waiting for completion. Usually this pro‐
1781 vides better throughput at cost of higher(up to 100%) CPU uti‐
1782 lization.
1783
1784 (http)http_host=str
1785 Hostname to connect to. For S3, this could be the bucket name.
1786 Default is localhost
1787
1788 (http)http_user=str
1789 Username for HTTP authentication.
1790
1791 (http)http_pass=str
1792 Password for HTTP authentication.
1793
1794 (http)https=str
1795 Whether to use HTTPS instead of plain HTTP. on enables HTTPS;
1796 insecure will enable HTTPS, but disable SSL peer verification
1797 (use with caution!). Default is off.
1798
1799 (http)http_mode=str
1800 Which HTTP access mode to use: webdav, swift, or s3. Default is
1801 webdav.
1802
1803 (http)http_s3_region=str
1804 The S3 region/zone to include in the request. Default is us-
1805 east-1.
1806
1807 (http)http_s3_key=str
1808 The S3 secret key.
1809
1810 (http)http_s3_keyid=str
1811 The S3 key/access id.
1812
1813 (http)http_swift_auth_token=str
1814 The Swift auth token. See the example configuration file on how
1815 to retrieve this.
1816
1817 (http)http_verbose=int
1818 Enable verbose requests from libcurl. Useful for debugging. 1
1819 turns on verbose logging from libcurl, 2 additionally enables
1820 HTTP IO tracing. Default is 0
1821
1822 (mtd)skip_bad=bool
1823 Skip operations against known bad blocks.
1824
1825 (libhdfs)hdfsdirectory
1826 libhdfs will create chunk in this HDFS directory.
1827
1828 (libhdfs)chunk_size
1829 The size of the chunk to use for each file.
1830
1831 (rdma)verb=str
1832 The RDMA verb to use on this side of the RDMA ioengine connec‐
1833 tion. Valid values are write, read, send and recv. These corre‐
1834 spond to the equivalent RDMA verbs (e.g. write = rdma_write
1835 etc.). Note that this only needs to be specified on the client
1836 side of the connection. See the examples folder.
1837
1838 (rdma)bindname=str
1839 The name to use to bind the local RDMA-CM connection to a local
1840 RDMA device. This could be a hostname or an IPv4 or IPv6
1841 address. On the server side this will be passed into the
1842 rdma_bind_addr() function and on the client site it will be used
1843 in the rdma_resolve_add() function. This can be useful when mul‐
1844 tiple paths exist between the client and the server or in cer‐
1845 tain loopback configurations.
1846
1847 (sg)readfua=bool
1848 With readfua option set to 1, read operations include the force
1849 unit access (fua) flag. Default: 0.
1850
1851 (sg)writefua=bool
1852 With writefua option set to 1, write operations include the
1853 force unit access (fua) flag. Default: 0.
1854
1855 (sg)sg_write_mode=str
1856 Specify the type of write commands to issue. This option can
1857 take three values:
1858
1859 write (default)
1860 Write opcodes are issued as usual
1861
1862 verify Issue WRITE AND VERIFY commands. The BYTCHK bit is
1863 set to 0. This directs the device to carry out a
1864 medium verification with no data comparison. The
1865 writefua option is ignored with this selection.
1866
1867 same Issue WRITE SAME commands. This transfers a single
1868 block to the device and writes this same block of
1869 data to a contiguous sequence of LBAs beginning at
1870 the specified offset. fio's block size parameter
1871 specifies the amount of data written with each
1872 command. However, the amount of data actually
1873 transferred to the device is equal to the device's
1874 block (sector) size. For a device with 512 byte
1875 sectors, blocksize=8k will write 16 sectors with
1876 each command. fio will still generate 8k of data
1877 for each command butonly the first 512 bytes will
1878 be used and transferred to the device. The write‐
1879 fua option is ignored with this selection.
1880
1881
1882 I/O depth
1883 iodepth=int
1884 Number of I/O units to keep in flight against the file. Note
1885 that increasing iodepth beyond 1 will not affect synchronous
1886 ioengines (except for small degrees when verify_async is in
1887 use). Even async engines may impose OS restrictions causing the
1888 desired depth not to be achieved. This may happen on Linux when
1889 using libaio and not setting `direct=1', since buffered I/O is
1890 not async on that OS. Keep an eye on the I/O depth distribution
1891 in the fio output to verify that the achieved depth is as
1892 expected. Default: 1.
1893
1894 iodepth_batch_submit=int, iodepth_batch=int
1895 This defines how many pieces of I/O to submit at once. It
1896 defaults to 1 which means that we submit each I/O as soon as it
1897 is available, but can be raised to submit bigger batches of I/O
1898 at the time. If it is set to 0 the iodepth value will be used.
1899
1900 iodepth_batch_complete_min=int, iodepth_batch_complete=int
1901 This defines how many pieces of I/O to retrieve at once. It
1902 defaults to 1 which means that we'll ask for a minimum of 1 I/O
1903 in the retrieval process from the kernel. The I/O retrieval will
1904 go on until we hit the limit set by iodepth_low. If this vari‐
1905 able is set to 0, then fio will always check for completed
1906 events before queuing more I/O. This helps reduce I/O latency,
1907 at the cost of more retrieval system calls.
1908
1909 iodepth_batch_complete_max=int
1910 This defines maximum pieces of I/O to retrieve at once. This
1911 variable should be used along with iodepth_batch_com‐
1912 plete_min=int variable, specifying the range of min and max
1913 amount of I/O which should be retrieved. By default it is equal
1914 to iodepth_batch_complete_min value. Example #1:
1915
1916 iodepth_batch_complete_min=1
1917 iodepth_batch_complete_max=<iodepth>
1918
1919 which means that we will retrieve at least 1 I/O and up to the
1920 whole submitted queue depth. If none of I/O has been completed
1921 yet, we will wait. Example #2:
1922
1923 iodepth_batch_complete_min=0
1924 iodepth_batch_complete_max=<iodepth>
1925
1926 which means that we can retrieve up to the whole submitted queue
1927 depth, but if none of I/O has been completed yet, we will NOT
1928 wait and immediately exit the system call. In this example we
1929 simply do polling.
1930
1931 iodepth_low=int
1932 The low water mark indicating when to start filling the queue
1933 again. Defaults to the same as iodepth, meaning that fio will
1934 attempt to keep the queue full at all times. If iodepth is set
1935 to e.g. 16 and iodepth_low is set to 4, then after fio has
1936 filled the queue of 16 requests, it will let the depth drain
1937 down to 4 before starting to fill it again.
1938
1939 serialize_overlap=bool
1940 Serialize in-flight I/Os that might otherwise cause or suffer
1941 from data races. When two or more I/Os are submitted simultane‐
1942 ously, there is no guarantee that the I/Os will be processed or
1943 completed in the submitted order. Further, if two or more of
1944 those I/Os are writes, any overlapping region between them can
1945 become indeterminate/undefined on certain storage. These issues
1946 can cause verification to fail erratically when at least one of
1947 the racing I/Os is changing data and the overlapping region has
1948 a non-zero size. Setting serialize_overlap tells fio to avoid
1949 provoking this behavior by explicitly serializing in-flight I/Os
1950 that have a non-zero overlap. Note that setting this option can
1951 reduce both performance and the iodepth achieved.
1952
1953 This option only applies to I/Os issued for a single job except
1954 when it is enabled along with io_submit_mode=offload. In offload
1955 mode, fio will check for overlap among all I/Os submitted by
1956 offload jobs with serialize_overlap enabled.
1957
1958 Default: false.
1959
1960 io_submit_mode=str
1961 This option controls how fio submits the I/O to the I/O engine.
1962 The default is `inline', which means that the fio job threads
1963 submit and reap I/O directly. If set to `offload', the job
1964 threads will offload I/O submission to a dedicated pool of I/O
1965 threads. This requires some coordination and thus has a bit of
1966 extra overhead, especially for lower queue depth I/O where it
1967 can increase latencies. The benefit is that fio can manage sub‐
1968 mission rates independently of the device completion rates. This
1969 avoids skewed latency reporting if I/O gets backed up on the
1970 device side (the coordinated omission problem).
1971
1972 I/O rate
1973 thinktime=time
1974 Stall the job for the specified period of time after an I/O has
1975 completed before issuing the next. May be used to simulate pro‐
1976 cessing being done by an application. When the unit is omitted,
1977 the value is interpreted in microseconds. See thinktime_blocks
1978 and thinktime_spin.
1979
1980 thinktime_spin=time
1981 Only valid if thinktime is set - pretend to spend CPU time doing
1982 something with the data received, before falling back to sleep‐
1983 ing for the rest of the period specified by thinktime. When the
1984 unit is omitted, the value is interpreted in microseconds.
1985
1986 thinktime_blocks=int
1987 Only valid if thinktime is set - control how many blocks to
1988 issue, before waiting thinktime usecs. If not set, defaults to 1
1989 which will make fio wait thinktime usecs after every block. This
1990 effectively makes any queue depth setting redundant, since no
1991 more than 1 I/O will be queued before we have to complete it and
1992 do our thinktime. In other words, this setting effectively caps
1993 the queue depth if the latter is larger.
1994
1995 rate=int[,int][,int]
1996 Cap the bandwidth used by this job. The number is in bytes/sec,
1997 the normal suffix rules apply. Comma-separated values may be
1998 specified for reads, writes, and trims as described in block‐
1999 size.
2000
2001 For example, using `rate=1m,500k' would limit reads to 1MiB/sec
2002 and writes to 500KiB/sec. Capping only reads or writes can be
2003 done with `rate=,500k' or `rate=500k,' where the former will
2004 only limit writes (to 500KiB/sec) and the latter will only limit
2005 reads.
2006
2007 rate_min=int[,int][,int]
2008 Tell fio to do whatever it can to maintain at least this band‐
2009 width. Failing to meet this requirement will cause the job to
2010 exit. Comma-separated values may be specified for reads, writes,
2011 and trims as described in blocksize.
2012
2013 rate_iops=int[,int][,int]
2014 Cap the bandwidth to this number of IOPS. Basically the same as
2015 rate, just specified independently of bandwidth. If the job is
2016 given a block size range instead of a fixed value, the smallest
2017 block size is used as the metric. Comma-separated values may be
2018 specified for reads, writes, and trims as described in block‐
2019 size.
2020
2021 rate_iops_min=int[,int][,int]
2022 If fio doesn't meet this rate of I/O, it will cause the job to
2023 exit. Comma-separated values may be specified for reads,
2024 writes, and trims as described in blocksize.
2025
2026 rate_process=str
2027 This option controls how fio manages rated I/O submissions. The
2028 default is `linear', which submits I/O in a linear fashion with
2029 fixed delays between I/Os that gets adjusted based on I/O com‐
2030 pletion rates. If this is set to `poisson', fio will submit I/O
2031 based on a more real world random request flow, known as the
2032 Poisson process (https://en.wikipedia.org/wiki/Pois‐
2033 son_point_process). The lambda will be 10^6 / IOPS for the given
2034 workload.
2035
2036 rate_ignore_thinktime=bool
2037 By default, fio will attempt to catch up to the specified rate
2038 setting, if any kind of thinktime setting was used. If this
2039 option is set, then fio will ignore the thinktime and continue
2040 doing IO at the specified rate, instead of entering a catch-up
2041 mode after thinktime is done.
2042
2043 I/O latency
2044 latency_target=time
2045 If set, fio will attempt to find the max performance point that
2046 the given workload will run at while maintaining a latency below
2047 this target. When the unit is omitted, the value is interpreted
2048 in microseconds. See latency_window and latency_percentile.
2049
2050 latency_window=time
2051 Used with latency_target to specify the sample window that the
2052 job is run at varying queue depths to test the performance. When
2053 the unit is omitted, the value is interpreted in microseconds.
2054
2055 latency_percentile=float
2056 The percentage of I/Os that must fall within the criteria speci‐
2057 fied by latency_target and latency_window. If not set, this
2058 defaults to 100.0, meaning that all I/Os must be equal or below
2059 to the value set by latency_target.
2060
2061 max_latency=time
2062 If set, fio will exit the job with an ETIMEDOUT error if it
2063 exceeds this maximum latency. When the unit is omitted, the
2064 value is interpreted in microseconds.
2065
2066 rate_cycle=int
2067 Average bandwidth for rate and rate_min over this number of mil‐
2068 liseconds. Defaults to 1000.
2069
2070 I/O replay
2071 write_iolog=str
2072 Write the issued I/O patterns to the specified file. See
2073 read_iolog. Specify a separate file for each job, otherwise the
2074 iologs will be interspersed and the file may be corrupt.
2075
2076 read_iolog=str
2077 Open an iolog with the specified filename and replay the I/O
2078 patterns it contains. This can be used to store a workload and
2079 replay it sometime later. The iolog given may also be a blktrace
2080 binary file, which allows fio to replay a workload captured by
2081 blktrace. See blktrace(8) for how to capture such logging data.
2082 For blktrace replay, the file needs to be turned into a blkparse
2083 binary data file first (`blkparse <device> -o /dev/null -d
2084 file_for_fio.bin'). You can specify a number of files by sepa‐
2085 rating the names with a ':' character. See the filename option
2086 for information on how to escape ':' and '´ characters within
2087 the file names. These files will be sequentially assigned to job
2088 clones created by numjobs.
2089
2090 read_iolog_chunked=bool
2091 Determines how iolog is read. If false (default) entire
2092 read_iolog will be read at once. If selected true, input from
2093 iolog will be read gradually. Useful when iolog is very large,
2094 or it is generated.
2095
2096 merge_blktrace_file=str
2097 When specified, rather than replaying the logs passed to
2098 read_iolog, the logs go through a merge phase which aggregates
2099 them into a single blktrace. The resulting file is then passed
2100 on as the read_iolog parameter. The intention here is to make
2101 the order of events consistent. This limits the influence of the
2102 scheduler compared to replaying multiple blktraces via concur‐
2103 rent jobs.
2104
2105 merge_blktrace_scalars=float_list
2106 This is a percentage based option that is index paired with the
2107 list of files passed to read_iolog. When merging is performed,
2108 scale the time of each event by the corresponding amount. For
2109 example, `--merge_blktrace_scalars="50:100"' runs the first
2110 trace in halftime and the second trace in realtime. This knob is
2111 separately tunable from replay_time_scale which scales the trace
2112 during runtime and will not change the output of the merge
2113 unlike this option.
2114
2115 merge_blktrace_iters=float_list
2116 This is a whole number option that is index paired with the list
2117 of files passed to read_iolog. When merging is performed, run
2118 each trace for the specified number of iterations. For example,
2119 `--merge_blktrace_iters="2:1"' runs the first trace for two
2120 iterations and the second trace for one iteration.
2121
2122 replay_no_stall=bool
2123 When replaying I/O with read_iolog the default behavior is to
2124 attempt to respect the timestamps within the log and replay them
2125 with the appropriate delay between IOPS. By setting this vari‐
2126 able fio will not respect the timestamps and attempt to replay
2127 them as fast as possible while still respecting ordering. The
2128 result is the same I/O pattern to a given device, but different
2129 timings.
2130
2131 replay_time_scale=int
2132 When replaying I/O with read_iolog, fio will honor the original
2133 timing in the trace. With this option, it's possible to scale
2134 the time. It's a percentage option, if set to 50 it means run at
2135 50% the original IO rate in the trace. If set to 200, run at
2136 twice the original IO rate. Defaults to 100.
2137
2138 replay_redirect=str
2139 While replaying I/O patterns using read_iolog the default behav‐
2140 ior is to replay the IOPS onto the major/minor device that each
2141 IOP was recorded from. This is sometimes undesirable because on
2142 a different machine those major/minor numbers can map to a dif‐
2143 ferent device. Changing hardware on the same system can also
2144 result in a different major/minor mapping. replay_redirect
2145 causes all I/Os to be replayed onto the single specified device
2146 regardless of the device it was recorded from. i.e. `replay_re‐
2147 direct=/dev/sdc' would cause all I/O in the blktrace or iolog to
2148 be replayed onto `/dev/sdc'. This means multiple devices will be
2149 replayed onto a single device, if the trace contains multiple
2150 devices. If you want multiple devices to be replayed concur‐
2151 rently to multiple redirected devices you must blkparse your
2152 trace into separate traces and replay them with independent fio
2153 invocations. Unfortunately this also breaks the strict time
2154 ordering between multiple device accesses.
2155
2156 replay_align=int
2157 Force alignment of the byte offsets in a trace to this value.
2158 The value must be a power of 2.
2159
2160 replay_scale=int
2161 Scale bye offsets down by this factor when replaying traces.
2162 Should most likely use replay_align as well.
2163
2164 Threads, processes and job synchronization
2165 replay_skip=str
2166 Sometimes it's useful to skip certain IO types in a replay
2167 trace. This could be, for instance, eliminating the writes in
2168 the trace. Or not replaying the trims/discards, if you are redi‐
2169 recting to a device that doesn't support them. This option
2170 takes a comma separated list of read, write, trim, sync.
2171
2172 thread Fio defaults to creating jobs by using fork, however if this
2173 option is given, fio will create jobs by using POSIX Threads'
2174 function pthread_create(3) to create threads instead.
2175
2176 wait_for=str
2177 If set, the current job won't be started until all workers of
2178 the specified waitee job are done. wait_for operates on the job
2179 name basis, so there are a few limitations. First, the waitee
2180 must be defined prior to the waiter job (meaning no forward ref‐
2181 erences). Second, if a job is being referenced as a waitee, it
2182 must have a unique name (no duplicate waitees).
2183
2184 nice=int
2185 Run the job with the given nice value. See man nice(2). On Win‐
2186 dows, values less than -15 set the process class to "High"; -1
2187 through -15 set "Above Normal"; 1 through 15 "Below Normal"; and
2188 above 15 "Idle" priority class.
2189
2190 prio=int
2191 Set the I/O priority value of this job. Linux limits us to a
2192 positive value between 0 and 7, with 0 being the highest. See
2193 man ionice(1). Refer to an appropriate manpage for other operat‐
2194 ing systems since meaning of priority may differ.
2195
2196 prioclass=int
2197 Set the I/O priority class. See man ionice(1).
2198
2199 cpus_allowed=str
2200 Controls the same options as cpumask, but accepts a textual
2201 specification of the permitted CPUs instead and CPUs are indexed
2202 from 0. So to use CPUs 0 and 5 you would specify
2203 `cpus_allowed=0,5'. This option also allows a range of CPUs to
2204 be specified -- say you wanted a binding to CPUs 0, 5, and 8 to
2205 15, you would set `cpus_allowed=0,5,8-15'.
2206
2207 On Windows, when `cpus_allowed' is unset only CPUs from fio's
2208 current processor group will be used and affinity settings are
2209 inherited from the system. An fio build configured to target
2210 Windows 7 makes options that set CPUs processor group aware and
2211 values will set both the processor group and a CPU from within
2212 that group. For example, on a system where processor group 0 has
2213 40 CPUs and processor group 1 has 32 CPUs, `cpus_allowed' values
2214 between 0 and 39 will bind CPUs from processor group 0 and
2215 `cpus_allowed' values between 40 and 71 will bind CPUs from pro‐
2216 cessor group 1. When using `cpus_allowed_policy=shared' all CPUs
2217 specified by a single `cpus_allowed' option must be from the
2218 same processor group. For Windows fio builds not built for Win‐
2219 dows 7, CPUs will only be selected from (and be relative to)
2220 whatever processor group fio happens to be running in and CPUs
2221 from other processor groups cannot be used.
2222
2223 cpus_allowed_policy=str
2224 Set the policy of how fio distributes the CPUs specified by
2225 cpus_allowed or cpumask. Two policies are supported:
2226
2227 shared All jobs will share the CPU set specified.
2228
2229 split Each job will get a unique CPU from the CPU set.
2230
2231 shared is the default behavior, if the option isn't specified.
2232 If split is specified, then fio will will assign one cpu per
2233 job. If not enough CPUs are given for the jobs listed, then fio
2234 will roundrobin the CPUs in the set.
2235
2236 cpumask=int
2237 Set the CPU affinity of this job. The parameter given is a bit
2238 mask of allowed CPUs the job may run on. So if you want the
2239 allowed CPUs to be 1 and 5, you would pass the decimal value of
2240 (1 << 1 | 1 << 5), or 34. See man sched_setaffinity(2). This may
2241 not work on all supported operating systems or kernel versions.
2242 This option doesn't work well for a higher CPU count than what
2243 you can store in an integer mask, so it can only control cpus
2244 1-32. For boxes with larger CPU counts, use cpus_allowed.
2245
2246 numa_cpu_nodes=str
2247 Set this job running on specified NUMA nodes' CPUs. The argu‐
2248 ments allow comma delimited list of cpu numbers, A-B ranges, or
2249 `all'. Note, to enable NUMA options support, fio must be built
2250 on a system with libnuma-dev(el) installed.
2251
2252 numa_mem_policy=str
2253 Set this job's memory policy and corresponding NUMA nodes. For‐
2254 mat of the arguments:
2255
2256 <mode>[:<nodelist>]
2257
2258 `mode' is one of the following memory policies: `default', `pre‐
2259 fer', `bind', `interleave' or `local'. For `default' and `local'
2260 memory policies, no node needs to be specified. For `prefer',
2261 only one node is allowed. For `bind' and `interleave' the
2262 `nodelist' may be as follows: a comma delimited list of numbers,
2263 A-B ranges, or `all'.
2264
2265 cgroup=str
2266 Add job to this control group. If it doesn't exist, it will be
2267 created. The system must have a mounted cgroup blkio mount point
2268 for this to work. If your system doesn't have it mounted, you
2269 can do so with:
2270
2271 # mount -t cgroup -o blkio none /cgroup
2272
2273 cgroup_weight=int
2274 Set the weight of the cgroup to this value. See the documenta‐
2275 tion that comes with the kernel, allowed values are in the range
2276 of 100..1000.
2277
2278 cgroup_nodelete=bool
2279 Normally fio will delete the cgroups it has created after the
2280 job completion. To override this behavior and to leave cgroups
2281 around after the job completion, set `cgroup_nodelete=1'. This
2282 can be useful if one wants to inspect various cgroup files after
2283 job completion. Default: false.
2284
2285 flow_id=int
2286 The ID of the flow. If not specified, it defaults to being a
2287 global flow. See flow.
2288
2289 flow=int
2290 Weight in token-based flow control. If this value is used, then
2291 there is a 'flow counter' which is used to regulate the propor‐
2292 tion of activity between two or more jobs. Fio attempts to keep
2293 this flow counter near zero. The flow parameter stands for how
2294 much should be added or subtracted to the flow counter on each
2295 iteration of the main I/O loop. That is, if one job has `flow=8'
2296 and another job has `flow=-1', then there will be a roughly 1:8
2297 ratio in how much one runs vs the other.
2298
2299 flow_watermark=int
2300 The maximum value that the absolute value of the flow counter is
2301 allowed to reach before the job must wait for a lower value of
2302 the counter.
2303
2304 flow_sleep=int
2305 The period of time, in microseconds, to wait after the flow
2306 watermark has been exceeded before retrying operations.
2307
2308 stonewall, wait_for_previous
2309 Wait for preceding jobs in the job file to exit, before starting
2310 this one. Can be used to insert serialization points in the job
2311 file. A stone wall also implies starting a new reporting group,
2312 see group_reporting.
2313
2314 exitall
2315 By default, fio will continue running all other jobs when one
2316 job finishes but sometimes this is not the desired action. Set‐
2317 ting exitall will instead make fio terminate all other jobs when
2318 one job finishes.
2319
2320 exec_prerun=str
2321 Before running this job, issue the command specified through
2322 system(3). Output is redirected in a file called `jobname.pre‐
2323 run.txt'.
2324
2325 exec_postrun=str
2326 After the job completes, issue the command specified though sys‐
2327 tem(3). Output is redirected in a file called `job‐
2328 name.postrun.txt'.
2329
2330 uid=int
2331 Instead of running as the invoking user, set the user ID to this
2332 value before the thread/process does any work.
2333
2334 gid=int
2335 Set group ID, see uid.
2336
2337 Verification
2338 verify_only
2339 Do not perform specified workload, only verify data still
2340 matches previous invocation of this workload. This option allows
2341 one to check data multiple times at a later date without over‐
2342 writing it. This option makes sense only for workloads that
2343 write data, and does not support workloads with the time_based
2344 option set.
2345
2346 do_verify=bool
2347 Run the verify phase after a write phase. Only valid if verify
2348 is set. Default: true.
2349
2350 verify=str
2351 If writing to a file, fio can verify the file contents after
2352 each iteration of the job. Each verification method also implies
2353 verification of special header, which is written to the begin‐
2354 ning of each block. This header also includes meta information,
2355 like offset of the block, block number, timestamp when block was
2356 written, etc. verify can be combined with verify_pattern option.
2357 The allowed values are:
2358
2359 md5 Use an md5 sum of the data area and store it in
2360 the header of each block.
2361
2362 crc64 Use an experimental crc64 sum of the data area and
2363 store it in the header of each block.
2364
2365 crc32c Use a crc32c sum of the data area and store it in
2366 the header of each block. This will automatically
2367 use hardware acceleration (e.g. SSE4.2 on an x86
2368 or CRC crypto extensions on ARM64) but will fall
2369 back to software crc32c if none is found. Gener‐
2370 ally the fastest checksum fio supports when hard‐
2371 ware accelerated.
2372
2373 crc32c-intel
2374 Synonym for crc32c.
2375
2376 crc32 Use a crc32 sum of the data area and store it in
2377 the header of each block.
2378
2379 crc16 Use a crc16 sum of the data area and store it in
2380 the header of each block.
2381
2382 crc7 Use a crc7 sum of the data area and store it in
2383 the header of each block.
2384
2385 xxhash Use xxhash as the checksum function. Generally the
2386 fastest software checksum that fio supports.
2387
2388 sha512 Use sha512 as the checksum function.
2389
2390 sha256 Use sha256 as the checksum function.
2391
2392 sha1 Use optimized sha1 as the checksum function.
2393
2394 sha3-224
2395 Use optimized sha3-224 as the checksum function.
2396
2397 sha3-256
2398 Use optimized sha3-256 as the checksum function.
2399
2400 sha3-384
2401 Use optimized sha3-384 as the checksum function.
2402
2403 sha3-512
2404 Use optimized sha3-512 as the checksum function.
2405
2406 meta This option is deprecated, since now meta informa‐
2407 tion is included in generic verification header
2408 and meta verification happens by default. For
2409 detailed information see the description of the
2410 verify setting. This option is kept because of
2411 compatibility's sake with old configurations. Do
2412 not use it.
2413
2414 pattern
2415 Verify a strict pattern. Normally fio includes a
2416 header with some basic information and checksum‐
2417 ming, but if this option is set, only the specific
2418 pattern set with verify_pattern is verified.
2419
2420 null Only pretend to verify. Useful for testing inter‐
2421 nals with `ioengine=null', not for much else.
2422
2423 This option can be used for repeated burn-in tests of a system
2424 to make sure that the written data is also correctly read back.
2425 If the data direction given is a read or random read, fio will
2426 assume that it should verify a previously written file. If the
2427 data direction includes any form of write, the verify will be of
2428 the newly written data.
2429
2430 To avoid false verification errors, do not use the norandommap
2431 option when verifying data with async I/O engines and I/O depths
2432 > 1. Or use the norandommap and the lfsr random generator
2433 together to avoid writing to the same offset with muliple out‐
2434 standing I/Os.
2435
2436 verify_offset=int
2437 Swap the verification header with data somewhere else in the
2438 block before writing. It is swapped back before verifying.
2439
2440 verify_interval=int
2441 Write the verification header at a finer granularity than the
2442 blocksize. It will be written for chunks the size of ver‐
2443 ify_interval. blocksize should divide this evenly.
2444
2445 verify_pattern=str
2446 If set, fio will fill the I/O buffers with this pattern. Fio
2447 defaults to filling with totally random bytes, but sometimes
2448 it's interesting to fill with a known pattern for I/O verifica‐
2449 tion purposes. Depending on the width of the pattern, fio will
2450 fill 1/2/3/4 bytes of the buffer at the time (it can be either a
2451 decimal or a hex number). The verify_pattern if larger than a
2452 32-bit quantity has to be a hex number that starts with either
2453 "0x" or "0X". Use with verify. Also, verify_pattern supports %o
2454 format, which means that for each block offset will be written
2455 and then verified back, e.g.:
2456
2457 verify_pattern=%o
2458
2459 Or use combination of everything:
2460
2461 verify_pattern=0xff%o"abcd"-12
2462
2463 verify_fatal=bool
2464 Normally fio will keep checking the entire contents before quit‐
2465 ting on a block verification failure. If this option is set, fio
2466 will exit the job on the first observed failure. Default: false.
2467
2468 verify_dump=bool
2469 If set, dump the contents of both the original data block and
2470 the data block we read off disk to files. This allows later
2471 analysis to inspect just what kind of data corruption occurred.
2472 Off by default.
2473
2474 verify_async=int
2475 Fio will normally verify I/O inline from the submitting thread.
2476 This option takes an integer describing how many async offload
2477 threads to create for I/O verification instead, causing fio to
2478 offload the duty of verifying I/O contents to one or more sepa‐
2479 rate threads. If using this offload option, even sync I/O
2480 engines can benefit from using an iodepth setting higher than 1,
2481 as it allows them to have I/O in flight while verifies are run‐
2482 ning. Defaults to 0 async threads, i.e. verification is not
2483 asynchronous.
2484
2485 verify_async_cpus=str
2486 Tell fio to set the given CPU affinity on the async I/O verifi‐
2487 cation threads. See cpus_allowed for the format used.
2488
2489 verify_backlog=int
2490 Fio will normally verify the written contents of a job that uti‐
2491 lizes verify once that job has completed. In other words, every‐
2492 thing is written then everything is read back and verified. You
2493 may want to verify continually instead for a variety of reasons.
2494 Fio stores the meta data associated with an I/O block in memory,
2495 so for large verify workloads, quite a bit of memory would be
2496 used up holding this meta data. If this option is enabled, fio
2497 will write only N blocks before verifying these blocks.
2498
2499 verify_backlog_batch=int
2500 Control how many blocks fio will verify if verify_backlog is
2501 set. If not set, will default to the value of verify_backlog
2502 (meaning the entire queue is read back and verified). If ver‐
2503 ify_backlog_batch is less than verify_backlog then not all
2504 blocks will be verified, if verify_backlog_batch is larger than
2505 verify_backlog, some blocks will be verified more than once.
2506
2507 verify_state_save=bool
2508 When a job exits during the write phase of a verify workload,
2509 save its current state. This allows fio to replay up until that
2510 point, if the verify state is loaded for the verify read phase.
2511 The format of the filename is, roughly:
2512
2513 <type>-<jobname>-<jobindex>-verify.state.
2514
2515 <type> is "local" for a local run, "sock" for a client/server
2516 socket connection, and "ip" (192.168.0.1, for instance) for a
2517 networked client/server connection. Defaults to true.
2518
2519 verify_state_load=bool
2520 If a verify termination trigger was used, fio stores the current
2521 write state of each thread. This can be used at verification
2522 time so that fio knows how far it should verify. Without this
2523 information, fio will run a full verification pass, according to
2524 the settings in the job file used. Default false.
2525
2526 trim_percentage=int
2527 Number of verify blocks to discard/trim.
2528
2529 trim_verify_zero=bool
2530 Verify that trim/discarded blocks are returned as zeros.
2531
2532 trim_backlog=int
2533 Verify that trim/discarded blocks are returned as zeros.
2534
2535 trim_backlog_batch=int
2536 Trim this number of I/O blocks.
2537
2538 experimental_verify=bool
2539 Enable experimental verification.
2540
2541 Steady state
2542 steadystate=str:float, ss=str:float
2543 Define the criterion and limit for assessing steady state per‐
2544 formance. The first parameter designates the criterion whereas
2545 the second parameter sets the threshold. When the criterion
2546 falls below the threshold for the specified duration, the job
2547 will stop. For example, `iops_slope:0.1%' will direct fio to
2548 terminate the job when the least squares regression slope falls
2549 below 0.1% of the mean IOPS. If group_reporting is enabled this
2550 will apply to all jobs in the group. Below is the list of avail‐
2551 able steady state assessment criteria. All assessments are car‐
2552 ried out using only data from the rolling collection window.
2553 Threshold limits can be expressed as a fixed value or as a per‐
2554 centage of the mean in the collection window.
2555
2556 When using this feature, most jobs should include the time_based
2557 and runtime options or the loops option so that fio does not
2558 stop running after it has covered the full size of the specified
2559 file(s) or device(s).
2560
2561 iops Collect IOPS data. Stop the job if all
2562 individual IOPS measurements are within the
2563 specified limit of the mean IOPS (e.g.,
2564 `iops:2' means that all individual IOPS
2565 values must be within 2 of the mean,
2566 whereas `iops:0.2%' means that all individ‐
2567 ual IOPS values must be within 0.2% of the
2568 mean IOPS to terminate the job).
2569
2570 iops_slope
2571 Collect IOPS data and calculate the least
2572 squares regression slope. Stop the job if
2573 the slope falls below the specified limit.
2574
2575 bw Collect bandwidth data. Stop the job if all
2576 individual bandwidth measurements are
2577 within the specified limit of the mean
2578 bandwidth.
2579
2580 bw_slope
2581 Collect bandwidth data and calculate the
2582 least squares regression slope. Stop the
2583 job if the slope falls below the specified
2584 limit.
2585
2586 steadystate_duration=time, ss_dur=time
2587 A rolling window of this duration will be used to judge
2588 whether steady state has been reached. Data will be col‐
2589 lected once per second. The default is 0 which disables
2590 steady state detection. When the unit is omitted, the
2591 value is interpreted in seconds.
2592
2593 steadystate_ramp_time=time, ss_ramp=time
2594 Allow the job to run for the specified duration before
2595 beginning data collection for checking the steady state
2596 job termination criterion. The default is 0. When the
2597 unit is omitted, the value is interpreted in seconds.
2598
2599 Measurements and reporting
2600 per_job_logs=bool
2601 If set, this generates bw/clat/iops log with per file private
2602 filenames. If not set, jobs with identical names will share the
2603 log filename. Default: true.
2604
2605 group_reporting
2606 It may sometimes be interesting to display statistics for groups
2607 of jobs as a whole instead of for each individual job. This is
2608 especially true if numjobs is used; looking at individual
2609 thread/process output quickly becomes unwieldy. To see the final
2610 report per-group instead of per-job, use group_reporting. Jobs
2611 in a file will be part of the same reporting group, unless if
2612 separated by a stonewall, or by using new_group.
2613
2614 new_group
2615 Start a new reporting group. See: group_reporting. If not given,
2616 all jobs in a file will be part of the same reporting group,
2617 unless separated by a stonewall.
2618
2619 stats=bool
2620 By default, fio collects and shows final output results for all
2621 jobs that run. If this option is set to 0, then fio will ignore
2622 it in the final stat output.
2623
2624 write_bw_log=str
2625 If given, write a bandwidth log for this job. Can be used to
2626 store data of the bandwidth of the jobs in their lifetime.
2627
2628 If no str argument is given, the default filename of `job‐
2629 name_type.x.log' is used. Even when the argument is given, fio
2630 will still append the type of log. So if one specifies:
2631
2632 write_bw_log=foo
2633
2634 The actual log name will be `foo_bw.x.log' where `x' is the
2635 index of the job (1..N, where N is the number of jobs). If
2636 per_job_logs is false, then the filename will not include the
2637 `.x` job index.
2638
2639 The included fio_generate_plots script uses gnuplot to turn
2640 these text files into nice graphs. See the LOG FILE FORMATS sec‐
2641 tion for how data is structured within the file.
2642
2643 write_lat_log=str
2644 Same as write_bw_log, except this option creates I/O submission
2645 (e.g., `name_slat.x.log'), completion (e.g., `name_clat.x.log'),
2646 and total (e.g., `name_lat.x.log') latency files instead. See
2647 write_bw_log for details about the filename format and the LOG
2648 FILE FORMATS section for how data is structured within the
2649 files.
2650
2651 write_hist_log=str
2652 Same as write_bw_log but writes an I/O completion latency his‐
2653 togram file (e.g., `name_hist.x.log') instead. Note that this
2654 file will be empty unless log_hist_msec has also been set. See
2655 write_bw_log for details about the filename format and the LOG
2656 FILE FORMATS section for how data is structured within the file.
2657
2658 write_iops_log=str
2659 Same as write_bw_log, but writes an IOPS file (e.g.
2660 `name_iops.x.log`) instead. Because fio defaults to individual
2661 I/O logging, the value entry in the IOPS log will be 1 unless
2662 windowed logging (see log_avg_msec) has been enabled. See
2663 write_bw_log for details about the filename format and LOG FILE
2664 FORMATS for how data is structured within the file.
2665
2666 log_avg_msec=int
2667 By default, fio will log an entry in the iops, latency, or bw
2668 log for every I/O that completes. When writing to the disk log,
2669 that can quickly grow to a very large size. Setting this option
2670 makes fio average the each log entry over the specified period
2671 of time, reducing the resolution of the log. See log_max_value
2672 as well. Defaults to 0, logging all entries. Also see LOG FILE
2673 FORMATS section.
2674
2675 log_hist_msec=int
2676 Same as log_avg_msec, but logs entries for completion latency
2677 histograms. Computing latency percentiles from averages of
2678 intervals using log_avg_msec is inaccurate. Setting this option
2679 makes fio log histogram entries over the specified period of
2680 time, reducing log sizes for high IOPS devices while retaining
2681 percentile accuracy. See log_hist_coarseness and write_hist_log
2682 as well. Defaults to 0, meaning histogram logging is disabled.
2683
2684 log_hist_coarseness=int
2685 Integer ranging from 0 to 6, defining the coarseness of the res‐
2686 olution of the histogram logs enabled with log_hist_msec. For
2687 each increment in coarseness, fio outputs half as many bins.
2688 Defaults to 0, for which histogram logs contain 1216 latency
2689 bins. See LOG FILE FORMATS section.
2690
2691 log_max_value=bool
2692 If log_avg_msec is set, fio logs the average over that window.
2693 If you instead want to log the maximum value, set this option to
2694 1. Defaults to 0, meaning that averaged values are logged.
2695
2696 log_offset=bool
2697 If this is set, the iolog options will include the byte offset
2698 for the I/O entry as well as the other data values. Defaults to
2699 0 meaning that offsets are not present in logs. Also see LOG
2700 FILE FORMATS section.
2701
2702 log_compression=int
2703 If this is set, fio will compress the I/O logs as it goes, to
2704 keep the memory footprint lower. When a log reaches the speci‐
2705 fied size, that chunk is removed and compressed in the back‐
2706 ground. Given that I/O logs are fairly highly compressible, this
2707 yields a nice memory savings for longer runs. The downside is
2708 that the compression will consume some background CPU cycles, so
2709 it may impact the run. This, however, is also true if the log‐
2710 ging ends up consuming most of the system memory. So pick your
2711 poison. The I/O logs are saved normally at the end of a run, by
2712 decompressing the chunks and storing them in the specified log
2713 file. This feature depends on the availability of zlib.
2714
2715 log_compression_cpus=str
2716 Define the set of CPUs that are allowed to handle online log
2717 compression for the I/O jobs. This can provide better isolation
2718 between performance sensitive jobs, and background compression
2719 work. See cpus_allowed for the format used.
2720
2721 log_store_compressed=bool
2722 If set, fio will store the log files in a compressed format.
2723 They can be decompressed with fio, using the --inflate-log com‐
2724 mand line parameter. The files will be stored with a `.fz' suf‐
2725 fix.
2726
2727 log_unix_epoch=bool
2728 If set, fio will log Unix timestamps to the log files produced
2729 by enabling write_type_log for each log type, instead of the
2730 default zero-based timestamps.
2731
2732 block_error_percentiles=bool
2733 If set, record errors in trim block-sized units from writes and
2734 trims and output a histogram of how many trims it took to get to
2735 errors, and what kind of error was encountered.
2736
2737 bwavgtime=int
2738 Average the calculated bandwidth over the given time. Value is
2739 specified in milliseconds. If the job also does bandwidth log‐
2740 ging through write_bw_log, then the minimum of this option and
2741 log_avg_msec will be used. Default: 500ms.
2742
2743 iopsavgtime=int
2744 Average the calculated IOPS over the given time. Value is speci‐
2745 fied in milliseconds. If the job also does IOPS logging through
2746 write_iops_log, then the minimum of this option and log_avg_msec
2747 will be used. Default: 500ms.
2748
2749 disk_util=bool
2750 Generate disk utilization statistics, if the platform supports
2751 it. Default: true.
2752
2753 disable_lat=bool
2754 Disable measurements of total latency numbers. Useful only for
2755 cutting back the number of calls to gettimeofday(2), as that
2756 does impact performance at really high IOPS rates. Note that to
2757 really get rid of a large amount of these calls, this option
2758 must be used with disable_slat and disable_bw_measurement as
2759 well.
2760
2761 disable_clat=bool
2762 Disable measurements of completion latency numbers. See dis‐
2763 able_lat.
2764
2765 disable_slat=bool
2766 Disable measurements of submission latency numbers. See dis‐
2767 able_lat.
2768
2769 disable_bw_measurement=bool, disable_bw=bool
2770 Disable measurements of throughput/bandwidth numbers. See dis‐
2771 able_lat.
2772
2773 clat_percentiles=bool
2774 Enable the reporting of percentiles of completion latencies.
2775 This option is mutually exclusive with lat_percentiles.
2776
2777 lat_percentiles=bool
2778 Enable the reporting of percentiles of I/O latencies. This is
2779 similar to clat_percentiles, except that this includes the sub‐
2780 mission latency. This option is mutually exclusive with
2781 clat_percentiles.
2782
2783 percentile_list=float_list
2784 Overwrite the default list of percentiles for completion laten‐
2785 cies and the block error histogram. Each number is a floating
2786 number in the range (0,100], and the maximum length of the list
2787 is 20. Use ':' to separate the numbers, and list the numbers in
2788 ascending order. For example, `--percentile_list=99.5:99.9' will
2789 cause fio to report the values of completion latency below which
2790 99.5% and 99.9% of the observed latencies fell, respectively.
2791
2792 significant_figures=int
2793 If using --output-format of `normal', set the significant fig‐
2794 ures to this value. Higher values will yield more precise IOPS
2795 and throughput units, while lower values will round. Requires a
2796 minimum value of 1 and a maximum value of 10. Defaults to 4.
2797
2798 Error handling
2799 exitall_on_error
2800 When one job finishes in error, terminate the rest. The default
2801 is to wait for each job to finish.
2802
2803 continue_on_error=str
2804 Normally fio will exit the job on the first observed failure. If
2805 this option is set, fio will continue the job when there is a
2806 'non-fatal error' (EIO or EILSEQ) until the runtime is exceeded
2807 or the I/O size specified is completed. If this option is used,
2808 there are two more stats that are appended, the total error
2809 count and the first error. The error field given in the stats is
2810 the first error that was hit during the run. The allowed values
2811 are:
2812
2813 none Exit on any I/O or verify errors.
2814
2815 read Continue on read errors, exit on all others.
2816
2817 write Continue on write errors, exit on all others.
2818
2819 io Continue on any I/O error, exit on all others.
2820
2821 verify Continue on verify errors, exit on all others.
2822
2823 all Continue on all errors.
2824
2825 0 Backward-compatible alias for 'none'.
2826
2827 1 Backward-compatible alias for 'all'.
2828
2829 ignore_error=str
2830 Sometimes you want to ignore some errors during test in that
2831 case you can specify error list for each error type, instead of
2832 only being able to ignore the default 'non-fatal error' using
2833 continue_on_error.
2834 `ignore_error=READ_ERR_LIST,WRITE_ERR_LIST,VERIFY_ERR_LIST'
2835 errors for given error type is separated with ':'. Error may be
2836 symbol ('ENOSPC', 'ENOMEM') or integer. Example:
2837
2838 ignore_error=EAGAIN,ENOSPC:122
2839
2840 This option will ignore EAGAIN from READ, and ENOSPC and
2841 122(EDQUOT) from WRITE. This option works by overriding con‐
2842 tinue_on_error with the list of errors for each error type if
2843 any.
2844
2845 error_dump=bool
2846 If set dump every error even if it is non fatal, true by
2847 default. If disabled only fatal error will be dumped.
2848
2849 Running predefined workloads
2850 Fio includes predefined profiles that mimic the I/O workloads generated
2851 by other tools.
2852
2853 profile=str
2854 The predefined workload to run. Current profiles are:
2855
2856 tiobench
2857 Threaded I/O bench (tiotest/tiobench) like work‐
2858 load.
2859
2860 act Aerospike Certification Tool (ACT) like workload.
2861
2862 To view a profile's additional options use --cmdhelp after specifying
2863 the profile. For example:
2864
2865 $ fio --profile=act --cmdhelp
2866
2867 Act profile options
2868 device-names=str
2869 Devices to use.
2870
2871 load=int
2872 ACT load multiplier. Default: 1.
2873
2874 test-duration=time
2875 How long the entire test takes to run. When the unit is omitted,
2876 the value is given in seconds. Default: 24h.
2877
2878 threads-per-queue=int
2879 Number of read I/O threads per device. Default: 8.
2880
2881 read-req-num-512-blocks=int
2882 Number of 512B blocks to read at the time. Default: 3.
2883
2884 large-block-op-kbytes=int
2885 Size of large block ops in KiB (writes). Default: 131072.
2886
2887 prep Set to run ACT prep phase.
2888
2889 Tiobench profile options
2890 size=str
2891 Size in MiB.
2892
2893 block=int
2894 Block size in bytes. Default: 4096.
2895
2896 numruns=int
2897 Number of runs.
2898
2899 dir=str
2900 Test directory.
2901
2902 threads=int
2903 Number of threads.
2904
2906 Fio spits out a lot of output. While running, fio will display the sta‐
2907 tus of the jobs created. An example of that would be:
2908
2909 Jobs: 1 (f=1): [_(1),M(1)][24.8%][r=20.5MiB/s,w=23.5MiB/s][r=82,w=94 IOPS][eta 01m:31s]
2910
2911 The characters inside the first set of square brackets denote the cur‐
2912 rent status of each thread. The first character is the first job
2913 defined in the job file, and so forth. The possible values (in typical
2914 life cycle order) are:
2915
2916 P Thread setup, but not started.
2917 C Thread created.
2918 I Thread initialized, waiting or generating necessary data.
2919 p Thread running pre-reading file(s).
2920 / Thread is in ramp period.
2921 R Running, doing sequential reads.
2922 r Running, doing random reads.
2923 W Running, doing sequential writes.
2924 w Running, doing random writes.
2925 M Running, doing mixed sequential reads/writes.
2926 m Running, doing mixed random reads/writes.
2927 D Running, doing sequential trims.
2928 d Running, doing random trims.
2929 F Running, currently waiting for fsync(2).
2930 V Running, doing verification of written data.
2931 f Thread finishing.
2932 E Thread exited, not reaped by main thread yet.
2933 - Thread reaped.
2934 X Thread reaped, exited with an error.
2935 K Thread reaped, exited due to signal.
2936
2937 Fio will condense the thread string as not to take up more space on the
2938 command line than needed. For instance, if you have 10 readers and 10
2939 writers running, the output would look like this:
2940
2941 Jobs: 20 (f=20): [R(10),W(10)][4.0%][r=20.5MiB/s,w=23.5MiB/s][r=82,w=94 IOPS][eta 57m:36s]
2942
2943 Note that the status string is displayed in order, so it's possible to
2944 tell which of the jobs are currently doing what. In the example above
2945 this means that jobs 1--10 are readers and 11--20 are writers.
2946
2947 The other values are fairly self explanatory -- number of threads cur‐
2948 rently running and doing I/O, the number of currently open files (f=),
2949 the estimated completion percentage, the rate of I/O since last check
2950 (read speed listed first, then write speed and optionally trim speed)
2951 in terms of bandwidth and IOPS, and time to completion for the current
2952 running group. It's impossible to estimate runtime of the following
2953 groups (if any).
2954
2955 When fio is done (or interrupted by Ctrl-C), it will show the data for
2956 each thread, group of threads, and disks in that order. For each over‐
2957 all thread (or group) the output looks like:
2958
2959 Client1: (groupid=0, jobs=1): err= 0: pid=16109: Sat Jun 24 12:07:54 2017
2960 write: IOPS=88, BW=623KiB/s (638kB/s)(30.4MiB/50032msec)
2961 slat (nsec): min=500, max=145500, avg=8318.00, stdev=4781.50
2962 clat (usec): min=170, max=78367, avg=4019.02, stdev=8293.31
2963 lat (usec): min=174, max=78375, avg=4027.34, stdev=8291.79
2964 clat percentiles (usec):
2965 | 1.00th=[ 302], 5.00th=[ 326], 10.00th=[ 343], 20.00th=[ 363],
2966 | 30.00th=[ 392], 40.00th=[ 404], 50.00th=[ 416], 60.00th=[ 445],
2967 | 70.00th=[ 816], 80.00th=[ 6718], 90.00th=[12911], 95.00th=[21627],
2968 | 99.00th=[43779], 99.50th=[51643], 99.90th=[68682], 99.95th=[72877],
2969 | 99.99th=[78119]
2970 bw ( KiB/s): min= 532, max= 686, per=0.10%, avg=622.87, stdev=24.82, samples= 100
2971 iops : min= 76, max= 98, avg=88.98, stdev= 3.54, samples= 100
2972 lat (usec) : 250=0.04%, 500=64.11%, 750=4.81%, 1000=2.79%
2973 lat (msec) : 2=4.16%, 4=1.84%, 10=4.90%, 20=11.33%, 50=5.37%
2974 lat (msec) : 100=0.65%
2975 cpu : usr=0.27%, sys=0.18%, ctx=12072, majf=0, minf=21
2976 IO depths : 1=85.0%, 2=13.1%, 4=1.8%, 8=0.1%, 16=0.0%, 32=0.0%, >=64=0.0%
2977 submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
2978 complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
2979 issued rwt: total=0,4450,0, short=0,0,0, dropped=0,0,0
2980 latency : target=0, window=0, percentile=100.00%, depth=8
2981
2982 The job name (or first job's name when using group_reporting) is
2983 printed, along with the group id, count of jobs being aggregated, last
2984 error id seen (which is 0 when there are no errors), pid/tid of that
2985 thread and the time the job/group completed. Below are the I/O statis‐
2986 tics for each data direction performed (showing writes in the example
2987 above). In the order listed, they denote:
2988
2989 read/write/trim
2990 The string before the colon shows the I/O direction the
2991 statistics are for. IOPS is the average I/Os performed
2992 per second. BW is the average bandwidth rate shown as:
2993 value in power of 2 format (value in power of 10 format).
2994 The last two values show: (total I/O performed in power
2995 of 2 format / runtime of that thread).
2996
2997 slat Submission latency (min being the minimum, max being the
2998 maximum, avg being the average, stdev being the standard
2999 deviation). This is the time it took to submit the I/O.
3000 For sync I/O this row is not displayed as the slat is
3001 really the completion latency (since queue/complete is
3002 one operation there). This value can be in nanoseconds,
3003 microseconds or milliseconds --- fio will choose the most
3004 appropriate base and print that (in the example above
3005 nanoseconds was the best scale). Note: in --minimal mode
3006 latencies are always expressed in microseconds.
3007
3008 clat Completion latency. Same names as slat, this denotes the
3009 time from submission to completion of the I/O pieces. For
3010 sync I/O, clat will usually be equal (or very close) to
3011 0, as the time from submit to complete is basically just
3012 CPU time (I/O has already been done, see slat explana‐
3013 tion).
3014
3015 lat Total latency. Same names as slat and clat, this denotes
3016 the time from when fio created the I/O unit to completion
3017 of the I/O operation.
3018
3019 bw Bandwidth statistics based on samples. Same names as the
3020 xlat stats, but also includes the number of samples taken
3021 (samples) and an approximate percentage of total aggre‐
3022 gate bandwidth this thread received in its group (per).
3023 This last value is only really useful if the threads in
3024 this group are on the same disk, since they are then com‐
3025 peting for disk access.
3026
3027 iops IOPS statistics based on samples. Same names as bw.
3028
3029 lat (nsec/usec/msec)
3030 The distribution of I/O completion latencies. This is the
3031 time from when I/O leaves fio and when it gets completed.
3032 Unlike the separate read/write/trim sections above, the
3033 data here and in the remaining sections apply to all I/Os
3034 for the reporting group. 250=0.04% means that 0.04% of
3035 the I/Os completed in under 250us. 500=64.11% means that
3036 64.11% of the I/Os required 250 to 499us for completion.
3037
3038 cpu CPU usage. User and system time, along with the number of
3039 context switches this thread went through, usage of sys‐
3040 tem and user time, and finally the number of major and
3041 minor page faults. The CPU utilization numbers are aver‐
3042 ages for the jobs in that reporting group, while the con‐
3043 text and fault counters are summed.
3044
3045 IO depths
3046 The distribution of I/O depths over the job lifetime. The
3047 numbers are divided into powers of 2 and each entry cov‐
3048 ers depths from that value up to those that are lower
3049 than the next entry -- e.g., 16= covers depths from 16 to
3050 31. Note that the range covered by a depth distribution
3051 entry can be different to the range covered by the equiv‐
3052 alent submit/complete distribution entry.
3053
3054 IO submit
3055 How many pieces of I/O were submitting in a single submit
3056 call. Each entry denotes that amount and below, until the
3057 previous entry -- e.g., 16=100% means that we submitted
3058 anywhere between 9 to 16 I/Os per submit call. Note that
3059 the range covered by a submit distribution entry can be
3060 different to the range covered by the equivalent depth
3061 distribution entry.
3062
3063 IO complete
3064 Like the above submit number, but for completions
3065 instead.
3066
3067 IO issued rwt
3068 The number of read/write/trim requests issued, and how
3069 many of them were short or dropped.
3070
3071 IO latency
3072 These values are for latency_target and related options.
3073 When these options are engaged, this section describes
3074 the I/O depth required to meet the specified latency tar‐
3075 get.
3076
3077 After each client has been listed, the group statistics are printed.
3078 They will look like this:
3079
3080 Run status group 0 (all jobs):
3081 READ: bw=20.9MiB/s (21.9MB/s), 10.4MiB/s-10.8MiB/s (10.9MB/s-11.3MB/s), io=64.0MiB (67.1MB), run=2973-3069msec
3082 WRITE: bw=1231KiB/s (1261kB/s), 616KiB/s-621KiB/s (630kB/s-636kB/s), io=64.0MiB (67.1MB), run=52747-53223msec
3083
3084 For each data direction it prints:
3085
3086 bw Aggregate bandwidth of threads in this group followed by
3087 the minimum and maximum bandwidth of all the threads in
3088 this group. Values outside of brackets are power-of-2
3089 format and those within are the equivalent value in a
3090 power-of-10 format.
3091
3092 io Aggregate I/O performed of all threads in this group. The
3093 format is the same as bw.
3094
3095 run The smallest and longest runtimes of the threads in this
3096 group.
3097
3098 And finally, the disk statistics are printed. This is Linux specific.
3099 They will look like this:
3100
3101 Disk stats (read/write):
3102 sda: ios=16398/16511, merge=30/162, ticks=6853/819634, in_queue=826487, util=100.00%
3103
3104 Each value is printed for both reads and writes, with reads first. The
3105 numbers denote:
3106
3107 ios Number of I/Os performed by all groups.
3108
3109 merge Number of merges performed by the I/O scheduler.
3110
3111 ticks Number of ticks we kept the disk busy.
3112
3113 in_queue
3114 Total time spent in the disk queue.
3115
3116 util The disk utilization. A value of 100% means we kept the
3117 disk busy constantly, 50% would be a disk idling half of
3118 the time.
3119
3120 It is also possible to get fio to dump the current output while it is
3121 running, without terminating the job. To do that, send fio the USR1
3122 signal. You can also get regularly timed dumps by using the --sta‐
3123 tus-interval parameter, or by creating a file in `/tmp' named
3124 `fio-dump-status'. If fio sees this file, it will unlink it and dump
3125 the current output status.
3126
3128 For scripted usage where you typically want to generate tables or
3129 graphs of the results, fio can output the results in a semicolon sepa‐
3130 rated format. The format is one long line of values, such as:
3131
3132 2;card0;0;0;7139336;121836;60004;1;10109;27.932460;116.933948;220;126861;3495.446807;1085.368601;226;126864;3523.635629;1089.012448;24063;99944;50.275485%;59818.274627;5540.657370;7155060;122104;60004;1;8338;29.086342;117.839068;388;128077;5032.488518;1234.785715;391;128085;5061.839412;1236.909129;23436;100928;50.287926%;59964.832030;5644.844189;14.595833%;19.394167%;123706;0;7313;0.1%;0.1%;0.1%;0.1%;0.1%;0.1%;100.0%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.01%;0.02%;0.05%;0.16%;6.04%;40.40%;52.68%;0.64%;0.01%;0.00%;0.01%;0.00%;0.00%;0.00%;0.00%;0.00%
3133 A description of this job goes here.
3134
3135 The job description (if provided) follows on a second line for terse
3136 v2. It appears on the same line for other terse versions.
3137
3138 To enable terse output, use the --minimal or `--output-format=terse'
3139 command line options. The first value is the version of the terse out‐
3140 put format. If the output has to be changed for some reason, this num‐
3141 ber will be incremented by 1 to signify that change.
3142
3143 Split up, the format is as follows (comments in brackets denote when a
3144 field was introduced or whether it's specific to some terse version):
3145
3146 terse version, fio version [v3], jobname, groupid, error
3147
3148 READ status:
3149
3150 Total IO (KiB), bandwidth (KiB/sec), IOPS, runtime (msec)
3151 Submission latency: min, max, mean, stdev (usec)
3152 Completion latency: min, max, mean, stdev (usec)
3153 Completion latency percentiles: 20 fields (see below)
3154 Total latency: min, max, mean, stdev (usec)
3155 Bw (KiB/s): min, max, aggregate percentage of total, mean, stdev, number of samples [v5]
3156 IOPS [v5]: min, max, mean, stdev, number of samples
3157
3158 WRITE status:
3159
3160 Total IO (KiB), bandwidth (KiB/sec), IOPS, runtime (msec)
3161 Submission latency: min, max, mean, stdev (usec)
3162 Completion latency: min, max, mean, stdev (usec)
3163 Completion latency percentiles: 20 fields (see below)
3164 Total latency: min, max, mean, stdev (usec)
3165 Bw (KiB/s): min, max, aggregate percentage of total, mean, stdev, number of samples [v5]
3166 IOPS [v5]: min, max, mean, stdev, number of samples
3167
3168 TRIM status [all but version 3]:
3169
3170 Fields are similar to READ/WRITE status.
3171
3172 CPU usage:
3173
3174 user, system, context switches, major faults, minor faults
3175
3176 I/O depths:
3177
3178 <=1, 2, 4, 8, 16, 32, >=64
3179
3180 I/O latencies microseconds:
3181
3182 <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000
3183
3184 I/O latencies milliseconds:
3185
3186 <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000, 2000, >=2000
3187
3188 Disk utilization [v3]:
3189
3190 disk name, read ios, write ios, read merges, write merges, read ticks, write ticks, time spent in queue, disk utilization percentage
3191
3192 Additional Info (dependent on continue_on_error, default off):
3193
3194 total # errors, first error code
3195
3196 Additional Info (dependent on description being set):
3197
3198 Text description
3199
3200 Completion latency percentiles can be a grouping of up to 20 sets, so
3201 for the terse output fio writes all of them. Each field will look like
3202 this:
3203
3204 1.00%=6112
3205
3206 which is the Xth percentile, and the `usec' latency associated with it.
3207
3208 For Disk utilization, all disks used by fio are shown. So for each disk
3209 there will be a disk utilization section.
3210
3211 Below is a single line containing short names for each of the fields in
3212 the minimal output v3, separated by semicolons:
3213
3214 terse_version_3;fio_version;jobname;groupid;error;read_kb;read_bandwidth;read_iops;read_runtime_ms;read_slat_min;read_slat_max;read_slat_mean;read_slat_dev;read_clat_min;read_clat_max;read_clat_mean;read_clat_dev;read_clat_pct01;read_clat_pct02;read_clat_pct03;read_clat_pct04;read_clat_pct05;read_clat_pct06;read_clat_pct07;read_clat_pct08;read_clat_pct09;read_clat_pct10;read_clat_pct11;read_clat_pct12;read_clat_pct13;read_clat_pct14;read_clat_pct15;read_clat_pct16;read_clat_pct17;read_clat_pct18;read_clat_pct19;read_clat_pct20;read_tlat_min;read_lat_max;read_lat_mean;read_lat_dev;read_bw_min;read_bw_max;read_bw_agg_pct;read_bw_mean;read_bw_dev;write_kb;write_bandwidth;write_iops;write_runtime_ms;write_slat_min;write_slat_max;write_slat_mean;write_slat_dev;write_clat_min;write_clat_max;write_clat_mean;write_clat_dev;write_clat_pct01;write_clat_pct02;write_clat_pct03;write_clat_pct04;write_clat_pct05;write_clat_pct06;write_clat_pct07;write_clat_pct08;write_clat_pct09;write_clat_pct10;write_clat_pct11;write_clat_pct12;write_clat_pct13;write_clat_pct14;write_clat_pct15;write_clat_pct16;write_clat_pct17;write_clat_pct18;write_clat_pct19;write_clat_pct20;write_tlat_min;write_lat_max;write_lat_mean;write_lat_dev;write_bw_min;write_bw_max;write_bw_agg_pct;write_bw_mean;write_bw_dev;cpu_user;cpu_sys;cpu_csw;cpu_mjf;cpu_minf;iodepth_1;iodepth_2;iodepth_4;iodepth_8;iodepth_16;iodepth_32;iodepth_64;lat_2us;lat_4us;lat_10us;lat_20us;lat_50us;lat_100us;lat_250us;lat_500us;lat_750us;lat_1000us;lat_2ms;lat_4ms;lat_10ms;lat_20ms;lat_50ms;lat_100ms;lat_250ms;lat_500ms;lat_750ms;lat_1000ms;lat_2000ms;lat_over_2000ms;disk_name;disk_read_iops;disk_write_iops;disk_read_merges;disk_write_merges;disk_read_ticks;write_ticks;disk_queue_time;disk_util
3215
3216 In client/server mode terse output differs from what appears when jobs
3217 are run locally. Disk utilization data is omitted from the standard
3218 terse output and for v3 and later appears on its own separate line at
3219 the end of each terse reporting cycle.
3220
3222 The json output format is intended to be both human readable and conve‐
3223 nient for automated parsing. For the most part its sections mirror
3224 those of the normal output. The runtime value is reported in msec and
3225 the bw value is reported in 1024 bytes per second units.
3226
3228 The json+ output format is identical to the json output format except
3229 that it adds a full dump of the completion latency bins. Each bins
3230 object contains a set of (key, value) pairs where keys are latency
3231 durations and values count how many I/Os had completion latencies of
3232 the corresponding duration. For example, consider:
3233
3234 "bins" : { "87552" : 1, "89600" : 1, "94720" : 1, "96768" : 1,
3235 "97792" : 1, "99840" : 1, "100864" : 2, "103936" : 6, "104960" :
3236 534, "105984" : 5995, "107008" : 7529, ... }
3237
3238 This data indicates that one I/O required 87,552ns to complete, two
3239 I/Os required 100,864ns to complete, and 7529 I/Os required 107,008ns
3240 to complete.
3241
3242 Also included with fio is a Python script fio_jsonplus_clat2csv that
3243 takes json+ output and generates CSV-formatted latency data suitable
3244 for plotting.
3245
3246 The latency durations actually represent the midpoints of latency
3247 intervals. For details refer to `stat.h' in the fio source.
3248
3250 There are two trace file format that you can encounter. The older (v1)
3251 format is unsupported since version 1.20-rc3 (March 2008). It will
3252 still be described below in case that you get an old trace and want to
3253 understand it.
3254
3255 In any case the trace is a simple text file with a single action per
3256 line.
3257
3258 Trace file format v1
3259 Each line represents a single I/O action in the following for‐
3260 mat:
3261
3262 rw, offset, length
3263
3264 where `rw=0/1' for read/write, and the `offset' and `length'
3265 entries being in bytes.
3266
3267 This format is not supported in fio versions >= 1.20-rc3.
3268
3269 Trace file format v2
3270 The second version of the trace file format was added in fio
3271 version 1.17. It allows to access more then one file per trace
3272 and has a bigger set of possible file actions.
3273
3274 The first line of the trace file has to be:
3275
3276 "fio version 2 iolog"
3277
3278 Following this can be lines in two different formats, which are
3279 described below.
3280
3281 The file management format:
3282 filename action
3283
3284 The `filename' is given as an absolute path. The `action'
3285 can be one of these:
3286
3287 add Add the given `filename' to the trace.
3288
3289 open Open the file with the given `filename'.
3290 The `filename' has to have been added with
3291 the add action before.
3292
3293 close Close the file with the given `filename'.
3294 The file has to have been opened before.
3295
3296 The file I/O action format:
3297 filename action offset length
3298
3299 The `filename' is given as an absolute path, and has to
3300 have been added and opened before it can be used with
3301 this format. The `offset' and `length' are given in
3302 bytes. The `action' can be one of these:
3303
3304 wait Wait for `offset' microseconds. Everything
3305 below 100 is discarded. The time is rela‐
3306 tive to the previous `wait' statement.
3307
3308 read Read `length' bytes beginning from `off‐
3309 set'.
3310
3311 write Write `length' bytes beginning from `off‐
3312 set'.
3313
3314 sync fsync(2) the file.
3315
3316 datasync
3317 fdatasync(2) the file.
3318
3319 trim Trim the given file from the given `offset'
3320 for `length' bytes.
3321
3323 Colocation is a common practice used to get the most out of a machine.
3324 Knowing which workloads play nicely with each other and which ones
3325 don't is a much harder task. While fio can replay workloads concur‐
3326 rently via multiple jobs, it leaves some variability up to the sched‐
3327 uler making results harder to reproduce. Merging is a way to make the
3328 order of events consistent.
3329
3330 Merging is integrated into I/O replay and done when a merge_blk‐
3331 trace_file is specified. The list of files passed to read_iolog go
3332 through the merge process and output a single file stored to the speci‐
3333 fied file. The output file is passed on as if it were the only file
3334 passed to read_iolog. An example would look like:
3335
3336 $ fio --read_iolog="<file1>:<file2>" --merge_blk‐
3337 trace_file="<output_file>"
3338
3339 Creating only the merged file can be done by passing the command line
3340 argument merge-blktrace-only.
3341
3342 Scaling traces can be done to see the relative impact of any particular
3343 trace being slowed down or sped up. merge_blktrace_scalars takes in a
3344 colon separated list of percentage scalars. It is index paired with the
3345 files passed to read_iolog.
3346
3347 With scaling, it may be desirable to match the running time of all
3348 traces. This can be done with merge_blktrace_iters. It is index paired
3349 with read_iolog just like merge_blktrace_scalars.
3350
3351 In an example, given two traces, A and B, each 60s long. If we want to
3352 see the impact of trace A issuing IOs twice as fast and repeat trace A
3353 over the runtime of trace B, the following can be done:
3354
3355 $ fio --read_iolog="<trace_a>:"<trace_b>" --merge_blk‐
3356 trace_file"<output_file>" --merge_blktrace_scalars="50:100"
3357 --merge_blktrace_iters="2:1"
3358
3359 This runs trace A at 2x the speed twice for approximately the same run‐
3360 time as a single run of trace B.
3361
3363 In some cases, we want to understand CPU overhead in a test. For exam‐
3364 ple, we test patches for the specific goodness of whether they reduce
3365 CPU usage. Fio implements a balloon approach to create a thread per
3366 CPU that runs at idle priority, meaning that it only runs when nobody
3367 else needs the cpu. By measuring the amount of work completed by the
3368 thread, idleness of each CPU can be derived accordingly.
3369
3370 An unit work is defined as touching a full page of unsigned characters.
3371 Mean and standard deviation of time to complete an unit work is
3372 reported in "unit work" section. Options can be chosen to report
3373 detailed percpu idleness or overall system idleness by aggregating per‐
3374 cpu stats.
3375
3377 Fio is usually run in one of two ways, when data verification is done.
3378 The first is a normal write job of some sort with verify enabled. When
3379 the write phase has completed, fio switches to reads and verifies
3380 everything it wrote. The second model is running just the write phase,
3381 and then later on running the same job (but with reads instead of
3382 writes) to repeat the same I/O patterns and verify the contents. Both
3383 of these methods depend on the write phase being completed, as fio oth‐
3384 erwise has no idea how much data was written.
3385
3386 With verification triggers, fio supports dumping the current write
3387 state to local files. Then a subsequent read verify workload can load
3388 this state and know exactly where to stop. This is useful for testing
3389 cases where power is cut to a server in a managed fashion, for
3390 instance.
3391
3392 A verification trigger consists of two things:
3393
3394 1) Storing the write state of each job.
3395
3396 2) Executing a trigger command.
3397
3398 The write state is relatively small, on the order of hundreds of bytes
3399 to single kilobytes. It contains information on the number of comple‐
3400 tions done, the last X completions, etc.
3401
3402 A trigger is invoked either through creation ('touch') of a specified
3403 file in the system, or through a timeout setting. If fio is run with
3404 `--trigger-file=/tmp/trigger-file', then it will continually check for
3405 the existence of `/tmp/trigger-file'. When it sees this file, it will
3406 fire off the trigger (thus saving state, and executing the trigger com‐
3407 mand).
3408
3409 For client/server runs, there's both a local and remote trigger. If fio
3410 is running as a server backend, it will send the job states back to the
3411 client for safe storage, then execute the remote trigger, if specified.
3412 If a local trigger is specified, the server will still send back the
3413 write state, but the client will then execute the trigger.
3414
3415 Verification trigger example
3416 Let's say we want to run a powercut test on the remote Linux
3417 machine 'server'. Our write workload is in `write-test.fio'. We
3418 want to cut power to 'server' at some point during the run, and
3419 we'll run this test from the safety or our local machine,
3420 'localbox'. On the server, we'll start the fio backend normally:
3421
3422 server# fio --server
3423
3424 and on the client, we'll fire off the workload:
3425
3426 localbox$ fio --client=server --trig‐
3427 ger-file=/tmp/my-trigger --trigger-remote="bash -c "echo
3428 b > /proc/sysrq-triger""
3429
3430 We set `/tmp/my-trigger' as the trigger file, and we tell fio to
3431 execute:
3432
3433 echo b > /proc/sysrq-trigger
3434
3435 on the server once it has received the trigger and sent us the
3436 write state. This will work, but it's not really cutting power
3437 to the server, it's merely abruptly rebooting it. If we have a
3438 remote way of cutting power to the server through IPMI or simi‐
3439 lar, we could do that through a local trigger command instead.
3440 Let's assume we have a script that does IPMI reboot of a given
3441 hostname, ipmi-reboot. On localbox, we could then have run fio
3442 with a local trigger instead:
3443
3444 localbox$ fio --client=server --trig‐
3445 ger-file=/tmp/my-trigger --trigger="ipmi-reboot server"
3446
3447 For this case, fio would wait for the server to send us the
3448 write state, then execute `ipmi-reboot server' when that hap‐
3449 pened.
3450
3451 Loading verify state
3452 To load stored write state, a read verification job file must
3453 contain the verify_state_load option. If that is set, fio will
3454 load the previously stored state. For a local fio run this is
3455 done by loading the files directly, and on a client/server run,
3456 the server backend will ask the client to send the files over
3457 and load them from there.
3458
3460 Fio supports a variety of log file formats, for logging latencies,
3461 bandwidth, and IOPS. The logs share a common format, which looks like
3462 this:
3463
3464 time (msec), value, data direction, block size (bytes), offset
3465 (bytes)
3466
3467 `Time' for the log entry is always in milliseconds. The `value' logged
3468 depends on the type of log, it will be one of the following:
3469
3470 Latency log
3471 Value is latency in nsecs
3472
3473 Bandwidth log
3474 Value is in KiB/sec
3475
3476 IOPS log
3477 Value is IOPS
3478
3479 `Data direction' is one of the following:
3480
3481 0 I/O is a READ
3482
3483 1 I/O is a WRITE
3484
3485 2 I/O is a TRIM
3486
3487 The entry's `block size' is always in bytes. The `offset' is the posi‐
3488 tion in bytes from the start of the file for that particular I/O. The
3489 logging of the offset can be toggled with log_offset.
3490
3491 Fio defaults to logging every individual I/O but when windowed logging
3492 is set through log_avg_msec, either the average (by default) or the
3493 maximum (log_max_value is set) `value' seen over the specified period
3494 of time is recorded. Each `data direction' seen within the window
3495 period will aggregate its values in a separate row. Further, when using
3496 windowed logging the `block size' and `offset' entries will always con‐
3497 tain 0.
3498
3500 Normally fio is invoked as a stand-alone application on the machine
3501 where the I/O workload should be generated. However, the backend and
3502 frontend of fio can be run separately i.e., the fio server can generate
3503 an I/O workload on the "Device Under Test" while being controlled by a
3504 client on another machine.
3505
3506 Start the server on the machine which has access to the storage DUT:
3507
3508 $ fio --server=args
3509
3510 where `args' defines what fio listens to. The arguments are of the form
3511 `type,hostname' or `IP,port'. `type' is either `ip' (or ip4) for TCP/IP
3512 v4, `ip6' for TCP/IP v6, or `sock' for a local unix domain socket.
3513 `hostname' is either a hostname or IP address, and `port' is the port
3514 to listen to (only valid for TCP/IP, not a local socket). Some exam‐
3515 ples:
3516
3517 1) fio --server
3518 Start a fio server, listening on all interfaces on the
3519 default port (8765).
3520
3521 2) fio --server=ip:hostname,4444
3522 Start a fio server, listening on IP belonging to hostname
3523 and on port 4444.
3524
3525 3) fio --server=ip6:::1,4444
3526 Start a fio server, listening on IPv6 localhost ::1 and
3527 on port 4444.
3528
3529 4) fio --server=,4444
3530 Start a fio server, listening on all interfaces on port
3531 4444.
3532
3533 5) fio --server=1.2.3.4
3534 Start a fio server, listening on IP 1.2.3.4 on the
3535 default port.
3536
3537 6) fio --server=sock:/tmp/fio.sock
3538 Start a fio server, listening on the local socket
3539 `/tmp/fio.sock'.
3540
3541 Once a server is running, a "client" can connect to the fio server
3542 with:
3543
3544 $ fio <local-args> --client=<server> <remote-args> <job file(s)>
3545
3546 where `local-args' are arguments for the client where it is running,
3547 `server' is the connect string, and `remote-args' and `job file(s)' are
3548 sent to the server. The `server' string follows the same format as it
3549 does on the server side, to allow IP/hostname/socket and port strings.
3550
3551 Fio can connect to multiple servers this way:
3552
3553 $ fio --client=<server1> <job file(s)> --client=<server2> <job
3554 file(s)>
3555
3556 If the job file is located on the fio server, then you can tell the
3557 server to load a local file as well. This is done by using
3558 --remote-config:
3559
3560 $ fio --client=server --remote-config /path/to/file.fio
3561
3562 Then fio will open this local (to the server) job file instead of being
3563 passed one from the client.
3564
3565 If you have many servers (example: 100 VMs/containers), you can input a
3566 pathname of a file containing host IPs/names as the parameter value for
3567 the --client option. For example, here is an example `host.list' file
3568 containing 2 hostnames:
3569
3570 host1.your.dns.domain
3571 host2.your.dns.domain
3572
3573 The fio command would then be:
3574
3575 $ fio --client=host.list <job file(s)>
3576
3577 In this mode, you cannot input server-specific parameters or job files
3578 -- all servers receive the same job file.
3579
3580 In order to let `fio --client' runs use a shared filesystem from multi‐
3581 ple hosts, `fio --client' now prepends the IP address of the server to
3582 the filename. For example, if fio is using the directory `/mnt/nfs/fio'
3583 and is writing filename `fileio.tmp', with a --client `hostfile' con‐
3584 taining two hostnames `h1' and `h2' with IP addresses 192.168.10.120
3585 and 192.168.10.121, then fio will create two files:
3586
3587 /mnt/nfs/fio/192.168.10.120.fileio.tmp
3588 /mnt/nfs/fio/192.168.10.121.fileio.tmp
3589
3590 Terse output in client/server mode will differ slightly from what is
3591 produced when fio is run in stand-alone mode. See the terse output sec‐
3592 tion for details.
3593
3595 fio was written by Jens Axboe <axboe@kernel.dk>.
3596 This man page was written by Aaron Carroll <aaronc@cse.unsw.edu.au>
3597 based on documentation by Jens Axboe.
3598 This man page was rewritten by Tomohiro Kusumi <tkusumi@tuxera.com>
3599 based on documentation by Jens Axboe.
3600
3602 Report bugs to the fio mailing list <fio@vger.kernel.org>.
3603 See REPORTING-BUGS.
3604
3605 REPORTING-BUGS: http://git.kernel.dk/cgit/fio/plain/REPORTING-BUGS
3606
3608 For further documentation see HOWTO and README.
3609 Sample jobfiles are available in the `examples/' directory.
3610 These are typically located under `/usr/share/doc/fio'.
3611
3612 HOWTO: http://git.kernel.dk/cgit/fio/plain/HOWTO
3613 README: http://git.kernel.dk/cgit/fio/plain/README
3614
3615
3616
3617User Manual August 2017 fio(1)