1JDUPES(1) General Commands Manual JDUPES(1)
2
3
4
6 jdupes - finds and performs actions upon duplicate files
7
9 jdupes [ options ] DIRECTORIES ...
10
11
13 Searches the given path(s) for duplicate files. Such files are found by
14 comparing file sizes, then partial and full file hashes, followed by a
15 byte-by-byte comparison. The default behavior with no other "action op‐
16 tions" specified (delete, summarize, link, dedupe, etc.) is to print
17 sets of matching files.
18
19
21 -@ --loud
22 output annoying low-level debug info while running
23
24 -0 --print-null
25 when printing matches, use null bytes instead of CR/LF bytes,
26 just like 'find -print0' does. This has no effect with any ac‐
27 tion mode other than the default "print matches" (delete, link,
28 etc. will still print normal line endings in the output.)
29
30 -1 --one-file-system
31 do not match files that are on different filesystems or devices
32
33 -A --no-hidden
34 exclude hidden files from consideration
35
36 -B --dedupe
37 issue the btrfs same-extents ioctl to trigger a deduplication on
38 disk. The program must be built with btrfs support for this op‐
39 tion to be available
40
41 -C --chunk-size=BYTES
42 set the I/O chunk size manually; larger values may improve per‐
43 formance on rotating media by reducing the number of head seeks
44 required, but also increases memory usage and can reduce perfor‐
45 mance in some cases
46
47 -D --debug
48 if this feature is compiled in, show debugging statistics and
49 info at the end of program execution
50
51 -d --delete
52 prompt user for files to preserve, deleting all others (see
53 CAVEATS below)
54
55 -f --omit-first
56 omit the first file in each set of matches
57
58 -H --hard-links
59 normally, when two or more files point to the same disk area
60 they are treated as non-duplicates; this option will change this
61 behavior
62
63 -h --help
64 displays help
65
66 -i --reverse
67 reverse (invert) the sort order of matches
68
69 -I --isolate
70 isolate each command-line parameter from one another; only match
71 if the files are under different parameter specifications
72
73 -L --link-hard
74 replace all duplicate files with hardlinks to the first file in
75 each set of duplicates
76
77 -m --summarize
78 summarize duplicate file information
79
80 -M --print-summarize
81 print matches and summarize the duplicate file information at
82 the end
83
84 -N --no-prompt
85 when used together with --delete, preserve the first file in
86 each set of duplicates and delete the others without prompting
87 the user
88
89 -O --param-order
90 parameter order preservation is more important than the chosen
91 sort; this is particularly useful with the -N option to ensure
92 that automatic deletion behaves in a controllable way
93
94 -o --order=WORD
95 order files according to WORD: time - sort by modification time
96 name - sort by filename (default)
97
98 -p --permissions
99 don't consider files with different owner/group or permission
100 bits as duplicates
101
102 -P --print=type
103 print extra information to stdout; valid options are: early -
104 matches that pass early size/permission/link/etc. checks partial
105 - files whose partial hashes match fullhash - files whose full
106 hashes match
107
108 -Q --quick
109 [WARNING: RISK OF DATA LOSS, SEE CAVEATS] skip byte-for-byte
110 verification of duplicate pairs (use hashes only)
111
112 -q --quiet
113 hide progress indicator
114
115 -R --recurse:
116 for each directory given after this option follow subdirectories
117 encountered within (note the ':' at the end of option; see the
118 Examples section below for further explanation)
119
120 -r --recurse
121 for every directory given follow subdirectories encountered
122 within
123
124 -l --link-soft
125 replace all duplicate files with symlinks to the first file in
126 each set of duplicates
127
128 -S --size
129 show size of duplicate files
130
131 -s --symlinks
132 follow symlinked directories
133
134 -T --partial-only
135 [WARNING: EXTREME RISK OF DATA LOSS, SEE CAVEATS] match based on
136 hash of first block of file data, ignoring the rest
137
138 -u --print-unique
139 print only a list of unique (non-duplicate, unmatched) files
140
141 -v --version
142 display jdupes version and compilation feature flags
143
144 -X --ext-filter=spec:info
145 exclude/filter files based on specified criteria; general for‐
146 mat:
147
148 jdupes -X filter[:value][size_suffix]
149
150 Some filters take no value or multiple values. Filters that can
151 take a numeric option generally support the size multipliers
152 K/M/G/T/P/E with or without an added iB or B. Multipliers are
153 binary-style unless the -B suffix is used, which will use deci‐
154 mal multipliers. For example, 16k or 16kib = 16384; 16kb =
155 16000. Multipliers are case-insensitive.
156
157 Filters have cumulative effects: jdupes -X size+:99 -X size-:101
158 will cause only files of exactly 100 bytes in size to be in‐
159 cluded.
160
161 Extension matching is case-insensitive. Path substring matching
162 is case-sensitive.
163
164 Supported filters are:
165
166 `size[+-=]:number[suffix]'
167 match only if size is greater (+), less than (-), or
168 equal to (=) the specified number. The +/- and = speci‐
169 fiers can be combined, i.e. "size+=:4K" will only con‐
170 sider files with a size greater than or equal to four
171 kilobytes (4096 bytes).
172
173 `noext:ext1[,ext2,...]'
174 exclude files with certain extension(s), specified as a
175 comma-separated list. Do not use a leading dot.
176
177 `onlyext:ext1[,ext2,...]'
178 only include files with certain extension(s), specified
179 as a comma-separated list. Do not use a leading dot.
180
181 `nostr:text_string'
182 exclude all paths containing the substring text_string.
183 This scans the full file path, so it can be used to match
184 directories: -X nostr:dir_name/
185
186 `onlystr:text_string'
187 require all paths to contain the substring text_string.
188 This scans the full file path, so it can be used to match
189 directories: -X onlystr:dir_name/
190
191 `newer:datetime`
192 only include files newer than specified date. Date/time
193 format: "YYYY-MM-DD HH:MM:SS" (time is optional).
194
195 `older:datetime`
196 only include files older than specified date. Date/time
197 format: "YYYY-MM-DD HH:MM:SS" (time is optional).
198
199
200 -z --zero-match
201 consider zero-length files to be duplicates; this replaces the
202 old default behavior when -n was not specified
203
204 -Z --soft-abort
205 if the user aborts the program (as with CTRL-C) act on the
206 matches that were found before the abort was received. For exam‐
207 ple, if -L and -Z are specified, all matches found prior to the
208 abort will be hard linked. The default behavior without -Z is to
209 abort without taking any actions.
210
211
213 A set of arrows are used in hard linking to show what action was taken
214 on each link candidate. These arrows are as follows:
215
216
217 ----> This file was successfully hard linked to the first file in the
218 duplicate chain
219
220 -@@-> This file was successfully symlinked to the first file in the
221 chain
222
223 -==-> This file was already a hard link to the first file in the chain
224
225 -//-> Linking this file failed due to an error during the linking
226 process
227
228
229 Duplicate files are listed together in groups with each file displayed
230 on a separate line. The groups are then separated from each other by
231 blank lines.
232
233
235 jdupes a --recurse: b
236 will follow subdirectories under b, but not those under a.
237
238 jdupes a --recurse b
239 will follow subdirectories under both a and b.
240
241 jdupes -O dir1 dir3 dir2
242 will always place 'dir1' results first in any match set (where
243 relevant)
244
245
247 Using -1 or --one-file-system prevents matches that cross filesystems,
248 but a more relaxed form of this option may be added that allows cross-
249 matching for all filesystems that each parameter is present on.
250
251 When using -d or --delete, care should be taken to insure against acci‐
252 dental data loss.
253
254 -Z or --soft-abort used to be --hardabort in jdupes prior to v1.5 and
255 had the opposite behavior. Defaulting to taking action on abort is
256 probably not what most users would expect. The decision to invert
257 rather than reassign to a different option was made because this fea‐
258 ture was still fairly new at the time of the change.
259
260 The -O or --param-order option allows the user greater control over
261 what appears in the first position of a match set, specifically for
262 keeping the -N option from deleting all but one file in a set in a
263 seemingly random way. All directories specified on the command line
264 will be used as the sorting order of result sets first, followed by the
265 sorting algorithm set by the -o or --order option. This means that the
266 order of all match pairs for a single directory specification will re‐
267 tain the old sorting behavior even if this option is specified.
268
269 When used together with options -s or --symlink, a user could acciden‐
270 tally preserve a symlink while deleting the file it points to.
271
272 The -Q or --quick option only reads each file once, hashes it, and per‐
273 forms comparisons based solely on the hashes. There is a small but sig‐
274 nificant risk of a hash collision which is the purpose of the failsafe
275 byte-for-byte comparison that this option explicitly bypasses. Do not
276 use it on ANY data set for which any amount of data loss is unaccept‐
277 able. This option is not included in the help text for the program due
278 to its risky nature. You have been warned!
279
280 The -T or --partial-only option produces results based on a hash of the
281 first block of file data in each file, ignoring everything else in the
282 file. Partial hash checks have always been an important exclusion step
283 in the jdupes algorithm, usually hashing the first 4096 bytes of data
284 and allowing files that are different at the start to be rejected
285 early. In certain scenarios it may be a useful heuristic for a user to
286 see that a set of files has the same size and the same starting data,
287 even if the remaining data does not match; one example of this would be
288 comparing files with data blocks that are damaged or missing such as an
289 incomplete file transfer or checking a data recovery against known-good
290 copies to see what damaged data can be deleted in favor of restoring
291 the known-good copy. This option is meant to be used with informational
292 actions and can result in EXTREME DATA LOSS if used with options that
293 delete files, create hard links, or perform other destructive actions
294 on data based on the matching output. Because of the potential for mas‐
295 sive data destruction, this option MUST BE SPECIFIED TWICE to take ef‐
296 fect and will error out if it is only specified once.
297
298 Using the -C or --chunk-size option to override I/O chunk size can in‐
299 crease performance on rotating storage media by reducing "head thrash‐
300 ing," reading larger amounts of data sequentially from each file. This
301 tunable size can have bad side effects; the default size maximizes al‐
302 gorithmic performance without regard to the I/O characteristics of any
303 given device and uses a modest amount of memory, but other values may
304 greatly increase memory usage or incur a lot more system call overhead.
305 Try several different values to see how they affect performance for
306 your hardware and data set. This option does not affect match results
307 in any way, so even if it slows down the file matching process it will
308 not hurt anything.
309
310
312 Send bug reports to jody@jodybruchon.com or use the issue tracker at:
313
314 http://github.com/jbruchon/jdupes/issues
315
316
318 If you find this software useful, please consider financially support‐
319 ing its development through the author's home page:
320
321 https://www.jodybruchon.com/
322
323
325 jdupes is created and maintained by Jody Bruchon <jody@jodybruchon.com>
326 and was forked from fdupes 1.51 by Adrian Lopez <adrian2@caribe.net>
327
328
330 The MIT License (MIT)
331
332 Copyright (C) 2015-2021 Jody Lee Bruchon and contributors Forked from
333 fdupes 1.51, Copyright (C) 1999-2014 Adrian Lopez and contributors
334
335 Permission is hereby granted, free of charge, to any person obtaining a
336 copy of this software and associated documentation files (the "Soft‐
337 ware"), to deal in the Software without restriction, including without
338 limitation the rights to use, copy, modify, merge, publish, distribute,
339 sublicense, and/or sell copies of the Software, and to permit persons
340 to whom the Software is furnished to do so, subject to the following
341 conditions:
342
343 The above copyright notice and this permission notice shall be included
344 in all copies or substantial portions of the Software.
345
346 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
347 OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MER‐
348 CHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN
349 NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
350 CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
351 TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFT‐
352 WARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
353
354
355
356 JDUPES(1)