1agedu(1)                         Simon Tatham                         agedu(1)
2
3
4

NAME

6       agedu  -  correlate disk usage with last-access times to identify large
7       and disused data
8

SYNOPSIS

10       agedu [ options ] action [action...]
11

DESCRIPTION

13       agedu scans a directory tree and produces reports about how  much  disk
14       space  is  used  in  each directory and subdirectory, and also how that
15       usage of disk space corresponds to files with last-access times a  long
16       time ago.
17
18       In  other words, agedu is a tool you might use to help you free up disk
19       space. It lets you see which directories are taking up the most  space,
20       as  du does; but unlike du, it also distinguishes between large collec‐
21       tions of data which are still in use  and  ones  which  have  not  been
22       accessed  in months or years - for instance, large archives downloaded,
23       unpacked, used once, and never cleaned up.  Where  du  helps  you  find
24       what's  using your disk space, agedu helps you find what's wasting your
25       disk space.
26
27       agedu has several operating modes. In one mode, it scans your disk  and
28       builds  an  index  file  containing a data structure which allows it to
29       efficiently retrieve any information  it  might  need.  Typically,  you
30       would  use it in this mode first, and then run it in one of a number of
31       `query' modes to display a report of the disk space usage of a particu‐
32       lar  directory and its subdirectories. Those reports can be produced as
33       plain text (much like du) or as HTML. agedu can even run as a miniature
34       web  server, presenting each directory's HTML report with hyperlinks to
35       let you navigate around the file system to similar  reports  for  other
36       directories.
37
38       So  you would typically start using agedu by telling it to do a scan of
39       a directory tree and build an index. This is done with a  command  such
40       as
41
42       $ agedu -s /home/fred
43
44       which  will  build  a  large data file called agedu.dat in your current
45       directory. (If that current directory is inside /home/fred, don't worry
46       - agedu is smart enough to discount its own index file.)
47
48       Having  built  the  index,  you  would now query it for reports of disk
49       space usage. If you have a graphical  web  browser,  the  simplest  and
50       nicest way to query the index is by running agedu in web server mode:
51
52       $ agedu -w
53
54       which  will  print  (among other messages) a URL on its standard output
55       along the lines of
56
57       URL: http://127.0.0.1:48638/
58
59       (That URL will always begin with  `127.',  meaning  that  it's  in  the
60       localhost address space. So only processes running on the same computer
61       can even try to connect to that web server, and also  there  is  access
62       control  to  prevent  other  users  from seeing it - see below for more
63       detail.)
64
65       Now paste that URL into your web browser,  and  you  will  be  shown  a
66       graphical  representation of the disk usage in /home/fred and its imme‐
67       diate subdirectories, with varying colours used to show the  difference
68       between  disused  and recently-accessed data. Click on any subdirectory
69       to descend into it and see a report for  its  subdirectories  in  turn;
70       click  on  parts  of  the  pathname at the top of any page to return to
71       higher-level directories. When you've finished browsing, you  can  just
72       press  Ctrl-D  to  send an end-of-file indication to agedu, and it will
73       shut down.
74
75       After that, you probably want to delete the data file agedu.dat,  since
76       it's  pretty large. In fact, the command agedu -R will do this for you;
77       and you can chain agedu commands on the  same  command  line,  so  that
78       instead of the above you could have done
79
80       $ agedu -s /home/fred -w -R
81
82       for a single self-contained run of agedu which builds its index, serves
83       web pages from it, and cleans it up when finished.
84
85       In some situations, you might want to scan the directory  structure  of
86       one  computer, but run agedu's user interface on another. In that case,
87       you can do your scan using the agedu -S option in place  of  agedu  -s,
88       which  will  make  agedu  not bother building an index file but instead
89       just write out its scan results in plain text on standard output;  then
90       you  can funnel that output to the other machine using SSH (or whatever
91       other technique you prefer), and there, run agedu -L  to  load  in  the
92       textual dump and turn it into an index file. For example, you might run
93       a command like this (plus any ssh options you need) on the machine  you
94       want to scan:
95
96       $ agedu -S /home/fred | ssh indexing-machine agedu -L
97
98       or, equivalently, run something like this on the other machine:
99
100       $ ssh machine-to-scan agedu -S /home/fred | agedu -L
101
102       Either  way,  the agedu -L command will create an agedu.dat index file,
103       which you can then use with agedu -w just as above.
104
105       (Another way to do this might be to build the index file on  the  first
106       machine as normal, and then just copy it to the other machine once it's
107       complete. However, for efficiency, the index file is formatted  differ‐
108       ently  depending on the CPU architecture that agedu is compiled for. So
109       if that doesn't match between the two machines  -  e.g.  if  one  is  a
110       32-bit  machine  and  one  64-bit - then agedu.dat files written on one
111       machine will not work on the other. The technique described above using
112       -S and -L should work between any two machines.)
113
114       If  you  don't  have  a  graphical  web  browser, you can do text-based
115       queries  instead  of  using  agedu's  web  interface.  Having   scanned
116       /home/fred in any of the ways suggested above, you might run
117
118       $ agedu -t /home/fred
119
120       which  again  gives  a  summary of the disk usage in /home/fred and its
121       immediate subdirectories; but this time agedu will print it on standard
122       output, in much the same format as du. If you then want to find out how
123       much old data is there, you can add the -a option to  show  only  files
124       last  accessed  a certain length of time ago. For example, to show only
125       files which haven't been looked at in six months or more:
126
127       $ agedu -t /home/fred -a 6m
128
129       That's the essence of what agedu does. It has other modes of  operation
130       for  more  complex  situations,  and  the  usual  array of configurable
131       options. The following sections contain a complete  reference  for  all
132       its functionality.
133

OPERATING MODES

135       This  section describes the operating modes supported by agedu. Each of
136       these is in the form of a command-line option, sometimes with an  argu‐
137       ment.  Multiple  operating-mode options may appear on the command line,
138       in which case agedu  will  perform  the  specified  actions  one  after
139       another. For instance, as shown in the previous section, you might want
140       to perform a disk scan and  immediately  launch  a  web  server  giving
141       reports from that scan.
142
143       -s directory or --scan directory
144              In this mode, agedu scans the file system starting at the speci‐
145              fied directory, and indexes the results of the scan into a large
146              data file which other operating modes can query.
147
148              By  default,  the  scan  is  restricted  to a single file system
149              (since the expected use of agedu is that you would probably  use
150              it  because  a  particular  disk  partition  was  running low on
151              space). You can remove that  restriction  using  the  --cross-fs
152              option;  other  configuration  options  allow  you to include or
153              exclude files or entire subdirectories from the  scan.  See  the
154              next section for full details of the configurable options.
155
156              The  index file is created with restrictive permissions, in case
157              the file system you are scanning contains confidential  informa‐
158              tion in its structure.
159
160              Index  files  are  dependent  on  the characteristics of the CPU
161              architecture you created them on. You should not  expect  to  be
162              able  to  move an index file between different types of computer
163              and have it continue to  work.  If  you  need  to  transfer  the
164              results  of a disk scan to a different kind of computer, see the
165              -D and -L options below.
166
167       -w or --web
168              In this mode, agedu expects to find an index file already  writ‐
169              ten.  It allocates a network port, and starts up a web server on
170              that port which serves reports generated from the index file. By
171              default it invents its own URL and prints it out.
172
173              The web server runs until agedu receives an end-of-file event on
174              its standard input. (The expected usage is that you run it  from
175              the command line, immediately browse web pages until you're sat‐
176              isfied, and then press Ctrl-D.) To disable  the  EOF  behaviour,
177              use the --no-eof option.
178
179              In  case  the  index  file contains any confidential information
180              about your file system, the web server  protects  the  pages  it
181              serves  from  access  by  other  people.  On Linux, this is done
182              transparently by means of using /proc/net/tcp to check the owner
183              of  each  incoming connection; failing that, the web server will
184              require a password to view the reports, and agedu will print the
185              password it invented on standard output along with the URL.
186
187              Configurable  options  for  this  mode  let you specify your own
188              address and port number to listen on, and also specify your  own
189              choice  of  authentication method (including turning authentica‐
190              tion off completely) and a username and password of your choice.
191
192       -t directory or --text directory
193              In this mode, agedu generates a textual report on standard  out‐
194              put,  listing  the disk usage in the specified directory and all
195              its subdirectories down to a given depth. By default that  depth
196              is  1,  so that you see a report for directory itself and all of
197              its immediate subdirectories.  You  can  configure  a  different
198              depth  (or  no depth limit) using -d, described in the next sec‐
199              tion.
200
201              Used on its own, -t merely lists the total disk  usage  in  each
202              subdirectory;  agedu's  additional ability to distinguish unused
203              from recently-used data is not activated. To  activate  it,  use
204              the -a option to specify a minimum age.
205
206              The  directory structure stored in agedu's index file is treated
207              as a set of literal strings. This means that you cannot refer to
208              directories  by synonyms. So if you ran agedu -s ., then all the
209              path names you later pass to the -t option must be either `.' or
210              begin  with `./'. Similarly, symbolic links within the directory
211              you scanned will not be followed; you must refer to each  direc‐
212              tory by its canonical, symlink-free pathname.
213
214       -R or --remove
215              In  this  mode, agedu deletes its index file. Running just agedu
216              -R on its own is therefore equivalent to  typing  rm  agedu.dat.
217              However,  you  can  also  put -R on the end of a command line to
218              indicate that agedu should delete its index file after  it  fin‐
219              ishes performing other operations.
220
221       -S directory or --scan-dump directory
222              In  this  mode, agedu will scan a directory tree and convert the
223              results straight into a textual dump on standard output, without
224              generating  an  index file at all. The dump data is intended for
225              agedu -L to read.
226
227       -L or --load
228              In this mode, agedu expects to read a dump produced  by  the  -S
229              option from its standard input. It constructs an index file from
230              that dump, exactly as it would have if it had read the same data
231              from a disk scan in -s mode.
232
233       -D or --dump
234              In  this mode, agedu reads an existing index file and produces a
235              dump of its contents on standard output, in the same format used
236              by  -S  and -L. This option could be used to convert an existing
237              index file into a format acceptable to a different kind of  com‐
238              puter,  by dumping it using -D and then loading the dump back in
239              on the other machine using -L.
240
241              (The output of agedu -D on an existing index file  will  not  be
242              exactly  identical  to  what agedu -S would have originally pro‐
243              duced, due to a difference in treatment of last-access times  on
244              directories.  However,  it  should be effectively equivalent for
245              most purposes. See the documentation of the  --dir-atime  option
246              in the next section for further detail.)
247
248       -H directory or --html directory
249              In  this  mode,  agedu  will generate an HTML report of the disk
250              usage in the specified directory and its  immediate  subdirecto‐
251              ries,  in the same form that it serves from its web server in -w
252              mode.
253
254              By default, a single HTML report will be  generated  and  simply
255              written to standard output, with no hyperlinks pointing to other
256              similar pages. If you also specify the -d  option  (see  below),
257              agedu  will  instead  write  out a collection of HTML files with
258              hyperlinks between them, and call the top-level file index.html.
259
260       --cgi  In this mode, agedu will run as the bulk of a CGI  script  which
261              provides  the  same  set of web pages as the built-in web server
262              would. It will read the usual  CGI  environment  variables,  and
263              write CGI-style data to its standard output.
264
265              The  actual  CGI  program itself should be a tiny wrapper around
266              agedu which passes it the --cgi option, and also  (probably)  -f
267              to  locate  the  index  file. agedu will do everything else. For
268              example, your script might read
269
270              #!/bin/sh
271              /some/path/to/agedu --cgi -f /some/other/path/to/agedu.dat
272
273              (Note that agedu will produce the entire CGI  output,  including
274              status code, HTTP headers and the full HTML document. If you try
275              to surround the call to agedu --cgi with code that adds your own
276              HTML  header and footer, you won't get the results you want, and
277              agedu's HTTP-level features such as auto-redirecting to  canoni‐
278              cal versions of URIs will stop working.)
279
280              No  access control is performed in this mode: restricting access
281              to CGI scripts is assumed to be the job of the web server.
282
283       -h or --help
284              Causes agedu to print some help text and terminate immediately.
285
286       -V or --version
287              Causes agedu to print its version number and  terminate  immedi‐
288              ately.
289

OPTIONS

291       This  section  describes  the various configuration options that affect
292       agedu's operation in one mode or another.
293
294       The following option affects nearly all modes (except -S):
295
296       -f filename or --file filename
297              Specifies the location of the index file  which  agedu  creates,
298              reads  or  removes  depending on its operating mode. By default,
299              this is simply `agedu.dat', in whatever is the  current  working
300              directory when you run agedu.
301
302       The following options affect the disk-scanning modes, -s and -S:
303
304       --cross-fs and --no-cross-fs
305              These  configure  whether  or  not the disk scan is permitted to
306              cross between different file systems. The  default  is  not  to:
307              agedu  will normally skip over subdirectories on which a differ‐
308              ent file system is mounted. This makes it  convenient  when  you
309              want  to free up space on a particular file system which is run‐
310              ning low. However, in other circumstances you might wish to  see
311              general  information about the use of space no matter which file
312              system it's on (for instance,  if  your  real  concern  is  your
313              backup  media  running  out of space, and if your backups do not
314              treat different file systems specially); in that situation,  use
315              --cross-fs.
316
317              (Note  that this default is the opposite way round from the cor‐
318              responding option in du.)
319
320       --prune wildcard and --prune-path wildcard
321              These cause  particular  files  or  directories  to  be  omitted
322              entirely  from  the  scan.  If agedu's scan encounters a file or
323              directory whose  name  matches  the  wildcard  provided  to  the
324              --prune  option, it will not include that file in its index, and
325              also if it's a directory it will skip over it and not  scan  its
326              contents.
327
328              Note  that  in most Unix shells, wildcards will probably need to
329              be escaped on the  command  line,  to  prevent  the  shell  from
330              expanding the wildcard before agedu sees it.
331
332              --prune-path  is similar to --prune, except that the wildcard is
333              matched against the entire pathname instead of just the filename
334              at  the  end of it. So whereas --prune *a*b* will match any file
335              whose actual name contains an a somewhere before a  b,  --prune-
336              path  *a*b*  will  also  match  a file whose name contains b and
337              which is inside a directory containing an a, or any file  inside
338              a directory of that form, and so on.
339
340       --exclude wildcard and --exclude-path wildcard
341              These  cause  particular files or directories to be omitted from
342              the index, but not from the scan. If agedu's scan  encounters  a
343              file  or  directory  whose name matches the wildcard provided to
344              the --exclude option, it will not include that file in its index
345              -  but unlike --prune, if the file in question is a directory it
346              will still scan its contents and index  them  if  they  are  not
347              ruled out themselves by --exclude options.
348
349              As  above,  --exclude-path  is similar to --exclude, except that
350              the wildcard is matched against the entire pathname.
351
352       --include wildcard and --include-path wildcard
353              These cause particular files or directories to be re-included in
354              the index and the scan, if they had previously been ruled out by
355              one of the above exclude or prune options.  You  can  interleave
356              include,  exclude  and  prune options as you wish on the command
357              line, and if more than one of them applies to a  file  then  the
358              last one takes priority.
359
360              For  example,  if you wanted to see only the disk space taken up
361              by MP3 files, you might run
362
363              $ agedu -s . --exclude '*' --include '*.mp3'
364
365              which will cause everything to be omitted  from  the  scan,  but
366              then  the MP3 files to be put back in. If you then wanted only a
367              subset of those MP3s, you could then exclude some of them  again
368              by  adding,  say,  `--exclude-path  './queen/*'' (or, more effi‐
369              ciently, `--prune ./queen') on the end of that command.
370
371              As with the previous two options, --include-path is  similar  to
372              --include except that the wildcard is matched against the entire
373              pathname.
374
375       --progress, --no-progress and --tty-progress
376              When agedu is scanning a directory tree, it will typically print
377              a  one-line  progress  report  every second showing where it has
378              reached in the scan, so you can  have  some  idea  of  how  much
379              longer  it  will  take. (Of course, it can't predict exactly how
380              long it will take, since it doesn't know which of  the  directo‐
381              ries it hasn't scanned yet will turn out to be huge.)
382
383              By  default,  those  progress  reports  are displayed on agedu's
384              standard error channel, if that channel  points  to  a  terminal
385              device.  If you need to manually enable or disable them, you can
386              use the above three options to do so: --progress unconditionally
387              enables the progress reports, --no-progress unconditionally dis‐
388              ables them, and --tty-progress reverts to the default  behaviour
389              which is conditional on standard error being a terminal.
390
391       --dir-atime and --no-dir-atime
392              In  normal  operation,  agedu  ignores  the  atimes (last access
393              times) on the directories it scans: it only  pays  attention  to
394              the  atimes  of  the  files  inside  those  directories. This is
395              because directory atimes tend to be reset by  a  lot  of  system
396              administrative tasks, such as cron jobs which scan the file sys‐
397              tem for one reason or another - or  even  other  invocations  of
398              agedu  itself,  though it tries to avoid modifying any atimes if
399              possible. So the literal atimes on directories are typically not
400              representative  of  how  long  ago the data in question was last
401              accessed with real intent to use that data in particular.
402
403              Instead, agedu makes up a fake  atime  for  every  directory  it
404              scans,  which  is  equal  to  the newest atime of any file in or
405              below that directory (or the directory's last modification time,
406              whichever  is  newest). This is based on the assumption that all
407              important accesses to directories are actually accesses  to  the
408              files  inside  those  directories,  so  that  when  any  file is
409              accessed all the directories on the path leading to it should be
410              considered to have been accessed as well.
411
412              In  unusual  cases  it is possible that a directory itself might
413              embody important data which is accessed by  reading  the  direc‐
414              tory. In that situation, agedu's atime-faking policy will misre‐
415              port the directory as disused. In the unlikely event  that  such
416              directories  form  a  significant part of your disk space usage,
417              you might want to turn off the faking.  The  --dir-atime  option
418              does  this:  it causes the disk scan to read the original atimes
419              of the directories it scans.
420
421              The faking of atimes on directories also requires  a  processing
422              pass  over  the index file after the main disk scan is complete.
423              --dir-atime also turns this pass off. Hence, this option affects
424              the -L option as well as -s and -S.
425
426              (The  previous section mentioned that there might be subtle dif‐
427              ferences between the output of agedu -s /path -D  and  agedu  -S
428              /path.  This  is  why.  Doing a scan with -s and then dumping it
429              with -D will dump the fully faked  atimes  on  the  directories,
430              whereas  doing  a  scan-to-dump with -S will dump only partially
431              faked atimes - specifically, each directory's last  modification
432              time  - since the subsequent processing pass will not have had a
433              chance to take place. However, loading either of  the  resulting
434              dump  files  with  -L  will  perform the atime-faking processing
435              pass, leading to the same data in the index file in  each  case.
436              In normal usage it should be safe to ignore all of this complex‐
437              ity.)
438
439       --mtime
440              This option causes agedu to index files by their last  modifica‐
441              tion  time  instead of their last access time. You might want to
442              use this if your last access times were completely  useless  for
443              some  reason:  for  example,  if you had recently searched every
444              file on your system, the system would have lost all the informa‐
445              tion  about what files you hadn't recently accessed before then.
446              Using this option is liable to be less effective at finding gen‐
447              uinely  wasted  space  than the normal mode (that is, it will be
448              more likely to flag things as disused when they're not,  so  you
449              will have more candidates to go through by hand looking for data
450              you don't need), but may be better than nothing  if  your  last-
451              access times are unhelpful.
452
453              Another  use  for  this  mode  might be to find recently created
454              large data. If your disk  has  been  gradually  filling  up  for
455              years,  the  default mode of agedu will let you find unused data
456              to delete; but if  you  know  your  disk  had  plenty  of  space
457              recently  and  now it's suddenly full, and you suspect that some
458              rogue program has left a large core dump or  output  file,  then
459              agedu --mtime might be a convenient way to locate the culprit.
460
461       The  following  option affects all the modes that generate reports: the
462       web server mode -w, the stand-alone HTML generation  mode  -H  and  the
463       text report mode -t.
464
465       --files
466              This  option causes agedu's reports to list the individual files
467              in each directory, instead of just giving a combined report  for
468              everything that's not in a subdirectory.
469
470       The following option affects the text report mode -t.
471
472       -a age or --age age
473              This  option  tells  agedu  to report only files of at least the
474              specified age. An age is specified as a number, followed by  one
475              of  `y'  (years), `m' (months), `w' (weeks) or `d' (days). (This
476              syntax is also used by the -r option.) For example, -a  6m  will
477              produce  a  text  report  which includes only files at least six
478              months old.
479
480       The following options affect the stand-alone HTML  generation  mode  -H
481       and the text report mode -t.
482
483       -d depth or --depth depth
484              This  option  controls the maximum depth to which agedu recurses
485              when generating a text or HTML report.
486
487              In text mode, the default is 1, meaning  that  the  report  will
488              include  the  directory given on the command line and all of its
489              immediate subdirectories. A depth of two includes another  level
490              below  that, and so on; a depth of zero means only the directory
491              on the command line.
492
493              In HTML mode, specifying this option switches agedu from writing
494              out  a single HTML file to writing out multiple files which link
495              to each other. A depth of 1 means agedu will write out  an  HTML
496              file  for the given directory and also one for each of its imme‐
497              diate subdirectories.
498
499              If you want agedu to recurse as deeply  as  possible,  give  the
500              special word `max' as an argument to -d.
501
502       -o filename or --output filename
503              This option is used to specify an output file for agedu to write
504              its report to. In text mode or single-file HTML mode, the  argu‐
505              ment  is  treated  as  the name of a file. In multiple-file HTML
506              mode, the argument is treated as the name of  a  directory:  the
507              directory  will be created if it does not already exist, and the
508              output HTML files will be created inside it.
509
510       The following option affects only the stand-alone HTML generation  mode
511       -H, and even then, only in recursive mode (with -d):
512
513       --numeric
514              This  option  tells  agedu to name most of its output HTML files
515              numerically. The root of the whole output file  collection  will
516              still  be  called  index.html,  but all the rest will have names
517              like 73.html or 12525.html. (The numbers are  essentially  arbi‐
518              trary;  in  fact, they're indices of nodes in the data structure
519              used by agedu's index file.)
520
521              This system of file naming is less intuitive than the default of
522              naming  files  after the sub-pathname they index. It's also less
523              stable: the same pathname will not necessarily be represented by
524              the  same  filename  if agedu -H is re-run after another scan of
525              the same directory tree. However, it does have the  virtue  that
526              it  keeps  the  filenames  short, so that even if your directory
527              tree is very deep, the output HTML files  won't  exceed  any  OS
528              limit on filename length.
529
530       The  following options affect the web server mode -w, and in some cases
531       also the stand-alone HTML generation mode -H:
532
533       -r age range or --age-range age range
534              The HTML reports produced by agedu use a  range  of  colours  to
535              indicate  how  long ago data was last accessed, running from red
536              (representing the most disused data) to green (representing  the
537              newest).  By default, the lengths of time represented by the two
538              ends of that spectrum are chosen by examining the data  file  to
539              see what range of ages appears in it. However, you might want to
540              set your own limits, and you can do this using -r.
541
542              The argument to -r consists of a single age, or two  ages  sepa‐
543              rated  by  a  minus sign. An age is a number, followed by one of
544              `y' (years), `m' (months), `w' (weeks) or `d' (days). (This syn‐
545              tax  is  also used by the -a option.) The first age in the range
546              represents the oldest data, and will  be  coloured  red  in  the
547              HTML;  the  second age represents the newest, coloured green. If
548              the second age is not specified, it will  default  to  zero  (so
549              that green means data which has been accessed just now).
550
551              For  example,  -r 2y will mark data in red if it has been unused
552              for two years or more, and green if it has  been  accessed  just
553              now. -r 2y-3m will similarly mark data red if it has been unused
554              for two years or more, but will mark it green  if  it  has  been
555              accessed three months ago or later.
556
557       --address addr[:port]
558              Specifies  the  network  address  and port number on which agedu
559              should listen when running its web server. If you want agedu  to
560              listen  for  connections  coming in from any source, specify the
561              address as the special value ANY. If the port number is omitted,
562              an arbitrary unused port will be chosen for you and displayed.
563
564              If  you  specify  this  option,  agedu will not print its URL on
565              standard output (since you are expected to know what address you
566              told it to listen to).
567
568       --auth auth-type
569              Specifies  how  agedu  should control access to the web pages it
570              serves. The options are as follows:
571
572              magic  This option only works on Linux, and only when the incom‐
573                     ing  connection  is  from  the same machine that agedu is
574                     running on. On Linux, the special file /proc/net/tcp con‐
575                     tains  a  list  of network connections currently known to
576                     the operating system kernel, including which user id cre‐
577                     ated them. So agedu will look up each incoming connection
578                     in that file, and allow access if it comes from the  same
579                     user  id  under which agedu itself is running. Therefore,
580                     in agedu's normal web server mode, you can safely run  it
581                     on a multi-user machine and no other user will be able to
582                     read data out of your index file.
583
584              basic  In this mode, agedu will use HTTP  Basic  authentication:
585                     the user will have to provide a username and password via
586                     their browser. agedu will normally make up a username and
587                     password  for  the purpose, but you can specify your own;
588                     see below.
589
590              none   In this mode, the web server is  unauthenticated:  anyone
591                     connecting to it has full access to the reports generated
592                     by agedu. Do not do this unless there is  nothing  confi‐
593                     dential at all in your index file, or unless you are cer‐
594                     tain that nobody but you can run processes on  your  com‐
595                     puter.
596
597              default
598                     This is the default mode if you do not specify one of the
599                     above. In this mode, agedu  will  attempt  to  use  Linux
600                     magic  authentication,  but if it detects at startup time
601                     that /proc/net/tcp is absent or  non-functional  then  it
602                     will  fall  back  to  using HTTP Basic authentication and
603                     invent a user name and password.
604
605       --auth-file filename or --auth-fd fd
606              When agedu is using HTTP  Basic  authentication,  these  options
607              allow  you  to  specify  your own user name and password. If you
608              specify --auth-file, these will be read from the specified file;
609              if  you specify --auth-fd they will instead be read from a given
610              file descriptor which you should have arranged to pass to agedu.
611              In either case, the authentication details should consist of the
612              username, followed by a colon, followed by  the  password,  fol‐
613              lowed  immediately  by end of file (no trailing newline, or else
614              it will be considered part of the password).
615
616       --title title
617              Specify the string that appears at the start of the <title> sec‐
618              tion  of  the  output  HTML  pages. The default is `agedu'. This
619              title is followed by a colon and then the  path  you're  viewing
620              within  the  index  file.  You might use this option if you were
621              serving agedu reports for several different servers  and  wanted
622              to make it clearer which one a user was looking at.
623
624       --no-eof
625              Stop  agedu  in  web server mode from looking for end-of-file on
626              standard input and treating it as a signal to terminate.
627

LIMITATIONS

629       The data file is pretty large. The core of agedu is the tree-based data
630       structure  it  uses  in  its  index in order to efficiently perform the
631       queries it needs; this data structure requires O(N log N) storage. This
632       is  larger than you might expect; a scan of my own home directory, con‐
633       taining half a million files and directories and about  20Gb  of  data,
634       produced  an  index file over 60Mb in size. Furthermore, since the data
635       file must be memory-mapped during most processing, it  can  never  grow
636       larger  than  available  address  space, so a really big filesystem may
637       need to be indexed on a 64-bit computer. (This is one  reason  for  the
638       existence  of  the  -D  and  -L options: you can do the scanning on the
639       machine with access to the filesystem, and the indexing  on  a  machine
640       big enough to handle it.)
641
642       The  data structure also does not usefully permit access control within
643       the data file, so it would be difficult - even given the willingness to
644       do  additional  coding  - to run a system-wide agedu scan on a cron job
645       and serve the right subset of reports to each user.
646
647       In certain circumstances, agedu can report false  positives  (reporting
648       files  as  disused which are in fact in use) as well as the more benign
649       false negatives (reporting files as in use which are not). This  arises
650       when  a  file  is, semantically speaking, `read' without actually being
651       physically read. Typically this occurs when a  program  checks  whether
652       the  file's mtime has changed and only bothers re-reading it if it has;
653       programs which do this include rsync(1) and make(1). Such programs will
654       fail to update the atime of unmodified files despite depending on their
655       continued existence; a directory full of such files will be reported as
656       disused  by  agedu  even  in  situations where deleting them will cause
657       trouble.
658
659       Finally, of course, agedu's normal usage mode depends critically on the
660       OS  providing last-access times which are at least approximately right.
661       So a file system mounted with Linux's `noatime' option, or the  equiva‐
662       lent on any other OS, will not give useful results! (However, the Linux
663       mount option  `relatime',  which  distributions  now  tend  to  use  by
664       default, should be fine for all but specialist purposes: it reduces the
665       accuracy of last-access times so that they might be wrong by up  to  24
666       hours, but if you're looking for files that have been unused for months
667       or years, that's not a problem.)
668

LICENCE

670       agedu is free software, distributed under the MIT licence.  Type  agedu
671       --licence to see the full licence text.
672
673
674
675Simon Tatham                      2008‐11‐02                          agedu(1)
Impressum