1agedu(1)                         Simon Tatham                         agedu(1)
2
3
4

NAME

6       agedu  -  correlate disk usage with last-access times to identify large
7       and disused data
8

SYNOPSIS

10       agedu [ options ] action [action...]
11

DESCRIPTION

13       agedu scans a directory tree and produces reports about how  much  disk
14       space  is  used  in  each directory and subdirectory, and also how that
15       usage of disk space corresponds to files with last-access times a  long
16       time ago.
17
18       In  other words, agedu is a tool you might use to help you free up disk
19       space. It lets you see which directories are taking up the most  space,
20       as  du does; but unlike du, it also distinguishes between large collec‐
21       tions of data which are still in use  and  ones  which  have  not  been
22       accessed  in months or years - for instance, large archives downloaded,
23       unpacked, used once, and never cleaned up.  Where  du  helps  you  find
24       what's  using your disk space, agedu helps you find what's wasting your
25       disk space.
26
27       agedu has several operating modes. In one mode, it scans your disk  and
28       builds  an  index  file  containing a data structure which allows it to
29       efficiently retrieve any information  it  might  need.  Typically,  you
30       would  use it in this mode first, and then run it in one of a number of
31       `query' modes to display a report of the disk space usage of a particu‐
32       lar  directory and its subdirectories. Those reports can be produced as
33       plain text (much like du) or as HTML. agedu can even run as a miniature
34       web  server, presenting each directory's HTML report with hyperlinks to
35       let you navigate around the file system to similar  reports  for  other
36       directories.
37
38       So  you would typically start using agedu by telling it to do a scan of
39       a directory tree and build an index. This is done with a  command  such
40       as
41
42       $ agedu -s /home/fred
43
44       which  will  build  a  large data file called agedu.dat in your current
45       directory. (If that current directory is inside /home/fred, don't worry
46       - agedu is smart enough to discount its own index file.)
47
48       Having  built  the  index,  you  would now query it for reports of disk
49       space usage. If you have a graphical  web  browser,  the  simplest  and
50       nicest way to query the index is by running agedu in web server mode:
51
52       $ agedu -w
53
54       which  will  print  (among other messages) a URL on its standard output
55       along the lines of
56
57       URL: http://127.0.0.1:48638/
58
59       (That URL will always begin with  `127.',  meaning  that  it's  in  the
60       localhost address space. So only processes running on the same computer
61       can even try to connect to that web server, and also  there  is  access
62       control  to  prevent  other  users  from seeing it - see below for more
63       detail.)
64
65       Now paste that URL into your web browser,  and  you  will  be  shown  a
66       graphical  representation of the disk usage in /home/fred and its imme‐
67       diate subdirectories, with varying colours used to show the  difference
68       between  disused  and recently-accessed data. Click on any subdirectory
69       to descend into it and see a report for  its  subdirectories  in  turn;
70       click  on  parts  of  the  pathname at the top of any page to return to
71       higher-level directories. When you've finished browsing, you  can  just
72       press  Ctrl-D  to  send an end-of-file indication to agedu, and it will
73       shut down.
74
75       After that, you probably want to delete the data file agedu.dat,  since
76       it's  pretty large. In fact, the command agedu -R will do this for you;
77       and you can chain agedu commands on the  same  command  line,  so  that
78       instead of the above you could have done
79
80       $ agedu -s /home/fred -w -R
81
82       for a single self-contained run of agedu which builds its index, serves
83       web pages from it, and cleans it up when finished.
84
85       In some situations, you might want to scan the directory  structure  of
86       one  computer, but run agedu's user interface on another. In that case,
87       you can do your scan using the agedu -S option in place  of  agedu  -s,
88       which  will  make  agedu  not bother building an index file but instead
89       just write out its scan results in plain text on standard output;  then
90       you  can funnel that output to the other machine using SSH (or whatever
91       other technique you prefer), and there, run agedu -L  to  load  in  the
92       textual dump and turn it into an index file. For example, you might run
93       a command like this (plus any ssh options you need) on the machine  you
94       want to scan:
95
96       $ agedu -S /home/fred | ssh indexing-machine agedu -L
97
98       or, equivalently, run something like this on the other machine:
99
100       $ ssh machine-to-scan agedu -S /home/fred | agedu -L
101
102       Either  way,  the agedu -L command will create an agedu.dat index file,
103       which you can then use with agedu -w just as above.
104
105       (Another way to do this might be to build the index file on  the  first
106       machine as normal, and then just copy it to the other machine once it's
107       complete. However, for efficiency, the index file is formatted  differ‐
108       ently  depending on the CPU architecture that agedu is compiled for. So
109       if that doesn't match between the two machines  -  e.g.  if  one  is  a
110       32-bit  machine  and  one  64-bit - then agedu.dat files written on one
111       machine will not work on the other. The technique described above using
112       -S and -L should work between any two machines.)
113
114       If  you  don't  have  a  graphical  web  browser, you can do text-based
115       queries  instead  of  using  agedu's  web  interface.  Having   scanned
116       /home/fred in any of the ways suggested above, you might run
117
118       $ agedu -t /home/fred
119
120       which  again  gives  a  summary of the disk usage in /home/fred and its
121       immediate subdirectories; but this time agedu will print it on standard
122       output, in much the same format as du. If you then want to find out how
123       much old data is there, you can add the -a option to  show  only  files
124       last  accessed  a certain length of time ago. For example, to show only
125       files which haven't been looked at in six months or more:
126
127       $ agedu -t /home/fred -a 6m
128
129       That's the essence of what agedu does. It has other modes of  operation
130       for  more  complex  situations,  and  the  usual  array of configurable
131       options. The following sections contain a complete  reference  for  all
132       its functionality.
133

OPERATING MODES

135       This  section describes the operating modes supported by agedu. Each of
136       these is in the form of a command-line option, sometimes with an  argu‐
137       ment.  Multiple  operating-mode options may appear on the command line,
138       in which case agedu  will  perform  the  specified  actions  one  after
139       another. For instance, as shown in the previous section, you might want
140       to perform a disk scan and  immediately  launch  a  web  server  giving
141       reports from that scan.
142
143       -s directory or --scan directory
144              In this mode, agedu scans the file system starting at the speci‐
145              fied directory, and indexes the results of the scan into a large
146              data file which other operating modes can query.
147
148              By  default,  the  scan  is  restricted  to a single file system
149              (since the expected use of agedu is that you would probably  use
150              it  because  a  particular  disk  partition  was  running low on
151              space). You can remove that  restriction  using  the  --cross-fs
152              option;  other  configuration  options  allow  you to include or
153              exclude files or entire subdirectories from the  scan.  See  the
154              next section for full details of the configurable options.
155
156              The  index file is created with restrictive permissions, in case
157              the file system you are scanning contains confidential  informa‐
158              tion in its structure.
159
160              Index  files  are  dependent  on  the characteristics of the CPU
161              architecture you created them on. You should not  expect  to  be
162              able  to  move an index file between different types of computer
163              and have it continue to  work.  If  you  need  to  transfer  the
164              results  of a disk scan to a different kind of computer, see the
165              -D and -L options below.
166
167       -w or --web
168              In this mode, agedu expects to find an index file already  writ‐
169              ten.  It allocates a network port, and starts up a web server on
170              that port which serves reports generated from the index file. By
171              default it invents its own URL and prints it out.
172
173              The web server runs until agedu receives an end-of-file event on
174              its standard input. (The expected usage is that you run it  from
175              the command line, immediately browse web pages until you're sat‐
176              isfied, and then press Ctrl-D.) To disable  the  EOF  behaviour,
177              use the --no-eof option.
178
179              In  case  the  index  file contains any confidential information
180              about your file system, the web server  protects  the  pages  it
181              serves  from  access  by  other  people.  On Linux, this is done
182              transparently by means of using /proc/net/tcp to check the owner
183              of  each  incoming connection; failing that, the web server will
184              require a password to view the reports, and agedu will print the
185              password it invented on standard output along with the URL.
186
187              Configurable  options  for  this  mode  let you specify your own
188              address and port number to listen on, and also specify your  own
189              choice  of  authentication method (including turning authentica‐
190              tion off completely) and a username and password of your choice.
191
192       -t directory or --text directory
193              In this mode, agedu generates a textual report on standard  out‐
194              put,  listing  the disk usage in the specified directory and all
195              its subdirectories down to a given depth. By default that  depth
196              is  1,  so that you see a report for directory itself and all of
197              its immediate subdirectories.  You  can  configure  a  different
198              depth  (or  no depth limit) using -d, described in the next sec‐
199              tion.
200
201              Used on its own, -t merely lists the total disk  usage  in  each
202              subdirectory;  agedu's  additional ability to distinguish unused
203              from recently-used data is not activated. To  activate  it,  use
204              the -a option to specify a minimum age.
205
206              The  directory structure stored in agedu's index file is treated
207              as a set of literal strings. This means that you cannot refer to
208              directories  by synonyms. So if you ran agedu -s ., then all the
209              path names you later pass to the -t option must be either `.' or
210              begin  with `./'. Similarly, symbolic links within the directory
211              you scanned will not be followed; you must refer to each  direc‐
212              tory by its canonical, symlink-free pathname.
213
214       -R or --remove
215              In  this  mode, agedu deletes its index file. Running just agedu
216              -R on its own is therefore equivalent to  typing  rm  agedu.dat.
217              However,  you  can  also  put -R on the end of a command line to
218              indicate that agedu should delete its index file after  it  fin‐
219              ishes performing other operations.
220
221       -S directory or --scan-dump directory
222              In  this  mode, agedu will scan a directory tree and convert the
223              results straight into a textual dump on standard output, without
224              generating  an  index file at all. The dump data is intended for
225              agedu -L to read.
226
227       -L or --load
228              In this mode, agedu expects to read a dump produced  by  the  -S
229              option from its standard input. It constructs an index file from
230              that dump, exactly as it would have if it had read the same data
231              from a disk scan in -s mode.
232
233       -D or --dump
234              In  this mode, agedu reads an existing index file and produces a
235              dump of its contents on standard output, in the same format used
236              by  -S  and -L. This option could be used to convert an existing
237              index file into a format acceptable to a different kind of  com‐
238              puter,  by dumping it using -D and then loading the dump back in
239              on the other machine using -L.
240
241              (The output of agedu -D on an existing index file  will  not  be
242              exactly  identical  to  what agedu -S would have originally pro‐
243              duced, due to a difference in treatment of last-access times  on
244              directories.  However,  it  should be effectively equivalent for
245              most purposes. See the documentation of the  --dir-atime  option
246              in the next section for further detail.)
247
248       -H directory or --html directory
249              In  this  mode,  agedu  will generate an HTML report of the disk
250              usage in the specified directory and its  immediate  subdirecto‐
251              ries,  in the same form that it serves from its web server in -w
252              mode.
253
254              By default, a single HTML report will be  generated  and  simply
255              written to standard output, with no hyperlinks pointing to other
256              similar pages. If you also specify the -d  option  (see  below),
257              agedu  will  instead  write  out a collection of HTML files with
258              hyperlinks between them, and call the top-level file index.html.
259
260       --cgi  In this mode, agedu will run as the bulk of a CGI  script  which
261              provides  the  same  set of web pages as the built-in web server
262              would. It will read the usual  CGI  environment  variables,  and
263              write CGI-style data to its standard output.
264
265              The  actual  CGI  program itself should be a tiny wrapper around
266              agedu which passes it the --cgi option, and also  (probably)  -f
267              to  locate  the  index  file. agedu will do everything else. For
268              example, your script might read
269
270              #!/bin/sh
271              /some/path/to/agedu --cgi -f /some/other/path/to/agedu.dat
272
273              (Note that agedu will produce the entire CGI  output,  including
274              status code, HTTP headers and the full HTML document. If you try
275              to surround the call to agedu --cgi with code that adds your own
276              HTML  header and footer, you won't get the results you want, and
277              agedu's HTTP-level features such as auto-redirecting to  canoni‐
278              cal versions of URIs will stop working.)
279
280              No  access control is performed in this mode: restricting access
281              to CGI scripts is assumed to be the job of the web server.
282
283       --presort and --postsort
284              In these two modes, agedu will expect to  read  a  textual  data
285              dump  from  its  standard  input of the form produced by -S (and
286              -D). It will transform the data into a different version of  its
287              text  dump format, and write the transformed version on standard
288              output.
289
290              The ordinary dump file format is reasonably readable, but  load‐
291              ing  it  into  an  index  file  using agedu -L requires it to be
292              sorted in a specific order, which is complicated to describe and
293              difficult  to implement using ordinary Unix sorting tools. So if
294              you want to construct your own data dump from a source  of  your
295              own that agedu itself doesn't know how to scan, you will need to
296              make sure it's sorted in the right order.
297
298              To help with this, agedu provides a secondary dump format  which
299              is  `sortable', in the sense that ordinary sort(1) without argu‐
300              ments will  arrange  it  into  the  right  order.  However,  the
301              sortable format is much more unreadable and also twice the size,
302              so you wouldn't want to write it directly!
303
304              So the recommended procedure is to generate  dump  data  in  the
305              ordinary format; then pipe it through agedu --presort to turn it
306              into the sortable format; then sort it; then pipe it into  agedu
307              -L (which can accept either the normal or the sortable format as
308              input). For example:
309
310              generate_custom_data.sh | agedu --presort | sort | agedu -L
311
312              If you need to transform the sorted  dump  file  back  into  the
313              ordinary  format,  agedu --postsort can do that. But since agedu
314              -L can accept either format as input, you may not need to.
315
316       -h or --help
317              Causes agedu to print some help text and terminate immediately.
318
319       -V or --version
320              Causes agedu to print its version number and  terminate  immedi‐
321              ately.
322

OPTIONS

324       This  section  describes  the various configuration options that affect
325       agedu's operation in one mode or another.
326
327       The following option affects nearly all modes (except -S):
328
329       -f filename or --file filename
330              Specifies the location of the index file  which  agedu  creates,
331              reads  or  removes  depending on its operating mode. By default,
332              this is simply `agedu.dat', in whatever is the  current  working
333              directory when you run agedu.
334
335       The following options affect the disk-scanning modes, -s and -S:
336
337       --cross-fs and --no-cross-fs
338              These  configure  whether  or  not the disk scan is permitted to
339              cross between different file systems. The  default  is  not  to:
340              agedu  will normally skip over subdirectories on which a differ‐
341              ent file system is mounted. This makes it  convenient  when  you
342              want  to free up space on a particular file system which is run‐
343              ning low. However, in other circumstances you might wish to  see
344              general  information about the use of space no matter which file
345              system it's on (for instance,  if  your  real  concern  is  your
346              backup  media  running  out of space, and if your backups do not
347              treat different file systems specially); in that situation,  use
348              --cross-fs.
349
350              (Note  that this default is the opposite way round from the cor‐
351              responding option in du.)
352
353       --prune wildcard and --prune-path wildcard
354              These cause  particular  files  or  directories  to  be  omitted
355              entirely  from  the  scan.  If agedu's scan encounters a file or
356              directory whose  name  matches  the  wildcard  provided  to  the
357              --prune  option, it will not include that file in its index, and
358              also if it's a directory it will skip over it and not  scan  its
359              contents.
360
361              Note  that  in most Unix shells, wildcards will probably need to
362              be escaped on the  command  line,  to  prevent  the  shell  from
363              expanding the wildcard before agedu sees it.
364
365              --prune-path  is similar to --prune, except that the wildcard is
366              matched against the entire pathname instead of just the filename
367              at  the  end of it. So whereas --prune *a*b* will match any file
368              whose actual name contains an a somewhere before a  b,  --prune-
369              path  *a*b*  will  also  match  a file whose name contains b and
370              which is inside a directory containing an a, or any file  inside
371              a directory of that form, and so on.
372
373       --exclude wildcard and --exclude-path wildcard
374              These  cause  particular files or directories to be omitted from
375              the index, but not from the scan. If agedu's scan  encounters  a
376              file  or  directory  whose name matches the wildcard provided to
377              the --exclude option, it will not include that file in its index
378              -  but unlike --prune, if the file in question is a directory it
379              will still scan its contents and index  them  if  they  are  not
380              ruled out themselves by --exclude options.
381
382              As  above,  --exclude-path  is similar to --exclude, except that
383              the wildcard is matched against the entire pathname.
384
385       --include wildcard and --include-path wildcard
386              These cause particular files or directories to be re-included in
387              the index and the scan, if they had previously been ruled out by
388              one of the above exclude or prune options.  You  can  interleave
389              include,  exclude  and  prune options as you wish on the command
390              line, and if more than one of them applies to a  file  then  the
391              last one takes priority.
392
393              For  example,  if you wanted to see only the disk space taken up
394              by MP3 files, you might run
395
396              $ agedu -s . --exclude '*' --include '*.mp3'
397
398              which will cause everything to be omitted  from  the  scan,  but
399              then  the MP3 files to be put back in. If you then wanted only a
400              subset of those MP3s, you could then exclude some of them  again
401              by  adding,  say,  `--exclude-path  './queen/*'' (or, more effi‐
402              ciently, `--prune ./queen') on the end of that command.
403
404              As with the previous two options, --include-path is  similar  to
405              --include except that the wildcard is matched against the entire
406              pathname.
407
408       --progress, --no-progress and --tty-progress
409              When agedu is scanning a directory tree, it will typically print
410              a  one-line  progress  report  every second showing where it has
411              reached in the scan, so you can  have  some  idea  of  how  much
412              longer  it  will  take. (Of course, it can't predict exactly how
413              long it will take, since it doesn't know which of  the  directo‐
414              ries it hasn't scanned yet will turn out to be huge.)
415
416              By  default,  those  progress  reports  are displayed on agedu's
417              standard error channel, if that channel  points  to  a  terminal
418              device.  If you need to manually enable or disable them, you can
419              use the above three options to do so: --progress unconditionally
420              enables the progress reports, --no-progress unconditionally dis‐
421              ables them, and --tty-progress reverts to the default  behaviour
422              which is conditional on standard error being a terminal.
423
424       --dir-atime and --no-dir-atime
425              In  normal  operation,  agedu  ignores  the  atimes (last access
426              times) on the directories it scans: it only  pays  attention  to
427              the  atimes  of  the  files  inside  those  directories. This is
428              because directory atimes tend to be reset by  a  lot  of  system
429              administrative tasks, such as cron jobs which scan the file sys‐
430              tem for one reason or another - or  even  other  invocations  of
431              agedu  itself,  though it tries to avoid modifying any atimes if
432              possible. So the literal atimes on directories are typically not
433              representative  of  how  long  ago the data in question was last
434              accessed with real intent to use that data in particular.
435
436              Instead, agedu makes up a fake  atime  for  every  directory  it
437              scans,  which  is  equal  to  the newest atime of any file in or
438              below that directory (or the directory's last modification time,
439              whichever  is  newest). This is based on the assumption that all
440              important accesses to directories are actually accesses  to  the
441              files  inside  those  directories,  so  that  when  any  file is
442              accessed all the directories on the path leading to it should be
443              considered to have been accessed as well.
444
445              In  unusual  cases  it is possible that a directory itself might
446              embody important data which is accessed by  reading  the  direc‐
447              tory. In that situation, agedu's atime-faking policy will misre‐
448              port the directory as disused. In the unlikely event  that  such
449              directories  form  a  significant part of your disk space usage,
450              you might want to turn off the faking.  The  --dir-atime  option
451              does  this:  it causes the disk scan to read the original atimes
452              of the directories it scans.
453
454              The faking of atimes on directories also requires  a  processing
455              pass  over  the index file after the main disk scan is complete.
456              --dir-atime also turns this pass off. Hence, this option affects
457              the -L option as well as -s and -S.
458
459              (The  previous section mentioned that there might be subtle dif‐
460              ferences between the output of agedu -s /path -D  and  agedu  -S
461              /path.  This  is  why.  Doing a scan with -s and then dumping it
462              with -D will dump the fully faked  atimes  on  the  directories,
463              whereas  doing  a  scan-to-dump with -S will dump only partially
464              faked atimes - specifically, each directory's last  modification
465              time  - since the subsequent processing pass will not have had a
466              chance to take place. However, loading either of  the  resulting
467              dump  files  with  -L  will  perform the atime-faking processing
468              pass, leading to the same data in the index file in  each  case.
469              In normal usage it should be safe to ignore all of this complex‐
470              ity.)
471
472       --mtime
473              This option causes agedu to index files by their last  modifica‐
474              tion  time  instead of their last access time. You might want to
475              use this if your last access times were completely  useless  for
476              some  reason:  for  example,  if you had recently searched every
477              file on your system, the system would have lost all the informa‐
478              tion  about what files you hadn't recently accessed before then.
479              Using this option is liable to be less effective at finding gen‐
480              uinely  wasted  space  than the normal mode (that is, it will be
481              more likely to flag things as disused when they're not,  so  you
482              will have more candidates to go through by hand looking for data
483              you don't need), but may be better than nothing  if  your  last-
484              access times are unhelpful.
485
486              Another  use  for  this  mode  might be to find recently created
487              large data. If your disk  has  been  gradually  filling  up  for
488              years,  the  default mode of agedu will let you find unused data
489              to delete; but if  you  know  your  disk  had  plenty  of  space
490              recently  and  now it's suddenly full, and you suspect that some
491              rogue program has left a large core dump or  output  file,  then
492              agedu --mtime might be a convenient way to locate the culprit.
493
494       --logicalsize
495              This option causes agedu to consider the size of each file to be
496              its `logical' size, rather than the amount of space it  consumes
497              on  disk.  (That is, it will use st_size instead of st_blocks in
498              the data returned from stat(2).) This option  makes  agedu  less
499              accurate  at  reporting  how  much  of your disk is used, but it
500              might be useful in specialist cases, such as  working  around  a
501              file system that is misreporting physical sizes.
502
503              For  most files, the physical size of a file will be larger than
504              the logical size, reflecting the fact  that  filesystem  layouts
505              generally  allocate a whole number of blocks of the disk to each
506              file, so some space is wasted at the end of the last  block.  So
507              counting  only the logical file size will typically cause under-
508              reporting of the disk usage (perhaps  large  under-reporting  in
509              the case of a very large number of very small files).
510
511              On  the  other  hand, sometimes a file with a very large logical
512              size can have `holes' where no data is actually stored, in which
513              case  using  the  logical  size of the file will over-report its
514              disk usage. So the use of logical sizes can give  wrong  answers
515              in both directions.
516
517       The  following  option affects all the modes that generate reports: the
518       web server mode -w, the stand-alone HTML generation  mode  -H  and  the
519       text report mode -t.
520
521       --files
522              This  option causes agedu's reports to list the individual files
523              in each directory, instead of just giving a combined report  for
524              everything that's not in a subdirectory.
525
526       The following option affects the text report mode -t.
527
528       -a age or --age age
529              This  option  tells  agedu  to report only files of at least the
530              specified age. An age is specified as a number, followed by  one
531              of  `y'  (years), `m' (months), `w' (weeks) or `d' (days). (This
532              syntax is also used by the -r option.) For example, -a  6m  will
533              produce  a  text  report  which includes only files at least six
534              months old.
535
536       The following options affect the stand-alone HTML  generation  mode  -H
537       and the text report mode -t.
538
539       -d depth or --depth depth
540              This  option  controls the maximum depth to which agedu recurses
541              when generating a text or HTML report.
542
543              In text mode, the default is 1, meaning  that  the  report  will
544              include  the  directory given on the command line and all of its
545              immediate subdirectories. A depth of two includes another  level
546              below  that, and so on; a depth of zero means only the directory
547              on the command line.
548
549              In HTML mode, specifying this option switches agedu from writing
550              out  a single HTML file to writing out multiple files which link
551              to each other. A depth of 1 means agedu will write out  an  HTML
552              file  for the given directory and also one for each of its imme‐
553              diate subdirectories.
554
555              If you want agedu to recurse as deeply  as  possible,  give  the
556              special word `max' as an argument to -d.
557
558       -o filename or --output filename
559              This option is used to specify an output file for agedu to write
560              its report to. In text mode or single-file HTML mode, the  argu‐
561              ment  is  treated  as  the name of a file. In multiple-file HTML
562              mode, the argument is treated as the name of  a  directory:  the
563              directory  will be created if it does not already exist, and the
564              output HTML files will be created inside it.
565
566       The following option affects only the stand-alone HTML generation  mode
567       -H, and even then, only in recursive mode (with -d):
568
569       --numeric
570              This  option  tells  agedu to name most of its output HTML files
571              numerically. The root of the whole output file  collection  will
572              still  be  called  index.html,  but all the rest will have names
573              like 73.html or 12525.html. (The numbers are  essentially  arbi‐
574              trary;  in  fact, they're indices of nodes in the data structure
575              used by agedu's index file.)
576
577              This system of file naming is less intuitive than the default of
578              naming  files  after the sub-pathname they index. It's also less
579              stable: the same pathname will not necessarily be represented by
580              the  same  filename  if agedu -H is re-run after another scan of
581              the same directory tree. However, it does have the  virtue  that
582              it  keeps  the  filenames  short, so that even if your directory
583              tree is very deep, the output HTML files  won't  exceed  any  OS
584              limit on filename length.
585
586       The  following options affect the web server mode -w, and in some cases
587       also the stand-alone HTML generation mode -H:
588
589       -r age range or --age-range age range
590              The HTML reports produced by agedu use a  range  of  colours  to
591              indicate  how  long ago data was last accessed, running from red
592              (representing the most disused data) to green (representing  the
593              newest).  By default, the lengths of time represented by the two
594              ends of that spectrum are chosen by examining the data  file  to
595              see what range of ages appears in it. However, you might want to
596              set your own limits, and you can do this using -r.
597
598              The argument to -r consists of a single age, or two  ages  sepa‐
599              rated  by  a  minus sign. An age is a number, followed by one of
600              `y' (years), `m' (months), `w' (weeks) or `d' (days). (This syn‐
601              tax  is  also used by the -a option.) The first age in the range
602              represents the oldest data, and will  be  coloured  red  in  the
603              HTML;  the  second age represents the newest, coloured green. If
604              the second age is not specified, it will  default  to  zero  (so
605              that green means data which has been accessed just now).
606
607              For  example,  -r 2y will mark data in red if it has been unused
608              for two years or more, and green if it has  been  accessed  just
609              now. -r 2y-3m will similarly mark data red if it has been unused
610              for two years or more, but will mark it green  if  it  has  been
611              accessed three months ago or later.
612
613       --address addr[:port]
614              Specifies  the  network  address  and port number on which agedu
615              should listen when running its web server. If you want agedu  to
616              listen  for  connections  coming in from any source, specify the
617              address as the special value ANY. If the port number is omitted,
618              an arbitrary unused port will be chosen for you and displayed.
619
620              If  you  specify  this  option,  agedu will not print its URL on
621              standard output (since you are expected to know what address you
622              told it to listen to).
623
624       --auth auth-type
625              Specifies  how  agedu  should control access to the web pages it
626              serves. The options are as follows:
627
628              magic  This option only works on Linux, and only when the incom‐
629                     ing  connection  is  from  the same machine that agedu is
630                     running on. On Linux, the special file /proc/net/tcp con‐
631                     tains  a  list  of network connections currently known to
632                     the operating system kernel, including which user id cre‐
633                     ated them. So agedu will look up each incoming connection
634                     in that file, and allow access if it comes from the  same
635                     user  id  under which agedu itself is running. Therefore,
636                     in agedu's normal web server mode, you can safely run  it
637                     on a multi-user machine and no other user will be able to
638                     read data out of your index file.
639
640              basic  In this mode, agedu will use HTTP  Basic  authentication:
641                     the user will have to provide a username and password via
642                     their browser. agedu will normally make up a username and
643                     password  for  the purpose, but you can specify your own;
644                     see below.
645
646              none   In this mode, the web server is  unauthenticated:  anyone
647                     connecting to it has full access to the reports generated
648                     by agedu. Do not do this unless there is  nothing  confi‐
649                     dential at all in your index file, or unless you are cer‐
650                     tain that nobody but you can run processes on  your  com‐
651                     puter.
652
653              default
654                     This is the default mode if you do not specify one of the
655                     above. In this mode, agedu  will  attempt  to  use  Linux
656                     magic  authentication,  but if it detects at startup time
657                     that /proc/net/tcp is absent or  non-functional  then  it
658                     will  fall  back  to  using HTTP Basic authentication and
659                     invent a user name and password.
660
661       --auth-file filename or --auth-fd fd
662              When agedu is using HTTP  Basic  authentication,  these  options
663              allow  you  to  specify  your own user name and password. If you
664              specify --auth-file, these will be read from the specified file;
665              if  you specify --auth-fd they will instead be read from a given
666              file descriptor which you should have arranged to pass to agedu.
667              In either case, the authentication details should consist of the
668              username, followed by a colon, followed by  the  password,  fol‐
669              lowed  immediately  by end of file (no trailing newline, or else
670              it will be considered part of the password).
671
672       --title title
673              Specify the string that appears at the start of the <title> sec‐
674              tion  of  the  output  HTML  pages. The default is `agedu'. This
675              title is followed by a colon and then the  path  you're  viewing
676              within  the  index  file.  You might use this option if you were
677              serving agedu reports for several different servers  and  wanted
678              to make it clearer which one a user was looking at.
679
680       --launch shell-command
681              Specify  a  command  to be run with the base URL of the web user
682              interface, once the web server has started up. The command  will
683              be  interpreted by /bin/sh, and the base URL will be appended to
684              it as an extra argument word.
685
686              A typical use for this would be  `--launch=browse',  which  uses
687              the  XDG  `browse'  command  to automatically open the agedu web
688              interface in your default browser. However, other uses are  pos‐
689              sible:  for  example, you could provide a command which communi‐
690              cates the URL to some other software that will use it for  some‐
691              thing.
692
693       --no-eof
694              Stop  agedu  in  web server mode from looking for end-of-file on
695              standard input and treating it as a signal to terminate.
696

LIMITATIONS

698       The data file is pretty large. The core of agedu is the tree-based data
699       structure  it  uses  in  its  index in order to efficiently perform the
700       queries it needs; this data structure requires O(N log N) storage. This
701       is  larger than you might expect; a scan of my own home directory, con‐
702       taining half a million files and directories and about  20Gb  of  data,
703       produced  an  index file over 60Mb in size. Furthermore, since the data
704       file must be memory-mapped during most processing, it  can  never  grow
705       larger  than  available  address  space, so a really big filesystem may
706       need to be indexed on a 64-bit computer. (This is one  reason  for  the
707       existence  of  the  -D  and  -L options: you can do the scanning on the
708       machine with access to the filesystem, and the indexing  on  a  machine
709       big enough to handle it.)
710
711       The  data structure also does not usefully permit access control within
712       the data file, so it would be difficult - even given the willingness to
713       do  additional  coding  - to run a system-wide agedu scan on a cron job
714       and serve the right subset of reports to each user.
715
716       In certain circumstances, agedu can report false  positives  (reporting
717       files  as  disused which are in fact in use) as well as the more benign
718       false negatives (reporting files as in use which are not). This  arises
719       when  a  file  is, semantically speaking, `read' without actually being
720       physically read. Typically this occurs when a  program  checks  whether
721       the  file's mtime has changed and only bothers re-reading it if it has;
722       programs which do this include rsync(1) and make(1). Such programs will
723       fail to update the atime of unmodified files despite depending on their
724       continued existence; a directory full of such files will be reported as
725       disused  by  agedu  even  in  situations where deleting them will cause
726       trouble.
727
728       Finally, of course, agedu's normal usage mode depends critically on the
729       OS  providing last-access times which are at least approximately right.
730       So a file system mounted with Linux's `noatime' option, or the  equiva‐
731       lent on any other OS, will not give useful results! (However, the Linux
732       mount option  `relatime',  which  distributions  now  tend  to  use  by
733       default, should be fine for all but specialist purposes: it reduces the
734       accuracy of last-access times so that they might be wrong by up  to  24
735       hours, but if you're looking for files that have been unused for months
736       or years, that's not a problem.)
737

LICENCE

739       agedu is free software, distributed under the MIT licence.  Type  agedu
740       --licence to see the full licence text.
741
742
743
744Simon Tatham                      2008‐11‐02                          agedu(1)
Impressum