1thttpd(8)                   System Manager's Manual                  thttpd(8)
2
3
4

NAME

6       thttpd - tiny/turbo/throttling HTTP server
7

SYNOPSIS

9       thttpd  [-C  configfile]  [-p  port]  [-d dir] [-dd data_dir] [-r|-nor]
10       [-s|-nos] [-v|-nov] [-g|-nog] [-u user] [-c cgipat] [-t throttles]  [-h
11       host]  [-l logfile] [-i pidfile] [-T charset] [-P P3P] [-M maxage] [-V]
12       [-D]
13

DESCRIPTION

15       thttpd is a simple, small, fast, and secure HTTP  server.   It  doesn't
16       have  a  lot  of special features, but it suffices for most uses of the
17       web, it's about as fast as  the  best  full-featured  servers  (Apache,
18       NCSA,  Netscape), and it has one extremely useful feature (URL-traffic-
19       based throttling) that no other server currently has.
20

OPTIONS

22       -C     Specifies a config-file to read.  All options can be set  either
23              by  command-line  flags  or  in  the config file.  See below for
24              details.
25
26       -p     Specifies an alternate port number to listen on.  The default is
27              80.   The  config-file  option name for this flag is "port", and
28              the config.h option is DEFAULT_PORT.
29
30       -d     Specifies a directory to chdir() to at startup.  This is  merely
31              a  convenience  -  you could just as easily do a cd in the shell
32              script that invokes the program.  The  config-file  option  name
33              for  this  flag  is  "dir", and the config.h options are WEBDIR,
34              USE_USER_DIR.
35
36       -r     Do a chroot() at initialization time, restricting file access to
37              the  program's  current  directory.   If  -r  is the compiled-in
38              default, then -nor disables it.  See  below  for  details.   The
39              config-file  option  names for this flag are "chroot" and "noch‐
40              root", and the config.h option is ALWAYS_CHROOT.
41
42       -dd    Specifies a directory to chdir() to after chrooting.  If  you're
43              not chrooting, you might as well do a single chdir() with the -d
44              flag.  If you are chrooting, this lets you put the web files  in
45              a  subdirectory  of the chroot tree, instead of in the top level
46              mixed in with the chroot files.  The config-file option name for
47              this flag is "data_dir".
48
49       -nos   Don't  do  explicit  symbolic  link  checking.  Normally, thttpd
50              explicitly expands any symbolic links  in  filenames,  to  check
51              that the resulting path stays within the original document tree.
52              If you want to turn off this check and save some CPU  time,  you
53              can  use  the -nos flag, however this is not recommended.  Note,
54              though, that if you are using the  chroot  option,  the  symlink
55              checking  is  unnecessary  and is turned off, so the safe way to
56              save those CPU cycles is to use chroot.  The config-file  option
57              names for this flag are "symlinkcheck" and "nosymlinkcheck".
58
59       -v     Do el-cheapo virtual hosting.  If -v is the compiled-in default,
60              then -nov disables it.  See below for details.  The  config-file
61              option  names  for  this flag are "vhost" and "novhost", and the
62              config.h option is ALWAYS_VHOST.
63
64       -g     Use a global passwd file.  This means that  every  file  in  the
65              entire  document  tree is protected by the single .htpasswd file
66              at the  top  of  the  tree.   Otherwise  the  semantics  of  the
67              .htpasswd file are the same.  If this option is set but there is
68              no .htpasswd file in the top-level directory, then  thttpd  pro‐
69              ceeds  as  if the option was not set - first looking for a local
70              .htpasswd file, and if that doesn't exist  either  then  serving
71              the  file  without  any  password.   If  -g  is  the compiled-in
72              default, then -nog disables it.  The  config-file  option  names
73              for  this  flag are "globalpasswd" and "noglobalpasswd", and the
74              config.h option is ALWAYS_GLOBAL_PASSWD.
75
76       -u     Specifies what user  to  switch  to  after  initialization  when
77              started  as  root.   The  default  is "nobody".  The config-file
78              option name for this flag is "user", and the config.h option  is
79              DEFAULT_USER.
80
81       -c     Specifies  a  wildcard  pattern  for  CGI programs, for instance
82              "**.cgi" or "/cgi-bin/*".  See below for details.   The  config-
83              file  option  name  for  this flag is "cgipat", and the config.h
84              option is CGI_PATTERN.
85
86       -t     Specifies a file of throttle settings.  See below  for  details.
87              The config-file option name for this flag is "throttles".
88
89       -h     Specifies  a  hostname to bind to, for multihoming.  The default
90              is to bind to all hostnames supported on the local machine.  See
91              below for details.  The config-file option name for this flag is
92              "host", and the config.h option is SERVER_NAME.
93
94       -l     Specifies a file for logging.  If no -l argument  is  specified,
95              thttpd  logs  via  syslog().   If  "-l  /dev/null" is specified,
96              thttpd doesn't log at all.  The config-file option name for this
97              flag is "logfile".
98
99       -i     Specifies  a  file  to  write  the process-id to.  If no file is
100              specified, no process-id is written.  You can use this  file  to
101              send signals to thttpd.  See below for details.  The config-file
102              option name for this flag is "pidfile".
103
104       -T     Specifies the character set to use with text  MIME  types.   The
105              default  is UTF-8.  The config-file option name for this flag is
106              "charset", and the config.h option is DEFAULT_CHARSET.
107
108       -P     Specifies a P3P server privacy header to be  returned  with  all
109              responses.   See  http://www.w3.org/P3P/  for  details.   Thttpd
110              doesn't do anything at all with the string except put it in  the
111              P3P: response header.  The config-file option name for this flag
112              is "p3p".
113
114       -M     Specifies the number of seconds to be used in a  "Cache-Control:
115              max-age"  header  to be returned with all responses.  An equiva‐
116              lent "Expires" header is also  generated.   The  default  is  no
117              Cache-Control  or  Expires  headers, which is just fine for most
118              sites.  The config-file option name for this flag is "max_age".
119
120       -V     Shows the current version info.
121
122       -D     This was originally just a debugging flag,  however  it's  worth
123              mentioning  because  one of the things it does is prevent thttpd
124              from making itself a background daemon.  Instead it runs in  the
125              foreground  like  a regular program.  This is necessary when you
126              want to run  thttpd  wrapped  in  a  little  shell  script  that
127              restarts it if it exits.
128

CONFIG-FILE

130       All  the  command-line  options  can also be set in a config file.  One
131       advantage of using a config file is that the file can be  changed,  and
132       thttpd will pick up the changes with a restart.
133
134       The  syntax  of  the  config  file  is  simple, a series of "option" or
135       "option=value" separated by whitespace.  The option  names  are  listed
136       above with their corresponding command-line flags.
137

CHROOT

139       chroot()  is  a  system  call  that restricts the program's view of the
140       filesystem to the current  directory  and  directories  below  it.   It
141       becomes  impossible  for remote users to access any file outside of the
142       initial directory.  The restriction is inherited by child processes, so
143       CGI  programs  get it too.  This is a very strong security measure, and
144       is recommended.  The only downside is that only root can call chroot(),
145       so  this  means the program must be started as root.  However, the last
146       thing it does during initialization is to give up root access by becom‐
147       ing another user, so this is safe.
148
149       The  program  can  also  be  compile-time  configured  to  always  do a
150       chroot(), without needing the -r flag.
151
152       Note that with some other web servers, such as NCSA httpd, setting up a
153       directory tree for use with chroot() is complicated, involving creating
154       a bunch of special directories and  copying  in  various  files.   With
155       thttpd  it's  a lot easier, all you have to do is make sure any shells,
156       utilities, and config files used by your CGI programs and  scripts  are
157       available.   If you have CGI disabled, or if you make a policy that all
158       CGI programs must be written in a compiled language such as C and stat‐
159       ically linked, then you probably don't have to do any setup at all.
160
161       However, one thing you should do is tell syslogd about the chroot tree,
162       so that thttpd can still generate syslog messages.  Check your system's
163       syslodg  man  page  for how to do this.  In FreeBSD you would put some‐
164       thing like this in /etc/rc.conf:
165           syslogd_flags="-l /usr/local/www/data/dev/log"
166       Substitute in your own chroot tree's pathname, of course.  Don't  worry
167       about  creating  the log socket, syslogd wants to do that itself.  (You
168       may need to create the dev directory.)  In Linux the flag is -a instead
169       of -l, and there may be other differences.
170
171       Relevant config.h option: ALWAYS_CHROOT.
172

CGI

174       thttpd supports the CGI 1.1 spec.
175
176       In  order  for a CGI program to be run, its name must match the pattern
177       specified either at compile time or on the command  line  with  the  -c
178       flag.  This is a simple shell-style filename pattern.  You can use * to
179       match any string not including a slash,  or  **  to  match  any  string
180       including  slashes,  or  ? to match any single character.  You can also
181       use multiple such patterns separated by |.  The  patterns  get  checked
182       against  the  filename part of the incoming URL.  Don't forget to quote
183       any wildcard characters so that the shell doesn't mess with them.
184
185       Restricting CGI programs to a single directory lets the  site  adminis‐
186       trator review them for security holes, and is strongly recommended.  If
187       there are individual users that you trust, you can enable their  direc‐
188       tories too.
189
190       If  no CGI pattern is specified, neither here nor at compile time, then
191       CGI programs cannot be run at all.  If you want to  disable  CGI  as  a
192       security  measure,  that's how you do it, just comment out the patterns
193       in the config file and don't run with the -c flag.
194
195       Note: the current working directory when a CGI program gets run is  the
196       directory  that  the  CGI  program lives in.  This isn't in the CGI 1.1
197       spec, but it's what most other HTTP servers do.
198
199       Relevant  config.h  options:  CGI_PATTERN,   CGI_TIMELIMIT,   CGI_NICE,
200       CGI_PATH, CGI_LD_LIBRARY_PATH, CGIBINDIR.
201

BASIC AUTHENTICATION

203       Basic  Authentication  is  available  as an option at compile time.  If
204       enabled, it uses a password file in  the  directory  to  be  protected,
205       called  .htpasswd  by  default.  This file is formatted as the familiar
206       colon-separated username/encrypted-password pair, records delimited  by
207       newlines.   The  protection does not carry over to subdirectories.  The
208       utility program htpasswd(1) is  included  to  help  create  and  modify
209       .htpasswd files.
210
211       Relevant config.h option: AUTH_FILE
212

THROTTLING

214       The  throttle  file  lets  you  set  maximum  byte rates on URLs or URL
215       groups.  You can optionally set a minimum rate too.  The format of  the
216       throttle  file  is  very simple.  A # starts a comment, and the rest of
217       the line is ignored.  Blank lines are ignored.  The rest of  the  lines
218       should  consist of a pattern, whitespace, and a number.  The pattern is
219       a simple shell-style filename pattern, using ?/**/*, or  multiple  such
220       patterns separated by |.
221
222       The numbers in the file are byte rates, specified in units of bytes per
223       second.  For comparison, a v.90 modem gives about 5000 B/s depending on
224       compression,  a  double-B-channel  ISDN  line about 12800 B/s, and a T1
225       line is about 150000 B/s.  If you want to set a minimum rate  as  well,
226       use number-number.
227
228       Example:
229         # throttle file for www.acme.com
230
231         **              2000-100000  # limit total web usage to 2/3 of our T1,
232                                      # but never go below 2000 B/s
233         **.jpg|**.gif   50000   # limit images to 1/3 of our T1
234         **.mpg          20000   # and movies to even less
235         jef/**          20000   # jef's pages are too popular
236
237       Throttling  is  implemented  by  checking  each  incoming  URL filename
238       against all of the patterns in the throttle file.  The  server  accumu‐
239       lates  statistics  on how much bandwidth each pattern has accounted for
240       recently (via a rolling average).  If a URL matches a pattern that  has
241       been  exceeding its specified limit, then the data returned is actually
242       slowed down, with pauses between each block.  If  that's  not  possible
243       (e.g.  for CGI programs) or if the bandwidth has gotten way larger than
244       the limit, then the server returns a special  code  saying  'try  again
245       later'.
246
247       The  minimum  rates  are implemented similarly.  If too many people are
248       trying to fetch something at the same time, throttling  may  slow  down
249       each connection so much that it's not really useable.  Furthermore, all
250       those slow connections clog up the server, using up  file  handles  and
251       connection  slots.   Setting  a  minimum  rate says that past a certain
252       point you should not even bother - the server returns  the  'try  again
253       later" code and the connection isn't even started.
254
255       There  is  no provision for setting a maximum connections/second throt‐
256       tle, because throttling a request uses as much cpu as handling  it,  so
257       there would be no point.  There is also no provision for throttling the
258       number of simultaneous connections on a per-URL basis.  However you can
259       control  the  overall  number  of connections for the whole server very
260       simply, by setting the operating system's per-process  file  descriptor
261       limit  before  starting thttpd.  Be sure to set the hard limit, not the
262       soft limit.
263

MULTIHOMING

265       Multihoming means using one machine to serve multiple  hostnames.   For
266       instance,  if  you're  an  internet provider and you want to let all of
267       your  customers  have  customized  web  addresses,   you   might   have
268       www.joe.acme.com,  www.jane.acme.com,  and  your  own www.acme.com, all
269       running on the same physical hardware.  This feature is also  known  as
270       "virtual hosts".  There are three steps to setting this up.
271
272       One,  make DNS entries for all of the hostnames.  The current way to do
273       this, allowed by HTTP/1.1, is to use CNAME aliases, like so:
274         www.acme.com IN A 192.100.66.1
275         www.joe.acme.com IN CNAME www.acme.com
276         www.jane.acme.com IN CNAME www.acme.com
277       However, this is incompatible with older  HTTP/1.0  browsers.   If  you
278       want  to  stay  compatible,  there's  a  different  way - use A records
279       instead, each with a different IP address, like so:
280         www.acme.com IN A 192.100.66.1
281         www.joe.acme.com IN A 192.100.66.200
282         www.jane.acme.com IN A 192.100.66.201
283       This is bad because it uses  extra  IP  addresses,  a  somewhat  scarce
284       resource.   But  if  you  want people with older browsers to be able to
285       visit your sites, you still have to do it this way.
286
287       Step two.  If you're using the modern CNAME method of multihoming, then
288       you can skip this step.  Otherwise, using the older multiple-IP-address
289       method you must set up IP aliases or multiple interfaces for the  extra
290       addresses.  You can use ifconfig(8)'s alias command to tell the machine
291       to answer to all of the different IP addresses.  Example:
292         ifconfig le0 www.acme.com
293         ifconfig le0 www.joe.acme.com alias
294         ifconfig le0 www.jane.acme.com alias
295       If your OS's version of ifconfig doesn't have an alias command,  you're
296       probably    out    of    luck    (but   see   http://www.acme.com/soft
297       ware/thttpd/notes.html).
298
299       Third and last, you must set up thttpd to handle  the  multiple  hosts.
300       The  easiest  way  is  with  the  -v flag, or the ALWAYS_VHOST config.h
301       option.  This works with either CNAME multihosting or multiple-IP  mul‐
302       tihosting.   What  it does is send each incoming request to a subdirec‐
303       tory based on the hostname it's intended for.  All you have  to  do  in
304       order  to set things up is to create those subdirectories in the direc‐
305       tory where thttpd will run.  With the example above, you'd do like so:
306         mkdir www.acme.com www.joe.acme.com www.jane.acme.com
307       If you're using old-style multiple-IP  multihosting,  you  should  also
308       create symbolic links from the numeric addresses to the names, like so:
309         ln -s www.acme.com 192.100.66.1
310         ln -s www.joe.acme.com 192.100.66.200
311         ln -s www.jane.acme.com 192.100.66.201
312       This lets the older HTTP/1.0 browsers find the right subdirectory.
313
314       There's  an  optional  alternate step three if you're using multiple-IP
315       multihosting: run a separate thttpd process for  each  hostname,  using
316       the  -h flag to specify which one is which.  This gives you more flexi‐
317       bility, since you can run each of these processes in separate  directo‐
318       ries, with different throttle files, etc.  Example:
319         thttpd -r -d /usr/www -h www.acme.com
320         thttpd -r -d /usr/www/joe -u joe -h www.joe.acme.com
321         thttpd -r -d /usr/www/jane -u jane -h www.jane.acme.com
322       But  remember,  this  multiple-process  method does not work with CNAME
323       multihosting - for that, you must use a single thttpd process with  the
324       -v flag.
325

CUSTOM ERRORS

327       thttpd lets you define your own custom error pages for the various HTTP
328       errors.  There's a separate file for each error number, all  stored  in
329       one  special  directory.  The directory name is "errors", at the top of
330       the web directory tree.  The error files should be named "errNNN.html",
331       where  NNN is the error number.  So for example, to make a custom error
332       page for the authentication failure error, which  is  number  401,  you
333       would  put  your HTML into the file "errors/err401.html".  If no custom
334       error file is found for a given error number, then the  usual  built-in
335       error page is generated.
336
337       If  you're  using the virtual hosts option, you can also have different
338       custom error pages for each different virtual host.  In this  case  you
339       put  another  "errors"  directory in the top of that virtual host's web
340       tree.  thttpd will look first in the virtual host errors directory, and
341       then  in  the server-wide errors directory, and if neither of those has
342       an appropriate error file then it will generate the built-in error.
343

NON-LOCAL REFERRERS

345       Sometimes another site on the net will embed your image files in  their
346       HTML files, which basically means they're stealing your bandwidth.  You
347       can prevent them from doing this by using non-local referrer filtering.
348       With  this option, certain files can only be fetched via a local refer‐
349       rer.  The files have to be referenced by a local web page.   If  a  web
350       page  on  some  other  site  references  the  files, that fetch will be
351       blocked.  There are three config-file variables for this feature:
352
353       urlpat A wildcard pattern for the URLs  that  should  require  a  local
354              referrer.   This is typically just image files, sound files, and
355              so on.  For example:
356                urlpat=**.jpg|**.gif|**.au|**.wav
357              For most sites, that one setting  is  all  you  need  to  enable
358              referrer filtering.
359
360       noemptyreferrers
361              By  default,  requests with no referrer at all, or a null refer‐
362              rer, or a referrer with no apparent hostname, are allowed.  With
363              this variable set, such requests are disallowed.
364
365       localpat
366              A wildcard pattern that specifies the local host or hosts.  This
367              is used to determine if the host in the  referrer  is  local  or
368              not.  If not specified it defaults to the actual local hostname.
369
371       thttpd is very picky about symbolic links.  Before delivering any file,
372       it first checks each element in the path to  see  if  it's  a  symbolic
373       link, and expands them all out to get the final actual filename.  Along
374       the way it checks for things like links with ".."  that  go  above  the
375       server's  directory,  and absolute symlinks (ones that start with a /).
376       These are prohibited as security holes, so the server returns an  error
377       page  for  them.  This means you can't set up your web directory with a
378       bunch of symlinks pointing to individual users' home  web  directories.
379       Instead  you  do it the other way around - the user web directories are
380       real subdirs of the main web directory, and in  each  user's  home  dir
381       there's a symlink pointing to their actual web dir.
382
383       The  CGI  pattern is also affected - it gets matched against the fully-
384       expanded filename.  So, if you have a single CGI directory but then put
385       a  symbolic  link  in it pointing somewhere else, that won't work.  The
386       CGI program will be treated as a  regular  file  and  returned  to  the
387       client, instead of getting run.  This could be confusing.
388

PERMISSIONS

390       thttpd  is  also  picky  about  file  permissions.  It wants data files
391       (HTML, images) to be world readable.  Readable by the  group  that  the
392       thttpd process runs as is not enough - thttpd checks explicitly for the
393       world-readable bit.  This is so that no one ever gets  surprised  by  a
394       file  that's  not set world-readable and yet somehow is readable by the
395       HTTP server and therefore the *whole* world.
396
397       The same logic applies to directories.  As with the standard Unix  "ls"
398       program,  thttpd  will only let you look at the contents of a directory
399       if its read bit is on; but as with data files, this must be the  world-
400       read bit, not just the group-read bit.
401
402       thttpd  also  wants the execute bit to be *off* for data files.  A file
403       that is marked executable but doesn't match the CGI pattern might be  a
404       script  or  program  that got accidentally left in the wrong directory.
405       Allowing people to fetch the contents of the file might be  a  security
406       breach,  so this is prohibited.  Of course if an executable file *does*
407       match the CGI pattern, then it just gets run as a CGI.
408
409       In summary, data files should  be  mode  644  (rw-r--r--),  directories
410       should  be  755  (rwxr-xr-x)  if  you  want  to  allow indexing and 711
411       (rwx--x--x) to disallow it, and CGI programs should be mode 755  (rwxr-
412       xr-x) or 711 (rwx--x--x).
413

LOGS

415       thttpd  does all of its logging via syslog(3).  The facility it uses is
416       configurable.  Aside from error messages, there  are  only  a  few  log
417       entry types of interest, all fairly similar to CERN Common Log Format:
418         Aug  6 15:40:34 acme thttpd[583]: 165.113.207.103 - - "GET /file" 200 357
419         Aug  6 15:40:43 acme thttpd[583]: 165.113.207.103 - - "HEAD /file" 200 0
420         Aug  6 15:41:16 acme thttpd[583]: referrer http://www.acme.com/ -> /dir
421         Aug  6 15:41:16 acme thttpd[583]: user-agent Mozilla/1.1N
422       The  package  includes  a script for translating these log entries info
423       CERN-compatible files.  Note that thttpd does not translate numeric  IP
424       addresses  into domain names.  This is both to save time and as a minor
425       security measure (the numeric address is harder to spoof).
426
427       Relevant config.h option: LOG_FACILITY.
428
429       If you'd rather log directly to a file, you can use the -l command-line
430       flag.  But note that error messages still go to syslog.
431

SIGNALS

433       thttpd handles a couple of signals, which you can send via the standard
434       Unix kill(1) command:
435
436       INT,TERM
437              These  signals  tell  thttpd  to  shut  down  immediately.   Any
438              requests in progress get aborted.
439
440       USR1   This  signal tells thttpd to shut down as soon as it's done ser‐
441              vicing all current requests.  In addition, the network socket it
442              uses  to  accept  new connections gets closed immediately, which
443              means a fresh thttpd can be started up immediately.
444
445       USR2   This signal tells thttpd to generate the statistics syslog  mes‐
446              sages  immediately,  instead  of  waiting for the regular hourly
447              update.
448
449       HUP    This signal tells thttpd to close and re-open  its  (non-syslog)
450              log  file,  for  instance if you rotated the logs and want it to
451              start using the new one.  This is a little tricky to set up cor‐
452              rectly, for instance if you are using chroot() then the log file
453              must be within the chroot tree, but it's definitely doable.
454

SEE ALSO

456       redirect(8),   ssi(8),   makeweb(1),   htpasswd(1),    syslogtocern(8),
457       weblog_parse(1), http_get(1)
458

THANKS

460       Many  thanks  to contributors, reviewers, testers: John LoVerso, Jordan
461       Hayes, Chris Torek, Jim Thompson, Barton  Schaffer,  Geoff  Adams,  Dan
462       Kegel,  John  Hascall, Bennett Todd, KIKUCHI Takahiro, Catalin Ionescu.
463       Special thanks to Craig Leres for substantial  debugging  and  develop‐
464       ment, and for not complaining about my coding style very much.
465

AUTHOR

467       Copyright  ©  1995,1998,1999,2000 by Jef Poskanzer <jef@mail.acme.com>.
468       All rights reserved.
469
470
471
472                               29 February 2000                      thttpd(8)
Impressum