1goaccess(1)                      User Manuals                      goaccess(1)
2
3
4

NAME

6       goaccess - fast web log analyzer and interactive viewer.
7

SYNOPSIS

9       goaccess [filename] [options...] [-c][-M][-H][-q][-d][...]
10

DESCRIPTION

12       goaccess  GoAccess  is  an  open  source real-time web log analyzer and
13       interactive viewer that runs in a terminal in *nix systems  or  through
14       your browser.
15
16       It provides fast and valuable HTTP statistics for system administrators
17       that require a visual server report on the fly.
18
19       GoAccess parses the specified web log file and outputs the data to  the
20       X terminal. Features include:
21
22
23       General Statistics:
24              This  panel gives a summary of several metrics, such as the num‐
25              ber of valid and invalid requests, time  taken  to  analyze  the
26              dataset,  unique  visitors,  requested files, static files (CSS,
27              ICO, JPG, etc) HTTP referrers, 404s, size of the parsed log file
28              and bandwidth consumption.
29
30       Unique visitors
31              This panel shows metrics such as hits, unique visitors and cumu‐
32              lative bandwidth per date. HTTP requests containing the same IP,
33              the  same  date, and the same user agent are considered a unique
34              visitor. By default, it includes web crawlers/spiders.
35
36              Optionally, date specificity can be set to the hour level  using
37              --date-spec=hr  which will display dates such as 05/Jun/2016:16.
38              This is great if you want to track your  daily  traffic  at  the
39              hour level.
40
41       Requested files
42              This panel displays the most requested files on your web server.
43              It shows hits, unique visitors, and percentage, along  with  the
44              cumulative bandwidth, protocol, and the request method used.
45
46       Requested static files
47              Lists  the  most frequently static files such as: JPG, CSS, SWF,
48              JS, GIF, and PNG file types, along with the same metrics as  the
49              last panel. Additional static files can be added to the configu‐
50              ration file.
51
52       404 or Not Found
53              Displays the same metrics as the previous request  panels,  how‐
54              ever,  its  data  contains  all pages that were not found on the
55              server, or commonly known as 404 status code.
56
57       Hosts  This panel has detailed information  on  the  hosts  themselves.
58              This  is  great for spotting aggressive crawlers and identifying
59              who's eating your bandwidth.
60
61              Expanding the panel can display more information such as  host's
62              reverse DNS lookup result, country of origin and city. If the -a
63              argument is enabled, a list of user agents can be  displayed  by
64              selecting the desired IP address, and then pressing ENTER.
65
66       Operating Systems
67              This panel will report which operating system the host used when
68              it hit the server. It attempts to provide the most specific ver‐
69              sion of each operating system.
70
71       Browsers
72              This  panel  will report which browser the host used when it hit
73              the server. It attempts to provide the most specific version  of
74              each browser.
75
76       Visit Times
77              This  panel  will display an hourly report. This option displays
78              24 data points, one for each hour of the day.
79
80              Optionally, hour specificity can be set to the tenth of an  hour
81              level  using  --hour-spec=min  which  will display hours as 16:4
82              This is great if you want to  spot  peaks  of  traffic  on  your
83              server.
84
85       Virtual Hosts
86              This  panel  will display all the different virtual hosts parsed
87              from the access log. This panel  is  displayed  if  %v  is  used
88              within the log-format string.
89
90       Referrers URLs
91              If  the host in question accessed the site via another resource,
92              or was linked/diverted to you from another host,  the  URL  they
93              were   referred  from  will  be  provided  in  this  panel.  See
94              `--ignore-panel` in your configuration file to enable it.   dis‐
95              abled by default.
96
97       Referring Sites
98              This  panel  will  display  only the host part but not the whole
99              URL. The URL where the request came from.
100
101       Keyphrases
102              It reports keyphrases used on Google search, Google  cache,  and
103              Google  translate that have lead to your web server. At present,
104              it only supports Google search queries via HTTP. See  `--ignore-
105              panel`  in  your  configuration  file to enable it.  disabled by
106              default.
107
108       Geo Location
109              Determines where an IP address is geographically  located.  Sta‐
110              tistics are broken down by continent and country. It needs to be
111              compiled with GeoLocation support.
112
113       HTTP Status Codes
114              The values of the numeric status code to HTTP requests.
115
116       Remote User (HTTP authentication)
117              This is the userid of the  person  requesting  the  document  as
118              determined  by HTTP authentication. If the document is not pass‐
119              word protected, this part will be "-"  just  like  the  previous
120              one.  This  panel  is  not enabled unless %e is given within the
121              log-format variable.
122
123
124       NOTE: Optionally and if configured, all panels can display the  average
125       time taken to serve the request.
126
127

STORAGE

129       There  are three storage options that can be used with GoAccess. Choos‐
130       ing one will depend on your environment and needs.
131
132       Default Hash Tables
133              In-memory storage provides better performance  at  the  cost  of
134              limiting  the  dataset  size to the amount of available physical
135              memory. By default GoAccess uses in-memory hash tables. If  your
136              dataset  can  fit in memory, then this will perform fine. It has
137              very good memory usage and pretty good performance.
138
139       Tokyo Cabinet On-Disk B+ Tree
140              Use this storage method for large datasets where it is not  pos‐
141              sible  to  fit  everything  in  memory.  The B+ tree database is
142              slower than any of the hash databases since data has to be  com‐
143              mitted to disk. However, using an SSD greatly increases the per‐
144              formance. You may also use this storage method if you need  data
145              persistence to quickly load statistics at a later date.
146
147       Tokyo Cabinet In-memory Hash Database
148              An  alternative to the default hash tables. It uses generic typ‐
149              ing and thus it's performance in terms of memory  and  speed  is
150              average.
151

CONFIGURATION

153       Multiple  options can be used to configure GoAccess. For a complete up-
154       to-date list of configure options, run ./configure --help
155
156       --enable-debug
157              Compile with debugging symbols and turn off  compiler  optimiza‐
158              tions.
159
160       --enable-utf8
161              Compile with wide character support. Ncursesw is required.
162
163       --enable-geoip=<legacy|mmdb>
164              Compile  with  GeoLocation support. MaxMind's GeoIP is required.
165              legacy will utilize the original  GeoIP  databases.   mmdb  will
166              utilize the enhanced GeoIP2 databases.
167
168       --enable-tcb=<memhash|btree>
169              Compile  with  Tokyo Cabinet storage support.  memhash will uti‐
170              lize Tokyo Cabinet's on-memory hash database.  btree  will  uti‐
171              lize Tokyo Cabinet's on-disk B+ Tree database.
172
173       --disable-zlib
174              Disable zlib compression on B+ Tree database.
175
176       --disable-bzip
177              Disable bzip2 compression on B+ Tree database.
178
179       --with-getline
180              Dynamically  expands  line  buffer  in  order to parse full line
181              requests instead of using a fixed size buffer of 4096.
182
183       --with-openssl
184              Compile GoAccess with OpenSSL support for its WebSocket server.
185

OPTIONS

187       The following options can be supplied to the command  or  specified  in
188       the  configuration  file.  If specified in the configuration file, long
189       options need to be used without prepending --  and  without  using  the
190       equal sign =.
191
192   LOG/DATE/TIME FORMAT
193       --time-format=<timeformat>
194              The  time-format variable followed by a space, specifies the log
195              format time containing either a name of a predefined format (see
196              options below) or any combination of regular characters and spe‐
197              cial format specifiers.
198
199              They all begin with a percentage (%) sign. See  `man  strftime`.
200              %T or %H:%M:%S.
201
202              Note  that  if  a timestamp is given in microseconds, %f must be
203              used as time-format
204
205       --date-format=<dateformat>
206              The date-format variable followed by a space, specifies the  log
207              format time containing either a name of a predefined format (see
208              options below) or any combination of regular characters and spe‐
209              cial format specifiers.
210
211              They  all  begin with a percentage (%) sign. See `man strftime`.
212              %Y-%m-%d.
213
214              Note that if a timestamp is given in microseconds,  %f  must  be
215              used as date-format
216
217       --log-format=<logformat>
218              The log-format variable followed by a space or \t for tab-delim‐
219              ited, specifies the log format string.
220
221              Note that if there are spaces  within  the  format,  the  string
222              needs  to be enclosed in single/double quotes. Inner quotes need
223              to be escaped.
224
225              In addition to specifying the  raw  log/date/time  formats,  for
226              simplicity, any of the following predefined log format names can
227              be supplied to the log/date/time-format variables. GoAccess  can
228              also handle one predefined name in one variable and another pre‐
229              defined name in another variable.
230
231                COMBINED     - Combined Log Format,
232                VCOMBINED    - Combined Log Format with Virtual Host,
233                COMMON       - Common Log Format,
234                VCOMMON      - Common Log Format with Virtual Host,
235                W3C          - W3C Extended Log File Format,
236                SQUID        - Native Squid Log Format,
237                CLOUDFRONT   - Amazon CloudFront Web Distribution,
238                CLOUDSTORAGE - Google Cloud Storage,
239                AWSELB       - Amazon Elastic Load Balancing,
240                AWSS3        - Amazon Simple Storage Service (S3)
241
242              Note: Piping data into GoAccess  won't  prompt  a  log/date/time
243              configuration  dialog,  you will need to previously define it in
244              your configuration file or in the command line.
245
246   USER INTERFACE OPTIONS
247       -c --config-dialog
248              Prompt log/time/date configuration window on program start. Only
249              when curses is initialized.
250
251       -i --hl-header
252              Color highlight active terminal panel.
253
254       -m --with-mouse
255              Enable mouse support on main terminal dashboard.
256
257       ---color=<fg:bg[attrs, PANEL]>
258              Specify custom colors for the terminal output.
259
260              Color Syntax
261                DEFINITION space/tab colorFG#:colorBG# [attributes,PANEL]
262
263               FG# = foreground color [-1...255] (-1 = default term color)
264               BG# = background color [-1...255] (-1 = default term color)
265
266              Optionally,  it  is possible to apply color attributes (multiple
267              attributes are comma separated), such as: bold, underline,  nor‐
268              mal, reverse, blink
269
270              If  desired,  it  is  possible to apply custom colors per panel,
271              that is, a metric in the REQUESTS panel can be of color A, while
272              the same metric in the BROWSERS panel can be of color B.
273
274              Available color definitions:
275                COLOR_MTRC_HITS
276                COLOR_MTRC_VISITORS
277                COLOR_MTRC_DATA
278                COLOR_MTRC_BW
279                COLOR_MTRC_AVGTS
280                COLOR_MTRC_CUMTS
281                COLOR_MTRC_MAXTS
282                COLOR_MTRC_PROT
283                COLOR_MTRC_MTHD
284                COLOR_MTRC_HITS_PERC
285                COLOR_MTRC_HITS_PERC_MAX
286                COLOR_MTRC_VISITORS_PERC
287                COLOR_MTRC_VISITORS_PERC_MAX
288                COLOR_PANEL_COLS
289                COLOR_BARS
290                COLOR_ERROR
291                COLOR_SELECTED
292                COLOR_PANEL_ACTIVE
293                COLOR_PANEL_HEADER
294                COLOR_PANEL_DESC
295                COLOR_OVERALL_LBLS
296                COLOR_OVERALL_VALS
297                COLOR_OVERALL_PATH
298                COLOR_ACTIVE_LABEL
299                COLOR_BG
300                COLOR_DEFAULT
301                COLOR_PROGRESS
302
303              See configuration file for a sample color scheme.
304
305       --color-scheme=<1|2|3>
306              Choose  among  color schemes.  1 for the default grey scheme.  2
307              for the green scheme.  3 for the Monokai scheme (shown  only  if
308              terminal supports 256 colors).
309
310       --crawlers-only
311              Parse and display only crawlers (bots).
312
313       --html-custom-css=<path/custom.css>
314              Specifies a custom CSS file path to load in the HTML report.
315
316       --html-custom-js=<path/custom.js>
317              Specifies a custom JS file path to load in the HTML report.
318
319       --html-report-title=<title>
320              Set HTML report page title and header.
321
322       --html-prefs=<JSON>
323              Set  HTML report default preferences. Supply a valid JSON object
324              containing the HTML preferences.  It allows the ability to  cus‐
325              tomize each panel plot. See example below.
326
327              Note: The JSON object passed needs to be a one line JSON string.
328              For instance,
329
330              --html-prefs='{"theme":"bright","perPage":5,"layout":"horizon‐
331              tal","showTables":true,"visitors":{"plot":{"chartType":"bar"}}}'
332
333       --json-pretty-print
334              Format JSON output using tabs and newlines.
335
336              Note:  This  is not recommended when outputting a real-time HTML
337              report since the WebSocket payload will much much larger.
338
339       --max-items=<number>
340              The maximum number of items to display per  panel.  The  maximum
341              can be a number between 1 and n.
342
343              Note:  Only  the  CSV  and  JSON  output  allow a maximum number
344              greater than the default value of 366 (or 50  in  the  real-time
345              HTML output) items per panel.
346
347       --no-color
348              Turn  off  colored output. This is the  default output on termi‐
349              nals that do not support colors.
350
351       --no-column-names
352              Don't write column names in the terminal output. By default,  it
353              displays column names for each available metric in every panel.
354
355       --no-csv-summary
356              Disable summary metrics on the CSV output.
357
358       --no-progress
359              Disable progress metrics [total requests/requests per second].
360
361       --no-tab-scroll
362              Disable  scrolling  through panels when TAB is pressed or when a
363              panel is selected using a numeric key.
364
365       --no-html-last-updated
366              Do not show the last updated field displayed in the HTML  gener‐
367              ated report.
368
369       --no-parsing-spinner
370              Do now show the progress metrics and parsing spinner.
371
372   SERVER OPTIONS
373       --addr Specify  IP address to bind the server to. Otherwise it binds to
374              0.0.0.0.
375
376              Usually there is no need to  specify  the  address,  unless  you
377              intentionally  would  like  to  bind  the  server to a different
378              address within your server.
379
380       --daemonize
381              Run GoAccess as daemon (only if --real-time-html enabled).
382
383              Note: It's important to make use of absolute paths across  GoAc‐
384              cess' configuration.
385
386       --origin=<url>
387              Ensure  clients  send  the specified origin header upon the Web‐
388              Socket handshake.
389
390       --pid-file=<path/goaccess.pid>
391              Write the daemon PID to a file when used along  the  --daemonize
392              option.
393
394       --port=<port>
395              Specify  the  port to use. By default GoAccess' WebSocket server
396              listens on port 7890.
397
398       --real-time-html
399              Enable real-time HTML output.
400
401              GoAccess uses its own WebSocket server to push the data from the
402              server  to  the  client. See http://gwsocket.io for more details
403              how the WebSocket server works.
404
405       --ws-url=<[scheme://]url[:port]>
406              URL to which the WebSocket server responds. This is the URL sup‐
407              plied to the WebSocket constructor on the client side.
408
409              Optionally,  it is possible to specify the WebSocket URI scheme,
410              such as ws:// or wss:// for unencrypted  and  encrypted  connec‐
411              tions. e.g., wss://goaccess.io
412
413              If  GoAccess is running behind a proxy, you could set the client
414              side to connect to a different port by specifying the host  fol‐
415              lowed by a colon and the port.  e.g., goaccess.io:9999
416
417              By default, it will attempt to connect to the generated report's
418              hostname. If GoAccess is running on a remote server, the host of
419              the  remote  server should be specified here. Also, make sure it
420              is a valid host and NOT an http address.
421
422       --fifo-in=<path/file>
423              Creates a named  pipe  (FIFO)  that  reads  from  on  the  given
424              path/file.
425
426       --fifo-out=<path/file>
427              Creates a named pipe (FIFO) that writes to the given path/file.
428
429       --ssl-cert=<cert.crt>
430              Path to TLS/SSL certificate. In order to enable TLS/SSL support,
431              GoAccess requires that --ssl-cert and --ssl-key are used.
432
433              Only if configured using --with-openssl
434
435       --ssl-key=<priv.key>
436              Path to TLS/SSL private key. In order to enable TLS/SSL support,
437              GoAccess requires that --ssl-cert and --ssl-key are used.
438
439              Only if configured using --with-openssl
440
441   FILE OPTIONS
442       -f --log-file=<logfile>
443              Specify  the  path  to  the input log file. If set in the config
444              file, it will take priority over -f from the command line.
445
446       -S --log-size=<bytes>
447              Specify the log size in bytes. This is  useful  when  piping  in
448              logs for processing in which the log size can be explicitly set.
449
450       -l --debug-file=<debugfile>
451              Send all debug messages to the specified file.
452
453       -p --config-file=<configfile>
454              Specify a custom configuration file to use. If set, it will take
455              priority over the global configuration file (if any).
456
457       --invalid-requests=<filename>
458              Log invalid requests to the specified file.
459
460       --no-global-config
461              Do not load the global configuration file. This directory should
462              normally    be    /usr/local/etc,    unless    specified    with
463              --sysconfdir=/dir.  See --dcf option  for  finding  the  default
464              configuration file.
465
466   PARSE OPTIONS
467       -a --agent-list
468              Enable a list of user-agents by host. For faster parsing, do not
469              enable this flag.
470
471       -d --with-output-resolver
472              Enable IP resolver on HTML|JSON output.
473
474       -e --exclude-ip=<IP|IP-range>
475              Exclude an IPv4 or IPv6  from  being  counted.   Ranges  can  be
476              included as well using a dash in between the IPs (start-end).
477
478              Examples:
479                exclude-ip 127.0.0.1
480                exclude-ip 192.168.0.1-192.168.0.100
481                exclude-ip ::1
482                exclude-ip 0:0:0:0:0:ffff:808:804-0:0:0:0:0:ffff:808:808
483
484       -H --http-protocol=<yes|no>
485              Set/unset  HTTP request protocol. This will create a request key
486              containing the request protocol + the actual request.
487
488       -M --http-method=<yes|no>
489              Set/unset HTTP request method. This will create  a  request  key
490              containing the request method + the actual request.
491
492       -o --output=<path/file.[json|csv|html]>
493              Write  output to stdout given one of the following files and the
494              corresponding extension for the output format:
495
496                /path/file.csv  - Comma-separated values (CSV)
497                /path/file.json - JSON (JavaScript Object Notation)
498                /path/file.html - HTML
499
500       -q --no-query-string
501              Ignore        request's        query        string.        i.e.,
502              www.google.com/page.htm?query => www.google.com/page.htm.
503
504              Note: Removing the query string can greatly decrease memory con‐
505              sumption, especially on timestamped requests.
506
507       -r --no-term-resolver
508              Disable IP resolver on terminal output.
509
510       --444-as-404
511              Treat non-standard status code 444 as 404.
512
513       --4xx-to-unique-count
514              Add 4xx client errors to the unique visitors count.
515
516       --accumulated-time
517              Store accumulated processing time from parsing day-by-day logs.
518
519              Only if configured with --enable-tcb=btree
520
521       --anonymize-ip
522              Anonymize the client IP address.  The  IP  anonymization  option
523              sets  the  last  octet of IPv4 user IP addresses and the last 80
524              bits of  IPv6  addresses  to  zeros.   e.g.,  192.168.20.100  =>
525              192.168.20.0     e.g.,    2a03:2880:2110:df07:face:b00c::1    =>
526              2a03:2880:2110:df07::
527
528       --all-static-files
529              Include  static  files  that  contain  a  query  string.   e.g.,
530              /fonts/fontawesome-webfont.woff?v=4.0.3
531
532       --browsers-file=<path>
533              Include  an additional delimited list of browsers/crawlers/feeds
534              etc.    See   config/browsers.list    for    an    example    or
535              https://raw.githubusercontent.com/allinurl/goaccess/master/con
536              fig/browsers.list
537
538       --date-spec=<date|hr>
539              Set the date specificity to either date (default) or hr to  dis‐
540              play hours appended to the date.
541
542              This  is  used  in  the visitors panel. It's useful for tracking
543              visitors at the hour level. For instance,  an  hour  specificity
544              would yield to display traffic as 18/Dec/2010:19
545
546       --double-decode
547              Decode   double-encoded   values.   This  includes,  user-agent,
548              request, and referer.
549
550       --enable-panel=<PANEL>
551              Enable parsing and displaying the given panel.
552
553              Available panels:
554                VISITORS
555                REQUESTS
556                REQUESTS_STATIC
557                NOT_FOUND
558                HOSTS
559                OS
560                BROWSERS
561                VISIT_TIMES
562                VIRTUAL_HOSTS
563                REFERRERS
564                REFERRING_SITES
565                KEYPHRASES
566                STATUS_CODES
567                REMOTE_USER
568                GEO_LOCATION
569
570       --hide-referer=<NEEDLE>
571              Hide a referer but still count it. Wild cards are allowed in the
572              needle. i.e., *.bing.com.
573
574       --hour-spec=<hr|min>
575              Set the time specificity to either hour (default) or min to dis‐
576              play the tenth of an hour appended to the hour.
577
578              This is used in the time distribution  panel.  It's  useful  for
579              tracking peaks of traffic on your server at specific times.
580
581       --ignore-crawlers
582              Ignore crawlers from being counted.
583
584       --ignore-panel=<PANEL>
585              Ignore parsing and displaying the given panel.
586
587              Available panels:
588                VISITORS
589                REQUESTS
590                REQUESTS_STATIC
591                NOT_FOUND
592                HOSTS
593                OS
594                BROWSERS
595                VISIT_TIMES
596                VIRTUAL_HOSTS
597                REFERRERS
598                REFERRING_SITES
599                KEYPHRASES
600                STATUS_CODES
601                REMOTE_USER
602
603       --ignore-referer=<referer>
604              Ignore  referers  from  being  counted. Wildcards allowed. e.g.,
605              *.domain.com ww?.domain.*
606
607       --ignore-status=<CODE>
608              Ignore parsing and displaying one or  multiple  status  code(s).
609              For multiple status codes, use this option multiple times.
610
611       --num-tests=<number>
612              Number of lines from the access log to test against the provided
613              log/date/time format. By default, the parser is set to  test  10
614              lines.   If  set  to 0, the parser won't test any lines and will
615              parse the  whole  access  log.  If  a  line  matches  the  given
616              log/date/time format before it reaches <number>, the parser will
617              consider the log to be valid,  otherwise  GoAccess  will  return
618              EXIT_FAILURE and display the relevant error messages.
619
620       --process-and-exit
621              Parse  log  and  exit  without outputting data. Useful if we are
622              looking to only add new data to  the  on-disk  database  without
623              outputting to a file or a terminal.
624
625       --real-os
626              Display real OS names. e.g, Windows XP, Snow Leopard.
627
628       --sort-panel=<PANEL,FIELD,ORDER>
629              Sort panel on initial load. Sort options are separated by comma.
630              Options are in the form: PANEL,METRIC,ORDER
631
632              Available metrics:
633                BY_HITS     - Sort by hits
634                BY_VISITORS - Sort by unique visitors
635                BY_DATA     - Sort by data
636                BY_BW       - Sort by bandwidth
637                BY_AVGTS    - Sort by average time served
638                BY_CUMTS    - Sort by cumulative time served
639                BY_MAXTS    - Sort by maximum time served
640                BY_PROT     - Sort by http protocol
641                BY_MTHD     - Sort by http method
642
643              Available orders:
644                ASC
645                DESC
646
647       --static-file=<extension>
648              Add static file extension. e.g.: .mp3 Extensions are case sensi‐
649              tive.
650
651   GEOLOCATION OPTIONS
652       -g --std-geoip
653              Standard GeoIP database for less memory usage.
654
655       --geoip-database=<geofile>
656              Specify path to GeoIP database file. i.e., GeoLiteCity.dat.
657
658              If  using GeoIP2, you will need to download the GeoLite2 City or
659              Country database from MaxMind.com and use  the  option  --geoip-
660              database to specify the database. You can also get updated data‐
661              base files for GeoIP legacy,  you  can  find  these  as  GeoLite
662              Legacy  Databases from MaxMind.com. IPv4 and IPv6 files are sup‐
663              ported as well. For updated DB  URLs,  please  see  the  default
664              GoAccess configuration file.
665
666              Note: --geoip-city-data is an alias of --geoip-database.
667
668   OTHER OPTIONS
669       -h --help
670              The help.
671
672       -s --storage
673              Display current storage method. i.e., B+ Tree, Hash.
674
675       -V --version
676              Display version information and exit.
677
678       --dcf  Display  the  path  of  the default config file when `-p` is not
679              used.
680
681   ON-DISK STORAGE OPTIONS
682       --keep-db-files
683              Persist parsed data into disk. If database  files  exist,  files
684              will  be  overwritten.  This should be set to the first dataset.
685              Setting it to false will delete all database files when  exiting
686              the program. See examples below.
687
688              Only if configured with --enable-tcb=btree
689
690       --load-from-disk
691              Load previously stored data from disk. If reading persisted data
692              only, the database files need to exist.  See  keep-db-files  and
693              examples below.
694
695              Only if configured with --enable-tcb=btree
696
697       --db-path=<dir>
698              Path  where  the  on-disk database files are stored. The default
699              value is the /tmp/goaccess<PID> directory (created on-demand).
700
701              Only if configured with --enable-tcb=btree
702
703       --xmmap=<num>
704              Set the size in bytes of the extra mapped  memory.  The  default
705              value is 0.
706
707              Only if configured with --enable-tcb=btree
708
709       --cache-lcnum=<num>
710              Specifies  the  maximum number of leaf nodes to be cached. If it
711              is not more than 0, the default value is specified. The  default
712              value  is  1024. Setting a larger value will increase speed per‐
713              formance, however, memory consumption will increase. Lower value
714              will decrease memory consumption.
715
716              Only if configured with --enable-tcb=btree
717
718       --cache-ncnum=<num>
719              Specifies  the maximum number of non-leaf nodes to be cached. If
720              it is not more than 0,  the  default  value  is  specified.  The
721              default value is 512.
722
723              Only if configured with --enable-tcb=btree
724
725       --tune-lmemb=<num>
726              Specifies  the number of members in each leaf page. If it is not
727              more than 0, the default value is specified. The  default  value
728              is 128.
729
730              Only if configured with --enable-tcb=btree
731
732       --tune-nmemb=<num>
733              Specifies  the number of members in each non-leaf page. If it is
734              not more than 0, the default value  is  specified.  The  default
735              value is 256.
736
737              Only if configured with --enable-tcb=btree
738
739       --tune-bnum=<num>
740              Specifies  the  number of elements of the bucket array. If it is
741              not more than 0, the default value  is  specified.  The  default
742              value is 32749. Suggested size of the bucket array is about from
743              1 to 4 times of the number of all pages to be stored.
744
745              Only if configured with --enable-tcb=btree
746
747       --compression=<zlib|bz2>
748              Specifies that each page is compressed with ZLIB|BZ2 encoding.
749
750              Only if configured with --enable-tcb=btree
751
752

CUSTOM LOG/DATE FORMAT

754       GoAccess can parse virtually any web log format.
755
756       Predefined options include, Common Log Format (CLF), Combined Log  For‐
757       mat (XLF/ELF), including virtual host, Amazon CloudFront (Download Dis‐
758       tribution), Google Cloud Storage and W3C format (IIS).
759
760       GoAccess allows any custom format string as well.
761
762       There are two ways to configure the log format.  The easiest is to  run
763       GoAccess with -c to prompt a configuration window. Otherwise, it can be
764       configured under ~/.goaccessrc or the %sysconfdir%.
765
766       time-format
767              The time-format variable followed by a space, specifies the  log
768              format time containing any combination of regular characters and
769              special format specifiers.  They all begin with a percentage (%)
770              sign. See `man strftime`.  %T or %H:%M:%S.
771
772              Note:  If  a timestamp is given in microseconds, %f must be used
773              as time-format
774
775       date-format
776              The date-format variable followed by a space, specifies the  log
777              format date containing any combination of regular characters and
778              special format specifiers. They all begin with a percentage  (%)
779              sign. See `man strftime`. e.g., %Y-%m-%d.
780
781              Note:  If  a timestamp is given in microseconds, %f must be used
782              as date-format
783
784       log-format
785              The log-format variable followed by a space or  \t  ,  specifies
786              the log format string.
787
788       %x     A  date  and time field matching the time-format and date-format
789              variables. This is used when given a timestamp  or  the  date  &
790              time  are  concatenated  as a single string (e.g., 1501647332 or
791              20170801235000) instead of the date and time being in two  sepa‐
792              rated variables.
793
794       %t     time field matching the time-format variable.
795
796       %d     date field matching the date-format variable.
797
798       %v     The  canonical  Server  Name  of  the server serving the request
799              (Virtual Host).
800
801       %e     This is the userid of the  person  requesting  the  document  as
802              determined by HTTP authentication.
803
804       %h     host (the client IP address, either IPv4 or IPv6)
805
806       %r     The  request line from the client. This requires specific delim‐
807              iters around the request (as single quotes,  double  quotes,  or
808              anything else) to be parsable. If not, we have to use a combina‐
809              tion of special format specifiers as %m %U %H.
810
811       %q     The query string.
812
813       %m     The request method.
814
815       %U     The URL path requested.
816
817              Note: If the query string is in %U, there is no need to use  %q.
818              However, if the URL path, does not include any query string, you
819              may use %q and the query string will be appended to the request.
820
821       %H     The request protocol.
822
823       %s     The status code that the server sends back to the client.
824
825       %b     The size of the object returned to the client.
826
827       %R     The "Referrer" HTTP request header.
828
829       %u     The user-agent HTTP request header.
830
831       %D     The time taken to serve the request, in microseconds as a  deci‐
832              mal number.
833
834       %T     The  time  taken to serve the request, in seconds with millisec‐
835              onds resolution.
836
837       %L     The time taken to serve the request, in milliseconds as a  deci‐
838              mal number.
839
840       %^     Ignore this field.
841
842       %~     Move forward through the log string until a non-space (!isspace)
843              char is found.
844
845       ~h     The host (the client IP address, either IPv4 or IPv6)  in  a  X-
846              Forwarded-For (XFF) field.
847
848              It uses a special specifier which consists of a tilde before the
849              host specifier, followed by the character(s)  that  delimit  the
850              XFF field, which are enclosed by curly braces (i.e., ~h{," })
851
852              For  example,  ~h{,"  }  is used in order to parse "11.25.11.53,
853              17.68.33.17" field which is  delimited  by  a  double  quote,  a
854              comma, and a space.
855
856       Note:  In  order to get the average, cumulative and maximum time served
857       in GoAccess, you will need to start logging response times in your  web
858       server. In Nginx you can add $request_time to your log format, or %D in
859       Apache.
860
861       Important: If multiple time served specifiers  are  used  at  the  same
862       time,  the first option specified in the format string will take prior‐
863       ity over the other specifiers.
864
865       GoAccess requires the following fields:
866
867              %h a valid IPv4/6
868
869              %d a valid date
870
871              %r the request
872

INTERACTIVE MENU

874       F1 or h
875              Main help.
876
877       F5     Redraw main window.
878
879       q      Quit the program, current window or collapse active module
880
881       o or  ENTER
882              Expand selected module or open window
883
884       0-9 and Shift + 0
885              Set selected module to active
886
887       j      Scroll down within expanded module
888
889       k      Scroll up within expanded module
890
891       c      Set or change scheme color.
892
893       TAB    Forward iteration of modules. Starts from current active module.
894
895       SHIFT + TAB
896              Backward iteration of modules. Starts from current  active  mod‐
897              ule.
898
899       ^f     Scroll forward one screen within an active module.
900
901       ^b     Scroll backward one screen within an active module.
902
903       s      Sort options for active module
904
905       /      Search across all modules (regex allowed)
906
907       n      Find the position of the next occurrence across all modules.
908
909       g      Move to the first item or top of screen.
910
911       G      Move to the last item or bottom of screen.
912

EXAMPLES

914       Note: Piping data into GoAccess won't prompt a log/date/time configura‐
915       tion dialog, you will need to previously define it in  your  configura‐
916       tion file or in the command line.
917
918
919   DIFFERENT OUTPUTS
920       To output to a terminal and generate an interactive report:
921
922              # goaccess access.log
923
924       To generate an HTML report:
925
926              # goaccess access.log -a -o report.html
927
928       To generate a JSON report:
929
930              # goaccess access.log -a -d -o report.json
931
932       To generate a CSV file:
933
934              # goaccess access.log --no-csv-summary -o report.csv
935
936       GoAccess  also  allows  great  flexibility  for real-time filtering and
937       parsing. For instance, to quickly diagnose issues  by  monitoring  logs
938       since goaccess was started:
939
940              # tail -f access.log | goaccess -
941
942       And  even better, to filter while maintaining opened a pipe to preserve
943       real-time analysis, we can make use of tail -f and a  matching  pattern
944       tool such as grep, awk, sed, etc:
945
946              # tail -f access.log | grep -i --line-buffered 'firefox' | goac‐
947              cess --log-format=COMBINED -
948
949       or to parse from the beginning of the file while maintaining  the  pipe
950       opened and applying a filter
951
952              # tail -f -n +0 access.log | grep -i --line-buffered 'firefox' |
953              goaccess --log-format=COMBINED -o report.html --real-time-html -
954
955   MULTIPLE LOG FILES
956       There are several ways to parse multiple logs with GoAccess.  The  sim‐
957       plest is to pass multiple log files to the command line:
958
959              # goaccess access.log access.log.1
960
961       It's  even  possible  to  parse files from a pipe while reading regular
962       files:
963
964              # cat access.log.2 | goaccess access.log access.log.1 -
965
966       Note that the single dash is appended to the command line to let  GoAc‐
967       cess know that it should read from the pipe.
968
969       Now  if we want to add more flexibility to GoAccess, we can do a series
970       of pipes. For instance, if we would like to process all compressed  log
971       files access.log.*.gz in addition to the current log file, we can do:
972
973              # zcat access.log.*.gz | goaccess access.log -
974
975       Note: On Mac OS X, use gunzip -c instead of zcat.
976
977   REAL TIME HTML OUTPUT
978       GoAccess  has  the ability to output real-time data in the HTML report.
979       You can even email the HTML file since it is composed of a single  file
980       with no external file dependencies, how neat is that!
981
982       The  process  of  generating a real-time HTML report is very similar to
983       the process of creating  a  static  report.  Only  --real-time-html  is
984       needed to make it real-time.
985
986              #  goaccess access.log -o /usr/share/nginx/html/site/report.html
987              --real-time-html
988
989       By default, GoAccess will use the host name of  the  generated  report.
990       Optionally,  you can specify the URL to which the client's browser will
991       connect to. See https://goaccess.io/faq for a more detailed example.
992
993              # goaccess  access.log  -o  report.html  --real-time-html  --ws-
994              url=goaccess.io
995
996       By  default,  GoAccess  listens  on  port 7890, to use a different port
997       other than 7890, you can specify it as (make sure the port is opened):
998
999              #   goaccess   access.log   -o   report.html    --real-time-html
1000              --port=9870
1001
1002       And  to  bind  the  WebSocket  server to a different address other than
1003       0.0.0.0, you can specify it as:
1004
1005              #   goaccess   access.log   -o   report.html    --real-time-html
1006              --addr=127.0.0.1
1007
1008       Note:  To  output real time data over a TLS/SSL connection, you need to
1009       use --ssl-cert=<cert.crt> and --ssl-key=<priv.key>.
1010
1011   WORKING WITH DATES
1012       Another useful pipe would be filtering dates out of the web log
1013
1014       The following will get all HTTP requests starting on 05/Dec/2010  until
1015       the end of the file.
1016
1017              # sed -n '/05Dec2010/,$ p' access.log | goaccess -a -
1018
1019       or using relative dates such as yesterdays or tomorrows day:
1020
1021              #  sed -n '/'$(date '+%d%b%Y' -d '1 week ago')'/,$ p' access.log
1022              | goaccess -a -
1023
1024       If we want to parse only a certain time-frame from DATE a to DATE b, we
1025       can do:
1026
1027              # sed -n '/5Nov2010/,/5Dec2010/ p' access.log | goaccess -a -
1028
1029   VIRTUAL HOSTS
1030       Assuming  your log contains the virtual host (server blocks) field. For
1031       instance:
1032
1033              vhost.com:80 10.131.40.139 - - [02/Mar/2016:08:14:04 -0600] "GET
1034              /shop/bag-p-20  HTTP/1.1"  200  6715 "-" "Apache (internal dummy
1035              connection)"
1036
1037       And you would like to append the virtual host to the request  in  order
1038       to see which virtual host the top urls belong to
1039
1040              awk '$8=$1$8' access.log | goaccess -a -
1041
1042       To exclude a list of virtual hosts you can do the following:
1043
1044              #  grep  -v  "`cat  exclude_vhost_list_file`" vhost_access.log |
1045              goaccess -
1046
1047   FILES & STATUS CODES
1048       To parse specific pages, e.g., page views, html, htm, php, etc.  within
1049       a request:
1050
1051              # awk '$7~/.html|.htm|.php/' access.log | goaccess -
1052
1053       Note,  $7  is the request field for the common and combined log format,
1054       (without Virtual Host), if your log includes  Virtual  Host,  then  you
1055       probably want to use $8 instead. It's best to check which field you are
1056       shooting for, e.g.:
1057
1058              # tail -10 access.log | awk '{print $8}'
1059
1060       Or to parse a specific status code, e.g., 500 (Internal Server Error):
1061
1062              # awk '$9~/500/' access.log | goaccess -
1063
1064   SERVER
1065       Also, it is worth pointing out that if we want to run GoAccess at lower
1066       priority, we can run it as:
1067
1068              # nice -n 19 goaccess -f access.log -a
1069
1070       and  if  you don't want to install it on your server, you can still run
1071       it from your local machine:
1072
1073              # ssh root@server 'cat /var/log/apache2/access.log'  |  goaccess
1074              -a -
1075
1076   INCREMENTAL LOG PROCESSING
1077       GoAccess  has the ability to process logs incrementally through the on-
1078       disk B+Tree database. It works in the following way:
1079
1080
1081       1  A dataset must be persisted first  with  --keep-db-files,  then  the
1082          same dataset can be loaded with --load-from-disk.
1083
1084       2  If  new data is passed (piped or through a log file), it will append
1085          it to the original dataset.
1086
1087       3  To preserve the data at all times, --keep-db-files must be used.
1088
1089       4  If --load-from-disk is used without --keep-db-files, database  files
1090          will be deleted upon closing the program.
1091
1092       For instance:
1093
1094              // last month access log
1095              goaccess access.log.1 --keep-db-files
1096
1097       then, load it with
1098
1099              // append this month access log, and preserve new data
1100              goaccess access.log --load-from-disk --keep-db-files
1101
1102       To read persisted data only (without parsing new data)
1103
1104              goaccess --load-from-disk --keep-db-files
1105

NOTES

1107       Each  active panel has a total of 366 items or 50 in the real-time HTML
1108       report.  The number of items is customizable using  max-items  However,
1109       only  the  CSV  and JSON output allow a maximum number greater than the
1110       default value of 366 items per panel.
1111
1112       When analyzing the same log file twice using  the  on-disk  B+Tree  and
1113       using  --keep-db-files  and --load-from-disk on each run, GoAccess will
1114       count each entry twice. Issue #334 will address this issue.
1115
1116       A hit is a request (line in the access log), e.g.,  10  requests  =  10
1117       hits.  HTTP requests with the same IP, date, and user agent are consid‐
1118       ered a unique visit.
1119

BUGS

1121       If you think you have found a bug, please send me  an  email  to  goac‐
1122       cess@prosoftcorp.com     or     use     the     issue     tracker    in
1123       https://github.com/allinurl/goaccess/issues
1124

AUTHOR

1126       Gerardo Orellana <goaccess@prosoftcorp.com> For more details about  it,
1127       or new releases, please visit https://goaccess.io
1128
1129
1130
1131Linux                            NOVEMBER 2018                     goaccess(1)
Impressum