1goaccess(1)                      User Manuals                      goaccess(1)
2
3
4

NAME

6       goaccess - fast web log analyzer and interactive viewer.
7

SYNOPSIS

9       goaccess [filename] [options...] [-c][-M][-H][-q][-d][...]
10

DESCRIPTION

12       goaccess  GoAccess is an open source real-time web log analyzer and in‐
13       teractive viewer that runs in a terminal in  *nix  systems  or  through
14       your browser.
15
16       It provides fast and valuable HTTP statistics for system administrators
17       that require a visual server report on the fly.
18
19       GoAccess parses the specified web log file and outputs the data to  the
20       X terminal. Features include:
21
22
23       General Statistics:
24              This  panel gives a summary of several metrics, such as the num‐
25              ber of valid and invalid requests, time  taken  to  analyze  the
26              dataset,  unique  visitors,  requested files, static files (CSS,
27              ICO, JPG, etc) HTTP referrers, 404s, size of the parsed log file
28              and bandwidth consumption.
29
30       Unique visitors
31              This panel shows metrics such as hits, unique visitors and cumu‐
32              lative bandwidth per date. HTTP requests containing the same IP,
33              the  same  date, and the same user agent are considered a unique
34              visitor. By default, it includes web crawlers/spiders.
35
36              Optionally, date specificity can be set to the hour level  using
37              --date-spec=hr  which will display dates such as 05/Jun/2016:16,
38              or to the minute  level  producing  05/Jun/2016:16:59.  This  is
39              great  if  you  want  to track your daily traffic at the hour or
40              minute level.
41
42       Requested files
43              This panel displays the most  requested  (non-static)  files  on
44              your  web  server.  It shows hits, unique visitors, and percent‐
45              age, along with the cumulative bandwidth, protocol, and the  re‐
46              quest method used.
47
48       Requested static files
49              Lists  the  most frequently static files such as: JPG, CSS, SWF,
50              JS, GIF, and PNG file types, along with the same metrics as  the
51              last panel. Additional static files can be added to the configu‐
52              ration file.
53
54       404 or Not Found
55              Displays the same metrics as the previous request  panels,  how‐
56              ever,  its  data  contains  all pages that were not found on the
57              server, or commonly known as 404 status code.
58
59       Hosts  This panel has detailed information  on  the  hosts  themselves.
60              This  is  great for spotting aggressive crawlers and identifying
61              who's eating your bandwidth.
62
63              Expanding the panel can display more information such as  host's
64              reverse DNS lookup result, country of origin and city. If the -a
65              argument is enabled, a list of user agents can be  displayed  by
66              selecting the desired IP address, and then pressing ENTER.
67
68       Operating Systems
69              This panel will report which operating system the host used when
70              it hit the server. It attempts to provide the most specific ver‐
71              sion of each operating system.
72
73       Browsers
74              This  panel  will report which browser the host used when it hit
75              the server. It attempts to provide the most specific version  of
76              each browser.
77
78       Visit Times
79              This  panel  will display an hourly report. This option displays
80              24 data points, one for each hour of the day.
81
82              Optionally, hour specificity can be set to the tenth of an  hour
83              level  using  --hour-spec=min  which  will display hours as 16:4
84              This is great if you want to  spot  peaks  of  traffic  on  your
85              server.
86
87       Virtual Hosts
88              This  panel  will display all the different virtual hosts parsed
89              from the access log. This panel  is  displayed  if  %v  is  used
90              within the log-format string.
91
92       Referrers URLs
93              If  the host in question accessed the site via another resource,
94              or was linked/diverted to you from another host,  the  URL  they
95              were  referred  from  will be provided in this panel. See `--ig‐
96              nore-panel` in your configuration file to enable  it.   disabled
97              by default.
98
99       Referring Sites
100              This  panel  will  display  only the host part but not the whole
101              URL. The URL where the request came from.
102
103       Keyphrases
104              It reports keyphrases used on Google search, Google  cache,  and
105              Google  translate that have lead to your web server. At present,
106              it only supports Google search queries via HTTP. See  `--ignore-
107              panel` in your configuration file to enable it.  disabled by de‐
108              fault.
109
110       Geo Location
111              Determines where an IP address is geographically  located.  Sta‐
112              tistics are broken down by continent and country. It needs to be
113              compiled with GeoLocation support.
114
115       HTTP Status Codes
116              The values of the numeric status code to HTTP requests.
117
118       Remote User (HTTP authentication)
119              This is the userid of the person requesting the document as  de‐
120              termined by HTTP authentication. If the document is not password
121              protected, this part will be "-" just  like  the  previous  one.
122              This panel is not enabled unless %e is given within the log-for‐
123              mat variable.
124
125       Cache Status
126              If you are using caching on your server, you may be at the point
127              where  you  want  to  know  if  your request is being cached and
128              served from the cache. This panel shows the cache status of  the
129              object the server served. This panel is not enabled unless %C is
130              given within the log-format variable. The status can be either
131               `MISS`, `BYPASS`, `EXPIRED`, `STALE`, `UPDATING`, `REVALIDATED`
132              or `HIT`
133
134       MIME Types
135              This  panel specifies Media Types (formerly known as MIME types)
136              and Media Subtypes which will be assigned and listed underneath.
137              This panel is not enabled unless %M is given within the log-for‐
138              mat   variable.   See    https://www.iana.org/assignments/media-
139              types/media-types.xhtml for more details.
140
141       Encryption Settings
142              This  panel  shows  the  SSL/TLS  protocol used along the Cipher
143              Suites. This panel is not enabled unless %K is given within  the
144              log-format variable.
145
146
147       NOTE:  Optionally and if configured, all panels can display the average
148       time taken to serve the request.
149
150

STORAGE

152       There are three storage options that can be used with GoAccess.  Choos‐
153       ing one will depend on your environment and needs.
154
155       Default Hash Tables
156              In-memory  storage  provides  better  performance at the cost of
157              limiting the dataset size to the amount  of  available  physical
158              memory.  GoAccess  uses  in-memory hash tables. It has very good
159              memory usage and pretty good performance. This storage has  sup‐
160              port for on-disk persistence.
161

CONFIGURATION

163       Multiple  options can be used to configure GoAccess. For a complete up-
164       to-date list of configure options, run ./configure --help
165
166       --enable-debug
167              Compile with debugging symbols and turn off  compiler  optimiza‐
168              tions.
169
170       --enable-utf8
171              Compile with wide character support. Ncursesw is required.
172
173       --enable-geoip=<legacy|mmdb>
174              Compile  with  GeoLocation support. MaxMind's GeoIP is required.
175              legacy will utilize the original  GeoIP  databases.   mmdb  will
176              utilize the enhanced GeoIP2 databases.
177
178       --with-getline
179              Dynamically  expands line buffer in order to parse full line re‐
180              quests instead of using a fixed size buffer of 4096.
181
182       --with-openssl
183              Compile GoAccess with OpenSSL support for its WebSocket server.
184

OPTIONS

186       The following options can be supplied to the command  or  specified  in
187       the  configuration  file.  If specified in the configuration file, long
188       options need to be used without prepending --  and  without  using  the
189       equal sign =.
190
191   LOG/DATE/TIME FORMAT
192       --time-format=<timeformat>
193              The  time-format variable followed by a space, specifies the log
194              format time containing either a name of a predefined format (see
195              options below) or any combination of regular characters and spe‐
196              cial format specifiers.
197
198              They all begin with a percentage (%) sign. See  `man  strftime`.
199              %T or %H:%M:%S.
200
201              Note  that  if  a timestamp is given in microseconds, %f must be
202              used as time-format.  If the timestamp is given in  milliseconds
203              %* must be used as time-format.
204
205       --date-format=<dateformat>
206              The  date-format variable followed by a space, specifies the log
207              format time containing either a name of a predefined format (see
208              options below) or any combination of regular characters and spe‐
209              cial format specifiers.
210
211              They all begin with a percentage (%) sign. See  `man  strftime`.
212              %Y-%m-%d.
213
214              Note  that  if  a timestamp is given in microseconds, %f must be
215              used as date-format.  If the timestamp is given in  milliseconds
216              %* must be used as date-format.
217
218       --datetime-format=<date_time_format>
219              The  date and time format combines the two variables into a sin‐
220              gle option. This gives the ability to get the  timezone  from  a
221              request  and  convert  it  to  another  timezone for output. See
222              --tz=<timezone>
223
224              They all begin with a percentage (%) sign. See  `man  strftime`.
225              e.g., %d/%b/%Y:%H:%M:%S %z.
226
227              Note that if --datetime-format is used, %x must be passed in the
228              log-format variable to represent the date and time field.
229
230       --log-format=<logformat>
231              The log-format variable followed by a space or \t for tab-delim‐
232              ited, specifies the log format string.
233
234              Note  that  if  there  are  spaces within the format, the string
235              needs to be enclosed in single/double quotes. Inner quotes  need
236              to be escaped.
237
238              In  addition  to  specifying  the raw log/date/time formats, for
239              simplicity, any of the following predefined log format names can
240              be  supplied to the log/date/time-format variables. GoAccess can
241              also handle one predefined name in one variable and another pre‐
242              defined name in another variable.
243
244                COMBINED     - Combined Log Format,
245                VCOMBINED    - Combined Log Format with Virtual Host,
246                COMMON       - Common Log Format,
247                VCOMMON      - Common Log Format with Virtual Host,
248                W3C          - W3C Extended Log File Format,
249                SQUID        - Native Squid Log Format,
250                CLOUDFRONT   - Amazon CloudFront Web Distribution,
251                CLOUDSTORAGE - Google Cloud Storage,
252                AWSELB       - Amazon Elastic Load Balancing,
253                AWSS3        - Amazon Simple Storage Service (S3)
254                AWSALB       - Amazon Application Load Balancer
255                CADDY        - Caddy's JSON Structured format
256
257              Note:  Piping  data  into  GoAccess won't prompt a log/date/time
258              configuration dialog, you will need to previously define  it  in
259              your configuration file or in the command line.
260
261   USER INTERFACE OPTIONS
262       -c --config-dialog
263              Prompt log/time/date configuration window on program start. Only
264              when curses is initialized.
265
266       -i --hl-header
267              Color highlight active terminal panel.
268
269       -m --with-mouse
270              Enable mouse support on main terminal dashboard.
271
272       ---color=<fg:bg[attrs, PANEL]>
273              Specify custom colors for the terminal output.
274
275              Color Syntax
276                DEFINITION space/tab colorFG#:colorBG# [attributes,PANEL]
277
278               FG# = foreground color [-1...255] (-1 = default term color)
279               BG# = background color [-1...255] (-1 = default term color)
280
281              Optionally, it is possible to apply color  attributes  (multiple
282              attributes  are comma separated), such as: bold, underline, nor‐
283              mal, reverse, blink
284
285              If desired, it is possible to apply  custom  colors  per  panel,
286              that is, a metric in the REQUESTS panel can be of color A, while
287              the same metric in the BROWSERS panel can be of color B.
288
289              Available color definitions:
290                COLOR_MTRC_HITS
291                COLOR_MTRC_VISITORS
292                COLOR_MTRC_DATA
293                COLOR_MTRC_BW
294                COLOR_MTRC_AVGTS
295                COLOR_MTRC_CUMTS
296                COLOR_MTRC_MAXTS
297                COLOR_MTRC_PROT
298                COLOR_MTRC_MTHD
299                COLOR_MTRC_HITS_PERC
300                COLOR_MTRC_HITS_PERC_MAX
301                COLOR_MTRC_VISITORS_PERC
302                COLOR_MTRC_VISITORS_PERC_MAX
303                COLOR_PANEL_COLS
304                COLOR_BARS
305                COLOR_ERROR
306                COLOR_SELECTED
307                COLOR_PANEL_ACTIVE
308                COLOR_PANEL_HEADER
309                COLOR_PANEL_DESC
310                COLOR_OVERALL_LBLS
311                COLOR_OVERALL_VALS
312                COLOR_OVERALL_PATH
313                COLOR_ACTIVE_LABEL
314                COLOR_BG
315                COLOR_DEFAULT
316                COLOR_PROGRESS
317
318              See configuration file for a sample color scheme.
319
320       --color-scheme=<1|2|3>
321              Choose among color schemes.  1 for the default grey  scheme.   2
322              for  the  green scheme.  3 for the Monokai scheme (shown only if
323              terminal supports 256 colors).
324
325       --crawlers-only
326              Parse and display only crawlers (bots).
327
328       --html-custom-css=<path/custom.css>
329              Specifies a custom CSS file path to load in the HTML report.
330
331       --html-custom-js=<path/custom.js>
332              Specifies a custom JS file path to load in the HTML report.
333
334       --html-report-title=<title>
335              Set HTML report page title and header.
336
337       --html-refresh=<secs>
338              Refresh the HTML report every X seconds. The value has to be be‐
339              tween  1  and 60 seconds. The default is set to refresh the HTML
340              report every 1 second.
341
342       --html-prefs=<JSON>
343              Set HTML report default preferences. Supply a valid JSON  object
344              containing  the  HTML preferences. It allows the ability to cus‐
345              tomize each panel plot. See example below.
346
347              Note: The JSON object passed needs to be a one line JSON string.
348              For instance,
349
350              --html-prefs='{"theme":"bright","perPage":5,"layout":"horizontal","showTables":true,"visitors":{"plot":{"chartType":"bar"}}}'
351
352       --json-pretty-print
353              Format JSON output using tabs and newlines.
354
355              Note:  This  is not recommended when outputting a real-time HTML
356              report since the WebSocket payload will much much larger.
357
358       --max-items=<number>
359              The maximum number of items to display per  panel.  The  maximum
360              can be a number between 1 and n.
361
362              Note:  Only  the  CSV  and  JSON  output  allow a maximum number
363              greater than the default value of 366 (or 50  in  the  real-time
364              HTML output) items per panel.
365
366       --no-color
367              Turn off colored output. This is the default output on terminals
368              that do not support colors.
369
370       --no-column-names
371              Don't write column names in the terminal output. By default,  it
372              displays column names for each available metric in every panel.
373
374       --no-csv-summary
375              Disable summary metrics on the CSV output.
376
377       --no-progress
378              Disable progress metrics [total requests/requests per second].
379
380       --no-tab-scroll
381              Disable  scrolling  through panels when TAB is pressed or when a
382              panel is selected using a numeric key.
383
384       --no-html-last-updated
385              Do not show the last updated field displayed in the HTML  gener‐
386              ated report.
387
388       --no-parsing-spinner
389              Do now show the progress metrics and parsing spinner.
390
391       --tz=<timezone>
392              Ouputs  the  report  date/time  data in the given timezone. Note
393              that it uses the canonical timezone name. e.g., Europe/Berlin or
394              America/Chicago  or  Africa/Cairo If an invalid timezone name is
395              given, the ouput will be in GMT. See --datetime-format in  order
396              to properly specify a timezone in the date/time format.
397
398   SERVER OPTIONS
399       Note This is just a WebSocket server to provide the raw real-time data.
400       It is not a WebServer itself. To access your  reports  html  file,  you
401       will  still  need  your  own HTTP server, place the generated report in
402       it's document root dir and open the html  file  in  your  browser.  The
403       browser  will  then  open another WebSocket-connection to the ws-server
404       you may setup here, to keep the dashboard up-to-date.
405
406       --addr Specify IP address to bind the server to. Otherwise it binds  to
407              0.0.0.0.
408
409              Usually  there is no need to specify the address, unless you in‐
410              tentionally would like to bind the server to a different address
411              within your server.
412
413       --daemonize
414              Run GoAccess as daemon (only if --real-time-html enabled).
415
416              Note:  It's important to make use of absolute paths across GoAc‐
417              cess' configuration.
418
419       --user-name=<username>
420              Run GoAccess as the specified user.
421
422              Note: It's important to ensure the user or the users' group  can
423              access  the  input  and  output files as well as any other files
424              needed.  Other groups the user belongs to will be  ignored.   As
425              such it's advised to run GoAccess behind a SSL proxy as it's un‐
426              likely this user can access the SSL certificates.
427
428       --origin=<url>
429              Ensure clients send the specified origin header  upon  the  Web‐
430              Socket handshake.
431
432       --pid-file=<path/goaccess.pid>
433              Write  the  daemon PID to a file when used along the --daemonize
434              option.
435
436       --port=<port>
437              Specify the port to use. By default GoAccess'  WebSocket  server
438              listens on port 7890.
439
440       --real-time-html
441              Enable real-time HTML output.
442
443              GoAccess uses its own WebSocket server to push the data from the
444              server to the client. See http://gwsocket.io  for  more  details
445              how the WebSocket server works.
446
447       --ws-url=<[scheme://]url[:port]>
448              URL to which the WebSocket server responds. This is the URL sup‐
449              plied to the WebSocket constructor on the client side.
450
451              Optionally, it is possible to specify the WebSocket URI  scheme,
452              such  as  ws://  or wss:// for unencrypted and encrypted connec‐
453              tions. e.g., wss://goaccess.io
454
455              If GoAccess is running behind a proxy, you could set the  client
456              side  to connect to a different port by specifying the host fol‐
457              lowed by a colon and the port.  e.g., goaccess.io:9999
458
459              By default, it will attempt to connect to the generated report's
460              hostname. If GoAccess is running on a remote server, the host of
461              the remote server should be specified here. Also, make  sure  it
462              is a valid host and NOT an http address.
463
464       --ping-interval=<secs>
465              Enable  WebSocket  ping with specified interval in seconds. This
466              helps prevent idle connections getting disconnected.
467
468       --fifo-in=<path/file>
469              Creates a named  pipe  (FIFO)  that  reads  from  on  the  given
470              path/file.
471
472       --fifo-out=<path/file>
473              Creates a named pipe (FIFO) that writes to the given path/file.
474
475       --ssl-cert=<cert.crt>
476              Path to TLS/SSL certificate. In order to enable TLS/SSL support,
477              GoAccess requires that --ssl-cert and --ssl-key are used.
478
479              Only if configured using --with-openssl
480
481       --ssl-key=<priv.key>
482              Path to TLS/SSL private key. In order to enable TLS/SSL support,
483              GoAccess requires that --ssl-cert and --ssl-key are used.
484
485              Only if configured using --with-openssl
486
487   FILE OPTIONS
488       -      The log file to parse is read from stdin.
489
490       -f --log-file=<logfile>
491              Specify  the  path  to  the input log file. If set in the config
492              file, it will take priority over -f from the command line.
493
494       -S --log-size=<bytes>
495              Specify the log size in bytes. This is  useful  when  piping  in
496              logs for processing in which the log size can be explicitly set.
497
498       -l --debug-file=<debugfile>
499              Send all debug messages to the specified file.
500
501       -p --config-file=<configfile>
502              Specify a custom configuration file to use. If set, it will take
503              priority over the global configuration file (if any).
504
505       --invalid-requests=<filename>
506              Log invalid requests to the specified file.
507
508       --unknowns-log=<filename>
509              Log unknown browsers and OSs to the specified file.
510
511       --no-global-config
512              Do not load the global configuration file. This directory should
513              normally    be    /usr/local/etc,    unless    specified    with
514              --sysconfdir=/dir.  See --dcf option  for  finding  the  default
515              configuration file.
516
517   PARSE OPTIONS
518       -a --agent-list
519              Enable a list of user-agents by host. For faster parsing, do not
520              enable this flag.
521
522       -d --with-output-resolver
523              Enable IP resolver on HTML|JSON output.
524
525       -e --exclude-ip=<IP|IP-range>
526              Exclude an IPv4 or IPv6 from being counted.  Ranges can  be  in‐
527              cluded as well using a dash in between the IPs (start-end).
528
529              Examples:
530                exclude-ip 127.0.0.1
531                exclude-ip 192.168.0.1-192.168.0.100
532                exclude-ip ::1
533                exclude-ip 0:0:0:0:0:ffff:808:804-0:0:0:0:0:ffff:808:808
534
535       -H --http-protocol=<yes|no>
536              Set/unset  HTTP request protocol. This will create a request key
537              containing the request protocol + the actual request.
538
539       -M --http-method=<yes|no>
540              Set/unset HTTP request method. This will create  a  request  key
541              containing the request method + the actual request.
542
543       -o --output=<path/file.[json|csv|html]>
544              Write  output to stdout given one of the following files and the
545              corresponding extension for the output format:
546
547                /path/file.csv - Comma-separated values (CSV)
548                /path/file.json - JSON (JavaScript Object Notation)
549                /path/file.html - HTML
550
551       -q --no-query-string
552              Ignore        request's        query        string.        i.e.,
553              www.google.com/page.htm?query => www.google.com/page.htm.
554
555              Note: Removing the query string can greatly decrease memory con‐
556              sumption, especially on timestamped requests.
557
558       -r --no-term-resolver
559              Disable IP resolver on terminal output.
560
561       --444-as-404
562              Treat non-standard status code 444 as 404.
563
564       --4xx-to-unique-count
565              Add 4xx client errors to the unique visitors count.
566
567       --anonymize-ip
568              Anonymize the client IP address.  The  IP  anonymization  option
569              sets  the  last  octet of IPv4 user IP addresses and the last 80
570              bits of  IPv6  addresses  to  zeros.   e.g.,  192.168.20.100  =>
571              192.168.20.0     e.g.,    2a03:2880:2110:df07:face:b00c::1    =>
572              2a03:2880:2110:df07::
573
574       --anonymize-level
575              Specifies the anonymization levels: 1 => default, 2 => strong, 3
576              => pedantic.
577
578              ┌────────────┬─────────┬─────────┬─────────┐
579Bits-hidden Level 1 Level 2 Level 3 
580              ├────────────┼─────────┼─────────┼─────────┤
581IPv4        │ 8       │ 16      │ 24      │
582              ├────────────┼─────────┼─────────┼─────────┤
583IPv6        │ 64      │ 80      │ 96      │
584              └────────────┴─────────┴─────────┴─────────┘
585
586       --all-static-files
587              Include   static  files  that  contain  a  query  string.  e.g.,
588              /fonts/fontawesome-webfont.woff?v=4.0.3
589
590       --browsers-file=<path>
591              By default GoAccess parses an "essential/basic" curated list  of
592              browsers & crawlers. If you need to add additional browsers, use
593              this  option.   Include  an   additional   delimited   list   of
594              browsers/crawlers/feeds  etc.   See  config/browsers.list for an
595              example   or    https://raw.githubusercontent.com/allinurl/goac
596              cess/master/config/browsers.list
597
598       --date-spec=<date|hr|min>
599              Set the date specificity to either date (default), hr to display
600              hours or min to display minutes appended to the date.
601
602              This is used in the visitors panel.  It's  useful  for  tracking
603              visitors  at  the  hour level. For instance, an hour specificity
604              would yield to  display  traffic  as  18/Dec/2010:19  or  minute
605              specificity 18/Dec/2010:19:59.
606
607       --double-decode
608              Decode  double-encoded  values.  This  includes, user-agent, re‐
609              quest, and referrer.
610
611       --enable-panel=<PANEL>
612              Enable parsing and displaying the given panel.
613
614              Available panels:
615                VISITORS
616                REQUESTS
617                REQUESTS_STATIC
618                NOT_FOUND
619                HOSTS
620                OS
621                BROWSERS
622                VISIT_TIMES
623                VIRTUAL_HOSTS
624                REFERRERS
625                REFERRING_SITES
626                KEYPHRASES
627                STATUS_CODES
628                REMOTE_USER
629                CACHE_STATUS
630                GEO_LOCATION
631                MIME_TYPE
632                TLS_TYPE
633
634       --fname-as-vhost=<regex>
635              Use log filename(s) as virtual host(s). POSIX regex is passed to
636              extract  the  virtual  host from the filename. e.g., --fname-as-
637              vhost='[a-z]*.[a-z]*' can be used to extract awesome.com.log  =>
638              awesome.com.
639
640       --hide-referrer=<NEEDLE>
641              Hide  a  referrer  but still count it. Wild cards are allowed in
642              the needle. i.e., *.bing.com.
643
644       --hour-spec=<hr|min>
645              Set the time specificity to either hour (default) or min to dis‐
646              play the tenth of an hour appended to the hour.
647
648              This  is  used  in  the time distribution panel. It's useful for
649              tracking peaks of traffic on your server at specific times.
650
651       --ignore-crawlers
652              Ignore crawlers from being counted.
653
654       --unknowns-as-crawlers
655              Classify unknown OS and browsers as crawlers.
656
657       --ignore-panel=<PANEL>
658              Ignore parsing and displaying the given panel.
659
660              Available panels:
661                VISITORS
662                REQUESTS
663                REQUESTS_STATIC
664                NOT_FOUND
665                HOSTS
666                OS
667                BROWSERS
668                VISIT_TIMES
669                VIRTUAL_HOSTS
670                REFERRERS
671                REFERRING_SITES
672                KEYPHRASES
673                STATUS_CODES
674                REMOTE_USER
675                CACHE_STATUS
676                GEO_LOCATION
677                MIME_TYPE
678                TLS_TYPE
679
680       --ignore-referrer=<referrer>
681              Ignore referers from being  counted.  Wildcards  allowed.  e.g.,
682              *.domain.com ww?.domain.*
683
684       --ignore-statics=<req|panel>
685              Ignore static file requests.
686
687              req
688                Only ignore request from valid requests
689
690              panels
691                Ignore request from panels.
692
693                Note  that  it will count them towards the total number of re‐
694              quests
695
696       --ignore-status=<CODE>
697              Ignore parsing and displaying one or  multiple  status  code(s).
698              For multiple status codes, use this option multiple times.
699
700       --keep-last=<num_days>
701              Keep the last specified number of days in storage. This will re‐
702              cycle the storage tables. e.g., keep &  show  only  the  last  7
703              days.
704
705       --no-ip-validation
706              Disable  client  IP validation. Useful if IP addresses have been
707              obfuscated before being logged.  The log still needs to  contain
708              a   placeholder   for  %h  usually  it's  a  resolved  IP.  e.g.
709              ord37s19-in-f14.1e100.net.
710
711       --no-strict-status
712              Disable HTTP status code validation. Some servers  would  record
713              this  value  only  if a connection was established to the target
714              and the target sent a response.  Otherwise, it could be recorded
715              as -.
716
717       --num-tests=<number>
718              Number of lines from the access log to test against the provided
719              log/date/time format. By default, the parser is set to  test  10
720              lines.  If  set  to  0, the parser won't test any lines and will
721              parse the  whole  access  log.  If  a  line  matches  the  given
722              log/date/time format before it reaches <number>, the parser will
723              consider the log to be valid,  otherwise  GoAccess  will  return
724              EXIT_FAILURE and display the relevant error messages.
725
726       --process-and-exit
727              Parse  log  and  exit  without outputting data. Useful if we are
728              looking to only add new data to  the  on-disk  database  without
729              outputting to a file or a terminal.
730
731       --real-os
732              Display real OS names. e.g, Windows XP, Snow Leopard.
733
734       --sort-panel=<PANEL,FIELD,ORDER>
735              Sort panel on initial load. Sort options are separated by comma.
736              Options are in the form: PANEL,METRIC,ORDER
737
738              Available metrics:
739                BY_HITS     - Sort by hits
740                BY_VISITORS - Sort by unique visitors
741                BY_DATA     - Sort by data
742                BY_BW       - Sort by bandwidth
743                BY_AVGTS    - Sort by average time served
744                BY_CUMTS    - Sort by cumulative time served
745                BY_MAXTS    - Sort by maximum time served
746                BY_PROT     - Sort by http protocol
747                BY_MTHD     - Sort by http method
748
749              Available orders:
750                ASC
751                DESC
752
753       --static-file=<extension>
754              Add static file extension. e.g.: .mp3 Extensions are case sensi‐
755              tive.
756
757   GEOLOCATION OPTIONS
758       -g --std-geoip
759              Standard GeoIP database for less memory usage.
760
761       --geoip-database=<geofile>
762              Specify path to GeoIP database file. i.e., GeoLiteCity.dat.
763
764              If  using GeoIP2, you will need to download the GeoLite2 City or
765              Country database from MaxMind.com and use  the  option  --geoip-
766              database to specify the database. You can also get updated data‐
767              base files for GeoIP legacy,  you  can  find  these  as  GeoLite
768              Legacy  Databases from MaxMind.com. IPv4 and IPv6 files are sup‐
769              ported as well. For updated DB  URLs,  please  see  the  default
770              GoAccess configuration file.
771
772              Note: --geoip-city-data is an alias of --geoip-database.
773
774   OTHER OPTIONS
775       -h --help
776              The help.
777
778       -s --storage
779              Display current storage method. i.e., B+ Tree, Hash.
780
781       -V --version
782              Display version information and exit.
783
784       --dcf  Display  the  path  of  the default config file when `-p` is not
785              used.
786
787   PERSISTENCE STORAGE OPTIONS
788       --persist
789              Persist parsed data into disk. If database  files  exist,  files
790              will  be  overwritten.  This should be set to the first dataset.
791              See examples below.
792
793       --restore
794              Load previously stored data from disk. If reading persisted data
795              only,  the database files need to exist. See --persist and exam‐
796              ples below.
797
798       --db-path=<dir>
799              Path where the on-disk database files are  stored.  The  default
800              value is the /tmp directory.
801
802

CUSTOM LOG/DATE FORMAT

804       GoAccess can parse virtually any web log format.
805
806       Predefined  options include, Common Log Format (CLF), Combined Log For‐
807       mat (XLF/ELF), including virtual host, Amazon CloudFront (Download Dis‐
808       tribution), Google Cloud Storage and W3C format (IIS).
809
810       GoAccess allows any custom format string as well.
811
812       There  are two ways to configure the log format.  The easiest is to run
813       GoAccess with -c to prompt a configuration window. Otherwise, it can be
814       configured under ~/.goaccessrc or the %sysconfdir%.
815
816       time-format
817              The  time-format variable followed by a space, specifies the log
818              format time containing any combination of regular characters and
819              special format specifiers.  They all begin with a percentage (%)
820              sign. See `man strftime`.  %T or %H:%M:%S.
821
822              Note: If a timestamp is given in microseconds, %f must  be  used
823              as time-format or %* if the timestamp is given in milliseconds.
824
825       date-format
826              The  date-format variable followed by a space, specifies the log
827              format date containing any combination of regular characters and
828              special  format specifiers. They all begin with a percentage (%)
829              sign. See `man strftime`. e.g., %Y-%m-%d.
830
831              Note: If a timestamp is given in microseconds, %f must  be  used
832              as date-format or %* if the timestamp is given in milliseconds.
833
834       log-format
835              The  log-format  variable  followed by a space or \t , specifies
836              the log format string.
837
838       %x     A date and time field matching the time-format  and  date-format
839              variables.  This  is  used  when given a timestamp or the date &
840              time are concatenated as a single string  (e.g.,  1501647332  or
841              20170801235000)  instead of the date and time being in two sepa‐
842              rated variables.
843
844       %t     time field matching the time-format variable.
845
846       %d     date field matching the date-format variable.
847
848       %v     The canonical Server Name of  the  server  serving  the  request
849              (Virtual Host).
850
851       %e     This  is the userid of the person requesting the document as de‐
852              termined by HTTP authentication.
853
854       %C     The cache status of the object the server served.
855
856       %h     host (the client IP address, either IPv4 or IPv6)
857
858       %r     The request line from the client. This requires specific  delim‐
859              iters  around  the  request (as single quotes, double quotes, or
860              anything else) to be parsable. If not, we have to use a combina‐
861              tion of special format specifiers as %m %U %H.
862
863       %q     The query string.
864
865       %m     The request method.
866
867       %U     The URL path requested.
868
869              Note:  If the query string is in %U, there is no need to use %q.
870              However, if the URL path, does not include any query string, you
871              may use %q and the query string will be appended to the request.
872
873       %H     The request protocol.
874
875       %s     The status code that the server sends back to the client.
876
877       %b     The size of the object returned to the client.
878
879       %R     The "Referrer" HTTP request header.
880
881       %u     The user-agent HTTP request header.
882
883       %K     The  TLS  encryption  settings  chosen  for  the connection. (In
884              Apache LogFormat: %{SSL_PROTOCOL}x)
885
886       %k     The TLS encryption  settings  chosen  for  the  connection.  (In
887              Apache LogFormat: %{SSL_CIPHER}x)
888
889       %M     The  MIME-type  of the requested resource. (In Apache LogFormat:
890              %{Content-Type}o)
891
892       %D     The time taken to serve the request, in microseconds as a  deci‐
893              mal number.
894
895       %T     The  time  taken to serve the request, in seconds with millisec‐
896              onds resolution.
897
898       %L     The time taken to serve the request, in milliseconds as a  deci‐
899              mal number.
900
901       %^     Ignore this field.
902
903       %~     Move forward through the log string until a non-space (!isspace)
904              char is found.
905
906       ~h     The host (the client IP address, either IPv4 or IPv6)  in  a  X-
907              Forwarded-For (XFF) field.
908
909              It uses a special specifier which consists of a tilde before the
910              host specifier, followed by the character(s)  that  delimit  the
911              XFF field, which are enclosed by curly braces. i.e., "~h{, }
912
913              For  example,  "~h{,  }" is used in order to parse "11.25.11.53,
914              17.68.33.17" field which is delimited by a  comma  and  a  space
915              (enclosed by double quotes).
916
917
918              ┌───────────────────────────┬───────────┐
919XFF field                  specifier 
920              ├───────────────────────────┼───────────┤
921"192.1.21.932,.68.33.11972,.1.1.2" │ "~h{, }"  │
922              ├───────────────────────────┼───────────┤
923"192.1.2.12","192.68.33.17" │ ~h{", }   │
924              ├───────────────────────────┼───────────┤
925192.1.2.12, 192.68.33.17   │ ~h{, }    │
926              ├───────────────────────────┼───────────┤
927192.1.2.11942.68.33.11792.1.1.2 │ ~h{ }     │
928              └───────────────────────────┴───────────┘
929
930
931       Note: In order to get the average, cumulative and maximum  time  served
932       in  GoAccess, you will need to start logging response times in your web
933       server. In Nginx you can add $request_time to your log format, or %D in
934       Apache.
935
936       Important:  If  multiple  time  served  specifiers are used at the same
937       time, the first option specified in the format string will take  prior‐
938       ity over the other specifiers.
939
940       GoAccess requires the following fields:
941
942              %h a valid IPv4/6
943
944              %d a valid date
945
946              %r the request
947

INTERACTIVE MENU

949       F1 or h
950              Main help.
951
952       F5     Redraw main window.
953
954       q      Quit the program, current window or collapse active module
955
956       o or ENTER
957              Expand selected module or open window
958
959       0-9 and Shift + 0
960              Set selected module to active
961
962       j      Scroll down within expanded module
963
964       k      Scroll up within expanded module
965
966       c      Set or change scheme color.
967
968       TAB    Forward iteration of modules. Starts from current active module.
969
970       SHIFT + TAB
971              Backward  iteration  of modules. Starts from current active mod‐
972              ule.
973
974       ^f     Scroll forward one screen within an active module.
975
976       ^b     Scroll backward one screen within an active module.
977
978       s      Sort options for active module
979
980       /      Search across all modules (regex allowed)
981
982       n      Find the position of the next occurrence across all modules.
983
984       g      Move to the first item or top of screen.
985
986       G      Move to the last item or bottom of screen.
987

EXAMPLES

989       Note: Piping data into GoAccess won't prompt a log/date/time configura‐
990       tion  dialog,  you will need to previously define it in your configura‐
991       tion file or in the command line.
992
993
994   DIFFERENT OUTPUTS
995       To output to a terminal and generate an interactive report:
996
997              # goaccess access.log
998
999       To generate an HTML report:
1000
1001              # goaccess access.log -a -o report.html
1002
1003       To generate a JSON report:
1004
1005              # goaccess access.log -a -d -o report.json
1006
1007       To generate a CSV file:
1008
1009              # goaccess access.log --no-csv-summary -o report.csv
1010
1011       GoAccess also allows great  flexibility  for  real-time  filtering  and
1012       parsing.  For  instance,  to quickly diagnose issues by monitoring logs
1013       since goaccess was started:
1014
1015              # tail -f access.log | goaccess -
1016
1017       And even better, to filter while maintaining opened a pipe to  preserve
1018       real-time  analysis,  we can make use of tail -f and a matching pattern
1019       tool such as grep, awk, sed, etc:
1020
1021              # tail -f access.log | grep -i --line-buffered 'firefox' | goac‐
1022              cess --log-format=COMBINED -
1023
1024       or  to  parse from the beginning of the file while maintaining the pipe
1025       opened and applying a filter
1026
1027              # tail -f -n +0 access.log | grep -i --line-buffered 'firefox' |
1028              goaccess --log-format=COMBINED -o report.html --real-time-html -
1029
1030       or  to convert the log date timezone to a different timezone, e.g., Eu‐
1031       rope/Berlin
1032
1033              # goaccess access.log --log-format='%h %^[%x] "%r"  %s  %b  "%R"
1034              "%u"'    --datetime-format='%d/%b/%Y:%H:%M:%S    %z'    --tz=Eu‐
1035              rope/Berlin --date-spec=min
1036
1037   MULTIPLE LOG FILES
1038       There are several ways to parse multiple logs with GoAccess.  The  sim‐
1039       plest is to pass multiple log files to the command line:
1040
1041              # goaccess access.log access.log.1
1042
1043       It's  even  possible  to  parse files from a pipe while reading regular
1044       files:
1045
1046              # cat access.log.2 | goaccess access.log access.log.1 -
1047
1048       Note that the single dash is appended to the command line to let  GoAc‐
1049       cess know that it should read from the pipe.
1050
1051       Now  if we want to add more flexibility to GoAccess, we can do a series
1052       of pipes. For instance, if we would like to process all compressed  log
1053       files access.log.*.gz in addition to the current log file, we can do:
1054
1055              # zcat access.log.*.gz | goaccess access.log -
1056
1057       Note: On Mac OS X, use gunzip -c instead of zcat.
1058
1059   REAL TIME HTML OUTPUT
1060       GoAccess  has  the ability to output real-time data in the HTML report.
1061       You can even email the HTML file since it is composed of a single  file
1062       with no external file dependencies, how neat is that!
1063
1064       The  process  of  generating a real-time HTML report is very similar to
1065       the process of creating  a  static  report.  Only  --real-time-html  is
1066       needed to make it real-time.
1067
1068              #  goaccess access.log -o /usr/share/nginx/html/site/report.html
1069              --real-time-html
1070
1071       By default, GoAccess will use the host name of  the  generated  report.
1072       Optionally,  you can specify the URL to which the client's browser will
1073       connect to. See https://goaccess.io/faq for a more detailed example.
1074
1075              # goaccess  access.log  -o  report.html  --real-time-html  --ws-
1076              url=goaccess.io
1077
1078       By  default,  GoAccess  listens  on  port 7890, to use a different port
1079       other than 7890, you can specify it as (make sure the port is opened):
1080
1081              #   goaccess   access.log   -o   report.html    --real-time-html
1082              --port=9870
1083
1084       And  to  bind  the  WebSocket  server to a different address other than
1085       0.0.0.0, you can specify it as:
1086
1087              #   goaccess   access.log   -o   report.html    --real-time-html
1088              --addr=127.0.0.1
1089
1090       Note:  To  output real time data over a TLS/SSL connection, you need to
1091       use --ssl-cert=<cert.crt> and --ssl-key=<priv.key>.
1092
1093   WORKING WITH DATES
1094       Another useful pipe would be filtering dates out of the web log
1095
1096       The following will get all HTTP requests starting on 05/Dec/2010  until
1097       the end of the file.
1098
1099              # sed -n '/05Dec2010/,$ p' access.log | goaccess -a -
1100
1101       or using relative dates such as yesterdays or tomorrows day:
1102
1103              #  sed -n '/'$(date '+%d%b%Y' -d '1 week ago')'/,$ p' access.log
1104              | goaccess -a -
1105
1106       If we want to parse only a certain time-frame from DATE a to DATE b, we
1107       can do:
1108
1109              # sed -n '/5Nov2010/,/5Dec2010/ p' access.log | goaccess -a -
1110
1111       If we want to preserve only certain amount of data and recycle storage,
1112       we can keep only a certain number of days. For instance to keep &  show
1113       the last 5 days:
1114
1115              # goaccess access.log --keep-last=5
1116
1117   VIRTUAL HOSTS
1118       Assuming  your log contains the virtual host (server blocks) field. For
1119       instance:
1120
1121              vhost.com:80 10.131.40.139 - - [02/Mar/2016:08:14:04 -0600] "GET
1122              /shop/bag-p-20  HTTP/1.1"  200  6715 "-" "Apache (internal dummy
1123              connection)"
1124
1125       And you would like to append the virtual host to the request  in  order
1126       to see which virtual host the top urls belong to
1127
1128              awk '$8=$1$8' access.log | goaccess -a -
1129
1130       To exclude a list of virtual hosts you can do the following:
1131
1132              #  grep  -v  "`cat  exclude_vhost_list_file`" vhost_access.log |
1133              goaccess -
1134
1135   FILES & STATUS CODES
1136       To parse specific pages, e.g., page views, html, htm, php, etc.  within
1137       a request:
1138
1139              # awk '$7~/.html|.htm|.php/' access.log | goaccess -
1140
1141       Note,  $7  is the request field for the common and combined log format,
1142       (without Virtual Host), if your log includes  Virtual  Host,  then  you
1143       probably want to use $8 instead. It's best to check which field you are
1144       shooting for, e.g.:
1145
1146              # tail -10 access.log | awk '{print $8}'
1147
1148       Or to parse a specific status code, e.g., 500 (Internal Server Error):
1149
1150              # awk '$9~/500/' access.log | goaccess -
1151
1152   SERVER
1153       Also, it is worth pointing out that if we want to run GoAccess at lower
1154       priority, we can run it as:
1155
1156              # nice -n 19 goaccess -f access.log -a
1157
1158       and  if  you don't want to install it on your server, you can still run
1159       it from your local machine:
1160
1161              # ssh -n root@server  'tail  -f  /var/log/apache2/access.log'  |
1162              goaccess -
1163
1164       Note:  SSH requires -n so GoAccess can read from stdin. Also, make sure
1165       to use SSH keys for authentication as it won't work if a passphrase  is
1166       required.
1167
1168   INCREMENTAL LOG PROCESSING
1169       GoAccess  has the ability to process logs incrementally through its in‐
1170       ternal storage and dump its data to disk. It  works  in  the  following
1171       way:
1172
1173
1174       1  A  dataset  must  be  persisted  first with --persist, then the same
1175          dataset can be loaded with
1176
1177       2  --restore.  If new data is passed (piped or through a log file),  it
1178          will append it to the original dataset.
1179
1180
1181       NOTES
1182
1183       GoAccess  keeps  track  of  inodes of all the files processed (assuming
1184       files will stay on the same partition),  in  addition,  it  extracts  a
1185       snippet  of  data  from the log along with the last line parsed of each
1186       file  and  the  timestamp  of  the  last   line   parsed.   e.g.,   in‐
1187       ode:29627417|line:20012|ts:20171231235059
1188
1189       First  it  compares  if the snippet matches the log being parsed, if it
1190       does, it assumes the log hasn't changed dramatically, e.g., hasn't been
1191       truncated.  If the inode does not match the current file, it parses all
1192       lines. If the current file matches the inode, it then reads the remain‐
1193       ing  lines  and updates the count of lines parsed and the timestamp. As
1194       an extra precaution, it won't parse log lines with a timestamp  ≤  than
1195       the one stored.
1196
1197       Piped data works based off the timestamp of the last line read. For in‐
1198       stance, it will parse and discard all incoming entries until it finds a
1199       timestamp >= than the one stored.
1200
1201
1202       For instance:
1203
1204              // last month access log
1205              # goaccess access.log.1 --persist
1206
1207       then, load it with
1208
1209              // append this month access log, and preserve new data
1210              # goaccess access.log --restore --persist
1211
1212       To read persisted data only (without parsing new data)
1213
1214              # goaccess --restore
1215

NOTES

1217       Each  active panel has a total of 366 items or 50 in the real-time HTML
1218       report.  The number of items is customizable using max-items Note  that
1219       HTML,  CSV  and JSON output allow a maximum number greater than the de‐
1220       fault value of 366 items per panel.
1221
1222       A hit is a request (line in the access log), e.g.,  10  requests  =  10
1223       hits.  HTTP requests with the same IP, date, and user agent are consid‐
1224       ered a unique visit.
1225
1226       The generated report will attempt to reconnect to the WebSocket  server
1227       after  1 second with exponential backoff. It will attempt to connect 20
1228       times.
1229

BUGS

1231       If you think you have found a bug, please send me  an  email  to  goac‐
1232       cess@prosoftcorp.com     or     use     the     issue     tracker    in
1233       https://github.com/allinurl/goaccess/issues
1234

AUTHOR

1236       Gerardo Orellana <hello@goaccess.io> For more details about it, or  new
1237       releases, please visit https://goaccess.io
1238
1239
1240
1241GNU+Linux                        DECEMBER 2022                     goaccess(1)
Impressum