1goaccess(1)                      User Manuals                      goaccess(1)
2
3
4

NAME

6       goaccess - fast web log analyzer and interactive viewer.
7

SYNOPSIS

9       goaccess [filename] [options...] [-c][-M][-H][-q][-d][...]
10

DESCRIPTION

12       goaccess  GoAccess is an open source real-time web log analyzer and in‐
13       teractive viewer that runs in a terminal in  *nix  systems  or  through
14       your browser.
15
16       It provides fast and valuable HTTP statistics for system administrators
17       that require a visual server report on the fly.
18
19       GoAccess parses the specified web log file and outputs the data to  the
20       X terminal. Features include:
21
22
23       General Statistics:
24              This  panel gives a summary of several metrics, such as the num‐
25              ber of valid and invalid requests, time  taken  to  analyze  the
26              dataset,  unique  visitors,  requested files, static files (CSS,
27              ICO, JPG, etc) HTTP referrers, 404s, size of the parsed log file
28              and bandwidth consumption.
29
30       Unique visitors
31              This panel shows metrics such as hits, unique visitors and cumu‐
32              lative bandwidth per date. HTTP requests containing the same IP,
33              the  same  date, and the same user agent are considered a unique
34              visitor. By default, it includes web crawlers/spiders.
35
36              Optionally, date specificity can be set to the hour level  using
37              --date-spec=hr  which will display dates such as 05/Jun/2016:16,
38              or to the minute  level  producing  05/Jun/2016:16:59.  This  is
39              great  if  you  want  to track your daily traffic at the hour or
40              minute level.
41
42       Requested files
43              This panel displays the most  requested  (non-static)  files  on
44              your  web  server.  It shows hits, unique visitors, and percent‐
45              age, along with the cumulative bandwidth, protocol, and the  re‐
46              quest method used.
47
48       Requested static files
49              Lists  the  most frequently static files such as: JPG, CSS, SWF,
50              JS, GIF, and PNG file types, along with the same metrics as  the
51              last panel. Additional static files can be added to the configu‐
52              ration file.
53
54       404 or Not Found
55              Displays the same metrics as the previous request  panels,  how‐
56              ever,  its  data  contains  all pages that were not found on the
57              server, or commonly known as 404 status code.
58
59       Hosts  This panel has detailed information  on  the  hosts  themselves.
60              This  is  great for spotting aggressive crawlers and identifying
61              who's eating your bandwidth.
62
63              Expanding the panel can display more information such as  host's
64              reverse DNS lookup result, country of origin and city. If the -a
65              argument is enabled, a list of user agents can be  displayed  by
66              selecting the desired IP address, and then pressing ENTER.
67
68       Operating Systems
69              This panel will report which operating system the host used when
70              it hit the server. It attempts to provide the most specific ver‐
71              sion of each operating system.
72
73       Browsers
74              This  panel  will report which browser the host used when it hit
75              the server. It attempts to provide the most specific version  of
76              each browser.
77
78       Visit Times
79              This  panel  will display an hourly report. This option displays
80              24 data points, one for each hour of the day.
81
82              Optionally, hour specificity can be set to the tenth of an  hour
83              level  using  --hour-spec=min  which  will display hours as 16:4
84              This is great if you want to  spot  peaks  of  traffic  on  your
85              server.
86
87       Virtual Hosts
88              This  panel  will display all the different virtual hosts parsed
89              from the access log. This panel  is  displayed  if  %v  is  used
90              within the log-format string.
91
92       Referrers URLs
93              If  the host in question accessed the site via another resource,
94              or was linked/diverted to you from another host,  the  URL  they
95              were  referred  from  will be provided in this panel. See `--ig‐
96              nore-panel` in your configuration file to enable  it.   disabled
97              by default.
98
99       Referring Sites
100              This  panel  will  display  only the host part but not the whole
101              URL. The URL where the request came from.
102
103       Keyphrases
104              It reports keyphrases used on Google search, Google  cache,  and
105              Google  translate that have lead to your web server. At present,
106              it only supports Google search queries via HTTP. See  `--ignore-
107              panel` in your configuration file to enable it.  disabled by de‐
108              fault.
109
110       Geo Location
111              Determines where an IP address is geographically  located.  Sta‐
112              tistics are broken down by continent and country. It needs to be
113              compiled with GeoLocation support.
114
115       HTTP Status Codes
116              The values of the numeric status code to HTTP requests.
117
118       ASN    This panel displays ASN (Autonomous  System  Numbers)  data  for
119              GeoIP2 and legacy databases. Great for detecting malicious traf‐
120              fic and blocking accordingly.
121
122       Remote User (HTTP authentication)
123              This is the userid of the person requesting the document as  de‐
124              termined by HTTP authentication. If the document is not password
125              protected, this part will be "-" just  like  the  previous  one.
126              This panel is not enabled unless %e is given within the log-for‐
127              mat variable.
128
129       Cache Status
130              If you are using caching on your server, you may be at the point
131              where  you  want  to  know  if  your request is being cached and
132              served from the cache. This panel shows the cache status of  the
133              object the server served. This panel is not enabled unless %C is
134              given within the log-format variable. The status can be either
135               `MISS`, `BYPASS`, `EXPIRED`, `STALE`, `UPDATING`, `REVALIDATED`
136              or `HIT`
137
138       MIME Types
139              This  panel specifies Media Types (formerly known as MIME types)
140              and Media Subtypes which will be assigned and listed underneath.
141              This panel is not enabled unless %M is given within the log-for‐
142              mat   variable.   See    https://www.iana.org/assignments/media-
143              types/media-types.xhtml for more details.
144
145       Encryption Settings
146              This  panel  shows  the  SSL/TLS  protocol used along the Cipher
147              Suites. This panel is not enabled unless %K is given within  the
148              log-format variable.
149
150
151       NOTE:  Optionally and if configured, all panels can display the average
152       time taken to serve the request.
153
154

STORAGE

156       There are three storage options that can be used with GoAccess.  Choos‐
157       ing one will depend on your environment and needs.
158
159       Default Hash Tables
160              In-memory  storage  provides  better  performance at the cost of
161              limiting the dataset size to the amount  of  available  physical
162              memory.  GoAccess  uses  in-memory hash tables. It has very good
163              memory usage and pretty good performance. This storage has  sup‐
164              port for on-disk persistence.
165

CONFIGURATION

167       Multiple  options can be used to configure GoAccess. For a complete up-
168       to-date list of configure options, run ./configure --help
169
170       --enable-debug
171              Compile with debugging symbols and turn off  compiler  optimiza‐
172              tions.
173
174       --enable-utf8
175              Compile with wide character support. Ncursesw is required.
176
177       --enable-geoip=<legacy|mmdb>
178              Compile  with  GeoLocation support. MaxMind's GeoIP is required.
179              legacy will utilize the original  GeoIP  databases.   mmdb  will
180              utilize the enhanced GeoIP2 databases.
181
182       --with-getline
183              Dynamically  expands line buffer in order to parse full line re‐
184              quests instead of using a fixed size buffer of 4096.
185
186       --with-openssl
187              Compile GoAccess with OpenSSL support for its WebSocket server.
188

OPTIONS

190       The following options can be supplied to the command  or  specified  in
191       the  configuration  file.  If specified in the configuration file, long
192       options need to be used without prepending --  and  without  using  the
193       equal sign =.
194
195   LOG/DATE/TIME FORMAT
196       --time-format=<timeformat>
197              The  time-format variable followed by a space, specifies the log
198              format time containing either a name of a predefined format (see
199              options below) or any combination of regular characters and spe‐
200              cial format specifiers.
201
202              They all begin with a percentage (%) sign. See  `man  strftime`.
203              %T or %H:%M:%S.
204
205              Note  that  if  a timestamp is given in microseconds, %f must be
206              used as time-format.  If the timestamp is given in  milliseconds
207              %* must be used as time-format.
208
209       --date-format=<dateformat>
210              The  date-format variable followed by a space, specifies the log
211              format time containing either a name of a predefined format (see
212              options below) or any combination of regular characters and spe‐
213              cial format specifiers.
214
215              They all begin with a percentage (%) sign. See  `man  strftime`.
216              %Y-%m-%d.
217
218              Note  that  if  a timestamp is given in microseconds, %f must be
219              used as date-format.  If the timestamp is given in  milliseconds
220              %* must be used as date-format.
221
222       --datetime-format=<date_time_format>
223              The  date and time format combines the two variables into a sin‐
224              gle option. This gives the ability to get the  timezone  from  a
225              request  and  convert  it  to  another  timezone for output. See
226              --tz=<timezone>
227
228              They all begin with a percentage (%) sign. See  `man  strftime`.
229              e.g., %d/%b/%Y:%H:%M:%S %z.
230
231              Note that if --datetime-format is used, %x must be passed in the
232              log-format variable to represent the date and time field.
233
234       --log-format=<logformat>
235              The log-format variable followed by a space or \t for tab-delim‐
236              ited, specifies the log format string.
237
238              Note  that  if  there  are  spaces within the format, the string
239              needs to be enclosed in single/double quotes. Inner quotes  need
240              to be escaped.
241
242              In  addition  to  specifying  the raw log/date/time formats, for
243              simplicity, any of the following predefined log format names can
244              be  supplied to the log/date/time-format variables. GoAccess can
245              also handle one predefined name in one variable and another pre‐
246              defined name in another variable.
247
248                COMBINED     - Combined Log Format,
249                VCOMBINED    - Combined Log Format with Virtual Host,
250                COMMON       - Common Log Format,
251                VCOMMON      - Common Log Format with Virtual Host,
252                W3C          - W3C Extended Log File Format,
253                SQUID        - Native Squid Log Format,
254                CLOUDFRONT   - Amazon CloudFront Web Distribution,
255                CLOUDSTORAGE - Google Cloud Storage,
256                AWSELB       - Amazon Elastic Load Balancing,
257                AWSS3        - Amazon Simple Storage Service (S3)
258                AWSALB       - Amazon Application Load Balancer
259                CADDY        - Caddy's JSON Structured format
260
261              Note:  Piping  data  into  GoAccess won't prompt a log/date/time
262              configuration dialog, you will need to previously define  it  in
263              your configuration file or in the command line.
264
265   USER INTERFACE OPTIONS
266       -c --config-dialog
267              Prompt log/time/date configuration window on program start. Only
268              when curses is initialized.
269
270       -i --hl-header
271              Color highlight active terminal panel.
272
273       -m --with-mouse
274              Enable mouse support on main terminal dashboard.
275
276       ---color=<fg:bg[attrs, PANEL]>
277              Specify custom colors for the terminal output.
278
279              Color Syntax
280                DEFINITION space/tab colorFG#:colorBG# [attributes,PANEL]
281
282               FG# = foreground color [-1...255] (-1 = default term color)
283               BG# = background color [-1...255] (-1 = default term color)
284
285              Optionally, it is possible to apply color  attributes  (multiple
286              attributes  are comma separated), such as: bold, underline, nor‐
287              mal, reverse, blink
288
289              If desired, it is possible to apply  custom  colors  per  panel,
290              that is, a metric in the REQUESTS panel can be of color A, while
291              the same metric in the BROWSERS panel can be of color B.
292
293              Available color definitions:
294                COLOR_MTRC_HITS
295                COLOR_MTRC_VISITORS
296                COLOR_MTRC_DATA
297                COLOR_MTRC_BW
298                COLOR_MTRC_AVGTS
299                COLOR_MTRC_CUMTS
300                COLOR_MTRC_MAXTS
301                COLOR_MTRC_PROT
302                COLOR_MTRC_MTHD
303                COLOR_MTRC_HITS_PERC
304                COLOR_MTRC_HITS_PERC_MAX
305                COLOR_MTRC_VISITORS_PERC
306                COLOR_MTRC_VISITORS_PERC_MAX
307                COLOR_PANEL_COLS
308                COLOR_BARS
309                COLOR_ERROR
310                COLOR_SELECTED
311                COLOR_PANEL_ACTIVE
312                COLOR_PANEL_HEADER
313                COLOR_PANEL_DESC
314                COLOR_OVERALL_LBLS
315                COLOR_OVERALL_VALS
316                COLOR_OVERALL_PATH
317                COLOR_ACTIVE_LABEL
318                COLOR_BG
319                COLOR_DEFAULT
320                COLOR_PROGRESS
321
322              See configuration file for a sample color scheme.
323
324       --color-scheme=<1|2|3>
325              Choose among color schemes.  1 for the default grey  scheme.   2
326              for  the  green scheme.  3 for the Monokai scheme (shown only if
327              terminal supports 256 colors).
328
329       --crawlers-only
330              Parse and display only crawlers (bots).
331
332       --html-custom-css=<path/custom.css>
333              Specifies a custom CSS file path to load in the HTML report.
334
335       --html-custom-js=<path/custom.js>
336              Specifies a custom JS file path to load in the HTML report.
337
338       --html-report-title=<title>
339              Set HTML report page title and header.
340
341       --html-refresh=<secs>
342              Refresh the HTML report every X seconds. The value has to be be‐
343              tween  1  and 60 seconds. The default is set to refresh the HTML
344              report every 1 second.
345
346       --html-prefs=<JSON>
347              Set HTML report default preferences. Supply a valid JSON  object
348              containing  the  HTML preferences. It allows the ability to cus‐
349              tomize each panel plot. See example below.
350
351              Note: The JSON object passed needs to be a one line JSON string.
352              For instance,
353
354              --html-prefs='{"theme":"bright","perPage":5,"layout":"horizontal","showTables":true,"visitors":{"plot":{"chartType":"bar"}}}'
355
356       --json-pretty-print
357              Format JSON output using tabs and newlines.
358
359              Note:  This  is not recommended when outputting a real-time HTML
360              report since the WebSocket payload will much much larger.
361
362       --max-items=<number>
363              The maximum number of items to display per  panel.  The  maximum
364              can be a number between 1 and n.
365
366              Note:  Only  the  CSV  and  JSON  output  allow a maximum number
367              greater than the default value of 366 (or 50  in  the  real-time
368              HTML output) items per panel.
369
370       --no-color
371              Turn off colored output. This is the default output on terminals
372              that do not support colors.
373
374       --no-column-names
375              Don't write column names in the terminal output. By default,  it
376              displays column names for each available metric in every panel.
377
378       --no-csv-summary
379              Disable summary metrics on the CSV output.
380
381       --no-progress
382              Disable progress metrics [total requests/requests per second].
383
384       --no-tab-scroll
385              Disable  scrolling  through panels when TAB is pressed or when a
386              panel is selected using a numeric key.
387
388       --no-html-last-updated
389              Do not show the last updated field displayed in the HTML  gener‐
390              ated report.
391
392       --no-parsing-spinner
393              Do now show the progress metrics and parsing spinner.
394
395       --tz=<timezone>
396              Outputs  the  report  date/time data in the given timezone. Note
397              that it uses the canonical timezone name. e.g., Europe/Berlin or
398              America/Chicago  or  Africa/Cairo If an invalid timezone name is
399              given, the output will be in GMT. See --datetime-format in order
400              to properly specify a timezone in the date/time format.
401
402   SERVER OPTIONS
403       Note This is just a WebSocket server to provide the raw real-time data.
404       It is not a WebServer itself. To access your  reports  html  file,  you
405       will  still  need  your  own HTTP server, place the generated report in
406       it's document root dir and open the html  file  in  your  browser.  The
407       browser  will  then  open another WebSocket-connection to the ws-server
408       you may setup here, to keep the dashboard up-to-date.
409
410       --addr Specify IP address to bind the server to. Otherwise it binds  to
411              0.0.0.0.
412
413              Usually  there is no need to specify the address, unless you in‐
414              tentionally would like to bind the server to a different address
415              within your server.
416
417       --daemonize
418              Run GoAccess as daemon (only if --real-time-html enabled).
419
420              Note:  It's important to make use of absolute paths across GoAc‐
421              cess' configuration.
422
423       --user-name=<username>
424              Run GoAccess as the specified user.
425
426              Note: It's important to ensure the user or the users' group  can
427              access  the  input  and  output files as well as any other files
428              needed.  Other groups the user belongs to will be  ignored.   As
429              such it's advised to run GoAccess behind a SSL proxy as it's un‐
430              likely this user can access the SSL certificates.
431
432       --origin=<url>
433              Ensure clients send the specified origin header  upon  the  Web‐
434              Socket handshake.
435
436       --pid-file=<path/goaccess.pid>
437              Write  the  daemon PID to a file when used along the --daemonize
438              option.
439
440       --port=<port>
441              Specify the port to use. By default GoAccess'  WebSocket  server
442              listens on port 7890.
443
444       --real-time-html
445              Enable real-time HTML output.
446
447              GoAccess uses its own WebSocket server to push the data from the
448              server to the client. See http://gwsocket.io  for  more  details
449              how the WebSocket server works.
450
451       --ws-url=<[scheme://]url[:port]>
452              URL to which the WebSocket server responds. This is the URL sup‐
453              plied to the WebSocket constructor on the client side.
454
455              Optionally, it is possible to specify the WebSocket URI  scheme,
456              such  as  ws://  or wss:// for unencrypted and encrypted connec‐
457              tions. e.g., wss://goaccess.io
458
459              If GoAccess is running behind a proxy, you could set the  client
460              side  to connect to a different port by specifying the host fol‐
461              lowed by a colon and the port.  e.g., goaccess.io:9999
462
463              By default, it will attempt to connect to the generated report's
464              hostname. If GoAccess is running on a remote server, the host of
465              the remote server should be specified here. Also, make  sure  it
466              is a valid host and NOT an http address.
467
468       --ping-interval=<secs>
469              Enable  WebSocket  ping with specified interval in seconds. This
470              helps prevent idle connections getting disconnected.
471
472       --fifo-in=<path/file>
473              Creates a named  pipe  (FIFO)  that  reads  from  on  the  given
474              path/file.
475
476       --fifo-out=<path/file>
477              Creates a named pipe (FIFO) that writes to the given path/file.
478
479       --ssl-cert=<cert.crt>
480              Path to TLS/SSL certificate. In order to enable TLS/SSL support,
481              GoAccess requires that --ssl-cert and --ssl-key are used.
482
483              Only if configured using --with-openssl
484
485       --ssl-key=<priv.key>
486              Path to TLS/SSL private key. In order to enable TLS/SSL support,
487              GoAccess requires that --ssl-cert and --ssl-key are used.
488
489              Only if configured using --with-openssl
490
491   FILE OPTIONS
492       -      The log file to parse is read from stdin.
493
494       -f --log-file=<logfile>
495              Specify  the  path  to  the input log file. If set in the config
496              file, it will take priority over -f from the command line.
497
498       -S --log-size=<bytes>
499              Specify the log size in bytes. This is  useful  when  piping  in
500              logs for processing in which the log size can be explicitly set.
501
502       -l --debug-file=<debugfile>
503              Send all debug messages to the specified file.
504
505       -p --config-file=<configfile>
506              Specify a custom configuration file to use. If set, it will take
507              priority over the global configuration file (if any).
508
509       --external-assets
510              Output HTML assets to external JS/CSS files. Great  if  you  are
511              setting  up  Content Security Policy (CSP). This will create two
512              separate files, goaccess.js and goaccess.css , in the  same  di‐
513              rectory as your report.html file.
514
515       --invalid-requests=<filename>
516              Log invalid requests to the specified file.
517
518       --unknowns-log=<filename>
519              Log unknown browsers and OSs to the specified file.
520
521       --no-global-config
522              Do not load the global configuration file. This directory should
523              normally    be    /usr/local/etc,    unless    specified    with
524              --sysconfdir=/dir.   See  --dcf  option  for finding the default
525              configuration file.
526
527   PARSE OPTIONS
528       -a --agent-list
529              Enable a list of user-agents by host. For faster parsing, do not
530              enable this flag.
531
532       -d --with-output-resolver
533              Enable IP resolver on HTML|JSON output.
534
535       -e --exclude-ip=<IP|IP-range>
536              Exclude  an  IPv4 or IPv6 from being counted.  Ranges can be in‐
537              cluded as well using a dash in between the IPs (start-end).
538
539              Examples:
540                exclude-ip 127.0.0.1
541                exclude-ip 192.168.0.1-192.168.0.100
542                exclude-ip ::1
543                exclude-ip 0:0:0:0:0:ffff:808:804-0:0:0:0:0:ffff:808:808
544
545       -H --http-protocol=<yes|no>
546              Set/unset HTTP request protocol. This will create a request  key
547              containing the request protocol + the actual request.
548
549       -M --http-method=<yes|no>
550              Set/unset  HTTP  request  method. This will create a request key
551              containing the request method + the actual request.
552
553       -o --output=<path/file.[json|csv|html]>
554              Write output to stdout given one of the following files and  the
555              corresponding extension for the output format:
556
557                /path/file.csv - Comma-separated values (CSV)
558                /path/file.json - JSON (JavaScript Object Notation)
559                /path/file.html - HTML
560
561       -q --no-query-string
562              Ignore        request's        query        string.        i.e.,
563              www.google.com/page.htm?query => www.google.com/page.htm.
564
565              Note: Removing the query string can greatly decrease memory con‐
566              sumption, especially on timestamped requests.
567
568       -r --no-term-resolver
569              Disable IP resolver on terminal output.
570
571       --444-as-404
572              Treat non-standard status code 444 as 404.
573
574       --4xx-to-unique-count
575              Add 4xx client errors to the unique visitors count.
576
577       --anonymize-ip
578              Anonymize  the  client  IP  address. The IP anonymization option
579              sets the last octet of IPv4 user IP addresses and  the  last  80
580              bits  of  IPv6  addresses  to  zeros.   e.g.,  192.168.20.100 =>
581              192.168.20.0    e.g.,    2a03:2880:2110:df07:face:b00c::1     =>
582              2a03:2880:2110:df07::
583
584       --anonymize-level
585              Specifies the anonymization levels: 1 => default, 2 => strong, 3
586              => pedantic.
587
588              ┌────────────┬─────────┬─────────┬─────────┐
589Bits-hidden Level 1 Level 2 Level 3 
590              ├────────────┼─────────┼─────────┼─────────┤
591IPv4        │ 8       │ 16      │ 24      │
592              ├────────────┼─────────┼─────────┼─────────┤
593IPv6        │ 64      │ 80      │ 96      │
594              └────────────┴─────────┴─────────┴─────────┘
595
596       --all-static-files
597              Include  static  files  that  contain  a  query  string.   e.g.,
598              /fonts/fontawesome-webfont.woff?v=4.0.3
599
600       --browsers-file=<path>
601              By  default GoAccess parses an "essential/basic" curated list of
602              browsers & crawlers. If you need to add additional browsers, use
603              this   option.    Include   an   additional  delimited  list  of
604              browsers/crawlers/feeds etc.  See  config/browsers.list  for  an
605              example    or   https://raw.githubusercontent.com/allinurl/goac
606              cess/master/config/browsers.list
607
608       --date-spec=<date|hr|min>
609              Set the date specificity to either date (default), hr to display
610              hours or min to display minutes appended to the date.
611
612              This  is  used  in  the visitors panel. It's useful for tracking
613              visitors at the hour level. For instance,  an  hour  specificity
614              would  yield  to  display  traffic  as  18/Dec/2010:19 or minute
615              specificity 18/Dec/2010:19:59.
616
617       --double-decode
618              Decode double-encoded values.  This  includes,  user-agent,  re‐
619              quest, and referrer.
620
621       --enable-panel=<PANEL>
622              Enable parsing and displaying the given panel.
623
624              Available panels:
625                VISITORS
626                REQUESTS
627                REQUESTS_STATIC
628                NOT_FOUND
629                HOSTS
630                OS
631                BROWSERS
632                VISIT_TIMES
633                VIRTUAL_HOSTS
634                REFERRERS
635                REFERRING_SITES
636                KEYPHRASES
637                STATUS_CODES
638                REMOTE_USER
639                CACHE_STATUS
640                GEO_LOCATION
641                MIME_TYPE
642                TLS_TYPE
643
644       --fname-as-vhost=<regex>
645              Use log filename(s) as virtual host(s). POSIX regex is passed to
646              extract the virtual host from the  filename.  e.g.,  --fname-as-
647              vhost='[a-z]*.[a-z]*'  can be used to extract awesome.com.log =>
648              awesome.com.
649
650       --hide-referrer=<NEEDLE>
651              Hide a referrer but still count it. Wild cards  are  allowed  in
652              the needle. i.e., *.bing.com.
653
654       --hour-spec=<hr|min>
655              Set the time specificity to either hour (default) or min to dis‐
656              play the tenth of an hour appended to the hour.
657
658              This is used in the time distribution  panel.  It's  useful  for
659              tracking peaks of traffic on your server at specific times.
660
661       --ignore-crawlers
662              Ignore crawlers from being counted.
663
664       --unknowns-as-crawlers
665              Classify unknown OS and browsers as crawlers.
666
667       --ignore-panel=<PANEL>
668              Ignore parsing and displaying the given panel.
669
670              Available panels:
671                VISITORS
672                REQUESTS
673                REQUESTS_STATIC
674                NOT_FOUND
675                HOSTS
676                OS
677                BROWSERS
678                VISIT_TIMES
679                VIRTUAL_HOSTS
680                REFERRERS
681                REFERRING_SITES
682                KEYPHRASES
683                STATUS_CODES
684                REMOTE_USER
685                CACHE_STATUS
686                GEO_LOCATION
687                MIME_TYPE
688                TLS_TYPE
689
690       --ignore-referrer=<referrer>
691              Ignore  referrers  from  being counted. Wildcards allowed. e.g.,
692              *.domain.com ww?.domain.*
693
694       --ignore-statics=<req|panel>
695              Ignore static file requests.
696
697              req
698                Only ignore request from valid requests
699
700              panels
701                Ignore request from panels.
702
703                Note that it will count them towards the total number  of  re‐
704              quests
705
706       --ignore-status=<CODE>
707              Ignore  parsing  and  displaying one or multiple status code(s).
708              For multiple status codes, use this option multiple times.
709
710       --keep-last=<num_days>
711              Keep the last specified number of days in storage. This will re‐
712              cycle  the  storage  tables.  e.g.,  keep & show only the last 7
713              days.
714
715       --no-ip-validation
716              Disable client IP validation. Useful if IP addresses  have  been
717              obfuscated  before being logged.  The log still needs to contain
718              a  placeholder  for  %h  usually  it's  a  resolved   IP.   e.g.
719              ord37s19-in-f14.1e100.net.
720
721       --no-strict-status
722              Disable  HTTP  status code validation. Some servers would record
723              this value only if a connection was established  to  the  target
724              and the target sent a response.  Otherwise, it could be recorded
725              as -.
726
727       --num-tests=<number>
728              Number of lines from the access log to test against the provided
729              log/date/time  format.  By default, the parser is set to test 10
730              lines. If set to 0, the parser won't test  any  lines  and  will
731              parse  the  whole  access  log.  If  a  line  matches  the given
732              log/date/time format before it reaches <number>, the parser will
733              consider  the  log  to  be valid, otherwise GoAccess will return
734              EXIT_FAILURE and display the relevant error messages.
735
736       --process-and-exit
737              Parse log and exit without outputting data.  Useful  if  we  are
738              looking  to  only  add  new data to the on-disk database without
739              outputting to a file or a terminal.
740
741       --real-os
742              Display real OS names. e.g, Windows XP, Snow Leopard.
743
744       --sort-panel=<PANEL,FIELD,ORDER>
745              Sort panel on initial load. Sort options are separated by comma.
746              Options are in the form: PANEL,METRIC,ORDER
747
748              Available metrics:
749                BY_HITS     - Sort by hits
750                BY_VISITORS - Sort by unique visitors
751                BY_DATA     - Sort by data
752                BY_BW       - Sort by bandwidth
753                BY_AVGTS    - Sort by average time served
754                BY_CUMTS    - Sort by cumulative time served
755                BY_MAXTS    - Sort by maximum time served
756                BY_PROT     - Sort by http protocol
757                BY_MTHD     - Sort by http method
758
759              Available orders:
760                ASC
761                DESC
762
763       --static-file=<extension>
764              Add static file extension. e.g.: .mp3 Extensions are case sensi‐
765              tive.
766
767   GEOLOCATION OPTIONS
768       -g --std-geoip
769              Standard GeoIP database for less memory usage.
770
771       --geoip-database=<geofile>
772              Specify path to GeoIP database file. i.e., GeoLiteCity.dat.
773
774              If using GeoIP2, you will need to download the GeoLite2 City  or
775              Country  database  from  MaxMind.com and use the option --geoip-
776              database to specify the database. You can also get updated data‐
777              base  files  for  GeoIP  legacy,  you  can find these as GeoLite
778              Legacy Databases from MaxMind.com. IPv4 and IPv6 files are  sup‐
779              ported  as  well.  For  updated  DB URLs, please see the default
780              GoAccess configuration file.
781
782              Note: --geoip-city-data is an alias of --geoip-database.
783
784   OTHER OPTIONS
785       -h --help
786              The help.
787
788       -s --storage
789              Display current storage method. i.e., B+ Tree, Hash.
790
791       -V --version
792              Display version information and exit.
793
794       --dcf  Display the path of the default config file  when  `-p`  is  not
795              used.
796
797   PERSISTENCE STORAGE OPTIONS
798       --persist
799              Persist  parsed  data  into disk. If database files exist, files
800              will be overwritten. This should be set to  the  first  dataset.
801              See examples below.
802
803       --restore
804              Load previously stored data from disk. If reading persisted data
805              only, the database files need to exist. See --persist and  exam‐
806              ples below.
807
808       --db-path=<dir>
809              Path  where  the  on-disk database files are stored. The default
810              value is the /tmp directory.
811
812

CUSTOM LOG/DATE FORMAT

814       GoAccess can parse virtually any web log format.
815
816       Predefined options include, Common Log Format (CLF), Combined Log  For‐
817       mat (XLF/ELF), including virtual host, Amazon CloudFront (Download Dis‐
818       tribution), Google Cloud Storage and W3C format (IIS).
819
820       GoAccess allows any custom format string as well.
821
822       There are two ways to configure the log format.  The easiest is to  run
823       GoAccess with -c to prompt a configuration window. Otherwise, it can be
824       configured under ~/.goaccessrc or the %sysconfdir%.
825
826       time-format
827              The time-format variable followed by a space, specifies the  log
828              format time containing any combination of regular characters and
829              special format specifiers.  They all begin with a percentage (%)
830              sign. See `man strftime`.  %T or %H:%M:%S.
831
832              Note:  If  a timestamp is given in microseconds, %f must be used
833              as time-format or %* if the timestamp is given in milliseconds.
834
835       date-format
836              The date-format variable followed by a space, specifies the  log
837              format date containing any combination of regular characters and
838              special format specifiers. They all begin with a percentage  (%)
839              sign. See `man strftime`. e.g., %Y-%m-%d.
840
841              Note:  If  a timestamp is given in microseconds, %f must be used
842              as date-format or %* if the timestamp is given in milliseconds.
843
844       log-format
845              The log-format variable followed by a space or  \t  ,  specifies
846              the log format string.
847
848       %x     A  date  and time field matching the time-format and date-format
849              variables. This is used when given a timestamp  or  the  date  &
850              time  are  concatenated  as a single string (e.g., 1501647332 or
851              20170801235000) instead of the date and time being in two  sepa‐
852              rated variables.
853
854       %t     time field matching the time-format variable.
855
856       %d     date field matching the date-format variable.
857
858       %v     The  canonical  Server  Name  of  the server serving the request
859              (Virtual Host).
860
861       %e     This is the userid of the person requesting the document as  de‐
862              termined by HTTP authentication.
863
864       %C     The cache status of the object the server served.
865
866       %h     host (the client IP address, either IPv4 or IPv6)
867
868       %r     The  request line from the client. This requires specific delim‐
869              iters around the request (as single quotes,  double  quotes,  or
870              anything else) to be parsable. If not, we have to use a combina‐
871              tion of special format specifiers as %m %U %H.
872
873       %q     The query string.
874
875       %m     The request method.
876
877       %U     The URL path requested.
878
879              Note: If the query string is in %U, there is no need to use  %q.
880              However, if the URL path, does not include any query string, you
881              may use %q and the query string will be appended to the request.
882
883       %H     The request protocol.
884
885       %s     The status code that the server sends back to the client.
886
887       %b     The size of the object returned to the client.
888
889       %R     The "Referrer" HTTP request header.
890
891       %u     The user-agent HTTP request header.
892
893       %K     The TLS encryption  settings  chosen  for  the  connection.  (In
894              Apache LogFormat: %{SSL_PROTOCOL}x)
895
896       %k     The  TLS  encryption  settings  chosen  for  the connection. (In
897              Apache LogFormat: %{SSL_CIPHER}x)
898
899       %M     The MIME-type of the requested resource. (In  Apache  LogFormat:
900              %{Content-Type}o)
901
902       %D     The  time taken to serve the request, in microseconds as a deci‐
903              mal number.
904
905       %T     The time taken to serve the request, in seconds  with  millisec‐
906              onds resolution.
907
908       %L     The  time taken to serve the request, in milliseconds as a deci‐
909              mal number.
910
911       %n     The time taken to serve the request, in nanoseconds.
912
913       %^     Ignore this field.
914
915       %~     Move forward through the log string until a non-space (!isspace)
916              char is found.
917
918       ~h     The  host  (the  client IP address, either IPv4 or IPv6) in a X-
919              Forwarded-For (XFF) field.
920
921              It uses a special specifier which consists of a tilde before the
922              host  specifier,  followed  by the character(s) that delimit the
923              XFF field, which are enclosed by curly braces. i.e., "~h{, }
924
925              For example, "~h{, }" is used in order  to  parse  "11.25.11.53,
926              17.68.33.17"  field  which  is  delimited by a comma and a space
927              (enclosed by double quotes).
928
929
930              ┌───────────────────────────┬───────────┐
931XFF field                  specifier 
932              ├───────────────────────────┼───────────┤
933"192.1.21.932,.68.33.11972,.1.1.2" │ "~h{, }"  │
934              ├───────────────────────────┼───────────┤
935"192.1.2.12","192.68.33.17" │ ~h{", }   │
936              ├───────────────────────────┼───────────┤
937192.1.2.12, 192.68.33.17   │ ~h{, }    │
938              ├───────────────────────────┼───────────┤
939192.1.2.11492.68.33.11972.1.1.2 │ ~h{ }     │
940              └───────────────────────────┴───────────┘
941
942
943       Note:  In  order to get the average, cumulative and maximum time served
944       in GoAccess, you will need to start logging response times in your  web
945       server. In Nginx you can add $request_time to your log format, or %D in
946       Apache.
947
948       Important: If multiple time served specifiers  are  used  at  the  same
949       time,  the first option specified in the format string will take prior‐
950       ity over the other specifiers.
951
952       GoAccess requires the following fields:
953
954              %h a valid IPv4/6
955
956              %d a valid date
957
958              %r the request
959

INTERACTIVE MENU

961       F1 or h
962              Main help.
963
964       F5     Redraw main window.
965
966       q      Quit the program, current window or collapse active module
967
968       o or ENTER
969              Expand selected module or open window
970
971       0-9 and Shift + 0
972              Set selected module to active
973
974       j      Scroll down within expanded module
975
976       k      Scroll up within expanded module
977
978       c      Set or change scheme color.
979
980       TAB    Forward iteration of modules. Starts from current active module.
981
982       SHIFT + TAB
983              Backward iteration of modules. Starts from current  active  mod‐
984              ule.
985
986       ^f     Scroll forward one screen within an active module.
987
988       ^b     Scroll backward one screen within an active module.
989
990       s      Sort options for active module
991
992       /      Search across all modules (regex allowed)
993
994       n      Find the position of the next occurrence across all modules.
995
996       g      Move to the first item or top of screen.
997
998       G      Move to the last item or bottom of screen.
999

EXAMPLES

1001       Note: Piping data into GoAccess won't prompt a log/date/time configura‐
1002       tion dialog, you will need to previously define it in  your  configura‐
1003       tion file or in the command line.
1004
1005
1006   DIFFERENT OUTPUTS
1007       To output to a terminal and generate an interactive report:
1008
1009              # goaccess access.log
1010
1011       To generate an HTML report:
1012
1013              # goaccess access.log -a -o report.html
1014
1015       To generate a JSON report:
1016
1017              # goaccess access.log -a -d -o report.json
1018
1019       To generate a CSV file:
1020
1021              # goaccess access.log --no-csv-summary -o report.csv
1022
1023       GoAccess  also  allows  great  flexibility  for real-time filtering and
1024       parsing. For instance, to quickly diagnose issues  by  monitoring  logs
1025       since goaccess was started:
1026
1027              # tail -f access.log | goaccess -
1028
1029       And  even better, to filter while maintaining opened a pipe to preserve
1030       real-time analysis, we can make use of tail -f and a  matching  pattern
1031       tool such as grep, awk, sed, etc:
1032
1033              # tail -f access.log | grep -i --line-buffered 'firefox' | goac‐
1034              cess --log-format=COMBINED -
1035
1036       or to parse from the beginning of the file while maintaining  the  pipe
1037       opened and applying a filter
1038
1039              # tail -f -n +0 access.log | grep -i --line-buffered 'firefox' |
1040              goaccess --log-format=COMBINED -o report.html --real-time-html -
1041
1042       or to convert the log date timezone to a different timezone, e.g.,  Eu‐
1043       rope/Berlin
1044
1045              #  goaccess  access.log  --log-format='%h %^[%x] "%r" %s %b "%R"
1046              "%u"'    --datetime-format='%d/%b/%Y:%H:%M:%S    %z'    --tz=Eu‐
1047              rope/Berlin --date-spec=min
1048
1049   MULTIPLE LOG FILES
1050       There  are  several ways to parse multiple logs with GoAccess. The sim‐
1051       plest is to pass multiple log files to the command line:
1052
1053              # goaccess access.log access.log.1
1054
1055       It's even possible to parse files from a  pipe  while  reading  regular
1056       files:
1057
1058              # cat access.log.2 | goaccess access.log access.log.1 -
1059
1060       Note  that the single dash is appended to the command line to let GoAc‐
1061       cess know that it should read from the pipe.
1062
1063       Now if we want to add more flexibility to GoAccess, we can do a  series
1064       of  pipes. For instance, if we would like to process all compressed log
1065       files access.log.*.gz in addition to the current log file, we can do:
1066
1067              # zcat access.log.*.gz | goaccess access.log -
1068
1069       Note: On Mac OS X, use gunzip -c instead of zcat.
1070
1071   REAL TIME HTML OUTPUT
1072       GoAccess has the ability to output real-time data in the  HTML  report.
1073       You  can even email the HTML file since it is composed of a single file
1074       with no external file dependencies, how neat is that!
1075
1076       The process of generating a real-time HTML report is  very  similar  to
1077       the  process  of  creating  a  static  report. Only --real-time-html is
1078       needed to make it real-time.
1079
1080              # goaccess access.log -o  /usr/share/nginx/html/site/report.html
1081              --real-time-html
1082
1083       By  default,  GoAccess  will use the host name of the generated report.
1084       Optionally, you can specify the URL to which the client's browser  will
1085       connect to. See https://goaccess.io/faq for a more detailed example.
1086
1087              #  goaccess  access.log  -o  report.html  --real-time-html --ws-
1088              url=goaccess.io
1089
1090       By default, GoAccess listens on port 7890,  to  use  a  different  port
1091       other than 7890, you can specify it as (make sure the port is opened):
1092
1093              #    goaccess   access.log   -o   report.html   --real-time-html
1094              --port=9870
1095
1096       And to bind the WebSocket server to  a  different  address  other  than
1097       0.0.0.0, you can specify it as:
1098
1099              #    goaccess   access.log   -o   report.html   --real-time-html
1100              --addr=127.0.0.1
1101
1102       Note: To output real time data over a TLS/SSL connection, you  need  to
1103       use --ssl-cert=<cert.crt> and --ssl-key=<priv.key>.
1104
1105   WORKING WITH DATES
1106       Another useful pipe would be filtering dates out of the web log
1107
1108       The  following will get all HTTP requests starting on 05/Dec/2010 until
1109       the end of the file.
1110
1111              # sed -n '/05Dec2010/,$ p' access.log | goaccess -a -
1112
1113       or using relative dates such as yesterdays or tomorrows day:
1114
1115              # sed -n '/'$(date '+%d%b%Y' -d '1 week ago')'/,$ p'  access.log
1116              | goaccess -a -
1117
1118       If we want to parse only a certain time-frame from DATE a to DATE b, we
1119       can do:
1120
1121              # sed -n '/5Nov2010/,/5Dec2010/ p' access.log | goaccess -a -
1122
1123       If we want to preserve only certain amount of data and recycle storage,
1124       we  can keep only a certain number of days. For instance to keep & show
1125       the last 5 days:
1126
1127              # goaccess access.log --keep-last=5
1128
1129   VIRTUAL HOSTS
1130       Assuming your log contains the virtual host (server blocks) field.  For
1131       instance:
1132
1133              vhost.com:80 10.131.40.139 - - [02/Mar/2016:08:14:04 -0600] "GET
1134              /shop/bag-p-20 HTTP/1.1" 200 6715 "-"  "Apache  (internal  dummy
1135              connection)"
1136
1137       And  you  would like to append the virtual host to the request in order
1138       to see which virtual host the top urls belong to
1139
1140              awk '$8=$1$8' access.log | goaccess -a -
1141
1142       To exclude a list of virtual hosts you can do the following:
1143
1144              # grep -v  "`cat  exclude_vhost_list_file`"  vhost_access.log  |
1145              goaccess -
1146
1147   FILES & STATUS CODES
1148       To  parse specific pages, e.g., page views, html, htm, php, etc. within
1149       a request:
1150
1151              # awk '$7~/.html|.htm|.php/' access.log | goaccess -
1152
1153       Note, $7 is the request field for the common and combined  log  format,
1154       (without  Virtual  Host),  if  your log includes Virtual Host, then you
1155       probably want to use $8 instead. It's best to check which field you are
1156       shooting for, e.g.:
1157
1158              # tail -10 access.log | awk '{print $8}'
1159
1160       Or to parse a specific status code, e.g., 500 (Internal Server Error):
1161
1162              # awk '$9~/500/' access.log | goaccess -
1163
1164   SERVER
1165       Also, it is worth pointing out that if we want to run GoAccess at lower
1166       priority, we can run it as:
1167
1168              # nice -n 19 goaccess -f access.log -a
1169
1170       and if you don't want to install it on your server, you can  still  run
1171       it from your local machine:
1172
1173              #  ssh  -n  root@server  'tail -f /var/log/apache2/access.log' |
1174              goaccess -
1175
1176       Note: SSH requires -n so GoAccess can read from stdin. Also, make  sure
1177       to  use SSH keys for authentication as it won't work if a passphrase is
1178       required.
1179
1180   INCREMENTAL LOG PROCESSING
1181       GoAccess has the ability to process logs incrementally through its  in‐
1182       ternal  storage  and  dump  its data to disk. It works in the following
1183       way:
1184
1185
1186       1  A dataset must be persisted first  with  --persist,  then  the  same
1187          dataset can be loaded with
1188
1189       2  --restore.   If new data is passed (piped or through a log file), it
1190          will append it to the original dataset.
1191
1192
1193       NOTES
1194
1195       GoAccess keeps track of inodes of all  the  files  processed  (assuming
1196       files  will  stay  on  the  same partition), in addition, it extracts a
1197       snippet of data from the log along with the last line  parsed  of  each
1198       file   and   the   timestamp   of  the  last  line  parsed.  e.g.,  in‐
1199       ode:29627417|line:20012|ts:20171231235059
1200
1201       First it compares if the snippet matches the log being  parsed,  if  it
1202       does, it assumes the log hasn't changed dramatically, e.g., hasn't been
1203       truncated. If the inode does not match the current file, it parses  all
1204       lines. If the current file matches the inode, it then reads the remain‐
1205       ing lines and updates the count of lines parsed and the  timestamp.  As
1206       an  extra  precaution, it won't parse log lines with a timestamp ≤ than
1207       the one stored.
1208
1209       Piped data works based off the timestamp of the last line read. For in‐
1210       stance, it will parse and discard all incoming entries until it finds a
1211       timestamp >= than the one stored.
1212
1213
1214       For instance:
1215
1216              // last month access log
1217              # goaccess access.log.1 --persist
1218
1219       then, load it with
1220
1221              // append this month access log, and preserve new data
1222              # goaccess access.log --restore --persist
1223
1224       To read persisted data only (without parsing new data)
1225
1226              # goaccess --restore
1227

NOTES

1229       Each active panel has a total of 366 items or 50 in the real-time  HTML
1230       report.   The number of items is customizable using max-items Note that
1231       HTML, CSV and JSON output allow a maximum number greater than  the  de‐
1232       fault value of 366 items per panel.
1233
1234       A  hit  is  a  request (line in the access log), e.g., 10 requests = 10
1235       hits. HTTP requests with the same IP, date, and user agent are  consid‐
1236       ered a unique visit.
1237
1238       If  you want to enable dual-stack support, please use --addr=:: instead
1239       of the default --addr=0.0.0.0.
1240
1241       The generated report will attempt to reconnect to the WebSocket  server
1242       after  1 second with exponential backoff. It will attempt to connect 20
1243       times.
1244

BUGS

1246       If you think you have found a bug, please send me  an  email  to  goac‐
1247       cess@prosoftcorp.com     or     use     the     issue     tracker    in
1248       https://github.com/allinurl/goaccess/issues
1249

AUTHOR

1251       Gerardo Orellana <hello@goaccess.io> For more details about it, or  new
1252       releases, please visit https://goaccess.io
1253
1254
1255
1256GNU+Linux                       SEPTEMBER 2023                     goaccess(1)
Impressum