1goaccess(1)                      User Manuals                      goaccess(1)
2
3
4

NAME

6       goaccess - fast web log analyzer and interactive viewer.
7

SYNOPSIS

9       goaccess [filename] [options...] [-c][-M][-H][-q][-d][...]
10

DESCRIPTION

12       goaccess  GoAccess is an open source real-time web log analyzer and in‐
13       teractive viewer that runs in a terminal in  *nix  systems  or  through
14       your browser.
15
16       It provides fast and valuable HTTP statistics for system administrators
17       that require a visual server report on the fly.
18
19       GoAccess parses the specified web log file and outputs the data to  the
20       X terminal. Features include:
21
22
23       General Statistics:
24              This  panel gives a summary of several metrics, such as the num‐
25              ber of valid and invalid requests, time  taken  to  analyze  the
26              dataset,  unique  visitors,  requested files, static files (CSS,
27              ICO, JPG, etc) HTTP referrers, 404s, size of the parsed log file
28              and bandwidth consumption.
29
30       Unique visitors
31              This panel shows metrics such as hits, unique visitors and cumu‐
32              lative bandwidth per date. HTTP requests containing the same IP,
33              the  same  date, and the same user agent are considered a unique
34              visitor. By default, it includes web crawlers/spiders.
35
36              Optionally, date specificity can be set to the hour level  using
37              --date-spec=hr  which will display dates such as 05/Jun/2016:16,
38              or to the minute  level  producing  05/Jun/2016:16:59.  This  is
39              great  if  you  want  to track your daily traffic at the hour or
40              minute level.
41
42       Requested files
43              This panel displays the most  requested  (non-static)  files  on
44              your  web  server.  It shows hits, unique visitors, and percent‐
45              age, along with the cumulative bandwidth, protocol, and the  re‐
46              quest method used.
47
48       Requested static files
49              Lists  the  most frequently static files such as: JPG, CSS, SWF,
50              JS, GIF, and PNG file types, along with the same metrics as  the
51              last panel. Additional static files can be added to the configu‐
52              ration file.
53
54       404 or Not Found
55              Displays the same metrics as the previous request  panels,  how‐
56              ever,  its  data  contains  all pages that were not found on the
57              server, or commonly known as 404 status code.
58
59       Hosts  This panel has detailed information  on  the  hosts  themselves.
60              This  is  great for spotting aggressive crawlers and identifying
61              who's eating your bandwidth.
62
63              Expanding the panel can display more information such as  host's
64              reverse DNS lookup result, country of origin and city. If the -a
65              argument is enabled, a list of user agents can be  displayed  by
66              selecting the desired IP address, and then pressing ENTER.
67
68       Operating Systems
69              This panel will report which operating system the host used when
70              it hit the server. It attempts to provide the most specific ver‐
71              sion of each operating system.
72
73       Browsers
74              This  panel  will report which browser the host used when it hit
75              the server. It attempts to provide the most specific version  of
76              each browser.
77
78       Visit Times
79              This  panel  will display an hourly report. This option displays
80              24 data points, one for each hour of the day.
81
82              Optionally, hour specificity can be set to the tenth of an  hour
83              level  using  --hour-spec=min  which  will display hours as 16:4
84              This is great if you want to  spot  peaks  of  traffic  on  your
85              server.
86
87       Virtual Hosts
88              This  panel  will display all the different virtual hosts parsed
89              from the access log. This panel  is  displayed  if  %v  is  used
90              within the log-format string.
91
92       Referrers URLs
93              If  the host in question accessed the site via another resource,
94              or was linked/diverted to you from another host,  the  URL  they
95              were  referred  from  will be provided in this panel. See `--ig‐
96              nore-panel` in your configuration file to enable  it.   disabled
97              by default.
98
99       Referring Sites
100              This  panel  will  display  only the host part but not the whole
101              URL. The URL where the request came from.
102
103       Keyphrases
104              It reports keyphrases used on Google search, Google  cache,  and
105              Google  translate that have lead to your web server. At present,
106              it only supports Google search queries via HTTP. See  `--ignore-
107              panel` in your configuration file to enable it.  disabled by de‐
108              fault.
109
110       Geo Location
111              Determines where an IP address is geographically  located.  Sta‐
112              tistics are broken down by continent and country. It needs to be
113              compiled with GeoLocation support.
114
115       HTTP Status Codes
116              The values of the numeric status code to HTTP requests.
117
118       Remote User (HTTP authentication)
119              This is the userid of the person requesting the document as  de‐
120              termined by HTTP authentication. If the document is not password
121              protected, this part will be "-" just  like  the  previous  one.
122              This panel is not enabled unless %e is given within the log-for‐
123              mat variable.
124
125       Cache Status
126              If you are using caching on your server, you may be at the point
127              where  you  want  to  know  if  your request is being cached and
128              served from the cache. This panel shows the cache status of  the
129              object the server served. This panel is not enabled unless %C is
130              given within the log-format variable. The status can be either
131               `MISS`, `BYPASS`, `EXPIRED`, `STALE`, `UPDATING`, `REVALIDATED`
132              or `HIT`
133
134       MIME Types
135              This  panel specifies Media Types (formerly known as MIME types)
136              and Media Subtypes which will be assigned and listed underneath.
137              This panel is not enabled unless %M is given within the log-for‐
138              mat   variable.   See    https://www.iana.org/assignments/media-
139              types/media-types.xhtml for more details.
140
141       Encryption Settings
142              This  panel  shows  the  SSL/TLS  protocol used along the Cipher
143              Suites. This panel is not enabled unless %K is given within  the
144              log-format variable.
145
146
147       NOTE:  Optionally and if configured, all panels can display the average
148       time taken to serve the request.
149
150

STORAGE

152       There are three storage options that can be used with GoAccess.  Choos‐
153       ing one will depend on your environment and needs.
154
155       Default Hash Tables
156              In-memory  storage  provides  better  performance at the cost of
157              limiting the dataset size to the amount  of  available  physical
158              memory.  GoAccess  uses  in-memory hash tables. It has very good
159              memory usage and pretty good performance. This storage has  sup‐
160              port for on-disk persistence.
161

CONFIGURATION

163       Multiple  options can be used to configure GoAccess. For a complete up-
164       to-date list of configure options, run ./configure --help
165
166       --enable-debug
167              Compile with debugging symbols and turn off  compiler  optimiza‐
168              tions.
169
170       --enable-utf8
171              Compile with wide character support. Ncursesw is required.
172
173       --enable-geoip=<legacy|mmdb>
174              Compile  with  GeoLocation support. MaxMind's GeoIP is required.
175              legacy will utilize the original  GeoIP  databases.   mmdb  will
176              utilize the enhanced GeoIP2 databases.
177
178       --with-getline
179              Dynamically  expands line buffer in order to parse full line re‐
180              quests instead of using a fixed size buffer of 4096.
181
182       --with-openssl
183              Compile GoAccess with OpenSSL support for its WebSocket server.
184

OPTIONS

186       The following options can be supplied to the command  or  specified  in
187       the  configuration  file.  If specified in the configuration file, long
188       options need to be used without prepending --  and  without  using  the
189       equal sign =.
190
191   LOG/DATE/TIME FORMAT
192       --time-format=<timeformat>
193              The  time-format variable followed by a space, specifies the log
194              format time containing either a name of a predefined format (see
195              options below) or any combination of regular characters and spe‐
196              cial format specifiers.
197
198              They all begin with a percentage (%) sign. See  `man  strftime`.
199              %T or %H:%M:%S.
200
201              Note  that  if  a timestamp is given in microseconds, %f must be
202              used as time-format.  If the timestamp is given in  milliseconds
203              %* must be used as time-format.
204
205       --date-format=<dateformat>
206              The  date-format variable followed by a space, specifies the log
207              format time containing either a name of a predefined format (see
208              options below) or any combination of regular characters and spe‐
209              cial format specifiers.
210
211              They all begin with a percentage (%) sign. See  `man  strftime`.
212              %Y-%m-%d.
213
214              Note  that  if  a timestamp is given in microseconds, %f must be
215              used as date-format.  If the timestamp is given in  milliseconds
216              %* must be used as date-format.
217
218       --datetime-format=<date_time_format>
219              The  date and time format combines the two variables into a sin‐
220              gle option. This gives the ability to get the  timezone  from  a
221              request  and  convert  it  to  another  timezone for output. See
222              --tz=<timezone>
223
224              They all begin with a percentage (%) sign. See  `man  strftime`.
225              e.g., %d/%b/%Y:%H:%M:%S %z.
226
227              Note that if --datetime-format is used, %x must be passed in the
228              log-format variable to represent the date and time field.
229
230       --log-format=<logformat>
231              The log-format variable followed by a space or \t for tab-delim‐
232              ited, specifies the log format string.
233
234              Note  that  if  there  are  spaces within the format, the string
235              needs to be enclosed in single/double quotes. Inner quotes  need
236              to be escaped.
237
238              In  addition  to  specifying  the raw log/date/time formats, for
239              simplicity, any of the following predefined log format names can
240              be  supplied to the log/date/time-format variables. GoAccess can
241              also handle one predefined name in one variable and another pre‐
242              defined name in another variable.
243
244                COMBINED     - Combined Log Format,
245                VCOMBINED    - Combined Log Format with Virtual Host,
246                COMMON       - Common Log Format,
247                VCOMMON      - Common Log Format with Virtual Host,
248                W3C          - W3C Extended Log File Format,
249                SQUID        - Native Squid Log Format,
250                CLOUDFRONT   - Amazon CloudFront Web Distribution,
251                CLOUDSTORAGE - Google Cloud Storage,
252                AWSELB       - Amazon Elastic Load Balancing,
253                AWSS3        - Amazon Simple Storage Service (S3)
254                AWSALB       - Amazon Application Load Balancer
255                CADDY        - Caddy's JSON Structured format
256
257              Note:  Piping  data  into  GoAccess won't prompt a log/date/time
258              configuration dialog, you will need to previously define  it  in
259              your configuration file or in the command line.
260
261   USER INTERFACE OPTIONS
262       -c --config-dialog
263              Prompt log/time/date configuration window on program start. Only
264              when curses is initialized.
265
266       -i --hl-header
267              Color highlight active terminal panel.
268
269       -m --with-mouse
270              Enable mouse support on main terminal dashboard.
271
272       ---color=<fg:bg[attrs, PANEL]>
273              Specify custom colors for the terminal output.
274
275              Color Syntax
276                DEFINITION space/tab colorFG#:colorBG# [attributes,PANEL]
277
278               FG# = foreground color [-1...255] (-1 = default term color)
279               BG# = background color [-1...255] (-1 = default term color)
280
281              Optionally, it is possible to apply color  attributes  (multiple
282              attributes  are comma separated), such as: bold, underline, nor‐
283              mal, reverse, blink
284
285              If desired, it is possible to apply  custom  colors  per  panel,
286              that is, a metric in the REQUESTS panel can be of color A, while
287              the same metric in the BROWSERS panel can be of color B.
288
289              Available color definitions:
290                COLOR_MTRC_HITS
291                COLOR_MTRC_VISITORS
292                COLOR_MTRC_DATA
293                COLOR_MTRC_BW
294                COLOR_MTRC_AVGTS
295                COLOR_MTRC_CUMTS
296                COLOR_MTRC_MAXTS
297                COLOR_MTRC_PROT
298                COLOR_MTRC_MTHD
299                COLOR_MTRC_HITS_PERC
300                COLOR_MTRC_HITS_PERC_MAX
301                COLOR_MTRC_VISITORS_PERC
302                COLOR_MTRC_VISITORS_PERC_MAX
303                COLOR_PANEL_COLS
304                COLOR_BARS
305                COLOR_ERROR
306                COLOR_SELECTED
307                COLOR_PANEL_ACTIVE
308                COLOR_PANEL_HEADER
309                COLOR_PANEL_DESC
310                COLOR_OVERALL_LBLS
311                COLOR_OVERALL_VALS
312                COLOR_OVERALL_PATH
313                COLOR_ACTIVE_LABEL
314                COLOR_BG
315                COLOR_DEFAULT
316                COLOR_PROGRESS
317
318              See configuration file for a sample color scheme.
319
320       --color-scheme=<1|2|3>
321              Choose among color schemes.  1 for the default grey  scheme.   2
322              for  the  green scheme.  3 for the Monokai scheme (shown only if
323              terminal supports 256 colors).
324
325       --crawlers-only
326              Parse and display only crawlers (bots).
327
328       --html-custom-css=<path/custom.css>
329              Specifies a custom CSS file path to load in the HTML report.
330
331       --html-custom-js=<path/custom.js>
332              Specifies a custom JS file path to load in the HTML report.
333
334       --html-report-title=<title>
335              Set HTML report page title and header.
336
337       --html-refresh=<secs>
338              Refresh the HTML report every X seconds. The value has to be be‐
339              tween  1  and 60 seconds. The default is set to refresh the HTML
340              report every 1 second.
341
342       --html-prefs=<JSON>
343              Set HTML report default preferences. Supply a valid JSON  object
344              containing  the  HTML preferences. It allows the ability to cus‐
345              tomize each panel plot. See example below.
346
347              Note: The JSON object passed needs to be a one line JSON string.
348              For instance,
349
350              --html-prefs='{"theme":"bright","perPage":5,"layout":"horizontal","showTables":true,"visitors":{"plot":{"chartType":"bar"}}}'
351
352       --json-pretty-print
353              Format JSON output using tabs and newlines.
354
355              Note:  This  is not recommended when outputting a real-time HTML
356              report since the WebSocket payload will much much larger.
357
358       --max-items=<number>
359              The maximum number of items to display per  panel.  The  maximum
360              can be a number between 1 and n.
361
362              Note:  Only  the  CSV  and  JSON  output  allow a maximum number
363              greater than the default value of 366 (or 50  in  the  real-time
364              HTML output) items per panel.
365
366       --no-color
367              Turn off colored output. This is the default output on terminals
368              that do not support colors.
369
370       --no-column-names
371              Don't write column names in the terminal output. By default,  it
372              displays column names for each available metric in every panel.
373
374       --no-csv-summary
375              Disable summary metrics on the CSV output.
376
377       --no-progress
378              Disable progress metrics [total requests/requests per second].
379
380       --no-tab-scroll
381              Disable  scrolling  through panels when TAB is pressed or when a
382              panel is selected using a numeric key.
383
384       --no-html-last-updated
385              Do not show the last updated field displayed in the HTML  gener‐
386              ated report.
387
388       --no-parsing-spinner
389              Do now show the progress metrics and parsing spinner.
390
391       --tz=<timezone>
392              Ouputs  the  report  date/time  data in the given timezone. Note
393              that it uses the canonical timezone name. e.g., Europe/Berlin or
394              America/Chicago  or  Africa/Cairo If an invalid timezone name is
395              given, the ouput will be in GMT. See --datetime-format in  order
396              to properly specify a timezone in the date/time format.
397
398   SERVER OPTIONS
399       Note This is just a WebSocket server to provide the raw real-time data.
400       It is not a WebServer itself. To access your  reports  html  file,  you
401       will  still  need  your  own HTTP server, place the generated report in
402       it's document root dir and open the html  file  in  your  browser.  The
403       browser  will  then  open another WebSocket-connection to the ws-server
404       you may setup here, to keep the dashboard up-to-date.
405
406       --addr Specify IP address to bind the server to. Otherwise it binds  to
407              0.0.0.0.
408
409              Usually  there is no need to specify the address, unless you in‐
410              tentionally would like to bind the server to a different address
411              within your server.
412
413       --daemonize
414              Run GoAccess as daemon (only if --real-time-html enabled).
415
416              Note:  It's important to make use of absolute paths across GoAc‐
417              cess' configuration.
418
419       --user-name=<username>
420              Run GoAccess as the specified user.
421
422              Note: It's important to ensure the user or the users' group  can
423              access  the  input  and  output files as well as any other files
424              needed.  Other groups the user belongs to will be  ignored.   As
425              such it's advised to run GoAccess behind a SSL proxy as it's un‐
426              likely this user can access the SSL certificates.
427
428       --origin=<url>
429              Ensure clients send the specified origin header  upon  the  Web‐
430              Socket handshake.
431
432       --pid-file=<path/goaccess.pid>
433              Write  the  daemon PID to a file when used along the --daemonize
434              option.
435
436       --port=<port>
437              Specify the port to use. By default GoAccess'  WebSocket  server
438              listens on port 7890.
439
440       --real-time-html
441              Enable real-time HTML output.
442
443              GoAccess uses its own WebSocket server to push the data from the
444              server to the client. See http://gwsocket.io  for  more  details
445              how the WebSocket server works.
446
447       --ws-url=<[scheme://]url[:port]>
448              URL to which the WebSocket server responds. This is the URL sup‐
449              plied to the WebSocket constructor on the client side.
450
451              Optionally, it is possible to specify the WebSocket URI  scheme,
452              such  as  ws://  or wss:// for unencrypted and encrypted connec‐
453              tions. e.g., wss://goaccess.io
454
455              If GoAccess is running behind a proxy, you could set the  client
456              side  to connect to a different port by specifying the host fol‐
457              lowed by a colon and the port.  e.g., goaccess.io:9999
458
459              By default, it will attempt to connect to the generated report's
460              hostname. If GoAccess is running on a remote server, the host of
461              the remote server should be specified here. Also, make  sure  it
462              is a valid host and NOT an http address.
463
464       --ping-interval=<secs>
465              Enable  WebSocket  ping with specified interval in seconds. This
466              helps prevent idle connections getting disconnected.
467
468       --fifo-in=<path/file>
469              Creates a named  pipe  (FIFO)  that  reads  from  on  the  given
470              path/file.
471
472       --fifo-out=<path/file>
473              Creates a named pipe (FIFO) that writes to the given path/file.
474
475       --ssl-cert=<cert.crt>
476              Path to TLS/SSL certificate. In order to enable TLS/SSL support,
477              GoAccess requires that --ssl-cert and --ssl-key are used.
478
479              Only if configured using --with-openssl
480
481       --ssl-key=<priv.key>
482              Path to TLS/SSL private key. In order to enable TLS/SSL support,
483              GoAccess requires that --ssl-cert and --ssl-key are used.
484
485              Only if configured using --with-openssl
486
487   FILE OPTIONS
488       -      The log file to parse is read from stdin.
489
490       -f --log-file=<logfile>
491              Specify  the  path  to  the input log file. If set in the config
492              file, it will take priority over -f from the command line.
493
494       -S --log-size=<bytes>
495              Specify the log size in bytes. This is  useful  when  piping  in
496              logs for processing in which the log size can be explicitly set.
497
498       -l --debug-file=<debugfile>
499              Send all debug messages to the specified file.
500
501       -p --config-file=<configfile>
502              Specify a custom configuration file to use. If set, it will take
503              priority over the global configuration file (if any).
504
505       --invalid-requests=<filename>
506              Log invalid requests to the specified file.
507
508       --unknowns-log=<filename>
509              Log unknown browsers and OSs to the specified file.
510
511       --no-global-config
512              Do not load the global configuration file. This directory should
513              normally    be    /usr/local/etc,    unless    specified    with
514              --sysconfdir=/dir.  See --dcf option  for  finding  the  default
515              configuration file.
516
517   PARSE OPTIONS
518       -a --agent-list
519              Enable a list of user-agents by host. For faster parsing, do not
520              enable this flag.
521
522       -d --with-output-resolver
523              Enable IP resolver on HTML|JSON output.
524
525       -e --exclude-ip=<IP|IP-range>
526              Exclude an IPv4 or IPv6 from being counted.  Ranges can  be  in‐
527              cluded as well using a dash in between the IPs (start-end).
528
529              Examples:
530                exclude-ip 127.0.0.1
531                exclude-ip 192.168.0.1-192.168.0.100
532                exclude-ip ::1
533                exclude-ip 0:0:0:0:0:ffff:808:804-0:0:0:0:0:ffff:808:808
534
535       -H --http-protocol=<yes|no>
536              Set/unset  HTTP request protocol. This will create a request key
537              containing the request protocol + the actual request.
538
539       -M --http-method=<yes|no>
540              Set/unset HTTP request method. This will create  a  request  key
541              containing the request method + the actual request.
542
543       -o --output=<path/file.[json|csv|html]>
544              Write  output to stdout given one of the following files and the
545              corresponding extension for the output format:
546
547                /path/file.csv - Comma-separated values (CSV)
548                /path/file.json - JSON (JavaScript Object Notation)
549                /path/file.html - HTML
550
551       -q --no-query-string
552              Ignore        request's        query        string.        i.e.,
553              www.google.com/page.htm?query => www.google.com/page.htm.
554
555              Note: Removing the query string can greatly decrease memory con‐
556              sumption, especially on timestamped requests.
557
558       -r --no-term-resolver
559              Disable IP resolver on terminal output.
560
561       --444-as-404
562              Treat non-standard status code 444 as 404.
563
564       --4xx-to-unique-count
565              Add 4xx client errors to the unique visitors count.
566
567       --anonymize-ip
568              Anonymize the client IP address.  The  IP  anonymization  option
569              sets  the  last  octet of IPv4 user IP addresses and the last 80
570              bits of  IPv6  addresses  to  zeros.   e.g.,  192.168.20.100  =>
571              192.168.20.0     e.g.,    2a03:2880:2110:df07:face:b00c::1    =>
572              2a03:2880:2110:df07::
573
574       --anonymize-level
575              Specifies the anonymization levels: 1 => default, 2 => strong, 3
576              => pedantic.
577
578              ┌────────────┬─────────┬─────────┬─────────┐
579Bits-hidden Level 1 Level 2 Level 3 
580              ├────────────┼─────────┼─────────┼─────────┤
581IPv4        │ 8       │ 16      │ 24      │
582              ├────────────┼─────────┼─────────┼─────────┤
583IPv6        │ 64      │ 80      │ 96      │
584              └────────────┴─────────┴─────────┴─────────┘
585
586       --all-static-files
587              Include   static  files  that  contain  a  query  string.  e.g.,
588              /fonts/fontawesome-webfont.woff?v=4.0.3
589
590       --browsers-file=<path>
591              By default GoAccess parses an "essential/basic" curated list  of
592              browsers & crawlers. If you need to add additional browsers, use
593              this  option.   Include  an   additional   delimited   list   of
594              browsers/crawlers/feeds  etc.   See  config/browsers.list for an
595              example   or    https://raw.githubusercontent.com/allinurl/goac
596              cess/master/config/browsers.list
597
598       --date-spec=<date|hr|min>
599              Set the date specificity to either date (default), hr to display
600              hours or min to display minutes appended to the date.
601
602              This is used in the visitors panel.  It's  useful  for  tracking
603              visitors  at  the  hour level. For instance, an hour specificity
604              would yield to  display  traffic  as  18/Dec/2010:19  or  minute
605              specificity 18/Dec/2010:19:59.
606
607       --double-decode
608              Decode  double-encoded  values.  This  includes, user-agent, re‐
609              quest, and referrer.
610
611       --enable-panel=<PANEL>
612              Enable parsing and displaying the given panel.
613
614              Available panels:
615                VISITORS
616                REQUESTS
617                REQUESTS_STATIC
618                NOT_FOUND
619                HOSTS
620                OS
621                BROWSERS
622                VISIT_TIMES
623                VIRTUAL_HOSTS
624                REFERRERS
625                REFERRING_SITES
626                KEYPHRASES
627                STATUS_CODES
628                REMOTE_USER
629                CACHE_STATUS
630                GEO_LOCATION
631                MIME_TYPE
632                TLS_TYPE
633
634       --hide-referrer=<NEEDLE>
635              Hide a referrer but still count it. Wild cards  are  allowed  in
636              the needle. i.e., *.bing.com.
637
638       --hour-spec=<hr|min>
639              Set the time specificity to either hour (default) or min to dis‐
640              play the tenth of an hour appended to the hour.
641
642              This is used in the time distribution  panel.  It's  useful  for
643              tracking peaks of traffic on your server at specific times.
644
645       --ignore-crawlers
646              Ignore crawlers from being counted.
647
648       --ignore-panel=<PANEL>
649              Ignore parsing and displaying the given panel.
650
651              Available panels:
652                VISITORS
653                REQUESTS
654                REQUESTS_STATIC
655                NOT_FOUND
656                HOSTS
657                OS
658                BROWSERS
659                VISIT_TIMES
660                VIRTUAL_HOSTS
661                REFERRERS
662                REFERRING_SITES
663                KEYPHRASES
664                STATUS_CODES
665                REMOTE_USER
666                CACHE_STATUS
667                GEO_LOCATION
668                MIME_TYPE
669                TLS_TYPE
670
671       --ignore-referrer=<referrer>
672              Ignore  referers  from  being  counted. Wildcards allowed. e.g.,
673              *.domain.com ww?.domain.*
674
675       --ignore-statics=<req|panel>
676              Ignore static file requests.
677
678              req
679                Only ignore request from valid requests
680
681              panels
682                Ignore request from panels.
683
684                Note that it will count them towards the total number  of  re‐
685              quests
686
687       --ignore-status=<CODE>
688              Ignore  parsing  and  displaying one or multiple status code(s).
689              For multiple status codes, use this option multiple times.
690
691       --keep-last=<num_days>
692              Keep the last specified number of days in storage. This will re‐
693              cycle  the  storage  tables.  e.g.,  keep & show only the last 7
694              days.
695
696       --no-ip-validation
697              Disable client IP validation. Useful if IP addresses  have  been
698              obfuscated  before being logged.  The log still needs to contain
699              a  placeholder  for  %h  usually  it's  a  resolved   IP.   e.g.
700              ord37s19-in-f14.1e100.net.
701
702       --no-strict-status
703              Disable  HTTP  status code validation. Some servers would record
704              this value only if a connection was established  to  the  target
705              and the target sent a response.  Otherwise, it could be recorded
706              as -.
707
708       --num-tests=<number>
709              Number of lines from the access log to test against the provided
710              log/date/time  format.  By default, the parser is set to test 10
711              lines. If set to 0, the parser won't test  any  lines  and  will
712              parse  the  whole  access  log.  If  a  line  matches  the given
713              log/date/time format before it reaches <number>, the parser will
714              consider  the  log  to  be valid, otherwise GoAccess will return
715              EXIT_FAILURE and display the relevant error messages.
716
717       --process-and-exit
718              Parse log and exit without outputting data.  Useful  if  we  are
719              looking  to  only  add  new data to the on-disk database without
720              outputting to a file or a terminal.
721
722       --real-os
723              Display real OS names. e.g, Windows XP, Snow Leopard.
724
725       --sort-panel=<PANEL,FIELD,ORDER>
726              Sort panel on initial load. Sort options are separated by comma.
727              Options are in the form: PANEL,METRIC,ORDER
728
729              Available metrics:
730                BY_HITS     - Sort by hits
731                BY_VISITORS - Sort by unique visitors
732                BY_DATA     - Sort by data
733                BY_BW       - Sort by bandwidth
734                BY_AVGTS    - Sort by average time served
735                BY_CUMTS    - Sort by cumulative time served
736                BY_MAXTS    - Sort by maximum time served
737                BY_PROT     - Sort by http protocol
738                BY_MTHD     - Sort by http method
739
740              Available orders:
741                ASC
742                DESC
743
744       --static-file=<extension>
745              Add static file extension. e.g.: .mp3 Extensions are case sensi‐
746              tive.
747
748   GEOLOCATION OPTIONS
749       -g --std-geoip
750              Standard GeoIP database for less memory usage.
751
752       --geoip-database=<geofile>
753              Specify path to GeoIP database file. i.e., GeoLiteCity.dat.
754
755              If using GeoIP2, you will need to download the GeoLite2 City  or
756              Country  database  from  MaxMind.com and use the option --geoip-
757              database to specify the database. You can also get updated data‐
758              base  files  for  GeoIP  legacy,  you  can find these as GeoLite
759              Legacy Databases from MaxMind.com. IPv4 and IPv6 files are  sup‐
760              ported  as  well.  For  updated  DB URLs, please see the default
761              GoAccess configuration file.
762
763              Note: --geoip-city-data is an alias of --geoip-database.
764
765   OTHER OPTIONS
766       -h --help
767              The help.
768
769       -s --storage
770              Display current storage method. i.e., B+ Tree, Hash.
771
772       -V --version
773              Display version information and exit.
774
775       --dcf  Display the path of the default config file  when  `-p`  is  not
776              used.
777
778   PERSISTENCE STORAGE OPTIONS
779       --persist
780              Persist  parsed  data  into disk. If database files exist, files
781              will be overwritten. This should be set to  the  first  dataset.
782              See examples below.
783
784       --restore
785              Load previously stored data from disk. If reading persisted data
786              only, the database files need to exist. See --persist and  exam‐
787              ples below.
788
789       --db-path=<dir>
790              Path  where  the  on-disk database files are stored. The default
791              value is the /tmp directory.
792
793

CUSTOM LOG/DATE FORMAT

795       GoAccess can parse virtually any web log format.
796
797       Predefined options include, Common Log Format (CLF), Combined Log  For‐
798       mat (XLF/ELF), including virtual host, Amazon CloudFront (Download Dis‐
799       tribution), Google Cloud Storage and W3C format (IIS).
800
801       GoAccess allows any custom format string as well.
802
803       There are two ways to configure the log format.  The easiest is to  run
804       GoAccess with -c to prompt a configuration window. Otherwise, it can be
805       configured under ~/.goaccessrc or the %sysconfdir%.
806
807       time-format
808              The time-format variable followed by a space, specifies the  log
809              format time containing any combination of regular characters and
810              special format specifiers.  They all begin with a percentage (%)
811              sign. See `man strftime`.  %T or %H:%M:%S.
812
813              Note:  If  a timestamp is given in microseconds, %f must be used
814              as time-format or %* if the timestamp is given in milliseconds.
815
816       date-format
817              The date-format variable followed by a space, specifies the  log
818              format date containing any combination of regular characters and
819              special format specifiers. They all begin with a percentage  (%)
820              sign. See `man strftime`. e.g., %Y-%m-%d.
821
822              Note:  If  a timestamp is given in microseconds, %f must be used
823              as date-format or %* if the timestamp is given in milliseconds.
824
825       log-format
826              The log-format variable followed by a space or  \t  ,  specifies
827              the log format string.
828
829       %x     A  date  and time field matching the time-format and date-format
830              variables. This is used when given a timestamp  or  the  date  &
831              time  are  concatenated  as a single string (e.g., 1501647332 or
832              20170801235000) instead of the date and time being in two  sepa‐
833              rated variables.
834
835       %t     time field matching the time-format variable.
836
837       %d     date field matching the date-format variable.
838
839       %v     The  canonical  Server  Name  of  the server serving the request
840              (Virtual Host).
841
842       %e     This is the userid of the person requesting the document as  de‐
843              termined by HTTP authentication.
844
845       %C     The cache status of the object the server served.
846
847       %h     host (the client IP address, either IPv4 or IPv6)
848
849       %r     The  request line from the client. This requires specific delim‐
850              iters around the request (as single quotes,  double  quotes,  or
851              anything else) to be parsable. If not, we have to use a combina‐
852              tion of special format specifiers as %m %U %H.
853
854       %q     The query string.
855
856       %m     The request method.
857
858       %U     The URL path requested.
859
860              Note: If the query string is in %U, there is no need to use  %q.
861              However, if the URL path, does not include any query string, you
862              may use %q and the query string will be appended to the request.
863
864       %H     The request protocol.
865
866       %s     The status code that the server sends back to the client.
867
868       %b     The size of the object returned to the client.
869
870       %R     The "Referrer" HTTP request header.
871
872       %u     The user-agent HTTP request header.
873
874       %K     The TLS encryption  settings  chosen  for  the  connection.  (In
875              Apache LogFormat: %{SSL_PROTOCOL}x)
876
877       %k     The  TLS  encryption  settings  chosen  for  the connection. (In
878              Apache LogFormat: %{SSL_CIPHER}x)
879
880       %M     The MIME-type of the requested resource. (In  Apache  LogFormat:
881              %{Content-Type}o)
882
883       %D     The  time taken to serve the request, in microseconds as a deci‐
884              mal number.
885
886       %T     The time taken to serve the request, in seconds  with  millisec‐
887              onds resolution.
888
889       %L     The  time taken to serve the request, in milliseconds as a deci‐
890              mal number.
891
892       %^     Ignore this field.
893
894       %~     Move forward through the log string until a non-space (!isspace)
895              char is found.
896
897       ~h     The  host  (the  client IP address, either IPv4 or IPv6) in a X-
898              Forwarded-For (XFF) field.
899
900              It uses a special specifier which consists of a tilde before the
901              host  specifier,  followed  by the character(s) that delimit the
902              XFF field, which are enclosed by curly braces. i.e., "~h{, }
903
904              For example, "~h{, }" is used in order  to  parse  "11.25.11.53,
905              17.68.33.17"  field  which  is  delimited by a comma and a space
906              (enclosed by double quotes).
907
908
909              ┌───────────────────────────┬───────────┐
910XFF field                  specifier 
911              ├───────────────────────────┼───────────┤
912"192.1.21.932,.68.33.11972,.1.1.2" │ "~h{, }"  │
913              ├───────────────────────────┼───────────┤
914"192.1.2.12","192.68.33.17" │ ~h{", }   │
915              ├───────────────────────────┼───────────┤
916192.1.2.12, 192.68.33.17   │ ~h{, }    │
917              ├───────────────────────────┼───────────┤
918192.1.2.11492.68.33.11972.1.1.2 │ ~h{ }     │
919              └───────────────────────────┴───────────┘
920
921
922       Note:  In  order to get the average, cumulative and maximum time served
923       in GoAccess, you will need to start logging response times in your  web
924       server. In Nginx you can add $request_time to your log format, or %D in
925       Apache.
926
927       Important: If multiple time served specifiers  are  used  at  the  same
928       time,  the first option specified in the format string will take prior‐
929       ity over the other specifiers.
930
931       GoAccess requires the following fields:
932
933              %h a valid IPv4/6
934
935              %d a valid date
936
937              %r the request
938

INTERACTIVE MENU

940       F1 or h
941              Main help.
942
943       F5     Redraw main window.
944
945       q      Quit the program, current window or collapse active module
946
947       o or ENTER
948              Expand selected module or open window
949
950       0-9 and Shift + 0
951              Set selected module to active
952
953       j      Scroll down within expanded module
954
955       k      Scroll up within expanded module
956
957       c      Set or change scheme color.
958
959       TAB    Forward iteration of modules. Starts from current active module.
960
961       SHIFT + TAB
962              Backward iteration of modules. Starts from current  active  mod‐
963              ule.
964
965       ^f     Scroll forward one screen within an active module.
966
967       ^b     Scroll backward one screen within an active module.
968
969       s      Sort options for active module
970
971       /      Search across all modules (regex allowed)
972
973       n      Find the position of the next occurrence across all modules.
974
975       g      Move to the first item or top of screen.
976
977       G      Move to the last item or bottom of screen.
978

EXAMPLES

980       Note: Piping data into GoAccess won't prompt a log/date/time configura‐
981       tion dialog, you will need to previously define it in  your  configura‐
982       tion file or in the command line.
983
984
985   DIFFERENT OUTPUTS
986       To output to a terminal and generate an interactive report:
987
988              # goaccess access.log
989
990       To generate an HTML report:
991
992              # goaccess access.log -a -o report.html
993
994       To generate a JSON report:
995
996              # goaccess access.log -a -d -o report.json
997
998       To generate a CSV file:
999
1000              # goaccess access.log --no-csv-summary -o report.csv
1001
1002       GoAccess  also  allows  great  flexibility  for real-time filtering and
1003       parsing. For instance, to quickly diagnose issues  by  monitoring  logs
1004       since goaccess was started:
1005
1006              # tail -f access.log | goaccess -
1007
1008       And  even better, to filter while maintaining opened a pipe to preserve
1009       real-time analysis, we can make use of tail -f and a  matching  pattern
1010       tool such as grep, awk, sed, etc:
1011
1012              # tail -f access.log | grep -i --line-buffered 'firefox' | goac‐
1013              cess --log-format=COMBINED -
1014
1015       or to parse from the beginning of the file while maintaining  the  pipe
1016       opened and applying a filter
1017
1018              # tail -f -n +0 access.log | grep -i --line-buffered 'firefox' |
1019              goaccess --log-format=COMBINED -o report.html --real-time-html -
1020
1021   MULTIPLE LOG FILES
1022       There are several ways to parse multiple logs with GoAccess.  The  sim‐
1023       plest is to pass multiple log files to the command line:
1024
1025              # goaccess access.log access.log.1
1026
1027       It's  even  possible  to  parse files from a pipe while reading regular
1028       files:
1029
1030              # cat access.log.2 | goaccess access.log access.log.1 -
1031
1032       Note that the single dash is appended to the command line to let  GoAc‐
1033       cess know that it should read from the pipe.
1034
1035       Now  if we want to add more flexibility to GoAccess, we can do a series
1036       of pipes. For instance, if we would like to process all compressed  log
1037       files access.log.*.gz in addition to the current log file, we can do:
1038
1039              # zcat access.log.*.gz | goaccess access.log -
1040
1041       Note: On Mac OS X, use gunzip -c instead of zcat.
1042
1043   REAL TIME HTML OUTPUT
1044       GoAccess  has  the ability to output real-time data in the HTML report.
1045       You can even email the HTML file since it is composed of a single  file
1046       with no external file dependencies, how neat is that!
1047
1048       The  process  of  generating a real-time HTML report is very similar to
1049       the process of creating  a  static  report.  Only  --real-time-html  is
1050       needed to make it real-time.
1051
1052              #  goaccess access.log -o /usr/share/nginx/html/site/report.html
1053              --real-time-html
1054
1055       By default, GoAccess will use the host name of  the  generated  report.
1056       Optionally,  you can specify the URL to which the client's browser will
1057       connect to. See https://goaccess.io/faq for a more detailed example.
1058
1059              # goaccess  access.log  -o  report.html  --real-time-html  --ws-
1060              url=goaccess.io
1061
1062       By  default,  GoAccess  listens  on  port 7890, to use a different port
1063       other than 7890, you can specify it as (make sure the port is opened):
1064
1065              #   goaccess   access.log   -o   report.html    --real-time-html
1066              --port=9870
1067
1068       And  to  bind  the  WebSocket  server to a different address other than
1069       0.0.0.0, you can specify it as:
1070
1071              #   goaccess   access.log   -o   report.html    --real-time-html
1072              --addr=127.0.0.1
1073
1074       Note:  To  output real time data over a TLS/SSL connection, you need to
1075       use --ssl-cert=<cert.crt> and --ssl-key=<priv.key>.
1076
1077   WORKING WITH DATES
1078       Another useful pipe would be filtering dates out of the web log
1079
1080       The following will get all HTTP requests starting on 05/Dec/2010  until
1081       the end of the file.
1082
1083              # sed -n '/05Dec2010/,$ p' access.log | goaccess -a -
1084
1085       or using relative dates such as yesterdays or tomorrows day:
1086
1087              #  sed -n '/'$(date '+%d%b%Y' -d '1 week ago')'/,$ p' access.log
1088              | goaccess -a -
1089
1090       If we want to parse only a certain time-frame from DATE a to DATE b, we
1091       can do:
1092
1093              # sed -n '/5Nov2010/,/5Dec2010/ p' access.log | goaccess -a -
1094
1095       If we want to preserve only certain amount of data and recycle storage,
1096       we can keep only a certain number of days. For instance to keep &  show
1097       the last 5 days:
1098
1099              # goaccess access.log --keep-last=5
1100
1101   VIRTUAL HOSTS
1102       Assuming  your log contains the virtual host (server blocks) field. For
1103       instance:
1104
1105              vhost.com:80 10.131.40.139 - - [02/Mar/2016:08:14:04 -0600] "GET
1106              /shop/bag-p-20  HTTP/1.1"  200  6715 "-" "Apache (internal dummy
1107              connection)"
1108
1109       And you would like to append the virtual host to the request  in  order
1110       to see which virtual host the top urls belong to
1111
1112              awk '$8=$1$8' access.log | goaccess -a -
1113
1114       To exclude a list of virtual hosts you can do the following:
1115
1116              #  grep  -v  "`cat  exclude_vhost_list_file`" vhost_access.log |
1117              goaccess -
1118
1119   FILES & STATUS CODES
1120       To parse specific pages, e.g., page views, html, htm, php, etc.  within
1121       a request:
1122
1123              # awk '$7~/.html|.htm|.php/' access.log | goaccess -
1124
1125       Note,  $7  is the request field for the common and combined log format,
1126       (without Virtual Host), if your log includes  Virtual  Host,  then  you
1127       probably want to use $8 instead. It's best to check which field you are
1128       shooting for, e.g.:
1129
1130              # tail -10 access.log | awk '{print $8}'
1131
1132       Or to parse a specific status code, e.g., 500 (Internal Server Error):
1133
1134              # awk '$9~/500/' access.log | goaccess -
1135
1136   SERVER
1137       Also, it is worth pointing out that if we want to run GoAccess at lower
1138       priority, we can run it as:
1139
1140              # nice -n 19 goaccess -f access.log -a
1141
1142       and  if  you don't want to install it on your server, you can still run
1143       it from your local machine:
1144
1145              # ssh -n root@server  'tail  -f  /var/log/apache2/access.log'  |
1146              goaccess -
1147
1148       Note:  SSH requires -n so GoAccess can read from stdin. Also, make sure
1149       to use SSH keys for authentication as it won't work if a passphrase  is
1150       required.
1151
1152   INCREMENTAL LOG PROCESSING
1153       GoAccess  has the ability to process logs incrementally through its in‐
1154       ternal storage and dump its data to disk. It  works  in  the  following
1155       way:
1156
1157
1158       1  A  dataset  must  be  persisted  first with --persist, then the same
1159          dataset can be loaded with
1160
1161       2  --restore.  If new data is passed (piped or through a log file),  it
1162          will append it to the original dataset.
1163
1164
1165       NOTES
1166
1167       GoAccess  keeps  track  of  inodes of all the files processed (assuming
1168       files will stay on the same partition),  in  addition,  it  extracts  a
1169       snippet  of  data  from the log along with the last line parsed of each
1170       file  and  the  timestamp  of  the  last   line   parsed.   e.g.,   in‐
1171       ode:29627417|line:20012|ts:20171231235059
1172
1173       First  it  compares  if the snippet matches the log being parsed, if it
1174       does, it assumes the log hasn't changed dramatically, e.g., hasn't been
1175       truncated.  If the inode does not match the current file, it parses all
1176       lines. If the current file matches the inode, it then reads the remain‐
1177       ing  lines  and updates the count of lines parsed and the timestamp. As
1178       an extra precaution, it won't parse log lines with a timestamp  ≤  than
1179       the one stored.
1180
1181       Piped data works based off the timestamp of the last line read. For in‐
1182       stance, it will parse and discard all incoming entries until it finds a
1183       timestamp >= than the one stored.
1184
1185
1186       For instance:
1187
1188              // last month access log
1189              # goaccess access.log.1 --persist
1190
1191       then, load it with
1192
1193              // append this month access log, and preserve new data
1194              # goaccess access.log --restore --persist
1195
1196       To read persisted data only (without parsing new data)
1197
1198              # goaccess --restore
1199

NOTES

1201       Each  active panel has a total of 366 items or 50 in the real-time HTML
1202       report.  The number of items is customizable using max-items Note  that
1203       HTML,  CSV  and JSON output allow a maximum number greater than the de‐
1204       fault value of 366 items per panel.
1205
1206       A hit is a request (line in the access log), e.g.,  10  requests  =  10
1207       hits.  HTTP requests with the same IP, date, and user agent are consid‐
1208       ered a unique visit.
1209
1210       The generated report will attempt to reconnect to the WebSocket  server
1211       after  1 second with exponential backoff. It will attempt to connect 20
1212       times.
1213

BUGS

1215       If you think you have found a bug, please send me  an  email  to  goac‐
1216       cess@prosoftcorp.com     or     use     the     issue     tracker    in
1217       https://github.com/allinurl/goaccess/issues
1218

AUTHOR

1220       Gerardo Orellana <hello@goaccess.io> For more details about it, or  new
1221       releases, please visit https://goaccess.io
1222
1223
1224
1225GNU+Linux                          JUNE 2022                       goaccess(1)
Impressum