1goaccess(1)                      User Manuals                      goaccess(1)
2
3
4

NAME

6       goaccess - fast web log analyzer and interactive viewer.
7

SYNOPSIS

9       goaccess [filename] [options...] [-c][-M][-H][-q][-d][...]
10

DESCRIPTION

12       goaccess  GoAccess  is  an  open  source real-time web log analyzer and
13       interactive viewer that runs in a terminal in *nix systems  or  through
14       your browser.
15
16       It provides fast and valuable HTTP statistics for system administrators
17       that require a visual server report on the fly.
18
19       GoAccess parses the specified web log file and outputs the data to  the
20       X terminal. Features include:
21
22
23       General Statistics:
24              This  panel gives a summary of several metrics, such as the num‐
25              ber of valid and invalid requests, time  taken  to  analyze  the
26              dataset,  unique  visitors,  requested files, static files (CSS,
27              ICO, JPG, etc) HTTP referrers, 404s, size of the parsed log file
28              and bandwidth consumption.
29
30       Unique visitors
31              This panel shows metrics such as hits, unique visitors and cumu‐
32              lative bandwidth per date. HTTP requests containing the same IP,
33              the  same  date, and the same user agent are considered a unique
34              visitor. By default, it includes web crawlers/spiders.
35
36              Optionally, date specificity can be set to the hour level  using
37              --date-spec=hr  which will display dates such as 05/Jun/2016:16.
38              This is great if you want to track your  daily  traffic  at  the
39              hour level.
40
41       Requested files
42              This  panel  displays  the  most requested (non-static) files on
43              your web server.  It shows hits, unique visitors,  and  percent‐
44              age,  along  with  the  cumulative  bandwidth, protocol, and the
45              request method used.
46
47       Requested static files
48              Lists the most frequently static files such as: JPG,  CSS,  SWF,
49              JS,  GIF, and PNG file types, along with the same metrics as the
50              last panel. Additional static files can be added to the configu‐
51              ration file.
52
53       404 or Not Found
54              Displays  the  same metrics as the previous request panels, how‐
55              ever, its data contains all pages that were  not  found  on  the
56              server, or commonly known as 404 status code.
57
58       Hosts  This  panel  has  detailed  information on the hosts themselves.
59              This is great for spotting aggressive crawlers  and  identifying
60              who's eating your bandwidth.
61
62              Expanding  the panel can display more information such as host's
63              reverse DNS lookup result, country of origin and city. If the -a
64              argument  is  enabled, a list of user agents can be displayed by
65              selecting the desired IP address, and then pressing ENTER.
66
67       Operating Systems
68              This panel will report which operating system the host used when
69              it hit the server. It attempts to provide the most specific ver‐
70              sion of each operating system.
71
72       Browsers
73              This panel will report which browser the host used when  it  hit
74              the  server. It attempts to provide the most specific version of
75              each browser.
76
77       Visit Times
78              This panel will display an hourly report. This  option  displays
79              24 data points, one for each hour of the day.
80
81              Optionally,  hour specificity can be set to the tenth of an hour
82              level using --hour-spec=min which will  display  hours  as  16:4
83              This  is  great  if  you  want  to spot peaks of traffic on your
84              server.
85
86       Virtual Hosts
87              This panel will display all the different virtual  hosts  parsed
88              from  the  access  log.  This  panel  is displayed if %v is used
89              within the log-format string.
90
91       Referrers URLs
92              If the host in question accessed the site via another  resource,
93              or  was  linked/diverted  to you from another host, the URL they
94              were  referred  from  will  be  provided  in  this  panel.   See
95              `--ignore-panel`  in your configuration file to enable it.  dis‐
96              abled by default.
97
98       Referring Sites
99              This panel will display only the host part  but  not  the  whole
100              URL. The URL where the request came from.
101
102       Keyphrases
103              It  reports  keyphrases used on Google search, Google cache, and
104              Google translate that have lead to your web server. At  present,
105              it  only supports Google search queries via HTTP. See `--ignore-
106              panel` in your configuration file to  enable  it.   disabled  by
107              default.
108
109       Geo Location
110              Determines  where  an IP address is geographically located. Sta‐
111              tistics are broken down by continent and country. It needs to be
112              compiled with GeoLocation support.
113
114       HTTP Status Codes
115              The values of the numeric status code to HTTP requests.
116
117       Remote User (HTTP authentication)
118              This  is  the  userid  of  the person requesting the document as
119              determined by HTTP authentication. If the document is not  pass‐
120              word  protected,  this  part  will be "-" just like the previous
121              one. This panel is not enabled unless %e  is  given  within  the
122              log-format variable.
123
124       Cache Status
125              If you are using caching on your server, you may be at the point
126              where you want to know if  your  request  is  being  cached  and
127              served  from the cache. This panel shows the cache status of the
128              object the server served. This panel is not enabled unless %C is
129              given within the log-format variable. The status can be either
130               `MISS`, `BYPASS`, `EXPIRED`, `STALE`, `UPDATING`, `REVALIDATED`
131              or `HIT`
132
133       MIME Types
134              This panel specifies Media Types (formerly known as MIME  types)
135              and Media Subtypes which will be assigned and listed underneath.
136              This panel is not enabled unless %M is given within the log-for‐
137              mat    variable.   See   https://www.iana.org/assignments/media-
138              types/media-types.xhtml for more details.
139
140       Encryption Settings
141              This panel shows the SSL/TLS  protocol  used  along  the  Cipher
142              Suites.  This panel is not enabled unless %K is given within the
143              log-format variable.
144
145
146       NOTE: Optionally and if configured, all panels can display the  average
147       time taken to serve the request.
148
149

STORAGE

151       There  are three storage options that can be used with GoAccess. Choos‐
152       ing one will depend on your environment and needs.
153
154       Default Hash Tables
155              In-memory storage provides better performance  at  the  cost  of
156              limiting  the  dataset  size to the amount of available physical
157              memory. GoAccess uses in-memory hash tables. It  has  very  good
158              memory  usage and pretty good performance. This storage has sup‐
159              port for on-disk persistence.
160

CONFIGURATION

162       Multiple options can be used to configure GoAccess. For a complete  up-
163       to-date list of configure options, run ./configure --help
164
165       --enable-debug
166              Compile  with  debugging symbols and turn off compiler optimiza‐
167              tions.
168
169       --enable-utf8
170              Compile with wide character support. Ncursesw is required.
171
172       --enable-geoip=<legacy|mmdb>
173              Compile with GeoLocation support. MaxMind's GeoIP  is  required.
174              legacy  will  utilize  the  original GeoIP databases.  mmdb will
175              utilize the enhanced GeoIP2 databases.
176
177       --with-getline
178              Dynamically expands line buffer in  order  to  parse  full  line
179              requests instead of using a fixed size buffer of 4096.
180
181       --with-openssl
182              Compile GoAccess with OpenSSL support for its WebSocket server.
183

OPTIONS

185       The  following  options  can be supplied to the command or specified in
186       the configuration file. If specified in the  configuration  file,  long
187       options  need  to  be  used without prepending -- and without using the
188       equal sign =.
189
190   LOG/DATE/TIME FORMAT
191       --time-format=<timeformat>
192              The time-format variable followed by a space, specifies the  log
193              format time containing either a name of a predefined format (see
194              options below) or any combination of regular characters and spe‐
195              cial format specifiers.
196
197              They  all  begin with a percentage (%) sign. See `man strftime`.
198              %T or %H:%M:%S.
199
200              Note that if a timestamp is given in microseconds,  %f  must  be
201              used  as time-format.  If the timestamp is given in milliseconds
202              %* must be used as time-format.
203
204       --date-format=<dateformat>
205              The date-format variable followed by a space, specifies the  log
206              format time containing either a name of a predefined format (see
207              options below) or any combination of regular characters and spe‐
208              cial format specifiers.
209
210              They  all  begin with a percentage (%) sign. See `man strftime`.
211              %Y-%m-%d.
212
213              Note that if a timestamp is given in microseconds,  %f  must  be
214              used  as date-format.  If the timestamp is given in milliseconds
215              %* must be used as date-format.
216
217       --log-format=<logformat>
218              The log-format variable followed by a space or \t for tab-delim‐
219              ited, specifies the log format string.
220
221              Note  that  if  there  are  spaces within the format, the string
222              needs to be enclosed in single/double quotes. Inner quotes  need
223              to be escaped.
224
225              In  addition  to  specifying  the raw log/date/time formats, for
226              simplicity, any of the following predefined log format names can
227              be  supplied to the log/date/time-format variables. GoAccess can
228              also handle one predefined name in one variable and another pre‐
229              defined name in another variable.
230
231                COMBINED     - Combined Log Format,
232                VCOMBINED    - Combined Log Format with Virtual Host,
233                COMMON       - Common Log Format,
234                VCOMMON      - Common Log Format with Virtual Host,
235                W3C          - W3C Extended Log File Format,
236                SQUID        - Native Squid Log Format,
237                CLOUDFRONT   - Amazon CloudFront Web Distribution,
238                CLOUDSTORAGE - Google Cloud Storage,
239                AWSELB       - Amazon Elastic Load Balancing,
240                AWSS3        - Amazon Simple Storage Service (S3)
241                CADDY        - Caddy's JSON Structured format
242
243              Note:  Piping  data  into  GoAccess won't prompt a log/date/time
244              configuration dialog, you will need to previously define  it  in
245              your configuration file or in the command line.
246
247   USER INTERFACE OPTIONS
248       -c --config-dialog
249              Prompt log/time/date configuration window on program start. Only
250              when curses is initialized.
251
252       -i --hl-header
253              Color highlight active terminal panel.
254
255       -m --with-mouse
256              Enable mouse support on main terminal dashboard.
257
258       ---color=<fg:bg[attrs, PANEL]>
259              Specify custom colors for the terminal output.
260
261              Color Syntax
262                DEFINITION space/tab colorFG#:colorBG# [attributes,PANEL]
263
264               FG# = foreground color [-1...255] (-1 = default term color)
265               BG# = background color [-1...255] (-1 = default term color)
266
267              Optionally, it is possible to apply color  attributes  (multiple
268              attributes  are comma separated), such as: bold, underline, nor‐
269              mal, reverse, blink
270
271              If desired, it is possible to apply  custom  colors  per  panel,
272              that is, a metric in the REQUESTS panel can be of color A, while
273              the same metric in the BROWSERS panel can be of color B.
274
275              Available color definitions:
276                COLOR_MTRC_HITS
277                COLOR_MTRC_VISITORS
278                COLOR_MTRC_DATA
279                COLOR_MTRC_BW
280                COLOR_MTRC_AVGTS
281                COLOR_MTRC_CUMTS
282                COLOR_MTRC_MAXTS
283                COLOR_MTRC_PROT
284                COLOR_MTRC_MTHD
285                COLOR_MTRC_HITS_PERC
286                COLOR_MTRC_HITS_PERC_MAX
287                COLOR_MTRC_VISITORS_PERC
288                COLOR_MTRC_VISITORS_PERC_MAX
289                COLOR_PANEL_COLS
290                COLOR_BARS
291                COLOR_ERROR
292                COLOR_SELECTED
293                COLOR_PANEL_ACTIVE
294                COLOR_PANEL_HEADER
295                COLOR_PANEL_DESC
296                COLOR_OVERALL_LBLS
297                COLOR_OVERALL_VALS
298                COLOR_OVERALL_PATH
299                COLOR_ACTIVE_LABEL
300                COLOR_BG
301                COLOR_DEFAULT
302                COLOR_PROGRESS
303
304              See configuration file for a sample color scheme.
305
306       --color-scheme=<1|2|3>
307              Choose among color schemes.  1 for the default grey  scheme.   2
308              for  the  green scheme.  3 for the Monokai scheme (shown only if
309              terminal supports 256 colors).
310
311       --crawlers-only
312              Parse and display only crawlers (bots).
313
314       --html-custom-css=<path/custom.css>
315              Specifies a custom CSS file path to load in the HTML report.
316
317       --html-custom-js=<path/custom.js>
318              Specifies a custom JS file path to load in the HTML report.
319
320       --html-report-title=<title>
321              Set HTML report page title and header.
322
323       --html-refresh=<secs>
324              Refresh the HTML report every X seconds. The  value  has  to  be
325              between 1 and 60 seconds. The default is set to refresh the HTML
326              report every 1 second.
327
328       --html-prefs=<JSON>
329              Set HTML report default preferences. Supply a valid JSON  object
330              containing  the  HTML preferences. It allows the ability to cus‐
331              tomize each panel plot. See example below.
332
333              Note: The JSON object passed needs to be a one line JSON string.
334              For instance,
335
336              --html-prefs='{"theme":"bright","perPage":5,"layout":"horizontal","showTables":true,"visitors":{"plot":{"chartType":"bar"}}}'
337
338       --json-pretty-print
339              Format JSON output using tabs and newlines.
340
341              Note:  This  is not recommended when outputting a real-time HTML
342              report since the WebSocket payload will much much larger.
343
344       --max-items=<number>
345              The maximum number of items to display per  panel.  The  maximum
346              can be a number between 1 and n.
347
348              Note:  Only  the  CSV  and  JSON  output  allow a maximum number
349              greater than the default value of 366 (or 50  in  the  real-time
350              HTML output) items per panel.
351
352       --no-color
353              Turn off colored output. This is the default output on terminals
354              that do not support colors.
355
356       --no-column-names
357              Don't write column names in the terminal output. By default,  it
358              displays column names for each available metric in every panel.
359
360       --no-csv-summary
361              Disable summary metrics on the CSV output.
362
363       --no-progress
364              Disable progress metrics [total requests/requests per second].
365
366       --no-tab-scroll
367              Disable  scrolling  through panels when TAB is pressed or when a
368              panel is selected using a numeric key.
369
370       --no-html-last-updated
371              Do not show the last updated field displayed in the HTML  gener‐
372              ated report.
373
374       --no-parsing-spinner
375              Do now show the progress metrics and parsing spinner.
376
377   SERVER OPTIONS
378       --addr Specify  IP address to bind the server to. Otherwise it binds to
379              0.0.0.0.
380
381              Usually there is no need to  specify  the  address,  unless  you
382              intentionally  would  like  to  bind  the  server to a different
383              address within your server.
384
385       --daemonize
386              Run GoAccess as daemon (only if --real-time-html enabled).
387
388              Note: It's important to make use of absolute paths across  GoAc‐
389              cess' configuration.
390
391       --user-name=<username>
392              Run GoAccess as the specified user.
393
394              Note:  It's important to ensure the user or the users' group can
395              access the input and output files as well  as  any  other  files
396              needed.   Other  groups the user belongs to will be ignored.  As
397              such it's advised to run GoAccess behind a  SSL  proxy  as  it's
398              unlikely this user can access the SSL certificates.
399
400       --origin=<url>
401              Ensure  clients  send  the specified origin header upon the Web‐
402              Socket handshake.
403
404       --pid-file=<path/goaccess.pid>
405              Write the daemon PID to a file when used along  the  --daemonize
406              option.
407
408       --port=<port>
409              Specify  the  port to use. By default GoAccess' WebSocket server
410              listens on port 7890.
411
412       --real-time-html
413              Enable real-time HTML output.
414
415              GoAccess uses its own WebSocket server to push the data from the
416              server  to  the  client. See http://gwsocket.io for more details
417              how the WebSocket server works.
418
419       --ws-url=<[scheme://]url[:port]>
420              URL to which the WebSocket server responds. This is the URL sup‐
421              plied to the WebSocket constructor on the client side.
422
423              Optionally,  it is possible to specify the WebSocket URI scheme,
424              such as ws:// or wss:// for unencrypted  and  encrypted  connec‐
425              tions. e.g., wss://goaccess.io
426
427              If  GoAccess is running behind a proxy, you could set the client
428              side to connect to a different port by specifying the host  fol‐
429              lowed by a colon and the port.  e.g., goaccess.io:9999
430
431              By default, it will attempt to connect to the generated report's
432              hostname. If GoAccess is running on a remote server, the host of
433              the  remote  server should be specified here. Also, make sure it
434              is a valid host and NOT an http address.
435
436       --fifo-in=<path/file>
437              Creates a named  pipe  (FIFO)  that  reads  from  on  the  given
438              path/file.
439
440       --fifo-out=<path/file>
441              Creates a named pipe (FIFO) that writes to the given path/file.
442
443       --ssl-cert=<cert.crt>
444              Path to TLS/SSL certificate. In order to enable TLS/SSL support,
445              GoAccess requires that --ssl-cert and --ssl-key are used.
446
447              Only if configured using --with-openssl
448
449       --ssl-key=<priv.key>
450              Path to TLS/SSL private key. In order to enable TLS/SSL support,
451              GoAccess requires that --ssl-cert and --ssl-key are used.
452
453              Only if configured using --with-openssl
454
455   FILE OPTIONS
456       -      The log file to parse is read from stdin.
457
458       -f --log-file=<logfile>
459              Specify  the  path  to  the input log file. If set in the config
460              file, it will take priority over -f from the command line.
461
462       -S --log-size=<bytes>
463              Specify the log size in bytes. This is  useful  when  piping  in
464              logs for processing in which the log size can be explicitly set.
465
466       -l --debug-file=<debugfile>
467              Send all debug messages to the specified file.
468
469       -p --config-file=<configfile>
470              Specify a custom configuration file to use. If set, it will take
471              priority over the global configuration file (if any).
472
473       --invalid-requests=<filename>
474              Log invalid requests to the specified file.
475
476       --unknowns-log=<filename>
477              Log unknown browsers and OSs to the specified file.
478
479       --no-global-config
480              Do not load the global configuration file. This directory should
481              normally    be    /usr/local/etc,    unless    specified    with
482              --sysconfdir=/dir.  See --dcf option  for  finding  the  default
483              configuration file.
484
485   PARSE OPTIONS
486       -a --agent-list
487              Enable a list of user-agents by host. For faster parsing, do not
488              enable this flag.
489
490       -d --with-output-resolver
491              Enable IP resolver on HTML|JSON output.
492
493       -e --exclude-ip=<IP|IP-range>
494              Exclude an IPv4 or IPv6  from  being  counted.   Ranges  can  be
495              included as well using a dash in between the IPs (start-end).
496
497              Examples:
498                exclude-ip 127.0.0.1
499                exclude-ip 192.168.0.1-192.168.0.100
500                exclude-ip ::1
501                exclude-ip 0:0:0:0:0:ffff:808:804-0:0:0:0:0:ffff:808:808
502
503       -H --http-protocol=<yes|no>
504              Set/unset  HTTP request protocol. This will create a request key
505              containing the request protocol + the actual request.
506
507       -M --http-method=<yes|no>
508              Set/unset HTTP request method. This will create  a  request  key
509              containing the request method + the actual request.
510
511       -o --output=<path/file.[json|csv|html]>
512              Write  output to stdout given one of the following files and the
513              corresponding extension for the output format:
514
515                /path/file.csv - Comma-separated values (CSV)
516                /path/file.json - JSON (JavaScript Object Notation)
517                /path/file.html - HTML
518
519       -q --no-query-string
520              Ignore        request's        query        string.        i.e.,
521              www.google.com/page.htm?query => www.google.com/page.htm.
522
523              Note: Removing the query string can greatly decrease memory con‐
524              sumption, especially on timestamped requests.
525
526       -r --no-term-resolver
527              Disable IP resolver on terminal output.
528
529       --444-as-404
530              Treat non-standard status code 444 as 404.
531
532       --4xx-to-unique-count
533              Add 4xx client errors to the unique visitors count.
534
535       --anonymize-ip
536              Anonymize the client IP address.  The  IP  anonymization  option
537              sets  the  last  octet of IPv4 user IP addresses and the last 80
538              bits of  IPv6  addresses  to  zeros.   e.g.,  192.168.20.100  =>
539              192.168.20.0     e.g.,    2a03:2880:2110:df07:face:b00c::1    =>
540              2a03:2880:2110:df07::
541
542       --all-static-files
543              Include  static  files  that  contain  a  query  string.   e.g.,
544              /fonts/fontawesome-webfont.woff?v=4.0.3
545
546       --browsers-file=<path>
547              By  default GoAccess parses an "essential/basic" curated list of
548              browsers & crawlers. If you need to add additional browsers, use
549              this   option.    Include   an   additional  delimited  list  of
550              browsers/crawlers/feeds etc.  See  config/browsers.list  for  an
551              example    or   https://raw.githubusercontent.com/allinurl/goac
552              cess/master/config/browsers.list
553
554       --date-spec=<date|hr>
555              Set the date specificity to either date (default) or hr to  dis‐
556              play hours appended to the date.
557
558              This  is  used  in  the visitors panel. It's useful for tracking
559              visitors at the hour level. For instance,  an  hour  specificity
560              would yield to display traffic as 18/Dec/2010:19
561
562       --double-decode
563              Decode   double-encoded   values.   This  includes,  user-agent,
564              request, and referer.
565
566       --enable-panel=<PANEL>
567              Enable parsing and displaying the given panel.
568
569              Available panels:
570                VISITORS
571                REQUESTS
572                REQUESTS_STATIC
573                NOT_FOUND
574                HOSTS
575                OS
576                BROWSERS
577                VISIT_TIMES
578                VIRTUAL_HOSTS
579                REFERRERS
580                REFERRING_SITES
581                KEYPHRASES
582                STATUS_CODES
583                REMOTE_USER
584                CACHE_STATUS
585                GEO_LOCATION
586                MIME_TYPE
587                TLS_TYPE
588
589       --hide-referer=<NEEDLE>
590              Hide a referer but still count it. Wild cards are allowed in the
591              needle. i.e., *.bing.com.
592
593       --hour-spec=<hr|min>
594              Set the time specificity to either hour (default) or min to dis‐
595              play the tenth of an hour appended to the hour.
596
597              This is used in the time distribution  panel.  It's  useful  for
598              tracking peaks of traffic on your server at specific times.
599
600       --ignore-crawlers
601              Ignore crawlers from being counted.
602
603       --ignore-panel=<PANEL>
604              Ignore parsing and displaying the given panel.
605
606              Available panels:
607                VISITORS
608                REQUESTS
609                REQUESTS_STATIC
610                NOT_FOUND
611                HOSTS
612                OS
613                BROWSERS
614                VISIT_TIMES
615                VIRTUAL_HOSTS
616                REFERRERS
617                REFERRING_SITES
618                KEYPHRASES
619                STATUS_CODES
620                REMOTE_USER
621                CACHE_STATUS
622                GEO_LOCATION
623                MIME_TYPE
624                TLS_TYPE
625
626       --ignore-referer=<referer>
627              Ignore  referers  from  being  counted. Wildcards allowed. e.g.,
628              *.domain.com ww?.domain.*
629
630       --ignore-statics=<req|panel>
631              Ignore static file requests.
632
633              req
634                Only ignore request from valid requests
635
636              panels
637                Ignore request from panels.
638
639                Note that it will count  them  towards  the  total  number  of
640              requests
641
642       --ignore-status=<CODE>
643              Ignore  parsing  and  displaying one or multiple status code(s).
644              For multiple status codes, use this option multiple times.
645
646       --keep-last=<num_days>
647              Keep the last specified number of days  in  storage.  This  will
648              recycle  the  storage  tables. e.g., keep & show only the last 7
649              days.
650
651       --no-ip-validation
652              Disable client IP validation. Useful if IP addresses  have  been
653              obfuscated  before being logged.  The log still needs to contain
654              a  placeholder  for  %h  usually  it's  a  resolved   IP.   e.g.
655              ord37s19-in-f14.1e100.net.
656
657       --no-strict-status
658              Disable  HTTP  status code validation. Some servers would record
659              this value only if a connection was established  to  the  target
660              and the target sent a response.  Otherwise, it could be recorded
661              as -.
662
663       --num-tests=<number>
664              Number of lines from the access log to test against the provided
665              log/date/time  format.  By default, the parser is set to test 10
666              lines. If set to 0, the parser won't test  any  lines  and  will
667              parse  the  whole  access  log.  If  a  line  matches  the given
668              log/date/time format before it reaches <number>, the parser will
669              consider  the  log  to  be valid, otherwise GoAccess will return
670              EXIT_FAILURE and display the relevant error messages.
671
672       --process-and-exit
673              Parse log and exit without outputting data.  Useful  if  we  are
674              looking  to  only  add  new data to the on-disk database without
675              outputting to a file or a terminal.
676
677       --real-os
678              Display real OS names. e.g, Windows XP, Snow Leopard.
679
680       --sort-panel=<PANEL,FIELD,ORDER>
681              Sort panel on initial load. Sort options are separated by comma.
682              Options are in the form: PANEL,METRIC,ORDER
683
684              Available metrics:
685                BY_HITS     - Sort by hits
686                BY_VISITORS - Sort by unique visitors
687                BY_DATA     - Sort by data
688                BY_BW       - Sort by bandwidth
689                BY_AVGTS    - Sort by average time served
690                BY_CUMTS    - Sort by cumulative time served
691                BY_MAXTS    - Sort by maximum time served
692                BY_PROT     - Sort by http protocol
693                BY_MTHD     - Sort by http method
694
695              Available orders:
696                ASC
697                DESC
698
699       --static-file=<extension>
700              Add static file extension. e.g.: .mp3 Extensions are case sensi‐
701              tive.
702
703   GEOLOCATION OPTIONS
704       -g --std-geoip
705              Standard GeoIP database for less memory usage.
706
707       --geoip-database=<geofile>
708              Specify path to GeoIP database file. i.e., GeoLiteCity.dat.
709
710              If using GeoIP2, you will need to download the GeoLite2 City  or
711              Country  database  from  MaxMind.com and use the option --geoip-
712              database to specify the database. You can also get updated data‐
713              base  files  for  GeoIP  legacy,  you  can find these as GeoLite
714              Legacy Databases from MaxMind.com. IPv4 and IPv6 files are  sup‐
715              ported  as  well.  For  updated  DB URLs, please see the default
716              GoAccess configuration file.
717
718              Note: --geoip-city-data is an alias of --geoip-database.
719
720   OTHER OPTIONS
721       -h --help
722              The help.
723
724       -s --storage
725              Display current storage method. i.e., B+ Tree, Hash.
726
727       -V --version
728              Display version information and exit.
729
730       --dcf  Display the path of the default config file  when  `-p`  is  not
731              used.
732
733   PERSISTENCE STORAGE OPTIONS
734       --persist
735              Persist  parsed  data  into disk. If database files exist, files
736              will be overwritten. This should be set to  the  first  dataset.
737              See examples below.
738
739       --restore
740              Load previously stored data from disk. If reading persisted data
741              only, the database files need to exist. See --persist and  exam‐
742              ples below.
743
744       --db-path=<dir>
745              Path  where  the  on-disk database files are stored. The default
746              value is the /tmp directory.
747
748

CUSTOM LOG/DATE FORMAT

750       GoAccess can parse virtually any web log format.
751
752       Predefined options include, Common Log Format (CLF), Combined Log  For‐
753       mat (XLF/ELF), including virtual host, Amazon CloudFront (Download Dis‐
754       tribution), Google Cloud Storage and W3C format (IIS).
755
756       GoAccess allows any custom format string as well.
757
758       There are two ways to configure the log format.  The easiest is to  run
759       GoAccess with -c to prompt a configuration window. Otherwise, it can be
760       configured under ~/.goaccessrc or the %sysconfdir%.
761
762       time-format
763              The time-format variable followed by a space, specifies the  log
764              format time containing any combination of regular characters and
765              special format specifiers.  They all begin with a percentage (%)
766              sign. See `man strftime`.  %T or %H:%M:%S.
767
768              Note:  If  a timestamp is given in microseconds, %f must be used
769              as time-format or %* if the timestamp is given in milliseconds.
770
771       date-format
772              The date-format variable followed by a space, specifies the  log
773              format date containing any combination of regular characters and
774              special format specifiers. They all begin with a percentage  (%)
775              sign. See `man strftime`. e.g., %Y-%m-%d.
776
777              Note:  If  a timestamp is given in microseconds, %f must be used
778              as date-format or %* if the timestamp is given in milliseconds.
779
780       log-format
781              The log-format variable followed by a space or  \t  ,  specifies
782              the log format string.
783
784       %x     A  date  and time field matching the time-format and date-format
785              variables. This is used when given a timestamp  or  the  date  &
786              time  are  concatenated  as a single string (e.g., 1501647332 or
787              20170801235000) instead of the date and time being in two  sepa‐
788              rated variables.
789
790       %t     time field matching the time-format variable.
791
792       %d     date field matching the date-format variable.
793
794       %v     The  canonical  Server  Name  of  the server serving the request
795              (Virtual Host).
796
797       %e     This is the userid of the  person  requesting  the  document  as
798              determined by HTTP authentication.
799
800       %C     The cache status of the object the server served.
801
802       %h     host (the client IP address, either IPv4 or IPv6)
803
804       %r     The  request line from the client. This requires specific delim‐
805              iters around the request (as single quotes,  double  quotes,  or
806              anything else) to be parsable. If not, we have to use a combina‐
807              tion of special format specifiers as %m %U %H.
808
809       %q     The query string.
810
811       %m     The request method.
812
813       %U     The URL path requested.
814
815              Note: If the query string is in %U, there is no need to use  %q.
816              However, if the URL path, does not include any query string, you
817              may use %q and the query string will be appended to the request.
818
819       %H     The request protocol.
820
821       %s     The status code that the server sends back to the client.
822
823       %b     The size of the object returned to the client.
824
825       %R     The "Referrer" HTTP request header.
826
827       %u     The user-agent HTTP request header.
828
829       %K     The TLS encryption  settings  chosen  for  the  connection.  (In
830              Apache LogFormat: %{SSL_PROTOCOL}x)
831
832       %k     The  TLS  encryption  settings  chosen  for  the connection. (In
833              Apache LogFormat: %{SSL_CIPHER}x)
834
835       %M     The MIME-type of the requested resource. (In  Apache  LogFormat:
836              %{Content-Type}o)
837
838       %D     The  time taken to serve the request, in microseconds as a deci‐
839              mal number.
840
841       %T     The time taken to serve the request, in seconds  with  millisec‐
842              onds resolution.
843
844       %L     The  time taken to serve the request, in milliseconds as a deci‐
845              mal number.
846
847       %^     Ignore this field.
848
849       %~     Move forward through the log string until a non-space (!isspace)
850              char is found.
851
852       ~h     The  host  (the  client IP address, either IPv4 or IPv6) in a X-
853              Forwarded-For (XFF) field.
854
855              It uses a special specifier which consists of a tilde before the
856              host  specifier,  followed  by the character(s) that delimit the
857              XFF field, which are enclosed by curly braces (i.e., ~h{," })
858
859              For example, ~h{," } is used in  order  to  parse  "11.25.11.53,
860              17.68.33.17"  field  which  is  delimited  by  a double quote, a
861              comma, and a space.
862
863       Note: In order to get the average, cumulative and maximum  time  served
864       in  GoAccess, you will need to start logging response times in your web
865       server. In Nginx you can add $request_time to your log format, or %D in
866       Apache.
867
868       Important:  If  multiple  time  served  specifiers are used at the same
869       time, the first option specified in the format string will take  prior‐
870       ity over the other specifiers.
871
872       GoAccess requires the following fields:
873
874              %h a valid IPv4/6
875
876              %d a valid date
877
878              %r the request
879

INTERACTIVE MENU

881       F1 or h
882              Main help.
883
884       F5     Redraw main window.
885
886       q      Quit the program, current window or collapse active module
887
888       o or ENTER
889              Expand selected module or open window
890
891       0-9 and Shift + 0
892              Set selected module to active
893
894       j      Scroll down within expanded module
895
896       k      Scroll up within expanded module
897
898       c      Set or change scheme color.
899
900       TAB    Forward iteration of modules. Starts from current active module.
901
902       SHIFT + TAB
903              Backward  iteration  of modules. Starts from current active mod‐
904              ule.
905
906       ^f     Scroll forward one screen within an active module.
907
908       ^b     Scroll backward one screen within an active module.
909
910       s      Sort options for active module
911
912       /      Search across all modules (regex allowed)
913
914       n      Find the position of the next occurrence across all modules.
915
916       g      Move to the first item or top of screen.
917
918       G      Move to the last item or bottom of screen.
919

EXAMPLES

921       Note: Piping data into GoAccess won't prompt a log/date/time configura‐
922       tion  dialog,  you will need to previously define it in your configura‐
923       tion file or in the command line.
924
925
926   DIFFERENT OUTPUTS
927       To output to a terminal and generate an interactive report:
928
929              # goaccess access.log
930
931       To generate an HTML report:
932
933              # goaccess access.log -a -o report.html
934
935       To generate a JSON report:
936
937              # goaccess access.log -a -d -o report.json
938
939       To generate a CSV file:
940
941              # goaccess access.log --no-csv-summary -o report.csv
942
943       GoAccess also allows great  flexibility  for  real-time  filtering  and
944       parsing.  For  instance,  to quickly diagnose issues by monitoring logs
945       since goaccess was started:
946
947              # tail -f access.log | goaccess -
948
949       And even better, to filter while maintaining opened a pipe to  preserve
950       real-time  analysis,  we can make use of tail -f and a matching pattern
951       tool such as grep, awk, sed, etc:
952
953              # tail -f access.log | grep -i --line-buffered 'firefox' | goac‐
954              cess --log-format=COMBINED -
955
956       or  to  parse from the beginning of the file while maintaining the pipe
957       opened and applying a filter
958
959              # tail -f -n +0 access.log | grep -i --line-buffered 'firefox' |
960              goaccess --log-format=COMBINED -o report.html --real-time-html -
961
962   MULTIPLE LOG FILES
963       There  are  several ways to parse multiple logs with GoAccess. The sim‐
964       plest is to pass multiple log files to the command line:
965
966              # goaccess access.log access.log.1
967
968       It's even possible to parse files from a  pipe  while  reading  regular
969       files:
970
971              # cat access.log.2 | goaccess access.log access.log.1 -
972
973       Note  that the single dash is appended to the command line to let GoAc‐
974       cess know that it should read from the pipe.
975
976       Now if we want to add more flexibility to GoAccess, we can do a  series
977       of  pipes. For instance, if we would like to process all compressed log
978       files access.log.*.gz in addition to the current log file, we can do:
979
980              # zcat access.log.*.gz | goaccess access.log -
981
982       Note: On Mac OS X, use gunzip -c instead of zcat.
983
984   REAL TIME HTML OUTPUT
985       GoAccess has the ability to output real-time data in the  HTML  report.
986       You  can even email the HTML file since it is composed of a single file
987       with no external file dependencies, how neat is that!
988
989       The process of generating a real-time HTML report is  very  similar  to
990       the  process  of  creating  a  static  report. Only --real-time-html is
991       needed to make it real-time.
992
993              # goaccess access.log -o  /usr/share/nginx/html/site/report.html
994              --real-time-html
995
996       By  default,  GoAccess  will use the host name of the generated report.
997       Optionally, you can specify the URL to which the client's browser  will
998       connect to. See https://goaccess.io/faq for a more detailed example.
999
1000              #  goaccess  access.log  -o  report.html  --real-time-html --ws-
1001              url=goaccess.io
1002
1003       By default, GoAccess listens on port 7890,  to  use  a  different  port
1004       other than 7890, you can specify it as (make sure the port is opened):
1005
1006              #    goaccess   access.log   -o   report.html   --real-time-html
1007              --port=9870
1008
1009       And to bind the WebSocket server to  a  different  address  other  than
1010       0.0.0.0, you can specify it as:
1011
1012              #    goaccess   access.log   -o   report.html   --real-time-html
1013              --addr=127.0.0.1
1014
1015       Note: To output real time data over a TLS/SSL connection, you  need  to
1016       use --ssl-cert=<cert.crt> and --ssl-key=<priv.key>.
1017
1018   WORKING WITH DATES
1019       Another useful pipe would be filtering dates out of the web log
1020
1021       The  following will get all HTTP requests starting on 05/Dec/2010 until
1022       the end of the file.
1023
1024              # sed -n '/05Dec2010/,$ p' access.log | goaccess -a -
1025
1026       or using relative dates such as yesterdays or tomorrows day:
1027
1028              # sed -n '/'$(date '+%d%b%Y' -d '1 week ago')'/,$ p'  access.log
1029              | goaccess -a -
1030
1031       If we want to parse only a certain time-frame from DATE a to DATE b, we
1032       can do:
1033
1034              # sed -n '/5Nov2010/,/5Dec2010/ p' access.log | goaccess -a -
1035
1036       If we want to preserve only certain amount of data and recycle storage,
1037       we  can keep only a certain number of days. For instance to keep & show
1038       the last 5 days:
1039
1040              # goaccess access.log --keep-last=5
1041
1042   VIRTUAL HOSTS
1043       Assuming your log contains the virtual host (server blocks) field.  For
1044       instance:
1045
1046              vhost.com:80 10.131.40.139 - - [02/Mar/2016:08:14:04 -0600] "GET
1047              /shop/bag-p-20 HTTP/1.1" 200 6715 "-"  "Apache  (internal  dummy
1048              connection)"
1049
1050       And  you  would like to append the virtual host to the request in order
1051       to see which virtual host the top urls belong to
1052
1053              awk '$8=$1$8' access.log | goaccess -a -
1054
1055       To exclude a list of virtual hosts you can do the following:
1056
1057              # grep -v  "`cat  exclude_vhost_list_file`"  vhost_access.log  |
1058              goaccess -
1059
1060   FILES & STATUS CODES
1061       To  parse specific pages, e.g., page views, html, htm, php, etc. within
1062       a request:
1063
1064              # awk '$7~/.html|.htm|.php/' access.log | goaccess -
1065
1066       Note, $7 is the request field for the common and combined  log  format,
1067       (without  Virtual  Host),  if  your log includes Virtual Host, then you
1068       probably want to use $8 instead. It's best to check which field you are
1069       shooting for, e.g.:
1070
1071              # tail -10 access.log | awk '{print $8}'
1072
1073       Or to parse a specific status code, e.g., 500 (Internal Server Error):
1074
1075              # awk '$9~/500/' access.log | goaccess -
1076
1077   SERVER
1078       Also, it is worth pointing out that if we want to run GoAccess at lower
1079       priority, we can run it as:
1080
1081              # nice -n 19 goaccess -f access.log -a
1082
1083       and if you don't want to install it on your server, you can  still  run
1084       it from your local machine:
1085
1086              #  ssh  -n  root@server  'tail -f /var/log/apache2/access.log' |
1087              goaccess -
1088
1089       Note: SSH requires -n so GoAccess can read from stdin. Also, make  sure
1090       to  use SSH keys for authentication as it won't work if a passphrase is
1091       required.
1092
1093   INCREMENTAL LOG PROCESSING
1094       GoAccess has the ability to  process  logs  incrementally  through  its
1095       internal  storage  and dump its data to disk. It works in the following
1096       way:
1097
1098
1099       1  A dataset must be persisted first  with  --persist,  then  the  same
1100          dataset can be loaded with
1101
1102       2  --restore.   If new data is passed (piped or through a log file), it
1103          will append it to the original dataset.
1104
1105
1106       NOTES
1107
1108       GoAccess keeps track of inodes of all  the  files  processed  (assuming
1109       files  will  stay  on  the  same partition), in addition, it extracts a
1110       snippet of data from the log along with the last line  parsed  of  each
1111       file   and   the   timestamp   of   the   last   line   parsed.   e.g.,
1112       inode:29627417|line:20012|ts:20171231235059
1113
1114       First it compares if the snippet matches the log being  parsed,  if  it
1115       does, it assumes the log hasn't changed dramatically, e.g., hasn't been
1116       truncated. If the inode does not match the current file, it parses  all
1117       lines. If the current file matches the inode, it then reads the remain‐
1118       ing lines and updates the count of lines parsed and the  timestamp.  As
1119       an  extra  precaution, it won't parse log lines with a timestamp ≤ than
1120       the one stored.
1121
1122       Piped data works based off the timestamp of the  last  line  read.  For
1123       instance, it will parse and discard all incoming entries until it finds
1124       a timestamp >= than the one stored.
1125
1126
1127       For instance:
1128
1129              // last month access log
1130              # goaccess access.log.1 --persist
1131
1132       then, load it with
1133
1134              // append this month access log, and preserve new data
1135              # goaccess access.log --restore --persist
1136
1137       To read persisted data only (without parsing new data)
1138
1139              # goaccess --restore
1140

NOTES

1142       Each active panel has a total of 366 items or 50 in the real-time  HTML
1143       report.   The  number of items is customizable using max-items However,
1144       only the CSV and JSON output allow a maximum number  greater  than  the
1145       default value of 366 items per panel.
1146
1147       A  hit  is  a  request (line in the access log), e.g., 10 requests = 10
1148       hits. HTTP requests with the same IP, date, and user agent are  consid‐
1149       ered a unique visit.
1150

BUGS

1152       If  you  think  you  have found a bug, please send me an email to goac‐
1153       cess@prosoftcorp.com    or     use     the     issue     tracker     in
1154       https://github.com/allinurl/goaccess/issues
1155

AUTHOR

1157       Gerardo  Orellana <goaccess@prosoftcorp.com> For more details about it,
1158       or new releases, please visit https://goaccess.io
1159
1160
1161
1162Linux                            FEBRUARY 2021                     goaccess(1)
Impressum