1goaccess(1)                      User Manuals                      goaccess(1)
2
3
4

NAME

6       goaccess - fast web log analyzer and interactive viewer.
7

SYNOPSIS

9       goaccess [filename] [ options ... ] [-c][-M][-H][-q][-d][...]
10

DESCRIPTION

12       goaccess  GoAccess  is  an  open  source real-time web log analyzer and
13       interactive viewer that runs in a terminal in *nix systems  or  through
14       your browser.
15
16       It provides fast and valuable HTTP statistics for system administrators
17       that require a visual server report on the fly.
18
19       GoAccess parses the specified web log file and outputs the data to  the
20       X terminal. Features include:
21
22
23       General Statistics:
24              This  panel  gives a summary of several metrics, such as: number
25              of valid  and  invalid  requests,  time  taken  to  analyze  the
26              dataset,  unique  visitors,  requested files, static files (CSS,
27              ICO, JPG, etc) HTTP referrers, 404s, size of the parsed log file
28              and bandwidth consumption.
29
30       Unique visitors
31              This panel shows metrics such as hits, unique visitors and cumu‐
32              lative bandwidth per date. HTTP requests containing the same IP,
33              the  same  date, and the same user agent are considered a unique
34              visitor. By default, it includes web crawlers/spiders.
35
36              Optionally, date specificity can be set to the hour level  using
37              --date-spec=hr  which will display dates such as 05/Jun/2016:16.
38              This is great if you want to track your  daily  traffic  at  the
39              hour level.
40
41       Requested files
42              This panel displays the most requested files on your web server.
43              It shows hits, unique visitors, and percentage, along  with  the
44              cumulative bandwidth, protocol, and the request method used.
45
46       Requested static files
47              Lists  the  most frequently static files such as: JPG, CSS, SWF,
48              JS, GIF, and PNG file types, along with the same metrics as  the
49              last panel. Additional static files can be added to the configu‐
50              ration file.
51
52       404 or Not Found
53              Displays the same metrics as the previous request  panels,  how‐
54              ever,  its  data  contains  all pages that were not found on the
55              server, or commonly known as 404 status code.
56
57       Hosts  This panel has detailed information  on  the  hosts  themselves.
58              This  is  great for spotting aggressive crawlers and identifying
59              who's eating your bandwidth.
60
61              Expanding the panel can display more information such as  host's
62              reverse DNS lookup result, country of origin and city. If the -a
63              argument is enabled, a list of user agents can be  displayed  by
64              selecting the desired IP address, and then pressing ENTER.
65
66       Operating Systems
67              This panel will report which operating system the host used when
68              it hit the server. It attempts to provide the most specific ver‐
69              sion of each operating system.
70
71       Browsers
72              This  panel  will report which browser the host used when it hit
73              the server. It attempts to provide the most specific version  of
74              each browser.
75
76       Visit Times
77              This  panel  will display an hourly report. This option displays
78              24 data points, one for each hour of the day.
79
80              Optionally, hour specificity can be set to the tenth of an  hour
81              level  using  --hour-spec=min  which  will display hours as 16:4
82              This is great if you want to  spot  peaks  of  traffic  on  your
83              server.
84
85       Virtual Hosts
86              This  panel  will display all the different virtual hosts parsed
87              from the access log. This panel  is  displayed  if  %v  is  used
88              within the log-format string.
89
90       Referrers URLs
91              If  the host in question accessed the site via another resource,
92              or was linked/diverted to you from another host,  the  URL  they
93              were   referred  from  will  be  provided  in  this  panel.  See
94              `--ignore-panel` in your configuration file to enable it.   dis‐
95              abled by default.
96
97       Referring Sites
98              This  panel  will  display  only the host part but not the whole
99              URL. The URL where the request came from.
100
101       Keyphrases
102              It reports keyphrases used on Google search, Google  cache,  and
103              Google  translate that have lead to your web server. At present,
104              it only supports Google search queries via HTTP. See  `--ignore-
105              panel`  in  your  configuration  file to enable it.  disabled by
106              default.
107
108       Geo Location
109              Determines where an IP address is geographically  located.  Sta‐
110              tistics are broken down by continent and country. It needs to be
111              compiled with GeoLocation support.
112
113       HTTP Status Codes
114              The values of the numeric status code to HTTP requests.
115
116       Remote User (HTTP authentication)
117              This is the userid of the  person  requesting  the  document  as
118              determined  by HTTP authentication. If the document is not pass‐
119              word protected, this part will be "-"  just  like  the  previous
120              one.  This  panel  is  not enabled unless %e is given within the
121              log-format variable.
122
123
124       NOTE: Optionally and if configured, all panels can display the  average
125       time taken to serve the request.
126
127

STORAGE

129       There  are three storage options that can be used with GoAccess. Choos‐
130       ing one will depend on your environment and needs.
131
132       Default Hash Tables
133              In-memory storage provides better performance  at  the  cost  of
134              limiting  the  dataset  size to the amount of available physical
135              memory. By default GoAccess uses in-memory hash tables. If  your
136              dataset  can  fit in memory, then this will perform fine. It has
137              very good memory usage and pretty good performance.
138
139       Tokyo Cabinet On-Disk B+ Tree
140              Use this storage method for large datasets where it is not  pos‐
141              sible  to  fit  everything  in  memory.  The B+ tree database is
142              slower than any of the hash databases since data has to be  com‐
143              mitted to disk. However, using an SSD greatly increases the per‐
144              formance. You may also use this storage method if you need  data
145              persistence to quickly load statistics at a later date.
146
147       Tokyo Cabinet In-memory Hash Database
148              An  alternative to the default hash tables. It uses generic typ‐
149              ing and thus it's performance in terms of memory  and  speed  is
150              average.
151

CONFIGURATION

153       Multiple  options can be used to configure GoAccess. For a complete up-
154       to-date list of configure options, run ./configure --help
155
156       --enable-debug
157              Compile with debugging symbols and turn off  compiler  optimiza‐
158              tions.
159
160       --enable-utf8
161              Compile with wide character support. Ncursesw is required.
162
163       --enable-geoip=<legacy|geoip2>
164              Compile  with  GeoLocation support. MaxMind's GeoIP is required.
165              legacy will utilize the original GeoIP databases.   geoip2  will
166              utilize the enhanced GeoIP2 databases.
167
168       --enable-tcb=<memhash|btree>
169              Compile  with  Tokyo Cabinet storage support.  memhash will uti‐
170              lize Tokyo Cabinet's on-memory hash database.  btree  will  uti‐
171              lize Tokyo Cabinet's on-disk B+ Tree database.
172
173       --disable-zlib
174              Disable zlib compression on B+ Tree database.
175
176       --disable-bzip
177              Disable bzip2 compression on B+ Tree database.
178
179       --with-getline
180              Dynamically  expands  line  buffer  in  order to parse full line
181              requests instead of using a fixed size buffer of 4096.
182
183       --with-openssl
184              Compile GoAccess with OpenSSL support for its WebSocket server.
185

OPTIONS

187       The following options can be supplied to the command  or  specified  in
188       the  configuration  file.  If specified in the configuration file, long
189       options need to be used without prepending --  and  without  using  the
190       equal sign =.
191
192   LOG/DATE/TIME FORMAT
193       --time-format=<timeformat>
194              The  time-format variable followed by a space, specifies the log
195              format time containing either a name of a predefined format (see
196              options below) or any combination of regular characters and spe‐
197              cial format specifiers.
198
199              They all begin with a percentage (%) sign. See  `man  strftime`.
200              %T or %H:%M:%S.
201
202              Note  that  if  a timestamp is given in microseconds, %f must be
203              used as time-format
204
205       --date-format=<dateformat>
206              The date-format variable followed by a space, specifies the  log
207              format time containing either a name of a predefined format (see
208              options below) or any combination of regular characters and spe‐
209              cial format specifiers.
210
211              They  all  begin with a percentage (%) sign. See `man strftime`.
212              %Y-%m-%d.
213
214              Note that if a timestamp is given in microseconds,  %f  must  be
215              used as date-format
216
217       --log-format=<logformat>
218              The log-format variable followed by a space or \t for tab-delim‐
219              ited, specifies the log format string.
220
221              Note that if there are spaces  within  the  format,  the  string
222              needs  to be enclosed in single/double quotes. Inner quotes need
223              to be escaped.
224
225              In addition to specifying the  raw  log/date/time  formats,  for
226              simplicity, any of the following predefined log format names can
227              be supplied to the log/date/time-format variables. GoAccess  can
228              also handle one predefined name in one variable and another pre‐
229              defined name in another variable.
230
231                COMBINED     - Combined Log Format,
232                VCOMBINED    - Combined Log Format with Virtual Host,
233                COMMON       - Common Log Format,
234                VCOMMON      - Common Log Format with Virtual Host,
235                W3C          - W3C Extended Log File Format,
236                SQUID        - Native Squid Log Format,
237                CLOUDFRONT   - Amazon CloudFront Web Distribution,
238                CLOUDSTORAGE - Google Cloud Storage,
239                AWSELB       - Amazon Elastic Load Balancing,
240                AWSS3        - Amazon Simple Storage Service (S3)
241
242              Note: Piping data into GoAccess  won't  prompt  a  log/date/time
243              configuration  dialog,  you will need to previously define it in
244              your configuration file or in the command line.
245
246   USER INTERFACE OPTIONS
247       -c --config-dialog
248              Prompt log/time/date configuration window on program start. Only
249              when curses is initialized.
250
251       -i --hl-header
252              Color highlight active panel.
253
254       -m --with-mouse
255              Enable mouse support on main terminal dashboard.
256
257       ---color=<fg:bg[attrs, PANEL]>
258              Specify custom colors for the terminal output.
259
260              Color Syntax
261                DEFINITION space/tab colorFG#:colorBG# [attributes,PANEL]
262
263               FG# = foreground color [-1...255] (-1 = default term color)
264               BG# = background color [-1...255] (-1 = default term color)
265
266              Optionally,  it  is possible to apply color attributes (multiple
267              attributes are comma separated), such as: bold, underline,  nor‐
268              mal, reverse, blink
269
270              If  desired,  it  is  possible to apply custom colors per panel,
271              that is, a metric in the REQUESTS panel can be of color A, while
272              the same metric in the BROWSERS panel can be of color B.
273
274              Available color definitions:
275                COLOR_MTRC_HITS
276                COLOR_MTRC_VISITORS
277                COLOR_MTRC_DATA
278                COLOR_MTRC_BW
279                COLOR_MTRC_AVGTS
280                COLOR_MTRC_CUMTS
281                COLOR_MTRC_MAXTS
282                COLOR_MTRC_PROT
283                COLOR_MTRC_MTHD
284                COLOR_MTRC_PERC
285                COLOR_MTRC_PERC_MAX
286                COLOR_PANEL_COLS
287                COLOR_BARS
288                COLOR_ERROR
289                COLOR_SELECTED
290                COLOR_PANEL_ACTIVE
291                COLOR_PANEL_HEADER
292                COLOR_PANEL_DESC
293                COLOR_OVERALL_LBLS
294                COLOR_OVERALL_VALS
295                COLOR_OVERALL_PATH
296                COLOR_ACTIVE_LABEL
297                COLOR_BG
298                COLOR_DEFAULT
299                COLOR_PROGRESS
300
301              See configuration file for a sample color scheme.
302
303       --color-scheme=<1|2|3>
304              Choose  among  color schemes.  1 for the default grey scheme.  2
305              for the green scheme.  3 for the Monokai scheme (shown  only  if
306              terminal supports 256 colors).
307
308       --crawlers-only
309              Parse and display only crawlers (bots).
310
311       --html-custom-css=<path.css>
312              Specifies a custom CSS file path to load in the HTML report.
313
314       --html-custom-js=<path.js>
315              Specifies a custom JS file path to load in the HTML report.
316
317       --html-report-title=<title>
318              Set HTML report page title and header.
319
320       --html-prefs=<JSON>
321              Set  HTML report default preferences. Supply a valid JSON object
322              containing the HTML preferences.  It allows the ability to  cus‐
323              tomize each panel plot. See example below.
324
325              Note: The JSON object passed needs to be a one line JSON string.
326              For instance,
327
328              --html-prefs='{"theme":"bright","perPage":5,"layout":"horizon‐
329              tal","showTables":true,"visitors":{"plot":{"chartType":"bar"}}}'
330
331       --json-pretty-print
332              Format JSON output using tabs and newlines.
333
334              Note:  This  is not recommended when outputting a real-time HTML
335              report since the WebSocket payload will much much larger.
336
337       --max-items=<number>
338              The maximum number of items to display per  panel.  The  maximum
339              can be a number between 1 and n.
340
341              Note:  Only  the  CSV  and  JSON  output  allow a maximum number
342              greater than the default value of 366 (or 50  in  the  real-time
343              HTML output) items per panel.
344
345       --no-color
346              Turn  off  colored output. This is the  default output on termi‐
347              nals that do not support colors.
348
349       --no-column-names
350              Don't write column names in the terminal output. By default,  it
351              displays column names for each available metric in every panel.
352
353       --no-csv-summary
354              Disable summary metrics on the CSV output.
355
356       --no-progress
357              Disable progress metrics [total requests/requests per second].
358
359       --no-tab-scroll
360              Disable  scrolling  through panels when TAB is pressed or when a
361              panel is selected using a numeric key.
362
363       --no-html-last-updated
364              Do not show the last updated field displayed in the HTML  gener‐
365              ated report.
366
367   SERVER OPTIONS
368       --addr Specify  IP address to bind the server to. Otherwise it binds to
369              0.0.0.0.
370
371              Usually there is no need to  specify  the  address,  unless  you
372              intentionally  would  like  to  bind  the  server to a different
373              address within your server.
374
375       --daemonize
376              Run GoAccess as daemon (only if --real-time-html enabled).
377
378       --origin=<url>
379              Ensure clients send the specified origin header  upon  the  Web‐
380              Socket handshake.
381
382       --port=<port>
383              Specify  the  port to use. By default GoAccess' WebSocket server
384              listens on port 7890.
385
386       --real-time-html
387              Enable real-time HTML output.
388
389              GoAccess uses its own WebSocket server to push the data from the
390              server  to  the  client. See http://gwsocket.io for more details
391              how the WebSocket server works.
392
393       --ws-url=<[scheme://]url[:port]>
394              URL to which the WebSocket server responds. This is the URL sup‐
395              plied to the WebSocket constructor on the client side.
396
397              Optionally,  it is possible to specify the WebSocket URI scheme,
398              such as ws:// or wss:// for unencrypted  and  encrypted  connec‐
399              tions. e.g., wss://goaccess.io
400
401              If  GoAccess is running behind a proxy, you could set the client
402              side to connect to a different port by specifying the host  fol‐
403              lowed by a colon and the port.  e.g., goaccess.io:9999
404
405              By default, it will attempt to connect to the generated report's
406              hostname. If GoAccess is running on a remote server, the host of
407              the  remote  server should be specified here. Also, make sure it
408              is a valid host and NOT an http address.
409
410       --fifo-in=<path/file>
411              Creates a named  pipe  (FIFO)  that  reads  from  on  the  given
412              path/file.
413
414       --fifo-out=<path/file>
415              Creates a named pipe (FIFO) that writes to the given path/file.
416
417       --ssl-cert=<cert.crt>
418              Path to TLS/SSL certificate. In order to enable TLS/SSL support,
419              GoAccess requires that --ssl-cert and --ssl-key are used.
420
421              Only if configured using --with-openssl
422
423       --ssl-key=<priv.key>
424              Path to TLS/SSL private key. In order to enable TLS/SSL support,
425              GoAccess requires that --ssl-cert and --ssl-key are used.
426
427              Only if configured using --with-openssl
428
429   FILE OPTIONS
430       -f --log-file=<logfile>
431              Specify  the  path  to  the input log file. If set in the config
432              file, it will take priority over -f from the command line.
433
434       -l --debug-file=<debugfile>
435              Send all debug messages to the specified file.
436
437       -p --config-file=<configfile>
438              Specify a custom configuration file to use. If set, it will take
439              priority over the global configuration file (if any).
440
441       --invalid-requests=<filename>
442              Log invalid requests to the specified file.
443
444       --no-global-config
445              Do not load the global configuration file. This directory should
446              normally    be    /usr/local/etc,    unless    specified    with
447              --sysconfdir=/dir.
448
449   PARSE OPTIONS
450       -a --agent-list
451              Enable a list of user-agents by host. For faster parsing, do not
452              enable this flag.
453
454       -d --with-output-resolver
455              Enable IP resolver on HTML|JSON output.
456
457       -e --exclude-ip=<IP|IP-range>
458              Exclude an IPv4 or IPv6  from  being  counted.   Ranges  can  be
459              included as well using a dash in between the IPs (start-end).
460
461              Examples:
462                exclude-ip 127.0.0.1
463                exclude-ip 192.168.0.1-192.168.0.100
464                exclude-ip ::1
465                exclude-ip 0:0:0:0:0:ffff:808:804-0:0:0:0:0:ffff:808:808
466
467       -H --http-protocol=<yes|no>
468              Set/unset  HTTP request protocol. This will create a request key
469              containing the request protocol + the actual request.
470
471       -M --http-method=<yes|no>
472              Set/unset HTTP request method. This will create  a  request  key
473              containing the request method + the actual request.
474
475       -o --output=<path/file.[json|csv|html]>
476              Write  output to stdout given one of the following files and the
477              corresponding extension for the output format:
478
479                /path/file.csv  - Comma-separated values (CSV)
480                /path/file.json - JSON (JavaScript Object Notation)
481                /path/file.html - HTML
482
483       -q --no-query-string
484              Ignore        request's        query        string.        i.e.,
485              www.google.com/page.htm?query => www.google.com/page.htm.
486
487              Note: Removing the query string can greatly decrease memory con‐
488              sumption, especially on timestamped requests.
489
490       -r --no-term-resolver
491              Disable IP resolver on terminal output.
492
493       --444-as-404
494              Treat non-standard status code 444 as 404.
495
496       --4xx-to-unique-count
497              Add 4xx client errors to the unique visitors count.
498
499       --all-static-files
500              Include  static  files  that  contain  a  query  string.   e.g.,
501              /fonts/fontawesome-webfont.woff?v=4.0.3
502
503       --date-spec=<date|hr>
504              Set  the date specificity to either date (default) or hr to dis‐
505              play hours appended to the date.
506
507              This is used in the visitors panel.  It's  useful  for  tracking
508              visitors  at  the  hour level. For instance, an hour specificity
509              would yield to display traffic as 18/Dec/2010:19
510
511       --double-decode
512              Decode  double-encoded  values.   This   includes,   user-agent,
513              request, and referer.
514
515       --enable-panel=<PANEL>
516              Enable parsing and displaying the given panel.
517
518              Available panels:
519                VISITORS
520                REQUESTS
521                REQUESTS_STATIC
522                NOT_FOUND
523                HOSTS
524                OS
525                BROWSERS
526                VISIT_TIMES
527                VIRTUAL_HOSTS
528                REFERRERS
529                REFERRING_SITES
530                KEYPHRASES
531                STATUS_CODES
532                REMOTE_USER
533                GEO_LOCATION
534
535       --hour-spec=<hr|min>
536              Set the time specificity to either hour (default) or min to dis‐
537              play the tenth of an hour appended to the hour.
538
539              This is used in the time distribution  panel.  It's  useful  for
540              tracking peaks of traffic on your server at specific times.
541
542       --ignore-crawlers
543              Ignore crawlers from being counted.
544
545       --ignore-panel=<PANEL>
546              Ignore parsing and displaying the given panel.
547
548              Available panels:
549                VISITORS
550                REQUESTS
551                REQUESTS_STATIC
552                NOT_FOUND
553                HOSTS
554                OS
555                BROWSERS
556                VISIT_TIMES
557                VIRTUAL_HOSTS
558                REFERRERS
559                REFERRING_SITES
560                KEYPHRASES
561                STATUS_CODES
562                REMOTE_USER
563
564       --ignore-referer=<referer>
565              Ignore  referers  from  being  counted. Wildcards allowed. e.g.,
566              *.domain.com ww?.domain.*
567
568       --ignore-status=<CODE>
569              Ignore parsing and displaying one or  multiple  status  code(s).
570              For multiple status codes, use this option multiple times.
571
572       --num-tests=<number>
573              Number of lines from the access log to test against the provided
574              log/date/time format. By default, the parser is set to  test  10
575              lines.   If  set  to 0, the parser won't test any lines and will
576              parse the  whole  access  log.  If  a  line  matches  the  given
577              log/date/time format before it reaches <number>, the parser will
578              consider the log to be valid,  otherwise  GoAccess  will  return
579              EXIT_FAILURE and display the relevant error messages.
580
581       --process-and-exit
582              Parse  log  and  exit  without outputting data. Useful if we are
583              looking to only add new data to  the  on-disk  database  without
584              outputting to a file or a terminal.
585
586       --real-os
587              Display real OS names. e.g, Windows XP, Snow Leopard.
588
589       --sort-panel=<PANEL,FIELD,ORDER>
590              Sort panel on initial load. Sort options are separated by comma.
591              Options are in the form: PANEL,METRIC,ORDER
592
593              Available metrics:
594                BY_HITS     - Sort by hits
595                BY_VISITORS - Sort by unique visitors
596                BY_DATA     - Sort by data
597                BY_BW       - Sort by bandwidth
598                BY_AVGTS    - Sort by average time served
599                BY_CUMTS    - Sort by cumulative time served
600                BY_MAXTS    - Sort by maximum time served
601                BY_PROT     - Sort by http protocol
602                BY_MTHD     - Sort by http method
603
604              Available orders:
605                ASC
606                DESC
607
608       --static-file=<extension>
609              Add static file extension. e.g.: .mp3 Extensions are case sensi‐
610              tive.
611
612   GEOLOCATION OPTIONS
613       -g --std-geoip
614              Standard GeoIP database for less memory usage.
615
616       --geoip-database=<geofile>
617              Specify path to GeoIP database file. i.e., GeoLiteCity.dat. File
618              needs to be downloaded from maxmind.com. IPv4 and IPv6 files are
619              supported  as  well.   Note:  `--geoip-city-data` is an alias of
620              `--geoip-database`.
621
622   OTHER OPTIONS
623       -h --help
624              The help.
625
626       -s --storage
627              Display current storage method. i.e., B+ Tree, Hash.
628
629       -V --version
630              Display version information and exit.
631
632       --dcf  Display the path of the default config file  when  `-p`  is  not
633              used.
634
635   ON-DISK STORAGE OPTIONS
636       --keep-db-files
637              Persist  parsed  data  into disk. If database files exist, files
638              will be overwritten. This should be set to  the  first  dataset.
639              Setting  it to false will delete all database files when exiting
640              the program. See examples below.
641
642              Only if configured with --enable-tcb=btree
643
644       --load-from-disk
645              Load previously stored data from disk. If reading persisted data
646              only,  the  database  files need to exist. See keep-db-files and
647              examples below.
648
649              Only if configured with --enable-tcb=btree
650
651       --db-path=<dir>
652              Path where the on-disk database files are  stored.  The  default
653              value is the /tmp directory.
654
655              Only if configured with --enable-tcb=btree
656
657       --xmmap=<num>
658              Set  the  size  in bytes of the extra mapped memory. The default
659              value is 0.
660
661              Only if configured with --enable-tcb=btree
662
663       --cache-lcnum=<num>
664              Specifies the maximum number of leaf nodes to be cached.  If  it
665              is  not more than 0, the default value is specified. The default
666              value is 1024. Setting a larger value will increase  speed  per‐
667              formance, however, memory consumption will increase. Lower value
668              will decrease memory consumption.
669
670              Only if configured with --enable-tcb=btree
671
672       --cache-ncnum=<num>
673              Specifies the maximum number of non-leaf nodes to be cached.  If
674              it  is  not  more  than  0,  the default value is specified. The
675              default value is 512.
676
677              Only if configured with --enable-tcb=btree
678
679       --tune-lmemb=<num>
680              Specifies the number of members in each leaf page. If it is  not
681              more  than  0, the default value is specified. The default value
682              is 128.
683
684              Only if configured with --enable-tcb=btree
685
686       --tune-nmemb=<num>
687              Specifies the number of members in each non-leaf page. If it  is
688              not  more  than  0,  the default value is specified. The default
689              value is 256.
690
691              Only if configured with --enable-tcb=btree
692
693       --tune-bnum=<num>
694              Specifies the number of elements of the bucket array. If  it  is
695              not  more  than  0,  the default value is specified. The default
696              value is 32749. Suggested size of the bucket array is about from
697              1 to 4 times of the number of all pages to be stored.
698
699              Only if configured with --enable-tcb=btree
700
701       --compression=<zlib|bz2>
702              Specifies that each page is compressed with ZLIB|BZ2 encoding.
703
704              Only if configured with --enable-tcb=btree
705
706

CUSTOM LOG/DATE FORMAT

708       GoAccess can parse virtually any web log format.
709
710       Predefined  options include, Common Log Format (CLF), Combined Log For‐
711       mat (XLF/ELF), including virtual host, Amazon CloudFront (Download Dis‐
712       tribution), Google Cloud Storage and W3C format (IIS).
713
714       GoAccess allows any custom format string as well.
715
716       There  are two ways to configure the log format.  The easiest is to run
717       GoAccess with -c to prompt a configuration window. Otherwise, it can be
718       configured under ~/.goaccessrc or the %sysconfdir%.
719
720       time-format
721              The  time-format variable followed by a space, specifies the log
722              format time containing any combination of regular characters and
723              special format specifiers.  They all begin with a percentage (%)
724              sign. See `man strftime`.  %T or %H:%M:%S.
725
726              Note: If a timestamp is given in microseconds, %f must  be  used
727              as time-format
728
729       date-format
730              The  date-format variable followed by a space, specifies the log
731              format date containing any combination of regular characters and
732              special  format specifiers. They all begin with a percentage (%)
733              sign. See `man strftime`. e.g., %Y-%m-%d.
734
735              Note: If a timestamp is given in microseconds, %f must  be  used
736              as date-format
737
738       log-format
739              The  log-format  variable  followed by a space or \t , specifies
740              the log format string.
741
742       %x     A date and time field matching the time-format  and  date-format
743              variables. This is used when a timestamp is given instead of the
744              date and time being in two separated variables.
745
746       %t     time field matching the time-format variable.
747
748       %d     date field matching the date-format variable.
749
750       %v     The canonical Server Name of  the  server  serving  the  request
751              (Virtual Host).
752
753       %e     This  is  the  userid  of  the person requesting the document as
754              determined by HTTP authentication.
755
756       %h     host (the client IP address, either IPv4 or IPv6)
757
758       %r     The request line from the client. This requires specific  delim‐
759              iters  around  the  request (as single quotes, double quotes, or
760              anything else) to be parsable. If not, we have to use a combina‐
761              tion of special format specifiers as %m %U %H.
762
763       %q     The query string.
764
765       %m     The request method.
766
767       %U     The URL path requested.
768
769              Note:  If the query string is in %U, there is no need to use %q.
770              However, if the URL path, does not include any query string, you
771              may use %q and the query string will be appended to the request.
772
773       %H     The request protocol.
774
775       %s     The status code that the server sends back to the client.
776
777       %b     The size of the object returned to the client.
778
779       %R     The "Referrer" HTTP request header.
780
781       %u     The user-agent HTTP request header.
782
783       %D     The  time taken to serve the request, in microseconds as a deci‐
784              mal number.
785
786       %T     The time taken to serve the request, in seconds  with  millisec‐
787              onds resolution.
788
789       %L     The  time taken to serve the request, in milliseconds as a deci‐
790              mal number.
791
792       %^     Ignore this field.
793
794       %~     Move forward through the log string until a non-space (!isspace)
795              char is found.
796
797       ~h     The  host  (the  client IP address, either IPv4 or IPv6) in a X-
798              Forwarded-For (XFF) field.
799
800              It uses a special specifier which consists of a tilde before the
801              host  specifier,  followed  by the character(s) that delimit the
802              XFF field, which are enclosed by curly braces (i.e., ~h{," })
803
804              For example, ~h{," } is used in  order  to  parse  "11.25.11.53,
805              17.68.33.17"  field  which  is  delimited  by  a double quote, a
806              comma, and a space.
807
808       Note: In order to get the average, cumulative and maximum  time  served
809       in  GoAccess, you will need to start logging response times in your web
810       server. In Nginx you can add $request_time to your log format, or %D in
811       Apache.
812
813       Important:  If  multiple  time  served  specifiers are used at the same
814       time, the first option specified in the format string will take  prior‐
815       ity over the other specifiers.
816
817       GoAccess requires the following fields:
818
819              %h a valid IPv4/6
820
821              %d a valid date
822
823              %r the request
824

INTERACTIVE MENU

826       F1 or h
827              Main help.
828
829       F5     Redraw main window.
830
831       q      Quit the program, current window or collapse active module
832
833       o or  ENTER
834              Expand selected module or open window
835
836       0-9 and Shift + 0
837              Set selected module to active
838
839       j      Scroll down within expanded module
840
841       k      Scroll up within expanded module
842
843       c      Set or change scheme color.
844
845       TAB    Forward iteration of modules. Starts from current active module.
846
847       SHIFT + TAB
848              Backward  iteration  of modules. Starts from current active mod‐
849              ule.
850
851       ^f     Scroll forward one screen within an active module.
852
853       ^b     Scroll backward one screen within an active module.
854
855       s      Sort options for active module
856
857       /      Search across all modules (regex allowed)
858
859       n      Find the position of the next occurrence across all modules.
860
861       g      Move to the first item or top of screen.
862
863       G      Move to the last item or bottom of screen.
864

EXAMPLES

866   DIFFERENT OUTPUTS
867       To output to a terminal and generate an interactive report:
868
869              # goaccess access.log
870
871       To generate an HTML report:
872
873              # goaccess access.log -a -o report.html
874
875       To generate a JSON report:
876
877              # goaccess access.log -a -d -o report.json
878
879       To generate a CSV file:
880
881              # goaccess access.log --no-csv-summary -o report.csv
882
883       GoAccess also allows great  flexibility  for  real-time  filtering  and
884       parsing.  For  instance,  to quickly diagnose issues by monitoring logs
885       since goaccess was started:
886
887              # tail -f access.log | goaccess -
888
889       And even better, to filter while maintaining opened a pipe to  preserve
890       real-time  analysis,  we can make use of tail -f and a matching pattern
891       tool such as grep, awk, sed, etc:
892
893              # tail -f access.log | grep -i --line-buffered 'firefox' | goac‐
894              cess --log-format=COMBINED -
895
896   MULTIPLE LOG FILES
897       There  are  several ways to parse multiple logs with GoAccess. The sim‐
898       plest is to pass multiple log files to the command line:
899
900              # goaccess access.log access.log.1
901
902       It's even possible to parse files from a  pipe  while  reading  regular
903       files:
904
905              # cat access.log.2 | goaccess access.log access.log.1 -
906
907       Note  that the single dash is appended to the command line to let GoAc‐
908       cess know that it should read from the pipe.
909
910       Now if we want to add more flexibility to GoAccess, we can do a  series
911       of  pipes. For instance, if we would like to process all compressed log
912       files access.log.*.gz in addition to the current log file, we can do:
913
914              # zcat access.log.*.gz | goaccess access.log -
915
916       Note: On Mac OS X, use gunzip -c instead of zcat.
917
918   REAL TIME HTML OUTPUT
919       GoAccess has the ability the output real-time data in the HTML  report.
920       You  can even email the HTML file since it is composed of a single file
921       with no external file dependencies, how neat is that!
922
923       The process of generating a real-time HTML report is  very  similar  to
924       the  process  of  creating  a  static  report. Only --real-time-html is
925       needed to make it real-time.
926
927              # goaccess access.log -o  /usr/share/nginx/html/site/report.html
928              --real-time-html
929
930       By  default,  GoAccess  will use the host name of the generated report.
931       Optionally, you can specify the URL to which the client's browser  will
932       connect to. See http://goaccess.io/faq for a more detailed example.
933
934              #  goaccess  access.log  -o  report.html  --real-time-html --ws-
935              url=goaccess.io
936
937       By default, GoAccess listens on port 7890,  to  use  a  different  port
938       other than 7890, you can specify it as (make sure the port is opened):
939
940              #    goaccess   access.log   -o   report.html   --real-time-html
941              --port=9870
942
943       And to bind the WebSocket server to  a  different  address  other  than
944       0.0.0.0, you can specify it as:
945
946              #    goaccess   access.log   -o   report.html   --real-time-html
947              --addr=127.0.0.1
948
949       Note: To output real time data over a TLS/SSL connection, you  need  to
950       use --ssl-cert=<cert.crt> and --ssl-key=<priv.key>.
951
952   WORKING WITH DATES
953       Another useful pipe would be filtering dates out of the web log
954
955       The  following will get all HTTP requests starting on 05/Dec/2010 until
956       the end of the file.
957
958              # sed -n '/05Dec2010/,$ p' access.log | goaccess -a -
959
960       or using relative dates such as yesterdays or tomorrows day:
961
962              # sed -n '/'$(date '+%d%b%Y' -d '1 week ago')'/,$ p'  access.log
963              | goaccess -a -
964
965       If we want to parse only a certain time-frame from DATE a to DATE b, we
966       can do:
967
968              # sed -n '/5Nov2010/,/5Dec2010/ p' access.log | goaccess -a -
969
970   VIRTUAL HOSTS
971       Assuming your log contains the virtual host (server blocks) field.  For
972       instance:
973
974              vhost.com:80 10.131.40.139 - - [02/Mar/2016:08:14:04 -0600] "GET
975              /shop/bag-p-20 HTTP/1.1" 200 6715 "-"  "Apache  (internal  dummy
976              connection)"
977
978       And  you  would like to append the virtual host to the request in order
979       to see which virtual host the top urls belong to
980
981              awk '$8=$1$8' access.log | goaccess -a -
982
983       To exclude a list of virtual hosts you can do the following:
984
985              # grep -v  "`cat  exclude_vhost_list_file`"  vhost_access.log  |
986              goaccess -
987
988   FILES & STATUS CODES
989       To  parse specific pages, e.g., page views, html, htm, php, etc. within
990       a request:
991
992              # awk '$7~/.html|.htm|.php/' access.log | goaccess -
993
994       Note, $7 is the request field for the common and combined  log  format,
995       (without  Virtual  Host),  if  your log includes Virtual Host, then you
996       probably want to use $8 instead. It's best to check which field you are
997       shooting for, e.g.:
998
999              # tail -10 access.log | awk '{print $8}'
1000
1001       Or to parse a specific status code, e.g., 500 (Internal Server Error):
1002
1003              # awk '$9~/500/' access.log | goaccess -
1004
1005   SERVER
1006       Also, it is worth pointing out that if we want to run GoAccess at lower
1007       priority, we can run it as:
1008
1009              # nice -n 19 goaccess -f access.log -a
1010
1011       and if you don't want to install it on your server, you can  still  run
1012       it from your local machine:
1013
1014              #  ssh  root@server 'cat /var/log/apache2/access.log' | goaccess
1015              -a -
1016
1017   INCREMENTAL LOG PROCESSING
1018       GoAccess has the ability to process logs incrementally through the  on-
1019       disk B+Tree database. It works in the following way:
1020
1021
1022       1  A  dataset  must  be  persisted first with --keep-db-files, then the
1023          same dataset can be loaded with --load-from-disk.
1024
1025       2  If new data is passed (piped or through a log file), it will  append
1026          it to the original dataset.
1027
1028       3  To preserve the data at all times, --keep-db-files must be used.
1029
1030       4  If  --load-from-disk is used without --keep-db-files, database files
1031          will be deleted upon closing the program.
1032
1033       For instance:
1034
1035              // last month access log
1036              goaccess access.log.1 --keep-db-files
1037
1038       then, load it with
1039
1040              // append this month access log, and preserve new data
1041              goaccess access.log --load-from-disk --keep-db-files
1042
1043       To read persisted data only (without parsing new data)
1044
1045              goaccess --load-from-disk --keep-db-files
1046

NOTES

1048       Each active panel has a total of 366 items or 50 in the real-time  HTML
1049       report.   The  number of items is customizable using max-items However,
1050       only the CSV and JSON output allow a maximum number  greater  than  the
1051       default value of 366 items per panel.
1052
1053       When  analyzing  the  same  log file twice using the on-disk B+Tree and
1054       using --keep-db-files and --load-from-disk on each run,  GoAccess  will
1055       count each entry twice. Issue #334 will address this issue.
1056
1057       A  hit  is  a  request (line in the access log), e.g., 10 requests = 10
1058       hits. HTTP requests with the same IP, date, and user agent are  consid‐
1059       ered a unique visit.
1060

BUGS

1062       If  you  think  you  have found a bug, please send me an email to goac‐
1063       cess@prosoftcorp.com    or     use     the     issue     tracker     in
1064       https://github.com/allinurl/goaccess/issues
1065

AUTHOR

1067       Gerardo  Orellana <goaccess@prosoftcorp.com> For more details about it,
1068       or new releases, please visit http://goaccess.io
1069
1070
1071
1072Linux                             MARCH 2017                       goaccess(1)
Impressum