1goaccess(1) User Manuals goaccess(1)
2
3
4
6 goaccess - fast web log analyzer and interactive viewer.
7
9 goaccess [filename] [ options ... ] [-c][-M][-H][-q][-d][...]
10
12 goaccess GoAccess is an open source real-time web log analyzer and
13 interactive viewer that runs in a terminal in *nix systems or through
14 your browser.
15
16 It provides fast and valuable HTTP statistics for system administrators
17 that require a visual server report on the fly.
18
19 GoAccess parses the specified web log file and outputs the data to the
20 X terminal. Features include:
21
22
23 General Statistics:
24 This panel gives a summary of several metrics, such as: number
25 of valid and invalid requests, time taken to analyze the
26 dataset, unique visitors, requested files, static files (CSS,
27 ICO, JPG, etc) HTTP referrers, 404s, size of the parsed log file
28 and bandwidth consumption.
29
30 Unique visitors
31 This panel shows metrics such as hits, unique visitors and cumu‐
32 lative bandwidth per date. HTTP requests containing the same IP,
33 the same date, and the same user agent are considered a unique
34 visitor. By default, it includes web crawlers/spiders.
35
36 Optionally, date specificity can be set to the hour level using
37 --date-spec=hr which will display dates such as 05/Jun/2016:16.
38 This is great if you want to track your daily traffic at the
39 hour level.
40
41 Requested files
42 This panel displays the most requested files on your web server.
43 It shows hits, unique visitors, and percentage, along with the
44 cumulative bandwidth, protocol, and the request method used.
45
46 Requested static files
47 Lists the most frequently static files such as: JPG, CSS, SWF,
48 JS, GIF, and PNG file types, along with the same metrics as the
49 last panel. Additional static files can be added to the configu‐
50 ration file.
51
52 404 or Not Found
53 Displays the same metrics as the previous request panels, how‐
54 ever, its data contains all pages that were not found on the
55 server, or commonly known as 404 status code.
56
57 Hosts This panel has detailed information on the hosts themselves.
58 This is great for spotting aggressive crawlers and identifying
59 who's eating your bandwidth.
60
61 Expanding the panel can display more information such as host's
62 reverse DNS lookup result, country of origin and city. If the -a
63 argument is enabled, a list of user agents can be displayed by
64 selecting the desired IP address, and then pressing ENTER.
65
66 Operating Systems
67 This panel will report which operating system the host used when
68 it hit the server. It attempts to provide the most specific ver‐
69 sion of each operating system.
70
71 Browsers
72 This panel will report which browser the host used when it hit
73 the server. It attempts to provide the most specific version of
74 each browser.
75
76 Visit Times
77 This panel will display an hourly report. This option displays
78 24 data points, one for each hour of the day.
79
80 Optionally, hour specificity can be set to the tenth of an hour
81 level using --hour-spec=min which will display hours as 16:4
82 This is great if you want to spot peaks of traffic on your
83 server.
84
85 Virtual Hosts
86 This panel will display all the different virtual hosts parsed
87 from the access log. This panel is displayed if %v is used
88 within the log-format string.
89
90 Referrers URLs
91 If the host in question accessed the site via another resource,
92 or was linked/diverted to you from another host, the URL they
93 were referred from will be provided in this panel. See
94 `--ignore-panel` in your configuration file to enable it. dis‐
95 abled by default.
96
97 Referring Sites
98 This panel will display only the host part but not the whole
99 URL. The URL where the request came from.
100
101 Keyphrases
102 It reports keyphrases used on Google search, Google cache, and
103 Google translate that have lead to your web server. At present,
104 it only supports Google search queries via HTTP. See `--ignore-
105 panel` in your configuration file to enable it. disabled by
106 default.
107
108 Geo Location
109 Determines where an IP address is geographically located. Sta‐
110 tistics are broken down by continent and country. It needs to be
111 compiled with GeoLocation support.
112
113 HTTP Status Codes
114 The values of the numeric status code to HTTP requests.
115
116 Remote User (HTTP authentication)
117 This is the userid of the person requesting the document as
118 determined by HTTP authentication. If the document is not pass‐
119 word protected, this part will be "-" just like the previous
120 one. This panel is not enabled unless %e is given within the
121 log-format variable.
122
123
124 NOTE: Optionally and if configured, all panels can display the average
125 time taken to serve the request.
126
127
129 There are three storage options that can be used with GoAccess. Choos‐
130 ing one will depend on your environment and needs.
131
132 Default Hash Tables
133 In-memory storage provides better performance at the cost of
134 limiting the dataset size to the amount of available physical
135 memory. By default GoAccess uses in-memory hash tables. If your
136 dataset can fit in memory, then this will perform fine. It has
137 very good memory usage and pretty good performance.
138
139 Tokyo Cabinet On-Disk B+ Tree
140 Use this storage method for large datasets where it is not pos‐
141 sible to fit everything in memory. The B+ tree database is
142 slower than any of the hash databases since data has to be com‐
143 mitted to disk. However, using an SSD greatly increases the per‐
144 formance. You may also use this storage method if you need data
145 persistence to quickly load statistics at a later date.
146
147 Tokyo Cabinet In-memory Hash Database
148 An alternative to the default hash tables. It uses generic typ‐
149 ing and thus it's performance in terms of memory and speed is
150 average.
151
153 Multiple options can be used to configure GoAccess. For a complete up-
154 to-date list of configure options, run ./configure --help
155
156 --enable-debug
157 Compile with debugging symbols and turn off compiler optimiza‐
158 tions.
159
160 --enable-utf8
161 Compile with wide character support. Ncursesw is required.
162
163 --enable-geoip=<legacy|geoip2>
164 Compile with GeoLocation support. MaxMind's GeoIP is required.
165 legacy will utilize the original GeoIP databases. geoip2 will
166 utilize the enhanced GeoIP2 databases.
167
168 --enable-tcb=<memhash|btree>
169 Compile with Tokyo Cabinet storage support. memhash will uti‐
170 lize Tokyo Cabinet's on-memory hash database. btree will uti‐
171 lize Tokyo Cabinet's on-disk B+ Tree database.
172
173 --disable-zlib
174 Disable zlib compression on B+ Tree database.
175
176 --disable-bzip
177 Disable bzip2 compression on B+ Tree database.
178
179 --with-getline
180 Dynamically expands line buffer in order to parse full line
181 requests instead of using a fixed size buffer of 4096.
182
183 --with-openssl
184 Compile GoAccess with OpenSSL support for its WebSocket server.
185
187 The following options can be supplied to the command or specified in
188 the configuration file. If specified in the configuration file, long
189 options need to be used without prepending -- and without using the
190 equal sign =.
191
192 LOG/DATE/TIME FORMAT
193 --time-format=<timeformat>
194 The time-format variable followed by a space, specifies the log
195 format time containing either a name of a predefined format (see
196 options below) or any combination of regular characters and spe‐
197 cial format specifiers.
198
199 They all begin with a percentage (%) sign. See `man strftime`.
200 %T or %H:%M:%S.
201
202 Note that if a timestamp is given in microseconds, %f must be
203 used as time-format
204
205 --date-format=<dateformat>
206 The date-format variable followed by a space, specifies the log
207 format time containing either a name of a predefined format (see
208 options below) or any combination of regular characters and spe‐
209 cial format specifiers.
210
211 They all begin with a percentage (%) sign. See `man strftime`.
212 %Y-%m-%d.
213
214 Note that if a timestamp is given in microseconds, %f must be
215 used as date-format
216
217 --log-format=<logformat>
218 The log-format variable followed by a space or \t for tab-delim‐
219 ited, specifies the log format string.
220
221 Note that if there are spaces within the format, the string
222 needs to be enclosed in single/double quotes. Inner quotes need
223 to be escaped.
224
225 In addition to specifying the raw log/date/time formats, for
226 simplicity, any of the following predefined log format names can
227 be supplied to the log/date/time-format variables. GoAccess can
228 also handle one predefined name in one variable and another pre‐
229 defined name in another variable.
230
231 COMBINED - Combined Log Format,
232 VCOMBINED - Combined Log Format with Virtual Host,
233 COMMON - Common Log Format,
234 VCOMMON - Common Log Format with Virtual Host,
235 W3C - W3C Extended Log File Format,
236 SQUID - Native Squid Log Format,
237 CLOUDFRONT - Amazon CloudFront Web Distribution,
238 CLOUDSTORAGE - Google Cloud Storage,
239 AWSELB - Amazon Elastic Load Balancing,
240 AWSS3 - Amazon Simple Storage Service (S3)
241
242 Note: Piping data into GoAccess won't prompt a log/date/time
243 configuration dialog, you will need to previously define it in
244 your configuration file or in the command line.
245
246 USER INTERFACE OPTIONS
247 -c --config-dialog
248 Prompt log/time/date configuration window on program start. Only
249 when curses is initialized.
250
251 -i --hl-header
252 Color highlight active panel.
253
254 -m --with-mouse
255 Enable mouse support on main terminal dashboard.
256
257 ---color=<fg:bg[attrs, PANEL]>
258 Specify custom colors for the terminal output.
259
260 Color Syntax
261 DEFINITION space/tab colorFG#:colorBG# [attributes,PANEL]
262
263 FG# = foreground color [-1...255] (-1 = default term color)
264 BG# = background color [-1...255] (-1 = default term color)
265
266 Optionally, it is possible to apply color attributes (multiple
267 attributes are comma separated), such as: bold, underline, nor‐
268 mal, reverse, blink
269
270 If desired, it is possible to apply custom colors per panel,
271 that is, a metric in the REQUESTS panel can be of color A, while
272 the same metric in the BROWSERS panel can be of color B.
273
274 Available color definitions:
275 COLOR_MTRC_HITS
276 COLOR_MTRC_VISITORS
277 COLOR_MTRC_DATA
278 COLOR_MTRC_BW
279 COLOR_MTRC_AVGTS
280 COLOR_MTRC_CUMTS
281 COLOR_MTRC_MAXTS
282 COLOR_MTRC_PROT
283 COLOR_MTRC_MTHD
284 COLOR_MTRC_PERC
285 COLOR_MTRC_PERC_MAX
286 COLOR_PANEL_COLS
287 COLOR_BARS
288 COLOR_ERROR
289 COLOR_SELECTED
290 COLOR_PANEL_ACTIVE
291 COLOR_PANEL_HEADER
292 COLOR_PANEL_DESC
293 COLOR_OVERALL_LBLS
294 COLOR_OVERALL_VALS
295 COLOR_OVERALL_PATH
296 COLOR_ACTIVE_LABEL
297 COLOR_BG
298 COLOR_DEFAULT
299 COLOR_PROGRESS
300
301 See configuration file for a sample color scheme.
302
303 --color-scheme=<1|2|3>
304 Choose among color schemes. 1 for the default grey scheme. 2
305 for the green scheme. 3 for the Monokai scheme (shown only if
306 terminal supports 256 colors).
307
308 --crawlers-only
309 Parse and display only crawlers (bots).
310
311 --html-custom-css=<path.css>
312 Specifies a custom CSS file path to load in the HTML report.
313
314 --html-custom-js=<path.js>
315 Specifies a custom JS file path to load in the HTML report.
316
317 --html-report-title=<title>
318 Set HTML report page title and header.
319
320 --html-prefs=<JSON>
321 Set HTML report default preferences. Supply a valid JSON object
322 containing the HTML preferences. It allows the ability to cus‐
323 tomize each panel plot. See example below.
324
325 Note: The JSON object passed needs to be a one line JSON string.
326 For instance,
327
328 --html-prefs='{"theme":"bright","perPage":5,"layout":"horizon‐
329 tal","showTables":true,"visitors":{"plot":{"chartType":"bar"}}}'
330
331 --json-pretty-print
332 Format JSON output using tabs and newlines.
333
334 Note: This is not recommended when outputting a real-time HTML
335 report since the WebSocket payload will much much larger.
336
337 --max-items=<number>
338 The maximum number of items to display per panel. The maximum
339 can be a number between 1 and n.
340
341 Note: Only the CSV and JSON output allow a maximum number
342 greater than the default value of 366 (or 50 in the real-time
343 HTML output) items per panel.
344
345 --no-color
346 Turn off colored output. This is the default output on termi‐
347 nals that do not support colors.
348
349 --no-column-names
350 Don't write column names in the terminal output. By default, it
351 displays column names for each available metric in every panel.
352
353 --no-csv-summary
354 Disable summary metrics on the CSV output.
355
356 --no-progress
357 Disable progress metrics [total requests/requests per second].
358
359 --no-tab-scroll
360 Disable scrolling through panels when TAB is pressed or when a
361 panel is selected using a numeric key.
362
363 --no-html-last-updated
364 Do not show the last updated field displayed in the HTML gener‐
365 ated report.
366
367 SERVER OPTIONS
368 --addr Specify IP address to bind the server to. Otherwise it binds to
369 0.0.0.0.
370
371 Usually there is no need to specify the address, unless you
372 intentionally would like to bind the server to a different
373 address within your server.
374
375 --daemonize
376 Run GoAccess as daemon (only if --real-time-html enabled).
377
378 --origin=<url>
379 Ensure clients send the specified origin header upon the Web‐
380 Socket handshake.
381
382 --port=<port>
383 Specify the port to use. By default GoAccess' WebSocket server
384 listens on port 7890.
385
386 --real-time-html
387 Enable real-time HTML output.
388
389 GoAccess uses its own WebSocket server to push the data from the
390 server to the client. See http://gwsocket.io for more details
391 how the WebSocket server works.
392
393 --ws-url=<[scheme://]url[:port]>
394 URL to which the WebSocket server responds. This is the URL sup‐
395 plied to the WebSocket constructor on the client side.
396
397 Optionally, it is possible to specify the WebSocket URI scheme,
398 such as ws:// or wss:// for unencrypted and encrypted connec‐
399 tions. e.g., wss://goaccess.io
400
401 If GoAccess is running behind a proxy, you could set the client
402 side to connect to a different port by specifying the host fol‐
403 lowed by a colon and the port. e.g., goaccess.io:9999
404
405 By default, it will attempt to connect to the generated report's
406 hostname. If GoAccess is running on a remote server, the host of
407 the remote server should be specified here. Also, make sure it
408 is a valid host and NOT an http address.
409
410 --fifo-in=<path/file>
411 Creates a named pipe (FIFO) that reads from on the given
412 path/file.
413
414 --fifo-out=<path/file>
415 Creates a named pipe (FIFO) that writes to the given path/file.
416
417 --ssl-cert=<cert.crt>
418 Path to TLS/SSL certificate. In order to enable TLS/SSL support,
419 GoAccess requires that --ssl-cert and --ssl-key are used.
420
421 Only if configured using --with-openssl
422
423 --ssl-key=<priv.key>
424 Path to TLS/SSL private key. In order to enable TLS/SSL support,
425 GoAccess requires that --ssl-cert and --ssl-key are used.
426
427 Only if configured using --with-openssl
428
429 FILE OPTIONS
430 -f --log-file=<logfile>
431 Specify the path to the input log file. If set in the config
432 file, it will take priority over -f from the command line.
433
434 -l --debug-file=<debugfile>
435 Send all debug messages to the specified file.
436
437 -p --config-file=<configfile>
438 Specify a custom configuration file to use. If set, it will take
439 priority over the global configuration file (if any).
440
441 --invalid-requests=<filename>
442 Log invalid requests to the specified file.
443
444 --no-global-config
445 Do not load the global configuration file. This directory should
446 normally be /usr/local/etc, unless specified with
447 --sysconfdir=/dir.
448
449 PARSE OPTIONS
450 -a --agent-list
451 Enable a list of user-agents by host. For faster parsing, do not
452 enable this flag.
453
454 -d --with-output-resolver
455 Enable IP resolver on HTML|JSON output.
456
457 -e --exclude-ip=<IP|IP-range>
458 Exclude an IPv4 or IPv6 from being counted. Ranges can be
459 included as well using a dash in between the IPs (start-end).
460
461 Examples:
462 exclude-ip 127.0.0.1
463 exclude-ip 192.168.0.1-192.168.0.100
464 exclude-ip ::1
465 exclude-ip 0:0:0:0:0:ffff:808:804-0:0:0:0:0:ffff:808:808
466
467 -H --http-protocol=<yes|no>
468 Set/unset HTTP request protocol. This will create a request key
469 containing the request protocol + the actual request.
470
471 -M --http-method=<yes|no>
472 Set/unset HTTP request method. This will create a request key
473 containing the request method + the actual request.
474
475 -o --output=<path/file.[json|csv|html]>
476 Write output to stdout given one of the following files and the
477 corresponding extension for the output format:
478
479 /path/file.csv - Comma-separated values (CSV)
480 /path/file.json - JSON (JavaScript Object Notation)
481 /path/file.html - HTML
482
483 -q --no-query-string
484 Ignore request's query string. i.e.,
485 www.google.com/page.htm?query => www.google.com/page.htm.
486
487 Note: Removing the query string can greatly decrease memory con‐
488 sumption, especially on timestamped requests.
489
490 -r --no-term-resolver
491 Disable IP resolver on terminal output.
492
493 --444-as-404
494 Treat non-standard status code 444 as 404.
495
496 --4xx-to-unique-count
497 Add 4xx client errors to the unique visitors count.
498
499 --all-static-files
500 Include static files that contain a query string. e.g.,
501 /fonts/fontawesome-webfont.woff?v=4.0.3
502
503 --date-spec=<date|hr>
504 Set the date specificity to either date (default) or hr to dis‐
505 play hours appended to the date.
506
507 This is used in the visitors panel. It's useful for tracking
508 visitors at the hour level. For instance, an hour specificity
509 would yield to display traffic as 18/Dec/2010:19
510
511 --double-decode
512 Decode double-encoded values. This includes, user-agent,
513 request, and referer.
514
515 --enable-panel=<PANEL>
516 Enable parsing and displaying the given panel.
517
518 Available panels:
519 VISITORS
520 REQUESTS
521 REQUESTS_STATIC
522 NOT_FOUND
523 HOSTS
524 OS
525 BROWSERS
526 VISIT_TIMES
527 VIRTUAL_HOSTS
528 REFERRERS
529 REFERRING_SITES
530 KEYPHRASES
531 STATUS_CODES
532 REMOTE_USER
533 GEO_LOCATION
534
535 --hour-spec=<hr|min>
536 Set the time specificity to either hour (default) or min to dis‐
537 play the tenth of an hour appended to the hour.
538
539 This is used in the time distribution panel. It's useful for
540 tracking peaks of traffic on your server at specific times.
541
542 --ignore-crawlers
543 Ignore crawlers from being counted.
544
545 --ignore-panel=<PANEL>
546 Ignore parsing and displaying the given panel.
547
548 Available panels:
549 VISITORS
550 REQUESTS
551 REQUESTS_STATIC
552 NOT_FOUND
553 HOSTS
554 OS
555 BROWSERS
556 VISIT_TIMES
557 VIRTUAL_HOSTS
558 REFERRERS
559 REFERRING_SITES
560 KEYPHRASES
561 STATUS_CODES
562 REMOTE_USER
563
564 --ignore-referer=<referer>
565 Ignore referers from being counted. Wildcards allowed. e.g.,
566 *.domain.com ww?.domain.*
567
568 --ignore-status=<CODE>
569 Ignore parsing and displaying one or multiple status code(s).
570 For multiple status codes, use this option multiple times.
571
572 --num-tests=<number>
573 Number of lines from the access log to test against the provided
574 log/date/time format. By default, the parser is set to test 10
575 lines. If set to 0, the parser won't test any lines and will
576 parse the whole access log. If a line matches the given
577 log/date/time format before it reaches <number>, the parser will
578 consider the log to be valid, otherwise GoAccess will return
579 EXIT_FAILURE and display the relevant error messages.
580
581 --process-and-exit
582 Parse log and exit without outputting data. Useful if we are
583 looking to only add new data to the on-disk database without
584 outputting to a file or a terminal.
585
586 --real-os
587 Display real OS names. e.g, Windows XP, Snow Leopard.
588
589 --sort-panel=<PANEL,FIELD,ORDER>
590 Sort panel on initial load. Sort options are separated by comma.
591 Options are in the form: PANEL,METRIC,ORDER
592
593 Available metrics:
594 BY_HITS - Sort by hits
595 BY_VISITORS - Sort by unique visitors
596 BY_DATA - Sort by data
597 BY_BW - Sort by bandwidth
598 BY_AVGTS - Sort by average time served
599 BY_CUMTS - Sort by cumulative time served
600 BY_MAXTS - Sort by maximum time served
601 BY_PROT - Sort by http protocol
602 BY_MTHD - Sort by http method
603
604 Available orders:
605 ASC
606 DESC
607
608 --static-file=<extension>
609 Add static file extension. e.g.: .mp3 Extensions are case sensi‐
610 tive.
611
612 GEOLOCATION OPTIONS
613 -g --std-geoip
614 Standard GeoIP database for less memory usage.
615
616 --geoip-database=<geofile>
617 Specify path to GeoIP database file. i.e., GeoLiteCity.dat. File
618 needs to be downloaded from maxmind.com. IPv4 and IPv6 files are
619 supported as well. Note: `--geoip-city-data` is an alias of
620 `--geoip-database`.
621
622 OTHER OPTIONS
623 -h --help
624 The help.
625
626 -s --storage
627 Display current storage method. i.e., B+ Tree, Hash.
628
629 -V --version
630 Display version information and exit.
631
632 --dcf Display the path of the default config file when `-p` is not
633 used.
634
635 ON-DISK STORAGE OPTIONS
636 --keep-db-files
637 Persist parsed data into disk. If database files exist, files
638 will be overwritten. This should be set to the first dataset.
639 Setting it to false will delete all database files when exiting
640 the program. See examples below.
641
642 Only if configured with --enable-tcb=btree
643
644 --load-from-disk
645 Load previously stored data from disk. If reading persisted data
646 only, the database files need to exist. See keep-db-files and
647 examples below.
648
649 Only if configured with --enable-tcb=btree
650
651 --db-path=<dir>
652 Path where the on-disk database files are stored. The default
653 value is the /tmp directory.
654
655 Only if configured with --enable-tcb=btree
656
657 --xmmap=<num>
658 Set the size in bytes of the extra mapped memory. The default
659 value is 0.
660
661 Only if configured with --enable-tcb=btree
662
663 --cache-lcnum=<num>
664 Specifies the maximum number of leaf nodes to be cached. If it
665 is not more than 0, the default value is specified. The default
666 value is 1024. Setting a larger value will increase speed per‐
667 formance, however, memory consumption will increase. Lower value
668 will decrease memory consumption.
669
670 Only if configured with --enable-tcb=btree
671
672 --cache-ncnum=<num>
673 Specifies the maximum number of non-leaf nodes to be cached. If
674 it is not more than 0, the default value is specified. The
675 default value is 512.
676
677 Only if configured with --enable-tcb=btree
678
679 --tune-lmemb=<num>
680 Specifies the number of members in each leaf page. If it is not
681 more than 0, the default value is specified. The default value
682 is 128.
683
684 Only if configured with --enable-tcb=btree
685
686 --tune-nmemb=<num>
687 Specifies the number of members in each non-leaf page. If it is
688 not more than 0, the default value is specified. The default
689 value is 256.
690
691 Only if configured with --enable-tcb=btree
692
693 --tune-bnum=<num>
694 Specifies the number of elements of the bucket array. If it is
695 not more than 0, the default value is specified. The default
696 value is 32749. Suggested size of the bucket array is about from
697 1 to 4 times of the number of all pages to be stored.
698
699 Only if configured with --enable-tcb=btree
700
701 --compression=<zlib|bz2>
702 Specifies that each page is compressed with ZLIB|BZ2 encoding.
703
704 Only if configured with --enable-tcb=btree
705
706
708 GoAccess can parse virtually any web log format.
709
710 Predefined options include, Common Log Format (CLF), Combined Log For‐
711 mat (XLF/ELF), including virtual host, Amazon CloudFront (Download Dis‐
712 tribution), Google Cloud Storage and W3C format (IIS).
713
714 GoAccess allows any custom format string as well.
715
716 There are two ways to configure the log format. The easiest is to run
717 GoAccess with -c to prompt a configuration window. Otherwise, it can be
718 configured under ~/.goaccessrc or the %sysconfdir%.
719
720 time-format
721 The time-format variable followed by a space, specifies the log
722 format time containing any combination of regular characters and
723 special format specifiers. They all begin with a percentage (%)
724 sign. See `man strftime`. %T or %H:%M:%S.
725
726 Note: If a timestamp is given in microseconds, %f must be used
727 as time-format
728
729 date-format
730 The date-format variable followed by a space, specifies the log
731 format date containing any combination of regular characters and
732 special format specifiers. They all begin with a percentage (%)
733 sign. See `man strftime`. e.g., %Y-%m-%d.
734
735 Note: If a timestamp is given in microseconds, %f must be used
736 as date-format
737
738 log-format
739 The log-format variable followed by a space or \t , specifies
740 the log format string.
741
742 %x A date and time field matching the time-format and date-format
743 variables. This is used when a timestamp is given instead of the
744 date and time being in two separated variables.
745
746 %t time field matching the time-format variable.
747
748 %d date field matching the date-format variable.
749
750 %v The canonical Server Name of the server serving the request
751 (Virtual Host).
752
753 %e This is the userid of the person requesting the document as
754 determined by HTTP authentication.
755
756 %h host (the client IP address, either IPv4 or IPv6)
757
758 %r The request line from the client. This requires specific delim‐
759 iters around the request (as single quotes, double quotes, or
760 anything else) to be parsable. If not, we have to use a combina‐
761 tion of special format specifiers as %m %U %H.
762
763 %q The query string.
764
765 %m The request method.
766
767 %U The URL path requested.
768
769 Note: If the query string is in %U, there is no need to use %q.
770 However, if the URL path, does not include any query string, you
771 may use %q and the query string will be appended to the request.
772
773 %H The request protocol.
774
775 %s The status code that the server sends back to the client.
776
777 %b The size of the object returned to the client.
778
779 %R The "Referrer" HTTP request header.
780
781 %u The user-agent HTTP request header.
782
783 %D The time taken to serve the request, in microseconds as a deci‐
784 mal number.
785
786 %T The time taken to serve the request, in seconds with millisec‐
787 onds resolution.
788
789 %L The time taken to serve the request, in milliseconds as a deci‐
790 mal number.
791
792 %^ Ignore this field.
793
794 %~ Move forward through the log string until a non-space (!isspace)
795 char is found.
796
797 ~h The host (the client IP address, either IPv4 or IPv6) in a X-
798 Forwarded-For (XFF) field.
799
800 It uses a special specifier which consists of a tilde before the
801 host specifier, followed by the character(s) that delimit the
802 XFF field, which are enclosed by curly braces (i.e., ~h{," })
803
804 For example, ~h{," } is used in order to parse "11.25.11.53,
805 17.68.33.17" field which is delimited by a double quote, a
806 comma, and a space.
807
808 Note: In order to get the average, cumulative and maximum time served
809 in GoAccess, you will need to start logging response times in your web
810 server. In Nginx you can add $request_time to your log format, or %D in
811 Apache.
812
813 Important: If multiple time served specifiers are used at the same
814 time, the first option specified in the format string will take prior‐
815 ity over the other specifiers.
816
817 GoAccess requires the following fields:
818
819 %h a valid IPv4/6
820
821 %d a valid date
822
823 %r the request
824
826 F1 or h
827 Main help.
828
829 F5 Redraw main window.
830
831 q Quit the program, current window or collapse active module
832
833 o or ENTER
834 Expand selected module or open window
835
836 0-9 and Shift + 0
837 Set selected module to active
838
839 j Scroll down within expanded module
840
841 k Scroll up within expanded module
842
843 c Set or change scheme color.
844
845 TAB Forward iteration of modules. Starts from current active module.
846
847 SHIFT + TAB
848 Backward iteration of modules. Starts from current active mod‐
849 ule.
850
851 ^f Scroll forward one screen within an active module.
852
853 ^b Scroll backward one screen within an active module.
854
855 s Sort options for active module
856
857 / Search across all modules (regex allowed)
858
859 n Find the position of the next occurrence across all modules.
860
861 g Move to the first item or top of screen.
862
863 G Move to the last item or bottom of screen.
864
866 DIFFERENT OUTPUTS
867 To output to a terminal and generate an interactive report:
868
869 # goaccess access.log
870
871 To generate an HTML report:
872
873 # goaccess access.log -a -o report.html
874
875 To generate a JSON report:
876
877 # goaccess access.log -a -d -o report.json
878
879 To generate a CSV file:
880
881 # goaccess access.log --no-csv-summary -o report.csv
882
883 GoAccess also allows great flexibility for real-time filtering and
884 parsing. For instance, to quickly diagnose issues by monitoring logs
885 since goaccess was started:
886
887 # tail -f access.log | goaccess -
888
889 And even better, to filter while maintaining opened a pipe to preserve
890 real-time analysis, we can make use of tail -f and a matching pattern
891 tool such as grep, awk, sed, etc:
892
893 # tail -f access.log | grep -i --line-buffered 'firefox' | goac‐
894 cess --log-format=COMBINED -
895
896 MULTIPLE LOG FILES
897 There are several ways to parse multiple logs with GoAccess. The sim‐
898 plest is to pass multiple log files to the command line:
899
900 # goaccess access.log access.log.1
901
902 It's even possible to parse files from a pipe while reading regular
903 files:
904
905 # cat access.log.2 | goaccess access.log access.log.1 -
906
907 Note that the single dash is appended to the command line to let GoAc‐
908 cess know that it should read from the pipe.
909
910 Now if we want to add more flexibility to GoAccess, we can do a series
911 of pipes. For instance, if we would like to process all compressed log
912 files access.log.*.gz in addition to the current log file, we can do:
913
914 # zcat access.log.*.gz | goaccess access.log -
915
916 Note: On Mac OS X, use gunzip -c instead of zcat.
917
918 REAL TIME HTML OUTPUT
919 GoAccess has the ability the output real-time data in the HTML report.
920 You can even email the HTML file since it is composed of a single file
921 with no external file dependencies, how neat is that!
922
923 The process of generating a real-time HTML report is very similar to
924 the process of creating a static report. Only --real-time-html is
925 needed to make it real-time.
926
927 # goaccess access.log -o /usr/share/nginx/html/site/report.html
928 --real-time-html
929
930 By default, GoAccess will use the host name of the generated report.
931 Optionally, you can specify the URL to which the client's browser will
932 connect to. See http://goaccess.io/faq for a more detailed example.
933
934 # goaccess access.log -o report.html --real-time-html --ws-
935 url=goaccess.io
936
937 By default, GoAccess listens on port 7890, to use a different port
938 other than 7890, you can specify it as (make sure the port is opened):
939
940 # goaccess access.log -o report.html --real-time-html
941 --port=9870
942
943 And to bind the WebSocket server to a different address other than
944 0.0.0.0, you can specify it as:
945
946 # goaccess access.log -o report.html --real-time-html
947 --addr=127.0.0.1
948
949 Note: To output real time data over a TLS/SSL connection, you need to
950 use --ssl-cert=<cert.crt> and --ssl-key=<priv.key>.
951
952 WORKING WITH DATES
953 Another useful pipe would be filtering dates out of the web log
954
955 The following will get all HTTP requests starting on 05/Dec/2010 until
956 the end of the file.
957
958 # sed -n '/05Dec2010/,$ p' access.log | goaccess -a -
959
960 or using relative dates such as yesterdays or tomorrows day:
961
962 # sed -n '/'$(date '+%d%b%Y' -d '1 week ago')'/,$ p' access.log
963 | goaccess -a -
964
965 If we want to parse only a certain time-frame from DATE a to DATE b, we
966 can do:
967
968 # sed -n '/5Nov2010/,/5Dec2010/ p' access.log | goaccess -a -
969
970 VIRTUAL HOSTS
971 Assuming your log contains the virtual host (server blocks) field. For
972 instance:
973
974 vhost.com:80 10.131.40.139 - - [02/Mar/2016:08:14:04 -0600] "GET
975 /shop/bag-p-20 HTTP/1.1" 200 6715 "-" "Apache (internal dummy
976 connection)"
977
978 And you would like to append the virtual host to the request in order
979 to see which virtual host the top urls belong to
980
981 awk '$8=$1$8' access.log | goaccess -a -
982
983 To exclude a list of virtual hosts you can do the following:
984
985 # grep -v "`cat exclude_vhost_list_file`" vhost_access.log |
986 goaccess -
987
988 FILES & STATUS CODES
989 To parse specific pages, e.g., page views, html, htm, php, etc. within
990 a request:
991
992 # awk '$7~/.html|.htm|.php/' access.log | goaccess -
993
994 Note, $7 is the request field for the common and combined log format,
995 (without Virtual Host), if your log includes Virtual Host, then you
996 probably want to use $8 instead. It's best to check which field you are
997 shooting for, e.g.:
998
999 # tail -10 access.log | awk '{print $8}'
1000
1001 Or to parse a specific status code, e.g., 500 (Internal Server Error):
1002
1003 # awk '$9~/500/' access.log | goaccess -
1004
1005 SERVER
1006 Also, it is worth pointing out that if we want to run GoAccess at lower
1007 priority, we can run it as:
1008
1009 # nice -n 19 goaccess -f access.log -a
1010
1011 and if you don't want to install it on your server, you can still run
1012 it from your local machine:
1013
1014 # ssh root@server 'cat /var/log/apache2/access.log' | goaccess
1015 -a -
1016
1017 INCREMENTAL LOG PROCESSING
1018 GoAccess has the ability to process logs incrementally through the on-
1019 disk B+Tree database. It works in the following way:
1020
1021
1022 1 A dataset must be persisted first with --keep-db-files, then the
1023 same dataset can be loaded with --load-from-disk.
1024
1025 2 If new data is passed (piped or through a log file), it will append
1026 it to the original dataset.
1027
1028 3 To preserve the data at all times, --keep-db-files must be used.
1029
1030 4 If --load-from-disk is used without --keep-db-files, database files
1031 will be deleted upon closing the program.
1032
1033 For instance:
1034
1035 // last month access log
1036 goaccess access.log.1 --keep-db-files
1037
1038 then, load it with
1039
1040 // append this month access log, and preserve new data
1041 goaccess access.log --load-from-disk --keep-db-files
1042
1043 To read persisted data only (without parsing new data)
1044
1045 goaccess --load-from-disk --keep-db-files
1046
1048 Each active panel has a total of 366 items or 50 in the real-time HTML
1049 report. The number of items is customizable using max-items However,
1050 only the CSV and JSON output allow a maximum number greater than the
1051 default value of 366 items per panel.
1052
1053 When analyzing the same log file twice using the on-disk B+Tree and
1054 using --keep-db-files and --load-from-disk on each run, GoAccess will
1055 count each entry twice. Issue #334 will address this issue.
1056
1057 A hit is a request (line in the access log), e.g., 10 requests = 10
1058 hits. HTTP requests with the same IP, date, and user agent are consid‐
1059 ered a unique visit.
1060
1062 If you think you have found a bug, please send me an email to goac‐
1063 cess@prosoftcorp.com or use the issue tracker in
1064 https://github.com/allinurl/goaccess/issues
1065
1067 Gerardo Orellana <goaccess@prosoftcorp.com> For more details about it,
1068 or new releases, please visit http://goaccess.io
1069
1070
1071
1072Linux MARCH 2017 goaccess(1)