1MCELOG(8)               Linux's Administrator's Manual               MCELOG(8)
2
3
4

NAME

6       mcelog - Decode kernel machine check log on x86 machines
7

SYNOPSIS

9       mcelog [options] [device]
10       mcelog [options] --daemon
11       mcelog [options] --client
12       mcelog [options] --ascii
13       mcelog [options] --is-cpu-supported
14       mcelog --version
15       mcelog --supported
16

DESCRIPTION

18       X86  CPUs  report  errors  detected  by the CPU as machine check events
19       (MCEs).  These can be data corruption detected in the  CPU  caches,  in
20       main memory by an integrated memory controller, data transfer errors on
21       the front side bus or CPU interconnect or other internal errors.   Pos‐
22       sible  causes can be cosmic radiation, instable power supplies, cooling
23       problems, broken hardware, running systems out of specification, or bad
24       luck.
25
26       Most  errors  can  be corrected by the CPU by internal error correction
27       mechanisms. Uncorrected errors cause machine check exceptions which may
28       kill processes or panic the machine. A small number of corrected errors
29       is usually not a cause for worry,  but  a  large  number  can  indicate
30       future failure.
31
32       When  a  corrected  or  recovered error happens the x86 kernel writes a
33       record describing the MCE into a internal ring buffer available through
34       the  /dev/mcelog  device  mcelog  retrieves  errors  from  /dev/mcelog,
35       decodes them into a human readable format and prints them on the  stan‐
36       dard output or optionally into the system log.
37
38       Optionally  it  can  also  take more options like keeping statistics or
39       triggering shell scripts on specific events. By default mcelog supports
40       offlining  memory pages with persistent corrected errors, offlining CPU
41       cores if they developed cache problems, and otherwise logging  specific
42       events to the system log after they crossed a threshold.
43
44       The  normal operating modi for mcelog are running as a regular cron job
45       (traditional way, deprecated), running as a trigger  directly  executed
46       by the kernel, or running as a daemon with the --daemon option.
47
48       When  an uncorrected machine check error happens that the kernel cannot
49       recover from then it will usually panic the system.  In this case  when
50       there  was  a  warm  reset  after  the  panic mcelog should pick up the
51       machine check errors after reboot.  This is not possible after  a  cold
52       reset.
53
54       In addition mcelog can be used on the command line to decode the kernel
55       output for a fatal machine check panic in text format using the --ascii
56       option.  This is typically used to decode the panic console output of a
57       fatal machine check, if the system was power cycled  or  mcelog  didn't
58       run immediately after reboot.
59
60       When  the  panic  triggers  a kdump kexec crash kernel the crash kernel
61       boot up script should log the machine checks to  disk,  otherwise  they
62       might be lost.
63
64       Note  that  after mcelog retrieves an error the kernel doesn't store it
65       anymore (different from dmesg(1)), so the output should be always saved
66       somewhere and mcelog not run in uncontrolled ways.
67
68       When  invoked with the --is-cpu-supported option mcelog exits with code
69       0 if the current CPU is supported, 1 otherwise.
70
71

OPTIONS

73       When the --syslog option is specified redirect output  to  system  log.
74       The --syslog-error option causes the normal machine checks to be logged
75       as LOG_ERR (implies --syslog ). Normally  only  fatal  errors  or  high
76       level  remarks  are  logged with error level.  High level one line sum‐
77       maries of specific errors are also logged  to  the  syslog  by  default
78       unless mcelog operates in --ascii mode.
79
80       When  the  --logfile=file  option is specified append log output to the
81       specified file. With the --no-syslog option mcelog will never log  any‐
82       thing to the syslog.
83
84       When the --cpu=cputype option is specified set the to be decoded CPU to
85       cputype.  See mcelog --help for a list of valid CPUs.  Note that speci‐
86       fying  an incorrect CPU can lead to incorrect decoding output.  Default
87       is either the CPU of the machine that reported the machine check (needs
88       a newer kernel version) or the CPU of the machine mcelog is running on,
89       so normally this option doesn't have to  be  used.  Older  versions  of
90       mcelog  had  separate  options for different CPU types. These are still
91       implemented, but deprecated and undocumented now.
92
93       With the --dmi option mcelog will look up the DIMMs reported in machine
94       checks  in the SMBIOS/DMI tables of the BIOS and map the DIMMs to board
95       identifiers.  This only works when the  BIOS  reports  the  identifiers
96       correctly.  Unfortunately often the information reported by the BIOS is
97       either subtly or obviously wrong or useless.  This option requires that
98       mcelog has read access to /dev/mem (normally requires root) and runs on
99       the same machine in the same hardware configuration as when the machine
100       check event happened.
101
102       When --ignorenodev is specified then mcelog will exit silently when the
103       device cannot be opened. This is useful in virtualized environment with
104       limited devices.
105
106       When  --filter is specified mcelog will filter out known broken machine
107       check events (default on). When the  --no-filter  option  is  specified
108       mcelog does not filter events.
109
110       When  --raw  is  specified  mcelog  will  not decode, but just dump the
111       mcelog in a raw hex format. This can be useful for automatic post  pro‐
112       cessing.
113
114       When  a device is specified the machine check logs are read from device
115       instead of the default /dev/mcelog.
116
117       With the --ascii option mcelog decodes a fatal machine check panic gen‐
118       erated  by  the  kernel ("CPU n: Machine Check Exception ...") in ASCII
119       from standard input and exits afterwards.  Note  that  when  the  panic
120       comes  from  a  different  machine  than where mcelog is running on you
121       might need to specify the correct cputype on older  kernels.  On  newer
122       kernels which output the PROCESSOR field this is not needed anymore.
123
124       When  the  --file filename option is specified mcelog --ascii will read
125       the ASCII machine check record from  input  file  filename  instead  of
126       standard input.
127
128       With  the  --config-file  file option mcelog reads the specified config
129       file.  Default is /etc/mcelog/mcelog.conf See also CONFIG FILE below.
130
131       With the --daemon option mcelog will run in the background. This  gives
132       the fastest reaction time and is the recommended operating mode.  If an
133       output option isn't selected ( --logfile or --syslog or  --syslog-error
134       ),  this  option implies --logfile=/var/log/mcelog.  Important messages
135       will be logged as one-liner summaries to syslog unless  --no-syslog  is
136       given.   The option --foreground will prevent mcelog from giving up the
137       terminal in daemon mode. This is intended for debugging.
138
139       With the --client option mcelog will query a running daemon for accumu‐
140       lated errors.
141
142       With  the  --cpumhz=mhz  option  assume  the  CPU has mhz frequency for
143       decoding the time of the event using the CPU time stamp  counter.  This
144       also  forces  decoding.  Note  this can be unreliable.  on some systems
145       with CPU frequency scaling or deep C states, where the CPU  time  stamp
146       counter  does  not  increase linearly.  By default the frequency of the
147       current CPU is used when mcelog determines it is  safe  to  use.  Newer
148       kernels  report the time directly in the event and don't need this any‐
149       more.
150
151       The --pidfile file option writes the process id of the daemon into file
152       file.  Only valid in daemon mode.
153
154       Mcelog  will enable extended error reporting from the memory controller
155       on processors that support it unless you tell it not to with the  --no-
156       imc-log  option. You might need this option when decoding old logs from
157       a system where this mode was not enabled.
158
159       Mcelog will enable extended error reporting from the memory  controller
160       on  processors that support it unless you tell it not to with the --no-
161       imc-log option. You might need this option when decoding old logs  from
162       a system where this mode was not enabled.
163
164
165       --version displays the version of mcelog and exits.
166
167       --supported  returns  0 if the system has processors which support MCE,
168       and 1 otherwise.
169
170

CONFIG FILE

172       mcelog supports a config file to set  defaults.  Command  line  options
173       override  the  config  file.  By  default  the config file is read from
174       /etc/mcelog/mcelog.conf  unless  overridden  with   the   --config-file
175       option.
176
177       The  general format is optionname = value White space is not allowed in
178       value currently, except at the end where it is dropped  Comments  start
179       with #.
180
181       All  command line options that are not commands can be specified in the
182       config file.  For example t to enable the --no-syslog  option  use  no-
183       syslog  =  yes  (or no to disable).  When the option has a argument use
184       logfile = /tmp/logfile
185
186       For more information on the config file please see mcelog.conf(5).
187
188

NOTES

190       The kernel prefers old messages over new. If the log  buffer  overflows
191       only old ones will be kept.
192
193       The  exact  output in the log file depends on the CPU, unless the --raw
194       option is used.
195
196       mcelog will report serious errors to the syslog during decoding.
197
198

SIGNALS

200       When mcelog runs in daemon mode and receives a SIGUSR1  it  will  close
201       and  reopen  the  log  files.  This  can be used to rotate logs without
202       restarting the daemon.
203
204

FILES

206       /dev/mcelog (char 10, minor 227)
207
208       /etc/mcelog/mcelog.conf
209
210       /var/log/mcelog
211
212       /var/run/mcelog.pid
213
214

SEE ALSO

216       mcelog.conf(5), mcelog.triggers(5)
217
218       http://www.mcelog.org
219
220       AMD x86-64 architecture programmer's manual, Volume 2, System  program‐
221       ming
222
223       Intel  64 and IA32 Architectures Software Developer's manual, Volume 3,
224       System programming guide Chapter 15 and 16.  http://www.intel.com/sdm
225
226       Datasheet of your CPU.
227
228
229
230                                   Mar 2015                          MCELOG(8)
Impressum