1MCELOG(8)               Linux's Administrator's Manual               MCELOG(8)
2
3
4

NAME

6       mcelog - Decode kernel machine check log on x86 machines
7

SYNOPSIS

9       mcelog [options] [device]
10       mcelog [options] --daemon
11       mcelog [options] --client
12       mcelog [options] --ascii
13       mcelog [options] --is-cpu-supported
14       mcelog --version
15

DESCRIPTION

17       X86  CPUs  report  errors  detected  by the CPU as machine check events
18       (MCEs).  These can be data corruption detected in the  CPU  caches,  in
19       main memory by an integrated memory controller, data transfer errors on
20       the front side bus or CPU interconnect or other internal errors.   Pos‐
21       sible  causes can be cosmic radiation, instable power supplies, cooling
22       problems, broken hardware, running systems out of specification, or bad
23       luck.
24
25       Most  errors  can  be corrected by the CPU by internal error correction
26       mechanisms. Uncorrected errors cause machine check exceptions which may
27       kill processes or panic the machine. A small number of corrected errors
28       is usually not a cause for worry,  but  a  large  number  can  indicate
29       future failure.
30
31       When  a  corrected  or recovered error happens, the x86 kernel writes a
32       record describing the MCE into a internal ring buffer available through
33       the  /dev/mcelog  device.   mcelog  retrieves  errors from /dev/mcelog,
34       decodes them into a human readable format and prints them on the  stan‐
35       dard output or optionally into the system log.
36
37       Optionally  it  can  also  take more options like keeping statistics or
38       triggering shell scripts on specific events. By default mcelog supports
39       offlining  memory pages with persistent corrected errors, offlining CPU
40       cores if they developed cache problems, and otherwise logging  specific
41       events to the system log after they crossed a threshold.
42
43       The  normal  operating  modes for mcelog are: running as a regular cron
44       job (traditional way, deprecated), running as a trigger  directly  exe‐
45       cuted by the kernel, or running as a daemon with the --daemon option.
46
47       When  an uncorrected machine check error happens that the kernel cannot
48       recover from then it will usually panic the system.  In this case  when
49       there  was  a  warm  reset  after  the  panic mcelog should pick up the
50       machine check errors after reboot.  This is not possible after  a  cold
51       reset.
52
53       In addition mcelog can be used on the command line to decode the kernel
54       output for a fatal machine check panic in text format using the --ascii
55       option.  This is typically used to decode the panic console output of a
56       fatal machine check, if the system was power cycled  or  mcelog  didn't
57       run immediately after reboot.
58
59       When  the  panic  triggers  a kdump kexec crash kernel the crash kernel
60       boot up script should log the machine checks to  disk,  otherwise  they
61       might be lost.
62
63       Note  that  after mcelog retrieves an error the kernel doesn't store it
64       anymore (different from dmesg(1)), so the output should be always saved
65       somewhere and mcelog not run in uncontrolled ways.
66
67       When  invoked with the --is-cpu-supported option mcelog exits with code
68       0 if the current CPU is supported, 1 otherwise.
69
70

OPTIONS

72       When the --syslog option is specified redirect output  to  system  log.
73       The --syslog-error option causes the normal machine checks to be logged
74       as LOG_ERR (implies --syslog ). Normally  only  fatal  errors  or  high
75       level  remarks  are  logged with error level.  High level one line sum‐
76       maries of specific errors are also logged  to  the  syslog  by  default
77       unless mcelog operates in --ascii mode.
78
79       When  the  --logfile=file  option is specified append log output to the
80       specified file. With the --no-syslog option mcelog will never log  any‐
81       thing to the syslog.
82
83       When the --cpu=cputype option is specified set the to be decoded CPU to
84       cputype.  See mcelog --help for a list of valid CPUs.  Note that speci‐
85       fying  an incorrect CPU can lead to incorrect decoding output.  Default
86       is either the CPU of the machine that reported the machine check (needs
87       a newer kernel version) or the CPU of the machine mcelog is running on,
88       so normally this option doesn't have to  be  used.  Older  versions  of
89       mcelog  had  separate  options for different CPU types. These are still
90       implemented, but deprecated and undocumented now.
91
92       With the --dmi option mcelog will look up the DIMMs reported in machine
93       checks  in the SMBIOS/DMI tables of the BIOS and map the DIMMs to board
94       identifiers.  This only works when the  BIOS  reports  the  identifiers
95       correctly.  Unfortunately often the information reported by the BIOS is
96       either subtly or obviously wrong or useless.  This option requires that
97       mcelog has read access to /dev/mem (normally requires root) and runs on
98       the same machine in the same hardware configuration as when the machine
99       check event happened.
100
101       When --ignorenodev is specified then mcelog will exit silently when the
102       device cannot be opened. This is useful in virtualized environment with
103       limited devices.
104
105       When  --filter is specified mcelog will filter out known broken machine
106       check events (default on). When the  --no-filter  option  is  specified
107       mcelog does not filter events.
108
109       When  --raw  is  specified  mcelog  will  not decode, but just dump the
110       mcelog in a raw hex format. This can be useful for automatic post  pro‐
111       cessing.
112
113       When  a device is specified the machine check logs are read from device
114       instead of the default /dev/mcelog.
115
116       With the --ascii option mcelog decodes a fatal machine check panic gen‐
117       erated  by  the  kernel ("CPU n: Machine Check Exception ...") in ASCII
118       from standard input and exits afterwards.  Note  that  when  the  panic
119       comes  from  a  different  machine  than where mcelog is running on you
120       might need to specify the correct cputype on older  kernels.  On  newer
121       kernels which output the PROCESSOR field this is not needed anymore.
122
123       When  the  --file filename option is specified mcelog --ascii will read
124       the ASCII machine check record from  input  file  filename  instead  of
125       standard input.
126
127       With  the  --config-file  file option mcelog reads the specified config
128       file.  Default is /etc/mcelog/mcelog.conf See also CONFIG FILE below.
129
130       With the --daemon option mcelog will run in the background. This  gives
131       the fastest reaction time and is the recommended operating mode.  If an
132       output option isn't selected ( --logfile or --syslog or  --syslog-error
133       ),  this  option implies --logfile=/var/log/mcelog.  Important messages
134       will be logged as one-liner summaries to syslog unless  --no-syslog  is
135       given.   The option --foreground will prevent mcelog from giving up the
136       terminal in daemon mode. This is intended for debugging.
137
138       With the --client option mcelog will query a running daemon for accumu‐
139       lated errors.
140
141       With  the  --cpumhz=mhz  option  assume  the  CPU has mhz frequency for
142       decoding the time of the event using the CPU time stamp  counter.  This
143       also  forces  decoding.  Note  this can be unreliable.  on some systems
144       with CPU frequency scaling or deep C states, where the CPU  time  stamp
145       counter  does  not  increase linearly.  By default the frequency of the
146       current CPU is used when mcelog determines it is  safe  to  use.  Newer
147       kernels  report the time directly in the event and don't need this any‐
148       more.
149
150       The --pidfile file option writes the process id of the daemon into file
151       file.  Only valid in daemon mode.
152
153       Mcelog  will enable extended error reporting from the memory controller
154       on processors that support it unless you tell it not to with the  --no-
155       imc-log  option. You might need this option when decoding old logs from
156       a system where this mode was not enabled.
157
158
159       --version displays the version of mcelog and exits.
160
161

CONFIG FILE

163       mcelog supports a config file to set  defaults.  Command  line  options
164       override  the  config  file.  By  default  the config file is read from
165       /etc/mcelog/mcelog.conf  unless  overridden  with   the   --config-file
166       option.
167
168       The  general format is optionname = value White space is not allowed in
169       value currently, except at the end where it is dropped  Comments  start
170       with #.
171
172       All  command line options that are not commands can be specified in the
173       config file.  For example t to enable the --no-syslog  option  use  no-
174       syslog  =  yes  (or no to disable).  When the option has a argument use
175       logfile = /tmp/logfile
176
177       For more information on the config file please see mcelog.conf(5).
178
179

NOTES

181       The kernel prefers old messages over new. If the log  buffer  overflows
182       only old ones will be kept.
183
184       The  exact  output in the log file depends on the CPU, unless the --raw
185       option is used.
186
187       mcelog will report serious errors to the syslog during decoding.
188
189

SIGNALS

191       When mcelog runs in daemon mode and receives a SIGUSR1  it  will  close
192       and  reopen  the  log  files.  This  can be used to rotate logs without
193       restarting the daemon.
194
195

FILES

197       /dev/mcelog (char 10, minor 227)
198
199       /etc/mcelog/mcelog.conf
200
201       /var/log/mcelog
202
203       /var/run/mcelog.pid
204
205

SEE ALSO

207       mcelog.conf(5), mcelog.triggers(5)
208
209       http://www.mcelog.org
210
211       AMD x86-64 architecture programmer's manual, Volume 2, System  program‐
212       ming
213
214       Intel  64 and IA32 Architectures Software Developer's manual, Volume 3,
215       System programming guide Chapter 15 and 16.  http://www.intel.com/sdm
216
217       Datasheet of your CPU.
218
219
220
221                                   Mar 2015                          MCELOG(8)
Impressum