1MCELOG(8)               Linux's Administrator's Manual               MCELOG(8)
2
3
4

NAME

6       mcelog - Decode kernel machine check log on x86 machines
7

SYNOPSIS

9       mcelog [options] [device]
10       mcelog [options] --daemon
11       mcelog [options] --client
12       mcelog [options] --ascii
13       mcelog --version
14

DESCRIPTION

16       X86  CPUs  report  errors  detected  by the CPU as machine check events
17       (MCEs).  These can be data corruption detected in the  CPU  caches,  in
18       main memory by an integrated memory controller, data transfer errors on
19       the front side bus or CPU interconnect or other internal errors.   Pos‐
20       sible  causes can be cosmic radiation, instable power supplies, cooling
21       problems, broken hardware, or bad luck.
22
23       Most errors can be corrected by the CPU by  internal  error  correction
24       mechanisms. Uncorrected errors cause machine check exceptions which may
25       panic the machine.
26
27       When a corrected error happens the x86 kernel writes a record  describ‐
28       ing  the  MCE  into  a  internal  ring  buffer  available  through  the
29       /dev/mcelog device mcelog retrieves errors  from  /dev/mcelog,  decodes
30       them  into a human readable format and prints them on the standard out‐
31       put or optionally into the system log.
32
33       Optionally it can also take more options  like  keeping  statistics  or
34       triggering shell scripts on specific events.
35
36       The  normal operating modi for mcelog are running as a regular cron job
37       (traditional way, deprecated), running as a trigger  directly  executed
38       by the kernel, or running as a daemon with the --daemon option.
39
40       When  an uncorrected machine check error happens that the kernel cannot
41       recover from then it will usually panic the system.  In this case  when
42       there  was  a  warm  reset  after  the  panic mcelog should pick up the
43       machine check errors after reboot.  This is not possible after  a  cold
44       reset.
45
46       In addition mcelog can be used on the command line to decode the kernel
47       output for a fatal machine check panic in text format using the --ascii
48       option.  This is typically used to decode the panic console output of a
49       fatal machine check, if the system was power cycled  or  mcelog  didn't
50       run immediately after reboot.
51
52       When  the  panic  triggers  a kdump kexec crash kernel the crash kernel
53       boot up script should log the machine checks to  disk,  otherwise  they
54       might be lost.
55
56       Note  that  after mcelog retrieves an error the kernel doesn't store it
57       anymore (different from dmesg(1)), so the output should be always saved
58       somewhere and mcelog not run in uncontrolled ways.
59
60

OPTIONS

62       When  the  --syslog  option is specified redirect output to system log.
63       The --syslog-error option causes the normal machine checks to be logged
64       as  LOG_ERR  (implies  --syslog  ).  Normally only fatal errors or high
65       level remarks are logged with error level.  High level  one  line  sum‐
66       maries  of  specific  errors  are  also logged to the syslog by default
67       unless mcelog operates in --ascii mode.
68
69       When the --logfile=file option is specified append log  output  to  the
70       specified  file. With the --no-syslog option mcelog will never log any‐
71       thing to the syslog.
72
73       When the --cpu=cputype option is specified set the to be decoded CPU to
74       cputype.  See mcelog --help for a list of valid CPUs.  Note that speci‐
75       fying an incorrect CPU can lead to incorrect decoding output.   Default
76       is either the CPU of the machine that reported the machine check (needs
77       a newer kernel version) or the CPU of the machine mcelog is running on,
78       so  normally  this  option  doesn't  have to be used. Older versions of
79       mcelog had separate options for different CPU types.  These  are  still
80       implemented, but deprecated and undocumented now.
81
82       With  the  --dmi  option  mcelog will look up the addresses reported in
83       machine checks in the SMBIOS/DMI tables of the BIOS.   This  can  some‐
84       times tell you which DIMM or memory controller has developed a problem.
85       More often the information reported by the BIOS  is  either  subtly  or
86       obviously  wrong or useless.  This option requires that mcelog has read
87       access to /dev/mem (normally  requires  root)  and  runs  on  the  same
88       machine  in  the  same hardware configuration as when the machine check
89       event happened.
90
91       When --ignorenodev is specified then mcelog will exit silently when the
92       device cannot be opened. This is useful in virtualized environment with
93       limited devices.
94
95       When --filter is specified mcelog will filter out known broken  machine
96       check  events  (default  on).  When the --no-filter option is specified
97       mcelog does not filter events.
98
99       When --raw is specified mcelog will  not  decode,  but  just  dump  the
100       mcelog  in a raw hex format. This can be useful for automatic post pro‐
101       cessing.
102
103       When a device is specified the machine check logs are read from  device
104       instead of the default /dev/mcelog.
105
106       With the --ascii option mcelog decodes a fatal machine check panic gen‐
107       erated by the kernel ("CPU n: Machine Check Exception  ...")  in  ASCII
108       from  standard  input  and  exits afterwards.  Note that when the panic
109       comes from a different machine than where  mcelog  is  running  on  you
110       might  need  to  specify the correct cputype on older kernels. On newer
111       kernels which output the PROCESSOR field this is not needed anymore.
112
113       When the --file filename option is specified mcelog --ascii  will  read
114       the  ASCII  machine  check  record  from input file filename instead of
115       standard input.
116
117       With the --config-file file option mcelog reads  the  specified  config
118       file.  Default is /etc/mcelog.conf See also CONFIG FILE below.
119
120       With  the --daemon option mcelog will run in the background. This gives
121       the fastest reaction time and is the recommended operating mode.   This
122       option  implies  --syslog.  The option --foreground will prevent mcelog
123       from giving up the terminal in daemon mode. This is intended for debug‐
124       ging.
125
126       With the --client option mcelog will query a running daemon for accumu‐
127       lated errors.
128
129       With the --cpumhz=mhz option assume  the  CPU  has  mhz  frequency  for
130       decoding  the  time of the event using the CPU time stamp counter. This
131       also forces decoding. Note this can be  unreliable.   on  some  systems
132       with  CPU  frequency scaling or deep C states, where the CPU time stamp
133       counter does not increase linearly.  By default the  frequency  of  the
134       current  CPU  is  used  when mcelog determines it is safe to use. Newer
135       kernels report the time directly in the event and don't need this  any‐
136       more.
137
138       The --pidfile file option writes the process id of the daemon into file
139       file.  Only valid in daemon mode.
140
141
142       --version displays the version of mcelog and exits.
143
144

CONFIG FILE

146       mcelog supports a config file to set  defaults.  Command  line  options
147       override  the  config  file.  By  default  the config file is read from
148       /etc/mcelog.conf unless overridden with the --config-file option.
149
150       The general format is optionname = value White space is not allowed  in
151       value  currently,  except at the end where it is dropped Comments start
152       with #.
153
154       All command line options that are not commands can be specified in  the
155       config  file.   For  example t to enable the --no-syslog option use no-
156       syslog = yes (or no to disable).  When the option has  a  argument  use
157       logfile = /tmp/logfile
158
159

NOTES

161       The  kernel  prefers old messages over new. If the log buffer overflows
162       only old ones will be kept.
163
164       The exact output in the log file depends on the CPU, unless  the  --raw
165       option is used.
166
167       mcelog will report serious errors to the syslog during decoding.
168
169

FILES

171       /dev/mcelog (char 10, minor 227)
172
173       /etc/mcelog/mcelog.conf
174
175       /sys/devices/system/machinecheck/machinecheck0/trigger
176
177

SEE ALSO

179       AMD  x86-64 architecture programmer's manual, Volume 2, System program‐
180       ming
181
182       Intel 64 and IA32 Architectures Software Developer's manual, Volume  3,
183       System programming guide Parts 1 and 2. Machine checks are described in
184       Chapter 14 in Part1 and in Appendix E in Part2.
185
186       Datasheet of your CPU.
187
188
189
190                                   May 2009                          MCELOG(8)
Impressum