1
2mcelog.conf(5) File Formats Manual mcelog.conf(5)
3
4
5
7 mcelog.conf - mcelog.conf reference
8
10 /etc/mcelog.conf
11
13 /etc/mcelog.conf is the main configuration file for mcelog(8). This is
14 configuration file separated into sections including a default section.
15
16 General format
17
18 optionname = value
19
20 White space is not allowed in value currently, except at the end where
21 it is dropped
22
23
24 In general all command line options that are not commands work here.
25 See man mcelog or mcelog --help for a list. e.g. to enable the --no-
26 syslog option use
27
28 no-syslog = yes (or no to disable)
29
30 When the option has a argument
31
32 logfile = /tmp/logfile
33
34 Below are the options which are not command line options.
35
36 Set cpu type for which mcelog decodes events:
37
38 cpu = type
39
40 For valid values for type please see mcelog --help. If this value is
41 set incorrectly the decoded output will be likely incorrect. By
42 default when this parameter is not set mcelog uses the CPU it is run‐
43 ning on on very new kernels the mcelog events reported by the kernel
44 also carry the CPU type which is used too when available and not over‐
45 ridden.
46
47 Enable daemon mode:
48
49 daemon = yes
50
51 By default mcelog just processes the currently pending events and
52 exits. In daemon mode it will keep running as a daemon in the back‐
53 ground and poll the kernel for events and then decode them.
54
55 Filter out known broken events by default.
56
57 filter = yes
58
59 Don't log memory errors individually. They still get accounted if that
60 is enabled.
61
62 filter-memory-errors = yes
63
64 Output in undecoded raw format to be easier machine readable (default
65 is decoded).
66
67 raw = yes
68
69 Set cpu mhz to decode uptime from time stamp counter (output unreli‐
70 able, not needed on new kernels which report the event time directly. A
71 lot of systems don't have a linear time stamp clock and the output is
72 wrong then. Normally mcelog tries to figure out if it the TSC is reli‐
73 able and only uses the current frequency then. Setting a frequency
74 forces timestamp decoding. This setting is obsolete with modern ker‐
75 nels which report the time directly.
76
77 cpumhz = 1800.00
78
79 Log output options Log decoded machine checks in syslog (default stdout
80 or syslog for daemon)
81
82 syslog = yes
83
84 Log decoded machine checks in syslog with error level
85
86 syslog-error = yes
87
88 Never log anything to syslog
89
90 no-syslog = yes
91
92 Append log output to logfile instead of stdout. only when no syslog
93 logging is active
94
95 logfile = filename
96
97 Use smbios information to decode dimms (needs root). This function is
98 not recommended to use right now and generally not needed. The excep‐
99 tion is memdb prepopulation, which is configured separately below.
100
101 dmi = no
102
103 When in daemon mode run as this user after set up. Note that the trig‐
104 gers will run as this user too. Setting this to non root will mean
105 that triggers cannot take some corrective action, like offlining
106 objects.
107
108 run-credentials-user = root
109
110 Group to run as daemon with default to the group of the run-creden‐
111 tials-user
112
113 run-credentials-group = nobody
114
115 The server config section
116 User allowed to access client socket. when set to * match any root is
117 always allowed to access. default: root only
118
119 client-user = root
120
121 Group allowed to access mcelog When no group is configured any group
122 matches (but still user checking). when set to * match any
123
124 client-group = root
125
126 Path to the unix socket for client<->server communication. When no
127 socket-path is configured the server will not start
128
129 socket-path = /var/run/mcelog-client
130
131 When mcelog starts it checks if a server is already running. this con‐
132 figures the timeout for this check.
133
134 initial-ping-timeout = 2
135
136 The dimm config section
137 Is the in memory dimm error tracking enabled? Only works on systems
138 with integrated memory controller and which are supported. Only takes
139 effect in daemon mode.
140
141 dimm-tracking-enabled = yes
142
143 Use dmi information from the bios to prepopulate dimm database. Note
144 this might not work with all BIOS and requires mcelog to run as root.
145 Alternative is to let mcelog create DIMM objects on demand.
146
147 dmi-prepopulate = yes
148
149 Execute these triggers when the rate of corrected or uncorrected Errors
150 per DIMM exceeds the threshold. Note when the hardware does not report
151 DIMMs this might also be per channel. The default of 10/24h is reason‐
152 able for server quality DDR3 DIMMs as of 2009/10.
153
154 uc-error-trigger = dimm-error-trigger
155
156 uc-error-threshold = 1 / 24h
157
158 ce-error-trigger = dimm-error-trigger
159
160 ce-error-threshold = 10 / 24h
161
162 The socket config section
163 Enable memory error accounting per socket.
164
165 socket-tracking-enabled = yes
166
167 Threshold and trigger for uncorrected memory errors on a socket. mem-
168 uc-error-trigger = socket-memory-error-trigger
169
170 mem-uc-error-threshold = 100 / 24h
171
172 Trigger script for corrected memory errors on a socket.
173
174 mem-ce-error-trigger = socket-memory-error-trigger
175
176 Threshold on when to trigger a correct error for the socket.
177
178 mem-ce-error-threshold = 100 / 24h
179
180 log socket error threshold explicitly?
181
182 mem-ce-error-log = yes
183
184 Trigger script for uncorrected bus error events
185
186 bus-uc-threshold-trigger = bus-error-trigger
187
188 Trigger script for uncorrected iomca erors
189
190 iomca-threshold-trigger = iomca-error-trigger
191
192 Trigger script for other uncategorized errors
193
194 unknown-threshold-trigger = unknown-error-trigger
195
196 The cache config section
197 Processing of cache error thresholds reported by intel cpus.
198
199 cache-threshold-trigger = cache-error-trigger
200
201 Should cache threshold events be logged explicitly?
202
203 cache-threshold-log = yes
204
205 The page config section
206 Memory error accouting per 4k memory page. Threshold for the correct
207 memory errors trigger script.
208
209 memory-ce-threshold = 10 / 24h
210
211 Trigger script for corrected errors. memory-ce-trigger = page-error-
212 trigger
213
214 Should page threshold events be logged explicitly?
215
216 memory-ce-log = yes
217
218 Specify the internal action in mcelog to exceeding a page error thresh‐
219 old this is done in addition to executing the trigger script if avail‐
220 able off no action account only account errors soft try to
221 soft-offline page without killing any processes
222 This requires an uptodate kernel. Might not be successfull.
223 hard try to hard-offline page by killing processes
224 Requires an uptodate kernel. Might not be successfull. soft-
225 then-hard First try to soft offline, then try hard offlining
226
227 memory-ce-action = off|account|soft|hard|soft-then-hard
228
229 memory-ce-action = soft
230
231 Trigger script before doing soft memory offline this trigger will scan
232 and run all the scipts in the page-error-pre-soft-trigger.extern
233
234 memory-pre-sync-soft-ce-trigger = page-error-pre-sync-soft-trigger
235
236 Trigger script after completing soft memory offline this trigger will
237 scan and run all the scipts in the page-error-post-soft-trigger.extern
238
239 memory-post-sync-soft-ce-trigger = page-error-post-sync-soft-trigger
240
241 The trigger config section
242 Maximum number of running triggers
243
244 children-max = 2
245
246 Execute triggers in this directory
247
248 directory = /etc/mcelog
249
251 mcelog(8), mcelog.triggers(5) http://www.mcelog.org
252
253
254
255
256 mcelog mcelog.conf(5)