1heart(3) Erlang Module Definition heart(3)
2
3
4
6 heart - Heartbeat monitoring of an Erlang runtime system.
7
9 This modules contains the interface to the heart process. heart sends
10 periodic heartbeats to an external port program, which is also named
11 heart. The purpose of the heart port program is to check that the
12 Erlang runtime system it is supervising is still running. If the port
13 program has not received any heartbeats within HEART_BEAT_TIMEOUT sec‐
14 onds (defaults to 60 seconds), the system can be rebooted.
15
16 An Erlang runtime system to be monitored by a heart program is to be
17 started with command-line flag -heart (see also erl(1)). The heart
18 process is then started automatically:
19
20 % erl -heart ...
21
22 If the system is to be rebooted because of missing heartbeats, or a
23 terminated Erlang runtime system, environment variable HEART_COMMAND
24 must be set before the system is started. If this variable is not set,
25 a warning text is printed but the system does not reboot.
26
27 To reboot on Windows, HEART_COMMAND can be set to heart -shutdown
28 (included in the Erlang delivery) or to any other suitable program that
29 can activate a reboot.
30
31 The environment variable HEART_BEAT_TIMEOUT can be used to configure
32 the heart time-outs; it can be set in the operating system shell before
33 Erlang is started or be specified at the command line:
34
35 % erl -heart -env HEART_BEAT_TIMEOUT 30 ...
36
37 The value (in seconds) must be in the range 10 < X <= 65535.
38
39 Notice that if the system clock is adjusted with more than
40 HEART_BEAT_TIMEOUT seconds, heart times out and tries to reboot the
41 system. This can occur, for example, if the system clock is adjusted
42 automatically by use of the Network Time Protocol (NTP).
43
44 If a crash occurs, an erl_crash.dump is not written unless environment
45 variable ERL_CRASH_DUMP_SECONDS is set:
46
47 % erl -heart -env ERL_CRASH_DUMP_SECONDS 10 ...
48
49 If a regular core dump is wanted, let heart know by setting the kill
50 signal to abort using environment variable HEART_KILL_SIGNAL=SIGABRT.
51 If unset, or not set to SIGABRT, the default behavior is a kill signal
52 using SIGKILL:
53
54 % erl -heart -env HEART_KILL_SIGNAL SIGABRT ...
55
56 If heart should not kill the Erlang runtime system, this can be indi‐
57 cated using the environment variable HEART_NO_KILL=TRUE. This can be
58 useful if the command executed by heart takes care of this, for example
59 as part of a specific cleanup sequence. If unset, or not set to TRUE,
60 the default behaviour will be to kill as described above.
61
62 % erl -heart -env HEART_NO_KILL 1 ...
63
64 Furthermore, ERL_CRASH_DUMP_SECONDS has the following behavior on
65 heart:
66
67 ERL_CRASH_DUMP_SECONDS=0:
68 Suppresses the writing of a crash dump file entirely, thus reboot‐
69 ing the runtime system immediately. This is the same as not setting
70 the environment variable.
71
72 ERL_CRASH_DUMP_SECONDS=-1:
73 Setting the environment variable to a negative value does not
74 reboot the runtime system until the crash dump file is completly
75 written.
76
77 ERL_CRASH_DUMP_SECONDS=S:
78 heart waits for S seconds to let the crash dump file be written.
79 After S seconds, heart reboots the runtime system, whether the
80 crash dump file is written or not.
81
82 In the following descriptions, all functions fail with reason badarg if
83 heart is not started.
84
86 heart_option() = check_schedulers
87
89 set_cmd(Cmd) -> ok | {error, {bad_cmd, Cmd}}
90
91 Types:
92
93 Cmd = string()
94
95 Sets a temporary reboot command. This command is used if a
96 HEART_COMMAND other than the one specified with the environment
97 variable is to be used to reboot the system. The new Erlang run‐
98 time system uses (if it misbehaves) environment variable
99 HEART_COMMAND to reboot.
100
101 Limitations: Command string Cmd is sent to the heart program as
102 an ISO Latin-1 or UTF-8 encoded binary, depending on the file‐
103 name encoding mode of the emulator (see file:native_name_encod‐
104 ing/0). The size of the encoded binary must be less than 2047
105 bytes.
106
107 clear_cmd() -> ok
108
109 Clears the temporary boot command. If the system terminates, the
110 normal HEART_COMMAND is used to reboot.
111
112 get_cmd() -> {ok, Cmd}
113
114 Types:
115
116 Cmd = string()
117
118 Gets the temporary reboot command. If the command is cleared,
119 the empty string is returned.
120
121 set_callback(Module, Function) ->
122 ok | {error, {bad_callback, {Module, Function}}}
123
124 Types:
125
126 Module = Function = atom()
127
128 This validation callback will be executed before any heartbeat
129 is sent to the port program. For the validation to succeed it
130 needs to return with the value ok.
131
132 An exception within the callback will be treated as a validation
133 failure.
134
135 The callback will be removed if the system reboots.
136
137 clear_callback() -> ok
138
139 Removes the validation callback call before heartbeats.
140
141 get_callback() -> {ok, {Module, Function}} | none
142
143 Types:
144
145 Module = Function = atom()
146
147 Get the validation callback. If the callback is cleared, none
148 will be returned.
149
150 set_options(Options) -> ok | {error, {bad_options, Options}}
151
152 Types:
153
154 Options = [heart_option()]
155
156 Valid options set_options are:
157
158 check_schedulers:
159 If enabled, a signal will be sent to each scheduler to check
160 its responsiveness. The system check occurs before any
161 heartbeat sent to the port program. If any scheduler is not
162 responsive enough the heart program will not receive its
163 heartbeat and thus eventually terminate the node.
164
165 Returns with the value ok if the options are valid.
166
167 get_options() -> {ok, Options} | none
168
169 Types:
170
171 Options = [atom()]
172
173 Returns {ok, Options} where Options is a list of current options
174 enabled for heart. If the callback is cleared, none will be
175 returned.
176
177
178
179Ericsson AB kernel 5.4.3.2 heart(3)