1heart(3)                   Erlang Module Definition                   heart(3)
2
3
4

NAME

6       heart - Heartbeat monitoring of an Erlang runtime system.
7

DESCRIPTION

9       This  modules  contains the interface to the heart process. heart sends
10       periodic heartbeats to an external port program, which  is  also  named
11       heart.  The  purpose  of  the  heart  port program is to check that the
12       Erlang runtime system it is supervising is still running. If  the  port
13       program  has not received any heartbeats within HEART_BEAT_TIMEOUT sec‐
14       onds (defaults to 60 seconds), the system can be rebooted.
15
16       An Erlang runtime system to be monitored by a heart program  is  to  be
17       started  with  command-line  flag  -heart  (see also erl(1)). The heart
18       process is then started automatically:
19
20       % erl -heart ...
21
22       If the system is to be rebooted because of  missing  heartbeats,  or  a
23       terminated  Erlang  runtime  system, environment variable HEART_COMMAND
24       must be set before the system is started. If this variable is not  set,
25       a warning text is printed but the system does not reboot.
26
27       To  reboot  on  Windows,  HEART_COMMAND  can  be set to heart -shutdown
28       (included in the Erlang delivery) or to any other suitable program that
29       can activate a reboot.
30
31       The  environment  variable  HEART_BEAT_TIMEOUT can be used to configure
32       the heart time-outs; it can be set in the operating system shell before
33       Erlang is started or be specified at the command line:
34
35       % erl -heart -env HEART_BEAT_TIMEOUT 30 ...
36
37       The value (in seconds) must be in the range 10 < X <= 65535.
38
39       When  running  on OSs lacking support for monotonic time, heart is sus‐
40       ceptible to system clock adjustments of  more  than  HEART_BEAT_TIMEOUT
41       seconds.  When  this  happens,  heart times out and tries to reboot the
42       system. This can occur, for example, if the system  clock  is  adjusted
43       automatically by use of the Network Time Protocol (NTP).
44
45       If  a crash occurs, an erl_crash.dump is not written unless environment
46       variable ERL_CRASH_DUMP_SECONDS is set:
47
48       % erl -heart -env ERL_CRASH_DUMP_SECONDS 10 ...
49
50       If a regular core dump is wanted, let heart know by  setting  the  kill
51       signal  to  abort using environment variable HEART_KILL_SIGNAL=SIGABRT.
52       If unset, or not set to SIGABRT, the default behavior is a kill  signal
53       using SIGKILL:
54
55       % erl -heart -env HEART_KILL_SIGNAL SIGABRT ...
56
57       If  heart  should not kill the Erlang runtime system, this can be indi‐
58       cated using the environment variable HEART_NO_KILL=TRUE.  This  can  be
59       useful if the command executed by heart takes care of this, for example
60       as part of a specific cleanup sequence. If unset, or not set  to  TRUE,
61       the default behaviour will be to kill as described above.
62
63       % erl -heart -env HEART_NO_KILL 1 ...
64
65       Furthermore,  ERL_CRASH_DUMP_SECONDS  has  the  following  behavior  on
66       heart:
67
68         ERL_CRASH_DUMP_SECONDS=0:
69           Suppresses the writing of a crash dump file entirely, thus  reboot‐
70           ing the runtime system immediately. This is the same as not setting
71           the environment variable.
72
73         ERL_CRASH_DUMP_SECONDS=-1:
74           Setting the environment variable  to  a  negative  value  does  not
75           reboot  the  runtime  system until the crash dump file is completly
76           written.
77
78         ERL_CRASH_DUMP_SECONDS=S:
79           heart waits for S seconds to let the crash dump  file  be  written.
80           After  S  seconds,  heart  reboots  the runtime system, whether the
81           crash dump file is written or not.
82
83       In the following descriptions, all functions fail with reason badarg if
84       heart is not started.
85

DATA TYPES

87       heart_option() = check_schedulers
88

EXPORTS

90       set_cmd(Cmd) -> ok | {error, {bad_cmd, Cmd}}
91
92              Types:
93
94                 Cmd = string()
95
96              Sets  a  temporary  reboot  command.  This  command is used if a
97              HEART_COMMAND other than the one specified with the  environment
98              variable is to be used to reboot the system. The new Erlang run‐
99              time  system  uses  (if  it  misbehaves)  environment   variable
100              HEART_COMMAND to reboot.
101
102              Limitations:  Command string Cmd is sent to the heart program as
103              an ISO Latin-1 or UTF-8 encoded binary, depending on  the  file‐
104              name  encoding mode of the emulator (see file:native_name_encod‐
105              ing/0). The size of the encoded binary must be  less  than  2047
106              bytes.
107
108       clear_cmd() -> ok
109
110              Clears the temporary boot command. If the system terminates, the
111              normal HEART_COMMAND is used to reboot.
112
113       get_cmd() -> {ok, Cmd}
114
115              Types:
116
117                 Cmd = string()
118
119              Gets the temporary reboot command. If the  command  is  cleared,
120              the empty string is returned.
121
122       set_callback(Module, Function) ->
123                       ok | {error, {bad_callback, {Module, Function}}}
124
125              Types:
126
127                 Module = Function = atom()
128
129              This  validation  callback will be executed before any heartbeat
130              is sent to the port program. For the validation  to  succeed  it
131              needs to return with the value ok.
132
133              An exception within the callback will be treated as a validation
134              failure.
135
136              The callback will be removed if the system reboots.
137
138       clear_callback() -> ok
139
140              Removes the validation callback call before heartbeats.
141
142       get_callback() -> {ok, {Module, Function}} | none
143
144              Types:
145
146                 Module = Function = atom()
147
148              Get the validation callback. If the callback  is  cleared,  none
149              will be returned.
150
151       set_options(Options) -> ok | {error, {bad_options, Options}}
152
153              Types:
154
155                 Options = [heart_option()]
156
157              Valid options set_options are:
158
159                check_schedulers:
160                  If enabled, a signal will be sent to each scheduler to check
161                  its responsiveness.  The  system  check  occurs  before  any
162                  heartbeat  sent to the port program. If any scheduler is not
163                  responsive enough the heart program  will  not  receive  its
164                  heartbeat and thus eventually terminate the node.
165
166              Returns with the value ok if the options are valid.
167
168       get_options() -> {ok, Options} | none
169
170              Types:
171
172                 Options = [atom()]
173
174              Returns {ok, Options} where Options is a list of current options
175              enabled for heart. If the callback  is  cleared,  none  will  be
176              returned.
177
178
179
180Ericsson AB                     kernel 6.3.1.1                        heart(3)
Impressum