1peer(3) Erlang Module Definition peer(3)
2
3
4
6 peer - Start and control linked Erlang nodes.
7
8
10 This module provides functions for starting linked Erlang nodes. The
11 node spawning new nodes is called origin, and newly started nodes are
12 peer nodes, or peers. A peer node automatically terminates when it
13 loses the control connection to the origin. This connection could be an
14 Erlang distribution connection, or an alternative - TCP or standard
15 I/O. The alternative connection provides a way to execute remote proce‐
16 dure calls even when Erlang Distribution is not available, allowing to
17 test the distribution itself.
18
19 Peer node terminal input/output is relayed through the origin. If a
20 standard I/O alternative connection is requested, console output also
21 goes via the origin, allowing debugging of node startup and boot script
22 execution (see -init_debug). File I/O is not redirected, contrary to
23 slave(3) behaviour.
24
25 The peer node can start on the same or a different host (via ssh) or in
26 a separate container (for example Docker). When the peer starts on the
27 same host as the origin, it inherits the current directory and environ‐
28 ment variables from the origin.
29
30 Note:
31 This module is designed to facilitate multi-node testing with Common
32 Test. Use the ?CT_PEER() macro to start a linked peer node according to
33 Common Test conventions: crash dumps written to specific location, node
34 name prefixed with module name, calling function, and origin OS process
35 ID). Use random_name/1 to create sufficiently unique node names if you
36 need more control.
37
38 A peer node started without alternative connection behaves similarly to
39 slave(3). When an alternative connection is requested, the behaviour is
40 similar to test_server:start_node(Name, peer, Args).
41
42
44 The following example implements a test suite starting extra Erlang
45 nodes. It employs a number of techniques to speed up testing and reli‐
46 ably shut down peer nodes:
47
48 * peers start linked to test runner process. If the test case fails,
49 the peer node is stopped automatically, leaving no rogue nodes run‐
50 ning in the background
51
52 * arguments used to start the peer are saved in the control process
53 state for manual analysis. If the test case fails, the CRASH REPORT
54 contains these arguments
55
56 * multiple test cases can run concurrently speeding up overall test‐
57 ing process, peer node names are unique even when there are multi‐
58 ple instances of the same test suite running in parallel
59
60 -module(my_SUITE).
61 -behaviour(ct_suite).
62 -export([all/0, groups/0]).
63 -export([basic/1, args/1, named/1, restart_node/1, multi_node/1]).
64
65 -include_lib("common_test/include/ct.hrl").
66
67 groups() ->
68 [{quick, [parallel],
69 [basic, args, named, restart_node, multi_node]}].
70
71 all() ->
72 [{group, quick}].
73
74 basic(Config) when is_list(Config) ->
75 {ok, Peer, _Node} = ?CT_PEER(),
76 peer:stop(Peer).
77
78 args(Config) when is_list(Config) ->
79 %% specify additional arguments to the new node
80 {ok, Peer, _Node} = ?CT_PEER(["-emu_flavor", "smp"]),
81 peer:stop(Peer).
82
83 named(Config) when is_list(Config) ->
84 %% pass test case name down to function starting nodes
85 Peer = start_node_impl(named_test),
86 peer:stop(Peer).
87
88 start_node_impl(ActualTestCase) ->
89 {ok, Peer, Node} = ?CT_PEER(#{name => ?CT_PEER_NAME(ActualTestCase)}),
90 %% extra setup needed for multiple test cases
91 ok = rpc:call(Node, application, set_env, [kernel, key, value]),
92 Peer.
93
94 restart_node(Config) when is_list(Config) ->
95 Name = ?CT_PEER_NAME(),
96 {ok, Peer, Node} = ?CT_PEER(#{name => Name}),
97 peer:stop(Peer),
98 %% restart the node with the same name as before
99 {ok, Peer2, Node} = ?CT_PEER(#{name => Name, args => ["+fnl"]}),
100 peer:stop(Peer2).
101
102
103 The next example demonstrates how to start multiple nodes concurrently:
104
105 multi_node(Config) when is_list(Config) ->
106 Peers = [?CT_PEER(#{wait_boot => {self(), tag}})
107 || _ <- lists:seq(1, 4)],
108 %% wait for all nodes to complete boot process, get their names:
109 _Nodes = [receive {tag, {started, Node, Peer}} -> Node end
110 || {ok, Peer} <- Peers],
111 [peer:stop(Peer) || {ok, Peer} <- Peers].
112
113
114 Start a peer on a different host. Requires ssh key-based authentication
115 set up, allowing "another_host" connection without password prompt.
116
117 Ssh = os:find_executable("ssh"),
118 peer:start_link(#{exec => {Ssh, ["another_host", "erl"]},
119 connection => standard_io}),
120
121
122 The following Common Test case demonstrates Docker integration, start‐
123 ing two containers with hostnames "one" and "two". In this example Er‐
124 lang nodes running inside containers form an Erlang cluster.
125
126 docker(Config) when is_list(Config) ->
127 Docker = os:find_executable("docker"),
128 PrivDir = proplists:get_value(priv_dir, Config),
129 build_release(PrivDir),
130 build_image(PrivDir),
131
132 %% start two Docker containers
133 {ok, Peer, Node} = peer:start_link(#{name => lambda,
134 connection => standard_io,
135 exec => {Docker, ["run", "-h", "one", "-i", "lambda"]}}),
136 {ok, Peer2, Node2} = peer:start_link(#{name => lambda,
137 connection => standard_io,
138 exec => {Docker, ["run", "-h", "two", "-i", "lambda"]}}),
139
140 %% find IP address of the second node using alternative connection RPC
141 {ok, Ips} = peer:call(Peer2, inet, getifaddrs, []),
142 {"eth0", Eth0} = lists:keyfind("eth0", 1, Ips),
143 {addr, Ip} = lists:keyfind(addr, 1, Eth0),
144
145 %% make first node to discover second one
146 ok = peer:call(Peer, inet_db, set_lookup, [[file]]),
147 ok = peer:call(Peer, inet_db, add_host, [Ip, ["two"]]),
148
149 %% join a cluster
150 true = peer:call(Peer, net_kernel, connect_node, [Node2]),
151 %% verify that second peer node has only the first node visible
152 [Node] = peer:call(Peer2, erlang, nodes, []),
153
154 %% stop peers, causing containers to also stop
155 peer:stop(Peer2),
156 peer:stop(Peer).
157
158 build_release(Dir) ->
159 %% load sasl.app file, otherwise application:get_key will fail
160 application:load(sasl),
161 %% create *.rel - release file
162 RelFile = filename:join(Dir, "lambda.rel"),
163 Release = {release, {"lambda", "1.0.0"},
164 {erts, erlang:system_info(version)},
165 [{App, begin {ok, Vsn} = application:get_key(App, vsn), Vsn end}
166 || App <- [kernel, stdlib, sasl]]},
167 ok = file:write_file(RelFile, list_to_binary(lists:flatten(
168 io_lib:format("~tp.", [Release])))),
169 RelFileNoExt = filename:join(Dir, "lambda"),
170
171 %% create boot script
172 {ok, systools_make, []} = systools:make_script(RelFileNoExt,
173 [silent, {outdir, Dir}]),
174 %% package release into *.tar.gz
175 ok = systools:make_tar(RelFileNoExt, [{erts, code:root_dir()}]).
176
177 build_image(Dir) ->
178 %% Create Dockerfile example, working only for Ubuntu 20.04
179 %% Expose port 4445, and make Erlang distribution to listen
180 %% on this port, and connect to it without EPMD
181 %% Set cookie on both nodes to be the same.
182 BuildScript = filename:join(Dir, "Dockerfile"),
183 Dockerfile =
184 "FROM ubuntu:20.04 as runner\n"
185 "EXPOSE 4445\n"
186 "WORKDIR /opt/lambda\n"
187 "COPY lambda.tar.gz /tmp\n"
188 "RUN tar -zxvf /tmp/lambda.tar.gz -C /opt/lambda\n"
189 "ENTRYPOINT [\"/opt/lambda/erts-" ++ erlang:system_info(version) ++
190 "/bin/dyn_erl\", \"-boot\", \"/opt/lambda/releases/1.0.0/start\","
191 " \"-kernel\", \"inet_dist_listen_min\", \"4445\","
192 " \"-erl_epmd_port\", \"4445\","
193 " \"-setcookie\", \"secret\"]\n",
194 ok = file:write_file(BuildScript, Dockerfile),
195 os:cmd("docker build -t lambda " ++ Dir).
196
197
199 server_ref() = pid()
200
201 Identifies the controlling process of a peer node.
202
203 start_options() =
204 #{name => atom() | string(),
205 longnames => boolean(),
206 host => string(),
207 peer_down => stop | continue | crash,
208 exec => exec(),
209 connection => connection(),
210 args => [string()],
211 env => [{string(), string()}],
212 wait_boot => wait_boot(),
213 shutdown =>
214 close | halt |
215 {halt, disconnect_timeout()} |
216 disconnect_timeout()}
217
218 Options that can be used when starting a peer node through
219 start/1 and start_link/0,1.
220
221 name:
222 Node name (the part before "@"). When name is not specified,
223 but host is, peer follows compatibility behaviour and uses
224 the origin node name.
225
226 host:
227 Enforces a specific host name. Can be used to override the
228 default behaviour and start "node@localhost" instead of
229 "node@realhostname".
230
231 longnames:
232 Use long names to start a node. Default is taken from the
233 origin using net_kernel:longnames(). If the origin is not
234 distributed, short names is the default.
235
236 peer_down:
237 Defines the peer control process behaviour when the control
238 connection is closed from the peer node side (for example
239 when the peer crashes or dumps core). When set to stop (de‐
240 fault), a lost control connection causes the control process
241 to exit normally. Setting peer_down to continue keeps the
242 control process running, and crash will cause the control‐
243 ling process to exit abnormally.
244
245 exec:
246 Alternative mechanism to start peer nodes with, for example,
247 ssh instead of the default bash.
248
249 connection:
250 Alternative connection specification. See the connection
251 datatype.
252
253 args:
254 Extra command line arguments to append to the "erl" command.
255 Arguments are passed as is, no escaping or quoting is needed
256 or accepted.
257
258 env:
259 List of environment variables with their values. This list
260 is applied to a locally started executable. If you need to
261 change the environment of the remote peer, adjust args to
262 contain -env ENV_KEY ENV_VALUE.
263
264 wait_boot:
265 Specifies the start/start_link timeout. See wait_boot
266 datatype.
267
268 shutdown:
269 Specifies the peer node stopping behaviour. See stop().
270
271 peer_state() = booting | running | {down, Reason :: term()}
272
273 Peer node state.
274
275 connection() =
276 0..65535 | {inet:ip_address(), 0..65535} | standard_io
277
278 Alternative connection between the origin and the peer. When the
279 connection closes, the peer node terminates automatically. If
280 the peer_down startup flag is set to crash, the controlling
281 process on the origin node exits with corresponding reason, ef‐
282 fectively providing a two-way link.
283
284 When connection is set to a port number, the origin starts lis‐
285 tening on the requested TCP port, and the peer node connects to
286 the port. When it is set to an {IP, Port} tuple, the origin lis‐
287 tens only on the specified IP. The port number can be set to 0
288 for automatic selection.
289
290 Using the standard_io alternative connection starts the peer at‐
291 tached to the origin (other connections use -detached flag to
292 erl). In this mode peer and origin communicate via stdin/stdout.
293
294 exec() = file:name() | {file:name(), [string()]}
295
296 Overrides executable to start peer nodes with. By default it is
297 the path to "erl", taken from init:get_argument(progname). If
298 progname is not known, peer makes best guess given the current
299 ERTS version.
300
301 When a tuple is passed, the first element is the path to exe‐
302 cutable, and the second element is prepended to the final com‐
303 mand line. This can be used to start peers on a remote host or
304 in a Docker container. See the examples above.
305
306 This option is useful for testing backwards compatibility with
307 previous releases, installed at specific paths, or when the Er‐
308 lang installation location is missing from the PATH.
309
310 wait_boot() = timeout() | {pid(), Tag :: term()} | false
311
312 Specifies start/start_link timeout in milliseconds. Can be set
313 to false, allowing the peer to start asynchronously. If {Pid,
314 Tag} is specified instead of a timeout, the peer will send Tag
315 to the requested process.
316
317 disconnect_timeout() = 1000..4294967295 | infinity
318
319 Disconnect timeout. See stop().
320
322 call(Dest :: server_ref(),
323 Module :: module(),
324 Function :: atom(),
325 Args :: [term()]) ->
326 Result :: term()
327
328 call(Dest :: server_ref(),
329 Module :: module(),
330 Function :: atom(),
331 Args :: [term()],
332 Timeout :: timeout()) ->
333 Result :: term()
334
335 Uses the alternative connection to evaluate apply(Module, Func‐
336 tion, Args) on the peer node and returns the corresponding value
337 Result. Timeout is an integer representing the timeout in mil‐
338 liseconds or the atom infinity which prevents the operation from
339 ever timing out.
340
341 When an alternative connection is not requested, this function
342 will raise exit signal with the noconnection reason. Use erpc
343 module to communicate over Erlang distribution.
344
345 cast(Dest :: server_ref(),
346 Module :: module(),
347 Function :: atom(),
348 Args :: [term()]) ->
349 ok
350
351 Uses the alternative connection to evaluate apply(Module, Func‐
352 tion, Args) on the peer node. No response is delivered to the
353 calling process.
354
355 peer:cast/4 fails silently when the alternative connection is
356 not configured. Use erpc module to communicate over Erlang dis‐
357 tribution.
358
359 send(Dest :: server_ref(),
360 To :: pid() | atom(),
361 Message :: term()) ->
362 ok
363
364 Uses the alternative connection to send Message to a process on
365 the the peer node. Silently fails if no alternative connection
366 is configured. The process can be referenced by process ID or
367 registered name.
368
369 get_state(Dest :: server_ref()) -> peer_state()
370
371 Returns the peer node state. Th initial state is booting; the
372 node stays in that state until then boot script is complete, and
373 then the node progresses to running. If the node stops (grace‐
374 fully or not), the state changes to down.
375
376 random_name() -> string()
377
378 The same as random_name(peer).
379
380 random_name(Prefix :: string() | atom()) -> string()
381
382 Creates a sufficiently unique node name for the current host,
383 combining a prefix, a unique number, and the current OS process
384 ID.
385
386 Note:
387 Use the ?CT_PEER(["erl_arg1"]) macro provided by Common Test
388 -include_lib("common_test/include/ct.hrl") for convenience. It
389 starts a new peer using Erlang distribution as the control chan‐
390 nel, supplies thes calling module's code path to the peer, and
391 uses the calling function name for the name prefix.
392
393
394 start(Options :: start_options()) ->
395 {ok, pid()} | {ok, pid(), node()} | {error, Reason}
396
397 Types:
398
399 Reason = term()
400
401 Starts a peer node with the specified start_options(). Returns
402 the controlling process and the full peer node name, unless
403 wait_boot is not requested and the host name is not known in ad‐
404 vance.
405
406 start_link() -> {ok, pid(), node()} | {error, Reason :: term()}
407
408 The same as start_link(#{name => random_name()}).
409
410 start_link(Options :: start_options()) ->
411 {ok, pid()} | {ok, pid(), node()} | {error, Reason}
412
413 Types:
414
415 Reason = term()
416
417 Starts a peer node in the same way as start/1, except that the
418 peer node is linked to the currently executing process. If that
419 process terminates, the peer node also terminates.
420
421 Accepts start_options(). Returns the controlling process and the
422 full peer node name, unless wait_boot is not requested and host
423 name is not known in advance.
424
425 When the standard_io alternative connection is requested, and
426 wait_boot is not set to false, a failed peer boot sequence
427 causes the caller to exit with the {boot_failed, {exit_status,
428 ExitCode}} reason.
429
430 stop(Dest :: server_ref()) -> ok
431
432 Types:
433
434 disconnect_timeout() = 1000..4294967295 | infinity
435
436 Stops a peer node. How the node is stopped depends on the shut‐
437 down option passed when starting the peer node. Currently the
438 following shutdown options are supported:
439
440 halt:
441 This is the default shutdown behavior. It behaves as shut‐
442 down option {halt, DefaultTimeout} where DefaultTimeout cur‐
443 rently equals 5000.
444
445 {halt, Timeout :: disconnect_timeout()}:
446 Triggers a call to erlang:halt() on the peer node and then
447 waits for the Erlang distribution connection to the peer
448 node to be taken down. If this connection has not been taken
449 down after Timeout milliseconds, it will forcefully be taken
450 down by peer:stop/1. See the warning below for more info
451 about this.
452
453 Timeout :: disconnect_timeout():
454 Triggers a call to init:stop() on the peer node and then
455 waits for the Erlang distribution connection to the peer
456 node to be taken down. If this connection has not been taken
457 down after Timeout milliseconds, it will forcefully be taken
458 down by peer:stop/1. See the warning below for more info
459 about this.
460
461 close:
462 Close the control connection to the peer node and return.
463 This is the fastest way for the caller of peer:stop/1 to
464 stop a peer node.
465
466 Note that if the Erlang distribution connection is not used
467 as control connection it might not have been taken down when
468 peer:stop/1 returns. Also note that the warning below ap‐
469 plies when the Erlang distribution connection is used as
470 control connection.
471
472 Warning:
473 In the cases where the Erlang distribution connection is taken
474 down by peer:stop/1, other code independent of the peer code
475 might react to the connection loss before the peer node is
476 stopped which might cause undesirable effects. For example,
477 global might trigger even more Erlang distribution connections
478 to other nodes to be taken down. The potential undesirable ef‐
479 fects are, however, not limited to this. It is hard to say what
480 the effects will be since these effects can be caused by any
481 code with links or monitors to something on the origin node, or
482 code monitoring the connection to the origin node.
483
484
485
486
487Maxim Fedorov, WhatsApp Inc. stdlib 4.2 peer(3)