1peer(3) Erlang Module Definition peer(3)
2
3
4
6 peer - Start and control linked Erlang nodes.
7
8
10 This module provides functions for starting linked Erlang nodes. The
11 node spawning new nodes is called origin, and newly started nodes are
12 peer nodes, or peers. A peer node automatically terminates when it
13 loses the control connection to the origin. This connection could be an
14 Erlang distribution connection, or an alternative - TCP or standard
15 I/O. The alternative connection provides a way to execute remote proce‐
16 dure calls even when Erlang Distribution is not available, allowing to
17 test the distribution itself.
18
19 Peer node terminal input/output is relayed through the origin. If a
20 standard I/O alternative connection is requested, console output also
21 goes via the origin, allowing debugging of node startup and boot script
22 execution (see -init_debug). File I/O is not redirected, contrary to
23 slave(3) behaviour.
24
25 The peer node can start on the same or a different host (via ssh) or in
26 a separate container (for example Docker). When the peer starts on the
27 same host as the origin, it inherits the current directory and environ‐
28 ment variables from the origin.
29
30 Note:
31 This module is designed to facilitate multi-node testing with Common
32 Test. Use the ?CT_PEER() macro to start a linked peer node according to
33 Common Test conventions: crash dumps written to specific location, node
34 name prefixed with module name, calling function, and origin OS process
35 ID). Use random_name/1 to create sufficiently unique node names if you
36 need more control.
37
38 A peer node started without alternative connection behaves similarly to
39 slave(3). When an alternative connection is requested, the behaviour is
40 similar to test_server:start_node(Name, peer, Args).
41
42
44 The following example implements a test suite starting extra Erlang
45 nodes. It employs a number of techniques to speed up testing and reli‐
46 ably shut down peer nodes:
47
48 * peers start linked to test runner process. If the test case fails,
49 the peer node is stopped automatically, leaving no rogue nodes run‐
50 ning in the background
51
52 * arguments used to start the peer are saved in the control process
53 state for manual analysis. If the test case fails, the CRASH REPORT
54 contains these arguments
55
56 * multiple test cases can run concurrently speeding up overall test‐
57 ing process, peer node names are unique even when there are multi‐
58 ple instances of the same test suite running in parallel
59
60 -module(my_SUITE).
61 -behaviour(ct_suite).
62 -export([all/0, groups/0]).
63 -export([basic/1, args/1, named/1, restart_node/1, multi_node/1]).
64
65 -include_lib("common_test/include/ct.hrl").
66
67 groups() ->
68 [{quick, [parallel],
69 [basic, args, named, restart_node, multi_node]}].
70
71 all() ->
72 [{group, quick}].
73
74 basic(Config) when is_list(Config) ->
75 {ok, Peer, _Node} = ?CT_PEER(),
76 peer:stop(Peer).
77
78 args(Config) when is_list(Config) ->
79 %% specify additional arguments to the new node
80 {ok, Peer, _Node} = ?CT_PEER(["-emu_flavor", "smp"]),
81 peer:stop(Peer).
82
83 named(Config) when is_list(Config) ->
84 %% pass test case name down to function starting nodes
85 Peer = start_node_impl(named_test),
86 peer:stop(Peer).
87
88 start_node_impl(ActualTestCase) ->
89 {ok, Peer, Node} = ?CT_PEER(#{name => ?CT_PEER_NAME(ActualTestCase)}),
90 %% extra setup needed for multiple test cases
91 ok = rpc:call(Node, application, set_env, [kernel, key, value]),
92 Peer.
93
94 restart_node(Config) when is_list(Config) ->
95 Name = ?CT_PEER_NAME(),
96 {ok, Peer, Node} = ?CT_PEER(#{name => Name}),
97 peer:stop(Peer),
98 %% restart the node with the same name as before
99 {ok, Peer2, Node} = ?CT_PEER(#{name => Name, args => ["+fnl"]}),
100 peer:stop(Peer2).
101
102
103 The next example demonstrates how to start multiple nodes concurrently:
104
105 multi_node(Config) when is_list(Config) ->
106 Peers = [?CT_PEER(#{wait_boot => {self(), tag}})
107 || _ <- lists:seq(1, 4)],
108 %% wait for all nodes to complete boot process, get their names:
109 _Nodes = [receive {tag, {started, Node, Peer}} -> Node end
110 || {ok, Peer} <- Peers],
111 [peer:stop(Peer) || {ok, Peer} <- Peers].
112
113
114 Start a peer on a different host. Requires ssh key-based authentication
115 set up, allowing "another_host" connection without password prompt.
116
117 Ssh = os:find_executable("ssh"),
118 peer:start_link(#{exec => {Ssh, ["another_host", "erl"]},
119 connection => standard_io}),
120
121
122 The following Common Test case demonstrates Docker integration, start‐
123 ing two containers with hostnames "one" and "two". In this example Er‐
124 lang nodes running inside containers form an Erlang cluster.
125
126 docker(Config) when is_list(Config) ->
127 Docker = os:find_executable("docker"),
128 PrivDir = proplists:get_value(priv_dir, Config),
129 build_release(PrivDir),
130 build_image(PrivDir),
131
132 %% start two Docker containers
133 {ok, Peer, Node} = peer:start_link(#{name => lambda,
134 connection => standard_io,
135 exec => {Docker, ["run", "-h", "one", "-i", "lambda"]}}),
136 {ok, Peer2, Node2} = peer:start_link(#{name => lambda,
137 connection => standard_io,
138 exec => {Docker, ["run", "-h", "two", "-i", "lambda"]}}),
139
140 %% find IP address of the second node using alternative connection RPC
141 {ok, Ips} = peer:call(Peer2, inet, getifaddrs, []),
142 {"eth0", Eth0} = lists:keyfind("eth0", 1, Ips),
143 {addr, Ip} = lists:keyfind(addr, 1, Eth0),
144
145 %% make first node to discover second one
146 ok = peer:call(Peer, inet_db, set_lookup, [[file]]),
147 ok = peer:call(Peer, inet_db, add_host, [Ip, ["two"]]),
148
149 %% join a cluster
150 true = peer:call(Peer, net_kernel, connect_node, [Node2]),
151 %% verify that second peer node has only the first node visible
152 [Node] = peer:call(Peer2, erlang, nodes, []),
153
154 %% stop peers, causing containers to also stop
155 peer:stop(Peer2),
156 peer:stop(Peer).
157
158 build_release(Dir) ->
159 %% load sasl.app file, otherwise application:get_key will fail
160 application:load(sasl),
161 %% create *.rel - release file
162 RelFile = filename:join(Dir, "lambda.rel"),
163 Release = {release, {"lambda", "1.0.0"},
164 {erts, erlang:system_info(version)},
165 [{App, begin {ok, Vsn} = application:get_key(App, vsn), Vsn end}
166 || App <- [kernel, stdlib, sasl]]},
167 ok = file:write_file(RelFile, list_to_binary(lists:flatten(
168 io_lib:format("~tp.", [Release])))),
169 RelFileNoExt = filename:join(Dir, "lambda"),
170
171 %% create boot script
172 {ok, systools_make, []} = systools:make_script(RelFileNoExt,
173 [silent, {outdir, Dir}]),
174 %% package release into *.tar.gz
175 ok = systools:make_tar(RelFileNoExt, [{erts, code:root_dir()}]).
176
177 build_image(Dir) ->
178 %% Create Dockerfile example, working only for Ubuntu 20.04
179 %% Expose port 4445, and make Erlang distribution to listen
180 %% on this port, and connect to it without EPMD
181 %% Set cookie on both nodes to be the same.
182 BuildScript = filename:join(Dir, "Dockerfile"),
183 Dockerfile =
184 "FROM ubuntu:20.04 as runner\n"
185 "EXPOSE 4445\n"
186 "WORKDIR /opt/lambda\n"
187 "COPY lambda.tar.gz /tmp\n"
188 "RUN tar -zxvf /tmp/lambda.tar.gz -C /opt/lambda\n"
189 "ENTRYPOINT [\"/opt/lambda/erts-" ++ erlang:system_info(version) ++
190 "/bin/dyn_erl\", \"-boot\", \"/opt/lambda/releases/1.0.0/start\","
191 " \"-kernel\", \"inet_dist_listen_min\", \"4445\","
192 " \"-erl_epmd_port\", \"4445\","
193 " \"-setcookie\", \"secret\"]\n",
194 ok = file:write_file(BuildScript, Dockerfile),
195 os:cmd("docker build -t lambda " ++ Dir).
196
197
199 server_ref() = pid()
200
201 Identifies the controlling process of a peer node.
202
203 start_options() =
204 #{name => atom() | string(),
205 longnames => boolean(),
206 host => string(),
207 peer_down => stop | continue | crash,
208 connection => connection(),
209 exec => exec(),
210 detached => boolean(),
211 args => [string()],
212 post_process_args => fun(([string()]) -> [string()]),
213 env => [{string(), string()}],
214 wait_boot => wait_boot(),
215 shutdown =>
216 close | halt |
217 {halt, disconnect_timeout()} |
218 disconnect_timeout()}
219
220 Options that can be used when starting a peer node through
221 start/1 and start_link/0,1.
222
223 name:
224 Node name (the part before "@"). When name is not specified,
225 but host is, peer follows compatibility behaviour and uses
226 the origin node name.
227
228 longnames:
229 Use long names to start a node. Default is taken from the
230 origin using net_kernel:longnames(). If the origin is not
231 distributed, short names is the default.
232
233 host:
234 Enforces a specific host name. Can be used to override the
235 default behaviour and start "node@localhost" instead of
236 "node@realhostname".
237
238 peer_down:
239 Defines the peer control process behaviour when the control
240 connection is closed from the peer node side (for example
241 when the peer crashes or dumps core). When set to stop (de‐
242 fault), a lost control connection causes the control process
243 to exit normally. Setting peer_down to continue keeps the
244 control process running, and crash will cause the control‐
245 ling process to exit abnormally.
246
247 connection:
248 Alternative connection specification. See the connection
249 datatype.
250
251 exec:
252 Alternative mechanism to start peer nodes with, for example,
253 ssh instead of the default bash.
254
255 detached:
256 Defines whether to pass the -detached flag to the started
257 peer. This option cannot be set to false using the stan‐
258 dard_io alternative connection type. Default is true.
259
260 args:
261 Extra command line arguments to append to the "erl" command.
262 Arguments are passed as is, no escaping or quoting is needed
263 or accepted.
264
265 post_process_args:
266 Allows the user to change the arguments passed to exec be‐
267 fore the peer is started. This can for example be useful
268 when the exec program wants the arguments to "erl" as a sin‐
269 gle argument. Example:
270
271 peer:start(#{ name => peer:random_name(),
272 exec => {os:find_executable("bash"),["-c","erl"]},
273 post_process_args =>
274 fun(["-c"|Args]) -> ["-c", lists:flatten(lists:join($\s, Args))] end
275 }).
276
277
278 env:
279 List of environment variables with their values. This list
280 is applied to a locally started executable. If you need to
281 change the environment of the remote peer, adjust args to
282 contain -env ENV_KEY ENV_VALUE.
283
284 wait_boot:
285 Specifies the start/start_link timeout. See wait_boot
286 datatype.
287
288 shutdown:
289 Specifies the peer node stopping behaviour. See stop().
290
291 peer_state() = booting | running | {down, Reason :: term()}
292
293 Peer node state.
294
295 connection() =
296 0..65535 | {inet:ip_address(), 0..65535} | standard_io
297
298 Alternative connection between the origin and the peer. When the
299 connection closes, the peer node terminates automatically. If
300 the peer_down startup flag is set to crash, the controlling
301 process on the origin node exits with corresponding reason, ef‐
302 fectively providing a two-way link.
303
304 When connection is set to a port number, the origin starts lis‐
305 tening on the requested TCP port, and the peer node connects to
306 the port. When it is set to an {IP, Port} tuple, the origin lis‐
307 tens only on the specified IP. The port number can be set to 0
308 for automatic selection.
309
310 Using the standard_io alternative connection starts the peer at‐
311 tached to the origin (other connections use -detached flag to
312 erl). In this mode peer and origin communicate via stdin/stdout.
313
314 exec() = file:name() | {file:name(), [string()]}
315
316 Overrides executable to start peer nodes with. By default it is
317 the path to "erl", taken from init:get_argument(progname). If
318 progname is not known, peer makes best guess given the current
319 ERTS version.
320
321 When a tuple is passed, the first element is the path to exe‐
322 cutable, and the second element is prepended to the final com‐
323 mand line. This can be used to start peers on a remote host or
324 in a Docker container. See the examples above.
325
326 This option is useful for testing backwards compatibility with
327 previous releases, installed at specific paths, or when the Er‐
328 lang installation location is missing from the PATH.
329
330 wait_boot() = timeout() | {pid(), Tag :: term()} | false
331
332 Specifies start/start_link timeout in milliseconds. Can be set
333 to false, allowing the peer to start asynchronously. If {Pid,
334 Tag} is specified instead of a timeout, the peer will send Tag
335 to the requested process.
336
337 disconnect_timeout() = 1000..4294967295 | infinity
338
339 Disconnect timeout. See stop().
340
342 call(Dest :: server_ref(),
343 Module :: module(),
344 Function :: atom(),
345 Args :: [term()]) ->
346 Result :: term()
347
348 call(Dest :: server_ref(),
349 Module :: module(),
350 Function :: atom(),
351 Args :: [term()],
352 Timeout :: timeout()) ->
353 Result :: term()
354
355 Uses the alternative connection to evaluate apply(Module, Func‐
356 tion, Args) on the peer node and returns the corresponding value
357 Result. Timeout is an integer representing the timeout in mil‐
358 liseconds or the atom infinity which prevents the operation from
359 ever timing out.
360
361 When an alternative connection is not requested, this function
362 will raise exit signal with the noconnection reason. Use erpc
363 module to communicate over Erlang distribution.
364
365 cast(Dest :: server_ref(),
366 Module :: module(),
367 Function :: atom(),
368 Args :: [term()]) ->
369 ok
370
371 Uses the alternative connection to evaluate apply(Module, Func‐
372 tion, Args) on the peer node. No response is delivered to the
373 calling process.
374
375 peer:cast/4 fails silently when the alternative connection is
376 not configured. Use erpc module to communicate over Erlang dis‐
377 tribution.
378
379 send(Dest :: server_ref(),
380 To :: pid() | atom(),
381 Message :: term()) ->
382 ok
383
384 Uses the alternative connection to send Message to a process on
385 the the peer node. Silently fails if no alternative connection
386 is configured. The process can be referenced by process ID or
387 registered name.
388
389 get_state(Dest :: server_ref()) -> peer_state()
390
391 Returns the peer node state. Th initial state is booting; the
392 node stays in that state until then boot script is complete, and
393 then the node progresses to running. If the node stops (grace‐
394 fully or not), the state changes to down.
395
396 random_name() -> string()
397
398 The same as random_name(peer).
399
400 random_name(Prefix :: string() | atom()) -> string()
401
402 Creates a sufficiently unique node name for the current host,
403 combining a prefix, a unique number, and the current OS process
404 ID.
405
406 Note:
407 Use the ?CT_PEER(["erl_arg1"]) macro provided by Common Test
408 -include_lib("common_test/include/ct.hrl") for convenience. It
409 starts a new peer using Erlang distribution as the control chan‐
410 nel, supplies thes calling module's code path to the peer, and
411 uses the calling function name for the name prefix.
412
413
414 start(Options :: start_options()) ->
415 {ok, pid()} | {ok, pid(), node()} | {error, Reason}
416
417 Types:
418
419 Reason = term()
420
421 Starts a peer node with the specified start_options(). Returns
422 the controlling process and the full peer node name, unless
423 wait_boot is not requested and the host name is not known in ad‐
424 vance.
425
426 start_link() -> {ok, pid(), node()} | {error, Reason :: term()}
427
428 The same as start_link(#{name => random_name()}).
429
430 start_link(Options :: start_options()) ->
431 {ok, pid()} | {ok, pid(), node()} | {error, Reason}
432
433 Types:
434
435 Reason = term()
436
437 Starts a peer node in the same way as start/1, except that the
438 peer node is linked to the currently executing process. If that
439 process terminates, the peer node also terminates.
440
441 Accepts start_options(). Returns the controlling process and the
442 full peer node name, unless wait_boot is not requested and host
443 name is not known in advance.
444
445 When the standard_io alternative connection is requested, and
446 wait_boot is not set to false, a failed peer boot sequence
447 causes the caller to exit with the {boot_failed, {exit_status,
448 ExitCode}} reason.
449
450 stop(Dest :: server_ref()) -> ok
451
452 Types:
453
454 disconnect_timeout() = 1000..4294967295 | infinity
455
456 Stops a peer node. How the node is stopped depends on the shut‐
457 down option passed when starting the peer node. Currently the
458 following shutdown options are supported:
459
460 halt:
461 This is the default shutdown behavior. It behaves as shut‐
462 down option {halt, DefaultTimeout} where DefaultTimeout cur‐
463 rently equals 5000.
464
465 {halt, Timeout :: disconnect_timeout()}:
466 Triggers a call to erlang:halt() on the peer node and then
467 waits for the Erlang distribution connection to the peer
468 node to be taken down. If this connection has not been taken
469 down after Timeout milliseconds, it will forcefully be taken
470 down by peer:stop/1. See the warning below for more info
471 about this.
472
473 Timeout :: disconnect_timeout():
474 Triggers a call to init:stop() on the peer node and then
475 waits for the Erlang distribution connection to the peer
476 node to be taken down. If this connection has not been taken
477 down after Timeout milliseconds, it will forcefully be taken
478 down by peer:stop/1. See the warning below for more info
479 about this.
480
481 close:
482 Close the control connection to the peer node and return.
483 This is the fastest way for the caller of peer:stop/1 to
484 stop a peer node.
485
486 Note that if the Erlang distribution connection is not used
487 as control connection it might not have been taken down when
488 peer:stop/1 returns. Also note that the warning below ap‐
489 plies when the Erlang distribution connection is used as
490 control connection.
491
492 Warning:
493 In the cases where the Erlang distribution connection is taken
494 down by peer:stop/1, other code independent of the peer code
495 might react to the connection loss before the peer node is
496 stopped which might cause undesirable effects. For example,
497 global might trigger even more Erlang distribution connections
498 to other nodes to be taken down. The potential undesirable ef‐
499 fects are, however, not limited to this. It is hard to say what
500 the effects will be since these effects can be caused by any
501 code with links or monitors to something on the origin node, or
502 code monitoring the connection to the origin node.
503
504
505
506
507Maxim Fedorov, WhatsApp Inc. stdlib 5.1.1 peer(3)