1BFCTL(1)                         LAM COMMANDS                         BFCTL(1)
2
3
4

NAME

6       bfctl, sweep - Control LAM buffers.
7

SYNTAX

9       bfctl [-hR] [-s <space>] [-e <event>] <nodes>
10
11       sweep <nodes>
12

OPTIONS

14       -h             Print the command help menu.
15
16       -R             Reset the state of the buffer daemon.
17
18       -e <event>     Sweep (clean) out buffered messages of a specific event.
19
20       -s <space>     Limit the total size, in bytes, of a node's total buffer
21                      pool.
22

DESCRIPTION

24       Most MPI users will probably not need to use the bfctl and  sweep  com‐
25       mands;  see lamclean(1).  This command is only installed if LAM/MPI was
26       configured with the --with-trillium switch.
27
28       The bfctl command controls buffering parameters on any node.   It  must
29       be  called  with  an option: bfctl <node(s)> by itself has no function.
30       sweep is used after an application program error or premature  termina‐
31       tion to remove all messages held in buffers.
32
33       The total space that can be consumed by the buffer daemon's buffer pool
34       is adjusted with the -s <space> option, where <space>  is  the  maximum
35       number  of  bytes  in  the  buffer  pool; the default is 2 Mbytes.  The
36       <space> parameter should  not  be  less  than  MAXNMSGLEN  (defined  in
37       <net.h>).
38
39       In  the  event of an application program error or premature termination
40       of an application process, unwanted messages often collect in the  buf‐
41       fers.   The  user will need to "sweep" the buffers clean before running
42       the application program again.  bfctl -R <node(s)> will remove all mes‐
43       sages  from the internal buffer pool on the given nodes.  sweep <nodes>
44       is equivalent to bfctl -R <nodes>.  Sweeping buffered messages  can  be
45       done  in a selective manner, removing all messages of a specific event.
46       The event is specified by the -e option.
47
48   Message Buffering
49       The purpose of LAM network buffering is to receive, store, and  forward
50       messages  to  provide  very loose synchronization for senders, to allow
51       selective out-of-order synchronization for receivers and to  facilitate
52       debugging synchronization errors.
53
54       Two  communicating  processes  using  network  functions  nsend(2)  and
55       nrecv(2) (or functions built upon these) have the option of  using  the
56       network  buffers  or  not.   By default, they are used.  The message is
57       routed to the buffer daemon on each node along the path from the sender
58       to the receiver.  If the two processes are on different nodes, the buf‐
59       fer daemon on the sender's node is skipped.  The receiver  synchronizes
60       by  first  sending  a query to the local buffer daemon and then waiting
61       for a message to arrive on the selected event.  If  the  buffer  daemon
62       has  a  synchronizing  message,  it forwards it to the receiver immedi‐
63       ately.  Otherwise the  buffer  daemon  forwards  the  message  when  it
64       arrives.   The  sender  blocks  only  if there is no appropriate buffer
65       space available on the receiver's node and on all nodes in between.
66
67   Bypassing Buffers
68       Buffering is turned off by setting the NOBUF flag in the nh_flags field
69       of  the  network  message  descriptor  prior to calling nrecv(2) in the
70       receiver and nsend(2) in the sender.  The NOBUF flag must be used  with
71       care  and  caution.   Setting the flag in one but not the other process
72       may inhibit synchronization.  Toggling the NOBUF flag in  a  stream  of
73       messages  to same receiver on the same synchronization point (event and
74       type, see nsend(2)), may cause messages to  get  out  of  order.   Even
75       without buffering the node-to-node links can hold one or more messages.
76       Thus the sender will block when all  the  links  on  the  path  to  the
77       receiver's  node  are  stuffed  with  messages.   When  the  sender and
78       receiver are on the same node, synchronization is strong and the sender
79       will block until the receiver takes the message.
80
81       The  buffer  daemon will refuse to receive any message for buffering if
82       the current size of the buffer pool exceeds the upper size  limit.   It
83       will resume receiving messages when space is cleared through forwarding
84       messages to receivers or other nodes.
85

EXAMPLES

87       bfctl -s 0x100000 h
88           Allow one megabyte of total message buffer space on the local node.
89
90       sweep N
91           Clean out all buffers on all nodes.
92
93       bfctl -e 4 n1
94           Remove all messages with event 4 on node 1.
95

SEE ALSO

97       bfstate(1), lamclean(1)
98
99
100
101LAM 7.1.2                         March, 2006                         BFCTL(1)
Impressum