1netpipe(1) netpipe netpipe(1)
2
3
4
6 NetPIPE - Network Protocol Independent Performance Evaluator
7
8
10 NPtcp [-h receiver_hostname] [-b TCP_buffer_sizes] [options]
11
12
13 mpirun [-machinefile hostlist] -np 2 NPmpi [-a] [-S] [-z] [options]
14
15
16 mpirun [-machinefile hostlist] -np 2 NPmpi2 [-f] [-g] [options]
17
18
19
20 NPpvm [options]
21
22 See the TESTING sections below for a more complete description of how
23 to run NetPIPE in each environment. The OPTIONS section describes the
24 general options available for all modules. See the README file from
25 the tar-ball at http://www.bitspjoule.org/Projects/NetPIPE/ for docu‐
26 mentation on the InfiniBand, GM, SHMEM, LAPI, and memcpy modules.
27
28
30 NetPIPE uses a simple series of ping-pong tests over a range of message
31 sizes to provide a complete measure of the performance of a network.
32 It bounces messages of increasing size between two processes, whether
33 across a network or within an SMP system. Message sizes are chosen at
34 regular intervals, and with slight perturbations, to provide a complete
35 evaluation of the communication system. Each data point involves many
36 ping-pong tests to provide an accurate timing. Latencies are calcu‐
37 lated by dividing the round trip time in half for small messages ( less
38 than 64 Bytes ).
39
40 The communication time for small messages is dominated by the overhead
41 in the communication layers, meaning that the transmission is latency
42 bound. For larger messages, the communication rate becomes bandwidth
43 limited by some component in the communication subsystem (PCI bus, net‐
44 work card link, network switch).
45
46 These measurements can be done at the message-passing layer (MPI,
47 MPI-2, and PVM) or at the native communications layers that that run
48 upon (TCP/IP, GM for Myrinet cards, InfiniBand, SHMEM for the Cray T3E
49 systems, and LAPI for IBM SP systems). Recent work is being aimed at
50 measuring some internal system properties such as the memcpy module
51 that measures the internal memory copy rates, or a disk module under
52 development that measures the performance to various I/O devices.
53
54 Some uses for NetPIPE include:
55
56 Comparing the latency and maximum throughput of various network
57 cards.
58
59 Comparing the performance between different types of networks.
60
61 Looking for inefficiencies in the message-passing layer by com‐
62 paring it to the native communication layer.
63
64 Optimizing the message-passing layer and tune OS and driver
65 parameters for optimal performance of the communication subsys‐
66 tem.
67
68
69 NetPIPE is provided with many modules allowing it to interface with a
70 wide variety of communication layers. It is fairly easy to write new
71 interfaces for other reliable protocols by using the existing modules
72 as examples.
73
74
75
76
78 NPtcp can now be launched in two ways, by manually starting NPtcp on
79 both systems or by using a nplaunch script. To manually start NPtcp,
80 the NetPIPE receiver must be started first on the remote system using
81 the command:
82
83 NPtcp [options]
84
85 then the primary transmitter is started on the local system with the
86 command
87
88 NPtcp -h receiver_hostname [options]
89
90 Any options used must be the same on both sides. The -P parameter can
91 be used to override the default port number. This is helpful when run‐
92 ning several streams through a router to a single endpoint.
93
94 The nplaunch script uses ssh to launch the remote receiver before
95 starting the local transmitter. To use rsh, simply change the nplaunch
96 script.
97
98 nplaunch NPtcp -h receiver_hostname [options]
99
100 The -b TCP_buffer_sizes option sets the TCP socket buffer size, which
101 can greatly influence the maximum throughput on some systems. A
102 throughput graph that flattens out suddenly may be a sign of the per‐
103 formance being limited by the socket buffer sizes.
104
105 Several other protocols are testable in the same way as TCP. These
106 include TCP6 (TCP over IPv6), SCTP and IPX. They are started in the
107 same way but the program names are NPtcp6, NPsctp, and NPipx respec‐
108 tively.
109
110
111
113 Use of the MPI interface for NetPIPE depends on the MPI implementation
114 being used. All will require the number of processes to be specified,
115 usually with a -np 2 argument. Clusters environments may require a
116 list of the hosts being used, either during initialization of MPI (dur‐
117 ing lamboot for LAM-MPI) or when each job is run (using a -machinefile
118 argument for MPICH). For LAM-MPI, for example, put the list of hosts
119 in hostlist then boot LAM and run NetPIPE using:
120
121 lamboot -v -b hostlist
122
123 mpirun -np 2 NPmpi [NetPIPE options]
124
125 For MPICH use a command like:
126
127 mpirun -machinefile hostlist -np 2 NPmpi [NetPIPE options]
128
129 To test the 1-sided communications of the MPI-2 standard, compile
130 using:
131
132 make mpi2
133
134 Running as described above and MPI will use 1-sided MPI_Put() calls in
135 both directions, with each receiver blocking until the last byte has
136 been overwritten before bouncing the message back. Use the -f option
137 to force usage of a fence to block rather than an overwrite of the last
138 byte. The -g option will use MP_Get() functions to transfer the data
139 rather than MP_Put().
140
141
142
144 Start the pvm system using:
145
146 pvm
147
148 and adding a second machine with the PVM command
149
150 add receiver_hostname
151
152 Exit the PVM command line interface using quit, then run the PVM Net‐
153 PIPE receiver on one system with the command:
154
155 NPpvm [options]
156
157 and run the TCP NetPIPE transmitter on the other system with the com‐
158 mand:
159
160 NPpvm -h receiver hostname [options]
161
162 Any options used must be the same on both sides. The nplaunch script
163 may also be used with NPpvm as described above for NPtcp.
164
165
167 NetPIPE tests network performance by sending a number of messages at
168 each block size, starting from the lower bound on the message sizes.
169
170 The message size is incremented until the upper bound on the message
171 size is reached or the time to transmit a block exceeds one second,
172 which ever occurs first. Message sizes are chosen at regular inter‐
173 vals, and for slight perturbations from them to provide a more complete
174 evaluation of the communication subsystem.
175
176 The NetPIPE output file may be graphed using a program such as gnu‐
177 plot(1). The output file contains three columns: the number of bytes
178 in the block, the transfer rate in bits per second, and the time to
179 transfer the block (half the round-trip time). The first two columns
180 are normally used to graph the throughput vs block size, while the
181 third column provides the latency. For example, the throughput versus
182 block size graph can be created by graphing bytes versus bits per sec‐
183 ond. Sample gnuplot(1) commands for such a graph would be
184
185 set logscale x
186
187 plot "np.out"
188
189
191 -a asynchronous mode: prepost receives (MPI, IB modules)
192
193 -b TCP_buffer_sizes
194 Set the send and receive TCP buffer sizes (TCP module only).
195
196
197 -B Burst mode where all receives are preposted at once (MPI, IB
198 modules).
199
200
201 -f Use a fence to block for completion (MPI2 module only).
202
203
204 -g Use MPI_Get() instead of MPI_Put() (MPI2 module only).
205
206
207 -h hostname
208 Specify the name of the receiver host to connect to (TCP, PVM,
209 IB, GM).
210
211
212 -I Invalidate cache to measure performance without cache effects
213 (mostly affects IB and memcpy modules).
214
215
216 -i Do an integrity check instead of a performance evaluation.
217
218
219 -l starting_msg_size
220 Specify the lower bound for the size of messages to be tested.
221
222
223
224 -n nrepeats
225 Set the number of repeats for each test to a constant.
226 Otherwise, the number of repeats is chosen to provide an
227 accurate timing for each test. Be very careful if speci‐
228 fying a low number so that the time for the ping-pong
229 test exceeds the timer accuracy.
230
231
232 -O source_offset,dest_offset
233 Specify the source and destination offsets of the buffers
234 from perfect page alignment.
235
236
237 -o output_filename
238 Specify the output filename (default is np.out).
239
240
241 -p perturbation_size
242 NetPIPE chooses the message sizes at regular intervals,
243 increasing them exponentially from the lower boundary to
244 the upper boundary. At each point, it also tests pertur‐
245 bations of 3 bytes above and 3 bytes below each test
246 point to find idiosyncrasies in the system. This pertur‐
247 bation value can be changed using the -p option, or
248 turned off using -p 0 .
249
250
251 -r This option resets the TCP sockets after every test (TCP
252 module only). It is necessary for some streaming tests
253 to get good measurements since the socket window size may
254 otherwise collapse.
255
256
257 -s Set streaming mode where data is only transmitted in one
258 direction.
259
260
261 -S Use synchronous sends (MPI module only).
262
263
264 -u upper_bound
265 Specify the upper boundary to the size of message being
266 tested. By default, NetPIPE will stop when the time to
267 transmit a block exceeds one second.
268
269
270 -z Receive messages using MPI_ANY_SOURCE (MPI module only)
271
272
273 -2 Set bi-directional mode where both sides send and receive
274 at the same time (supported by most modules). You may
275 need to use -a to choose asynchronous communications for
276 MPI to avoid freeze-ups. For TCP, the maximum test size
277 will be limited by the TCP buffer sizes.
278
279
281 np.out Default output file for NetPIPE. Overridden by the -o
282 option.
283
284
286 The original NetPIPE core plus TCP and MPI modules were written
287 by Quinn Snell, Armin Mikler, Guy Helmer, and John Gustafson.
288 NetPIPE is currently being developed and maintained by Dave
289 Turner with contributions from many students (Bogdan Vasiliu,
290 Adam Oline, Xuehua Chen, and Brian Smith).
291
292
293 Send comments/bug-reports to: <netpipe@bitspjoule.org>.
294
295 Additional information about NetPIPE can be found on the World
296 Wide Web at http://www.bitspjoule.org/Projects/NetPIPE/
297
298
300 As of version 3.6.1, there is a bug that causes NetPIPE to seg‐
301 fault on RedHat Enterprise systems. I will debug this as soon as
302 I get access to a few such systems. -Dave Turner (turner@ames‐
303 lab.gov)
304
305
306
307NetPIPE June 1, 2004 netpipe(1)