1DIETAGENT(1) DIETAGENT(1)
2
3
4
6 dietAgent - DIET grid middleware agent
7
9 dietAgent - Main binary of the DIET Architecture for running DIET
10 Agents (master and local agents)
11
13 dietAgent [config file] ...
14
16 The DietAgent is the main binary of the DIET distribution. It is used
17 for both master and local agents of a DIET hierarchy.
18
20 A DIET platform is buildt upon Server Daemons (SeD). Requests are dis‐
21 tributed amongst a hierarchy of agents. The scheduler can rely on
22 resources availability informations collected from three different
23 tools: NWS sensors which are placed on every node of the hierarchy,
24 from the application-centric performance prediction tool FAST which
25 relies on NWS informations or from CoRI Easy a module based on simple
26 system calls and basic performance tests.
27
28 The different components of a DIET architecture are the following:
29
30 Client A client is an application which uses DIET to solve computa‐
31 tional problems. Clients could be web pages, PSE scripts like
32 Matlab or Scilab or native program.
33
34 Master Agent (MA)
35 A MA manages computation requests from clients. It chooses the
36 best server available to handle the request based on performance
37 informations collected from servers. Then, the reference of the
38 chosen server is returned to the client.
39
40 Local Agent (LA)
41 A LA transmits requests between MAs and servers. LAs store a
42 list of services available in their subtree. For each service,
43 LAs store a list of children (either agents or servers) provid‐
44 ing the former. Depending on the underlying network topology, a
45 hierarchy of LAs may exists between the MA and the appropriate
46 servers, one of LAs tasks is to do a partial scheduling on its
47 subtree, effectively reducing its MA workload.
48
49 Server Daemon (SeD)
50 A SeD encapsulate a computational resource. FOr instance, it can
51 be locate on the entry point of a parallel computer. SeD store a
52 list of locally available data, available computational solvers
53 and performance-related information (available memory amount or
54 number or resources). During registration, SeD declare to its
55 parent agent (LA or MA) every computational problem it can
56 solve. SeD can send performance and hardware informations by
57 using the CoRI module or performance predictions for some kinds
58 of problems by using the FAST module.
59
60 Master Agent DAG (MA DAG)
61 The Master Agent DAG (MADAG) provides DAG workflow scheduling.
62 This agent serves as the entry point to the Diet Hierarchy for a
63 client that wants to submit a workflow. The language supported
64 by the MADAG is based on XML.
65
67 DIET relies on the CORBA naming service for service discovery allowing
68 every entity to interconnect. Reference to the omniORB naming service
69 is written down in a CORBA configuration file whose path is given to
70 omniORB through the environment variable OMNIORB_CONFIG.
71
72 The lines concerning the name server in the omniORB configuration
73 file are built as follows:
74
75 InitRef = NameService=corbaname::<hostname>:<port>
76
77 The name server port is the port given as an argument to the -start
78 option of omniNames. You also need to update your LD LIBRARY PATH to
79 point to <install dir>/lib. So your LD LIBRARY PATH environment vari‐
80 able should now be :
81 LD LIBRARY PATH=<omniORB home>/lib:<install dir>/lib.
82
83 NB1: In order to avoid name collision, every agent must be
84 assigned a different name in the name server; since they donât
85 have any children, SeDs do not need names assigned to them and they
86 donât register with the name server.
87
88 NB2: Each Diet hierarchy can use a different name server, or mul‐
89 tiple hierarchies can share one name server (assuming all agents are
90 assigned unique names). In a multi-MA environment, in order for
91 multiple hierarchies to be able to cooperate it is necessary that
92 they all share the same name server.
93
95 config_file
96 Configuration file used by the agent to launch the DIET entity
97
99 Every DIET entity requires a configuration file.
100
101 Please note that:
102
103 · comments start with â#â and finish at the end of the current
104 line,
105
106 · meaningful lines have the format: keyword = value, following the for‐
107 mat of configuration files for omniORB 4,
108
109 · for options that accept 0 or 1, 0 means no and 1 means yes, and
110
111 · keywords are case sensitive.
112
113 Depending on the type of DIET element, different kinds of keyword
114 could be found. Here is a list of the possible keywords for a DIET
115 Agent configuration file:
116
117 traceLevel
118 Integer value corresponding to the traceLevel for the DIET
119 agent:
120
121 · 0: DIET do not print anything.
122
123 · 1: DIET prints only warnings and errors on the standard
124 error output.
125
126 · 2: [default] DIET prints information on the main steps of a
127 call.
128
129 · 5: DIET prints information on all internal steps too.
130
131 · 10: DIET prints all the communication structures too.
132
133 · >10: (traceLevel - 10) is given to the ORB to print CORBA
134 messages too.
135
136 0 Diet do not print anything, â 1 Diet prints only warnings
137 and errors on the standard error output, â 2 [default] Diet
138 prints information on the main steps of a call, â 5 Diet
139 prints information on all internal steps too, â 10 Diet prints
140 all the communication structures too, â > 10 (traceLevel - 10)
141 is given to the ORB to print CORBA messages too.
142
143 agentType
144 three possible values
145
146 · DIET MASTER AGENT (or MA) for a Master Agent
147
148 · DIET LOCAL AGENT (or LA) for a Local Agent
149
150 · DIET_MA_DAG for an MA DAG Agent
151
152 dietPort
153 Integer setting the listening port of the agent. If left empty,
154 the ORB will get an open port from the system (if default 2809
155 is busy).
156
157 dietHostName
158 String setting the listening interface of the agent. If left
159 empty, the ORB will use the system hostname (the first one if
160 several are available).
161
162 name String identifying the element. Clients and children nodes (LAs
163 and SeDs) must point to the same CORBA Naming Service hosting
164 the MA.
165
166 parentName
167 String identifying the parent agent.
168
169 [Remark: Only DIET Local Agents could use the parentName key‐
170 word]
171
172 fastUse
173 Boolean enabling/disabling FAST module. If set to 0, all LDAP
174 and NWS parameters are ignored, and all requests to FAST are
175 disabled (when Diet is compiled with FAST). This is useful
176 while testing a DIET platform without having to deploy an LDAP
177 base nor an NWS platform.
178
179 [Remark: DIET must be compiled with FAST]
180
181 ldapUse
182 Boolean enabling/disabling LDAP support.
183
184 [Remark: DIET must be compiled with FAST]
185
186 ldapBase
187 String representing the LDAP base storing FAST-known services
188 address in the form host:port .
189
190 [Remark: DIET must be compiled with FAST]
191
192 ldapMask
193 String specifying the mask registered in the LDAP base.
194
195 [Remark: DIET must be compiled with FAST]
196
197 nwsUse Boolean enabling/disabling NWS support.
198
199 [Remark: DIET must be compiled with FAST]
200
201 nwsNameserver
202 String representing the NWS naming service address in the form
203 host:port .
204
205 [Remark: DIET must be compiled with FAST]
206
207 nwsForecaster
208 String representing the NWS forecast module used by FAST.
209
210 [Remark: DIET must be compiled with FAST]
211
212 useLogService
213 Boolean enabling/disabling the LogService for monitoring pur‐
214 poses.
215
216 lsOutbuffersize
217 Integer setting outgoing messages buffer size.
218
219 lsFlushinterval
220 Integer setting the flush interval for the outgoing messages
221 buffer.
222
223 neighbours
224 String listing MA that must be conntacted to build a federation.
225 It is formatted as a white-space separated list of addresses in
226 the form host:port.
227
228 [Remark: DIET must be compiled with the Multi-MA option]
229
230 minimumNeighbours
231 Integer setting the minimum connected neighbours. If the agent
232 has less connected neighbours, it will try establishing new con‐
233 nections.
234
235 [Remark: DIET must be compiled with the Multi-MA option]
236
237 maximumNeighbours
238 Integer setting the maximum connected neighbours. Further, the
239 agent will refuse newer connections.
240
241 [Remark: DIET must be compiled with the Multi-MA option]
242
243 updateLinkPeriod
244 Integer setting the period (in seconds) at which the agent will
245 check its neighbours status and will try establishing new con‐
246 nections if their numbers is less than minimumNeighbours.
247
248 [Remark: DIET must be compiled with the Multi-MA option]
249
250 bindServicePort
251 Integer defining the port used by the MA to share its IOR.
252
253 [Remark: Option used only by MAs]
254
255 useConcJobLimit
256 Boolean enabling/disabling the SeD restriction about concurrent
257 solves. This should be used in conjunction with maxConcJobs.
258
259 [Remark: Option used only by SeDs]
260
261 maxConcJobs
262 Integer setting the maximum number of jobs running at once.
263 This should be used in conjunction with maxConcJobs.
264
265 [Remark: Option used only by SeDs]
266
267 locationID
268 String used for alternative transfer cost prediction in Dagda.
269
270 [Remark: Option used only by SeDs]
271
272 MADAGNAME
273 String corresponding to the name of the MADAG agent.
274
275 [Remark: DIET must be compiled with the workflow option]
276
277 [Remark: Option used only by clients]
278
279 schedulerModule
280 Path to the sheduler library module containing the scheduler
281 implementation.
282
283 [Remark: DIET must be compiled with the User Scheduling
284 option]
285
286 [Remark: Option used only by agents]
287
288 moduleConfigFile
289 String corresponding to an optional configuration file for the
290 module.
291
292 [Remark: DIET must be compiled with the User Scheduling option]
293
294 [Remark: Option used only by agents]
295
296 batchName
297 String corresponding to the name of the queue where the job will
298 be submitted.
299
300 [Remark: DIET must be compiled with the Batch option]
301
302 [Remark: Option used only by SeDs]
303
304 pathToNFS
305 Path to the NFS directory where you have read/write permissions.
306
307 [Remark: DIET must be compiled with the Batch option]
308
309 [Remark: Option used only by SeDs]
310
311 pathToTmp
312 Path to the temporary directory where you have read/write per‐
313 missions.
314
315 [Remark: DIET must be compiled with the Batch option]
316
317 [Remark: Option used only by SeDs]
318
319 internOARbatchQueueName
320 String only useful when using CORI batch features with OAR 1.6
321
322 [Remark: DIET must be compiled with the Batch option]
323
324 [Remark: Option used only by SeDs]
325
326 initRequestID
327 Integer setting the MA Request ID starting value.
328
329 [Remark: Option used only by MAs]
330
331 maxMsgSize
332 Integer setting the maximum size of CORBA messages sent by
333 Dagda. By default, it's the same as the omniORB giopMaxMsgSize
334 size.
335
336 maxDiskSpace
337 Integer setting maximum disk space available to Dagda for stor‐
338 ingt data. When set to 0, Dagda will ignore any disk quota. By
339 default, it's the same value as available disk space on the par‐
340 tition set by storageDirectory.
341
342 maxMemSpace
343 Integer setting the maximum memory available to Dagda. When set
344 to 0, Dagda will ignore any memory usage limitation. By default,
345 no limitations.
346
347 cacheAlgorithm
348 String defining the cache replacement algorithm used when Dagda
349 needs more memory for storing a piece of data. Possible values
350 are: LRU, LFU, FIFO. By default, no cache replacement algo‐
351 rithm, Dagda never overwrite data.
352
353 shareFiles
354 Boolean enabling/disabling Dagda file sharing with its children.
355 Requires that the path is accessible by the children (ie: NFS
356 partition shared by parent and children). By default, no file
357 sharing.
358
359 dataBackupFile
360 Path to the backup file used by Dagda on user request (check‐
361 pointing). By default, no checkpointing is disabled.
362
363 [Remark: Option used by Agents and ServerDaemon]
364
365 restoreOnStart
366 Boolean defining if Dagda have to load the file set by dataBack‐
367 upFile at startup and restore all data recorded during the last
368 checkpointing event. Disabled by default.
369
370 [Remark: Option used by agents and SeDs]
371
372 storageDirectory
373 String defining the directory where Dagda will store data files.
374 By default /tmp is used.
375
377 Specific options setting scheduler policy used by the client whenever
378 it submits a request:
379
380 · BURST REQUEST: round robin on the available SeD
381
382 · BURST LIMIT: only allow a certain number of request per SeD in par‐
383 allel the limit can be set with "void setAllowedReqPerSeD(unsigned
384 ix)"
385
386 [Remark: DIET must be compiled with the Custom Client Scheduling
387 (CCS) option]
388
389 [Remark: Option used by clients]
390
391 clientMaxNbSeD:
392 Integer value representing the maximum number of SeD the client
393 should receive.
394
395 [Remark: Option used by clients]
396
398 DIET needs some variables to be defined in order for the agent to be
399 able to find all the mandatory library and the CORBA naming service.
400
401 LD_LIBRARY_PATH
402 This environment variable must contain the path to the omniORB
403 libraries
404
405 OMNIORB CONFIG
406 Path to the CORBA configuration file where the reference to the
407 omniORB naming service is written.
408
410 Here are examples of configuration file for the Master Agent or
411 Local Agents.
412
413 · Configuration file for the Master Agent:
414
415 # file MA example.cfg, configuration file for an MA
416 agentType = DIET_MASTER_AGENT
417 name = MA
418 #traceLevel = 2 # default
419 #dietPort = <port> # not needed
420 #dietHostname = <hostname|IP>
421 useLogService = 0 # default
422 lsOutbuffersize = 0 # default
423 lsFlushinterval = 10000 # default
424
425 · Configuration file for the Local Agent
426
427 # file LA example.cfg, configuration file for an LA
428 agentType = DIET_LOCAL_AGENT
429 name = LA
430 useLogService = 0 # default
431 lsOutbuffersize = 0 # default
432 lsFlushinterval = 10000 # default
433
435 If you find that software interesting, or if you find a bug, please
436 send us a mail: <diet-dev@ens-lyon.fr> with the description of the
437 problem, the version of the program and/or any information that could
438 help us fixing it.
439
441 Copyright
442 (C)2010, GRAAL, INRIA Rhone-Alpes, 46 allee d'Italie, 69364 Lyon cedex
443 07, France all right reserved <diet-dev@ens-lyon.fr>
444
445 License
446 This program is free software: you can redistribute it and/or modify it
447 under the terms of the GNU General Public License as published by the
448 Free Software Foundation, either version 3 of the License, or (at your
449 option) any later version. This program is distributed in the hope that
450 it will be useful, but WITHOUT ANY WARRANTY; without even the implied
451 warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
452 the GNU General Public License for more details. You should have
453 received a copy of the GNU General Public License along with this pro‐
454 gram. If not, see <http://www.gnu.org/licenses/>.
455
457 GRAAL
458 INRIA Rhone-Alpes
459 46 allee d'Italie 69364 Lyon cedex 07, FRANCE
460 Email: <diet-dev@ens-lyon.fr>
461 www: http://graal.ens-lyon.fr/DIET
462
463 SysFera
464 13 avenue Albert Einstein
465 69100 Villeurbanne, FRANCE
466 Email: <contact@sysfera.com>
467 www: http://www.sysfera.com
468
470 omniNames(1), dietForwarder(1), maDagAgent(1)
471
473 david.loureiro@sysfera.com
474
475 License: CeCILL
476
478 DIET developers
479
480
481
482
4830.1 2010-09-07 DIETAGENT(1)