1MADAGAGENT(1) MADAGAGENT(1)
2
3
4
6 maDagAgent - DIET grid middleware agent
7
9 maDagAgent - Main binary of the DIET Architecture for managing work‐
10 flows
11
13 maDagAgent config_file [sched] [pfm] [IRD] ...
14
16 The maDagAgent is the daemon responsible for the management of work‐
17 flows execution.
18
20 A DIET platform is buildt upon Server Daemons (SeD). Requests are dis‐
21 tributed amongst a hierarchy of agents. The scheduler can rely on
22 resources availability informations collected from three different
23 tools: NWS sensors which are placed on every node of the hierarchy,
24 from the application-centric performance prediction tool FAST which
25 relies on NWS informations or from CoRI Easy a module based on simple
26 system calls and basic performance tests.
27
28 The different components of a DIET architecture are the following:
29
30 Client A client is an application which uses DIET to solve computa‐
31 tional problems. Clients could be web pages, PSE scripts like
32 Matlab or Scilab or native program.
33
34 Master Agent (MA)
35 A MA manages computation requests from clients. It chooses the
36 best server available to handle the request based on performance
37 informations collected from servers. Then, the reference of the
38 chosen server is returned to the client.
39
40 Local Agent (LA)
41 A LA transmits requests between MAs and servers. LAs store a
42 list of services available in their subtree. For each service,
43 LAs store a list of children (either agents or servers) provid‐
44 ing the former. Depending on the underlying network topology, a
45 hierarchy of LAs may exists between the MA and the appropriate
46 servers, one of LAs tasks is to do a partial scheduling on its
47 subtree, effectively reducing its MA workload.
48
49 Server Daemon (SeD)
50 A SeD encapsulate a computational resource. FOr instance, it can
51 be locate on the entry point of a parallel computer. SeD store a
52 list of locally available data, available computational solvers
53 and performance-related information (available memory amount or
54 number or resources). During registration, SeD declare to its
55 parent agent (LA or MA) every computational problem it can
56 solve. SeD can send performance and hardware informations by
57 using the CoRI module or performance predictions for some kinds
58 of problems by using the FAST module.
59
60 Master Agent DAG (MA DAG)
61 The Master Agent DAG (MADAG) provides DAG workflow scheduling.
62 This agent serves as the entry point to the Diet Hierarchy for a
63 client that wants to submit a workflow. The language supported
64 by the MADAG is based on XML.
65
67 DIET relies on the CORBA naming service for service discovery allowing
68 every entity to interconnect. Reference to the omniORB naming service
69 is written down in a CORBA configuration file whose path is given to
70 omniORB through the environment variable OMNIORB_CONFIG.
71
72 The lines concerning the name server in the omniORB configuration
73 file are built as follows:
74
75 InitRef = NameService=corbaname::<hostname>:<port>
76
77 The name server port is the port given as an argument to the -start
78 option of omniNames. You also need to update your LD LIBRARY PATH to
79 point to <install dir>/lib. So your LD LIBRARY PATH environment vari‐
80 able should now be :
81 LD LIBRARY PATH=<omniORB home>/lib:<install dir>/lib.
82
83 NB1: In order to avoid name collision, every agent must be
84 assigned a different name in the name server; since they donât
85 have any children, SeDs do not need names assigned to them and they
86 donât register with the name server.
87
88 NB2: Each Diet hierarchy can use a different name server, or mul‐
89 tiple hierarchies can share one name server (assuming all agents are
90 assigned unique names). In a multi-MA environment, in order for
91 multiple hierarchies to be able to cooperate it is necessary that
92 they all share the same name server.
93
95 config_file
96 Configuration file used by the agent to launch the DIET entity
97
98 sched The policy used for scheduling workflows. This option can take
99 the following values:
100
101 · -basic (default):
102
103 · -g_heft:
104
105 · -g_aging_heft:
106
107 · -fairness:
108
109 · -srpt:
110
111 · -fcfs: first come first serve
112
113 pfm
114
115 · -pfm_any (default)
116
117 · -pfm_sameservices
118
119 IRD
120
121 · -IRD value
122
124 Every DIET entity requires a configuration file.
125
126 Please note that:
127
128 · comments start with â#â and finish at the end of the current
129 line,
130
131 · meaningful lines have the format: keyword = value, following the for‐
132 mat of configuration files for omniORB 4,
133
134 · for options that accept 0 or 1, 0 means no and 1 means yes, and
135
136 · keywords are case sensitive.
137
138 Depending on the type of DIET element, different kinds of keyword
139 could be found. Here is a list of the possible keywords for a DIET
140 Agent configuration file:
141
142 traceLevel
143 Integer value corresponding to the traceLevel for the DIET
144 agent:
145
146 · 0: DIET do not print anything.
147
148 · 1: DIET prints only warnings and errors on the standard
149 error output.
150
151 · 2: [default] DIET prints information on the main steps of a
152 call.
153
154 · 5: DIET prints information on all internal steps too.
155
156 · 10: DIET prints all the communication structures too.
157
158 · >10: (traceLevel - 10) is given to the ORB to print CORBA
159 messages too.
160
161 0 Diet do not print anything, â 1 Diet prints only warnings
162 and errors on the standard error output, â 2 [default] Diet
163 prints information on the main steps of a call, â 5 Diet
164 prints information on all internal steps too, â 10 Diet prints
165 all the communication structures too, â > 10 (traceLevel - 10)
166 is given to the ORB to print CORBA messages too.
167
168 agentType
169 three possible values
170
171 · DIET MASTER AGENT (or MA) for a Master Agent
172
173 · DIET LOCAL AGENT (or LA) for a Local Agent
174
175 · DIET_MA_DAG for an MA DAG Agent
176
177 dietPort
178 Integer setting the listening port of the agent. If left empty,
179 the ORB will get an open port from the system (if default 2809
180 is busy).
181
182 dietHostName
183 String setting the listening interface of the agent. If left
184 empty, the ORB will use the system hostname (the first one if
185 several are available).
186
187 name String identifying the element. Clients and children nodes (LAs
188 and SeDs) must point to the same CORBA Naming Service hosting
189 the MA.
190
191 parentName
192 String identifying the parent agent.
193
194 [Remark: Only DIET Local Agents could use the parentName key‐
195 word]
196
197 fastUse
198 Boolean enabling/disabling FAST module. If set to 0, all LDAP
199 and NWS parameters are ignored, and all requests to FAST are
200 disabled (when Diet is compiled with FAST). This is useful
201 while testing a DIET platform without having to deploy an LDAP
202 base nor an NWS platform.
203
204 [Remark: DIET must be compiled with FAST]
205
206 ldapUse
207 Boolean enabling/disabling LDAP support.
208
209 [Remark: DIET must be compiled with FAST]
210
211 ldapBase
212 String representing the LDAP base storing FAST-known services
213 address in the form host:port .
214
215 [Remark: DIET must be compiled with FAST]
216
217 ldapMask
218 String specifying the mask registered in the LDAP base.
219
220 [Remark: DIET must be compiled with FAST]
221
222 nwsUse Boolean enabling/disabling NWS support.
223
224 [Remark: DIET must be compiled with FAST]
225
226 nwsNameserver
227 String representing the NWS naming service address in the form
228 host:port .
229
230 [Remark: DIET must be compiled with FAST]
231
232 nwsForecaster
233 String representing the NWS forecast module used by FAST.
234
235 [Remark: DIET must be compiled with FAST]
236
237 useLogService
238 Boolean enabling/disabling the LogService for monitoring pur‐
239 poses.
240
241 lsOutbuffersize
242 Integer setting outgoing messages buffer size.
243
244 lsFlushinterval
245 Integer setting the flush interval for the outgoing messages
246 buffer.
247
248 neighbours
249 String listing MA that must be conntacted to build a federation.
250 It is formatted as a white-space separated list of addresses in
251 the form host:port.
252
253 [Remark: DIET must be compiled with the Multi-MA option]
254
255 minimumNeighbours
256 Integer setting the minimum connected neighbours. If the agent
257 has less connected neighbours, it will try establishing new con‐
258 nections.
259
260 [Remark: DIET must be compiled with the Multi-MA option]
261
262 maximumNeighbours
263 Integer setting the maximum connected neighbours. Further, the
264 agent will refuse newer connections.
265
266 [Remark: DIET must be compiled with the Multi-MA option]
267
268 updateLinkPeriod
269 Integer setting the period (in seconds) at which the agent will
270 check its neighbours status and will try establishing new con‐
271 nections if their numbers is less than minimumNeighbours.
272
273 [Remark: DIET must be compiled with the Multi-MA option]
274
275 bindServicePort
276 Integer defining the port used by the MA to share its IOR.
277
278 [Remark: Option used only by MAs]
279
280 useConcJobLimit
281 Boolean enabling/disabling the SeD restriction about concurrent
282 solves. This should be used in conjunction with maxConcJobs.
283
284 [Remark: Option used only by SeDs]
285
286 maxConcJobs
287 Integer setting the maximum number of jobs running at once.
288 This should be used in conjunction with maxConcJobs.
289
290 [Remark: Option used only by SeDs]
291
292 locationID
293 String used for alternative transfer cost prediction in Dagda.
294
295 [Remark: Option used only by SeDs]
296
297 MADAGNAME
298 String corresponding to the name of the MADAG agent.
299
300 [Remark: DIET must be compiled with the workflow option]
301
302 [Remark: Option used only by clients]
303
304 schedulerModule
305 Path to the sheduler library module containing the scheduler
306 implementation.
307
308 [Remark: DIET must be compiled with the User Scheduling
309 option]
310
311 [Remark: Option used only by agents]
312
313 moduleConfigFile
314 String corresponding to an optional configuration file for the
315 module.
316
317 [Remark: DIET must be compiled with the User Scheduling option]
318
319 [Remark: Option used only by agents]
320
321 batchName
322 String corresponding to the name of the queue where the job will
323 be submitted.
324
325 [Remark: DIET must be compiled with the Batch option]
326
327 [Remark: Option used only by SeDs]
328
329 pathToNFS
330 Path to the NFS directory where you have read/write permissions.
331
332 [Remark: DIET must be compiled with the Batch option]
333
334 [Remark: Option used only by SeDs]
335
336 pathToTmp
337 Path to the temporary directory where you have read/write per‐
338 missions.
339
340 [Remark: DIET must be compiled with the Batch option]
341
342 [Remark: Option used only by SeDs]
343
344 internOARbatchQueueName
345 String only useful when using CORI batch features with OAR 1.6
346
347 [Remark: DIET must be compiled with the Batch option]
348
349 [Remark: Option used only by SeDs]
350
351 initRequestID
352 Integer setting the MA Request ID starting value.
353
354 [Remark: Option used only by MAs]
355
356 maxMsgSize
357 Integer setting the maximum size of CORBA messages sent by
358 Dagda. By default, it's the same as the omniORB giopMaxMsgSize
359 size.
360
361 maxDiskSpace
362 Integer setting maximum disk space available to Dagda for stor‐
363 ingt data. When set to 0, Dagda will ignore any disk quota. By
364 default, it's the same value as available disk space on the par‐
365 tition set by storageDirectory.
366
367 maxMemSpace
368 Integer setting the maximum memory available to Dagda. When set
369 to 0, Dagda will ignore any memory usage limitation. By default,
370 no limitations.
371
372 cacheAlgorithm
373 String defining the cache replacement algorithm used when Dagda
374 needs more memory for storing a piece of data. Possible values
375 are: LRU, LFU, FIFO. By default, no cache replacement algo‐
376 rithm, Dagda never overwrite data.
377
378 shareFiles
379 Boolean enabling/disabling Dagda file sharing with its children.
380 Requires that the path is accessible by the children (ie: NFS
381 partition shared by parent and children). By default, no file
382 sharing.
383
384 dataBackupFile
385 Path to the backup file used by Dagda on user request (check‐
386 pointing). By default, no checkpointing is disabled.
387
388 [Remark: Option used by Agents and ServerDaemon]
389
390 restoreOnStart
391 Boolean defining if Dagda have to load the file set by dataBack‐
392 upFile at startup and restore all data recorded during the last
393 checkpointing event. Disabled by default.
394
395 [Remark: Option used by agents and SeDs]
396
397 storageDirectory
398 String defining the directory where Dagda will store data files.
399 By default /tmp is used.
400
402 DIET needs some variables to be defined in order for the agent to be
403 able to find all the mandatory library and the CORBA naming service.
404
405 LD_LIBRARY_PATH
406 This environment variable must contain the path to the omniORB
407 libraries
408
409 OMNIORB CONFIG
410 Path to the CORBA configuration file where the reference to the
411 omniORB naming service is written.
412
414 Here are examples of configuration file for the MA DAG Agent.
415
416 traceLevel = 2
417 agentType = DIET_MA_DAG
418 name = mad
419 parentName = MA1
420
422 If you find that software interesting, or if you find a bug, please
423 send us a mail: <diet-dev@ens-lyon.fr> with the description of the
424 problem, the version of the program and/or any information that could
425 help us fixing it.
426
428 Copyright
429 (C)2010, GRAAL, INRIA Rhone-Alpes, 46 allee d'Italie, 69364 Lyon cedex
430 07, France all right reserved <diet-dev@ens-lyon.fr>
431
432 License
433 This program is free software: you can redistribute it and/or modify it
434 under the terms of the GNU General Public License as published by the
435 Free Software Foundation, either version 3 of the License, or (at your
436 option) any later version. This program is distributed in the hope that
437 it will be useful, but WITHOUT ANY WARRANTY; without even the implied
438 warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
439 the GNU General Public License for more details. You should have
440 received a copy of the GNU General Public License along with this pro‐
441 gram. If not, see <http://www.gnu.org/licenses/>.
442
444 GRAAL
445 INRIA Rhone-Alpes
446 46 allee d'Italie 69364 Lyon cedex 07, FRANCE
447 Email: <diet-dev@ens-lyon.fr>
448 www: http://graal.ens-lyon.fr/DIET
449
450 SysFera
451 13 avenue Albert Einstein
452 69100 Villeurbanne, FRANCE
453 Email: <contact@sysfera.com>
454 www: http://www.sysfera.com
455
457 omniNames(1), dietAgent(1), dietForwarder(1)
458
460 benjamin.depardon@sysfera.com
461
462 License: CeCILL
463
465 DIET developers
466
467
468
469
4700.1 2011-05-23 MADAGAGENT(1)