1MADAGAGENT(1)                                                    MADAGAGENT(1)
2
3
4

NAME

6       maDagAgent - DIET grid middleware agent
7

NAME

9       maDagAgent  -  Main  binary of the DIET Architecture for managing work‐
10       flows
11

SYNOPSYS

13          maDagAgent config_file [sched] [pfm] [IRD] ...
14

DESCRIPTION

16       The maDagAgent is the daemon responsible for the  management  of  work‐
17       flows execution.
18

DIET PLATFORM

20       A  DIET platform is buildt upon Server Daemons (SeD). Requests are dis‐
21       tributed amongst a hierarchy of  agents.  The  scheduler  can  rely  on
22       resources  availability  informations  collected  from  three different
23       tools: NWS sensors which are placed on every  node  of  the  hierarchy,
24       from  the  application-centric  performance  prediction tool FAST which
25       relies on NWS informations or from CoRI Easy a module based  on  simple
26       system calls and basic performance tests.
27
28       The different components of a DIET architecture are the following:
29
30       Client A  client  is  an  application which uses DIET to solve computa‐
31              tional problems.  Clients could be web pages, PSE  scripts  like
32              Matlab or Scilab or native program.
33
34       Master Agent (MA)
35              A  MA  manages computation requests from clients. It chooses the
36              best server available to handle the request based on performance
37              informations  collected from servers. Then, the reference of the
38              chosen server is returned to the client.
39
40       Local Agent (LA)
41              A LA transmits requests between MAs and  servers.  LAs  store  a
42              list  of  services available in their subtree. For each service,
43              LAs store a list of children (either agents or servers)  provid‐
44              ing  the former. Depending on the underlying network topology, a
45              hierarchy of LAs may exists between the MA and  the  appropriate
46              servers,  one  of LAs tasks is to do a partial scheduling on its
47              subtree, effectively reducing its MA workload.
48
49       Server Daemon (SeD)
50              A SeD encapsulate a computational resource. FOr instance, it can
51              be locate on the entry point of a parallel computer. SeD store a
52              list of locally available data, available computational  solvers
53              and  performance-related information (available memory amount or
54              number or resources). During registration, SeD  declare  to  its
55              parent  agent  (LA  or  MA)  every  computational problem it can
56              solve.  SeD can send performance and  hardware  informations  by
57              using  the CoRI module or performance predictions for some kinds
58              of problems by using the FAST module.
59
60       Master Agent DAG (MA DAG)
61              The Master Agent DAG (MADAG) provides DAG  workflow  scheduling.
62              This agent serves as the entry point to the Diet Hierarchy for a
63              client that wants to submit a workflow. The  language  supported
64              by the MADAG is based on XML.
65

CORBA USAGE FOR DIET

67       DIET  relies on the CORBA naming service for service discovery allowing
68       every entity to interconnect. Reference to the omniORB  naming  service
69       is  written  down  in a CORBA configuration file whose path is given to
70       omniORB through the environment variable OMNIORB_CONFIG.
71
72       The  lines  concerning  the name server in  the  omniORB  configuration
73       file are built as follows:
74
75          InitRef  =  NameService=corbaname::<hostname>:<port>
76
77       The   name   server port is the port given as an argument to the -start
78       option of omniNames. You also need to update your LD  LIBRARY  PATH  to
79       point  to  <install dir>/lib. So your LD LIBRARY PATH environment vari‐
80       able should now be :
81          LD LIBRARY PATH=<omniORB home>/lib:<install dir>/lib.
82
83          NB1:  In  order  to  avoid  name  collision,  every  agent  must  be
84          assigned   a  different  name in the name server; since they don’t
85          have any children, SeDs do not need names assigned to them and  they
86          don’t register with the name server.
87
88          NB2:  Each  Diet  hierarchy can use a different name server, or mul‐
89          tiple hierarchies can share one name server (assuming all agents are
90          assigned  unique  names).   In  a multi-MA environment, in order for
91          multiple hierarchies to be able to cooperate it  is  necessary  that
92          they all share the same name server.
93

OPTIONS

95       config_file
96              Configuration file used by the agent to launch the DIET entity
97
98       sched  The  policy  used for scheduling workflows. This option can take
99              the following values:
100
101              · -basic (default):
102
103              · -g_heft:
104
105              · -g_aging_heft:
106
107              · -fairness:
108
109              · -srpt:
110
111              · -fcfs: first come first serve
112
113       pfm
114
115              · -pfm_any (default)
116
117              · -pfm_sameservices
118
119       IRD
120
121              · -IRD value
122

DIET CONFIGURATION FILE

124       Every DIET entity requires a configuration file.
125
126       Please note that:
127
128       · comments start with ’#’ and finish at  the  end  of  the  current
129         line,
130
131       · meaningful lines have the format: keyword = value, following the for‐
132         mat of configuration files for omniORB 4,
133
134       · for options that accept 0 or 1, 0 means no and 1 means yes, and
135
136       · keywords are case sensitive.
137
138       Depending on the type of DIET  element,  different  kinds  of   keyword
139       could  be   found.   Here is a list of the possible keywords for a DIET
140       Agent configuration file:
141
142       traceLevel
143                 Integer value corresponding to the traceLevel  for  the  DIET
144                 agent:
145
146                 · 0: DIET do not print anything.
147
148                 · 1:  DIET  prints  only  warnings and errors on the standard
149                   error output.
150
151                 · 2: [default] DIET prints information on the main steps of a
152                   call.
153
154                 · 5: DIET prints information on all internal steps too.
155
156                 · 10: DIET prints all the communication structures too.
157
158                 · >10:  (traceLevel  - 10) is given to the ORB to print CORBA
159                   messages too.
160
161              0 Diet do not print anything, – 1 Diet  prints  only  warnings
162              and  errors  on  the standard error output, – 2 [default] Diet
163              prints information on the main steps  of  a  call,  –  5  Diet
164              prints information on all internal steps too, – 10 Diet prints
165              all the communication structures too, – > 10 (traceLevel - 10)
166              is given to the ORB to print CORBA messages too.
167
168       agentType
169              three possible values
170
171              · DIET MASTER AGENT (or MA) for a Master Agent
172
173              · DIET LOCAL AGENT (or LA) for a Local Agent
174
175              · DIET_MA_DAG for an MA DAG Agent
176
177       dietPort
178              Integer  setting the listening port of the agent. If left empty,
179              the ORB will get an open port from the system (if  default  2809
180              is busy).
181
182       dietHostName
183              String  setting  the  listening  interface of the agent. If left
184              empty, the ORB will use the system hostname (the  first  one  if
185              several are available).
186
187       name   String  identifying the element. Clients and children nodes (LAs
188              and SeDs) must point to the same CORBA  Naming  Service  hosting
189              the MA.
190
191       parentName
192              String identifying the parent agent.
193
194              [Remark:  Only  DIET  Local Agents could use the parentName key‐
195              word]
196
197       fastUse
198              Boolean enabling/disabling FAST module.  If set to 0,  all  LDAP
199              and  NWS  parameters  are  ignored, and all requests to FAST are
200              disabled (when Diet is compiled  with  FAST).   This  is  useful
201              while  testing  a DIET platform without having to deploy an LDAP
202              base nor an NWS platform.
203
204              [Remark: DIET must be compiled with FAST]
205
206       ldapUse
207              Boolean enabling/disabling LDAP support.
208
209              [Remark: DIET must be compiled with FAST]
210
211       ldapBase
212              String representing the LDAP base  storing  FAST-known  services
213              address in the form host:port .
214
215              [Remark: DIET must be compiled with FAST]
216
217       ldapMask
218              String specifying the mask registered in the LDAP base.
219
220              [Remark: DIET must be compiled with FAST]
221
222       nwsUse Boolean enabling/disabling NWS support.
223
224              [Remark: DIET must be compiled with FAST]
225
226       nwsNameserver
227              String  representing  the NWS naming service address in the form
228              host:port .
229
230              [Remark: DIET must be compiled with FAST]
231
232       nwsForecaster
233              String representing the NWS forecast module used by FAST.
234
235              [Remark: DIET must be compiled with FAST]
236
237       useLogService
238              Boolean enabling/disabling the LogService  for  monitoring  pur‐
239              poses.
240
241       lsOutbuffersize
242              Integer setting outgoing messages buffer size.
243
244       lsFlushinterval
245              Integer  setting  the  flush  interval for the outgoing messages
246              buffer.
247
248       neighbours
249              String listing MA that must be conntacted to build a federation.
250              It  is formatted as a white-space separated list of addresses in
251              the form host:port.
252
253              [Remark: DIET must be compiled with the Multi-MA option]
254
255       minimumNeighbours
256              Integer setting the minimum connected neighbours. If  the  agent
257              has less connected neighbours, it will try establishing new con‐
258              nections.
259
260              [Remark: DIET must be compiled with the Multi-MA option]
261
262       maximumNeighbours
263              Integer setting the maximum connected neighbours.  Further,  the
264              agent will refuse newer connections.
265
266              [Remark: DIET must be compiled with the Multi-MA option]
267
268       updateLinkPeriod
269              Integer  setting the period (in seconds) at which the agent will
270              check its neighbours status and will try establishing  new  con‐
271              nections if their numbers is less than minimumNeighbours.
272
273              [Remark: DIET must be compiled with the Multi-MA option]
274
275       bindServicePort
276              Integer defining the port used by the MA to share its IOR.
277
278              [Remark: Option used only by MAs]
279
280       useConcJobLimit
281              Boolean  enabling/disabling the SeD restriction about concurrent
282              solves.  This should be used in conjunction with maxConcJobs.
283
284              [Remark: Option used only by SeDs]
285
286       maxConcJobs
287              Integer setting the maximum number  of  jobs  running  at  once.
288              This should be used in conjunction with maxConcJobs.
289
290              [Remark: Option used only by SeDs]
291
292       locationID
293              String used for alternative transfer cost prediction in Dagda.
294
295              [Remark: Option used only by SeDs]
296
297       MADAGNAME
298              String corresponding to the name of the MADAG agent.
299
300              [Remark: DIET must be compiled with the workflow option]
301
302              [Remark: Option used only by clients]
303
304       schedulerModule
305              Path  to  the  sheduler  library module containing the scheduler
306              implementation.
307
308              [Remark:  DIET  must   be  compiled  with  the  User  Scheduling
309              option]
310
311              [Remark: Option used only by agents]
312
313       moduleConfigFile
314              String  corresponding  to an optional configuration file for the
315              module.
316
317              [Remark: DIET must be compiled with the User Scheduling option]
318
319              [Remark: Option used only by agents]
320
321       batchName
322              String corresponding to the name of the queue where the job will
323              be submitted.
324
325              [Remark: DIET must be compiled with the Batch option]
326
327              [Remark: Option used only by SeDs]
328
329       pathToNFS
330              Path to the NFS directory where you have read/write permissions.
331
332              [Remark: DIET must be compiled with the Batch option]
333
334              [Remark: Option used only by SeDs]
335
336       pathToTmp
337              Path  to  the temporary directory where you have read/write per‐
338              missions.
339
340              [Remark: DIET must be compiled with the Batch option]
341
342              [Remark: Option used only by SeDs]
343
344       internOARbatchQueueName
345              String only useful when using CORI batch features with OAR 1.6
346
347              [Remark: DIET must be compiled with the Batch option]
348
349              [Remark: Option used only by SeDs]
350
351       initRequestID
352              Integer setting the MA Request ID starting value.
353
354              [Remark: Option used only by MAs]
355
356       maxMsgSize
357              Integer setting the maximum  size  of  CORBA  messages  sent  by
358              Dagda.   By default, it's the same as the omniORB giopMaxMsgSize
359              size.
360
361       maxDiskSpace
362              Integer setting maximum disk space available to Dagda for  stor‐
363              ingt  data.  When set to 0, Dagda will ignore any disk quota. By
364              default, it's the same value as available disk space on the par‐
365              tition set by storageDirectory.
366
367       maxMemSpace
368              Integer  setting the maximum memory available to Dagda. When set
369              to 0, Dagda will ignore any memory usage limitation. By default,
370              no limitations.
371
372       cacheAlgorithm
373              String  defining the cache replacement algorithm used when Dagda
374              needs more memory for storing a piece of data.  Possible  values
375              are:  LRU,  LFU,  FIFO.   By default, no cache replacement algo‐
376              rithm, Dagda never overwrite data.
377
378       shareFiles
379              Boolean enabling/disabling Dagda file sharing with its children.
380              Requires  that  the  path is accessible by the children (ie: NFS
381              partition shared by parent and children). By  default,  no  file
382              sharing.
383
384       dataBackupFile
385              Path  to  the  backup file used by Dagda on user request (check‐
386              pointing).  By default, no checkpointing is disabled.
387
388              [Remark: Option used by Agents and ServerDaemon]
389
390       restoreOnStart
391              Boolean defining if Dagda have to load the file set by dataBack‐
392              upFile  at startup and restore all data recorded during the last
393              checkpointing event.  Disabled by default.
394
395              [Remark: Option used by agents and SeDs]
396
397       storageDirectory
398              String defining the directory where Dagda will store data files.
399              By default /tmp is used.
400

ENVIRONMENT

402       DIET  needs  some  variables to be defined in order for the agent to be
403       able to find all the mandatory library and the CORBA naming service.
404
405       LD_LIBRARY_PATH
406              This environment variable must contain the path to  the  omniORB
407              libraries
408
409       OMNIORB CONFIG
410              Path  to the CORBA configuration file where the reference to the
411              omniORB naming service is written.
412

EXAMPLES

414       Here are examples of configuration file for the MA DAG Agent.
415
416          traceLevel = 2
417          agentType = DIET_MA_DAG
418          name = mad
419          parentName = MA1
420

REPORTING BUGS

422       If you find that software interesting, or if you  find  a  bug,  please
423       send  us  a  mail:  <diet-dev@ens-lyon.fr>  with the description of the
424       problem, the version of the program and/or any information  that  could
425       help us fixing it.
426
428   Copyright
429       (C)2010,  GRAAL, INRIA Rhone-Alpes, 46 allee d'Italie, 69364 Lyon cedex
430       07, France all right reserved <diet-dev@ens-lyon.fr>
431
432   License
433       This program is free software: you can redistribute it and/or modify it
434       under  the  terms of the GNU General Public License as published by the
435       Free Software Foundation, either version 3 of the License, or (at  your
436       option) any later version. This program is distributed in the hope that
437       it will be useful, but WITHOUT ANY WARRANTY; without even  the  implied
438       warranty  of  MERCHANTABILITY  or FITNESS FOR A PARTICULAR PURPOSE. See
439       the GNU General Public  License  for  more  details.  You  should  have
440       received  a copy of the GNU General Public License along with this pro‐
441       gram. If not, see <http://www.gnu.org/licenses/>.
442

AUTHORS

444          GRAAL
445          INRIA Rhone-Alpes
446          46 allee d'Italie 69364 Lyon cedex 07, FRANCE
447          Email: <diet-dev@ens-lyon.fr>
448          www: http://graal.ens-lyon.fr/DIET
449
450          SysFera
451          13 avenue Albert Einstein
452          69100 Villeurbanne, FRANCE
453          Email: <contact@sysfera.com>
454          www: http://www.sysfera.com
455

SEE ALSO

457       omniNames(1), dietAgent(1), dietForwarder(1)
458

AUTHOR

460       benjamin.depardon@sysfera.com
461
462       License: CeCILL
463
465       DIET developers
466
467
468
469
4700.1                               2011-05-23                     MADAGAGENT(1)
Impressum