1Bio::AnalysisI(3)     User Contributed Perl Documentation    Bio::AnalysisI(3)
2
3
4

NAME

6       Bio::AnalysisI - An interface to any (local or remote) analysis tool
7

SYNOPSIS

9       This is an interface module - you do not instantiate it.  Use
10       "Bio::Tools::Run::Analysis" module:
11
12         use Bio::Tools::Run::Analysis;
13         my $tool = Bio::Tools::Run::Analysis->new(@args);
14

DESCRIPTION

16       This interface contains all public methods for accessing and
17       controlling local and remote analysis tools. It is meant to be used on
18       the client side.
19

FEEDBACK

21   Mailing Lists
22       User feedback is an integral part of the evolution of this and other
23       Bioperl modules. Send your comments and suggestions preferably to the
24       Bioperl mailing list.  Your participation is much appreciated.
25
26         bioperl-l@bioperl.org                  - General discussion
27         http://bioperl.org/wiki/Mailing_lists  - About the mailing lists
28
29   Support
30       Please direct usage questions or support issues to the mailing list:
31
32       bioperl-l@bioperl.org
33
34       rather than to the module maintainer directly. Many experienced and
35       reponsive experts will be able look at the problem and quickly address
36       it. Please include a thorough description of the problem with code and
37       data examples if at all possible.
38
39   Reporting Bugs
40       Report bugs to the Bioperl bug tracking system to help us keep track of
41       the bugs and their resolution. Bug reports can be submitted via the
42       web:
43
44         http://bugzilla.open-bio.org/
45

AUTHOR

47       Martin Senger (martin.senger@gmail.com)
48
50       Copyright (c) 2003, Martin Senger and EMBL-EBI.  All Rights Reserved.
51
52       This module is free software; you can redistribute it and/or modify it
53       under the same terms as Perl itself.
54

DISCLAIMER

56       This software is provided "as is" without warranty of any kind.
57

SEE ALSO

59       http://www.ebi.ac.uk/Tools/webservices/soaplab/guide
60

APPENDIX

62       This is actually the main documentation...
63
64       If you try to call any of these methods directly on this
65       "Bio::AnalysisI" object you will get a not implemented error message.
66       You need to call them on a "Bio::Tools::Run::Analysis" object instead.
67
68   analysis_name
69        Usage   : $tool->analysis_name;
70        Returns : a name of this analysis
71        Args    : none
72
73   analysis_spec
74        Usage   : $tool->analysis_spec;
75        Returns : a hash reference describing this analysis
76        Args    : none
77
78       The returned hash reference uses the following keys (not all of them
79       always present, perhaps others present as well): "name", "type",
80       "version", "supplier", "installation", "description".
81
82       Here is an example output:
83
84         Analysis 'edit.seqret':
85               installation => EMBL-EBI
86               description => Reads and writes (returns) sequences
87               supplier => EMBOSS
88               version => 2.6.0
89               type => edit
90               name => seqret
91
92   describe
93        Usage   : $tool->analysis_spec;
94        Returns : an XML detailed description of this analysis
95        Args    : none
96
97       The returned XML string contains metadata describing this analysis
98       service. It includes also metadata returned (and easier used) by method
99       "analysis_spec", "input_spec" and "result_spec".
100
101       The DTD used for returned metadata is based on the adopted standard
102       (BSA specification for analysis engine):
103
104         <!ELEMENT DsLSRAnalysis (analysis)+>
105
106         <!ELEMENT analysis (description?, input*, output*, extension?)>
107
108         <!ATTLIST analysis
109             type          CDATA #REQUIRED
110             name          CDATA #IMPLIED
111             version       CDATA #IMPLIED
112             supplier      CDATA #IMPLIED
113             installation  CDATA #IMPLIED>
114
115         <!ELEMENT description ANY>
116         <!ELEMENT extension ANY>
117
118         <!ELEMENT input (default?, allowed*, extension?)>
119
120         <!ATTLIST input
121             type          CDATA #REQUIRED
122             name          CDATA #REQUIRED
123             mandatory     (true|false) "false">
124
125         <!ELEMENT default (#PCDATA)>
126         <!ELEMENT allowed (#PCDATA)>
127
128         <!ELEMENT output (extension?)>
129
130         <!ATTLIST output
131             type          CDATA #REQUIRED
132             name          CDATA #REQUIRED>
133
134       But the DTD may be extended by provider-specific metadata. For example,
135       the EBI experimental SOAP-based service on top of EMBOSS uses DTD
136       explained at "http://www.ebi.ac.uk/~senger/applab".
137
138   input_spec
139        Usage   : $tool->input_spec;
140        Returns : an array reference with hashes as elements
141        Args    : none
142
143       The analysis input data are named, and can be also associated with a
144       default value, with allowed values and with few other attributes. The
145       names are important for feeding the service with the input data (the
146       inputs are given to methods "create_job", "Bio::AnalysisI|run", and/or
147       "Bio::AnalysisI|wait_for" as name/value pairs).
148
149       Here is a (slightly shortened) example of an input specification:
150
151        $input_spec = [
152                 {
153                   'mandatory' => 'false',
154                   'type' => 'String',
155                   'name' => 'sequence_usa'
156                 },
157                 {
158                   'mandatory' => 'false',
159                   'type' => 'String',
160                   'name' => 'sequence_direct_data'
161                 },
162                 {
163                   'mandatory' => 'false',
164                   'allowed_values' => [
165                                         'gcg',
166                                         'gcg8',
167                                         ...
168                                         'raw'
169                                       ],
170                   'type' => 'String',
171                   'name' => 'sformat'
172                 },
173                 {
174                   'mandatory' => 'false',
175                   'type' => 'String',
176                   'name' => 'sbegin'
177                 },
178                 {
179                   'mandatory' => 'false',
180                   'type' => 'String',
181                   'name' => 'send'
182                 },
183                 {
184                   'mandatory' => 'false',
185                   'type' => 'String',
186                   'name' => 'sprotein'
187                 },
188                 {
189                   'mandatory' => 'false',
190                   'type' => 'String',
191                   'name' => 'snucleotide'
192                 },
193                 {
194                   'mandatory' => 'false',
195                   'type' => 'String',
196                   'name' => 'sreverse'
197                 },
198                 {
199                   'mandatory' => 'false',
200                   'type' => 'String',
201                   'name' => 'slower'
202                 },
203                 {
204                   'mandatory' => 'false',
205                   'type' => 'String',
206                   'name' => 'supper'
207                 },
208                 {
209                   'mandatory' => 'false',
210                   'default' => 'false',
211                   'type' => 'String',
212                   'name' => 'firstonly'
213                 },
214                 {
215                   'mandatory' => 'false',
216                   'default' => 'fasta',
217                   'allowed_values' => [
218                                         'gcg',
219                                         'gcg8',
220                                         'embl',
221                                         ...
222                                         'raw'
223                                       ],
224                   'type' => 'String',
225                   'name' => 'osformat'
226                 }
227               ];
228
229   result_spec
230        Usage   : $tool->result_spec;
231        Returns : a hash reference with result names as keys
232                  and result types as values
233        Args    : none
234
235       The analysis results are named and can be retrieved using their names
236       by methods "results" and "result".
237
238       Here is an example of the result specification (again for the service
239       edit.seqret):
240
241         $result_spec = {
242                 'outseq' => 'String',
243                 'report' => 'String',
244                 'detailed_status' => 'String'
245               };
246
247   create_job
248        Usage   : $tool->create_job ( {'sequence'=>'tatat'} )
249        Returns : Bio::Tools::Run::Analysis::Job
250        Args    : data and parameters for this execution
251                  (in various formats)
252
253       Create an object representing a single execution of this analysis tool.
254
255       Call this method if you wish to "stage the scene" - to create a job
256       with all input data but without actually running it. This method is
257       called automatically from other methods ("Bio::AnalysisI|run" and
258       "Bio::AnalysisI|wait_for") so usually you do not need to call it
259       directly.
260
261       The input data and prameters for this execution can be specified in
262       various ways:
263
264       array reference
265           The array has scalar elements of the form
266
267              name = [[@]value]
268
269           where "name" is the name of an input data or input parameter (see
270           method "input_spec" for finding what names are recognized by this
271           analysis) and "value" is a value for this data/parameter. If
272           "value" is missing a 1 is assumed (which is convenient for the
273           boolean options). If "value" starts with "@" it is treated as a
274           local filename, and its contents is used as the data/parameter
275           value.
276
277       hash reference
278           The same as with the array reference but now there is no need to
279           use an equal sign. The hash keys are input names and hash values
280           their data. The values can again start with a "@" sign indicating a
281           local filename.
282
283       scalar
284           In this case, the parameter represents a job ID obtained in some
285           previous invocation - such job already exists on the server side,
286           and we are just re-creating it here using the same job ID.
287
288           TBD: here we should allow the same by using a reference to the
289           Bio::Tools::Run::Analysis::Job object.
290
291       undef
292           Finally, if the parameter is undefined, ask server to create an
293           empty job. The input data may be added later using "set_data..."
294           method(s) - see scripts/papplmaker.PLS for details.
295
296   run
297        Usage   : $tool->run ( ['sequence=@my.seq', 'osformat=embl'] )
298        Returns : Bio::Tools::Run::Analysis::Job,
299                  representing started job (an execution)
300        Args    : the same as for create_job
301
302       Create a job and start it, but do not wait for its completion.
303
304   wait_for
305        Usage   : $tool->wait_for ( { 'sequence' => '@my,file' } )
306        Returns : Bio::Tools::Run::Analysis::Job,
307                  representing finished job
308        Args    : the same as for create_job
309
310       Create a job, start it and wait for its completion.
311
312       Note that this is a blocking method. It returns only after the executed
313       job finishes, either normally or by an error.
314
315       Usually, after this call, you ask for results of the finished job:
316
317           $analysis->wait_for (...)->results;
318

Module Bio::AnalysisI::JobI

320       An interface to the public methods provided by
321       "Bio::Tools::Run::Analysis::Job" objects.
322
323       The "Bio::Tools::Run::Analysis::Job" objects represent a created,
324       running, or finished execution of an analysis tool.
325
326       The factory for these objects is module "Bio::Tools::Run::Analysis"
327       where the following methods return an "Bio::Tools::Run::Analysis::Job"
328       object:
329
330           create_job   (returning a prepared job)
331           run          (returning a running job)
332           wait_for     (returning a finished job)
333
334   id
335        Usage   : $job->id;
336        Returns : this job ID
337        Args    : none
338
339       Each job (an execution) is identifiable by this unique ID which can be
340       used later to re-create the same job (in other words: to re-connect to
341       the same job). It is useful in cases when a job takes long time to
342       finish and your client program does not want to wait for it within the
343       same session.
344
345   Bio::AnalysisI::JobI::run
346        Usage   : $job->run
347        Returns : itself
348        Args    : none
349
350       It starts previously created job.  The job already must have all input
351       data filled-in. This differs from the method of the same name of the
352       "Bio::Tools::Run::Analysis" object where the
353       "Bio::AnalysisI::JobI::run" method creates also a new job allowing to
354       set input data.
355
356   Bio::AnalysisI::JobI::wait_for
357        Usage   : $job->wait_for
358        Returns : itself
359        Args    : none
360
361       It waits until a previously started execution of this job finishes.
362
363   terminate
364        Usage   : $job->terminate
365        Returns : itself
366        Args    : none
367
368       Stop the currently running job (represented by this object). This is a
369       definitive stop, there is no way to resume it later.
370
371   last_event
372        Usage   : $job->last_event
373        Returns : an XML string
374        Args    : none
375
376       It returns a short XML document showing what happened last with this
377       job. This is the used DTD:
378
379          <!-- place for extensions -->
380          <!ENTITY % event_body_template "(state_changed | heartbeat_progress | percent_progress | time_progress | step_progress)">
381
382          <!ELEMENT analysis_event (message?, (%event_body_template;)?)>
383
384          <!ATTLIST analysis_event
385              timestamp  CDATA #IMPLIED>
386
387          <!ELEMENT message (#PCDATA)>
388
389          <!ELEMENT state_changed EMPTY>
390          <!ENTITY % analysis_state "created | running | completed | terminated_by_request | terminated_by_error">
391          <!ATTLIST state_changed
392              previous_state  (%analysis_state;) "created"
393              new_state       (%analysis_state;) "created">
394
395          <!ELEMENT heartbeat_progress EMPTY>
396
397          <!ELEMENT percent_progress EMPTY>
398          <!ATTLIST percent_progress
399              percentage CDATA #REQUIRED>
400
401          <!ELEMENT time_progress EMPTY>
402          <!ATTLIST time_progress
403              remaining CDATA #REQUIRED>
404
405          <!ELEMENT step_progress EMPTY>
406          <!ATTLIST step_progress
407              total_steps      CDATA #IMPLIED
408              steps_completed CDATA #REQUIRED>
409
410       Here is an example what is returned after a job was created and
411       started, but before it finishes (note that the example uses an analysis
412       'showdb' which does not need any input data):
413
414          use Bio::Tools::Run::Analysis;
415          print new Bio::Tools::Run::Analysis (-name => 'display.showdb')
416                    ->run
417                    ->last_event;
418
419       It prints:
420
421          <?xml version = "1.0"?>
422          <analysis_event>
423            <message>Mar 3, 2003 5:14:46 PM (Europe/London)</message>
424            <state_changed previous_state="created" new_state="running"/>
425          </analysis_event>
426
427       The same example but now after it finishes:
428
429          use Bio::Tools::Run::Analysis;
430          print new Bio::Tools::Run::Analysis (-name => 'display.showdb')
431                    ->wait_for
432                    ->last_event;
433
434          <?xml version = "1.0"?>
435          <analysis_event>
436            <message>Mar 3, 2003 5:17:14 PM (Europe/London)</message>
437            <state_changed previous_state="running" new_state="completed"/>
438          </analysis_event>
439
440   status
441        Usage   : $job->status
442        Returns : string describing the job status
443        Args    : none
444
445       It returns one of the following strings (and perhaps more if a server
446       implementation extended possible job states):
447
448          CREATED
449          RUNNING
450          COMPLETED
451          TERMINATED_BY_REQUEST
452          TERMINATED_BY_ERROR
453
454   created
455        Usage   : $job->created (1)
456        Returns : time when this job was created
457        Args    : optional
458
459       Without any argument it returns a time of creation of this job in
460       seconds, counting from the beginning of the UNIX epoch (1.1.1970). With
461       a true argument it returns a formatted time, using rules described in
462       "Bio::Tools::Run::Analysis::Utils::format_time".
463
464   started
465        Usage   : $job->started (1)
466        Returns : time when this job was started
467        Args    : optional
468
469       See "created".
470
471   ended
472        Usage   : $job->ended (1)
473        Returns : time when this job was terminated
474        Args    : optional
475
476       See "created".
477
478   elapsed
479        Usage   : $job->elapsed
480        Returns : elapsed time of the execution of the given job
481                  (in milliseconds), or 0 of job was not yet started
482        Args    : none
483
484       Note that some server implementations cannot count in millisecond - so
485       the returned time may be rounded to seconds.
486
487   times
488        Usage   : $job->times ('formatted')
489        Returns : a hash refrence with all time characteristics
490        Args    : optional
491
492       It is a convenient method returning a hash reference with the folowing
493       keys:
494
495          created
496          started
497          ended
498          elapsed
499
500       See "create" for remarks on time formating.
501
502       An example - both for unformatted and formatted times:
503
504          use Data::Dumper;
505          use Bio::Tools::Run::Analysis;
506          my $rh = Bio::Tools::Run::Analysis->new(-name => 'nucleic_cpg_islands.cpgplot')
507                    ->wait_for ( { 'sequence_usa' => 'embl:hsu52852' } )
508                    ->times (1);
509          print Data::Dumper->Dump ( [$rh], ['Times']);
510          $rh = Bio::Tools::Run::Analysis->new(-name => 'nucleic_cpg_islands.cpgplot')
511                    ->wait_for ( { 'sequence_usa' => 'embl:AL499624' } )
512                    ->times;
513          print Data::Dumper->Dump ( [$rh], ['Times']);
514
515          $Times = {
516                  'ended'   => 'Mon Mar  3 17:52:06 2003',
517                  'started' => 'Mon Mar  3 17:52:05 2003',
518                  'elapsed' => '1000',
519                  'created' => 'Mon Mar  3 17:52:05 2003'
520                };
521          $Times = {
522                  'ended'   => '1046713961',
523                  'started' => '1046713926',
524                  'elapsed' => '35000',
525                  'created' => '1046713926'
526                };
527
528   results
529        Usage   : $job->results (...)
530        Returns : one or more results created by this job
531        Args    : various, see belou
532
533       This is a complex method trying to make sense for all kinds of results.
534       Especially it tries to help to put binary results (such as images) into
535       local files. Generally it deals with fhe following facts:
536
537       ·   Each analysis tool may produce more results.
538
539       ·   Some results may contain binary data not suitable for printing into
540           a terminal window.
541
542       ·   Some results may be split into variable number of parts (this is
543           mainly true for the image results that can consist of more *.png
544           files).
545
546       Note also that results have names to distinguish if there are more of
547       them. The names can be obtained by method "result_spec".
548
549       Here are the rules how the method works:
550
551           Retrieving NAMED results:
552           -------------------------
553            results ('name1', ...)   => return results as they are, no storing into files
554
555            results ( { 'name1' => 'filename', ... } )  => store into 'filename', return 'filename'
556            results ( 'name1=filename', ...)            => ditto
557
558            results ( { 'name1' => '-', ... } )         => send result to the STDOUT, do not return anything
559            results ( 'name1=-', ...)                   => ditto
560
561            results ( { 'name1' => '@', ... } )  => store into file whose name is invented by
562                                                    this method, perhaps using RESULT_NAME_TEMPLATE env
563            results ( 'name1=@', ...)            => ditto
564
565            results ( { 'name1' => '?', ... } )  => find of what type is this result and then use
566                                                    {'name1'=>'@' for binary files, and a regular
567                                                    return for non-binary files
568            results ( 'name=?', ...)             => ditto
569
570           Retrieving ALL results:
571           -----------------------
572            results()     => return all results as they are, no storing into files
573
574            results ('@') => return all results, as if each of them given
575                             as {'name' => '@'} (see above)
576
577            results ('?') => return all results, as if each of them given
578                             as {'name' => '?'} (see above)
579
580           Misc:
581           -----
582            * any result can be returned as a scalar value, or as an array reference
583              (the latter is used for results consisting of more parts, such images);
584              this applies regardless whether the returned result is the result itself
585              or a filename created for the result
586
587            * look in the documentation of the C<panalysis[.PLS]> script for examples
588              (especially how to use various templates for inventing file names)
589
590   result
591        Usage   : $job->result (...)
592        Returns : the first result
593        Args    : see 'results'
594
595   remove
596        Usage   : $job->remove
597        Returns : 1
598        Args    : none
599
600       The job object is not actually removed in this time but it is marked
601       (setting 1 to "_destroy_on_exit" attribute) as ready for deletion when
602       the client program ends (including a request to server to forget the
603       job mirror object on the server side).
604
605
606
607perl v5.12.0                      2010-04-29                 Bio::AnalysisI(3)
Impressum