1BP_PANALYSIS(1)       User Contributed Perl Documentation      BP_PANALYSIS(1)
2
3
4

NAME

6       panalysis.PLS - An example/tutorial script how to access analysis tools
7

SYNOPSIS

9        # run an analysis with your sequence in a local file
10          ./panalysis.PLS -n 'edit.seqret'-w -r \
11                          sequence_direct_data=@/home/testdata/my.seq
12
13        See more examples in the text below.
14

DESCRIPTION

16       A client showing how to use "Bio::Tools::Run::Analysis" module, a mod‐
17       ule for executing and controlling local or remote analysis tools.  It
18       also calls methods from the "Bio::Tools::Run::AnalysisFactory" module,
19       a module providing lists of available analyses.
20
21       Primarily, this client is meant as an example how to use analysis mod‐
22       ules, and also to test them. However, because it has a lot of options
23       in order to cover as many methods as possible, it can be also used as a
24       fully functional command-line client for accessing various analysis
25       tools.
26
27       Defining location and access method
28
29       "panalysis.PLS" is independent on the access method to the remote
30       analyses (the analyses running on a different machines). The method
31       used to communicate with the analyses is defined by the "-A" option,
32       with the default value soap. The other possible values (not yet sup‐
33       ported, but coming soon) are corba and local.
34
35       Each access method may have different meaning for parameter "-l" defin‐
36       ing a location of services giving access to the analysis tools. For
37       example, the soap access expects a URL of a Web Service in the "-l"
38       option, while the corba access may find here a stringified Interopera‐
39       ble Object Reference (IOR).
40
41       A default location for the soap access is
42       "http://www.ebi.ac.uk/soaplab/services" which represents services run‐
43       ning at European Bioinformatics Institute on top of over hundred EMBOSS
44       analyses (and on top of few others).
45
46       Available analyses
47
48       "panalysis.PLS" can show a list of available analyses (from the given
49       location using given access method). The "-L" option shows all analy‐
50       ses, the "-c" option lists all available categories (a category is a
51       group of analyses with similar functionality or processing similar type
52       of data), and finally the "-C" option shows only analyses available
53       within the given category.
54
55       Note, that all these functions are provided by module
56       "Bio::Tools::Run::AnalysisFactory" (respectively, by one of its access-
57       dependent sub-classes). The module has also a factory method "cre‐
58       ate_analysis" which is not used by this script.
59
60       Service
61
62       A "service" is a higher level of abstraction of an analysis tool. It
63       understands a well defined interface (module "Bio::AnalysisI", a fact
64       which allows this script to be independent on the access protocol to
65       various services.
66
67       The service name must be given by the "-n" option. This option can be
68       omitted only if you invoked just the "factory" methods (described
69       above).
70
71       Each service (representing an analysis tool, a program, or an applica‐
72       tion) has its description, available by using options "-a" (analysis
73       name, type, etc.), "-i", "-I" (specification of analysis input data,
74       most important are their names), and "-o", "-O" (result names and their
75       types). The option "-d" gives the most detailed description in the XML
76       format.
77
78       The service description is nice but the most important is to use the
79       service for invoking an underlying analysis tool. For each invocation,
80       the service creates a "job" and feeds it with input data. There are
81       three stages: (a) create a job, (b) run the job, and (c) wait for its
82       completion. Correspondingly. there are three options: the "-b" which
83       just creates (builds) a job, the "-x" which creates a job and executes
84       it, and finally "-w" which creates a job, runs it and blocks the client
85       until the job is finished. Always only one of these options is used (so
86       it does not make sense to use more of them, the "panalysis.PLS" priori‐
87       ties them in the order "-x", "-w", and "-b").
88
89       All of these options take input data from the command-line (see next
90       section about it) and all of them return (internally) an object repre‐
91       senting a job. There are many methods (options) dealing with the job
92       objects (see one after next section about them).
93
94       Last note in this section: the "-b" option is actually optional - a job
95       is created even without this option when there are some input data
96       found on the command-line. You have to use it, however, if you do not
97       pass any data to an analysis tool (an example would be the famous
98       "Classic::HelloWorld" service).
99
100       Input data
101
102       Input data are given as name/value pairs, put on the command-line with
103       equal sign between name and value. If the value part starts with an un-
104       escaped character "@", it is used as a local file name and the "panaly‐
105       sis.PLS" reads the file and uses its contents instead. Examples:
106
107          panalysis.PLS -n edit.seqret -w -r
108                        sequence_direct_data='tatatctcccc' osformat=embl
109
110          panalysis.PLS ...
111                      sequence_direct_data=@/my/data/my.seq
112
113       The names of input data come from the "input specification" that can be
114       shown by the "-i" or "-I" options. The input specification (when using
115       option "-I") shows also - for some inputs - a list of allowed values.
116       The specification, however, does not tell what input data are mutually
117       exclusive, or what other constrains apply. If there is a conflict, an
118       error message is produced later (before the job starts).
119
120       Input data are used when any of the options "-b", "-x", or "-w" is
121       present, but option "-j" is not present (see next section about this
122       job option).
123
124       Job
125
126       Each service (defined by a name given in the "-n" option) can be exe‐
127       cuted one or more times, with the same, but usually with different
128       input data. Each execution creates a job object. Actually, the job is
129       created even before execution (remember that option "-b" builds a job
130       but does not execute it yet).
131
132       Any job, executed or not, is persistent and can be used again later
133       from another invocation of the "panalysis.PLS" script. Unless you
134       explicitly destroy the job using option "-z".
135
136       A job created by options "-b", "-x" and "-w" (and by input data) can be
137       accessed in the same "panalysis.PLS" invocation using various job-
138       related options, the most important are "-r" and "-R" for retrieving
139       results from the finished job.
140
141       However, you can also re-create a job created by a previous invocation.
142       Assuming that you know the job ID (the "panalysis.PLS" prints it always
143       on the standard error when a new job is created), use option "-j" to
144       re-create the job.
145
146       Example:
147
148          ./panalysis.PLS -n 'edit.seqret'
149                        sequence_direct_data=@/home/testdata/my.seq
150
151       It prints:
152
153          JOB ID: edit.seqret/bb494b:ef55e47c99:-8000
154
155       Next invocation (asking to run the job, to wait for its completion and
156       to show job status) can be:
157
158          ./panalysis.PLS -n 'edit.seqret'
159                        -j edit.seqret/bb494b:ef55e47c99:-800
160                        -w -s
161
162       And again later another invocation can ask for results:
163
164          ./panalysis.PLS -n 'edit.seqret'
165                        -j edit.seqret/bb494b:ef55e47c99:-800
166                        -r
167
168       Here is a list of all job options (except for results, they are in the
169       next section):
170
171       Job execution and termination
172           There are the same options "-x" and "-w" for executing a job and
173           for executing it and waiting for its completion, as they were
174           described above. But now, the options act on a job given by the
175           "-j" option, now they do not use any input data from the command-
176           line (the input data had to be used when the job was created).
177
178           Additionally, there is a "-k" option to kill a running job.
179
180       Job characteristics
181           Other options tell about the job status ("-s", about the job execu‐
182           tion times ("-t" and "-T", and about the last available event what
183           happened with the job ("-e"). Note that the event notification is
184           not yet fully implemented, so this option will change in the future
185           to reflect more notification capabilities.
186
187       Results
188
189       Of course, the most important on the analysis tools are their results.
190       The results are named (in the similar way as the input data) and they
191       can be retrieved all in one go using option "-r" (so you do not need to
192       know their names actually), or by specifying (all or some) result names
193       using the "-R" option.
194
195       If a result does not exist (either not yet, or the name is wrong) an
196       undef value is returned (no error message produced).
197
198       Some results are better to save directly into files instead to show
199       them in the terminal window (this applies to the binary results, mostly
200       containing images). The "panalysis.PLS" helps to deal with binary
201       results by saving them automatically to local files (actually it is the
202       module "Bio::Tools::Run::Analysis" and its submodules who do help with
203       the binary data).
204
205       So why not to use a traditional shell re-direction to a file? There are
206       two reasons. First, a job can produce more than one result, so they
207       would be mixed together. But mainly, because each result can consist of
208       several parts whose number is not known in advance and which cannot be
209       mixed together in one file. Again, this is typical for the binary data
210       returning images - an invocation can produce many images.
211
212       The "-r" option retrieves all available results and treat them as
213       described by the '?' format below.
214
215       The "-R" option has a comma-separated list of result names, each of the
216       names can be either a simple name (as specified by the "result specifi‐
217       cation" obtainable using the "-o" or "-O" options), or a equal-sign-
218       separated name/format construct suggesting what to do with the result.
219       The possibilities are:
220
221       result-name
222           It prints given result on the standard output.
223
224       result-name=filename
225           It saves the given result into given file.
226
227       result-name=@
228           It saves the given result into a file whose name is automatically
229           invented, and it guarantees that the same name will not be used in
230           the next invocation.
231
232       result=name=@template
233           It saves the given result into a file whose name is given by the
234           "template". The template can contain several strings which are sub‐
235           stituted before using it as the filename:
236
237           Any '*'
238               Will be replaced by a unique number
239
240           $ANALYSIS or ${ANALYSIS}
241               Will be replaced by the current analysis name
242
243           $RESULT or ${RESULT}
244               Will be replaced by the current result name
245
246               How to tell what to do with results? Each result name
247
248           Additionally, a template can be given as an environment variable
249           "RESULT_FILENAME_TEMPLATE". Such variable is used for any result
250           having in its format a simple "?" or "@" character.
251
252       result-name=?
253           It first decides whether the given result is binary or not. Then,
254           the binary results are saved into local files whose names are auto‐
255           matically invented, the other results are sent to the standard out‐
256           put.
257
258       result-name=?template
259           The same as above but the filenames for binary files are deduced
260           from the given template (using the same rules as described above).
261
262       Examples:
263
264          -r
265          -R report
266          -R report,outseq
267          -R Graphics_in_PNG=@
268          -R Graphics_in_PNG=@$ANALYSIS-*-$RESULT
269
270       Note that the result formatting will be enriched in the future by using
271       existing data type parsers in bioperl.
272

FEEDBACK

274       Mailing Lists
275
276       User feedback is an integral part of the evolution of this and other
277       Bioperl modules. Send your comments and suggestions preferably to the
278       Bioperl mailing list.  Your participation is much appreciated.
279
280         bioperl-l@bioperl.org                  - General discussion
281         http://bioperl.org/wiki/Mailing_lists  - About the mailing lists
282
283       Reporting Bugs
284
285       Report bugs to the Bioperl bug tracking system to help us keep track of
286       the bugs and their resolution. Bug reports can be submitted via the
287       web:
288
289         http://bugzilla.open-bio.org/
290

AUTHOR

292       Martin Senger (martin.senger@gmail.com)
293
295       Copyright (c) 2003, Martin Senger and EMBL-EBI.  All Rights Reserved.
296
297       This script is free software; you can redistribute it and/or modify it
298       under the same terms as Perl itself.
299

DISCLAIMER

301       This software is provided "as is" without warranty of any kind.
302

BUGS AND LIMITATIONS

304       None known at the time of writing this.
305
306
307
308perl v5.8.8                       2007-04-19                   BP_PANALYSIS(1)
Impressum