1BP_PANALYSIS(1)       User Contributed Perl Documentation      BP_PANALYSIS(1)
2
3
4

NAME

6       panalysis.PLS - An example/tutorial script how to access analysis tools
7

SYNOPSIS

9        # run an analysis with your sequence in a local file
10          ./panalysis.PLS -n 'edit.seqret'-w -r \
11                          sequence_direct_data=@/home/testdata/my.seq
12
13        See more examples in the text below.
14

DESCRIPTION

16       A client showing how to use "Bio::Tools::Run::Analysis" module, a
17       module for executing and controlling local or remote analysis tools.
18       It also calls methods from the "Bio::Tools::Run::AnalysisFactory"
19       module, a module providing lists of available analyses.
20
21       Primarily, this client is meant as an example how to use analysis
22       modules, and also to test them. However, because it has a lot of
23       options in order to cover as many methods as possible, it can be also
24       used as a fully functional command-line client for accessing various
25       analysis tools.
26
27   Defining location and access method
28       "panalysis.PLS" is independent on the access method to the remote
29       analyses (the analyses running on a different machines). The method
30       used to communicate with the analyses is defined by the "-A" option,
31       with the default value soap. The other possible values (not yet
32       supported, but coming soon) are corba and local.
33
34       Each access method may have different meaning for parameter "-l"
35       defining a location of services giving access to the analysis tools.
36       For example, the soap access expects a URL of a Web Service in the "-l"
37       option, while the corba access may find here a stringified
38       Interoperable Object Reference (IOR).
39
40       A default location for the soap access is
41       "http://www.ebi.ac.uk/soaplab/services" which represents services
42       running at European Bioinformatics Institute on top of over hundred
43       EMBOSS analyses (and on top of few others).
44
45   Available analyses
46       "panalysis.PLS" can show a list of available analyses (from the given
47       location using given access method). The "-L" option shows all
48       analyses, the "-c" option lists all available categories (a category is
49       a group of analyses with similar functionality or processing similar
50       type of data), and finally the "-C" option shows only analyses
51       available within the given category.
52
53       Note, that all these functions are provided by module
54       "Bio::Tools::Run::AnalysisFactory" (respectively, by one of its access-
55       dependent sub-classes). The module has also a factory method
56       "create_analysis" which is not used by this script.
57
58   Service
59       A "service" is a higher level of abstraction of an analysis tool. It
60       understands a well defined interface (module "Bio::AnalysisI", a fact
61       which allows this script to be independent on the access protocol to
62       various services.
63
64       The service name must be given by the "-n" option. This option can be
65       omitted only if you invoked just the "factory" methods (described
66       above).
67
68       Each service (representing an analysis tool, a program, or an
69       application) has its description, available by using options "-a"
70       (analysis name, type, etc.), "-i", "-I" (specification of analysis
71       input data, most important are their names), and "-o", "-O" (result
72       names and their types). The option "-d" gives the most detailed
73       description in the XML format.
74
75       The service description is nice but the most important is to use the
76       service for invoking an underlying analysis tool. For each invocation,
77       the service creates a "job" and feeds it with input data. There are
78       three stages: (a) create a job, (b) run the job, and (c) wait for its
79       completion. Correspondingly. there are three options: the "-b" which
80       just creates (builds) a job, the "-x" which creates a job and executes
81       it, and finally "-w" which creates a job, runs it and blocks the client
82       until the job is finished. Always only one of these options is used (so
83       it does not make sense to use more of them, the "panalysis.PLS"
84       priorities them in the order "-x", "-w", and "-b").
85
86       All of these options take input data from the command-line (see next
87       section about it) and all of them return (internally) an object
88       representing a job. There are many methods (options) dealing with the
89       job objects (see one after next section about them).
90
91       Last note in this section: the "-b" option is actually optional - a job
92       is created even without this option when there are some input data
93       found on the command-line. You have to use it, however, if you do not
94       pass any data to an analysis tool (an example would be the famous
95       "Classic::HelloWorld" service).
96
97   Input data
98       Input data are given as name/value pairs, put on the command-line with
99       equal sign between name and value. If the value part starts with an un-
100       escaped character "@", it is used as a local file name and the
101       "panalysis.PLS" reads the file and uses its contents instead. Examples:
102
103          panalysis.PLS -n edit.seqret -w -r
104                        sequence_direct_data='tatatctcccc' osformat=embl
105
106          panalysis.PLS ...
107                      sequence_direct_data=@/my/data/my.seq
108
109       The names of input data come from the "input specification" that can be
110       shown by the "-i" or "-I" options. The input specification (when using
111       option "-I") shows also - for some inputs - a list of allowed values.
112       The specification, however, does not tell what input data are mutually
113       exclusive, or what other constrains apply. If there is a conflict, an
114       error message is produced later (before the job starts).
115
116       Input data are used when any of the options "-b", "-x", or "-w" is
117       present, but option "-j" is not present (see next section about this
118       job option).
119
120   Job
121       Each service (defined by a name given in the "-n" option) can be
122       executed one or more times, with the same, but usually with different
123       input data. Each execution creates a job object. Actually, the job is
124       created even before execution (remember that option "-b" builds a job
125       but does not execute it yet).
126
127       Any job, executed or not, is persistent and can be used again later
128       from another invocation of the "panalysis.PLS" script. Unless you
129       explicitly destroy the job using option "-z".
130
131       A job created by options "-b", "-x" and "-w" (and by input data) can be
132       accessed in the same "panalysis.PLS" invocation using various job-
133       related options, the most important are "-r" and "-R" for retrieving
134       results from the finished job.
135
136       However, you can also re-create a job created by a previous invocation.
137       Assuming that you know the job ID (the "panalysis.PLS" prints it always
138       on the standard error when a new job is created), use option "-j" to
139       re-create the job.
140
141       Example:
142
143          ./panalysis.PLS -n 'edit.seqret'
144                        sequence_direct_data=@/home/testdata/my.seq
145
146       It prints:
147
148          JOB ID: edit.seqret/bb494b:ef55e47c99:-8000
149
150       Next invocation (asking to run the job, to wait for its completion and
151       to show job status) can be:
152
153          ./panalysis.PLS -n 'edit.seqret'
154                        -j edit.seqret/bb494b:ef55e47c99:-800
155                        -w -s
156
157       And again later another invocation can ask for results:
158
159          ./panalysis.PLS -n 'edit.seqret'
160                        -j edit.seqret/bb494b:ef55e47c99:-800
161                        -r
162
163       Here is a list of all job options (except for results, they are in the
164       next section):
165
166       Job execution and termination
167           There are the same options "-x" and "-w" for executing a job and
168           for executing it and waiting for its completion, as they were
169           described above. But now, the options act on a job given by the
170           "-j" option, now they do not use any input data from the command-
171           line (the input data had to be used when the job was created).
172
173           Additionally, there is a "-k" option to kill a running job.
174
175       Job characteristics
176           Other options tell about the job status ("-s", about the job
177           execution times ("-t" and "-T", and about the last available event
178           what happened with the job ("-e"). Note that the event notification
179           is not yet fully implemented, so this option will change in the
180           future to reflect more notification capabilities.
181
182   Results
183       Of course, the most important on the analysis tools are their results.
184       The results are named (in the similar way as the input data) and they
185       can be retrieved all in one go using option "-r" (so you do not need to
186       know their names actually), or by specifying (all or some) result names
187       using the "-R" option.
188
189       If a result does not exist (either not yet, or the name is wrong) an
190       undef value is returned (no error message produced).
191
192       Some results are better to save directly into files instead to show
193       them in the terminal window (this applies to the binary results, mostly
194       containing images). The "panalysis.PLS" helps to deal with binary
195       results by saving them automatically to local files (actually it is the
196       module "Bio::Tools::Run::Analysis" and its submodules who do help with
197       the binary data).
198
199       So why not to use a traditional shell re-direction to a file? There are
200       two reasons. First, a job can produce more than one result, so they
201       would be mixed together. But mainly, because each result can consist of
202       several parts whose number is not known in advance and which cannot be
203       mixed together in one file. Again, this is typical for the binary data
204       returning images - an invocation can produce many images.
205
206       The "-r" option retrieves all available results and treat them as
207       described by the '?' format below.
208
209       The "-R" option has a comma-separated list of result names, each of the
210       names can be either a simple name (as specified by the "result
211       specification" obtainable using the "-o" or "-O" options), or a equal-
212       sign-separated name/format construct suggesting what to do with the
213       result. The possibilities are:
214
215       result-name
216           It prints given result on the standard output.
217
218       result-name=filename
219           It saves the given result into given file.
220
221       result-name=@
222           It saves the given result into a file whose name is automatically
223           invented, and it guarantees that the same name will not be used in
224           the next invocation.
225
226       result=name=@template
227           It saves the given result into a file whose name is given by the
228           "template". The template can contain several strings which are
229           substituted before using it as the filename:
230
231           Any '*'
232               Will be replaced by a unique number
233
234           $ANALYSIS or ${ANALYSIS}
235               Will be replaced by the current analysis name
236
237           $RESULT or ${RESULT}
238               Will be replaced by the current result name
239
240               How to tell what to do with results? Each result name
241
242           Additionally, a template can be given as an environment variable
243           "RESULT_FILENAME_TEMPLATE". Such variable is used for any result
244           having in its format a simple "?" or "@" character.
245
246       result-name=?
247           It first decides whether the given result is binary or not. Then,
248           the binary results are saved into local files whose names are
249           automatically invented, the other results are sent to the standard
250           output.
251
252       result-name=?template
253           The same as above but the filenames for binary files are deduced
254           from the given template (using the same rules as described above).
255
256       Examples:
257
258          -r
259          -R report
260          -R report,outseq
261          -R Graphics_in_PNG=@
262          -R Graphics_in_PNG=@$ANALYSIS-*-$RESULT
263
264       Note that the result formatting will be enriched in the future by using
265       existing data type parsers in bioperl.
266

FEEDBACK

268   Mailing Lists
269       User feedback is an integral part of the evolution of this and other
270       Bioperl modules. Send your comments and suggestions preferably to the
271       Bioperl mailing list.  Your participation is much appreciated.
272
273         bioperl-l@bioperl.org                  - General discussion
274         http://bioperl.org/wiki/Mailing_lists  - About the mailing lists
275
276   Reporting Bugs
277       Report bugs to the Bioperl bug tracking system to help us keep track of
278       the bugs and their resolution. Bug reports can be submitted via the
279       web:
280
281         http://bugzilla.open-bio.org/
282

AUTHOR

284       Martin Senger (martin.senger@gmail.com)
285
287       Copyright (c) 2003, Martin Senger and EMBL-EBI.  All Rights Reserved.
288
289       This script is free software; you can redistribute it and/or modify it
290       under the same terms as Perl itself.
291

DISCLAIMER

293       This software is provided "as is" without warranty of any kind.
294

BUGS AND LIMITATIONS

296       None known at the time of writing this.
297
298
299
300perl v5.12.0                      2010-04-29                   BP_PANALYSIS(1)
Impressum