1BP_PANALYSIS(1) User Contributed Perl Documentation BP_PANALYSIS(1)
2
3
4
6 panalysis.PLS - An example/tutorial script how to access analysis tools
7
9 # run an analysis with your sequence in a local file
10 ./panalysis.PLS -n 'edit.seqret'-w -r \
11 sequence_direct_data=@/home/testdata/my.seq
12
13 See more examples in the text below.
14
16 A client showing how to use "Bio::Tools::Run::Analysis" module, a
17 module for executing and controlling local or remote analysis tools.
18 It also calls methods from the "Bio::Tools::Run::AnalysisFactory"
19 module, a module providing lists of available analyses.
20
21 Primarily, this client is meant as an example how to use analysis
22 modules, and also to test them. However, because it has a lot of
23 options in order to cover as many methods as possible, it can be also
24 used as a fully functional command-line client for accessing various
25 analysis tools.
26
27 Defining location and access method
28 "panalysis.PLS" is independent on the access method to the remote
29 analyses (the analyses running on a different machines). The method
30 used to communicate with the analyses is defined by the "-A" option,
31 with the default value soap. The other possible values (not yet
32 supported, but coming soon) are corba and local.
33
34 Each access method may have different meaning for parameter "-l"
35 defining a location of services giving access to the analysis tools.
36 For example, the soap access expects a URL of a Web Service in the "-l"
37 option, while the corba access may find here a stringified
38 Interoperable Object Reference (IOR).
39
40 A default location for the soap access is
41 "http://www.ebi.ac.uk/soaplab/services" which represents services
42 running at European Bioinformatics Institute on top of over hundred
43 EMBOSS analyses (and on top of few others).
44
45 Available analyses
46 "panalysis.PLS" can show a list of available analyses (from the given
47 location using given access method). The "-L" option shows all
48 analyses, the "-c" option lists all available categories (a category is
49 a group of analyses with similar functionality or processing similar
50 type of data), and finally the "-C" option shows only analyses
51 available within the given category.
52
53 Note, that all these functions are provided by module
54 "Bio::Tools::Run::AnalysisFactory" (respectively, by one of its access-
55 dependent sub-classes). The module has also a factory method
56 "create_analysis" which is not used by this script.
57
58 Service
59 A "service" is a higher level of abstraction of an analysis tool. It
60 understands a well defined interface (module "Bio::AnalysisI", a fact
61 which allows this script to be independent on the access protocol to
62 various services.
63
64 The service name must be given by the "-n" option. This option can be
65 omitted only if you invoked just the "factory" methods (described
66 above).
67
68 Each service (representing an analysis tool, a program, or an
69 application) has its description, available by using options "-a"
70 (analysis name, type, etc.), "-i", "-I" (specification of analysis
71 input data, most important are their names), and "-o", "-O" (result
72 names and their types). The option "-d" gives the most detailed
73 description in the XML format.
74
75 The service description is nice but the most important is to use the
76 service for invoking an underlying analysis tool. For each invocation,
77 the service creates a "job" and feeds it with input data. There are
78 three stages: (a) create a job, (b) run the job, and (c) wait for its
79 completion. Correspondingly. there are three options: the "-b" which
80 just creates (builds) a job, the "-x" which creates a job and executes
81 it, and finally "-w" which creates a job, runs it and blocks the client
82 until the job is finished. Always only one of these options is used (so
83 it does not make sense to use more of them, the "panalysis.PLS"
84 priorities them in the order "-x", "-w", and "-b").
85
86 All of these options take input data from the command-line (see next
87 section about it) and all of them return (internally) an object
88 representing a job. There are many methods (options) dealing with the
89 job objects (see one after next section about them).
90
91 Last note in this section: the "-b" option is actually optional - a job
92 is created even without this option when there are some input data
93 found on the command-line. You have to use it, however, if you do not
94 pass any data to an analysis tool (an example would be the famous
95 "Classic::HelloWorld" service).
96
97 Input data
98 Input data are given as name/value pairs, put on the command-line with
99 equal sign between name and value. If the value part starts with an un-
100 escaped character "@", it is used as a local file name and the
101 "panalysis.PLS" reads the file and uses its contents instead. Examples:
102
103 panalysis.PLS -n edit.seqret -w -r
104 sequence_direct_data='tatatctcccc' osformat=embl
105
106 panalysis.PLS ...
107 sequence_direct_data=@/my/data/my.seq
108
109 The names of input data come from the "input specification" that can be
110 shown by the "-i" or "-I" options. The input specification (when using
111 option "-I") shows also - for some inputs - a list of allowed values.
112 The specification, however, does not tell what input data are mutually
113 exclusive, or what other constrains apply. If there is a conflict, an
114 error message is produced later (before the job starts).
115
116 Input data are used when any of the options "-b", "-x", or "-w" is
117 present, but option "-j" is not present (see next section about this
118 job option).
119
120 Job
121 Each service (defined by a name given in the "-n" option) can be
122 executed one or more times, with the same, but usually with different
123 input data. Each execution creates a job object. Actually, the job is
124 created even before execution (remember that option "-b" builds a job
125 but does not execute it yet).
126
127 Any job, executed or not, is persistent and can be used again later
128 from another invocation of the "panalysis.PLS" script. Unless you
129 explicitly destroy the job using option "-z".
130
131 A job created by options "-b", "-x" and "-w" (and by input data) can be
132 accessed in the same "panalysis.PLS" invocation using various job-
133 related options, the most important are "-r" and "-R" for retrieving
134 results from the finished job.
135
136 However, you can also re-create a job created by a previous invocation.
137 Assuming that you know the job ID (the "panalysis.PLS" prints it always
138 on the standard error when a new job is created), use option "-j" to
139 re-create the job.
140
141 Example:
142
143 ./panalysis.PLS -n 'edit.seqret'
144 sequence_direct_data=@/home/testdata/my.seq
145
146 It prints:
147
148 JOB ID: edit.seqret/bb494b:ef55e47c99:-8000
149
150 Next invocation (asking to run the job, to wait for its completion and
151 to show job status) can be:
152
153 ./panalysis.PLS -n 'edit.seqret'
154 -j edit.seqret/bb494b:ef55e47c99:-800
155 -w -s
156
157 And again later another invocation can ask for results:
158
159 ./panalysis.PLS -n 'edit.seqret'
160 -j edit.seqret/bb494b:ef55e47c99:-800
161 -r
162
163 Here is a list of all job options (except for results, they are in the
164 next section):
165
166 Job execution and termination
167 There are the same options "-x" and "-w" for executing a job and
168 for executing it and waiting for its completion, as they were
169 described above. But now, the options act on a job given by the
170 "-j" option, now they do not use any input data from the command-
171 line (the input data had to be used when the job was created).
172
173 Additionally, there is a "-k" option to kill a running job.
174
175 Job characteristics
176 Other options tell about the job status ("-s", about the job
177 execution times ("-t" and "-T", and about the last available event
178 what happened with the job ("-e"). Note that the event notification
179 is not yet fully implemented, so this option will change in the
180 future to reflect more notification capabilities.
181
182 Results
183 Of course, the most important on the analysis tools are their results.
184 The results are named (in the similar way as the input data) and they
185 can be retrieved all in one go using option "-r" (so you do not need to
186 know their names actually), or by specifying (all or some) result names
187 using the "-R" option.
188
189 If a result does not exist (either not yet, or the name is wrong) an
190 undef value is returned (no error message produced).
191
192 Some results are better to save directly into files instead to show
193 them in the terminal window (this applies to the binary results, mostly
194 containing images). The "panalysis.PLS" helps to deal with binary
195 results by saving them automatically to local files (actually it is the
196 module "Bio::Tools::Run::Analysis" and its submodules who do help with
197 the binary data).
198
199 So why not to use a traditional shell re-direction to a file? There are
200 two reasons. First, a job can produce more than one result, so they
201 would be mixed together. But mainly, because each result can consist of
202 several parts whose number is not known in advance and which cannot be
203 mixed together in one file. Again, this is typical for the binary data
204 returning images - an invocation can produce many images.
205
206 The "-r" option retrieves all available results and treat them as
207 described by the '?' format below.
208
209 The "-R" option has a comma-separated list of result names, each of the
210 names can be either a simple name (as specified by the "result
211 specification" obtainable using the "-o" or "-O" options), or a equal-
212 sign-separated name/format construct suggesting what to do with the
213 result. The possibilities are:
214
215 result-name
216 It prints given result on the standard output.
217
218 result-name=filename
219 It saves the given result into given file.
220
221 result-name=@
222 It saves the given result into a file whose name is automatically
223 invented, and it guarantees that the same name will not be used in
224 the next invocation.
225
226 result=name=@template
227 It saves the given result into a file whose name is given by the
228 "template". The template can contain several strings which are
229 substituted before using it as the filename:
230
231 Any '*'
232 Will be replaced by a unique number
233
234 $ANALYSIS or ${ANALYSIS}
235 Will be replaced by the current analysis name
236
237 $RESULT or ${RESULT}
238 Will be replaced by the current result name
239
240 How to tell what to do with results? Each result name
241
242 Additionally, a template can be given as an environment variable
243 "RESULT_FILENAME_TEMPLATE". Such variable is used for any result
244 having in its format a simple "?" or "@" character.
245
246 result-name=?
247 It first decides whether the given result is binary or not. Then,
248 the binary results are saved into local files whose names are
249 automatically invented, the other results are sent to the standard
250 output.
251
252 result-name=?template
253 The same as above but the filenames for binary files are deduced
254 from the given template (using the same rules as described above).
255
256 Examples:
257
258 -r
259 -R report
260 -R report,outseq
261 -R Graphics_in_PNG=@
262 -R Graphics_in_PNG=@$ANALYSIS-*-$RESULT
263
264 Note that the result formatting will be enriched in the future by using
265 existing data type parsers in bioperl.
266
268 Mailing Lists
269 User feedback is an integral part of the evolution of this and other
270 Bioperl modules. Send your comments and suggestions preferably to the
271 Bioperl mailing list. Your participation is much appreciated.
272
273 bioperl-l@bioperl.org - General discussion
274 http://bioperl.org/wiki/Mailing_lists - About the mailing lists
275
276 Reporting Bugs
277 Report bugs to the Bioperl bug tracking system to help us keep track of
278 the bugs and their resolution. Bug reports can be submitted via the
279 web:
280
281 http://bugzilla.open-bio.org/
282
284 Martin Senger (martin.senger@gmail.com)
285
287 Copyright (c) 2003, Martin Senger and EMBL-EBI. All Rights Reserved.
288
289 This script is free software; you can redistribute it and/or modify it
290 under the same terms as Perl itself.
291
293 This software is provided "as is" without warranty of any kind.
294
296 None known at the time of writing this.
297
298
299
300perl v5.12.0 2010-04-29 BP_PANALYSIS(1)