1BP_PANALYSIS(1) User Contributed Perl Documentation BP_PANALYSIS(1)
2
3
4
6 panalysis.PLS - An example/tutorial script how to access analysis tools
7
9 # run an analysis with your sequence in a local file
10 ./panalysis.PLS -n 'edit.seqret'-w -r \
11 sequence_direct_data=@/home/testdata/my.seq
12
13 See more examples in the text below.
14
16 A client showing how to use "Bio::Tools::Run::Analysis" module, a mod‐
17 ule for executing and controlling local or remote analysis tools. It
18 also calls methods from the "Bio::Tools::Run::AnalysisFactory" module,
19 a module providing lists of available analyses.
20
21 Primarily, this client is meant as an example how to use analysis mod‐
22 ules, and also to test them. However, because it has a lot of options
23 in order to cover as many methods as possible, it can be also used as a
24 fully functional command-line client for accessing various analysis
25 tools.
26
27 Defining location and access method
28
29 "panalysis.PLS" is independent on the access method to the remote
30 analyses (the analyses running on a different machines). The method
31 used to communicate with the analyses is defined by the "-A" option,
32 with the default value soap. The other possible values (not yet sup‐
33 ported, but coming soon) are corba and local.
34
35 Each access method may have different meaning for parameter "-l" defin‐
36 ing a location of services giving access to the analysis tools. For
37 example, the soap access expects a URL of a Web Service in the "-l"
38 option, while the corba access may find here a stringified Interopera‐
39 ble Object Reference (IOR).
40
41 A default location for the soap access is
42 "http://www.ebi.ac.uk/soaplab/services" which represents services run‐
43 ning at European Bioinformatics Institute on top of over hundred EMBOSS
44 analyses (and on top of few others).
45
46 Available analyses
47
48 "panalysis.PLS" can show a list of available analyses (from the given
49 location using given access method). The "-L" option shows all analy‐
50 ses, the "-c" option lists all available categories (a category is a
51 group of analyses with similar functionality or processing similar type
52 of data), and finally the "-C" option shows only analyses available
53 within the given category.
54
55 Note, that all these functions are provided by module
56 "Bio::Tools::Run::AnalysisFactory" (respectively, by one of its access-
57 dependent sub-classes). The module has also a factory method "cre‐
58 ate_analysis" which is not used by this script.
59
60 Service
61
62 A "service" is a higher level of abstraction of an analysis tool. It
63 understands a well defined interface (module "Bio::AnalysisI", a fact
64 which allows this script to be independent on the access protocol to
65 various services.
66
67 The service name must be given by the "-n" option. This option can be
68 omitted only if you invoked just the "factory" methods (described
69 above).
70
71 Each service (representing an analysis tool, a program, or an applica‐
72 tion) has its description, available by using options "-a" (analysis
73 name, type, etc.), "-i", "-I" (specification of analysis input data,
74 most important are their names), and "-o", "-O" (result names and their
75 types). The option "-d" gives the most detailed description in the XML
76 format.
77
78 The service description is nice but the most important is to use the
79 service for invoking an underlying analysis tool. For each invocation,
80 the service creates a "job" and feeds it with input data. There are
81 three stages: (a) create a job, (b) run the job, and (c) wait for its
82 completion. Correspondingly. there are three options: the "-b" which
83 just creates (builds) a job, the "-x" which creates a job and executes
84 it, and finally "-w" which creates a job, runs it and blocks the client
85 until the job is finished. Always only one of these options is used (so
86 it does not make sense to use more of them, the "panalysis.PLS" priori‐
87 ties them in the order "-x", "-w", and "-b").
88
89 All of these options take input data from the command-line (see next
90 section about it) and all of them return (internally) an object repre‐
91 senting a job. There are many methods (options) dealing with the job
92 objects (see one after next section about them).
93
94 Last note in this section: the "-b" option is actually optional - a job
95 is created even without this option when there are some input data
96 found on the command-line. You have to use it, however, if you do not
97 pass any data to an analysis tool (an example would be the famous
98 "Classic::HelloWorld" service).
99
100 Input data
101
102 Input data are given as name/value pairs, put on the command-line with
103 equal sign between name and value. If the value part starts with an un-
104 escaped character "@", it is used as a local file name and the "panaly‐
105 sis.PLS" reads the file and uses its contents instead. Examples:
106
107 panalysis.PLS -n edit.seqret -w -r
108 sequence_direct_data='tatatctcccc' osformat=embl
109
110 panalysis.PLS ...
111 sequence_direct_data=@/my/data/my.seq
112
113 The names of input data come from the "input specification" that can be
114 shown by the "-i" or "-I" options. The input specification (when using
115 option "-I") shows also - for some inputs - a list of allowed values.
116 The specification, however, does not tell what input data are mutually
117 exclusive, or what other constrains apply. If there is a conflict, an
118 error message is produced later (before the job starts).
119
120 Input data are used when any of the options "-b", "-x", or "-w" is
121 present, but option "-j" is not present (see next section about this
122 job option).
123
124 Job
125
126 Each service (defined by a name given in the "-n" option) can be exe‐
127 cuted one or more times, with the same, but usually with different
128 input data. Each execution creates a job object. Actually, the job is
129 created even before execution (remember that option "-b" builds a job
130 but does not execute it yet).
131
132 Any job, executed or not, is persistent and can be used again later
133 from another invocation of the "panalysis.PLS" script. Unless you
134 explicitly destroy the job using option "-z".
135
136 A job created by options "-b", "-x" and "-w" (and by input data) can be
137 accessed in the same "panalysis.PLS" invocation using various job-
138 related options, the most important are "-r" and "-R" for retrieving
139 results from the finished job.
140
141 However, you can also re-create a job created by a previous invocation.
142 Assuming that you know the job ID (the "panalysis.PLS" prints it always
143 on the standard error when a new job is created), use option "-j" to
144 re-create the job.
145
146 Example:
147
148 ./panalysis.PLS -n 'edit.seqret'
149 sequence_direct_data=@/home/testdata/my.seq
150
151 It prints:
152
153 JOB ID: edit.seqret/bb494b:ef55e47c99:-8000
154
155 Next invocation (asking to run the job, to wait for its completion and
156 to show job status) can be:
157
158 ./panalysis.PLS -n 'edit.seqret'
159 -j edit.seqret/bb494b:ef55e47c99:-800
160 -w -s
161
162 And again later another invocation can ask for results:
163
164 ./panalysis.PLS -n 'edit.seqret'
165 -j edit.seqret/bb494b:ef55e47c99:-800
166 -r
167
168 Here is a list of all job options (except for results, they are in the
169 next section):
170
171 Job execution and termination
172 There are the same options "-x" and "-w" for executing a job and
173 for executing it and waiting for its completion, as they were
174 described above. But now, the options act on a job given by the
175 "-j" option, now they do not use any input data from the command-
176 line (the input data had to be used when the job was created).
177
178 Additionally, there is a "-k" option to kill a running job.
179
180 Job characteristics
181 Other options tell about the job status ("-s", about the job execu‐
182 tion times ("-t" and "-T", and about the last available event what
183 happened with the job ("-e"). Note that the event notification is
184 not yet fully implemented, so this option will change in the future
185 to reflect more notification capabilities.
186
187 Results
188
189 Of course, the most important on the analysis tools are their results.
190 The results are named (in the similar way as the input data) and they
191 can be retrieved all in one go using option "-r" (so you do not need to
192 know their names actually), or by specifying (all or some) result names
193 using the "-R" option.
194
195 If a result does not exist (either not yet, or the name is wrong) an
196 undef value is returned (no error message produced).
197
198 Some results are better to save directly into files instead to show
199 them in the terminal window (this applies to the binary results, mostly
200 containing images). The "panalysis.PLS" helps to deal with binary
201 results by saving them automatically to local files (actually it is the
202 module "Bio::Tools::Run::Analysis" and its submodules who do help with
203 the binary data).
204
205 So why not to use a traditional shell re-direction to a file? There are
206 two reasons. First, a job can produce more than one result, so they
207 would be mixed together. But mainly, because each result can consist of
208 several parts whose number is not known in advance and which cannot be
209 mixed together in one file. Again, this is typical for the binary data
210 returning images - an invocation can produce many images.
211
212 The "-r" option retrieves all available results and treat them as
213 described by the '?' format below.
214
215 The "-R" option has a comma-separated list of result names, each of the
216 names can be either a simple name (as specified by the "result specifi‐
217 cation" obtainable using the "-o" or "-O" options), or a equal-sign-
218 separated name/format construct suggesting what to do with the result.
219 The possibilities are:
220
221 result-name
222 It prints given result on the standard output.
223
224 result-name=filename
225 It saves the given result into given file.
226
227 result-name=@
228 It saves the given result into a file whose name is automatically
229 invented, and it guarantees that the same name will not be used in
230 the next invocation.
231
232 result=name=@template
233 It saves the given result into a file whose name is given by the
234 "template". The template can contain several strings which are sub‐
235 stituted before using it as the filename:
236
237 Any '*'
238 Will be replaced by a unique number
239
240 $ANALYSIS or ${ANALYSIS}
241 Will be replaced by the current analysis name
242
243 $RESULT or ${RESULT}
244 Will be replaced by the current result name
245
246 How to tell what to do with results? Each result name
247
248 Additionally, a template can be given as an environment variable
249 "RESULT_FILENAME_TEMPLATE". Such variable is used for any result
250 having in its format a simple "?" or "@" character.
251
252 result-name=?
253 It first decides whether the given result is binary or not. Then,
254 the binary results are saved into local files whose names are auto‐
255 matically invented, the other results are sent to the standard out‐
256 put.
257
258 result-name=?template
259 The same as above but the filenames for binary files are deduced
260 from the given template (using the same rules as described above).
261
262 Examples:
263
264 -r
265 -R report
266 -R report,outseq
267 -R Graphics_in_PNG=@
268 -R Graphics_in_PNG=@$ANALYSIS-*-$RESULT
269
270 Note that the result formatting will be enriched in the future by using
271 existing data type parsers in bioperl.
272
274 Mailing Lists
275
276 User feedback is an integral part of the evolution of this and other
277 Bioperl modules. Send your comments and suggestions preferably to the
278 Bioperl mailing list. Your participation is much appreciated.
279
280 bioperl-l@bioperl.org - General discussion
281 http://bioperl.org/wiki/Mailing_lists - About the mailing lists
282
283 Reporting Bugs
284
285 Report bugs to the Bioperl bug tracking system to help us keep track of
286 the bugs and their resolution. Bug reports can be submitted via the
287 web:
288
289 http://bugzilla.open-bio.org/
290
292 Martin Senger (martin.senger@gmail.com)
293
295 Copyright (c) 2003, Martin Senger and EMBL-EBI. All Rights Reserved.
296
297 This script is free software; you can redistribute it and/or modify it
298 under the same terms as Perl itself.
299
301 This software is provided "as is" without warranty of any kind.
302
304 None known at the time of writing this.
305
306
307
308perl v5.8.8 2007-04-19 BP_PANALYSIS(1)