1exonerate-server(1) sequence comparison server exonerate-server(1)
2
3
4
6 exonerate-server - a sequence comparison server for exonerate
7
8
10 exonerate-server [ options ] <index path>
11
12
14 exonerate-server is a multi-threaded server for the exonerate sequence
15 alignment program.
16
17 It uses a set of sequences and a corresponding index file to allow fast
18 of large datasets.
19
20
22 Firstly, an .esd file must be made from the sequence files. The .esd
23 file is an Exonerate Sequence Dataset file, and can be used to group
24 together any set of sequences where each sequences containing unique
25 identifiers. This is done by using the fasta2esd utility.
26
27 fasta2esd genome.fasta genome.esd
28
29 Next, an .esi file my be made from the .esd file. The .esi file is an
30 Exonerate Sequence Index file, and contains an index or set of indices
31 corresponding to a particular dataset. This is done by using the
32 esd2esi utility.
33
34 esd2esi genome.esd genome.esi
35
36 Once the .esi file has been generated, the exonerate-server may be
37 started.
38
39 exonerate-server genome.esi
40
41 While the server is running, exonerate may be used to query the server
42 by replacing the target sequences in the command line with the name of
43 the server and port number. The default port number for the exonerate-
44 server is 12886.
45
46 exonerate query.fasta localhost:12886
47
49 Some of the command line options for the exonerate-server are the same
50 as for the exonerate client, and these are documented in the man page
51 for exonerate. The other options which are specific to exonerate-
52 server are documented here.
53
54 --port <port>
55 Specify the port on which the server should listen. By default,
56 exonerate-server will listen on port 12886, but alternative
57 ports may be specified with this option.
58
59 --input <index file>
60 Specify the index file to be used when the server is started.
61 This option is mandatory. The index file is a .esi file gener‐
62 ated by the esd2esi utility.
63
64 --preload <boolean>
65 By default the indices contained in the .esi file, and the
66 sequences referenced in the corresponding .esd file are loaded
67 into memory when the server is started. This is necessary to
68 achieve fast performance that would otherwise be hampered by
69 frequent disk accesses. This option allows the index and
70 sequence preloading to be turned off, which allows the server to
71 run much more slowly, but with faster startup and a smaller mem‐
72 ory footprint. It is not advised to turn preloading off unless
73 testing or debugging the server.
74
75 --maxconnections <count>
76 The server is multithreaded. This option sets the number client
77 processes which are allowed to connect to the server simultane‐
78 ously. For good performance, it should not be set to more than
79 the number of CPUs on the machine on which the server is run‐
80 ning.
81
82 --verbosity <level>
83 Set the verbosity level for the server. If it is zero, the
84 server will be silent, and the higher the number, the more mes‐
85 sages are reported by the server about what is happening.
86
87
89 This section documents the communication interface between the client
90 and server. The interface is documented for people wishing to write
91 their own custom server to sit behind exonerate - for normal use of
92 exonerate, it is not necessary to know this.
93
94 The interface works by the client sending simple command lines and the
95 server sending simple reply lines over a socket. All the commands and
96 replies are simple lines of ASCII text, so it is possible to use telnet
97 as a client for testing a server.
98
99 Any command is a single line of text, but a reply may contain many
100 lines of text. The replies are in the form of <tag>: <message>
101
102 Any reply can include lines with the tag warning: or error: These warn‐
103 ing: and error: tags are echoed by the client, and the client will exit
104 after receiving any error: reply.
105
106 When the server is returning a multiline reply, the first line must
107 show the number of lines in the whole reply as: linecount: <count> For
108 examples, see the replies from the get hsps commands in the example
109 session below.
110
111 The client will only open a single connection to any server, although a
112 multithreaded server is obviously required to allow multiple clients to
113 connect simultaneously.
114
115 Commands and replies used in for the interface.
116 Command: version
117 Reply: version <server name> <server version>
118
119 Command: exit
120 Reply: ( no reply - server closes connection )
121
122 Command: dbinfo
123 Reply: dbinfo: <type> <masked> <num_seqs> <max_seq_len>
124 <total_seq_len>
125
126 The dbinfo command returns information about the database
127 loaded on the server. The returned fields are:
128
129 <type> either dna or protein
130 <masked> either softmasked or unmasked
131 <num_seqs> the number of sequences in the database
132 <max_seq_len> the length of the longest sequence in the
133 database
134 <total_seq_len> the total length of all the sequences in the
135 database
136
137 Command: lookup <eid>
138 Reply: lookup: <iid>
139
140 The lookup command is used to map an external identifier to
141 an internal identifier.
142
143 Command: get info <iid>
144 Reply: seqinfo: <len> <checksum> <eid> [ <def> ]
145
146 The get info command returns information about a sequence in
147 the database. The returned fields are:
148
149 <len> the sequence length
150 <checksum> a gcg format checksum (see below)
151 <eid> the external id (eg. from fasta header)
152 <def> a description line for the sequence (also
153 from the fasta header), this field is
154 optional an may be ommitted.
155
156 Command: get seq <iid>
157 Reply: seq: <seq>
158
159 The get seq command returns a whole sequence on one line.
160
161 Command: get subseq <iid> <start> <len>
162 Reply: subseq: <sequence>
163
164 The get subseq command returns part of a sequence. The start
165 of the sequence is position zero. eg. get subseq 0 0 10 will
166 return the first 10 bases of the first sequence in the data‐
167 base.
168
169 Command: set query <seq>
170 Reply: ok: <len> <checksum>
171
172 The seq query command is used to send a query sequence to the
173 server. It returns the length of the sequence and a gcg
174 checksum
175
176 Command: revcomp <query | target>
177 Reply: ok: <query | target> strand <forward | revcomp>
178
179 The revcomp query command makes the server reverse complement
180 the query. This is to save the bandwidth of sending the
181 query twice.
182
183 The revcomp target command is to tell the server to treat the
184 database as its reverse complement. The client only sends
185 this command when searching a translated database, so need
186 not be implemented for most types of search.
187
188 Command: set param <name> <value>
189 Reply: ok: <set | ignored>
190
191 The set parameter command sends parameters from the exonerate
192 command line to the server. This commands can all be ignored
193 by the client for a basic implementation, but cannot be
194 ignored for optimal performance.
195
196 Command: get hsps
197 Reply: hspset: <iid> { <query_pos> <target_pos> <length> }
198 Or: hspset: empty
199
200 The get hsps command is the main command for getting sets of
201 hsps. The server may return multiple hspsets. The returned
202 fields are:
203
204 <iid> The internal id of the target sequence for
205 these HSPsets.
206 <query_pos> The hsp query start position
207 <target_pos> The hsp target start position
208 <length> The hsp length
209
210 The last three fields represent an HSP, and may be repeated
211 many times on one hspset: reply line.
212
213 A simple example client server dialog.
214 % telnet localhost 12886
215 Trying 127.0.0.1...
216 Connected to localhost.localdomain.
217 Escape character is '^]'.
218 % version
219 version: exonerate-server 2.0.0
220 % dbinfo
221 dbinfo: dna softmasked 100000 1701 38113579
222 % lookup AA159529.1
223 lookup: 88065
224 % get info 88065
225 seqinfo: 62 2028 AA159529.1 zo72g05.s1 Stratagene pancreas (#937208) Homo sapiens cDNA
226 % get seq 88065
227 seq: NAACTCATCNTTTTCTGCTGNATCCTCTTCACCAGTTTGGGGGANGGCCTGCACTTCCANAG
228 % get subseq 88065 10 20
229 subseq: TTTTCTGCTGNATCCTCTTC
230 % set query NAACTCATCNTTTTCTGCTGNATCCTCTTCACCAGTTTGGGGGANGGCCTGCACTTCCANAG
231 ok: 62 2028
232 % get hsps
233 linecount: 15
234 hspset: 12423 1 349 41
235 hspset: 44900 1 356 47
236 hspset: 61781 1 358 41 36 392 26
237 hspset: 70065 1 349 41 36 383 26
238 hspset: 88065 1 1 61
239 hspset: 91032 1 357 41 36 391 26
240 hspset: 91442 1 350 41 36 384 26
241 hspset: 92971 1 348 41 36 382 26
242 hspset: 94311 1 375 41
243 hspset: 95381 1 346 41 36 380 26
244 hspset: 96808 10 385 32 36 410 26
245 hspset: 88449 18 11 22
246 hspset: 91036 6 6 56
247 hspset: 93736 36 400 26
248 % revcomp query
249 ok: query strand revcomp
250 % get hsps
251 linecount: 6
252 hspset: 12564 0 64 26 20 83 41
253 hspset: 61780 0 266 61
254 hspset: 29148 0 116 61
255 hspset: 25849 15 445 22
256 hspset: 93938 26 265 34
257 % exit
258 Connection closed by foreign host.
259
260
262 Not documented yet.
263
265 1. Example of creating a translated index and running a fast pro‐
266 tein2genome search using exonerate-server
267
268 fasta2esd human.genomic.fasta human.genomic.esd esd2esi --translate yes
269 human.genomic.esd human.genomic.trans.esi exonerate-server --port 1234
270 human.genomic.trans.esi exonerate pep.fasta localhost:1234 --model p2g
271 --seedrepeat 3 --geneseed 250
272
273
275 This documentation accompanies version 2.2.0 of the exonerate package.
276
278 Guy St.C. Slater. <guy@ebi.ac.uk>. See the AUTHORS file accompanying
279 the source code for a list of contributors.
280
282 This source code for the exonerate package is available under the terms
283 of the GNU general public licence.
284
285 Please see the file COPYING which was distrubuted with this package, or
286 http://www.gnu.org/licenses/gpl.txt for details.
287
288 This package has been developed as part of the ensembl project. Please
289 see http://www.ensembl.org/ for more information.
290
292 exonerate(1),
293
294
295
296exonerate-server January 2008 exonerate-server(1)