1hmmpfam(1) HMMER Manual hmmpfam(1)
2
3
4
6 hmmpfam - search one or more sequences against an HMM database
7
8
10 hmmpfam [options] hmmfile seqfile
11
12
14 hmmpfam reads a sequence file seqfile and compares each sequence in it,
15 one at a time, against all the HMMs in hmmfile looking for signifi‐
16 cantly similar sequence matches.
17
18
19 hmmfile will be looked for first in the current working directory, then
20 in a directory named by the environment variable HMMERDB. This lets
21 administrators install HMM library(s) such as Pfam in a common loca‐
22 tion.
23
24
25 There is a separate output report for each sequence in seqfile. This
26 report consists of three sections: a ranked list of the best scoring
27 HMMs, a list of the best scoring domains in order of their occurrence
28 in the sequence, and alignments for all the best scoring domains. A
29 sequence score may be higher than a domain score for the same sequence
30 if there is more than one domain in the sequence; the sequence score
31 takes into account all the domains. All sequences scoring above the -E
32 and -T cutoffs are shown in the first list, then every domain found in
33 this list is shown in the second list of domain hits. If desired, E-
34 value and bit score thresholds may also be applied to the domain list
35 using the --domE and --domT options.
36
37
39 -h Print brief help; includes version number and summary of all
40 options, including expert options.
41
42
43 -n Specify that models and sequence are nucleic acid, not protein.
44 Other HMMER programs autodetect this; but because of the order
45 in which hmmpfam accesses data, it can't reliably determine the
46 correct "alphabet" by itself.
47
48
49 -A <n> Limits the alignment output to the <n> best scoring domains.
50 -A0 shuts off the alignment output and can be used to reduce the
51 size of output files.
52
53
54 -E <x> Set the E-value cutoff for the per-sequence ranked hit list to
55 <x>, where <x> is a positive real number. The default is 10.0.
56 Hits with E-values better than (less than) this threshold will
57 be shown.
58
59
60 -T <x> Set the bit score cutoff for the per-sequence ranked hit list to
61 <x>, where <x> is a real number. The default is negative infin‐
62 ity; by default, the threshold is controlled by E-value and not
63 by bit score. Hits with bit scores better than (greater than)
64 this threshold will be shown.
65
66
67 -Z <n> Calculate the E-value scores as if we had seen a sequence data‐
68 base of <n> sequences. The default is arbitrarily set to 59021,
69 the size of Swissprot 34.
70
71
73 --acc Report HMM accessions instead of names in the output reports.
74 Useful for high-throughput annotation, where the data are being
75 parsed for storage in a relational database.
76
77
78 --compat
79 Use the output format of HMMER 2.1.1, the 1998-2001 public
80 release; provided so 2.1.1 parsers don't have to be rewritten.
81
82
83 --cpu <n>
84 Sets the maximum number of CPUs that the program will run on.
85 The default is to use all CPUs in the machine. Overrides the
86 HMMER_NCPU environment variable. Only affects threaded versions
87 of HMMER (the default on most systems).
88
89
90 --cut_ga
91 Use Pfam GA (gathering threshold) score cutoffs. Equivalent to
92 --globT <GA1> --domT <GA2>, but the GA1 and GA2 cutoffs are read
93 from each HMM in hmmfile individually. hmmbuild puts these cut‐
94 offs there if the alignment file was annotated in a Pfam-
95 friendly alignment format (extended SELEX or Stockholm format)
96 and the optional GA annotation line was present. If these cut‐
97 offs are not set in the HMM file, --cut_ga doesn't work.
98
99
100 --cut_tc
101 Use Pfam TC (trusted cutoff) score cutoffs. Equivalent to
102 --globT <TC1> --domT <TC2>, but the TC1 and TC2 cutoffs are read
103 from each HMM in hmmfile individually. hmmbuild puts these cut‐
104 offs there if the alignment file was annotated in a Pfam-
105 friendly alignment format (extended SELEX or Stockholm format)
106 and the optional TC annotation line was present. If these cut‐
107 offs are not set in the HMM file, --cut_tc doesn't work.
108
109
110 --cut_nc
111 Use Pfam NC (noise cutoff) score cutoffs. Equivalent to --globT
112 <NC1> --domT <NC2>, but the NC1 and NC2 cutoffs are read from
113 each HMM in hmmfile individually. hmmbuild puts these cutoffs
114 there if the alignment file was annotated in a Pfam-friendly
115 alignment format (extended SELEX or Stockholm format) and the
116 optional NC annotation line was present. If these cutoffs are
117 not set in the HMM file, --cut_nc doesn't work.
118
119
120 --domE <x>
121 Set the E-value cutoff for the per-domain ranked hit list to
122 <x>, where <x> is a positive real number. The default is infin‐
123 ity; by default, all domains in the sequences that passed the
124 first threshold will be reported in the second list, so that the
125 number of domains reported in the per-sequence list is consis‐
126 tent with the number that appear in the per-domain list.
127
128
129 --domT <x>
130 Set the bit score cutoff for the per-domain ranked hit list to
131 <x>, where <x> is a real number. The default is negative infin‐
132 ity; by default, all domains in the sequences that passed the
133 first threshold will be reported in the second list, so that the
134 number of domains reported in the per-sequence list is consis‐
135 tent with the number that appear in the per-domain list. Impor‐
136 tant note: only one domain in a sequence is absolutely con‐
137 trolled by this parameter, or by --domT. The second and subse‐
138 quent domains in a sequence have a de facto bit score threshold
139 of 0 because of the details of how HMMER works. HMMER requires
140 at least one pass through the main model per sequence; to do
141 more than one pass (more than one domain) the multidomain align‐
142 ment must have a better score than the single domain alignment,
143 and hence the extra domains must contribute positive score. See
144 the Users' Guide for more detail.
145
146
147 --forward
148 Use the Forward algorithm instead of the Viterbi algorithm to
149 determine the per-sequence scores. Per-domain scores are still
150 determined by the Viterbi algorithm. Some have argued that For‐
151 ward is a more sensitive algorithm for detecting remote sequence
152 homologues; my experiments with HMMER have not confirmed this,
153 however.
154
155
156 --informat <s>
157 Assert that the input seqfile is in format <s>; do not run
158 Babelfish format autodection. This increases the reliability of
159 the program somewhat, because the Babelfish can make mistakes;
160 particularly recommended for unattended, high-throughput runs of
161 HMMER. Valid format strings include FASTA, GENBANK, EMBL, GCG,
162 PIR, STOCKHOLM, SELEX, MSF, CLUSTAL, and PHYLIP. See the User's
163 Guide for a complete list.
164
165
166 --null2
167 Turn off the post hoc second null model. By default, each align‐
168 ment is rescored by a postprocessing step that takes into
169 account possible biased composition in either the HMM or the
170 target sequence. This is almost essential in database searches,
171 especially with local alignment models. There is a very small
172 chance that this postprocessing might remove real matches, and
173 in these cases --null2 may improve sensitivity at the expense of
174 reducing specificity by letting biased composition hits through.
175
176
177 --pvm Run on a Parallel Virtual Machine (PVM). The PVM must already be
178 running. The client program hmmpfam-pvm must be installed on all
179 the PVM nodes. The HMM database hmmfile and an associated GSI
180 index file hmmfile.gsi must also be installed on all the PVM
181 nodes. (The GSI index is produced by the program hmmindex.)
182 Because the PVM implementation is I/O bound, it is highly recom‐
183 mended that each node have a local copy of hmmfile rather than
184 NFS mounting a shared copy. Optional PVM support must have been
185 compiled into HMMER for --pvm to function.
186
187
188 --xnu Turn on XNU filtering of target protein sequences. Has no effect
189 on nucleic acid sequences. In trial experiments, --xnu appears
190 to perform less well than the default post hoc null2 model.
191
192
193
194
195
197 Master man page, with full list of and guide to the individual man
198 pages: see hmmer(1).
199
200 For complete documentation, see the user guide that came with the dis‐
201 tribution (Userguide.pdf); or see the HMMER web page,
202 http://hmmer.wustl.edu/.
203
204
206 Copyright (C) 1992-2003 HHMI/Washington University School of Medicine.
207 Freely distributed under the GNU General Public License (GPL).
208 See the file COPYING in your distribution for details on redistribution
209 conditions.
210
211
213 Sean Eddy
214 HHMI/Dept. of Genetics
215 Washington Univ. School of Medicine
216 4566 Scott Ave.
217 St Louis, MO 63110 USA
218 http://www.genetics.wustl.edu/eddy/
219
220
221
222
223
224HMMER 2.3.2 Oct 2003 hmmpfam(1)