1hmmsearch(1) HMMER Manual hmmsearch(1)
2
3
4
6 hmmsearch - search a sequence database with a profile HMM
7
8
10 hmmsearch [options] hmmfile seqfile
11
12
14 hmmsearch reads an HMM from hmmfile and searches seqfile for signifi‐
15 cantly similar sequence matches.
16
17
18 seqfile will be looked for first in the current working directory, then
19 in a directory named by the environment variable BLASTDB. This lets
20 users use existing BLAST databases, if BLAST has been configured for
21 the site.
22
23
24 hmmsearch may take minutes or even hours to run, depending on the size
25 of the sequence database. It is a good idea to redirect the output to a
26 file.
27
28
29 The output consists of four sections: a ranked list of the best scoring
30 sequences, a ranked list of the best scoring domains, alignments for
31 all the best scoring domains, and a histogram of the scores. A
32 sequence score may be higher than a domain score for the same sequence
33 if there is more than one domain in the sequence; the sequence score
34 takes into account all the domains. All sequences scoring above the -E
35 and -T cutoffs are shown in the first list, then every domain found in
36 this list is shown in the second list of domain hits. If desired, E-
37 value and bit score thresholds may also be applied to the domain list
38 using the --domE and --domT options.
39
40
42 -h Print brief help; includes version number and summary of all
43 options, including expert options.
44
45
46 -A <n> Limits the alignment output to the <n> best scoring domains.
47 -A0 shuts off the alignment output and can be used to reduce the
48 size of output files.
49
50
51 -E <x> Set the E-value cutoff for the per-sequence ranked hit list to
52 <x>, where <x> is a positive real number. The default is 10.0.
53 Hits with E-values better than (less than) this threshold will
54 be shown.
55
56
57 -T <x> Set the bit score cutoff for the per-sequence ranked hit list to
58 <x>, where <x> is a real number. The default is negative infin‐
59 ity; by default, the threshold is controlled by E-value and not
60 by bit score. Hits with bit scores better than (greater than)
61 this threshold will be shown.
62
63
64 -Z <n> Calculate the E-value scores as if we had seen a sequence data‐
65 base of <n> sequences. The default is the number of sequences
66 seen in your database file <seqfile>.
67
68
70 --compat
71 Use the output format of HMMER 2.1.1, the 1998-2001 public
72 release; provided so 2.1.1 parsers don't have to be rewritten.
73
74
75 --cpu <n>
76 Sets the maximum number of CPUs that the program will run on.
77 The default is to use all CPUs in the machine. Overrides the
78 HMMER_NCPU environment variable. Only affects threaded versions
79 of HMMER (the default on most systems).
80
81
82 --cut_ga
83 Use Pfam GA (gathering threshold) score cutoffs. Equivalent to
84 --globT <GA1> --domT <GA2>, but the GA1 and GA2 cutoffs are read
85 from the HMM file. hmmbuild puts these cutoffs there if the
86 alignment file was annotated in a Pfam-friendly alignment format
87 (extended SELEX or Stockholm format) and the optional GA annota‐
88 tion line was present. If these cutoffs are not set in the HMM
89 file, --cut_ga doesn't work.
90
91
92 --cut_tc
93 Use Pfam TC (trusted cutoff) score cutoffs. Equivalent to
94 --globT <TC1> --domT <TC2>, but the TC1 and TC2 cutoffs are read
95 from the HMM file. hmmbuild puts these cutoffs there if the
96 alignment file was annotated in a Pfam-friendly alignment format
97 (extended SELEX or Stockholm format) and the optional TC annota‐
98 tion line was present. If these cutoffs are not set in the HMM
99 file, --cut_tc doesn't work.
100
101
102 --cut_nc
103 Use Pfam NC (noise cutoff) score cutoffs. Equivalent to --globT
104 <NC1> --domT <NC2>, but the NC1 and NC2 cutoffs are read from
105 the HMM file. hmmbuild puts these cutoffs there if the alignment
106 file was annotated in a Pfam-friendly alignment format (extended
107 SELEX or Stockholm format) and the optional NC annotation line
108 was present. If these cutoffs are not set in the HMM file,
109 --cut_nc doesn't work.
110
111
112 --domE <x>
113 Set the E-value cutoff for the per-domain ranked hit list to
114 <x>, where <x> is a positive real number. The default is infin‐
115 ity; by default, all domains in the sequences that passed the
116 first threshold will be reported in the second list, so that the
117 number of domains reported in the per-sequence list is consis‐
118 tent with the number that appear in the per-domain list.
119
120
121 --domT <x>
122 Set the bit score cutoff for the per-domain ranked hit list to
123 <x>, where <x> is a real number. The default is negative infin‐
124 ity; by default, all domains in the sequences that passed the
125 first threshold will be reported in the second list, so that the
126 number of domains reported in the per-sequence list is consis‐
127 tent with the number that appear in the per-domain list. Impor‐
128 tant note: only one domain in a sequence is absolutely con‐
129 trolled by this parameter, or by --domT. The second and subse‐
130 quent domains in a sequence have a de facto bit score threshold
131 of 0 because of the details of how HMMER works. HMMER requires
132 at least one pass through the main model per sequence; to do
133 more than one pass (more than one domain) the multidomain align‐
134 ment must have a better score than the single domain alignment,
135 and hence the extra domains must contribute positive score. See
136 the Users' Guide for more detail.
137
138
139 --forward
140 Use the Forward algorithm instead of the Viterbi algorithm to
141 determine the per-sequence scores. Per-domain scores are still
142 determined by the Viterbi algorithm. Some have argued that For‐
143 ward is a more sensitive algorithm for detecting remote sequence
144 homologues; my experiments with HMMER have not confirmed this,
145 however.
146
147
148 --informat <s>
149 Assert that the input seqfile is in format <s>; do not run
150 Babelfish format autodection. This increases the reliability of
151 the program somewhat, because the Babelfish can make mistakes;
152 particularly recommended for unattended, high-throughput runs of
153 HMMER. Valid format strings include FASTA, GENBANK, EMBL, GCG,
154 PIR, STOCKHOLM, SELEX, MSF, CLUSTAL, and PHYLIP. See the User's
155 Guide for a complete list.
156
157
158 --null2
159 Turn off the post hoc second null model. By default, each align‐
160 ment is rescored by a postprocessing step that takes into
161 account possible biased composition in either the HMM or the
162 target sequence. This is almost essential in database searches,
163 especially with local alignment models. There is a very small
164 chance that this postprocessing might remove real matches, and
165 in these cases --null2 may improve sensitivity at the expense of
166 reducing specificity by letting biased composition hits through.
167
168
169 --pvm Run on a Parallel Virtual Machine (PVM). The PVM must already be
170 running. The client program hmmsearch-pvm must be installed on
171 all the PVM nodes. Optional PVM support must have been compiled
172 into HMMER.
173
174
175 --xnu Turn on XNU filtering of target protein sequences. Has no effect
176 on nucleic acid sequences. In trial experiments, --xnu appears
177 to perform less well than the default post hoc null2 model.
178
179
180
182 Master man page, with full list of and guide to the individual man
183 pages: see hmmer(1).
184
185 For complete documentation, see the user guide that came with the dis‐
186 tribution (Userguide.pdf); or see the HMMER web page,
187 http://hmmer.wustl.edu/.
188
189
191 Copyright (C) 1992-2003 HHMI/Washington University School of Medicine.
192 Freely distributed under the GNU General Public License (GPL).
193 See the file COPYING in your distribution for details on redistribution
194 conditions.
195
196
198 Sean Eddy
199 HHMI/Dept. of Genetics
200 Washington Univ. School of Medicine
201 4566 Scott Ave.
202 St Louis, MO 63110 USA
203 http://www.genetics.wustl.edu/eddy/
204
205
206
207
208
209HMMER 2.3.2 Oct 2003 hmmsearch(1)