hmmsearch(1)

1hmmsearch(1)                     HMMER Manual                     hmmsearch(1)
2
3
4

NAME

6       hmmsearch - search a sequence database with a profile HMM
7
8

SYNOPSIS

10       hmmsearch [options] hmmfile seqfile
11
12

DESCRIPTION

14       hmmsearch  reads  an HMM from hmmfile and searches seqfile for signifi‐
15       cantly similar sequence matches.
16
17
18       seqfile will be looked for first in the current working directory, then
19       in  a  directory  named by the environment variable BLASTDB.  This lets
20       users use existing BLAST databases, if BLAST has  been  configured  for
21       the site.
22
23
24       hmmsearch  may take minutes or even hours to run, depending on the size
25       of the sequence database. It is a good idea to redirect the output to a
26       file.
27
28
29       The output consists of four sections: a ranked list of the best scoring
30       sequences, a ranked list of the best scoring  domains,  alignments  for
31       all  the  best  scoring  domains,  and  a  histogram  of the scores.  A
32       sequence score may be higher than a domain score for the same  sequence
33       if  there  is  more than one domain in the sequence; the sequence score
34       takes into account all the domains.  All sequences scoring above the -E
35       and  -T cutoffs are shown in the first list, then every domain found in
36       this list is shown in the second list of domain hits.  If  desired,  E-
37       value  and  bit score thresholds may also be applied to the domain list
38       using the --domE and --domT options.
39
40

OPTIONS

42       -h     Print brief help; includes version number  and  summary  of  all
43              options, including expert options.
44
45
46       -A <n> Limits  the  alignment  output  to the <n> best scoring domains.
47              -A0 shuts off the alignment output and can be used to reduce the
48              size of output files.
49
50
51       -E <x> Set  the  E-value cutoff for the per-sequence ranked hit list to
52              <x>, where <x> is a positive real number. The default  is  10.0.
53              Hits  with  E-values better than (less than) this threshold will
54              be shown.
55
56
57       -T <x> Set the bit score cutoff for the per-sequence ranked hit list to
58              <x>, where <x> is a real number.  The default is negative infin‐
59              ity; by default, the threshold is controlled by E-value and  not
60              by  bit  score.  Hits with bit scores better than (greater than)
61              this threshold will be shown.
62
63
64       -Z <n> Calculate the E-value scores as if we had seen a sequence  data‐
65              base  of  <n>  sequences. The default is the number of sequences
66              seen in your database file <seqfile>.
67
68

EXPERT OPTIONS

70       --compat
71              Use the output format  of  HMMER  2.1.1,  the  1998-2001  public
72              release; provided so 2.1.1 parsers don't have to be rewritten.
73
74
75       --cpu <n>
76              Sets  the  maximum  number of CPUs that the program will run on.
77              The default is to use all CPUs in  the  machine.  Overrides  the
78              HMMER_NCPU  environment variable. Only affects threaded versions
79              of HMMER (the default on most systems).
80
81
82       --cut_ga
83              Use Pfam GA (gathering threshold) score cutoffs.  Equivalent  to
84              --globT <GA1> --domT <GA2>, but the GA1 and GA2 cutoffs are read
85              from the HMM file. hmmbuild puts  these  cutoffs  there  if  the
86              alignment file was annotated in a Pfam-friendly alignment format
87              (extended SELEX or Stockholm format) and the optional GA annota‐
88              tion  line  was present. If these cutoffs are not set in the HMM
89              file, --cut_ga doesn't work.
90
91
92       --cut_tc
93              Use Pfam  TC  (trusted  cutoff)  score  cutoffs.  Equivalent  to
94              --globT <TC1> --domT <TC2>, but the TC1 and TC2 cutoffs are read
95              from the HMM file. hmmbuild puts  these  cutoffs  there  if  the
96              alignment file was annotated in a Pfam-friendly alignment format
97              (extended SELEX or Stockholm format) and the optional TC annota‐
98              tion  line  was present. If these cutoffs are not set in the HMM
99              file, --cut_tc doesn't work.
100
101
102       --cut_nc
103              Use Pfam NC (noise cutoff) score cutoffs. Equivalent to  --globT
104              <NC1>  --domT  <NC2>,  but the NC1 and NC2 cutoffs are read from
105              the HMM file. hmmbuild puts these cutoffs there if the alignment
106              file was annotated in a Pfam-friendly alignment format (extended
107              SELEX or Stockholm format) and the optional NC  annotation  line
108              was  present.  If  these  cutoffs  are  not set in the HMM file,
109              --cut_nc doesn't work.
110
111
112       --domE <x>
113              Set the E-value cutoff for the per-domain  ranked  hit  list  to
114              <x>, where <x> is a positive real number.  The default is infin‐
115              ity; by default, all domains in the sequences  that  passed  the
116              first threshold will be reported in the second list, so that the
117              number of domains reported in the per-sequence list  is  consis‐
118              tent with the number that appear in the per-domain list.
119
120
121       --domT <x>
122              Set  the  bit score cutoff for the per-domain ranked hit list to
123              <x>, where <x> is a real number. The default is negative  infin‐
124              ity;  by  default,  all domains in the sequences that passed the
125              first threshold will be reported in the second list, so that the
126              number  of  domains reported in the per-sequence list is consis‐
127              tent with the number that appear in the per-domain list.  Impor‐
128              tant  note:  only  one  domain  in a sequence is absolutely con‐
129              trolled by this parameter, or by --domT.  The second and  subse‐
130              quent  domains in a sequence have a de facto bit score threshold
131              of 0 because of the details of how HMMER works.  HMMER  requires
132              at  least  one  pass  through the main model per sequence; to do
133              more than one pass (more than one domain) the multidomain align‐
134              ment  must have a better score than the single domain alignment,
135              and hence the extra domains must contribute positive score.  See
136              the Users' Guide for more detail.
137
138
139       --forward
140              Use  the  Forward  algorithm instead of the Viterbi algorithm to
141              determine the per-sequence scores. Per-domain scores  are  still
142              determined  by the Viterbi algorithm. Some have argued that For‐
143              ward is a more sensitive algorithm for detecting remote sequence
144              homologues;  my  experiments with HMMER have not confirmed this,
145              however.
146
147
148       --informat <s>
149              Assert that the input seqfile is  in  format  <s>;  do  not  run
150              Babelfish  format autodection. This increases the reliability of
151              the program somewhat, because the Babelfish can  make  mistakes;
152              particularly recommended for unattended, high-throughput runs of
153              HMMER. Valid format strings include FASTA, GENBANK,  EMBL,  GCG,
154              PIR,  STOCKHOLM, SELEX, MSF, CLUSTAL, and PHYLIP. See the User's
155              Guide for a complete list.
156
157
158       --null2
159              Turn off the post hoc second null model. By default, each align‐
160              ment  is  rescored  by  a  postprocessing  step  that takes into
161              account possible biased composition in either  the  HMM  or  the
162              target sequence.  This is almost essential in database searches,
163              especially with local alignment models. There is  a  very  small
164              chance  that  this postprocessing might remove real matches, and
165              in these cases --null2 may improve sensitivity at the expense of
166              reducing specificity by letting biased composition hits through.
167
168
169       --pvm  Run on a Parallel Virtual Machine (PVM). The PVM must already be
170              running. The client program hmmsearch-pvm must be  installed  on
171              all the PVM nodes.  Optional PVM support must have been compiled
172              into HMMER.
173
174
175       --xnu  Turn on XNU filtering of target protein sequences. Has no effect
176              on  nucleic  acid sequences. In trial experiments, --xnu appears
177              to perform less well than the default post hoc null2 model.
178
179
180

COPYRIGHT

191       Copyright (C) 1992-2003 HHMI/Washington University School of Medicine.
192       Freely distributed under the GNU General Public License (GPL).
193       See the file COPYING in your distribution for details on redistribution
194       conditions.
195
196

AUTHOR

198       Sean Eddy
199       HHMI/Dept. of Genetics
200       Washington Univ. School of Medicine
201       4566 Scott Ave.
202       St Louis, MO 63110 USA
203       http://www.genetics.wustl.edu/eddy/
204
205
206
207
208
209HMMER 2.3.2                        Oct 2003                       hmmsearch(1)