1hmmpgmd_shard(1)                 HMMER Manual                 hmmpgmd_shard(1)
2
3
4

NAME

6       hmmpgmd_shard - sharded daemon for database search web services
7
8
9

SYNOPSIS

11       hmmpgmd_shard [options]
12
13
14

DESCRIPTION

16       The  hmmpgmd_shard  program  provides  a sharded version of the hmmpgmd
17       program that we use internally to implement high-performance HMMER ser‐
18       vices  that can be accessed via the internet.  See the hmmpgmd man page
19       for a discussion of how the base hmmpgmd program  is  used.   This  man
20       page discusses differences between hmmpgmd_shard and hmmpgmd.  The base
21       hmmpgmd program loads the entirety of its database file into RAM on ev‐
22       ery  worker node, in spite of the fact that each worker node searches a
23       predictable fraction of the database(s) contained  in  that  file  when
24       performing  searches.   This  wastes RAM, particularly when many worker
25       nodes are used to accelerate searches of large databases.
26
27
28       Hmmpgmd_shard addresses this  by  dividing  protein  sequence  database
29       files  into  shards.  Each worker node loads only 1/Nth of the database
30       file, where N is the number of worker nodes  attached  to  the  master.
31       HMM database files are not sharded, meaning that every worker node will
32       load the entire database file into RAM.  Current HMM databases are much
33       smaller  than  current  protein sequence databases, and easily fit into
34       the RAM of modern servers even without sharding.
35
36
37       Hmmpgmd_shard is used in the same manner as hmmpgmd ,  except  that  it
38       takes  one  additional argument: --num_shards <n> , which specifies the
39       number of shards that protein databases will be divided into,  and  de‐
40       faults to 1 if unspecified.  This argument is only valid for the master
41       node of a hmmpgmd system (i.e., when --master is passed to the  hmmpgmd
42       program),  and  must  be  equal to the number of worker nodes that will
43       connect to the master node.  Hmmpgmd_shard will signal an error if more
44       than  num_shards  worker nodes attempt to connect to the master node or
45       if a search is started when fewer than num_shards workers are connected
46       to the master.
47
48

OPTIONS

50       -h     Help;  print  a  brief  reminder  of  command line usage and all
51              available options.
52
53
54       --master
55              Run as the master server.
56
57
58       --worker <s>
59              Run as a worker, connecting to the master server that is running
60              on IP address <s>.
61
62
63       --cport <n>
64              Port  to  use  for  communication between clients and the master
65              server.  The default is 51371.
66
67
68       --wport <n>
69              Port to use for communication between  workers  and  the  master
70              server.  The default is 51372.
71
72
73       --ccncts <n>
74              Maximum  number  of client connections to accept. The default is
75              16.
76
77
78       --wcncts <n>
79              Maximum number of worker connections to accept. The  default  is
80              32.
81
82
83       --pid <f>
84              Name of file into which the process id will be written.
85
86
87       --seqdb <f>
88              Name  of  the  file  (in  hmmpgmd format) containing protein se‐
89              quences.  The contents of this file will be cached for searches.
90
91
92       --hmmdb <f>
93              Name of the file containing protein HMMs. The contents  of  this
94              file will be cached for searches.
95
96
97       --cpu <n>
98              Number of parallel threads to use (for --worker ).
99
100
101       --num_shards <n>
102              Number  of  shards  to  divide cached sequence database(s) into.
103              HMM databases are not sharded, due to their  small  size.   This
104              option  is  only  valid when the --master option is present, and
105              defaults to 1 if not specified.  Hmmpgmd_shard requires that the
106              number  of  shards  be  equal to the number of worker nodes, and
107              will give errors if more than num_shards workers attempt to con‐
108              nect  to  the  master  node or if a search is started with fewer
109              than num_shards workers connected to the master.
110
111

SEE ALSO

113       See hmmmpgmd(1) for a description of the base hmmpgmd command  and  how
114       the daemon should be used.
115
116       hmmer(1)  for  a  master man page with a list of all the individual man
117       pages for programs in the HMMER package.
118
119
120       For complete documentation, see the user guide that came with your  HM‐
121       MER distribution (Userguide.pdf); or see the HMMER web page (http://hm
122       mer.org/).
123
124
125
126
128       Copyright (C) 2020 Howard Hughes Medical Institute.
129       Freely distributed under the BSD open source license.
130
131       For additional information on copyright and  licensing,  see  the  file
132       called  COPYRIGHT  in  your HMMER source distribution, or see the HMMER
133       web page (http://hmmer.org/).
134
135
136

AUTHOR

138       http://eddylab.org
139
140
141
142
143
144
145HMMER 3.3.2                        Nov 2020                   hmmpgmd_shard(1)
Impressum