1hmmpgmd_shard(1) HMMER Manual hmmpgmd_shard(1)
2
3
4
6 hmmpgmd_shard - sharded daemon for database search web services
7
8
9
11 hmmpgmd_shard [options]
12
13
14
16 The hmmpgmd_shard program provides a sharded version of the hmmpgmd
17 program that we use internally to implement high-performance HMMER ser‐
18 vices that can be accessed via the internet. See the hmmpgmd man page
19 for a discussion of how the base hmmpgmd program is used. This man
20 page discusses differences between hmmpgmd_shard and hmmpgmd. The base
21 hmmpgmd program loads the entirety of its database file into RAM on ev‐
22 ery worker node, in spite of the fact that each worker node searches a
23 predictable fraction of the database(s) contained in that file when
24 performing searches. This wastes RAM, particularly when many worker
25 nodes are used to accelerate searches of large databases.
26
27
28 Hmmpgmd_shard addresses this by dividing protein sequence database
29 files into shards. Each worker node loads only 1/Nth of the database
30 file, where N is the number of worker nodes attached to the master.
31 HMM database files are not sharded, meaning that every worker node will
32 load the entire database file into RAM. Current HMM databases are much
33 smaller than current protein sequence databases, and easily fit into
34 the RAM of modern servers even without sharding.
35
36
37 Hmmpgmd_shard is used in the same manner as hmmpgmd , except that it
38 takes one additional argument: --num_shards <n> , which specifies the
39 number of shards that protein databases will be divided into, and de‐
40 faults to 1 if unspecified. This argument is only valid for the master
41 node of a hmmpgmd system (i.e., when --master is passed to the hmmpgmd
42 program), and must be equal to the number of worker nodes that will
43 connect to the master node. Hmmpgmd_shard will signal an error if more
44 than num_shards worker nodes attempt to connect to the master node or
45 if a search is started when fewer than num_shards workers are connected
46 to the master.
47
48
50 -h Help; print a brief reminder of command line usage and all
51 available options.
52
53
54 --master
55 Run as the master server.
56
57
58 --worker <s>
59 Run as a worker, connecting to the master server that is running
60 on IP address <s>.
61
62
63 --cport <n>
64 Port to use for communication between clients and the master
65 server. The default is 51371.
66
67
68 --wport <n>
69 Port to use for communication between workers and the master
70 server. The default is 51372.
71
72
73 --ccncts <n>
74 Maximum number of client connections to accept. The default is
75 16.
76
77
78 --wcncts <n>
79 Maximum number of worker connections to accept. The default is
80 32.
81
82
83 --pid <f>
84 Name of file into which the process id will be written.
85
86
87 --seqdb <f>
88 Name of the file (in hmmpgmd format) containing protein se‐
89 quences. The contents of this file will be cached for searches.
90
91
92 --hmmdb <f>
93 Name of the file containing protein HMMs. The contents of this
94 file will be cached for searches.
95
96
97 --cpu <n>
98 Number of parallel threads to use (for --worker ).
99
100
101 --num_shards <n>
102 Number of shards to divide cached sequence database(s) into.
103 HMM databases are not sharded, due to their small size. This
104 option is only valid when the --master option is present, and
105 defaults to 1 if not specified. Hmmpgmd_shard requires that the
106 number of shards be equal to the number of worker nodes, and
107 will give errors if more than num_shards workers attempt to con‐
108 nect to the master node or if a search is started with fewer
109 than num_shards workers connected to the master.
110
111
113 See hmmmpgmd(1) for a description of the base hmmpgmd command and how
114 the daemon should be used.
115
116 hmmer(1) for a master man page with a list of all the individual man
117 pages for programs in the HMMER package.
118
119
120 For complete documentation, see the user guide that came with your HM‐
121 MER distribution (Userguide.pdf); or see the HMMER web page (http://hm‐
122 mer.org/).
123
124
125
126
128 Copyright (C) 2020 Howard Hughes Medical Institute.
129 Freely distributed under the BSD open source license.
130
131 For additional information on copyright and licensing, see the file
132 called COPYRIGHT in your HMMER source distribution, or see the HMMER
133 web page (http://hmmer.org/).
134
135
136
138 http://eddylab.org
139
140
141
142
143
144
145HMMER 3.3.2 Nov 2020 hmmpgmd_shard(1)