1esl-construct(1) Easel Manual esl-construct(1)
2
3
4
6 esl-construct - describe or create a consensus secondary structure
7
8
10 esl-construct [options] msafile
11
12
14 esl-construct reports information on existing consensus secondary
15 structure annotation of an alignment or derives new consensus secondary
16 structures based on structure annotation for individual aligned se‐
17 quences.
18
19
20 The alignment file must contain either individual sequence secondary
21 structure annotation (Stockholm #=GR SS), consensus secondary structure
22 annotation (Stockohlm #=GC SS_cons), or both. All structure annotation
23 must be in WUSS notation (Vienna dot paranetheses notation will be cor‐
24 rectly interpreted). At present, the alignment file must be in Stock‐
25 holm format and contain RNA or DNA sequences.
26
27
28 By default, esl-construct generates lists the sequences in the align‐
29 ment that have structure annotation and the number of basepairs in
30 those structures. If the alignment also contains consensus structure
31 annotation, the default output will list how many of the individual
32 basepairs overlap with the consensus basepairs and how many conflict
33 with a consensus basepair.
34
35
36 For the purposes of this miniapp, a basepair 'conflict' exists between
37 two basepairs in different structures, one between columns i and j and
38 the other between columns k and l, if (i == k and j != l) or (j == l
39 and i != k).
40
41
42 esl-construct can also be used to derive a new consensus structure
43 based on structure annotation for individual sequences in the alignment
44 by using any of the following options: -x, -r, -c, --indi <s>, --ffreq
45 <x>, --fmin. These are described below. All of these options require
46 the -o <f> option be used as well to specify that a new alignment file
47 <f> be created. Differences between the new alignment(s) and the input
48 alignment(s) will be limited to the the consensus secondary structure
49 (#=GC SS_cons) annotation and possibly reference (#=GC RF) annotation.
50
51
53 -h Print brief help; includes version number and summary of all op‐
54 tions, including expert options.
55
56
57 -a List all alignment positions that are involved in at least one
58 conflicting basepair in at least one sequence to the screen, and
59 then exit.
60
61
62 -v Be verbose; with no other options, list individual sequence
63 basepair conflicts as well as summary statistics.
64
65
66 -x Compute a new consensus structure as the maximally sized set of
67 basepairs (greatest number of basepairs) chosen from all indi‐
68 vidual structures that contains 0 conflicts. Output the align‐
69 ment with the new SS_cons annotation. This option must be used
70 in combination with the -o option.
71
72
73 -r Remove any consensus basepairs that conflict with >= 1 individ‐
74 ual basepair and output the alignment with the new SS_cons anno‐
75 tation. This option must be used in combination with the -o op‐
76 tion.
77
78
79 -c Define a new consensus secondary structure as the individual
80 structure annotation that has the maximum number of consistent
81 basepairs with the existing consensus secondary structure anno‐
82 tation. This option must be used in combination with the -o op‐
83 tion.
84
85
86 --rfc With -c, set the reference annotation (#=GC RF) as the sequence
87 whose individual structure becomes the consensus structure.
88
89
90 --indi <s>
91 Define a new consensus secondary structure as the individual
92 structure annotation from sequence named <s>. This option must
93 be used in combination with the -o option.
94
95
96 --rfindi
97 With --indi <s>, set the reference annotation (#=GC RF) as the
98 sequence named <s>.
99
100
101 --ffreq <x>
102 Define a new consensus structure as the set of basepairs between
103 columns i:j that are paired in more than <x> fraction of the in‐
104 dividual sequence structures. This option must be used in com‐
105 bination with the -o option.
106
107
108 --fmin Same as --ffreq <x> except find the maximal <x> that gives a
109 consistent consensus structure. A consistent structure has each
110 base (alignment position) as a member of at most 1 basepair.
111
112
113 -o <s>,
114 Output the alignment(s) with new consensus structure annotation
115 to file <f>.
116
117
118 --pfam With -o, specify that the alignment output format be Pfam for‐
119 mat, a special type of non-interleaved Stockholm on which each
120 sequence appears on a single line.
121
122
123 -l <f> Create a new file <f> that lists the sequences that have at
124 least one basepair that conflicts with a consensus basepair.
125
126
127 --lmax <n>
128 With -l, only list sequences that have more than <n> basepairs
129 that conflict with the consensus structure to the list file.
130
131
132
134 http://bioeasel.org/
135
136
138 Copyright (C) 2020 Howard Hughes Medical Institute.
139 Freely distributed under the BSD open source license.
140
141
143 http://eddylab.org
144
145
146
147
148Easel 0.48 Nov 2020 esl-construct(1)