1esl-construct(1)                 Easel Manual                 esl-construct(1)
2
3
4

NAME

6       esl-construct - describe or create a consensus secondary structure
7
8

SYNOPSIS

10       esl-construct [options] msafile
11
12

DESCRIPTION

14       esl-construct  reports  information  on  existing  consensus  secondary
15       structure annotation of an alignment or derives new consensus secondary
16       structures  based  on  structure  annotation for individual aligned se‐
17       quences.
18
19
20       The alignment file must contain either  individual  sequence  secondary
21       structure annotation (Stockholm #=GR SS), consensus secondary structure
22       annotation (Stockohlm #=GC SS_cons), or both. All structure  annotation
23       must be in WUSS notation (Vienna dot paranetheses notation will be cor‐
24       rectly interpreted). At present, the alignment file must be  in  Stock‐
25       holm format and contain RNA or DNA sequences.
26
27
28       By  default,  esl-construct generates lists the sequences in the align‐
29       ment that have structure annotation and  the  number  of  basepairs  in
30       those  structures.  If  the alignment also contains consensus structure
31       annotation, the default output will list how  many  of  the  individual
32       basepairs  overlap  with  the consensus basepairs and how many conflict
33       with a consensus basepair.
34
35
36       For the purposes of this miniapp, a basepair 'conflict' exists  between
37       two  basepairs in different structures, one between columns i and j and
38       the other between columns k and l, if (i == k and j != l) or  (j  ==  l
39       and i != k).
40
41
42       esl-construct  can  also  be  used  to derive a new consensus structure
43       based on structure annotation for individual sequences in the alignment
44       by  using any of the following options: -x, -r, -c, --indi <s>, --ffreq
45       <x>, --fmin.  These are described below. All of these  options  require
46       the  -o <f> option be used as well to specify that a new alignment file
47       <f> be created. Differences between the new alignment(s) and the  input
48       alignment(s)  will  be limited to the the consensus secondary structure
49       (#=GC SS_cons) annotation and possibly reference (#=GC RF) annotation.
50
51

OPTIONS

53       -h     Print brief help; includes version number and summary of all op‐
54              tions, including expert options.
55
56
57       -a     List  all  alignment positions that are involved in at least one
58              conflicting basepair in at least one sequence to the screen, and
59              then exit.
60
61
62       -v     Be  verbose;  with  no  other  options, list individual sequence
63              basepair conflicts as well as summary statistics.
64
65
66       -x     Compute a new consensus structure as the maximally sized set  of
67              basepairs  (greatest  number of basepairs) chosen from all indi‐
68              vidual structures that contains 0 conflicts. Output  the  align‐
69              ment  with  the new SS_cons annotation. This option must be used
70              in combination with the -o option.
71
72
73       -r     Remove any consensus basepairs that conflict with >= 1  individ‐
74              ual basepair and output the alignment with the new SS_cons anno‐
75              tation.  This option must be used in combination with the -o op‐
76              tion.
77
78
79       -c     Define  a  new  consensus  secondary structure as the individual
80              structure annotation that has the maximum number  of  consistent
81              basepairs  with the existing consensus secondary structure anno‐
82              tation.  This option must be used in combination with the -o op‐
83              tion.
84
85
86       --rfc  With  -c, set the reference annotation (#=GC RF) as the sequence
87              whose individual structure becomes the consensus structure.
88
89
90       --indi <s>
91              Define a new consensus secondary  structure  as  the  individual
92              structure  annotation from sequence named <s>.  This option must
93              be used in combination with the -o option.
94
95
96       --rfindi
97              With --indi <s>, set the reference annotation (#=GC RF)  as  the
98              sequence named <s>.
99
100
101       --ffreq <x>
102              Define a new consensus structure as the set of basepairs between
103              columns i:j that are paired in more than <x> fraction of the in‐
104              dividual  sequence structures.  This option must be used in com‐
105              bination with the -o option.
106
107
108       --fmin Same as --ffreq <x> except find the maximal  <x>  that  gives  a
109              consistent  consensus structure. A consistent structure has each
110              base (alignment position) as a member of at most 1 basepair.
111
112
113       -o <s>,
114              Output the alignment(s) with new consensus structure  annotation
115              to file <f>.
116
117
118       --pfam With  -o,  specify that the alignment output format be Pfam for‐
119              mat, a special type of non-interleaved Stockholm on  which  each
120              sequence appears on a single line.
121
122
123       -l <f> Create  a  new  file  <f>  that lists the sequences that have at
124              least one basepair that conflicts with a consensus basepair.
125
126
127       --lmax <n>
128              With -l, only list sequences that have more than  <n>  basepairs
129              that conflict with the consensus structure to the list file.
130
131
132

SEE ALSO

134       http://bioeasel.org/
135
136
138       Copyright (C) 2020 Howard Hughes Medical Institute.
139       Freely distributed under the BSD open source license.
140
141

AUTHOR

143       http://eddylab.org
144
145
146
147
148Easel 0.48                         Nov 2020                   esl-construct(1)
Impressum