esl-ssdraw(1)

1esl-ssdraw(1)                    Easel Manual                    esl-ssdraw(1)
2
3
4

NAME

6       esl-ssdraw - create postscript secondary structure diagrams
7
8

SYNOPSIS

10       esl-ssdraw [options] msafile postscript_template postscript_output_file
11
12

DESCRIPTION

14       esl-ssdraw reads an existing template consensus secondary structure di‐
15       agram from postscript_template and creates new postscript diagrams  in‐
16       cluding  the  template structure but with positions colored differently
17       based on alignment statistics such as frequency of gaps  per  position,
18       average  posterior  probability per position or information content per
19       position. Additionally, all or some of the  aligned  sequences  can  be
20       drawn  separately,  with  nucleotides or posterior probabilities mapped
21       onto the corresponding positions of the consensus structure.
22
23
24       The alignment must be in Stockholm format with per-column reference an‐
25       notation  (#=GC  RF). The sequences in the alignment must be RNA or DNA
26       sequences. The postscript_template file must contain one page that  in‐
27       cludes  <rflen> consensus nucleotides (positions), where <rflen> is the
28       number of nongap characters in the reference  (RF)  annotation  of  the
29       first  alignment in msafile.  The specific format required in the post‐
30       script_template is described below in the  INPUT  section.   Postscript
31       diagrams will only be created for the first alignment in msafile.
32
33
34

OUTPUT

36       By  default  (if  run  with zero command line options), esl-ssdraw will
37       create a six or seven page postscript_output_file, with each page  dis‐
38       playing a different alignment statistic. These pages display the align‐
39       ment consensus sequence, information content per position,  mutual  in‐
40       formation  per  position,  frequency  of  inserts per position, average
41       length of inserts per position, frequency of deletions (gaps) per posi‐
42       tion,  and  average  posterior  probability  per position (if posterior
43       probabilites exist in the alignment) If -d is  enabled,  all  of  these
44       pages  plus  additional ones, such as individual sequences (see discus‐
45       sion of --indi below) will be drawn.  These pages can be selected to be
46       drawn  individually  by  using the command line options --cons, --info,
47       --mutinfo, --ifreq, --iavglen, --dall, and --prob.  The calculation  of
48       the  statistics for each of these options is discussed below in the de‐
49       scription for each option.  Importantly, only so-called 'consensus' po‐
50       sitions  of  the  alignment  will be drawn. A consensus position is one
51       that is a nongap nucleotide in the 'reference' annotation of the Stock‐
52       holm alignment (#=GC RF) read from msafile.
53
54
55       By default, a consensus sequence for the input alignment will be calcu‐
56       lated and displayed on the alignment statistic diagrams. The  consensus
57       sequence is defined as the most common nucleotide at each consensus po‐
58       sition of the alignment. The consensus sequence will not  be  displayed
59       if  the --no-cnt option is used. The --cthresh, --cambig, and --athresh
60       options affect the definition of the consensus  sequence  as  explained
61       below in the descriptions for those options.
62
63
64       If the --tabfile <f> option is used, a tab-delimited text file <f> will
65       be created that includes per-position lists of the numerical values for
66       each  of  the  calculated statistics that were drawn to postscript_out‐
67       put_file.  Comment lines in <f> are prefixed with a '#'  character  and
68       explain  the  meaning of each of the tab-delimited columns and how each
69       of the statistics was calculated.
70
71       If --indi is used, esl-ssdraw will create  diagrams  showing  each  se‐
72       quence in the alignment on a separate page, with aligned nucleotides in
73       their corresponding position in the  structure  diagram.   By  default,
74       basepaired  nucleotides  will  be colored based on their basepair type:
75       either Watson-Crick (A:U, U:A, C:G, or G:C), G:U or U:G, or non-canoni‐
76       cal (the other ten possible basepairs). This coloring can be turned off
77       with the --no-bp option.  Also by default, nucleotides that differ from
78       the  most  common nucleotide at each aligned consensus position will be
79       outlined. If the most common nucleotide occurs in more than 75% of  se‐
80       quences  that  do  not have a gap at that position, the outline will be
81       bold. Outlining can be turned off with the --no-ol option.
82
83
84       With --indi, if the alignment contains posterior probability annotation
85       (#=GR  PP),  the postscript_output_file will contain an additional page
86       for each sequence drawn with positions colored by the posterior  proba‐
87       bility of each aligned nucleotide.  No posterior probability pages will
88       be drawn if the --no-pp option is used.
89
90
91       esl-ssdraw can also be used to draw 'mask' diagrams which  color  posi‐
92       tions  of  the structure one of two colors depending on if they are in‐
93       cluded or excluded by a mask. This is enabled with the  --mask-col  <f>
94       option.   <f>  must  contain a single line of <rflen> characters, where
95       <rflen> is the the number of nongap RF characters in the alignment. The
96       line must contain only '0' and '1' characters. A '0' at position <x> of
97       the string indicates position <x> is excluded from the mask, and a  '1'
98       indicates  position  <x> is included by the mask.  A page comparing the
99       overlap of the <f> mask from --mask-col and another mask in  <f2>  will
100       be created if the --mask-diff <f2> option is used.
101
102
103       If the --mask <f> option is used, positions excluded by the mask in <f>
104       will be drawn differently (as open circles by default)  than  positions
105       included by the mask. The style of the masked positions can be modified
106       with the --mask-u, --mask-x, and --mask-a options.
107
108
109       Finally, two different types of input files can be  used  to  customize
110       output diagrams using the --dfile and --efile options, as described be‐
111       low.
112
113
114
115

INPUT

117       The postscript_template_file is a postscript file that  must  be  in  a
118       very  specific format in order for esl-ssdraw to work. The specifics of
119       the format, described below, are likely to change in future versions of
120       esl-ssdraw.   The  postscript_output_file files generated by esl-ssdraw
121       will not be valid postscript_template_file format (i.e. an output  file
122       from esl-ssdraw cannot be used as an postscript_template_file in a sub‐
123       sequent run of the program).
124
125
126       An example postscript_template_file ('trna-ssdraw.ps') is included with
127       the  Easel  distribution  in  the 'testsuite/' subdirectory of the top-
128       level 'easel' directory.
129
130
131       The postscript_template_file is a valid postscript  file.  It  includes
132       postscript  commands  for  drawing  a secondary structure. The commands
133       specify x and y coordinates for placing each nucleotide  on  the  page.
134       The  postscript_template_file  might  also contain commands for drawing
135       lines connecting basepaired positions and tick marks  indicating  every
136       tenth position, though these are not required, as explained below.
137
138
139       If  you  are  unfamiliar with the postscript language, it may be useful
140       for you to know that a postscript page is, by default, 612 points  wide
141       and  792  points tall.  The (0,0) coordinate of a postscript file is at
142       the bottom left corner of the page, (0,792) is the top left, (612,0) is
143       the  bottom  right,  and (612,792) is the top right.  esl-ssdraw uses 8
144       point by 8 point cells for drawing positions of the consensus secondary
145       structure.  The  'scale' section of the postscript_template_file allows
146       for different 'zoom levels', as described below.  Also, it is important
147       to  know  that  postscript lines beginning with '%' are considered com‐
148       ments and do not include postscript commands.
149
150
151       An esl-ssdraw postscript_template_file contains  n  >=  1  pages,  each
152       specifying a consensus secondary structure diagram. Each page is delim‐
153       ited by a 'showpage' line in an 'ignore' section (as described  below).
154       esl-ssdraw will read all pages of the postscript_template_file and then
155       choose the appropriate one  that  corresponds  with  the  alignment  in
156       msafile  based  on  the  consensus (nongap RF) length of the alignment.
157       For an alignment of consensus length <rflen>, the first page  of  post‐
158       script_template_file that has a structure diagram with consensus length
159       <rflen> will be used as the template structure for the alignment.
160
161
162       Each page of postscript_template_file contains blocks of text organized
163       into  seven different possible sections. Each section must begin with a
164       single line '% begin <sectionname>' and end with a single line  '%  end
165       <sectionname>'  and  have n >= 1 lines in between. On the begin and end
166       lines, there must be at least one space between the '%' and the 'begin'
167       or  'end'.  <sectionname>  must  be  one of the following: 'modelname',
168       'legend', 'scale', 'regurgitate', 'ignore', 'text positiontext',  'text
169       nucleotides',  'lines  positionticks', or 'lines bpconnects'. The n >=1
170       lines in between the begin and end lines of each section must be  in  a
171       specific format that differs for each section as described below.
172
173
174       Importantly,  each page must end with an 'ignore' section that includes
175       a single line 'showpage' between the begin and  end  lines.  This  lets
176       esl-ssdraw know that a page has ended and another might follow.
177
178
179       Each  page  of a postscript_template_file must include a single 'model‐
180       name' section.  This section  must include exactly one line in  between
181       its begin and end lines. This line must begin with a '%' character fol‐
182       lowed by a single space. The remainder of the line will  be  parsed  as
183       the  model  name and will appear on each page of postscript_output_file
184       in the header section. If the name is more than 16 characters, it  will
185       be truncated in the output.
186
187
188       Each  page of a postscript_template_file must include a single 'legend'
189       section.  This section must include exactly one line in between its be‐
190       gin  and  end  lines.  This line must be formatted as '% <d1> <f1> <f2>
191       <d2> <f3>', where <d1> is an integer specifying the consensus  position
192       with relation to which the legend will be placed; <f1> and <f2> specify
193       the x and y axis offsets for the top left corner of the legend relative
194       to  the x and y position of consensus position <d1>; <d2> specifies the
195       size of a cell in the legend and <f3> specifies how many  extra  points
196       should  be between the right hand edge of the legend and the end of the
197       page. the offset of the right hand end of the legend . For example, the
198       line  '%  34  -40.  -30.  12  0.' specfies that the legend be placed 40
199       points to the left and 30 points below  the  34th  consensus  position,
200       that  cells  appearing in the legend be squares of size 12 points by 12
201       points, and that the right hand side of the legend  flush  against  the
202       right hand edge of the printable page.
203
204
205       Each  page  of a postscript_template_file must include a single 'scale'
206       section.  This section must include exactly one line in between its be‐
207       gin  and  end  lines. This line must be formatted as '<f1> <f2> scale',
208       where <f1> and <f2> are both positive real numbers that are  identical,
209       for  example '1.7 1.7 scale' is valid, but '1.7 2.7 scale' is not. This
210       line is a valid postscript command which specifies the  scale  or  zoom
211       level  on  the  pages in the output. If <f1> and <f2> are '1.0' the de‐
212       fault scale is used for which the total size of the page is 612  points
213       wide and 792 points tall. A scale of 2.0 will reduce this to 306 points
214       wide by 396 points tall. A scale of 0.5 will increase it to 1224 points
215       wide  by  1584 points tall. A single cell corresponding to one position
216       of the secondary structure is 8 points by 8 points. For larger RNAs,  a
217       scale  of  less  than  1.0 is appropriate (for example, SSU rRNA models
218       (about 1500 nt) use a scale of about 0.6),  and  for  smaller  RNAs,  a
219       scale  of  more  than 1.0 might be desirable (tRNA (about 70 nt) uses a
220       scale of 1.7). The best way to determine the  exact  scale  to  use  is
221       trial and error.
222
223
224       Each  page  of  a postscript_template_file can include n >= 0 'regurgi‐
225       tate' sections.  These sections can include any number of  lines.   The
226       text  in  this section will not be parsed by esl-ssdraw but will be in‐
227       cluded in each page of postscript_output_file.  The format of the lines
228       in this section must therefore be valid postscript commands. An example
229       of content that might be in a regurgitate section are commands to  draw
230       lines  and  text annotating the anticodon on a tRNA secondary structure
231       diagram.
232
233
234       Each page of a postscript_template_file must include at  least  1  'ig‐
235       nore'  section.   One of these sections must include a single line that
236       reads 'showpage'. This section should be placed at the end of each page
237       of  the template file.  Other ignore sections can include any number of
238       lines.  The text in these section will not be parsed by esl-ssdraw  nor
239       will  it be included in each page of postscript_output_file.  An ignore
240       section can contain comments or postscript commands that draw  features
241       of   the  postscript_template_file  that  are  unwanted  in  the  post‐
242       script_output_file.
243
244
245       Each page of a postscript_template_file must include a single 'text nu‐
246       cleotides'  section.  This  section must include exactly <rflen> lines,
247       indicating that the consensus secondary structure has  exactly  <rflen>
248       nucleotide  positions.  Each  line must be of the format '(<c>) <x> <y>
249       moveto show' where <c> is a nucleotide (this can be any character actu‐
250       ally),  and  <x> and <y> are the coordinates specifying the location of
251       the nucleotide on the page, they should be positive real  numbers.  The
252       best  way  to determine what these coordinates should be is manually by
253       trial and error, by inspecting the resulting structure as you add  each
254       nucleotide.  Note that esl-ssdraw will color an 8 point by 8 point cell
255       for each position, so nucleotides should be placed about 8 points apart
256       from each other.
257
258
259       Each page of a postscript_template_file may or may not include a single
260       'text positiontext' section. This section can include  n  >=  1  lines,
261       each  specifying  text  to  be placed next to specific positions of the
262       structure, for example, to number them.  Each line must be of the  for‐
263       mat  '(<s>) <x> <y> moveto show' where <s> is a string of text to place
264       at coordinates (<x>,<y>) of the postscript page.  Currently,  the  best
265       way to determine what these coordinates is manually by trial and error,
266       by inspecting the resulting diagram as you add each line.
267
268
269       Each page of a postscript_template_file may or may not include a single
270       'lines  positionticks'  section. This section can include n >= 1 lines,
271       each specifying the location of a tick mark on the diagram.  Each  line
272       must  be  of  the format '<x1> <y1> <x2> <y2> moveto show'. A tick mark
273       (line of width 2.0) will be  drawn  from  point  (<x1>,<y1>)  to  point
274       (<x2>,<y2>)  on  each  page  of postscript_output_file.  Currently, the
275       best way to determine what these coordinates should be is  manually  by
276       trial  and  error,  by inspecting the resulting diagram as you add each
277       line.
278
279
280       Each page of a postscript_template_file may or may not include a single
281       'lines  bpconnects'  section.  This  section  must include <nbp> lines,
282       where <nbp> is the number of basepairs in the  consensus  structure  of
283       the  input  msafile annotated as #=GC SS_cons. Each line should connect
284       two basepaired positions in the consensus structure diagram.  Each line
285       must be of the format '<x1> <y1> <x2> <y2> moveto show'. A line will be
286       drawn from point (<x1>,<y1>) to point (<x2>,<y2>) on each page of post‐
287       script_output_file.   Currently,  the  best way to determine what these
288       coordinates should be is manually by trial and error, by inspecting the
289       resulting diagram as you add each line.
290
291
292
293

REQUIRED MEMORY

295       The  memory  required by esl-ssdraw will be equal to roughly the larger
296       of 2 Mb and the size of the first alignment in msafile.  If the --small
297       option  is  used, the memory required will be independent of the align‐
298       ment size. To use --small the alignment must be in Pfam format, a  non-
299       interleaved (1 line/seq) version of Stockholm format.
300
301       If  the  --indi option is used, the required memory may exceed the size
302       of the alignment by up to  ten-fold,  and  the  output  postscript_out‐
303       put_file may be up to 50 times larger than the msafile.
304
305

OPTIONS

307       -h     Print  brief  help;   includes version number and summary of all
308              options, including expert options.
309
310
311       -d     Draw the default set of alignment  summary  diagrams:  consensus
312              sequence,  information  content, mutual information, insert fre‐
313              quency, average insert length, deletion frequency,  and  average
314              posterior  probability  (if posterior probability annotation ex‐
315              ists in the alignment). These diagrams are also drawn by default
316              (if zero command line options are used), but using the -d option
317              allows the user to add  additional  pages,  such  as  individual
318              aligned sequences with --indi.
319
320
321       --mask <f>
322              Read  the  mask from file <f>, and draw positions differently in
323              postscript_output_file depending on whether they are included or
324              excluded  by the mask.  <f> must contain a single line of length
325              <rflen> with only '0' and '1' characters. <rflen> is the  number
326              of  nongap  characters  in the reference (#=GC RF) annotation of
327              the first alignment in msafile A '0' at position <x> of the mask
328              indicates  position <x> is excluded by the mask, and a '1' indi‐
329              cates that position <x> is included by the mask.
330
331
332       --small
333              Operate in memory saving mode. Without --indi, required RAM will
334              be  independent  of  the size of the alignment in msafile.  With
335              --indi, the required RAM will be roughly ten times the  size  of
336              the  alignment  in  msafile.  For --small to work, the alignment
337              must be in Pfam Stockholm (non-interleaved 1 line/seq) format.
338
339
340       --rf   Add a page to postscript_output_file showing the  reference  se‐
341              quence  from  the  #=GC  RF  annotation in msafile.  By default,
342              basepaired nucleotides will be colored based  on  what  type  of
343              basepair  they are. To turn this off, use --no-bp.  This page is
344              drawn by default (if zero command-line options are used).
345
346
347       --info Add a page to postscript_output_file with consensus (nongap  RF)
348              positions  colored  based  on their information content from the
349              alignment.  Information content is calculated as 2.0 - H,  where
350              H  = sum_x p_x log_2 p_x for x in {A,C,G,U}.  This page is drawn
351              by default (if zero command-line options are used).
352
353
354       --mutinfo
355              Add a page to postscript_output_file with  basepaired  consensus
356              (nongap  RF) positions colored based on the amount of mutual in‐
357              formation they have in  the  alignment.  Mutual  information  is
358              sum_{x,y}  p_{x,y}  log_2  ((p_x * p_y) / p_{x,y}, where x and y
359              are the four possible bases A,C,G,U. p_x  is  the  fractions  of
360              aligned sequences that have nucleotide x of in the left half (5'
361              half) of the basepair. p_y is the fraction of aligned  sequences
362              that  have  nucleotide  y  in  the position corresponding to the
363              right half (3' half) of the basepair. And p_{x,y} is  the  frac‐
364              tion  of  aligned sequences that have basepair x:y. For all p_x,
365              p_y and p{x,y} only sequences that that have a nongap nucleotide
366              at  both  the  left  and right half of the basepair are counted.
367              This page is drawn by default (if zero command-line options  are
368              used).
369
370
371       --ifreq
372              Add a page to postscript_output_file with each consensus (nongap
373              RF) position colored based on the  fraction  of  sequences  that
374              span  each position that have at least 1 inserted nucleotide af‐
375              ter the position.  A sequence s spans consensus position x  that
376              is  actual alignment position a if s has at least one nongap nu‐
377              cleotide aligned to a position b <= a and at  least  one  nongap
378              nucleotide  aligned to a consensus position c >= a. This page is
379              drawn by default (if zero command-line options are used).
380
381
382       --iavglen
383              Add a page to postscript_output_file with each consensus (nongap
384              RF)  position colored based on average length of insertions that
385              occur after it. The average is calculated as the total number of
386              inserted  nucleotides after position x, divided by the number of
387              sequences that have at least 1 inserted nucleotide  after  posi‐
388              tion x (so the minimum possible average insert length is 1.0).
389
390
391       --dall Add a page to postscript_output_file with each consensus (nongap
392              RF) position colored based on the  fraction  of  sequences  that
393              have  a gap (delete) at the position.  This page is drawn by de‐
394              fault (if zero command-line options are used).
395
396
397       --dint Add a page to postscript_output_file with each consensus (nongap
398              RF)  position  colored  based  on the fraction of sequences that
399              have an internal gap (delete) at the position. An  internal  gap
400              in  a  sequence  is one that occurs after (5' of) the sequence's
401              first aligned nucleotide and after (3' of) the sequence's  final
402              aligned nucleotide.  This page is drawn by default (if zero com‐
403              mand-line options are used).
404
405
406       --prob Add a page  to  postscript_output_file  with  positions  colored
407              based  on average posterior probability (PP). The alignment must
408              contain #=GR PP annotation for all sequences. PP  annotation  is
409              converted  to numerical PP values as follows: '*' = 0.975, '9' =
410              0.90, '8' = 0.80, '7' = 0.70, '6' = 0.60,  '5'  =  0.50,  '4'  =
411              0.40,  '3'  =  0.30,  '2' = 0.20, '1' = 0.10, '0' = 0.025.  This
412              page is drawn by  default  (if  zero  command-line  options  are
413              used).
414
415
416       --span Add  a page to postscript_output_file with consensus (nongap RF)
417              positions colored based on the fraction of sequences that 'span'
418              the  position.   A sequence s spans consensus position x that is
419              actual alignment position a if s has at least one nongap nucleo‐
420              tide  aligned  to  a position b <= a and at least one nongap nu‐
421              cleotide aligned to a consensus position c >= a.  This  page  is
422              drawn by default (if zero command-line options are used).
423
424
425

OPTIONS FOR DRAWING INDIVIDUAL ALIGNED SEQUENCES

427       --indi Add  a  page  displaying the aligned nucleotides in their corre‐
428              sponding consensus positions of the structure diagram  for  each
429              aligned  sequence  in the alignment.  By default, basepaired nu‐
430              cleotides will be colored based on what type  of  basepair  they
431              are.  To  turn  this off, use --no-bp.  If posterior probability
432              information (#=GR PP) exists in the  alignment,  one  additional
433              page  per sequence will be drawn displaying the posterior proba‐
434              bilities.
435
436
437       -f     With --indi, force esl-ssdraw to create a diagram, even if it is
438              predicted  to be large (> 100 Mb).  By default, if the predicted
439              size exceeds 100 Mb, esl-ssdraw will fail with a warning.
440
441
442

OPTIONS FOR OMITTING PARTS OF THE DIAGRAMS

444       --no-leg
445              Omit the legend on all pages of postscript_output_file.
446
447
448       --no-head
449              Omit the header on all pages of postscript_output_file.
450
451
452       --no-foot
453              Omit the footer on all pages of postscript_output_file.
454
455
456
457

OPTIONS FOR SIMPLE TWO-COLOR MASK DIAGRAMS

459       --mask-col
460              With --mask, postscript_output_file will contain exactly 1  page
461              showing positions included by the mask as black squares, and po‐
462              sitions excluded as pink squares.
463
464
465       --mask-diff <f>
466              With --mask <f2> and mask-col, postscript_output_file will  con‐
467              tain  one  additional  page  comparing the mask from <f> and the
468              mask from <f2>.  Positions will be colored based on whether they
469              are  included  by  one  mask and not the other, excluded by both
470              masks, and included by both masks.
471
472
473

EXPERT OPTIONS FOR CONTROLLING INDIVIDUAL SEQUENCE DIAGRAMS

475       --no-pp
476              When used in combination with  --indi,  do  not  draw  posterior
477              probability  structure  diagrams  for each sequence, even if the
478              alignment has PP annotation.
479
480
481       --no-bp
482              Do not color basepaired  nucleotides  based  on  their  basepair
483              type.
484
485
486       --no-ol
487              When used in combination with --indi, do not outline nucleotides
488              that differ from the majority rule  consensus  nucleotide  given
489              the alignment.
490
491
492       --no-ntpp
493              When used in combination with --indi, do not draw nucleotides on
494              the individual sequence posterior probability diagrams.
495
496
497

EXPERT OPTIONS RELATED TO CONSENSUS SEQUENCE DEFINITION

499       --no-cnt
500              Do not draw consensus nucleotides on  alignment  statistic  dia‐
501              grams  (such  as  information content diagrams). By default, the
502              consensus nucleotide is defined as the most frequent  nucleotide
503              in  the  alignment  at the corresponding position. Consensus nu‐
504              cleotides that occur in at least <x> fraction of the aligned se‐
505              quences (that do not contain a gap at the position) are capital‐
506              ized. By default <x> is  0.75,  but  can  be  changed  with  the
507              --cthresh <x> option.
508
509
510       --cthresh <x>
511              Specify the threshold for capitalizing consensus nucleotides de‐
512              fined by the majority rule (i.e. when --cambig is  not  enabled)
513              as <x>.
514
515
516       --cambig
517              Change  how  consensus  nucleotides are calculated from majority
518              rule to the least ambiguous IUPAC nucleotide that represents  at
519              least  <x>  fraction of the nongap nucleotides at each consensus
520              position.  By default <x> is 0.9, but can be  changed  with  the
521              --athresh <x> option.
522
523
524       --athresh <x>
525              With  --cambig, specify the threshold for defining consensus nu‐
526              cleotides is the least ambiguous IUPAC  nucleotide  that  repre‐
527              sents  at  least  <x> fraction of the nongap nucleotides at each
528              position.
529
530
531

EXPERT OPTIONS CONTROLLING STYLE OF MASKING POSITIONS

533       --mask-u
534              With --mask, change the style of masked columns to squares.
535
536
537       --mask-x
538              With --mask, change the style of masked columns to x's.
539
540
541       --mask-a
542              With --mask and --mask-u or --mask-x draw the alternative  style
543              of square or 'x' masks.
544
545
546

EXPERT OPTIONS RELATED TO INPUT FILES

548       --dfile <f>
549              Read  the  'draw  file' <f> which specifies numerical values for
550              each consensus position in one or more  postscript  pages.   For
551              each  page,  the draw file must include <rflen>+3 lines (<rflen>
552              is defined in the DESCRIPTION section). The  first  three  lines
553              are  special. The following <rflen> 'value lines' each must con‐
554              tain a single number, the numerical value for the  corresponding
555              position.  The first of the three special lines defines the 'de‐
556              scription' for the page. This should be text that describes what
557              the  numerical  values refer to for the page. The maximum allow‐
558              able length is roughly 50 characters (the exact  maximum  length
559              depends  on the template file and the program will report an in‐
560              formative error message upon execution if it is  exceeded).  The
561              second  special line defines the 'legend header' line that which
562              will appear immediately above the legend. It has a  maximum  al‐
563              lowable  length  of about 30 characters.  The third special line
564              per page must contain exactly 7 numbers, which must  be  in  in‐
565              creasing order, each separated by a space.  These numbers define
566              the numerical ranges for the six different colors used  to  draw
567              the  consensus  positions on the page.  The first number defines
568              the minimum value for the first color (blue) and  must  be  less
569              than  or  equal  to  the minimum value from the value lines. The
570              second number defines the minimum value  for  the  second  color
571              (turquoise).  The  third, fourth, fifth and sixth numbers define
572              the minimum values for the third, fourth, fifth and sixth colors
573              (light green, yellow, orange, red), and the seventh final number
574              defines the maximum value for  red  and  must  be  equal  to  or
575              greater  than the maximum value from the value lines.  After the
576              <rflen> value lines, there must exist a special line  with  only
577              '//',  signifying the end of a page.  The draw file <f> must end
578              with this special '//' line, even if it only includes  a  single
579              page.  A  draw  file specifying <n> pages should include exactly
580              <n> * (<rflen> + 4) lines.
581
582
583       --efile <f>
584              Read the 'expert draw file' <f> which specifies the  colors  and
585              nucleotides  to  draw  on each consensus position in one or more
586              postscript pages. Unlike with the --dfile option, no legend will
587              be  drawn  when  --efile  is used.  For each page, the draw file
588              must include <rflen> lines, each with four or five tab-delimited
589              tokens.  The  first four tokens on line <x> specify the color to
590              paint position <x> and must be real numbers between 0 and 1. The
591              four numbers specify the cyan, magenta, yellow and black values,
592              respectively, in the CMYK color scheme for the postscript  file.
593              The  fifth token on line <x> specifies which nucleotide to write
594              on position <x> (on top of the colored background). If the fifth
595              token  does not exist, no nucleotide will be written.  After the
596              <rflen> lines, there must exist a special line with  only  '//',
597              signifying the end of a page.  The expert draw file <f> must end
598              with this special '//' line, even if it only includes  a  single
599              page. A expert draw file specifying <n> pages should include ex‐
600              actly <n> * (<rflen> + 1) lines.
601
602
603       --ifile <f>
604              Read insert information from the file <f>, which may  have  been
605              created  with INFERNAL's cmalign(1) program. The insert informa‐
606              tion in msafile will be ignored and  the  information  from  <f>
607              will supersede it. Inserts are columns that are gaps in the ref‐
608              erence (#=GC RF) annotation.
609
610
611
612
613

COPYRIGHT

619       Copyright (C) 2020 Howard Hughes Medical Institute.
620       Freely distributed under the BSD open source license.
621
622

AUTHOR

624       http://eddylab.org
625
626
627
628Easel 0.48                         Nov 2020                      esl-ssdraw(1)