1esl-compstruct(1) Easel Manual esl-compstruct(1)
2
3
4
6 esl-compstruct - calculate accuracy of RNA secondary structure predic‐
7 tions
8
9
10
12 esl-compstruct [options] trusted_file test_file
13
14
15
16
18 esl-compstruct evaluates the accuracy of RNA secondary structure pre‐
19 dictions on a per-base-pair basis. The trusted_file contains one or
20 more sequences with trusted (known) RNA secondary structure annotation.
21 The test_file contains the same sequences, in the same order, with pre‐
22 dicted RNA secondary structure annotation. esl-compstruct reads the
23 structures and compares them, and calculates both the sensitivity (the
24 number of true base pairs that are correctly predicted) and the posi‐
25 tive predictive value (PPV; the number of predicted base pairs that are
26 true). Results are reported for each individual sequence, and in sum‐
27 mary for all sequences together.
28
29
30 Both files must contain secondary structure annotation in WUSS nota‐
31 tion. Only SELEX and Stockholm formats support structure markup at
32 present.
33
34
35 The default definition of a correctly predicted base pair is that a
36 true pair (i,j) must exactly match a predicted pair (i,j).
37
38
39 Mathews and colleagues (Mathews et al., JMB 288:911-940, 1999) use a
40 more relaxed definition. Mathews defines "correct" as follows: a true
41 pair (i,j) is correctly predicted if any of the following pairs are
42 predicted: (i,j), (i+1,j), (i-1,j), (i,j+1), or (i,j-1). This rule al‐
43 lows for "slipped helices" off by one base. The -m option activates
44 this rule for both sensitivity and for specificity. For specificity,
45 the rule is reversed: predicted pair (i,j) is considered to be true if
46 the true structure contains one of the five pairs (i,j), (i+1,j),
47 (i-1,j), (i,j+1), or (i,j-1).
48
49
50
51
52
54 -h Print brief help; includes version number and summary of all op‐
55 tions, including expert options.
56
57
58 -m Use the Mathews relaxed accuracy rule (see above), instead of
59 requiring exact prediction of base pairs.
60
61
62 -p Count pseudoknotted base pairs towards the accuracy, in either
63 trusted or predicted structures. By default, pseudoknots are ig‐
64 nored.
65
66 Normally, only the trusted_file would have pseudoknot annota‐
67 tion, since most RNA secondary structure prediction programs do
68 not predict pseudoknots. Using the -p option allows you to pe‐
69 nalize the prediction program for not predicting known pseudo‐
70 knots. In a case where both the trusted_file and the test_file
71 have pseudoknot annotation, the -p option lets you count pseu‐
72 doknots in evaluating the prediction accuracy. Beware, however,
73 the case where you use a pseudoknot-capable prediction program
74 to generate the test_file, but the trusted_file does not have
75 pseudoknot annotation; in this case, -p will penalize any pre‐
76 dicted pseudoknots when it calculates specificity, even if
77 they're right, because they don't appear in the trusted annota‐
78 tion. This is probably not what you'd want to do.
79
80
81
82
84 --quiet
85 Don't print any verbose header information. (Used by regression
86 test scripts, for example, to suppress version/date informa‐
87 tion.)
88
89
90
91
93 http://bioeasel.org/
94
95
97 Copyright (C) 2020 Howard Hughes Medical Institute.
98 Freely distributed under the BSD open source license.
99
100
102 http://eddylab.org
103
104
105
106
107
108
109Easel 0.48 Nov 2020 esl-compstruct(1)