1samtools-flagstat(1)         Bioinformatics tools         samtools-flagstat(1)
2
3
4

NAME

6       samtools flagstat - counts the number of alignments for each FLAG type
7

SYNOPSIS

9       samtools flagstat in.sam|in.bam|in.cram
10
11

DESCRIPTION

13       Does  a full pass through the input file to calculate and print statis‐
14       tics to stdout.
15
16       Provides counts for each of 13 categories based primarily on bit  flags
17       in the FLAG field.  Information on the meaning of the flags is given in
18       the   SAM   specification   document   <https://samtools.github.io/hts-
19       specs/SAMv1.pdf>.
20
21       Each  category  in  the output is broken down into QC pass and QC fail.
22       In the default output format, these are presented as  "#PASS  +  #FAIL"
23       followed by a description of the category.
24
25       The  first  row  of  output gives the total number of reads that are QC
26       pass and fail (according to flag bit 0x200). For example:
27
28         122 + 28 in total (QC-passed reads + QC-failed reads)
29
30       Which would indicate that there are a total of 150 reads in  the  input
31       file,  122 of which are marked as QC pass and 28 of which are marked as
32       "not passing quality controls"
33
34       Following this, additional categories are given for reads which are:
35
36
37                         primary
38                                neither 0x100 nor 0x800 bit set
39
40                         secondary
41                                0x100 bit set
42
43                         supplementary
44                                0x800 bit set
45
46                         duplicates
47                                0x400 bit set
48
49                         primary duplicates
50                                0x400 bit set and neither 0x100 nor 0x800  bit
51                                set
52
53                         mapped 0x4 bit not set
54
55                         primary mapped
56                                0x4, 0x100 and 0x800 bits not set
57
58                         paired in sequencing
59                                0x1 bit set
60
61                         read1  both 0x1 and 0x40 bits set
62
63                         read2  both 0x1 and 0x80 bits set
64
65                         properly paired
66                                both 0x1 and 0x2 bits set and 0x4 bit not set
67
68                         with itself and mate mapped
69                                0x1 bit set and neither 0x4 nor 0x8 bits set
70
71                         singletons
72                                both 0x1 and 0x8 bits set and bit 0x4 not set
73
74                 And  finally,  two rows are given that additionally filter on
75                 the reference name (RNAME), mate reference name  (MRNM),  and
76                 mapping quality (MAPQ) fields:
77
78
79                         with mate mapped to a different chr
80                                0x1  bit  set and neither 0x4 nor 0x8 bits set
81                                and MRNM not equal to RNAME
82
83                         with mate mapped to a different chr (mapQ>=5)
84                                0x1 bit set and neither 0x4 nor 0x8  bits  set
85                                and MRNM not equal to RNAME and MAPQ >= 5
86
87

ALTERNATIVE OUTPUT FORMATS

89       The  -O  option  can  be used to select two alternative formats for the
90       output.
91
92       Using -O tsv selects a tab-separated values format that can  easily  be
93       imported  into  spreadsheet  software.  In this format the first column
94       contains the values for QC-passed reads, the second column has the val‐
95       ues for QC-failed reads and the third contains the category names.
96
97       Using -O json generates an ECMA-404 JSON data interchange format object
98       <https://www.json.org/>.  The top-level object contains two  named  ob‐
99       jects  QC-passed  reads and QC-failed reads.  These contain the various
100       categories listed above as names and the corresponding count as value.
101
102       For the default format, mapped shows the count as a percentage  of  the
103       total  number  of QC-passed or QC-failed reads after the category name.
104       For example:
105
106       32 + 0 mapped (94.12% : N/A)
107
108
109       The properly paired and singletons counts work in a similar way but the
110       percentage  is  against  the  total  number  of QC-passed and QC-failed
111       pairs.  The primary mapped count is a percentage of the total number of
112       QC-passed and QC-failed primary reads.
113
114       In  the  tsv and json formats, these percentages are listed in separate
115       categories mapped %, primary mapped %, properly paired %,  and  single‐
116       tons  %.   If the percentage cannot be calculated (because the total is
117       zero) then in the default and tsv formats it will be reported as `N/A'.
118       In the json format, it will be reported as a JSON `null' value.
119
120

OPTIONS

122       -@ INT    Set  number  of  additional  threads  to use when reading the
123                 file.
124
125       -O FORMAT Set the output format.   FORMAT  can  be  set  to  `default',
126                 `json'  or `tsv' to select the default, JSON or tab-separated
127                 values output format.  If this option is not  used,  the  de‐
128                 fault format will be selected.
129
130

AUTHOR

132       Written by Heng Li from the Sanger Institute.
133
134

SEE ALSO

136       samtools(1), samtools-idxstats(1), samtools-stats(1)
137
138       Samtools website: <http://www.htslib.org/>
139
140
141
142samtools-1.13                     7 July 2021             samtools-flagstat(1)
Impressum