1samtools-flagstat(1)         Bioinformatics tools         samtools-flagstat(1)
2
3
4

NAME

6       samtools flagstat - counts the number of alignments for each FLAG type
7

SYNOPSIS

9       samtools flagstat in.sam|in.bam|in.cram
10
11

DESCRIPTION

13       Does  a full pass through the input file to calculate and print statis‐
14       tics to stdout.
15
16       Provides counts for each of 13 categories based primarily on bit  flags
17       in the FLAG field.  Information on the meaning of the flags is given in
18       the   SAM   specification   document   <https://samtools.github.io/hts-
19       specs/SAMv1.pdf>.
20
21       Each  category  in  the output is broken down into QC pass and QC fail.
22       In the default output format, these are presented as  "#PASS  +  #FAIL"
23       followed by a description of the category.
24
25       The  first  row  of  output gives the total number of reads that are QC
26       pass and fail (according to flag bit 0x200). For example:
27
28         122 + 28 in total (QC-passed reads + QC-failed reads)
29
30       Which would indicate that there are a total of 150 reads in  the  input
31       file,  122 of which are marked as QC pass and 28 of which are marked as
32       "not passing quality controls"
33
34       Following this, additional categories are given for reads which are:
35
36
37           primary
38                  neither 0x100 nor 0x800 bit set
39
40           secondary
41                  0x100 bit set
42
43           supplementary
44                  0x800 bit set
45
46           duplicates
47                  0x400 bit set
48
49           primary duplicates
50                  0x400 bit set and neither 0x100 nor 0x800 bit set
51
52           mapped 0x4 bit not set
53
54           primary mapped
55                  0x4, 0x100 and 0x800 bits not set
56
57           paired in sequencing
58                  0x1 bit set
59
60           read1  both 0x1 and 0x40 bits set
61
62           read2  both 0x1 and 0x80 bits set
63
64           properly paired
65                  both 0x1 and 0x2 bits set and 0x4 bit not set
66
67           with itself and mate mapped
68                  0x1 bit set and neither 0x4 nor 0x8 bits set
69
70           singletons
71                  both 0x1 and 0x8 bits set and bit 0x4 not set
72
73
74       And finally, two rows are given that additionally filter on the  refer‐
75       ence  name  (RNAME),  mate  reference  name (MRNM), and mapping quality
76       (MAPQ) fields:
77
78
79           with mate mapped to a different chr
80                  0x1 bit set and neither 0x4 nor 0x8 bits set  and  MRNM  not
81                  equal to RNAME
82
83           with mate mapped to a different chr (mapQ>=5)
84                  0x1  bit  set  and neither 0x4 nor 0x8 bits set and MRNM not
85                  equal to RNAME and MAPQ >= 5
86
87

ALTERNATIVE OUTPUT FORMATS

89       The -O option can be used to select two  alternative  formats  for  the
90       output.
91
92       Using  -O  tsv selects a tab-separated values format that can easily be
93       imported into spreadsheet software.  In this format  the  first  column
94       contains the values for QC-passed reads, the second column has the val‐
95       ues for QC-failed reads and the third contains the category names.
96
97       Using -O json generates an ECMA-404 JSON data interchange format object
98       <https://www.json.org/>.   The  top-level object contains two named ob‐
99       jects QC-passed reads and QC-failed reads.  These contain  the  various
100       categories listed above as names and the corresponding count as value.
101
102       For  the  default format, mapped shows the count as a percentage of the
103       total number of QC-passed or QC-failed reads after the  category  name.
104       For example:
105
106           32 + 0 mapped (94.12% : N/A)
107
108
109       The properly paired and singletons counts work in a similar way but the
110       percentage is against the  total  number  of  QC-passed  and  QC-failed
111       pairs.  The primary mapped count is a percentage of the total number of
112       QC-passed and QC-failed primary reads.
113
114       In the tsv and json formats, these percentages are listed  in  separate
115       categories  mapped  %, primary mapped %, properly paired %, and single‐
116       tons %.  If the percentage cannot be calculated (because the  total  is
117       zero) then in the default and tsv formats it will be reported as `N/A'.
118       In the json format, it will be reported as a JSON `null' value.
119
120

OPTIONS

122       -@ INT    Set number of additional threads  to  use  when  reading  the
123                 file.
124
125       -O FORMAT Set  the  output  format.   FORMAT  can  be set to `default',
126                 `json' or `tsv' to select the default, JSON or  tab-separated
127                 values  output  format.   If this option is not used, the de‐
128                 fault format will be selected.
129
130

AUTHOR

132       Written by Heng Li from the Sanger Institute.
133
134

SEE ALSO

136       samtools(1), samtools-idxstats(1), samtools-stats(1)
137
138       Samtools website: <http://www.htslib.org/>
139
140
141
142samtools-1.15.1                  7 April 2022             samtools-flagstat(1)
Impressum