1samtools-view(1)             Bioinformatics tools             samtools-view(1)
2
3
4

NAME

6       samtools view - views and converts SAM/BAM/CRAM files
7

SYNOPSIS

9       samtools view [options] in.sam|in.bam|in.cram [region...]
10
11

DESCRIPTION

13       With  no  options  or  regions  specified, prints all alignments in the
14       specified input alignment file (in SAM, BAM, or CRAM format)  to  stan‐
15       dard output in SAM format (with no header).
16
17       You may specify one or more space-separated region specifications after
18       the input filename to restrict output to only  those  alignments  which
19       overlap  the specified region(s). Use of region specifications requires
20       a coordinate-sorted and indexed input file (in BAM or CRAM format).
21
22       The -b, -C, -1, -u, -h, -H, and -c options  change  the  output  format
23       from  the  default of headerless SAM, and the -o and -U options set the
24       output file name(s).
25
26       The -t and -T options provide additional reference data. One  of  these
27       two  options  is  required when SAM input does not contain @SQ headers,
28       and the -T option is required whenever writing CRAM output.
29
30       The -L, -M, -N, -r, -R, -d, -D, -s, -q, -l, -m, -f, -F, and -G  options
31       filter the alignments that will be included in the output to only those
32       alignments that match certain criteria.
33
34       The -x, -B, --add-flags, and --remove-flags  options  modify  the  data
35       which is contained in each alignment.
36
37       The  -X  option  can  be used to allow user to specify customized index
38       file location(s) if the data folder does not contain  any  index  file.
39       See EXAMPLES section for sample of usage.
40
41       Finally, the -@ option can be used to allocate additional threads to be
42       used for compression, and the -?  option requests a long help message.
43
44
45       REGIONS:
46              Regions can be specified as: RNAME[:STARTPOS[-ENDPOS]]  and  all
47              position coordinates are 1-based.
48
49              Important note: when multiple regions are given, some alignments
50              may be output multiple times if they overlap more  than  one  of
51              the specified regions.
52
53              Examples of region specifications:
54
55              chr1      Output all alignments mapped to the reference sequence
56                        named `chr1' (i.e. @SQ SN:chr1).
57
58              chr2:1000000
59                        The  region  on  chr2  beginning  at   base   position
60                        1,000,000 and ending at the end of the chromosome.
61
62              chr3:1000-2000
63                        The  1001bp  region on chr3 beginning at base position
64                        1,000 and ending at  base  position  2,000  (including
65                        both end positions).
66
67              '*'       Output  the  unmapped  reads  at  the end of the file.
68                        (This does not include any unmapped reads placed on  a
69                        reference sequence alongside their mapped mates.)
70
71              .         Output  all  alignments.   (Mostly  unnecessary as not
72                        specifying a region at all has the same effect.)
73
74
75

OPTIONS

77       -b, --bam Output in the BAM format.
78
79       -C, --cram
80                 Output in the CRAM format (requires -T).
81
82       -1, --fast
83                 Enable fast BAM compression (implies -b).
84
85       -u, --uncompressed
86                 Output uncompressed BAM. This option saves time spent on com‐
87                 pression/decompression  and is thus preferred when the output
88                 is piped to another samtools command.
89
90       -h, --with-header
91                 Include the header in the output.
92
93       -H, --header-only
94                 Output the header only.
95
96       --no-header
97                 When producing SAM format, output alignment records  but  not
98                 headers.   This is the default; the option can be used to re‐
99                 set the effect of -h/-H.
100
101       -c, --count
102                 Instead of printing the alignments, only count them and print
103                 the total number. All filter options, such as -f, -F, and -q,
104                 are taken into account.
105
106       -?, --help
107                 Output long help and exit immediately.
108
109       -o FILE, --output FILE
110                 Output to FILE [stdout].
111
112       -U FILE, --unoutput FILE, --output-unselected FILE
113                 Write alignments that are not selected by the various  filter
114                 options  to  FILE.   When this option is used, all alignments
115                 (or all alignments intersecting the  regions  specified)  are
116                 written  to  either  the  output file or this file, but never
117                 both.
118
119       -t FILE, --fai-reference FILE
120                 A tab-delimited FILE.  Each line must contain  the  reference
121                 name  in  the first column and the length of the reference in
122                 the second column, with one line for each distinct reference.
123                 Any  additional  fields beyond the second column are ignored.
124                 This file also defines the order of the  reference  sequences
125                 in  sorting.  If  you run: `samtools faidx <ref.fa>', the re‐
126                 sulting index file <ref.fa>.fai can be used as this FILE.
127
128       -T FILE, --reference FILE
129                 A FASTA format reference FILE, optionally compressed by bgzip
130                 and  ideally  indexed  by samtools faidx.  If an index is not
131                 present one will be generated for you, if the reference  file
132                 is local.
133
134                 If  the  reference file is not local, but is accessed instead
135                 via an https://, s3:// or other URL, the index file will need
136                 to  be supplied by the server alongside the reference.  It is
137                 possible to have the reference and index files  in  different
138                 locations  by  supplying both to this option separated by the
139                 string "##idx##", for example:
140
141                 -T ftp://x.com/ref.fa##idx##ftp://y.com/index.fa.fai
142
143                 However, note that only the location of the reference will be
144                 stored  in the output file header.  If this method is used to
145                 make CRAM files, the cram reader may not be able to find  the
146                 index,  and  may not be able to decode the file unless it can
147                 get the references it needs using a different method.
148
149       -L FILE, --target-file FILE, --targets-file FILE
150                 Only output alignments overlapping the input BED FILE [null].
151
152       -M, --use-index
153                 Use the multi-region iterator on the union of a BED file  and
154                 command-line  region  arguments.   This avoids re-reading the
155                 same regions of files so can sometimes be much faster.   Note
156                 this  also  removes  duplicate sequences.  Without this a se‐
157                 quence that overlaps multiple regions specified on  the  com‐
158                 mand  line  will  be reported multiple times.  The usage of a
159                 BED file is optional and its path has to be  preceded  by  -L
160                 option.
161
162       --region-file FILE, --regions-file FILE
163                 Use  an index and multi-region iterator to only output align‐
164                 ments overlapping the input BED FILE.  Equivalent  to  -M  -L
165                 FILE or --use-index --target-file FILE.
166
167       -N FILE, --qname-file FILE
168                 Output only alignments with read names listed in FILE.
169
170       -r STR, --read-group STR
171                 Output  alignments  in  read  group  STR  [null].   Note that
172                 records with no RG tag will also be output  when  using  this
173                 option.  This behaviour may change in a future release.
174
175       -R FILE, --read-group-file FILE
176                 Output alignments in read groups listed in FILE [null].  Note
177                 that records with no RG tag will also be  output  when  using
178                 this option.  This behaviour may change in a future release.
179
180       -d STR1[:STR2], --tag STR1[:STR2]
181                 Only  output  alignments  with  tag STR1 and associated value
182                 STR2, which can be a string or an integer [null].  The  value
183                 can be omitted, in which case only the tag is considered.
184
185       -D STR:FILE, --tag-file STR:FILE
186                 Only  output  alignments  with  tag STR and associated values
187                 listed in FILE [null].
188
189       -q INT, --min-MQ INT
190                 Skip alignments with MAPQ smaller than INT [0].
191
192       -l STR, --library STR
193                 Only output alignments in library STR [null].
194
195       -m INT, --min-qlen INT
196                 Only output alignments with number of CIGAR  bases  consuming
197                 query sequence ≥ INT [0]
198
199       -e STR, --expr STR
200                 Only include alignments that match the filter expression STR.
201                 The syntax for these expressions is  described  in  the  main
202                 samtools(1) man page under the FILTER EXPRESSIONS heading.
203
204       -f FLAG, --require-flags FLAG
205                 Only  output  alignments with all bits set in FLAG present in
206                 the FLAG field.  FLAG can be specified in  hex  by  beginning
207                 with  `0x'  (i.e. /^0x[0-9A-F]+/), in octal by beginning with
208                 `0' (i.e. /^0[0-7]+/), as a decimal number not beginning with
209                 '0' or as a comma-separated list of flag names.
210
211
212                 For a list of flag names see samtools-flags(1).
213
214       -F FLAG, --excl-flags FLAG, --exclude-flags FLAG
215                 Do not output alignments with any bits set in FLAG present in
216                 the FLAG field.  FLAG can be specified in  hex  by  beginning
217                 with  `0x'  (i.e. /^0x[0-9A-F]+/), in octal by beginning with
218                 `0' (i.e. /^0[0-7]+/), as a decimal number not beginning with
219                 '0' or as a comma-separated list of flag names.
220
221       -G FLAG   Do  not output alignments with all bits set in INT present in
222                 the FLAG field.  This is the opposite of -f  such  that  -f12
223                 -G12  is the same as no filtering at all.  FLAG can be speci‐
224                 fied in hex by beginning with `0x' (i.e. /^0x[0-9A-F]+/),  in
225                 octal  by  beginning with `0' (i.e. /^0[0-7]+/), as a decimal
226                 number not beginning with '0' or as a comma-separated list of
227                 flag names.
228
229       -x STR, --remove-tag STR
230                 Read tag to exclude from output (repeatable) [null]
231
232       -B, --remove-B
233                 Collapse the backward CIGAR operation.
234
235       --add-flags FLAG
236                 Adds flag(s) to read.  FLAG can be specified in hex by begin‐
237                 ning with `0x' (i.e. /^0x[0-9A-F]+/), in octal  by  beginning
238                 with `0' (i.e. /^0[0-7]+/), as a decimal number not beginning
239                 with '0' or as a comma-separated list of flag names.
240
241       --remove-flags FLAG
242                 Remove flag(s) from read.  FLAG is specified in the same  way
243                 as with the --add-flags option.
244
245       --subsample FLOAT
246                 Output  only  a proportion of the input alignments, as speci‐
247                 fied by 0.0 ≤ FLOAT ≤ 1.0, which gives the fraction  of  tem‐
248                 plates/pairs  to  be kept.  This subsampling acts in the same
249                 way on all of the alignment records in the same  template  or
250                 read pair, so it never keeps a read but not its mate.
251
252       --subsample-seed INT
253                 Subsampling  seed  used to influence which subset of reads is
254                 kept.  When subsampling data that has previously been subsam‐
255                 pled,  be  sure to use a different seed value from those used
256                 previously; otherwise more reads will be  retained  than  ex‐
257                 pected.  [0]
258
259       -s FLOAT  Subsampling  shorthand  option:  -s INT.FRAC is equivalent to
260                 --subsample-seed INT --subsample 0.FRAC.
261
262       -@ INT, --threads INT
263                 Number of BAM compression threads to use in addition to  main
264                 thread [0].
265
266       -S        Ignored  for  compatibility  with previous samtools versions.
267                 Previously this option was required if input was in SAM  for‐
268                 mat,  but now the correct format is automatically detected by
269                 examining the first few characters of input.
270
271       -X, --customized-index
272                 Include customized index file as a part of arguments. See EX‐
273                 AMPLES section for sample of usage.
274
275       --no-PG   Do not add a @PG line to the header of the output file.
276
277

EXAMPLES

279       o Import SAM to BAM when @SQ lines are present in the header:
280
281           samtools view -bo aln.bam aln.sam
282
283         If @SQ lines are absent:
284
285           samtools faidx ref.fa
286           samtools view -bt ref.fa.fai -o aln.bam aln.sam
287
288         where ref.fa.fai is generated automatically by the faidx command.
289
290
291       o Convert a BAM file to a CRAM file using a local reference sequence.
292
293           samtools view -C -T ref.fa -o aln.cram aln.bam
294
295
296
297       o Convert  a  BAM  file  to  a CRAM with NM and MD tags stored verbatim
298         rather than calculating on the fly during CRAM decode, so that  mixed
299         data  sets  with  MD/NM  only on some records, or NM calculated using
300         different definitions of mismatch, can  be  decoded  without  change.
301         The  second  command demonstrates how to decode such a file.  The re‐
302         quest to not decode MD here is turning off auto-generation of both MD
303         and  NM;  it will still emit the MD/NM tags on records that had these
304         stored verbatim.
305
306           samtools view -C --output-fmt-option store_md=1 --output-fmt-option store_nm=1 -o aln.cram aln.bam
307           samtools view --input-fmt-option decode_md=0 -o aln.new.bam aln.cram
308
309
310       o An alternative way of achieving the above is listing multiple options
311         after  the --output-fmt or -O option.  The commands below are equiva‐
312         lent to the two above.
313
314           samtools view -O cram,store_md=1,store_nm=1 -o aln.cram aln.bam
315           samtools view --input-fmt cram,decode_md=0 -o aln.new.bam aln.cram
316
317
318
319       o Include customized index file as a part of arguments.
320
321           samtools view [options] -X /data_folder/data.bam /index_folder/data.bai chrM:1-10
322
323
324
325       o Output alignments in read group grp2 (records with  no  RG  tag  will
326         also be in the output).
327
328           samtools view -r grp2 -o /data_folder/data.rg2.bam /data_folder/data.bam
329
330
331
332       o Only keep reads with tag BC and were the barcode matches the barcodes
333         listed in the barcode file.
334
335           samtools view -D BC:barcodes.txt -o /data_folder/data.barcodes.bam /data_folder/data.bam
336
337
338
339       o Only keep reads with tag RG and read group grp2.   This  does  almost
340         the same than -r grp2 but will not keep records without the RG tag.
341
342           samtools view -d RG:grp2 -o /data_folder/data.rg2_only.bam /data_folder/data.bam
343
344
345
346       o Remove the actions of samtools markdup.  Clear the duplicate flag and
347         remove the dt tag, keep the header.
348
349           samtools view -h --remove-flags DUP -x dt -o /data_folder/dat.no_dup_markings.bam /data_folder/data.bam
350
351
352

AUTHOR

354       Written by Heng Li from the Sanger Institute.
355
356

SEE ALSO

358       samtools(1), samtools-tview(1), sam(5)
359
360       Samtools website: <http://www.htslib.org/>
361
362
363
364samtools-1.13                     7 July 2021                 samtools-view(1)
Impressum