1samtools-view(1) Bioinformatics tools samtools-view(1)
2
3
4
6 samtools view - views and converts SAM/BAM/CRAM files
7
9 samtools view [options] in.sam|in.bam|in.cram [region...]
10
11
13 With no options or regions specified, prints all alignments in the
14 specified input alignment file (in SAM, BAM, or CRAM format) to stan‐
15 dard output in SAM format (with no header).
16
17 You may specify one or more space-separated region specifications after
18 the input filename to restrict output to only those alignments which
19 overlap the specified region(s). Use of region specifications requires
20 a coordinate-sorted and indexed input file (in BAM or CRAM format).
21
22 The -b, -C, -1, -u, -h, -H, and -c options change the output format
23 from the default of headerless SAM, and the -o and -U options set the
24 output file name(s).
25
26 The -t and -T options provide additional reference data. One of these
27 two options is required when SAM input does not contain @SQ headers,
28 and the -T option is required whenever writing CRAM output.
29
30 The -L, -M, -N, -r, -R, -d, -D, -s, -q, -l, -m, -f, -F, -G, and --rf
31 options filter the alignments that will be included in the output to
32 only those alignments that match certain criteria.
33
34 The -p, option sets the UNMAP flag on filtered alignments then writes
35 them to the output file.
36
37 The -x, -B, --add-flags, and --remove-flags options modify the data
38 which is contained in each alignment.
39
40 The -X option can be used to allow user to specify customized index
41 file location(s) if the data folder does not contain any index file.
42 See EXAMPLES section for sample of usage.
43
44 Finally, the -@ option can be used to allocate additional threads to be
45 used for compression, and the -? option requests a long help message.
46
47
48 REGIONS:
49 Regions can be specified as: RNAME[:STARTPOS[-ENDPOS]] and all
50 position coordinates are 1-based.
51
52 Important note: when multiple regions are given, some alignments
53 may be output multiple times if they overlap more than one of
54 the specified regions.
55
56 Examples of region specifications:
57
58 chr1 Output all alignments mapped to the reference sequence
59 named `chr1' (i.e. @SQ SN:chr1).
60
61 chr2:1000000
62 The region on chr2 beginning at base position
63 1,000,000 and ending at the end of the chromosome.
64
65 chr3:1000-2000
66 The 1001bp region on chr3 beginning at base position
67 1,000 and ending at base position 2,000 (including
68 both end positions).
69
70 '*' Output the unmapped reads at the end of the file.
71 (This does not include any unmapped reads placed on a
72 reference sequence alongside their mapped mates.)
73
74 . Output all alignments. (Mostly unnecessary as not
75 specifying a region at all has the same effect.)
76
77
78
80 -b, --bam Output in the BAM format.
81
82 -C, --cram
83 Output in the CRAM format (requires -T).
84
85 -1, --fast
86 Enable fast compression. This also changes the default out‐
87 put format to BAM, but this can be overridden by the explicit
88 format options or using a filename with a known suffix.
89
90 -u, --uncompressed
91 Output uncompressed data. This also changes the default out‐
92 put format to BAM, but this can be overridden by the explicit
93 format options or using a filename with a known suffix.
94
95 This option saves time spent on compression/decompression and
96 is thus preferred when the output is piped to another sam‐
97 tools command.
98
99 -h, --with-header
100 Include the header in the output.
101
102 -H, --header-only
103 Output the header only.
104
105 --no-header
106 When producing SAM format, output alignment records but not
107 headers. This is the default; the option can be used to re‐
108 set the effect of -h/-H.
109
110 -c, --count
111 Instead of printing the alignments, only count them and print
112 the total number. All filter options, such as -f, -F, and -q,
113 are taken into account. The -p option is ignored in this
114 mode.
115
116 -?, --help
117 Output long help and exit immediately.
118
119 -o FILE, --output FILE
120 Output to FILE [stdout].
121
122 -U FILE, --unoutput FILE, --output-unselected FILE
123 Write alignments that are not selected by the various filter
124 options to FILE. When this option is used, all alignments
125 (or all alignments intersecting the regions specified) are
126 written to either the output file or this file, but never
127 both.
128
129 -p, --unmap
130 Set the UNMAP flag on alignments that are not selected by the
131 filter options. These alignments are then written to the
132 normal output. This is not compatible with -U.
133
134 -t FILE, --fai-reference FILE
135 A tab-delimited FILE. Each line must contain the reference
136 name in the first column and the length of the reference in
137 the second column, with one line for each distinct reference.
138 Any additional fields beyond the second column are ignored.
139 This file also defines the order of the reference sequences
140 in sorting. If you run: `samtools faidx <ref.fa>', the re‐
141 sulting index file <ref.fa>.fai can be used as this FILE.
142
143 -T FILE, --reference FILE
144 A FASTA format reference FILE, optionally compressed by bgzip
145 and ideally indexed by samtools faidx. If an index is not
146 present one will be generated for you, if the reference file
147 is local.
148
149 If the reference file is not local, but is accessed instead
150 via an https://, s3:// or other URL, the index file will need
151 to be supplied by the server alongside the reference. It is
152 possible to have the reference and index files in different
153 locations by supplying both to this option separated by the
154 string "##idx##", for example:
155
156 -T ftp://x.com/ref.fa##idx##ftp://y.com/index.fa.fai
157
158 However, note that only the location of the reference will be
159 stored in the output file header. If this method is used to
160 make CRAM files, the cram reader may not be able to find the
161 index, and may not be able to decode the file unless it can
162 get the references it needs using a different method.
163
164 -L FILE, --target-file FILE, --targets-file FILE
165 Only output alignments overlapping the input BED FILE [null].
166
167 -M, --use-index
168 Use the multi-region iterator on the union of a BED file and
169 command-line region arguments. This avoids re-reading the
170 same regions of files so can sometimes be much faster. Note
171 this also removes duplicate sequences. Without this a se‐
172 quence that overlaps multiple regions specified on the com‐
173 mand line will be reported multiple times. The usage of a
174 BED file is optional and its path has to be preceded by -L
175 option.
176
177 --region-file FILE, --regions-file FILE
178 Use an index and multi-region iterator to only output align‐
179 ments overlapping the input BED FILE. Equivalent to -M -L
180 FILE or --use-index --target-file FILE.
181
182 -N FILE, --qname-file FILE
183 Output only alignments with read names listed in FILE.
184
185 -r STR, --read-group STR
186 Output alignments in read group STR [null]. Note that
187 records with no RG tag will also be output when using this
188 option. This behaviour may change in a future release.
189
190 -R FILE, --read-group-file FILE
191 Output alignments in read groups listed in FILE [null]. Note
192 that records with no RG tag will also be output when using
193 this option. This behaviour may change in a future release.
194
195 -d STR1[:STR2], --tag STR1[:STR2]
196 Only output alignments with tag STR1 and associated value
197 STR2, which can be a string or an integer [null]. The value
198 can be omitted, in which case only the tag is considered.
199
200 -D STR:FILE, --tag-file STR:FILE
201 Only output alignments with tag STR and associated values
202 listed in FILE [null].
203
204 -q INT, --min-MQ INT
205 Skip alignments with MAPQ smaller than INT [0].
206
207 -l STR, --library STR
208 Only output alignments in library STR [null].
209
210 -m INT, --min-qlen INT
211 Only output alignments with number of CIGAR bases consuming
212 query sequence ≥ INT [0]
213
214 -e STR, --expr STR
215 Only include alignments that match the filter expression STR.
216 The syntax for these expressions is described in the main
217 samtools(1) man page under the FILTER EXPRESSIONS heading.
218
219 -f FLAG, --require-flags FLAG
220 Only output alignments with all bits set in FLAG present in
221 the FLAG field. FLAG can be specified in hex by beginning
222 with `0x' (i.e. /^0x[0-9A-F]+/), in octal by beginning with
223 `0' (i.e. /^0[0-7]+/), as a decimal number not beginning with
224 '0' or as a comma-separated list of flag names.
225
226
227 For a list of flag names see samtools-flags(1).
228
229 -F FLAG, --excl-flags FLAG, --exclude-flags FLAG
230 Do not output alignments with any bits set in FLAG present in
231 the FLAG field. FLAG can be specified in hex by beginning
232 with `0x' (i.e. /^0x[0-9A-F]+/), in octal by beginning with
233 `0' (i.e. /^0[0-7]+/), as a decimal number not beginning with
234 '0' or as a comma-separated list of flag names.
235
236 --rf FLAG , --incl-flags FLAG, --include-flags FLAG
237 Only output alignments with any bit set in FLAG present in
238 the FLAG field. FLAG can be specified in hex by beginning
239 with `0x' (i.e. /^0x[0-9A-F]+/), in octal by beginning with
240 `0' (i.e. /^0[0-7]+/), as a decimal number not beginning with
241 '0' or as a comma-separated list of flag names.
242
243 -G FLAG Do not output alignments with all bits set in INT present in
244 the FLAG field. This is the opposite of -f such that -f12
245 -G12 is the same as no filtering at all. FLAG can be speci‐
246 fied in hex by beginning with `0x' (i.e. /^0x[0-9A-F]+/), in
247 octal by beginning with `0' (i.e. /^0[0-7]+/), as a decimal
248 number not beginning with '0' or as a comma-separated list of
249 flag names.
250
251 -x STR, --remove-tag STR
252 Read tag(s) to exclude from output (repeatable) [null]. This
253 can be a single tag or a comma separated list. Alternatively
254 the option itself can be repeated multiple times.
255
256 If the list starts with a `^' then it is negated and treated
257 as a request to remove all tags except those in STR. The list
258 may be empty, so -x ^ will remove all tags.
259
260 Note that tags will only be removed from reads that pass fil‐
261 tering.
262
263 --keep-tag STR
264 This keeps only tags listed in STR and is directly equivalent
265 to --remove-tag ^STR. Specifying an empty list will remove
266 all tags. If both --keep-tag and --remove-tag are specified
267 then --keep-tag has precedence.
268
269 Note that tags will only be removed from reads that pass fil‐
270 tering.
271
272 -B, --remove-B
273 Collapse the backward CIGAR operation.
274
275 --add-flags FLAG
276 Adds flag(s) to read. FLAG can be specified in hex by begin‐
277 ning with `0x' (i.e. /^0x[0-9A-F]+/), in octal by beginning
278 with `0' (i.e. /^0[0-7]+/), as a decimal number not beginning
279 with '0' or as a comma-separated list of flag names.
280
281 --remove-flags FLAG
282 Remove flag(s) from read. FLAG is specified in the same way
283 as with the --add-flags option.
284
285 --subsample FLOAT
286 Output only a proportion of the input alignments, as speci‐
287 fied by 0.0 ≤ FLOAT ≤ 1.0, which gives the fraction of tem‐
288 plates/pairs to be kept. This subsampling acts in the same
289 way on all of the alignment records in the same template or
290 read pair, so it never keeps a read but not its mate.
291
292 --subsample-seed INT
293 Subsampling seed used to influence which subset of reads is
294 kept. When subsampling data that has previously been subsam‐
295 pled, be sure to use a different seed value from those used
296 previously; otherwise more reads will be retained than ex‐
297 pected. [0]
298
299 -s FLOAT Subsampling shorthand option: -s INT.FRAC is equivalent to
300 --subsample-seed INT --subsample 0.FRAC.
301
302 -@ INT, --threads INT
303 Number of BAM compression threads to use in addition to main
304 thread [0].
305
306 -P, --fetch-pairs
307 Retrieve pairs even when the mate is outside of the requested
308 region. Enabling this option also turns on the multi-region
309 iterator (-M). A region to search must be specified, either
310 on the command-line, or using the -L option. The input file
311 must be an indexed regular file.
312
313 This option first scans the requested region, using the RNEXT
314 and PNEXT fields of the records that have the PAIRED flag set
315 and pass other filtering options to find where paired reads
316 are located. These locations are used to build an expanded
317 region list, and a set of QNAMEs to allow from the new re‐
318 gions. It will then make a second pass, collecting all reads
319 from the originally-specified region list together with reads
320 from additional locations that match the allowed set of
321 QNAMEs. Any other filtering options used will be applied to
322 all reads found during this second pass.
323
324 As this option links reads using RNEXT and PNEXT, it is im‐
325 portant that these fields are set accurately. Use 'samtools
326 fixmate' to correct them if necessary.
327
328 Note that this option does not work with the -c, --count; -U,
329 --output-unselected; or -p, --unmap options.
330
331 -S Ignored for compatibility with previous samtools versions.
332 Previously this option was required if input was in SAM for‐
333 mat, but now the correct format is automatically detected by
334 examining the first few characters of input.
335
336 -X, --customized-index
337 Include customized index file as a part of arguments. See EX‐
338 AMPLES section for sample of usage.
339
340 --no-PG Do not add a @PG line to the header of the output file.
341
342
344 o Import SAM to BAM when @SQ lines are present in the header:
345
346 samtools view -bo aln.bam aln.sam
347
348 If @SQ lines are absent:
349
350 samtools faidx ref.fa
351 samtools view -bt ref.fa.fai -o aln.bam aln.sam
352
353 where ref.fa.fai is generated automatically by the faidx command.
354
355
356 o Convert a BAM file to a CRAM file using a local reference sequence.
357
358 samtools view -C -T ref.fa -o aln.cram aln.bam
359
360
361
362 o Convert a BAM file to a CRAM with NM and MD tags stored verbatim
363 rather than calculating on the fly during CRAM decode, so that mixed
364 data sets with MD/NM only on some records, or NM calculated using
365 different definitions of mismatch, can be decoded without change.
366 The second command demonstrates how to decode such a file. The re‐
367 quest to not decode MD here is turning off auto-generation of both MD
368 and NM; it will still emit the MD/NM tags on records that had these
369 stored verbatim.
370
371 samtools view -C --output-fmt-option store_md=1 --output-fmt-option store_nm=1 -o aln.cram aln.bam
372 samtools view --input-fmt-option decode_md=0 -o aln.new.bam aln.cram
373
374
375 o An alternative way of achieving the above is listing multiple options
376 after the --output-fmt or -O option. The commands below are equiva‐
377 lent to the two above.
378
379 samtools view -O cram,store_md=1,store_nm=1 -o aln.cram aln.bam
380 samtools view --input-fmt cram,decode_md=0 -o aln.new.bam aln.cram
381
382
383
384 o Include customized index file as a part of arguments.
385
386 samtools view [options] -X /data_folder/data.bam /index_folder/data.bai chrM:1-10
387
388
389
390 o Output alignments in read group grp2 (records with no RG tag will
391 also be in the output).
392
393 samtools view -r grp2 -o /data_folder/data.rg2.bam /data_folder/data.bam
394
395
396
397 o Only keep reads with tag BC and were the barcode matches the barcodes
398 listed in the barcode file.
399
400 samtools view -D BC:barcodes.txt -o /data_folder/data.barcodes.bam /data_folder/data.bam
401
402
403
404 o Only keep reads with tag RG and read group grp2. This does almost
405 the same than -r grp2 but will not keep records without the RG tag.
406
407 samtools view -d RG:grp2 -o /data_folder/data.rg2_only.bam /data_folder/data.bam
408
409
410
411 o Remove the actions of samtools markdup. Clear the duplicate flag and
412 remove the dt tag, keep the header.
413
414 samtools view -h --remove-flags DUP -x dt -o /data_folder/dat.no_dup_markings.bam /data_folder/data.bam
415
416
417
419 Written by Heng Li from the Sanger Institute.
420
421
423 samtools(1), samtools-tview(1), sam(5)
424
425 Samtools website: <http://www.htslib.org/>
426
427
428
429samtools-1.15.1 7 April 2022 samtools-view(1)