1samtools-merge(1) Bioinformatics tools samtools-merge(1)
2
3
4
6 samtools merge - merges multiple sorted files into a single file
7
9 samtools merge [options] -o out.bam [options] in1.bam ... inN.bam
10
11 samtools merge [options] out.bam in1.bam ... inN.bam
12
13
15 Merge multiple sorted alignment files, producing a single sorted output
16 file that contains all the input records and maintains the existing
17 sort order.
18
19 The output file can be specified via -o as shown in the first synopsis.
20 Otherwise the first non-option filename argument is taken to be out.bam
21 rather than an input file, as in the second synopsis. There is no de‐
22 fault; to write to standard output (or to a pipe), use either “-o -” or
23 the equivalent using “-” as the first filename argument.
24
25 If -h is specified the @SQ headers of input files will be merged into
26 the specified header, otherwise they will be merged into a composite
27 header created from the input headers. If in the process of merging
28 @SQ lines for coordinate sorted input files, a conflict arises as to
29 the order (for example input1.bam has @SQ for a,b,c and input2.bam has
30 b,a,c) then the resulting output file will need to be re-sorted back
31 into coordinate order.
32
33 Unless the -c or -p flags are specified then when merging @RG and @PG
34 records into the output header then any IDs found to be duplicates of
35 existing IDs in the output header will have a suffix appended to them
36 to differentiate them from similar header records from other files and
37 the read records will be updated to reflect this.
38
39 The ordering of the records in the input files must match the usage of
40 the -n and -t command-line options. If they do not, the output order
41 will be undefined. See sort for information about record ordering.
42
43
44 -1 Use Deflate compression level 1 to compress the output.
45
46 -b FILE List of input BAM files, one file per line.
47
48 -f Force to overwrite the output file if present.
49
50 -h FILE Use the lines of FILE as `@' headers to be copied to out.bam,
51 replacing any header lines that would otherwise be copied from
52 in1.bam. (FILE is actually in SAM format, though any alignment
53 records it may contain are ignored.)
54
55 -n The input alignments are sorted by read names rather than by
56 chromosomal coordinates
57
58 -o FILE Write merged output to FILE, specifying the filename via an op‐
59 tion rather than as the first filename argument. When -o is
60 used, all non-option filename arguments specify input files to
61 be merged.
62
63 -t TAG The input alignments have been sorted by the value of TAG, then
64 by either position or name (if -n is given).
65
66 -R STR Merge files in the specified region indicated by STR [null]
67
68 -r Attach an RG tag to each alignment. The tag value is inferred
69 from file names.
70
71 -u Uncompressed BAM output
72
73 -c When several input files contain @RG headers with the same ID,
74 emit only one of them (namely, the header line from the first
75 file we find that ID in) to the merged output file. Combining
76 these similar headers is usually the right thing to do when the
77 files being merged originated from the same file.
78
79 Without -c, all @RG headers appear in the output file, with
80 random suffixes added to their IDs where necessary to differen‐
81 tiate them.
82
83 -p Similarly, for each @PG ID in the set of files to merge, use
84 the @PG line of the first file we find that ID in rather than
85 adding a suffix to differentiate similar IDs.
86
87 -X If this option is set, it will allows user to specify custom‐
88 ized index file location(s) if the data folder does not contain
89 any index file. See EXAMPLES section for sample of usage.
90
91 -L FILE BED file for specifying multiple regions on which the merge
92 will be performed. This option extends the usage of -R option
93 and cannot be used concurrently with it.
94
95 --no-PG Do not add a @PG line to the header of the output file.
96
97 -@, --threads INT
98 Number of input/output compression threads to use in addition
99 to main thread [0].
100
101
103 o Attach the RG tag while merging sorted alignments:
104
105 perl -e 'print "@RG\tID:ga\tSM:hs\tLB:ga\tPL:Illumina\n@RG\tID:454\tSM:hs\tLB:454\tPL:454\n"' > rg.txt
106 samtools merge -rh rg.txt merged.bam ga.bam 454.bam
107
108 The value in a RG tag is determined by the file name the read is com‐
109 ing from. In this example, in the merged.bam, reads from ga.bam will
110 be attached RG:Z:ga, while reads from 454.bam will be attached
111 RG:Z:454.
112
113
114 o Include customized index file as a part of arguments:
115
116 samtools merge [options] -X <out.bam> </data_folder/in1.bam> [</data_folder/in2.bam> ... </data_folder/inN.bam>] </index_folder/index1.bai> [</index_folder/index2.bai> ... </index_folder/indexN.bai>]
117
118
119
121 Written by Heng Li from the Sanger Institute.
122
123
125 samtools(1), samtools-sort(1), sam(5)
126
127 Samtools website: <http://www.htslib.org/>
128
129
130
131samtools-1.15.1 7 April 2022 samtools-merge(1)