1scramble(1)                      Staden io_lib                     scramble(1)
2
3
4

NAME

6       scramble - Converts between the SAM, BAM and CRAM file formats.
7
8

SYNOPSIS

10       scramble  [options] [input_file [output_file]]
11
12

DESCRIPTION

14       scramble  converts  between  various next-gen sequencing alignment file
15       formats, including SAM, BAM and CRAM. It can either act as a pipe read‐
16       ing stdin and writing to stdout, or on named files.
17
18       When operating as a pipe the input type defaults to SAM or BAM, requir‐
19       ing the -I cram option to indicate input is in CRAM format is appropri‐
20       ate.  The  output  defaults to BAM, but can be adjusted by using the -O
21       format option. When given filenames the file type is automatically cho‐
22       sen based on the filename suffix.
23
24

OPTIONS

26       -I format
27              Selects  the  input  format,  where format is one of sam, bam or
28              cram.  Use this when reading via a pipe to avoid input bytes be‐
29              ing consumed when attempting to detect if the input is in SAM or
30              BAM format.
31
32
33       -O format
34              Selects the output format, where format is one of  sam,  bam  or
35              cram.
36
37
38       -1 to -9
39              Sets  the  compression level from 1 (low compression, fast) to 9
40              (high compression, slow) when writing in  BAM  or  CRAM  format.
41              This is only used during writing.
42
43
44       -0 or -u
45              Writes  uncompressed  data. In BAM this still uses BGZF contain‐
46              ers, but with no internal compression. In CRAM it stores  blocks
47              in RAW format instead. The option has no effect on SAM output.
48
49
50       -j     CRAM  encoding only.  Add bzip2 to the list of compression codes
51              potentially used during CRAM creation.
52
53
54       -Z     CRAM encoding only.  Add lzma to the list of  compression  codes
55              potentially  used during CRAM creation.  Given the slow compres‐
56              sion speed of lzma, this may only be used where it gives a  sig‐
57              nificant  advantage over zlib or bzip2, but with higher compres‐
58              sion levels (-7) this weighting is ignored as LZMA decompression
59              speed is acceptable, albeit still slower than zlib.
60
61
62       -m     CRAM  decoding  only.  Generate MD:Z: and NM:I: auxiliary fields
63              based on the reference-based compression.
64
65
66       -M     CRAM encoding only.  Forcibly pack sequences from multiple  ref‐
67              erences  into  the  same  slice.  Normally CRAM will start a new
68              slice when changing from one  reference  to  another,  but  will
69              still automatically switch to multi-reference slices if the num‐
70              ber of sequences per slice becomes too small.
71
72
73       -R range
74              Currently for CRAM input only, but SAM/BAM support  is  pending.
75              This  indicates a reference sequence name and optionally a start
76              and  end  location  within  that  reference,  using  the  syntax
77              ref_name or ref_name:start-end. For efficient operation the CRAM
78              file needs a .crai format index (built using the cram_index pro‐
79              gram).
80
81
82       -r ref.fa
83              CRAM  encoding  only.   Use  this to specify the reference fasta
84              file.  Note that if the input SAM or BAM file a file:  or  local
85              file system based URI specified in the @SQ headers then this op‐
86              tion may not be necessary.
87
88
89       -s number
90              CRAM encoding only.   Specifies  the  number  of  sequecnes  per
91              slice.  Defaults to 10000.
92
93
94       -S number
95              CRAM  encoding  only.    Specifies the number of slices per con‐
96              tainer.  Defaults to 1.
97
98
99       -t     BAM and CRAM only.  Specifies the number of compression  or  de‐
100              compression threads, adaptively shared between both encoding and
101              decoding.  Defaults to 1 (no threading).
102
103
104       -V version_string
105              CRAM encoding only.  Sets the CRAM  file  format  version.  Sup‐
106              ported values are "2.0", "2.1" and "3.0".
107
108
109       -e     CRAM  encoding only. Embed snippets of the reference sequence in
110              every slice.  This means the files can be decoded without  need‐
111              ing to specify the reference fasta file.
112
113
114       -x     CRAM  encoding  only.   Omit reference based compression and in‐
115              stead store details of every base verbatim.
116
117
118       -B     Experimental, encoding only.  When storing quality  values,  bin
119              into 8 discrete values (plus 0), as typically used by modern Il‐
120              lumina instruments.  (Note that the bins may  not  be  precisely
121              the same ranges.)
122
123
124       -!     CRAM  v3.0 and above decoding only. Do not check CRCs.  This op‐
125              tion should only be used when attempting to recover from a  data
126              corruption.
127
128

EXAMPLES

130       To  convert  a  BAM  file from stdin to CRAM on stdout, using reference
131       MT.fa.
132
133           some_command | scramble -I bam -O cram -r MT.fa | some_command
134
135
136       The default CRAM output format is version 3.0, so no version  needs  to
137       be  specified  when converting from 2.1 to 3.0.  To perform the reverse
138       use:
139
140           scramble -V 2.1 in.cram out.cram
141
142

AUTHOR

144       James Bonfield, Wellcome Trust Sanger Institute
145
146
147
148                                 March 19 2013                     scramble(1)
Impressum