1mkdssp(1)                        USER COMMANDS                       mkdssp(1)
2
3
4

NAME

6       mkdssp - Calculate secondary structure for proteins in a PDB file
7

SYNOPSIS

9       mkdssp [OPTION] pdbfile [dsspfile]
10

DESCRIPTION

12       The mkdssp program was originally designed by Wolfgang Kabsch and Chris
13       Sander to standardize secondary structure assignment.  DSSP is a  data‐
14       base of secondary structure assignments (and much more) for all protein
15       entries in the Protein Data Bank (PDB) and mkdssp  is  the  application
16       that  calculates  the  DSSP entries from PDB entries.  Please note that
17       mkdssp does not predict secondary structure.
18

OPTIONS

20       If you invoke mkdssp with only one parameter, it will be interpreted as
21       the  PDB file to process and output will be sent to stdout. If a second
22       parameter is specified this is interpreted as the name of the DSSP file
23       to create. Both the input and the output file names may have either .gz
24       or .bz2 as extension resulting in the proper compression.
25
26       -i, --input filename
27              The file name of a PDB formatted  file  containing  the  protein
28              structure  data.  This  file may be a file compressed by gzip or
29              bzip2.
30
31       -o, --output filename
32              The file name of a DSSP file to create. If the filename ends  in
33              .gz or .bz2 a compressed file is created.
34
35       -v, --verbose
36              Write out diagnositic information.
37
38       --version
39              Print the version number and exit.
40
41       -h, --help
42              Print  the  help message and exit.  The directory containing the
43              parser scripts for mrs.
44

THEORY

46       The DSSP program works by calculating the most likely secondary  struc‐
47       ture  assignment  given  the 3D structure of a protein. It does this by
48       reading the position of the atoms in a protein (the ATOM records  in  a
49       PDB  file)  followed  by  calculation  of the H-bond energy between all
50       atoms. The best two H-bonds for each atom are then  used  to  determine
51       the  most  likely  class of secondary structure for each residue in the
52       protein.
53
54       This means you do need to have a full and valid 3D structure for a pro‐
55       tein to be able to calculate the secondary structure.  There's no magic
56       in DSSP, so e.g. it cannot guess the secondary structure for a  mutated
57       protein for which you don't have the 3D structure.
58

DSSP FILE FORMAT

60       The  header part of each DSSP file is self explaining, it contains some
61       of the information copied over from the PDB file  and  there  are  some
62       statistics gathered while calculating the secondary structure.
63
64       The second half of the file contains the calculated secondary structure
65       information per residue. What follows is a brief explanation  for  each
66       column.
67
68       Column Name             Description
69       ────────────────────────────────────────────────────────────────────────
70       #                       The residue number as counted by mkdssp
71       RESIDUE                 The residue number as specified by the PDB
72                               file followed by a chain identifier.
73       AA                      The one letter code for the amino acid. If
74                               this  letter is lower case this means this
75                               is a cysteine that form  a  sulfur  bridge
76                               with  the  other amino acid in this column
77                               with the same lower case letter.
78       STRUCTURE               This is a complex column containing multi‐
79                               ple  sub  columns.   The first column con‐
80                               tains a letter  indicating  the  secondary
81                               structure  assigned to this residue. Valid
82                               values are:
83                                     Code            Description
84                                       H             Alpha Helix
85                                       B             Beta Bridge
86                                       E             Strand
87                                       G             Helix-3
88                                       I             Helix-5
89                                       T             Turn
90                                       S             Bend
91                               What follows are three  column  indicating
92                               for  each  of  the three helix types (3, 4
93                               and 5) whether this residue is a candidate
94                               in forming this helix. A > character indi‐
95                               cates it starts a helix,  a  number  indi‐
96                               cates  it  is  inside such a helix and a <
97                               character means it ends the helix.
98                               The next column contains a S character  if
99                               this residue is a possible bend.
100                               Then  there's a column indicating the chi‐
101                               rality and this can either be positive  or
102                               negative (i.e. the alpha torsion is either
103                               positive or negative).
104                               The last two columns contain  beta  bridge
105                               labels.  Lower  case  here  means parallel
106                               bridge and thus upper case means anti par‐
107                               allel.
108       BP1 and BP2             The  first  and  second bridge pair candi‐
109                               date, this is followed by a  letter  indi‐
110                               cating the sheet.
111       ACC                     The accessibility of this residue, this is
112                               the  surface  area  expressed  in   square
113                               Ångstrom  that  can be accessed by a water
114                               molecule.
115       N-H-->O..O-->H-N        Four columns, they give for  each  residue
116                               the  H-bond  energy  with  another residue
117                               where the current residue is either accep‐
118                               tor  or  donor.  Each  column contains two
119                               numbers, the first is an offset  from  the
120                               current  residue to the partner residue in
121                               this H-bond (in DSSP numbering), the  sec‐
122                               ond  number  is  the calculated energy for
123                               this H-bond.
124       TCO                     The cosine of the angle between C=O of the
125                               current   residue   and  C=O  of  previous
126                               residue. For alpha-helices,  TCO  is  near
127                               +1,  for  beta-sheets  TCO is near -1. Not
128                               used for structure definition.
129
130
131
132
133       Kappa                   The  virtual  bond  angle   (bend   angle)
134                               defined  by the three C-alpha atoms of the
135                               residues current - 2, current and  current
136                               +  2.  Used to define bend (structure code
137                               'S').
138       PHI and PSI             IUPAC peptide backbone torsion angles.
139       X-CA, Y-CA and Z-CA     The C-alpha coordinates
140
141

HISTORY

143       The original DSSP application was written by Wolfgang Kabsch and  Chris
144       Sander  in  Pascal.  This version is a complete rewrite in C++ based on
145       the original source code. A few bugs have  been  fixed  since  and  the
146       algorithms have been tweaked here and there.
147

TODO

149       The code desperately needs an update. The first thing that needs imple‐
150       menting is the improved recognition of pi-helices. A second improvement
151       would be to use angle dependent H-bond energy calculation.
152

BUGS

154       If you find any, please let me know.
155

AUTHOR

157       Maarten L. Hekkelman (m.hekkelman (at) cmbi.ru.nl)
158
159
160
161version 2.0.4                     18-apr-2012                        mkdssp(1)
Impressum