1gfs2(5)                       File Formats Manual                      gfs2(5)


6       gfs2 - GFS2 reference guide


10       Overview of the GFS2 filesystem


14       GFS2  is a clustered filesystem, designed for sharing data between mul‐
15       tiple nodes connected to a common shared storage device. It can also be
16       used  as  a local filesystem on a single node, however since the design
17       is aimed at clusters, that will usually  result  in  lower  performance
18       than using a filesystem designed specifically for single node use.
20       GFS2  is  a  journaling filesystem and one journal is required for each
21       node that will mount the filesystem. The one exception to that is spec‐
22       tator  mounts which are equivalent to mounting a read-only block device
23       and as such can neither recover a journal or write to  the  filesystem,
24       so do not require a journal assigned to them.


28       lockproto=LockProtoName
29              This  specifies  which  inter-node  lock protocol is used by the
30              GFS2 filesystem for this mount, overriding the default lock pro‐
31              tocol name stored in the filesystem's on-disk superblock.
33              The  LockProtoName  must  be one of the supported locking proto‐
34              cols, currently these are lock_nolock and lock_dlm.
36              The default lock protocol name is written to disk initially when
37              creating the filesystem with mkfs.gfs2(8), -p option.  It can be
38              changed on-disk by using the  gfs2_tool(8)  utility's  sb  proto
39              command.
41              The  lockproto  mount  option  should be used only under special
42              circumstances in which you want to temporarily use  a  different
43              lock  protocol  without  changing the on-disk default. Using the
44              incorrect lock protocol on a  cluster  filesystem  mounted  from
45              more  than  one  node will almost certainly result in filesystem
46              corruption.
48       locktable=LockTableName
49              This specifies the identity of the cluster and of the filesystem
50              for  this mount, overriding the default cluster/filesystem iden‐
51              tify stored in the filesystem's on-disk superblock.   The  clus‐
52              ter/filesystem  name is recognized globally throughout the clus‐
53              ter, and establishes a unique namespace for the inter-node lock‐
54              ing system, enabling the mounting of multiple GFS2 filesystems.
56              The   format  of  LockTableName  is  lock-module-specific.   For
57              lock_dlm, the format is  clustername:fsname.   For  lock_nolock,
58              the field is ignored.
60              The default cluster/filesystem name is written to disk initially
61              when creating the filesystem with mkfs.gfs2(8), -t  option.   It
62              can  be  changed  on-disk by using the gfs2_tool(8) utility's sb
63              table command.
65              The locktable mount option should be  used  only  under  special
66              circumstances  in  which  you  want to mount the filesystem in a
67              different cluster, or mount it as a different  filesystem  name,
68              without changing the on-disk default.
70       localflocks
71              This  flag  tells  GFS2 that it is running as a local (not clus‐
72              tered) filesystem, so it can allow the kernel VFS  layer  to  do
73              all flock and fcntl file locking.  When running in cluster mode,
74              these file locks require inter-node locks, and require the  sup‐
75              port  of  GFS2.   When  running  locally,  better performance is
76              achieved by letting VFS handle the whole job.
78              This is turned on automatically by the lock_nolock module.
80       errors=[panic|withdraw]
81              Setting errors=panic causes GFS2 to oops  when  encountering  an
82              error  that would otherwise cause the mount to withdraw or print
83              an assertion warning. The default  setting  is  errors=withdraw.
84              This  option  should  not  be  used  in a production system.  It
85              replaces the earlier debug option on kernel versions 2.6.31  and
86              above.
88       acl    Enables POSIX Access Control List acl(5) support within GFS2.
90       spectator
91              Mount  this  filesystem using a special form of read-only mount.
92              The mount does not use one of  the  filesystem's  journals.  The
93              node is unable to recover journals for other nodes.
95       norecovery
96              A synonym for spectator
98       suiddir
99              Sets  owner of any newly created file or directory to be that of
100              parent directory, if parent  directory  has  S_ISUID  permission
101              attribute  bit  set.   Sets S_ISUID in any new directory, if its
102              parent directory's S_ISUID is set.  Strips all execution bits on
103              a new file, if parent directory owner is different from owner of
104              process creating the file.  Set this option only if you know why
105              you are setting it.
107       quota=[off/account/on]
108              Turns  quotas on or off for a filesystem.  Setting the quotas to
109              be in the "account" state causes the per UID/GID  usage  statis‐
110              tics  to  be  correctly  maintained by the filesystem, limit and
111              warn values are ignored.  The default value is "off".
113       discard
114              Causes GFS2 to generate "discard" I/O requests for blocks  which
115              have  been  freed.  These  can  be  used by suitable hardware to
116              implement thin-provisioning and similar schemes. This feature is
117              supported in kernel version 2.6.30 and above.
119       barrier
120              This  option, which defaults to on, causes GFS2 to send I/O bar‐
121              riers when flushing the journal.  The  option  is  automatically
122              turned  off if the underlying device does not support I/O barri‐
123              ers. We highly recommend the use of I/O barriers  with  GFS2  at
124              all  times unless the block device is designed so that it cannot
125              lose its write cache content (e.g. its on a UPS, or  it  doesn't
126              have a write cache)
128       commit=secs
129              This  is  similar to the ext3 commit= option in that it sets the
130              maximum number of seconds between journal commits  if  there  is
131              dirty  data  in  the  journal.  The  default is 60 seconds. This
132              option is only provided in kernel versions 2.6.31 and above.
134       data=[ordered|writeback]
135              When data=ordered is set, the user data modified by  a  transac‐
136              tion  is flushed to the disk before the transaction is committed
137              to disk.  This should prevent the user from seeing uninitialized
138              blocks  in a file after a crash.  Data=writeback mode writes the
139              user data to the disk at any  time  after  it's  dirtied.   This
140              doesn't  provide the same consistency guarantee as ordered mode,
141              but it should  be  slightly  faster  for  some  workloads.   The
142              default is ordered mode.
144       meta   This option results in selecting the meta filesystem root rather
145              than the normal filesystem root. This option  is  normally  only
146              used  by  the  GFS2  utility functions. Altering any file on the
147              GFS2 meta filesystem may render the filesystem unusable, so only
148              experts in the GFS2 on-disk layout should use this option.
150       quota_quantum=secs
151              This  sets the number of seconds for which a change in the quota
152              information may sit on one node  before  being  written  to  the
153              quota file. This is the preferred way to set this parameter. The
154              value is an integer number of seconds  greater  than  zero.  The
155              default is 60 seconds. Shorter settings result in faster updates
156              of the lazy quota information and  less  likelihood  of  someone
157              exceeding  their  quota.  Longer settings make filesystem opera‐
158              tions involving quotas faster and more efficient.
160       statfs_quantum=secs
161              Setting statfs_quantum to 0 is the preferred way to set the slow
162              version  of  statfs. The default value is 30 secs which sets the
163              maximum time period before statfs changes will be syned  to  the
164              master  statfs  file.  This can be adjusted to allow for faster,
165              less accurate statfs values or slower more accurate values. When
166              set to 0, statfs will always report the true values.
168       statfs_percent=value
169              This  setting  provides a bound on the maximum percentage change
170              in the statfs information on a local basis before it  is  synced
171              back  to the master statfs file, even if the time period has not
172              expired. If the setting of statfs_quantum is 0, then  this  set‐
173              ting is ignored.
175       rgrplvb
176              This  flag  tells  gfs2 to look for information about a resource
177              group's free space and unlinked inodes in its glock  lock  value
178              block. This keeps gfs2 from having to read in the resource group
179              data from disk, speeding up allocations  in  some  cases.   This
180              option  was added in the 3.6 Linux kernel. Prior to this kernel,
181              no information was saved to the resource  group  lvb.  Note:  To
182              safely  turn  on  this option, all nodes mounting the filesystem
183              must be running at least a 3.6 Linux kernel. If  any  nodes  had
184              previously  mounted  the  filesystem  using  older  kernels, the
185              filesystem must be unmounted on  all  nodes  before  it  can  be
186              mounted  with  this option enabled. This option does not need to
187              be enabled on all nodes using a filesystem.
189       loccookie
190              This flag tells gfs2 to  use  location  based  readdir  cookies,
191              instead  of  its usual filename hash readdir cookies.  The file‐
192              name hash cookies are not guaranteed to be unique,  and  as  the
193              number of files in a directory increases, so does the likelihood
194              of a collision.  NFS requires  readdir  cookies  to  be  unique,
195              which  can  cause  problems  with  very  large directories (over
196              100,000 files). With this flag set, gfs2 will try  to  give  out
197              location  based cookies.  Since the cookie is 31 bits, gfs2 will
198              eventually run out of unique cookies,  and  will  fail  back  to
199              using  hash cookies. The maximum number of files that could have
200              unique location cookies  assuming  perfectly  even  hashing  and
201              names  of  8  or  fewer  characters is 1,073,741,824. An average
202              directory should be able to give out well over  half  a  billion
203              location  based  cookies. This option was added in the 4.5 Linux
204              kernel. Prior to this kernel, gfs2 did not add directory entries
205              in  a way that allowed it to use location based readdir cookies.
206              Note: To safely turn on this  option,  all  nodes  mounting  the
207              filesystem  must be running at least a 4.5 Linux kernel. If this
208              option is only enabled on some of the nodes mounting a  filesys‐
209              tem, the cookies returned by nodes using this option will not be
210              valid on nodes that are not using this option, and  vice  versa.
211              Finally,  when  first  enabling this option on a filesystem that
212              had been previously mounted without it, you must make sure  that
213              there are no outstanding cookies being cached by other software,
214              such as NFS.


218       GFS2 doesn't support errors=remount-ro or data=journal.  It is not pos‐
219       sible  to  switch support for user and group quotas on and off indepen‐
220       dently of each other. Some of the error messages are rather cryptic, if
221       you  encounter one of these messages check firstly that gfs_controld is
222       running and secondly that you have enough journals  on  the  filesystem
223       for the number of nodes in use.


227       mount(8)  for  general  mount options, chmod(1) and chmod(2) for access
228       permission flags, acl(5) for access control lists,  lvm(8)  for  volume
229       management, ccs(7) for cluster management, umount(8), initrd(4).
231       The GFS2 documentation has been split into a number of sections:
233       gfs2_edit(8) A GFS2 debug tool (use with caution) fsck.gfs2(8) The GFS2
234       file  system  checker  gfs2_grow(8)  Growing   a   GFS2   file   system
235       gfs2_jadd(8) Adding a journal to a GFS2 file system mkfs.gfs2(8) Make a
236       GFS2 file system gfs2_quota(8) Manipulate GFS2 disk quotas gfs2_tool(8)
237       Tool  to  manipulate  a GFS2 file system (obsolete) tunegfs2(8) Tool to
238       manipulate GFS2 superblocks


242       GFS2 clustering is driven by the dlm, which depends on dlm_controld  to
243       provide clustering from userspace.  dlm_controld clustering is built on
244       corosync cluster/group membership and messaging.
246       Follow these steps to manually configure and run gfs2/dlm/corosync.
248       1. create /etc/corosync/corosync.conf and copy to all nodes
250       In this sample, replace cluster_name and IP addresses, and add nodes as
251       needed.   If  using  only  two nodes, uncomment the two_node line.  See
252       corosync.conf(5) for more information.
254       totem {
255               version: 2
256               secauth: off
257               cluster_name: abc
258       }
260       nodelist {
261               node {
262                       ring0_addr:
263                       nodeid: 1
264               }
265               node {
266                       ring0_addr:
267                       nodeid: 2
268               }
269               node {
270                       ring0_addr:
271                       nodeid: 3
272               }
273       }
275       quorum {
276               provider: corosync_votequorum
277       #       two_node: 1
278       }
280       logging {
281               to_syslog: yes
282       }
285       2. start corosync on all nodes
287       systemctl start corosync
289       Run corosync-quorumtool to verify that all nodes are listed.
292       3. create /etc/dlm/dlm.conf and copy to all nodes
294       * To use no fencing, use this line:
296       enable_fencing=0
298       * To use no fencing, but exercise fencing functions, use this line:
300       fence_all /bin/true
302       The "true" binary will be executed for all nodes and will succeed (exit
303       0) immediately.
305       * To use manual fencing, use this line:
307       fence_all /bin/false
309       The  "false"  binary will be executed for all nodes and will fail (exit
310       1) immediately.
312       When a node fails, manually run: dlm_tool fence_ack <nodeid>
314       * To use stonith/pacemaker for fencing, use this line:
316       fence_all /usr/sbin/dlm_stonith
318       The  "dlm_stonith"  binary  will  be  executed  for  all   nodes.    If
319       stonith/pacemaker  systems are not available, dlm_stonith will fail and
320       this config becomes the equivalent of the previous /bin/false config.
322       * To use an APC power switch, use these lines:
324       device  apc /usr/sbin/fence_apc ipaddr= login=admin password=pw
325       connect apc node=1 port=1
326       connect apc node=2 port=2
327       connect apc node=3 port=3
329       Other network switch based agents are configured similarly.
331       * To use sanlock/watchdog fencing, use these lines:
333       device wd /usr/sbin/fence_sanlock path=/dev/fence/leases
334       connect wd node=1 host_id=1
335       connect wd node=2 host_id=2
336       unfence wd
338       See fence_sanlock(8) for more information.
340       * For other fencing configurations see dlm.conf(5) man page.
343       4. start dlm_controld on all nodes
345       systemctl start dlm
347       Run "dlm_tool status" to verify that all nodes are listed.
350       5. if using clvm, start clvmd on all nodes
352       systemctl clvmd start
355       6. make new gfs2 file systems
357       mkfs.gfs2 -p lock_dlm -t cluster_name:fs_name -j num /path/to/storage
359       The cluster_name must match the name used in step 1 above.  The fs_name
360       must  be  a unique name in the cluster.  The -j option is the number of
361       journals to create, there must be one for each node that will mount the
362       fs.
365       7. mount gfs2 file systems
367       mount /path/to/storage /mountpoint
369       Run "dlm_tool ls" to verify the nodes that have each fs mounted.
372       8. shut down
374       umount -a -t gfs2
375       systemctl clvmd stop
376       systemctl dlm stop
377       systemctl corosync stop
380       More setup information:
381       dlm_controld(8),
382       dlm_tool(8),
383       dlm.conf(5),
384       corosync(8),
385       corosync.conf(5)
389                                                                       gfs2(5)