gfs2(5) - c8

1gfs2(5)                       File Formats Manual                      gfs2(5)
2
3
4

NAME

6       gfs2 - GFS2 reference guide
7
8

SYNOPSIS

10       Overview of the GFS2 filesystem
11
12

DESCRIPTION

14       GFS2  is a clustered filesystem, designed for sharing data between mul‐
15       tiple nodes connected to a common shared storage device. It can also be
16       used  as  a local filesystem on a single node, however since the design
17       is aimed at clusters, that will usually  result  in  lower  performance
18       than using a filesystem designed specifically for single node use.
19
20       GFS2  is  a  journaling filesystem and one journal is required for each
21       node that will mount the filesystem. The one exception to that is spec‐
22       tator  mounts which are equivalent to mounting a read-only block device
23       and as such can neither recover a journal or write to  the  filesystem,
24       so do not require a journal assigned to them.
25
26

MOUNT OPTIONS

28       lockproto=LockProtoName
29              This  specifies  which  inter-node  lock protocol is used by the
30              GFS2 filesystem for this mount, overriding the default lock pro‐
31              tocol name stored in the filesystem's on-disk superblock.
32
33              The  LockProtoName  must  be one of the supported locking proto‐
34              cols, currently these are lock_nolock and lock_dlm.
35
36              The default lock protocol name is written to disk initially when
37              creating the filesystem with mkfs.gfs2(8), -p option.  It can be
38              changed on-disk by using the  gfs2_tool(8)  utility's  sb  proto
39              command.
40
41              The  lockproto  mount  option  should be used only under special
42              circumstances in which you want to temporarily use  a  different
43              lock  protocol  without  changing the on-disk default. Using the
44              incorrect lock protocol on a  cluster  filesystem  mounted  from
45              more  than  one  node will almost certainly result in filesystem
46              corruption.
47
48       locktable=LockTableName
49              This specifies the identity of the cluster and of the filesystem
50              for  this mount, overriding the default cluster/filesystem iden‐
51              tify stored in the filesystem's on-disk superblock.   The  clus‐
52              ter/filesystem  name is recognized globally throughout the clus‐
53              ter, and establishes a unique namespace for the inter-node lock‐
54              ing system, enabling the mounting of multiple GFS2 filesystems.
55
56              The   format  of  LockTableName  is  lock-module-specific.   For
57              lock_dlm, the format is  clustername:fsname.   For  lock_nolock,
58              the field is ignored.
59
60              The default cluster/filesystem name is written to disk initially
61              when creating the filesystem with mkfs.gfs2(8), -t  option.   It
62              can  be  changed  on-disk by using the gfs2_tool(8) utility's sb
63              table command.
64
65              The locktable mount option should be  used  only  under  special
66              circumstances  in  which  you  want to mount the filesystem in a
67              different cluster, or mount it as a different  filesystem  name,
68              without changing the on-disk default.
69
70       localflocks
71              This  flag  tells  GFS2 that it is running as a local (not clus‐
72              tered) filesystem, so it can allow the kernel VFS  layer  to  do
73              all flock and fcntl file locking.  When running in cluster mode,
74              these file locks require inter-node locks, and require the  sup‐
75              port  of  GFS2.   When  running  locally,  better performance is
76              achieved by letting VFS handle the whole job.
77
78              This is turned on automatically by the lock_nolock module.
79
80       errors=[panic|withdraw]
81              Setting errors=panic causes GFS2 to oops  when  encountering  an
82              error  that would otherwise cause the mount to withdraw or print
83              an assertion warning. The default  setting  is  errors=withdraw.
84              This  option  should  not  be  used  in a production system.  It
85              replaces the earlier debug option on kernel versions 2.6.31  and
86              above.
87
88       acl    Enables POSIX Access Control List acl(5) support within GFS2.
89
90       spectator
91              Mount  this  filesystem using a special form of read-only mount.
92              The mount does not use one of  the  filesystem's  journals.  The
93              node is unable to recover journals for other nodes.
94
95       norecovery
96              A synonym for spectator
97
98       suiddir
99              Sets  owner of any newly created file or directory to be that of
100              parent directory, if parent  directory  has  S_ISUID  permission
101              attribute  bit  set.   Sets S_ISUID in any new directory, if its
102              parent directory's S_ISUID is set.  Strips all execution bits on
103              a new file, if parent directory owner is different from owner of
104              process creating the file.  Set this option only if you know why
105              you are setting it.
106
107       quota=[off/account/on]
108              Turns  quotas on or off for a filesystem.  Setting the quotas to
109              be in the "account" state causes the per UID/GID  usage  statis‐
110              tics  to  be  correctly  maintained by the filesystem, limit and
111              warn values are ignored.  The default value is "off".
112
113       discard
114              Causes GFS2 to generate "discard" I/O requests for blocks  which
115              have  been  freed.  These  can  be  used by suitable hardware to
116              implement thin-provisioning and similar schemes. This feature is
117              supported in kernel version 2.6.30 and above.
118
119       barrier
120              This  option, which defaults to on, causes GFS2 to send I/O bar‐
121              riers when flushing the journal.  The  option  is  automatically
122              turned  off if the underlying device does not support I/O barri‐
123              ers. We highly recommend the use of I/O barriers  with  GFS2  at
124              all  times unless the block device is designed so that it cannot
125              lose its write cache content (e.g. its on a UPS, or  it  doesn't
126              have a write cache)
127
128       commit=secs
129              This  is  similar to the ext3 commit= option in that it sets the
130              maximum number of seconds between journal commits  if  there  is
131              dirty  data  in  the  journal.  The  default is 60 seconds. This
132              option is only provided in kernel versions 2.6.31 and above.
133
134       data=[ordered|writeback]
135              When data=ordered is set, the user data modified by  a  transac‐
136              tion  is flushed to the disk before the transaction is committed
137              to disk.  This should prevent the user from seeing uninitialized
138              blocks  in a file after a crash.  Data=writeback mode writes the
139              user data to the disk at any  time  after  it's  dirtied.   This
140              doesn't  provide the same consistency guarantee as ordered mode,
141              but it should  be  slightly  faster  for  some  workloads.   The
142              default is ordered mode.
143
144       meta   This option results in selecting the meta filesystem root rather
145              than the normal filesystem root. This option  is  normally  only
146              used  by  the  GFS2  utility functions. Altering any file on the
147              GFS2 meta filesystem may render the filesystem unusable, so only
148              experts in the GFS2 on-disk layout should use this option.
149
150       quota_quantum=secs
151              This  sets the number of seconds for which a change in the quota
152              information may sit on one node  before  being  written  to  the
153              quota file. This is the preferred way to set this parameter. The
154              value is an integer number of seconds  greater  than  zero.  The
155              default is 60 seconds. Shorter settings result in faster updates
156              of the lazy quota information and  less  likelihood  of  someone
157              exceeding  their  quota.  Longer settings make filesystem opera‐
158              tions involving quotas faster and more efficient.
159
160       statfs_quantum=secs
161              Setting statfs_quantum to 0 is the preferred way to set the slow
162              version  of  statfs. The default value is 30 secs which sets the
163              maximum time period before statfs changes will be syned  to  the
164              master  statfs  file.  This can be adjusted to allow for faster,
165              less accurate statfs values or slower more accurate values. When
166              set to 0, statfs will always report the true values.
167
168       statfs_percent=value
169              This  setting  provides a bound on the maximum percentage change
170              in the statfs information on a local basis before it  is  synced
171              back  to the master statfs file, even if the time period has not
172              expired. If the setting of statfs_quantum is 0, then  this  set‐
173              ting is ignored.
174
175       rgrplvb
176              This  flag  tells  gfs2 to look for information about a resource
177              group's free space and unlinked inodes in its glock  lock  value
178              block. This keeps gfs2 from having to read in the resource group
179              data from disk, speeding up allocations  in  some  cases.   This
180              option  was added in the 3.6 Linux kernel. Prior to this kernel,
181              no information was saved to the resource  group  lvb.  Note:  To
182              safely  turn  on  this option, all nodes mounting the filesystem
183              must be running at least a 3.6 Linux kernel. If  any  nodes  had
184              previously  mounted  the  filesystem  using  older  kernels, the
185              filesystem must be unmounted on  all  nodes  before  it  can  be
186              mounted  with  this option enabled. This option does not need to
187              be enabled on all nodes using a filesystem.
188
189       loccookie
190              This flag tells gfs2 to  use  location  based  readdir  cookies,
191              instead  of  its usual filename hash readdir cookies.  The file‐
192              name hash cookies are not guaranteed to be unique,  and  as  the
193              number of files in a directory increases, so does the likelihood
194              of a collision.  NFS requires  readdir  cookies  to  be  unique,
195              which  can  cause  problems  with  very  large directories (over
196              100,000 files). With this flag set, gfs2 will try  to  give  out
197              location  based cookies.  Since the cookie is 31 bits, gfs2 will
198              eventually run out of unique cookies,  and  will  fail  back  to
199              using  hash cookies. The maximum number of files that could have
200              unique location cookies  assuming  perfectly  even  hashing  and
201              names  of  8  or  fewer  characters is 1,073,741,824. An average
202              directory should be able to give out well over  half  a  billion
203              location  based  cookies. This option was added in the 4.5 Linux
204              kernel. Prior to this kernel, gfs2 did not add directory entries
205              in  a way that allowed it to use location based readdir cookies.
206              Note: To safely turn on this  option,  all  nodes  mounting  the
207              filesystem  must be running at least a 4.5 Linux kernel. If this
208              option is only enabled on some of the nodes mounting a  filesys‐
209              tem, the cookies returned by nodes using this option will not be
210              valid on nodes that are not using this option, and  vice  versa.
211              Finally,  when  first  enabling this option on a filesystem that
212              had been previously mounted without it, you must make sure  that
213              there are no outstanding cookies being cached by other software,
214              such as NFS.
215
216

BUGS

218       GFS2 doesn't support errors=remount-ro or data=journal.  It is not pos‐
219       sible  to  switch support for user and group quotas on and off indepen‐
220       dently of each other. Some of the error messages are rather cryptic, if
221       you  encounter one of these messages check firstly that gfs_controld is
222       running and secondly that you have enough journals  on  the  filesystem
223       for the number of nodes in use.
224
225

SETUP

242       GFS2 clustering is driven by the dlm, which depends on dlm_controld  to
243       provide clustering from userspace.  dlm_controld clustering is built on
244       corosync cluster/group membership and messaging.
245
246       Follow these steps to manually configure and run gfs2/dlm/corosync.
247
248       1. create /etc/corosync/corosync.conf and copy to all nodes
249
250       In this sample, replace cluster_name and IP addresses, and add nodes as
251       needed.   If  using  only  two nodes, uncomment the two_node line.  See
252       corosync.conf(5) for more information.
253
254       totem {
255               version: 2
256               secauth: off
257               cluster_name: abc
258       }
259
260       nodelist {
261               node {
262                       ring0_addr: 10.10.10.1
263                       nodeid: 1
264               }
265               node {
266                       ring0_addr: 10.10.10.2
267                       nodeid: 2
268               }
269               node {
270                       ring0_addr: 10.10.10.3
271                       nodeid: 3
272               }
273       }
274
275       quorum {
276               provider: corosync_votequorum
277       #       two_node: 1
278       }
279
280       logging {
281               to_syslog: yes
282       }
283
284
285       2. start corosync on all nodes
286
287       systemctl start corosync
288
289       Run corosync-quorumtool to verify that all nodes are listed.
290
291
292       3. create /etc/dlm/dlm.conf and copy to all nodes
293
294       * To use no fencing, use this line:
295
296       enable_fencing=0
297
298       * To use no fencing, but exercise fencing functions, use this line:
299
300       fence_all /bin/true
301
302       The "true" binary will be executed for all nodes and will succeed (exit
303       0) immediately.
304
305       * To use manual fencing, use this line:
306
307       fence_all /bin/false
308
309       The  "false"  binary will be executed for all nodes and will fail (exit
310       1) immediately.
311
312       When a node fails, manually run: dlm_tool fence_ack <nodeid>
313
314       * To use stonith/pacemaker for fencing, use this line:
315
316       fence_all /usr/sbin/dlm_stonith
317
318       The  "dlm_stonith"  binary  will  be  executed  for  all   nodes.    If
319       stonith/pacemaker  systems are not available, dlm_stonith will fail and
320       this config becomes the equivalent of the previous /bin/false config.
321
322       * To use an APC power switch, use these lines:
323
324       device  apc /usr/sbin/fence_apc ipaddr=1.1.1.1 login=admin password=pw
325       connect apc node=1 port=1
326       connect apc node=2 port=2
327       connect apc node=3 port=3
328
329       Other network switch based agents are configured similarly.
330
331       * To use sanlock/watchdog fencing, use these lines:
332
333       device wd /usr/sbin/fence_sanlock path=/dev/fence/leases
334       connect wd node=1 host_id=1
335       connect wd node=2 host_id=2
336       unfence wd
337
338       See fence_sanlock(8) for more information.
339
340       * For other fencing configurations see dlm.conf(5) man page.
341
342
343       4. start dlm_controld on all nodes
344
345       systemctl start dlm
346
347       Run "dlm_tool status" to verify that all nodes are listed.
348
349
350       5. if using clvm, start clvmd on all nodes
351
352       systemctl clvmd start
353
354
355       6. make new gfs2 file systems
356
357       mkfs.gfs2 -p lock_dlm -t cluster_name:fs_name -j num /path/to/storage
358
359       The cluster_name must match the name used in step 1 above.  The fs_name
360       must  be  a unique name in the cluster.  The -j option is the number of
361       journals to create, there must be one for each node that will mount the
362       fs.
363
364
365       7. mount gfs2 file systems
366
367       mount /path/to/storage /mountpoint
368
369       Run "dlm_tool ls" to verify the nodes that have each fs mounted.
370
371
372       8. shut down
373
374       umount -a -t gfs2
375       systemctl clvmd stop
376       systemctl dlm stop
377       systemctl corosync stop
378
379
380       More setup information:
381       dlm_controld(8),
382       dlm_tool(8),
383       dlm.conf(5),
384       corosync(8),
385       corosync.conf(5)
386
387
388
389                                                                       gfs2(5)

NAME

SYNOPSIS

DESCRIPTION

MOUNT OPTIONS

BUGS

SEE ALSO

SETUP