1LVMCACHE(7)                                                        LVMCACHE(7)
2
3
4

NAME

6       lvmcache — LVM caching
7

DESCRIPTION

9       lvm(8)  includes  two  kinds of caching that can be used to improve the
10       performance of a Logical Volume (LV). When caching, varying subsets  of
11       an  LV's  data are temporarily stored on a smaller, faster device (e.g.
12       an SSD) to improve the performance of the LV.
13
14       To do this with lvm, a new special LV is first created from the  faster
15       device.  This LV will hold the cache. Then, the new fast LV is attached
16       to the main LV by way of an lvconvert command. lvconvert inserts one of
17       the  device mapper caching targets into the main LV's i/o path. The de‐
18       vice mapper target combines the main LV and fast LV into a  hybrid  de‐
19       vice that looks like the main LV, but has better performance. While the
20       main LV is being used, portions of its data  will  be  temporarily  and
21       transparently stored on the special fast LV.
22
23       The two kinds of caching are:
24
25       • A  read  and  write hot-spot cache, using the dm-cache kernel module.
26         This cache tracks access patterns and adjusts  its  content  deliber‐
27         ately  so  that  commonly  used parts of the main LV are likely to be
28         found on the fast storage. LVM refers  to  this  using  the  LV  type
29         cache.
30
31       • A write cache, using the dm-writecache kernel module.  This cache can
32         be used with SSD or PMEM devices to speed up all writes to  the  main
33         LV. Data read from the main LV is not stored in the cache, only newly
34         written data.  LVM refers to this using the LV type writecache.
35

USAGE

37   1. Identify main LV that needs caching
38       The main LV may already exist, and is located  on  larger,  slower  de‐
39       vices.  A main LV would be created with a command like:
40
41       # lvcreate -n main -L Size vg /dev/slow_hhd
42
43   2. Identify fast LV to use as the cache
44       A fast LV is created using one or more fast devices, like an SSD.  This
45       special LV will be used to hold the cache:
46
47       # lvcreate -n fast -L Size vg /dev/fast_ssd
48
49       # lvs -a
50         LV   Attr       Type   Devices
51         fast -wi------- linear /dev/fast_ssd
52         main -wi------- linear /dev/slow_hhd
53
54   3. Start caching the main LV
55       To start caching the main LV,  convert  the  main  LV  to  the  desired
56       caching type, and specify the fast LV to use as the cache:
57
58       using dm-cache (with cachepool):
59
60       # lvconvert --type cache --cachepool fast vg/main
61
62       using dm-cache (with cachevol):
63
64       # lvconvert --type cache --cachevol fast vg/main
65
66       using dm-writecache (with cachevol):
67
68       # lvconvert --type writecache --cachevol fast vg/main
69
70       For more alternatives see:
71       dm-cache command shortcut
72       dm-cache with separate data and metadata LVs
73
74   4. Display LVs
75       Once the fast LV has been attached to the main LV, lvm reports the main
76       LV type as either cache or  writecache  depending  on  the  type  used.
77       While  attached,  the  fast  LV  is hidden, and renamed with a _cvol or
78       _cpool suffix.  It is displayed by lvs -a.  The _corig  or  _wcorig  LV
79       represents the original LV without the cache.
80
81       using dm-cache (with cachepool):
82
83       # lvs -ao+devices
84         LV                 Pool         Type       Devices
85         main               [fast_cpool] cache      main_corig(0)
86         [fast_cpool]                    cache-pool fast_pool_cdata(0)
87         [fast_cpool_cdata]              linear     /dev/fast_ssd
88         [fast_cpool_cmeta]              linear     /dev/fast_ssd
89         [main_corig]                    linear     /dev/slow_hhd
90
91       using dm-cache (with cachevol):
92
93       # lvs -ao+devices
94
95         LV           Pool        Type   Devices
96         main         [fast_cvol] cache  main_corig(0)
97         [fast_cvol]              linear /dev/fast_ssd
98         [main_corig]             linear /dev/slow_hhd
99
100       using dm-writecache (with cachevol):
101
102       # lvs -ao+devices
103
104         LV            Pool        Type       Devices
105         main          [fast_cvol] writecache main_wcorig(0)
106         [fast_cvol]               linear     /dev/fast_ssd
107         [main_wcorig]             linear     /dev/slow_hhd
108
109   5. Use the main LV
110       Use the LV until the cache is no longer wanted, or needs to be changed.
111
112   6. Stop caching
113       To  stop  caching  the main LV and also remove unneeded cache pool, use
114       the --uncache:
115
116       # lvconvert --uncache vg/main
117
118       # lvs -a
119         LV   VG Attr       Type   Devices
120         main vg -wi------- linear /dev/slow_hhd
121
122       To stop caching the main LV, separate the fast LV  from  the  main  LV.
123       This  changes  the  type  of the main LV back to what it was before the
124       cache was attached.
125
126       # lvconvert --splitcache vg/main
127
128       # lvs -a
129         LV   VG Attr       Type   Devices
130         fast vg -wi------- linear /dev/fast_ssd
131         main vg -wi------- linear /dev/slow_hhd
132
133   7. Create a new LV with caching
134       A new LV can be created with caching attached at the time  of  creation
135       using the following command:
136
137       # lvcreate --type cache|writecache -n Name -L Size
138            --cachedevice /dev/fast_ssd vg /dev/slow_hhd
139
140       The  main  LV  is  created  with  the  specified Name and Size from the
141       slow_hhd.  A hidden fast LV is created on the fast_ssd and is then  at‐
142       tached  to the new main LV.  If the fast_ssd is unused, the entire disk
143       will be used as the cache unless the  --cachesize  option  is  used  to
144       specify  a  size  for the fast LV.  The --cachedevice option can be re‐
145       peated to use multiple disks for the fast LV.
146

OPTIONS

148   option args
149       --cachepool CachePoolLV|LV
150
151       Pass this option a cachepool LV or a standard LV.  When using  a  cache
152       pool,  lvm  places cache data and cache metadata on different LVs.  The
153       two LVs together are called a cache pool.  This has a bit  better  per‐
154       formance  for  dm-cache and permits specific placement and segment type
155       selection for data and metadata volumes.  A cache pool  is  represented
156       as a special type of LV that cannot be used directly.  If a standard LV
157       is passed with this option, lvm will first convert it to a  cache  pool
158       by  combining  it with another LV to use for metadata.  This option can
159       be used with dm-cache.
160
161       --cachevol LV
162
163       Pass this option a fast LV that should be used to hold the cache.  With
164       a  cachevol,  cache  data and metadata are stored in different parts of
165       the same fast LV.  This option can be used with  dm-writecache  or  dm-
166       cache.
167
168       --cachedevice PV
169
170       This  option  can  be  used  in  place  of  --cachevol, in which case a
171       cachevol LV will be created using the specified  device.   This  option
172       can  be  repeated to create a cachevol using multiple devices, or a tag
173       name can be specified in which case the cachevol will be created  using
174       any  of the devices with the given tag.  If a named cache device is un‐
175       used, the entire device will be used to create the cachevol.  To create
176       a  cachevol  of  a  specific  size  from the cache devices, include the
177       --cachesize option.
178
179   dm-cache block size
180       A cache pool will have a logical block size of 4096 bytes if it is cre‐
181       ated on a device with a logical block size of 4096 bytes.
182
183       If a main LV has logical block size 512 (with an existing xfs file sys‐
184       tem using that size), then it cannot use a cache pool with a 4096 logi‐
185       cal block size.  If the cache pool is attached, the main LV will likely
186       fail to mount.
187
188       To avoid this problem, use a mkfs option to specify a 4096  block  size
189       for the file system, or attach the cache pool before running mkfs.
190
191   dm-writecache block size
192       The  dm-writecache  block  size can be 4096 bytes (the default), or 512
193       bytes.  The default 4096 has better performance and should be used  ex‐
194       cept  when 512 is necessary for compatibility.  The dm-writecache block
195       size is specified with --cachesettings block_size=4096|512 when caching
196       is started.
197
198       When  a  file  system  like  xfs already exists on the main LV prior to
199       caching, and the file system is using a block size  of  512,  then  the
200       writecache  block  size  should  be  set to 512.  (The file system will
201       likely fail to mount if writecache block size of 4096 is used  in  this
202       case.)
203
204       Check the xfs sector size while the fs is mounted:
205
206       # xfs_info /dev/vg/main
207       Look for sectsz=512 or sectsz=4096
208
209       The  writecache  block  size  should  be chosen to match the xfs sectsz
210       value.
211
212       It is also possible to specify a sector size of 4096 to  mkfs.xfs  when
213       creating  the  file  system.  In this case the writecache block size of
214       4096 can be used.
215
216       The writecache block size is displayed by the command:
217       lvs -o writecacheblocksize VG/LV
218
219   dm-writecache memory usage
220       The amount of main system memory used by dm-writecache can be a  factor
221       when  selecting  the  writecache cachevol size and the writecache block
222       size.
223
224       • writecache block size 4096: each 100 GiB of writecache cachevol  uses
225         slightly over 2 GiB of system memory.
226
227       • writecache block size 512: each 100 GiB of writecache cachevol uses a
228         little over 16 GiB of system memory.
229
230   dm-writecache settings
231       To specify dm-writecache tunable settings on the command line, use:
232       --cachesettings 'option=N' or
233       --cachesettings 'option1=N option2=N ...'
234
235       For example, --cachesettings 'high_watermark=90 writeback_jobs=4'.
236
237       To include settings when caching is started, run:
238
239       # lvconvert --type writecache --cachevol fast \
240            --cachesettings 'option=N' vg/main
241
242       To change settings for an existing writecache, run:
243
244       # lvchange --cachesettings 'option=N' vg/main
245
246       To clear all settings that have been applied, run:
247
248       # lvchange --cachesettings '' vg/main
249
250       To view the settings that are applied to a writecache LV, run:
251
252       # lvs -o cachesettings vg/main
253
254       Tunable settings are:
255
256       high_watermark = <percent>
257              Start writeback when the writecache usage reaches  this  percent
258              (0-100).
259
260       low_watermark = <percent>
261              Stop  writeback  when  the writecache usage reaches this percent
262              (0-100).
263
264       writeback_jobs = <count>
265              Limit the number of blocks that are in flight during  writeback.
266              Setting  this value reduces writeback throughput, but it may im‐
267              prove latency of read requests.
268
269       autocommit_blocks = <count>
270              When the application writes this amount of blocks without  issu‐
271              ing the FLUSH request, the blocks are automatically committed.
272
273       autocommit_time = <milliseconds>
274              The  data  is automatically committed if this time passes and no
275              FLUSH request is received.
276
277       fua = 0|1
278              Use the FUA flag when writing data from persistent  memory  back
279              to the underlying device.  Applicable only to persistent memory.
280
281       nofua = 0|1
282              Don't use the FUA flag when writing back data and send the FLUSH
283              request afterwards.  Some underlying devices perform better with
284              fua,  some with nofua.  Testing is necessary to determine which.
285              Applicable only to persistent memory.
286
287       cleaner = 0|1
288              Setting cleaner=1 enables the writecache cleaner mode  in  which
289              data is gradually flushed from the cache.  If this is done prior
290              to detaching the writecache, then the  splitcache  command  will
291              have  little or no flushing to perform.  If not done beforehand,
292              the splitcache command enables the cleaner mode  and  waits  for
293              flushing  to  complete  before detaching the writecache.  Adding
294              cleaner=0 to the splitcache command will skip the cleaner  mode,
295              and any required flushing is performed in device suspend.
296
297       max_age = <milliseconds>
298              Specifies the maximum age of a block in milliseconds. If a block
299              is stored in the cache for too long, it will be written  to  the
300              underlying device and cleaned up.
301
302       metadata_only = 0|1
303              Only  metadata  is  promoted  to the cache. This option improves
304              performance for heavier REQ_META workloads.
305
306       pause_writeback = <milliseconds>
307              Pause writeback if there was some write I/O  redirected  to  the
308              origin volume in the last number of milliseconds.
309
310
311   dm-writecache using metadata profiles
312       In addition to specifying writecache settings on the command line, they
313       can also be set in lvm.conf, or in a profile file,  using  the  alloca‐
314       tion/cache_settings/writecache config structure shown below.
315
316       It's  possible  to  prepare  a number of different profile files in the
317       /etc/lvm/profile directory and specify the file  name  of  the  profile
318       when starting writecache.
319
320       Example
321       # cat <<EOF > /etc/lvm/profile/cache_writecache.profile
322       allocation {
323              cache_settings {
324                     writecache {
325                            high_watermark=60
326                            writeback_jobs=1024
327                     }
328              }
329       }
330       EOF
331
332       # lvcreate -an -L10G --name fast vg /dev/fast_ssd
333       # lvcreate --type writecache -L10G --name main  --cachevol fast \
334          --metadataprofile cache_writecache vg /dev/slow_hdd
335
336   dm-cache with separate data and metadata LVs
337       Preferred  way  of  using  dm-cache  is to place the cache metadata and
338       cache data on separate LVs.  To do this, a  "cache  pool"  is  created,
339       which is a special LV that references two sub LVs, one for data and one
340       for metadata.
341
342       To create a cache pool of given data size and let lvm2 calculate appro‐
343       priate metadata size:
344
345       # lvcreate --type cache-pool -L DataSize -n fast vg /dev/fast_ssd1
346
347       To  create  a cache pool from separate LV and let lvm2 calculate appro‐
348       priate cache metadata size:
349
350       # lvcreate -n fast -L DataSize vg /dev/fast_ssd1
351       # lvconvert --type cache-pool vg/fast /dev/fast_ssd1
352
353       To create a cache pool from two separate LVs:
354
355       # lvcreate -n fast -L DataSize vg /dev/fast_ssd1
356       # lvcreate -n fastmeta -L MetadataSize vg /dev/fast_ssd2
357       # lvconvert --type cache-pool --poolmetadata fastmeta vg/fast
358
359       Then use the cache pool LV to start caching the main LV:
360
361       # lvconvert --type cache --cachepool fast vg/main
362
363       A variation of the same procedure automatically creates  a  cache  pool
364       when  caching  is  started.   To  do  this,  use  a  standard LV as the
365       --cachepool (this will hold cache data), and use another standard LV as
366       the  --poolmetadata (this will hold cache metadata).  LVM will create a
367       cache pool LV from the two specified LVs, and use  the  cache  pool  to
368       start caching the main LV.
369
370       # lvcreate -n fast -L DataSize vg /dev/fast_ssd1
371       # lvcreate -n fastmeta -L MetadataSize vg /dev/fast_ssd2
372       # lvconvert --type cache --cachepool fast \
373               --poolmetadata fastmeta vg/main
374
375   dm-cache cache modes
376       The  default  dm-cache  cache mode is "writethrough".  Writethrough en‐
377       sures that any data written will be stored both in the cache and on the
378       origin LV.  The loss of a device associated with the cache in this case
379       would not mean the loss of any data.
380
381       A second cache mode is  "writeback".   Writeback  delays  writing  data
382       blocks  from  the cache back to the origin LV.  This mode will increase
383       performance, but the loss of a cache device can result in lost data.
384
385       With the --cachemode option, the cache mode can be set when caching  is
386       started, or changed on an LV that is already cached.  The current cache
387       mode can be displayed with the cache_mode reporting option:
388
389       lvs -o+cache_mode VG/LV
390
391       lvm.conf(5) allocation/cache_mode
392       defines the default cache mode.
393
394       # lvconvert --type cache --cachemode writethrough \
395               --cachepool fast vg/main
396
397       # lvconvert --type cache --cachemode writethrough \
398               --cachevol fast  vg/main
399
400   dm-cache chunk size
401       The size of data blocks managed by dm-cache can be specified  with  the
402       --chunksize  option  when caching is started.  The default unit is KiB.
403       The value must be a multiple of 32 KiB between 32 KiB and 1 GiB.  Cache
404       chunks bigger then 512KiB shall be only used when necessary.
405
406       Using  a chunk size that is too large can result in wasteful use of the
407       cache, in which small reads and writes cause large sections of an LV to
408       be  stored  in  the  cache.  It  can  also require increasing migration
409       threshold which defaults to 2048 sectors (1 MiB). Lvm2  ensures  migra‐
410       tion threshold is at least 8 chunks in size. This may in some cases re‐
411       sult in very high bandwidth load of transferring data between the cache
412       LV  and its cache origin LV. However, choosing a chunk size that is too
413       small can result in more overhead trying to manage the numerous  chunks
414       that become mapped into the cache.  Overhead can include both excessive
415       CPU time searching for chunks, and excessive memory tracking chunks.
416
417       Command to display the chunk size:
418
419       lvs -o+chunksize VG/LV
420
421       lvm.conf(5) allocation/cache_pool_chunk_size
422
423       controls the default chunk size.
424
425       The default value is shown by:
426
427       lvmconfig --type default allocation/cache_pool_chunk_size
428
429       Checking migration threshold (in sectors) of running cached LV:
430       lvs -o+kernel_cache_settings VG/LV
431
432   dm-cache cache settings
433       To set dm-cache cache setting use:
434
435       --cachesettings 'option1=N option2=N ...'
436
437       To unset/drop cache setting and restore its default  kernel  value  use
438       special keyword 'default' as option parameter:
439
440       --cachesettings 'option1=default option2=default ...'
441
442   dm-cache migration threshold cache setting
443       Migrating  data  between  the  origin and cache LV uses bandwidth.  The
444       user can set a throttle to prevent more than a certain amount of migra‐
445       tion  occurring  at any one time.  Currently dm-cache is not taking any
446       account of normal io traffic going to the devices.
447
448       User can set migration threshold via cache policy settings  as  "migra‐
449       tion_threshold=<#sectors>"  to  set the maximum number of sectors being
450       migrated, the default being 2048 sectors  (1 MiB)  or  8  cache  chunks
451       whichever of those two values is larger.
452
453       Command to set migration threshold to 2 MiB (4096 sectors):
454
455       lvcreate --cachesettings 'migration_threshold=4096' VG/LV
456
457       Command to display the migration threshold:
458
459       lvs -o+kernel_cache_settings,cache_settings VG/LV
460       lvs -o+chunksize VG/LV
461
462       Command to restore/revert to default value:
463
464       lvchange --cachesettings 'migration_threshold=default' VG/LV
465
466   dm-cache cache policy
467       The dm-cache subsystem has additional per-LV parameters: the cache pol‐
468       icy to use, and possibly  tunable  parameters  for  the  cache  policy.
469       Three  policies  are  currently available: "smq" is the default policy,
470       "mq" is an older implementation, and "cleaner" is  used  to  force  the
471       cache to write back (flush) all cached writes to the origin LV.
472
473       The  older "mq" policy has a number of tunable parameters. The defaults
474       are chosen to be suitable for the majority of systems, but  in  special
475       circumstances,  changing  the  settings can improve performance.  Newer
476       kernels however alias this policy with  "smq"  policy.  Cache  settings
477       used  to configure "mq" policy [random_threshold, sequential_threshold,
478       discard_promote_adjustment, read_promote_adjustment,  write_promote_ad‐
479       justment] are thus silently ignored also performance matches "smq".
480
481       With  the  --cachepolicy  and --cachesettings options, the cache policy
482       and settings can be set when caching is started, or changed on  an  ex‐
483       isting  cached  LV  (both  options  can be used together).  The current
484       cache policy and settings can be displayed with  the  cache_policy  and
485       cache_settings reporting options:
486
487       lvs -o+cache_policy,cache_settings VG/LV
488
489       Change the cache policy and settings of an existing LV.
490       # lvchange --cachepolicy mq --cachesettings \
491            'migration_threshold=2048 random_threshold=4' vg/main
492
493       lvm.conf(5) allocation/cache_policy
494       defines the default cache policy.
495
496       lvm.conf(5) allocation/cache_settings
497       defines the default cache settings.
498
499   dm-cache using metadata profiles
500       Cache  pools allows to set a variety of options. Lots of these settings
501       can be specified in lvm.conf or profile settings.  You  can  prepare  a
502       number of different profiles in the /etc/lvm/profile directory and just
503       specify the metadata profile file name  when  caching  LV  or  creating
504       cache-pool.   Check  the  output of lvmconfig --type default --withcom‐
505       ments for a detailed description of all individual cache settings.
506
507       Example
508       # cat <<EOF > /etc/lvm/profile/cache_big_chunk.profile
509       allocation {
510              cache_pool_metadata_require_separate_pvs=0
511              cache_pool_chunk_size=512
512              cache_metadata_format=2
513              cache_mode="writethrough"
514              cache_policy="smq"
515              cache_settings {
516                     smq {
517                            migration_threshold=8192
518                     }
519              }
520       }
521       EOF
522
523       # lvcreate --cache -L10G --metadataprofile cache_big_chunk vg/main \
524               /dev/fast_ssd
525       # lvcreate --cache -L10G vg/main --config \
526               'allocation/cache_pool_chunk_size=512' /dev/fast_ssd
527
528   dm-cache spare metadata LV
529       See lvmthin(7) for a description of the "pool metadata spare" LV.   The
530       same concept is used for cache pools.
531
532   dm-cache metadata formats
533       There  are two disk formats for dm-cache metadata.  The metadata format
534       can be specified with --cachemetadataformat when  caching  is  started,
535       and  cannot  be  changed.   Format 2 has better performance; it is more
536       compact, and stores dirty bits in a separate btree, which improves  the
537       speed  of shutting down the cache.  With auto, lvm selects the best op‐
538       tion provided by the current dm-cache kernel module.
539
540   RAID1 cache device
541       RAID1 can be used to create the fast LV holding the cache  so  that  it
542       can tolerate a device failure.  (When using dm-cache with separate data
543       and metadata LVs, each of the sub-LVs can use RAID1.)
544
545       # lvcreate -n main -L Size vg /dev/slow
546       # lvcreate --type raid1 -m 1 -n fast -L Size vg /dev/ssd1 /dev/ssd2
547       # lvconvert --type cache --cachevol fast vg/main
548
549   dm-cache command shortcut
550       A single command can be used to cache main LV with  automatic  creation
551       of a cache-pool:
552
553       # lvcreate --cache --size CacheDataSize VG/LV [FastPVs]
554
555       or the longer variant
556
557       # lvcreate --type cache --size CacheDataSize \
558               --name NameCachePool VG/LV [FastPVs]
559
560       In this command, the specified LV already exists, and is the main LV to
561       be cached.  The command creates a new cache pool with  size  and  given
562       name or the name is automatically selected from a sequence lvolX_cpool,
563       using the optionally specified fast PV(s) (typically an ssd).  Then  it
564       attaches the new cache pool to the existing main LV to begin caching.
565
566       (Note:  ensure that the specified main LV is a standard LV.  If a cache
567       pool LV is mistakenly specified, then the command does  something  dif‐
568       ferent.)
569
570       (Note:  the type option is interpreted differently by this command than
571       by normal lvcreate commands in which --type specifies the type  of  the
572       newly  created  LV.   In this case, an LV with type cache-pool is being
573       created, and the existing main LV is being converted to type cache.)
574

SEE ALSO

576       lvm.conf(5), lvchange(8), lvcreate(8), lvdisplay(8), lvextend(8),
577       lvremove(8), lvrename(8), lvresize(8), lvs(8),
578       vgchange(8), vgmerge(8), vgreduce(8), vgsplit(8),
579
580       cache_check(8), cache_dump(8), cache_repair(8)
581
582
583
584Red Hat, Inc           LVM TOOLS 2.03.22(2) (2023-08-02)           LVMCACHE(7)
Impressum