1LVMCACHE(7) LVMCACHE(7)
2
3
4
6 lvmcache — LVM caching
7
8
10 lvm(8) includes two kinds of caching that can be used to improve the
11 performance of a Logical Volume (LV). When caching, varying subsets of
12 an LV's data are temporarily stored on a smaller, faster device (e.g.
13 an SSD) to improve the performance of the LV.
14
15 To do this with lvm, a new special LV is first created from the faster
16 device. This LV will hold the cache. Then, the new fast LV is attached
17 to the main LV by way of an lvconvert command. lvconvert inserts one of
18 the device mapper caching targets into the main LV's i/o path. The de‐
19 vice mapper target combines the main LV and fast LV into a hybrid de‐
20 vice that looks like the main LV, but has better performance. While the
21 main LV is being used, portions of its data will be temporarily and
22 transparently stored on the special fast LV.
23
24 The two kinds of caching are:
25
26
27 • A read and write hot-spot cache, using the dm-cache kernel module.
28 This cache tracks access patterns and adjusts its content deliber‐
29 ately so that commonly used parts of the main LV are likely to be
30 found on the fast storage. LVM refers to this using the LV type
31 cache.
32
33
34 • A write cache, using the dm-writecache kernel module. This cache can
35 be used with SSD or PMEM devices to speed up all writes to the main
36 LV. Data read from the main LV is not stored in the cache, only newly
37 written data. LVM refers to this using the LV type writecache.
38
39
41 1. Identify main LV that needs caching
42
43 The main LV may already exist, and is located on larger, slower de‐
44 vices. A main LV would be created with a command like:
45
46 $ lvcreate -n main -L Size vg /dev/slow_hhd
47
48 2. Identify fast LV to use as the cache
49
50 A fast LV is created using one or more fast devices, like an SSD. This
51 special LV will be used to hold the cache:
52
53 $ lvcreate -n fast -L Size vg /dev/fast_ssd
54
55 $ lvs -a
56 LV Attr Type Devices
57 fast -wi------- linear /dev/fast_ssd
58 main -wi------- linear /dev/slow_hhd
59
60 3. Start caching the main LV
61
62 To start caching the main LV, convert the main LV to the desired
63 caching type, and specify the fast LV to use as the cache:
64
65 using dm-cache:
66
67 $ lvconvert --type cache --cachevol fast vg/main
68
69 using dm-writecache:
70
71 $ lvconvert --type writecache --cachevol fast vg/main
72
73 using dm-cache (with cachepool):
74
75 $ lvconvert --type cache --cachepool fast vg/main
76
77 4. Display LVs
78
79 Once the fast LV has been attached to the main LV, lvm reports the main
80 LV type as either cache or writecache depending on the type used.
81 While attached, the fast LV is hidden, and renamed with a _cvol or
82 _cpool suffix. It is displayed by lvs -a. The _corig or _wcorig LV
83 represents the original LV without the cache.
84
85 using dm-cache:
86
87 $ lvs -a
88 LV Pool Type Devices
89 main [fast_cvol] cache main_corig(0)
90 [fast_cvol] linear /dev/fast_ssd
91 [main_corig] linear /dev/slow_hhd
92
93 using dm-writecache:
94
95 $ lvs -a
96 LV Pool Type Devices
97 main [fast_cvol] writecache main_wcorig(0)
98 [fast_cvol] linear /dev/fast_ssd
99 [main_wcorig] linear /dev/slow_hhd
100
101 using dm-cache (with cachepool):
102
103 $ lvs -a
104 LV Pool Type Devices
105 main [fast_cpool] cache main_corig(0)
106 [fast_cpool] cache-pool fast_pool_cdata(0)
107 [fast_cpool_cdata] linear /dev/fast_ssd
108 [fast_cpool_cmeta] linear /dev/fast_ssd
109 [main_corig] linear /dev/slow_hhd
110
111 5. Use the main LV
112
113 Use the LV until the cache is no longer wanted, or needs to be changed.
114
115 6. Stop caching
116
117 To stop caching the main LV, separate the fast LV from the main LV.
118 This changes the type of the main LV back to what it was before the
119 cache was attached.
120
121 $ lvconvert --splitcache vg/main
122
123 $ lvs -a
124 LV VG Attr Type Devices
125 fast vg -wi------- linear /dev/fast_ssd
126 main vg -wi------- linear /dev/slow_hhd
127
128 To stop caching the main LV and also remove unneeded cache pool,
129 use the --uncache:
130
131 $ lvconvert --uncache vg/main
132
133 $ lvs -a
134 LV VG Attr Type Devices
135 main vg -wi------- linear /dev/slow_hhd
136
137
138
139 Create a new LV with caching.
140 A new LV can be created with caching attached at the time of creation
141 using the following command:
142
143 $ lvcreate --type cache|writecache -n Name -L Size
144 --cachedevice /dev/fast_ssd vg /dev/slow_hhd
145
146 The main LV is created with the specified Name and Size from the
147 slow_hhd. A hidden fast LV is created on the fast_ssd and is then at‐
148 tached to the new main LV. If the fast_ssd is unused, the entire disk
149 will be used as the cache unless the --cachesize option is used to
150 specify a size for the fast LV. The --cachedevice option can be re‐
151 peated to use multiple disks for the fast LV.
152
153
155
156
157
158 option args
159
160
161 --cachevol LV
162
163 Pass this option a fast LV that should be used to hold the cache. With
164 a cachevol, cache data and metadata are stored in different parts of
165 the same fast LV. This option can be used with dm-writecache or dm-
166 cache.
167
168 --cachepool CachePoolLV|LV
169
170 Pass this option a cachepool LV or a standard LV. When using a cache
171 pool, lvm places cache data and cache metadata on different LVs. The
172 two LVs together are called a cache pool. This has a bit better per‐
173 formance for dm-cache and permits specific placement and segment type
174 selection for data and metadata volumes. A cache pool is represented
175 as a special type of LV that cannot be used directly. If a standard LV
176 is passed with this option, lvm will first convert it to a cache pool
177 by combining it with another LV to use for metadata. This option can
178 be used with dm-cache.
179
180 --cachedevice PV
181
182 This option can be used in place of --cachevol, in which case a
183 cachevol LV will be created using the specified device. This option
184 can be repeated to create a cachevol using multiple devices, or a tag
185 name can be specified in which case the cachevol will be created using
186 any of the devices with the given tag. If a named cache device is un‐
187 used, the entire device will be used to create the cachevol. To create
188 a cachevol of a specific size from the cache devices, include the
189 --cachesize option.
190
191
192
193
194 dm-cache block size
195
196
197 A cache pool will have a logical block size of 4096 bytes if it is cre‐
198 ated on a device with a logical block size of 4096 bytes.
199
200 If a main LV has logical block size 512 (with an existing xfs file sys‐
201 tem using that size), then it cannot use a cache pool with a 4096 logi‐
202 cal block size. If the cache pool is attached, the main LV will likely
203 fail to mount.
204
205 To avoid this problem, use a mkfs option to specify a 4096 block size
206 for the file system, or attach the cache pool before running mkfs.
207
208
209 dm-writecache block size
210
211
212 The dm-writecache block size can be 4096 bytes (the default), or 512
213 bytes. The default 4096 has better performance and should be used ex‐
214 cept when 512 is necessary for compatibility. The dm-writecache block
215 size is specified with --cachesettings block_size=4096|512 when caching
216 is started.
217
218 When a file system like xfs already exists on the main LV prior to
219 caching, and the file system is using a block size of 512, then the
220 writecache block size should be set to 512. (The file system will
221 likely fail to mount if writecache block size of 4096 is used in this
222 case.)
223
224 Check the xfs sector size while the fs is mounted:
225
226 $ xfs_info /dev/vg/main
227 Look for sectsz=512 or sectsz=4096
228
229 The writecache block size should be chosen to match the xfs sectsz
230 value.
231
232 It is also possible to specify a sector size of 4096 to mkfs.xfs when
233 creating the file system. In this case the writecache block size of
234 4096 can be used.
235
236
237 dm-writecache settings
238
239
240 Tunable parameters can be passed to the dm-writecache kernel module us‐
241 ing the --cachesettings option when caching is started, e.g.
242
243 $ lvconvert --type writecache --cachevol fast \
244 --cachesettings 'high_watermark=N writeback_jobs=N' vg/main
245
246 Tunable options are:
247
248
249 • high_watermark = <percent>
250
251 Start writeback when the writecache usage reaches this percent
252 (0-100).
253
254
255 • low_watermark = <percent>
256
257 Stop writeback when the writecache usage reaches this percent
258 (0-100).
259
260
261 • writeback_jobs = <count>
262
263 Limit the number of blocks that are in flight during writeback. Set‐
264 ting this value reduces writeback throughput, but it may improve la‐
265 tency of read requests.
266
267
268 • autocommit_blocks = <count>
269
270 When the application writes this amount of blocks without issuing the
271 FLUSH request, the blocks are automatically commited.
272
273
274 • autocommit_time = <milliseconds>
275
276 The data is automatically commited if this time passes and no FLUSH
277 request is received.
278
279
280 • fua = 0|1
281
282 Use the FUA flag when writing data from persistent memory back to the
283 underlying device. Applicable only to persistent memory.
284
285
286 • nofua = 0|1
287
288 Don't use the FUA flag when writing back data and send the FLUSH re‐
289 quest afterwards. Some underlying devices perform better with fua,
290 some with nofua. Testing is necessary to determine which. Applica‐
291 ble only to persistent memory.
292
293
294 • cleaner = 0|1
295
296 Setting cleaner=1 enables the writecache cleaner mode in which data
297 is gradually flushed from the cache. If this is done prior to de‐
298 taching the writecache, then the splitcache command will have little
299 or no flushing to perform. If not done beforehand, the splitcache
300 command enables the cleaner mode and waits for flushing to complete
301 before detaching the writecache. Adding cleaner=0 to the splitcache
302 command will skip the cleaner mode, and any required flushing is per‐
303 formed in device suspend.
304
305
306 dm-cache with separate data and metadata LVs
307
308
309 When using dm-cache, the cache metadata and cache data can be stored on
310 separate LVs. To do this, a "cache pool" is created, which is a spe‐
311 cial LV that references two sub LVs, one for data and one for metadata.
312
313 To create a cache pool from two separate LVs:
314
315 $ lvcreate -n fast -L DataSize vg /dev/fast_ssd1
316 $ lvcreate -n fastmeta -L MetadataSize vg /dev/fast_ssd2
317 $ lvconvert --type cache-pool --poolmetadata fastmeta vg/fast
318
319 Then use the cache pool LV to start caching the main LV:
320
321 $ lvconvert --type cache --cachepool fast vg/main
322
323 A variation of the same procedure automatically creates a cache pool
324 when caching is started. To do this, use a standard LV as the
325 --cachepool (this will hold cache data), and use another standard LV as
326 the --poolmetadata (this will hold cache metadata). LVM will create a
327 cache pool LV from the two specified LVs, and use the cache pool to
328 start caching the main LV.
329
330 $ lvcreate -n fast -L DataSize vg /dev/fast_ssd1
331 $ lvcreate -n fastmeta -L MetadataSize vg /dev/fast_ssd2
332 $ lvconvert --type cache --cachepool fast --poolmetadata fastmeta vg/main
333
334
335 dm-cache cache modes
336
337
338 The default dm-cache cache mode is "writethrough". Writethrough en‐
339 sures that any data written will be stored both in the cache and on the
340 origin LV. The loss of a device associated with the cache in this case
341 would not mean the loss of any data.
342
343 A second cache mode is "writeback". Writeback delays writing data
344 blocks from the cache back to the origin LV. This mode will increase
345 performance, but the loss of a cache device can result in lost data.
346
347 With the --cachemode option, the cache mode can be set when caching is
348 started, or changed on an LV that is already cached. The current cache
349 mode can be displayed with the cache_mode reporting option:
350
351 lvs -o+cache_mode VG/LV
352
353 lvm.conf(5) allocation/cache_mode
354 defines the default cache mode.
355
356 $ lvconvert --type cache --cachevol fast \
357 --cachemode writethrough vg/main
358
359
360 dm-cache chunk size
361
362
363 The size of data blocks managed by dm-cache can be specified with the
364 --chunksize option when caching is started. The default unit is KiB.
365 The value must be a multiple of 32KiB between 32KiB and 1GiB. Cache
366 chunks bigger then 512KiB shall be only used when necessary.
367
368 Using a chunk size that is too large can result in wasteful use of the
369 cache, in which small reads and writes cause large sections of an LV to
370 be stored in the cache. It can also require increasing migration
371 threshold which defaults to 2048 sectors (1 MiB). Lvm2 ensures migra‐
372 tion threshold is at least 8 chunks in size. This may in some cases re‐
373 sult in very high bandwidth load of transfering data between the cache
374 LV and its cache origin LV. However, choosing a chunk size that is too
375 small can result in more overhead trying to manage the numerous chunks
376 that become mapped into the cache. Overhead can include both excessive
377 CPU time searching for chunks, and excessive memory tracking chunks.
378
379 Command to display the chunk size:
380 lvs -o+chunksize VG/LV
381
382 lvm.conf(5) cache_pool_chunk_size
383 controls the default chunk size.
384
385 The default value is shown by:
386 lvmconfig --type default allocation/cache_pool_chunk_size
387
388 Checking migration threshold (in sectors) of running cached LV:
389 lvs -o+kernel_cache_settings VG/LV
390
391
392
393 dm-cache migration threshold
394
395
396 Migrating data between the origin and cache LV uses bandwidth. The
397 user can set a throttle to prevent more than a certain amount of migra‐
398 tion occurring at any one time. Currently dm-cache is not taking any
399 account of normal io traffic going to the devices.
400
401 User can set migration threshold via cache policy settings as "migra‐
402 tion_threshold=<#sectors>" to set the maximum number of sectors being
403 migrated, the default being 2048 sectors (1MiB).
404
405 Command to set migration threshold to 2MiB (4096 sectors):
406 lvcreate --cachepolicy 'migration_threshold=4096' VG/LV
407
408
409 Command to display the migration threshold:
410 lvs -o+kernel_cache_settings,cache_settings VG/LV
411 lvs -o+chunksize VG/LV
412
413
414
415 dm-cache cache policy
416
417
418 The dm-cache subsystem has additional per-LV parameters: the cache pol‐
419 icy to use, and possibly tunable parameters for the cache policy.
420 Three policies are currently available: "smq" is the default policy,
421 "mq" is an older implementation, and "cleaner" is used to force the
422 cache to write back (flush) all cached writes to the origin LV.
423
424 The older "mq" policy has a number of tunable parameters. The defaults
425 are chosen to be suitable for the majority of systems, but in special
426 circumstances, changing the settings can improve performance.
427
428 With the --cachepolicy and --cachesettings options, the cache policy
429 and settings can be set when caching is started, or changed on an ex‐
430 isting cached LV (both options can be used together). The current
431 cache policy and settings can be displayed with the cache_policy and
432 cache_settings reporting options:
433
434 lvs -o+cache_policy,cache_settings VG/LV
435
436 Change the cache policy and settings of an existing LV.
437
438 $ lvchange --cachepolicy mq --cachesettings \
439 'migration_threshold=2048 random_threshold=4' vg/main
440
441 lvm.conf(5) allocation/cache_policy
442 defines the default cache policy.
443
444 lvm.conf(5) allocation/cache_settings
445 defines the default cache settings.
446
447
448 dm-cache spare metadata LV
449
450
451 See lvmthin(7) for a description of the "pool metadata spare" LV. The
452 same concept is used for cache pools.
453
454
455 dm-cache metadata formats
456
457
458 There are two disk formats for dm-cache metadata. The metadata format
459 can be specified with --cachemetadataformat when caching is started,
460 and cannot be changed. Format 2 has better performance; it is more
461 compact, and stores dirty bits in a separate btree, which improves the
462 speed of shutting down the cache. With auto, lvm selects the best op‐
463 tion provided by the current dm-cache kernel module.
464
465
466 RAID1 cache device
467
468
469 RAID1 can be used to create the fast LV holding the cache so that it
470 can tolerate a device failure. (When using dm-cache with separate data
471 and metadata LVs, each of the sub-LVs can use RAID1.)
472
473 $ lvcreate -n main -L Size vg /dev/slow
474 $ lvcreate --type raid1 -m 1 -n fast -L Size vg /dev/ssd1 /dev/ssd2
475 $ lvconvert --type cache --cachevol fast vg/main
476
477
478 dm-cache command shortcut
479
480
481 A single command can be used to create a cache pool and attach that new
482 cache pool to a main LV:
483
484 $ lvcreate --type cache --name Name --size Size VG/LV [PV]
485
486 In this command, the specified LV already exists, and is the main LV to
487 be cached. The command creates a new cache pool with the given name
488 and size, using the optionally specified PV (typically an ssd). Then
489 it attaches the new cache pool to the existing main LV to begin
490 caching.
491
492 (Note: ensure that the specified main LV is a standard LV. If a cache
493 pool LV is mistakenly specified, then the command does something dif‐
494 ferent.)
495
496 (Note: the type option is interpreted differently by this command than
497 by normal lvcreate commands in which --type specifies the type of the
498 newly created LV. In this case, an LV with type cache-pool is being
499 created, and the existing main LV is being converted to type cache.)
500
501
502
504 lvm.conf(5), lvchange(8), lvcreate(8), lvdisplay(8), lvextend(8), lvre‐
505 move(8), lvrename(8), lvresize(8), lvs(8), vgchange(8), vgmerge(8),
506 vgreduce(8), vgsplit(8)
507
508
509
510Red Hat, Inc LVM TOOLS 2.03.11(2) (2021-01-08) LVMCACHE(7)