1LVMCACHE(7) LVMCACHE(7)
2
3
4
6 lvmcache — LVM caching
7
8
10 lvm(8) includes two kinds of caching that can be used to improve the
11 performance of a Logical Volume (LV). When caching, varying subsets of
12 an LV's data are temporarily stored on a smaller, faster device (e.g.
13 an SSD) to improve the performance of the LV.
14
15 To do this with lvm, a new special LV is first created from the faster
16 device. This LV will hold the cache. Then, the new fast LV is attached
17 to the main LV by way of an lvconvert command. lvconvert inserts one of
18 the device mapper caching targets into the main LV's i/o path. The
19 device mapper target combines the main LV and fast LV into a hybrid
20 device that looks like the main LV, but has better performance. While
21 the main LV is being used, portions of its data will be temporarily and
22 transparently stored on the special fast LV.
23
24 The two kinds of caching are:
25
26
27 · A read and write hot-spot cache, using the dm-cache kernel module.
28 This cache tracks access patterns and adjusts its content deliber‐
29 ately so that commonly used parts of the main LV are likely to be
30 found on the fast storage. LVM refers to this using the LV type
31 cache.
32
33
34 · A write cache, using the dm-writecache kernel module. This cache can
35 be used with SSD or PMEM devices to speed up all writes to the main
36 LV. Data read from the main LV is not stored in the cache, only newly
37 written data. LVM refers to this using the LV type writecache.
38
39
41 Both kinds of caching use similar lvm commands:
42
43 1. Identify main LV that needs caching
44
45 The main LV may already exist, and is located on larger, slower
46 devices. A main LV would be created with a command like:
47
48 $ lvcreate -n main -L Size vg /dev/slow_hhd
49
50 2. Identify fast LV to use as the cache
51
52 A fast LV is created using one or more fast devices, like an SSD. This
53 special LV will be used to hold the cache:
54
55 $ lvcreate -n fast -L Size vg /dev/fast_ssd
56
57 $ lvs -a
58 LV Attr Type Devices
59 fast -wi------- linear /dev/fast_ssd
60 main -wi------- linear /dev/slow_hhd
61
62 3. Start caching the main LV
63
64 To start caching the main LV, convert the main LV to the desired
65 caching type, and specify the fast LV to use as the cache:
66
67 using dm-cache:
68
69 $ lvconvert --type cache --cachevol fast vg/main
70
71 using dm-writecache:
72
73 $ lvconvert --type writecache --cachevol fast vg/main
74
75 using dm-cache (with cachepool):
76
77 $ lvconvert --type cache --cachepool fast vg/main
78
79 4. Display LVs
80
81 Once the fast LV has been attached to the main LV, lvm reports the main
82 LV type as either cache or writecache depending on the type used.
83 While attached, the fast LV is hidden, and renamed with a _cvol or
84 _cpool suffix. It is displayed by lvs -a. The _corig or _wcorig LV
85 represents the original LV without the cache.
86
87 using dm-cache:
88
89 $ lvs -a
90 LV Pool Type Devices
91 main [fast_cvol] cache main_corig(0)
92 [fast_cvol] linear /dev/fast_ssd
93 [main_corig] linear /dev/slow_hhd
94
95 using dm-writecache:
96
97 $ lvs -a
98 LV Pool Type Devices
99 main [fast_cvol] writecache main_wcorig(0)
100 [fast_cvol] linear /dev/fast_ssd
101 [main_wcorig] linear /dev/slow_hhd
102
103 using dm-cache (with cachepool):
104
105 $ lvs -a
106 LV Pool Type Devices
107 main [fast_cpool] cache main_corig(0)
108 [fast_cpool] cache-pool fast_pool_cdata(0)
109 [fast_cpool_cdata] linear /dev/fast_ssd
110 [fast_cpool_cmeta] linear /dev/fast_ssd
111 [main_corig] linear /dev/slow_hhd
112
113 5. Use the main LV
114
115 Use the LV until the cache is no longer wanted, or needs to be changed.
116
117 6. Stop caching
118
119 To stop caching the main LV, separate the fast LV from the main LV.
120 This changes the type of the main LV back to what it was before the
121 cache was attached.
122
123 $ lvconvert --splitcache vg/main
124
125 $ lvs -a
126 LV VG Attr Type Devices
127 fast vg -wi------- linear /dev/fast_ssd
128 main vg -wi------- linear /dev/slow_hhd
129
130
131
133
134
135
136 option args
137
138
139 --cachevol LV
140
141 Pass this option a fast LV that should be used to hold the cache. With
142 a cachevol, cache data and metadata are stored in different parts of
143 the same fast LV. This option can be used with dm-writecache or dm-
144 cache.
145
146 --cachepool CachePoolLV|LV
147
148 Pass this option a cachepool LV or a standard LV. When using a cache
149 pool, lvm places cache data and cache metadata on different LVs. The
150 two LVs together are called a cache pool. This permits specific place‐
151 ment of data and metadata. A cache pool is represented as a special
152 type of LV that cannot be used directly. If a standard LV is passed
153 with this option, lvm will first convert it to a cache pool by combin‐
154 ing it with another LV to use for metadata. This option can be used
155 with dm-cache.
156
157
158
159
160 dm-cache block size
161
162
163 A cache pool will have a logical block size of 4096 bytes if it is cre‐
164 ated on a device with a logical block size of 4096 bytes.
165
166 If a main LV has logical block size 512 (with an existing xfs file sys‐
167 tem using that size), then it cannot use a cache pool with a 4096 logi‐
168 cal block size. If the cache pool is attached, the main LV will likely
169 fail to mount.
170
171 To avoid this problem, use a mkfs option to specify a 4096 block size
172 for the file system, or attach the cache pool before running mkfs.
173
174
175 dm-writecache block size
176
177
178 The dm-writecache block size can be 4096 bytes (the default), or 512
179 bytes. The default 4096 has better performance and should be used
180 except when 512 is necessary for compatibility. The dm-writecache
181 block size is specified with --cachesettings block_size=4096|512 when
182 caching is started.
183
184 When a file system like xfs already exists on the main LV prior to
185 caching, and the file system is using a block size of 512, then the
186 writecache block size should be set to 512. (The file system will
187 likely fail to mount if writecache block size of 4096 is used in this
188 case.)
189
190 Check the xfs sector size while the fs is mounted:
191
192 $ xfs_info /dev/vg/main
193 Look for sectsz=512 or sectsz=4096
194
195 The writecache block size should be chosen to match the xfs sectsz
196 value.
197
198 It is also possible to specify a sector size of 4096 to mkfs.xfs when
199 creating the file system. In this case the writecache block size of
200 4096 can be used.
201
202
203 dm-writecache settings
204
205
206 Tunable parameters can be passed to the dm-writecache kernel module
207 using the --cachesettings option when caching is started, e.g.
208
209 $ lvconvert --type writecache --cachevol fast \
210 --cachesettings 'high_watermark=N writeback_jobs=N' vg/main
211
212 Tunable options are:
213
214
215 · high_watermark = <percent>
216
217 Start writeback when the writecache usage reaches this percent
218 (0-100).
219
220
221 · low_watermark = <percent>
222
223 Stop writeback when the writecache usage reaches this percent
224 (0-100).
225
226
227 · writeback_jobs = <count>
228
229 Limit the number of blocks that are in flight during writeback. Set‐
230 ting this value reduces writeback throughput, but it may improve
231 latency of read requests.
232
233
234 · autocommit_blocks = <count>
235
236 When the application writes this amount of blocks without issuing the
237 FLUSH request, the blocks are automatically commited.
238
239
240 · autocommit_time = <milliseconds>
241
242 The data is automatically commited if this time passes and no FLUSH
243 request is received.
244
245
246 · fua = 0|1
247
248 Use the FUA flag when writing data from persistent memory back to the
249 underlying device. Applicable only to persistent memory.
250
251
252 · nofua = 0|1
253
254 Don't use the FUA flag when writing back data and send the FLUSH
255 request afterwards. Some underlying devices perform better with fua,
256 some with nofua. Testing is necessary to determine which. Applica‐
257 ble only to persistent memory.
258
259
260
261 dm-cache with separate data and metadata LVs
262
263
264 When using dm-cache, the cache metadata and cache data can be stored on
265 separate LVs. To do this, a "cache pool" is created, which is a spe‐
266 cial LV that references two sub LVs, one for data and one for metadata.
267
268 To create a cache pool from two separate LVs:
269
270 $ lvcreate -n fast -L DataSize vg /dev/fast_ssd1
271 $ lvcreate -n fastmeta -L MetadataSize vg /dev/fast_ssd2
272 $ lvconvert --type cache-pool --poolmetadata fastmeta vg/fast
273
274 Then use the cache pool LV to start caching the main LV:
275
276 $ lvconvert --type cache --cachepool fast vg/main
277
278 A variation of the same procedure automatically creates a cache pool
279 when caching is started. To do this, use a standard LV as the
280 --cachepool (this will hold cache data), and use another standard LV as
281 the --poolmetadata (this will hold cache metadata). LVM will create a
282 cache pool LV from the two specified LVs, and use the cache pool to
283 start caching the main LV.
284
285 $ lvcreate -n fast -L DataSize vg /dev/fast_ssd1
286 $ lvcreate -n fastmeta -L MetadataSize vg /dev/fast_ssd2
287 $ lvconvert --type cache --cachepool fast --poolmetadata fastmeta vg/main
288
289
290 dm-cache cache modes
291
292
293 The default dm-cache cache mode is "writethrough". Writethrough
294 ensures that any data written will be stored both in the cache and on
295 the origin LV. The loss of a device associated with the cache in this
296 case would not mean the loss of any data.
297
298 A second cache mode is "writeback". Writeback delays writing data
299 blocks from the cache back to the origin LV. This mode will increase
300 performance, but the loss of a cache device can result in lost data.
301
302 With the --cachemode option, the cache mode can be set when caching is
303 started, or changed on an LV that is already cached. The current cache
304 mode can be displayed with the cache_mode reporting option:
305
306 lvs -o+cache_mode VG/LV
307
308 lvm.conf(5) allocation/cache_mode
309 defines the default cache mode.
310
311 $ lvconvert --type cache --cachevol fast \
312 --cachemode writethrough vg/main
313
314
315 dm-cache chunk size
316
317
318 The size of data blocks managed by dm-cache can be specified with the
319 --chunksize option when caching is started. The default unit is KiB.
320 The value must be a multiple of 32KiB between 32KiB and 1GiB.
321
322 Using a chunk size that is too large can result in wasteful use of the
323 cache, in which small reads and writes cause large sections of an LV to
324 be stored in the cache. However, choosing a chunk size that is too
325 small can result in more overhead trying to manage the numerous chunks
326 that become mapped into the cache. Overhead can include both excessive
327 CPU time searching for chunks, and excessive memory tracking chunks.
328
329 Command to display the chunk size:
330 lvs -o+chunksize VG/LV
331
332 lvm.conf(5) cache_pool_chunk_size
333 controls the default chunk size.
334
335 The default value is shown by:
336 lvmconfig --type default allocation/cache_pool_chunk_size
337
338
339
340 dm-cache cache policy
341
342
343 The dm-cache subsystem has additional per-LV parameters: the cache pol‐
344 icy to use, and possibly tunable parameters for the cache policy.
345 Three policies are currently available: "smq" is the default policy,
346 "mq" is an older implementation, and "cleaner" is used to force the
347 cache to write back (flush) all cached writes to the origin LV.
348
349 The older "mq" policy has a number of tunable parameters. The defaults
350 are chosen to be suitable for the majority of systems, but in special
351 circumstances, changing the settings can improve performance.
352
353 With the --cachepolicy and --cachesettings options, the cache policy
354 and settings can be set when caching is started, or changed on an
355 existing cached LV (both options can be used together). The current
356 cache policy and settings can be displayed with the cache_policy and
357 cache_settings reporting options:
358
359 lvs -o+cache_policy,cache_settings VG/LV
360
361 Change the cache policy and settings of an existing LV.
362
363 $ lvchange --cachepolicy mq --cachesettings \
364 'migration_threshold=2048 random_threshold=4' vg/main
365
366 lvm.conf(5) allocation/cache_policy
367 defines the default cache policy.
368
369 lvm.conf(5) allocation/cache_settings
370 defines the default cache settings.
371
372
373 dm-cache spare metadata LV
374
375
376 See lvmthin(7) for a description of the "pool metadata spare" LV. The
377 same concept is used for cache pools.
378
379
380 dm-cache metadata formats
381
382
383 There are two disk formats for dm-cache metadata. The metadata format
384 can be specified with --cachemetadataformat when caching is started,
385 and cannot be changed. Format 2 has better performance; it is more
386 compact, and stores dirty bits in a separate btree, which improves the
387 speed of shutting down the cache. With auto, lvm selects the best
388 option provided by the current dm-cache kernel module.
389
390
391 RAID1 cache device
392
393
394 RAID1 can be used to create the fast LV holding the cache so that it
395 can tolerate a device failure. (When using dm-cache with separate data
396 and metadata LVs, each of the sub-LVs can use RAID1.)
397
398 $ lvcreate -n main -L Size vg /dev/slow
399 $ lvcreate --type raid1 -m 1 -n fast -L Size vg /dev/ssd1 /dev/ssd2
400 $ lvconvert --type cache --cachevol fast vg/main
401
402
403 dm-cache command shortcut
404
405
406 A single command can be used to create a cache pool and attach that new
407 cache pool to a main LV:
408
409 $ lvcreate --type cache --name Name --size Size VG/LV [PV]
410
411 In this command, the specified LV already exists, and is the main LV to
412 be cached. The command creates a new cache pool with the given name
413 and size, using the optionally specified PV (typically an ssd). Then
414 it attaches the new cache pool to the existing main LV to begin
415 caching.
416
417 (Note: ensure that the specified main LV is a standard LV. If a cache
418 pool LV is mistakenly specified, then the command does something dif‐
419 ferent.)
420
421 (Note: the type option is interpreted differently by this command than
422 by normal lvcreate commands in which --type specifies the type of the
423 newly created LV. In this case, an LV with type cache-pool is being
424 created, and the existing main LV is being converted to type cache.)
425
426
427
429 lvm.conf(5), lvchange(8), lvcreate(8), lvdisplay(8), lvextend(8), lvre‐
430 move(8), lvrename(8), lvresize(8), lvs(8), vgchange(8), vgmerge(8),
431 vgreduce(8), vgsplit(8)
432
433
434
435Red Hat, Inc LVM TOOLS 2.03.09(2) (2020-03-26) LVMCACHE(7)