1VIRTIO-FORWARDER(8) Virtio-forwarder VIRTIO-FORWARDER(8)
2
3
4
6 virtio-forwarder - Virtio-forwarder Documentation
7
9 virtio-forwarder (VIO4WD) is a userspace networking application that
10 forwards bi-directional traffic between SR-IOV virtual functions (VFs)
11 and virtio networking devices in QEMU virtual machines. virtio-for‐
12 warder implements a virtio backend driver using the DPDK's vhost-user
13 library and services designated VFs by means of the DPDK poll mode
14 driver (PMD) mechanism.
15
16 VIO4WD supports up to 64 forwarding instances, where an instance is
17 essentially a VF <-> virtio pairing. Packets received on the VFs are
18 sent on their corresponding virtio backend and vice versa. The relay
19 principle allows a user to benefit from technologies provided by both
20 NICs and the the virtio network driver. A NIC may offload some or all
21 network functions, while virtio enables VM live migration and is also
22 agnostic to the underlying hardware.
23
25 · QEMU version 2.5 (or newer) must be used for the virtual machine
26 hypervisor. The older QEMU 2.3 and 2.4 do work with virtio-for‐
27 warder, though there are bugs, less optimised performance and missing
28 features.
29
30 · libvirt 1.2.6 or newer (if using libvirt to manage VMs - manually
31 scripted QEMU command line VMs don't require libvirt)
32
33 · 2M hugepages must be configured in Linux, a corresponding hugetlbfs
34 mountpoint must exist, and at least 1375 hugepages must be free for
35 use by virtio-forwarder.
36
37 · The SR-IOV VFs added to the relay must be bound to the igb_uio driver
38 on the host.
39
41 virtio-forwarder requires 2M hugepages and QEMU/KVM performs better
42 with 1G hugepages. To set up the system for use with libvirt, QEMU and
43 virtio-forwarder, the following should be added to the Linux kernel
44 command line parameters:
45
46 hugepagesz=2M hugepages=1375 default_hugepagesz=1G hugepagesz=1G
47 hugepages=8
48
49 The following could be done after each boot:
50
51 # Reserve at least 1375 * 2M for virtio-forwarder:
52 echo 1375 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
53 # Reserve 8G for application hugepages (modify this as needed):
54 echo 8 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
55
56 Note that reserving hugepages after boot may fail if not enough con‐
57 tiguous free memory is available, and it is therefore recommended to
58 reserve them at boot time with Linux kernel command line parameters.
59 This is especially true for 1G hugepages.
60
61 hugetlbfs needs to be mounted on the filesystem to allow applications
62 to create and allocate handles to the mapped memory. The following
63 lines mount the two types of hugepages on /dev/hugepages (2M) and
64 /dev/hugepages-1G (1G):
65
66 grep hugetlbfs /proc/mounts | grep -q "pagesize=2M" || \
67 ( mkdir -p /dev/hugepages && mount nodev -t hugetlbfs -o rw,pagesize=2M /dev/hugepages/ )
68 grep hugetlbfs /proc/mounts | grep -q "pagesize=1G" || \
69 ( mkdir -p /dev/hugepages-1G && mount nodev -t hugetlbfs -o rw,pagesize=1G /dev/hugepages-1G/ )
70
71 Finally, libvirt requires a special directory inside the hugepages
72 mounts with the correct permissions in order to create the necessary
73 per-VM handles:
74
75 mkdir /dev/hugepages-1G/libvirt
76 mkdir /dev/hugepages/libvirt
77 chown [libvirt-]qemu:kvm -R /dev/hugepages-1G/libvirt
78 chown [libvirt-]qemu:kvm -R /dev/hugepages/libvirt
79
80 NOTE:
81 Substitute /dev/hugepages[-1G] with your actual hugepage mount
82 directory. A 2M hugepage mount location is created by default by
83 some distributions.
84
85 NOTE:
86 After these mounts have been prepared, the libvirt daemon will prob‐
87 ably need to be restarted.
88
90 libvirt and apparmor
91 On Ubuntu systems, libvirt's apparmor permissions might need to be mod‐
92 ified to allow read/write access to the hugepages directory and library
93 files for QEMU:
94
95 # in /etc/apparmor.d/abstractions/libvirt-qemu
96 # for latest QEMU
97 /usr/lib/x86_64-linux-gnu/qemu/* rmix,
98 # for access to hugepages
99 owner "/dev/hugepages/libvirt/qemu/**" rw,
100 owner "/dev/hugepages-1G/libvirt/qemu/**" rw,
101
102 Be sure to substitute the hugetlbfs mountpoints that you use into the
103 above. It may also be prudent to check for any deny lines in the appar‐
104 mor configuration that may refer to paths used by virtio-forwarder,
105 such as hugepage mounts or vhostuser sockets (default /tmp).
106
107 SELinux
108 On RHEL or CentOS systems, SELinux's access control policies may need
109 to be to be changed to allow virtio-forwarder to work. The semanage
110 utility can be used to set the svirt_t domain into permissive mode,
111 thereby allowing the functioning of the relay:
112
113 yum install policycoreutils-python
114 semanage permissive -a svirt_t
115
117 virtio-forwarder packages are hosted on copr and ppa. To install, add
118 the applicable repository and launch the appropriate package manager:
119
120 # rpms
121 yum install yum-plugin-copr
122 yum copr enable netronome/virtio-forwarder
123 yum install virtio-forwarder
124
125 # debs
126 add-apt-repository ppa:netronome/virtio-forwarder
127 apt-get update
128 apt-get install virtio-forwarder
129
130 The package install configures virtio-forwarder as a systemd/upstart
131 service. Boot time startup can be configured using the appropriate ini‐
132 tialization utility, e.g. systemctl enable virtio-forwarder.
133
134 After installation, the software can be manually started using the fol‐
135 lowing command:
136
137 systemctl start virtio-forwarder # systemd
138 start virtio-forwarder # upstart
139
140 Configuration variables taken into account at startup can be set in the
141 /etc/default/virtioforwarder file. The next section highlights some
142 important options.
143
144 The virtio-forwarder daemon can be stopped by substituting stop in the
145 start commands of the respective initialization utilities.
146
147 An additional CPU load balancing component is installed alongside vir‐
148 tio-forwarder. The service, vio4wd_core_scheduler, is managed exactly
149 like virtio-forwarder with regard to starting, stopping and configura‐
150 tion.
151
153 Both the virtio-forwarder and vio4wd_core_scheduler daemons read from
154 /etc/default/virtioforwarder at startup. The file takes the form of
155 variable=value entries, one per line. Lines starting with the "#" char‐
156 acter are treated as comments and ignored. The file comes pre-populated
157 with sane default values, but may require alterations to comply with
158 different setups. The following table lists a subset of the available
159 options and their use:
160
162┌────────────────────────────────────────────────────────────────────┬─────────────────────┬─────────────────────┐
163│Name / Description │ Valid values │ Default │
164├────────────────────────────────────────────────────────────────────┼─────────────────────┼─────────────────────┤
165│VIRTIOFWD_CPU_MASK │ 0 - number of host │ 1,2 │
166│CPUs to use for worker threads: either comma separated integers or, │ CPU │ │
167│hex bitmap starting with 0x. │ │ │
168├────────────────────────────────────────────────────────────────────┼─────────────────────┼─────────────────────┤
169│VIRTIOFWD_LOG_LEVEL │ 0-7 │ 6 │
170│Log threshold 0-7 (least to most verbose). │ │ │
171├────────────────────────────────────────────────────────────────────┼─────────────────────┼─────────────────────┤
172│VIRTIOFWD_OVSDB_SOCK_PATH │ System path │ /usr/local/var/run/ │
173│Path to the ovsdb socket file used for port control. │ │ openvswitch/db.sock │
174├────────────────────────────────────────────────────────────────────┼─────────────────────┼─────────────────────┤
175│VIRTIOFWD_HUGETLBFS_MOUNT_POINT │ System path │ /mnt/huge │
176│Mount path to hugepages for vhost-user communication with VMs. │ │ │
177│This must match the path configured for libvirt/QEMU. │ │ │
178├────────────────────────────────────────────────────────────────────┼─────────────────────┼─────────────────────┤
179│VIRTIOFWD_SOCKET_OWNER │ Username │ libvirt-qemu │
180│vhost-user unix socket ownership username. │ │ │
181├────────────────────────────────────────────────────────────────────┼─────────────────────┼─────────────────────┤
182│VIRTIOFWD_SOCKET_GROUP │ Groupname │ kvm │
183│vhost-user unix socket ownership groupname. │ │ │
184├────────────────────────────────────────────────────────────────────┼─────────────────────┼─────────────────────┤
185│VIO4WD_CORE_SCHED_ENABLE │ true or false │ false │
186│Use dynamic CPU load balancing. Toggle flag to enable the CPU │ │ │
187│migration API to be exposed. vio4wd_core_scheduler requires this │ │ │
188│option to function. │ │ │
189├────────────────────────────────────────────────────────────────────┼─────────────────────┼─────────────────────┤
190│VIRTIOFWD_CPU_PINS │ <vf>:<cpu>[,<cpu>] │ None │
191│Relay CPU pinnings. A semicolon-delimited list of strings │ │ │
192│specifying which CPU(s) to use for the specified relay instances. │ │ │
193└────────────────────────────────────────────────────────────────────┴─────────────────────┴─────────────────────┘
194
195
196
197
198
199│VIRTIOFWD_DYNAMIC_SOCKETS │ true or false │ false │
200│Enable dynamic sockets. virtio-forwarder will not create or listen │ │ │
201│to any sockets when dynamic sockets are enabled. Instead, socket │ │ │
202│registration/deregistration must ensue through the ZMQ port control │ │ │
203│client. │ │ │
204└────────────────────────────────────────────────────────────────────┴─────────────────────┴─────────────────────┘
205
207 virtio-forwarder implements different methods for the addition and
208 removal of VFs and bonds. Depending on the use case, one of the follow‐
209 ing may be appropriate:
210
211 · ZeroMQ port control for the purpose of manual device and socket man‐
212 agement at run-time. Run /usr/lib[64]/virtio-forwarder/virtiofor‐
213 warder_port_control.py -h for usage guidelines. To enable ZeroMQ VF
214 management, set VIRTIOFWD_ZMQ_PORT_CONTROL_EP to an appropriate path
215 in the configuration file.
216
217 The port control client is the preferred device management tool, and
218 is the only utility that can exercise all the device related features
219 of virtio-forwarder. Particularly, bond creation/deletion, and
220 dynamic socket registration/deregistration are only exposed to the
221 port control client. The examples below demonstrate the different
222 modes of operation:
223
224 ·
225
226 Add VF
227
228 virtioforwarder_port_control.py add --virtio-id=<ID> \
229 --pci-addr=<PCI_ADDR>
230
231 ·
232
233 Remove VF
234
235 virtioforwarder_port_control.py remove --virtio-id=<ID> \
236 --pci-addr=<PCI_ADDR>
237
238 ·
239
240 Add bond
241
242 virtioforwarder_port_control.py add --virtio-id=<ID> \
243 --name=<BOND_NAME> --pci-addr=<PCI_ADDR> --pci-addr=<PCI_ADDR> \
244 [--mode=<MODE>]
245
246 ·
247
248 Remove bond
249
250 virtioforwarder_port_control.py remove --virtio-id=<ID> \
251 --name=<BOND_NAME>
252
253 ·
254
255 Add device <-> vhost-user socket pair
256
257 virtioforwarder_port_control.py add_sock \
258 --vhost-path=</path/to/vhostuser.sock> --pci-addr=<PCI_ADDR> \
259 [--pci-addr=<PCI_ADDR> --name=<BOND_NAME> [--mode=<MODE>]]
260
261 ·
262
263 Remove device <-> vhost-user socket pair
264
265 virtioforwarder_port_control.py remove_sock \
266 --vhost-path=</path/to/vhostuser.sock> \
267 (--pci-addr=<PCI_ADDR>|--name=<BOND_NAME>)
268
269 NOTE:
270
271 · A bond operation is assumed when multiple PCI addresses are pro‐
272 vided.
273
274 · Bond names are required to start with net_bonding.
275
276 · Socket operations only apply if virtio-forwarder was started
277 with the VIRTIOFWD_DYNAMIC_SOCKETS option enabled.
278
279 · Static VF entries in /etc/default/virtioforwarder. VFs specified here
280 are added when the daemon starts. The VIRTIOFWD_STATIC_VFS variable
281 is used for this purpose, with the syntax <PCI>=<virtio_id>, e.g.
282 0000:05:08.1=1. Multiple entries can be specified using bash arrays.
283 The following examples are all valid:
284
285 · VIRTIOFWD_STATIC_VFS=0000:05:08.1=1
286
287 · VIRTIOFWD_STATIC_VFS=(0000:05:08.1=1)
288
289 · VIRTIOFWD_STATIC_VFS=(0000:05:08.1=1 0000:05:08.2=2
290 0000:05:08.3=3)
291
292 · OVSDB monitor: The ovs-vsctl command manipulates the OVSDB, which is
293 monitored for changes by virtio-forwarder. To add a VF to the vir‐
294 tio-forwarder, the ovs-vsctl command can be used with a special
295 external_ids value containing an indication to use the relay. The
296 bridge name br-virtio in this example is arbitrary, any bridge name
297 may be used:
298
299 ovs-vsctl add-port br-virtio eth100 -- set interface \
300 eth100 external_ids:virtio_forwarder=1
301
302 Note that the ports in the OVSDB remain configured across OvS
303 restarts, and when virtio-forwarder starts it will find the initial
304 list of ports with associated virtio-forwarder indications and recre‐
305 ate the necessary associations.
306
307 Changing an interface with no virtio-forwarder indication to one with
308 a virtio- forwarder indication, or changing one with a virtio-for‐
309 warder indication to one without a virtio-forwarder indication also
310 works. e.g.
311
312 # add to OvS bridge without virtio-forwarder (ignored by virtio-forwarder)
313 ovs-vsctl add-port br-virtio eth100
314 # add virtio-forwarder (detected by virtio-forwarder)
315 ovs-vsctl set interface eth100 external_ids:virtio_forwarder=1
316 # remove virtio-forwarder (detected by virtio-forwarder and removed from
317 # relay, but remains on OvS bridge)
318 ovs-vsctl remove interface eth100 external_ids virtio_forwarder
319
320 The externals_ids of a particular interface can be viewed with
321 ovs-vsctl as follows:
322
323 ovs-vsctl list interface eth100 | grep external_ids
324
325 A list of all the interfaces with external_ids can be queried from
326 OVSDB:
327
328 ovsdb-client --pretty -f list dump Interface name external_ids | \
329 grep -A2 -E "external_ids.*: {.+}"
330
331 · Inter-process communication (IPC) which implements a file monitor for
332 VF management. Set VIRTIOFWD_IPC_PORT_CONTROL in the configuration
333 file to non-null to enable.
334
335 NOTE:
336 ZMQ, OVSDB and IPC port control are mutually exclusive.
337
338 WARNING:
339 Relayed VFs cannot be used for SR-IOV passthrough while in use by
340 virtio- forwarder, as libvirt will disregard the igb_uio binding of
341 relayed VFs when establishing a passthrough connection. This causes
342 irrevocable interference with the igb_uio module, leading to an
343 eventual segmentation fault.
344
346 The VIRTIOFWD_CPU_PINS variable in the configuration file can be used
347 to control VF relay CPU affinities. The format of the option is --vir‐
348 tio-cpu=<vf>:<cpu>[,<cpu>], where <cpu> must be a valid CPU enabled in
349 the VIRTIOFWD_CPU_MASK configuration option. Specifying two CPUs for a
350 particular VF allows the VF-to-virtio and virtio-to-VF relay directions
351 to be serviced by separate CPUs, enabling higher performance to a par‐
352 ticular virtio endpoint in a VM. If a given VF is not bound to a CPU
353 (or CPUs), then that VF relay will be assigned to the least busy CPU in
354 the list of CPUs provided in the configuration. The option may contain
355 multiple affinity specifiers, one for each VF number.
356
358 In some scenarios, virtio-forwarder’s CPU assignments may result in
359 poor relay to CPU affinities due to the network load being unevenly
360 distributed among worker cores. A relay’s throughput will suffer when
361 it is serviced by worker cores under excessive processing load. Manual
362 pinnings may also prove suboptimal under varying network requirements.
363 The external vio4wd_core_scheduler load balancing daemon is included to
364 address this issue. The balancer daemon gathers network load periodi‐
365 cally in order to determine and apply an optimal affinity solution.
366 ZeroMQ is used for inter-process communication. Note that
367 VIO4WD_CORE_SCHED_ENABLE must be explicitely set to true for vir‐
368 tio-forwarder to create and listen on the ZeroMQ endpoint required for
369 CPU migration.
370
371 NOTE:
372 When running, the load balancer may overwrite manual pinnigs at any
373 time!
374
376 QEMU virtual machines can be run manually on the command line, or by
377 using libvirt to manage them. To use QEMU manually with the vhost-user
378 backed VirtIO which the virtio-forwarder provides, the following exam‐
379 ple can be used:
380
381 -object memory-backend-file,id=mem,size=3584M,mem-path=/dev/hugepages-1G,share=on,prealloc=on \
382 -numa node,memdev=mem -mem-prealloc \
383 -chardev socket,id=chr0,path=/tmp/virtio-forwarder1.sock \
384 -netdev type=vhost-user,id=guest3,chardev=chr0,vhostforce \
385 -device virtio-net-pci,netdev=guest3,csum=off,gso=off,guest_tso4=off,guest_tso6=off,\
386 guest_ecn=off,mac=00:03:02:03:04:01
387
388 It is important for the VM memory to be marked as shareable (share=on)
389 and preallocated (prealloc=on and -mem-prealloc), the mem-path must
390 also be correctly specified to the hugepage mount point used on the
391 system. The path of the socket must be set to the correct virtio-for‐
392 warder vhost-user instance, and the MAC address may be configured as
393 needed.
394
395 Virtual machines may also be managed using libvirt, and this requires
396 some specific XML snippets in the libvirt VM domain specification file:
397
398 <memoryBacking>
399 <hugepages>
400 <page size='1048576' unit='KiB' nodeset='0'/>
401 </hugepages>
402 </memoryBacking>
403
404 <cpu mode='custom' match='exact'>
405 <model fallback='allow'>SandyBridge</model>
406 <feature policy='require' name='ssse3'/>
407 <numa>
408 <cell id='0' cpus='0-1' memory='3670016' unit='KiB' memAccess='shared'/>
409 </numa>
410 </cpu>
411
412 If only 2M hugepages are in use on the system, the domain can be con‐
413 figured with the following page size:
414
415 <page size='2' unit='MiB' nodeset='0'/>
416
417 Note, the emulated CPU requires SSSE3 instructions for DPDK support.
418
419 The following snippet illustrates how to add a vhost-user interface to
420 the domain:
421
422 <devices>
423 <interface type='vhostuser'>
424 <source type='unix' path='/tmp/virtio-forwarderRELAYID.sock' mode='client'/>
425 <model type='virtio'/>
426 <alias name='net1'/>
427 <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
428 </interface>
429 </devices>
430
431 NOTE:
432 When starting the domain, make sure that the permissions are cor‐
433 rectly set on the relay vhost-user socket, as well as adding the
434 required permissions to the apparmor profile. The VIR‐
435 TIOFWD_SOCKET_OWNER and VIRTIOFWD_SOCKET_GROUP options in the con‐
436 figuration file can also be used to set the permissions on the vhos‐
437 tuser sockets.
438
440 The VIRTIOFWD_VHOST_CLIENT option can be used to put virtio-forwarder
441 in vhostuser client mode instead of the default server mode. This
442 requires the VM to use QEMU v2.7 or newer, and the VM must be config‐
443 ured to use vhostuser server mode, e.g. for libvirt:
444
445 <interface type='vhostuser'>
446 <mac address='52:54:00:bf:e3:ae'/>
447 <source type='unix' path='/tmp/virtio-forwarder1.sock' mode='server'/>
448 <model type='virtio'/>
449 <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
450 </interface>
451
452 or when using a QEMU cmdline directly:
453
454 -chardev socket,id=charnet1,path=/tmp/virtio-forwarder1.sock,server
455
456 The advantage of this is that virtio-forwarder will attempt to
457 re-establish broken vhostuser connections automatically. In particular,
458 this allows virtio-forwarder to be restarted while a VM is running (and
459 still have virtio connectivity afterwards), as well as have a VM be
460 restarted while virtio-forwarder is running. In the default virtio-for‐
461 warder vhostuser server mode, only the latter is possible.
462
464 virtio-forwarder supports multiqueue virtio up to a maximum of 32
465 queues, where the QEMU VM is configured in the standard way. For lib‐
466 virt configured VMs, libvirt version >= 1.2.17 is required for multi‐
467 queue support, and then one can simply add <driver queues='4'/> inside
468 the vhostuser interface chunk in libvirt XML, where 4 is the number of
469 queues required, e.g.:
470
471 <interface type='vhostuser'>
472 <mac address='52:54:00:bf:e3:ae'/>
473 <source type='unix' path='/tmp/virtio-forwarder1.sock' mode='client'/>
474 <model type='virtio'/>
475 <driver queues='4'/>
476 <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
477 </interface>
478
479 This results in the following cmdline params to QEMU:
480
481 -chardev socket,id=charnet1,path=/tmp/virtio-forwarder1.sock -netdev type=vhost-user,\
482 id=hostnet1,chardev=charnet1,queues=4 -device virtio-net-pci,mq=on,vectors=10,\
483 netdev=hostnet1,id=net1,mac=52:54:00:bf:e3:ae,bus=pci.0,addr=0x6
484
485 (i.e. the queues item in netdev option, and the mq and vectors items in
486 device option, where the vectors value must be (queues+1)*2)
487
488 To enable the multiqueue inside the VM:
489
490 # to see max and current queues:
491 ethtool -l eth1
492 # to set queues
493 ethtool -L eth1 combined 4
494
496 Important aspects that influence performance are resource contention,
497 and CPU and memory NUMA affinities. The following are general guide‐
498 lines to follow for a performance oriented setup:
499
500 · Pin VM VCPUs.
501
502 · Dedicate worker CPUs for relays.
503
504 · Do not make any overlapping CPU assignments.
505
506 · Set the NUMA affinity of a VM's backing memory and ensure that it
507 matches the VCPUs. The numatune libvirt xml snippet can be used for
508 this.
509
510 · Keep hyperthread partners idle.
511
512 · Disable interrupts on the applicable CPUs.
513
514 · Keep all components on the same NUMA. If you want to utilize the
515 other NUMA, assign everything (VCPUs, VM memory, VIO4WD workers) to
516 that NUMA so that only the PCI device is cross-socket.
517
518 If a VM's backing memory is confined to a particular NUMA, virtio-for‐
519 warder will automatically align the corresponding relay's memory pool
520 with the VM's upon connection in order to limit QPI crossings. More‐
521 over, the CPU load balancing daemon will only consider CPUs that are
522 local to a relay's NUMA to service it.
523
525 Helper and debugging scripts are located in /usr/lib[64]/virtio-for‐
526 warder/. Here are pointers to using some of the more useful ones:
527
528 · virtioforwarder_stats.py: Gathers statistics (including rate stats)
529 from running relay instances.
530
531 · virtioforwarder_core_pinner.py: Manually pin relay instances to CPUs
532 at runtime. Uses the same syntax as the environment file, that is,
533 --virtio-cpu=RN:Ci,Cj. Run without arguments to get the current relay
534 to CPU mapping. Note that the mappings may be overridden by the load
535 balancer if it is also running. The same is true for mappings pro‐
536 vided in the configuration file.
537
538 · virtioforwarder_monitor_load.py: Provides a bar-like representation
539 of the current load on worker CPUs. Useful to monitor the work of the
540 load balancer.
541
542 System logs can be viewed by running journalctl -u virtio-forwarder -u
543 vio4wd_core_scheduler on systemd-enabled systems. Syslog provides the
544 same information on older systems.
545
547 To enable VirtIO 1.0 (as opposed to legacy VirtIO), the backend virtual
548 PCI device provided by QEMU needs to be enabled. Using QEMU 2.5, you
549 need to supply an extra cmdline parameter to prevent VirtIO 1.0 support
550 from being disabled (it is disabled by default, since there are appar‐
551 ently still known issues with performance, stability and live migra‐
552 tion):
553
554 -global virtio-pci.disable_modern=off
555
556 This can be done in a libvirt domain by ensuring the domain spec starts
557 with something like:
558
559 <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
560
561 and just prior to the closing </domain> tag adding the following:
562
563 <qemu:commandline>
564 <qemu:arg value='-global'/>
565 <qemu:arg value='virtio-pci.disable-modern=off'/>
566 </qemu:commandline>
567
568 In addition to this, the vhost or vhost-user connected to the device in
569 QEMU must support VirtIO 1.0. The vhostuser interface which virtio-for‐
570 warder supplies does support this, but if the host is running a Linux
571 kernel older than 4.0, you likely won't have vhost-net (kernel) support
572 for any network interfaces in your QEMU VM which are not connected to
573 virtio-forwarder, for example if you have a bridged management network
574 interface. Libvirt will by default use vhost net for that, you can dis‐
575 able vhost-net by adding <driver name='qemu'/> to the relevant bridge
576 interface as follows:
577
578 <interface type='bridge'>
579 ...
580 <model type='virtio'/>
581 <driver name='qemu'/>
582 ...
583 </interface>
584
585 To use VirtIO 1.0 with DPDK inside a VM, you will need to use DPDK
586 16.04. To use a VirtIO 1.0 netdev in the VM, the VM must be running
587 Linux kernel version 4.0 or newer.
588
590 The virtio-forwarder is compatible with QEMU VM live migration as
591 abstracted by libvirt, and has been tested using QEMU 2.5 with libvirt
592 1.2.16. The VM configuration must conform to some requirements to allow
593 live migration to take place. In short:
594
595 · VM disk image must be accessible over shared network storage accessi‐
596 ble to the source and destination machines.
597
598 · Same versions of QEMU must be available on both machines.
599
600 · apparmor configuration must be correct on both machines.
601
602 · VM disk cache must be disabled, e.g. <driver name='qemu'
603 type='qcow2' cache='none'/> (inside the disk element).
604
605 · The hugepages for both machines must be correctly configured.
606
607 · Ensure both machines have Linux kernels new enough to support
608 vhost-net live migration for any virtio network devices not using the
609 vhostuser interface, or configure such interfaces to only use vanilla
610 QEMU virtio backend support, e.g. <model type='virtio'/> <driver
611 name='qemu'/> (inside the relevant interface elements).
612
613 The VM live migration can be initiated from the source machine by giv‐
614 ing the VM name and target user&hostname as follows:
615
616 virsh migrate --live <vm_name> qemu+ssh://<user@host>/system
617
618 The --verbose argument can optionally be added for extra information.
619 If all goes well, virsh list on the source machine should no longer
620 show <vm_name> and instead it should appear in the output of virsh list
621 on the destination machine. If anything goes wrong, the following log
622 files often have additional details to help troubleshoot the problem:
623
624 journalctl
625 /var/log/syslog
626 /var/log/libvirt/libvirt.log
627 /var/log/libvirt/qemu/<vm_name>.log
628
629 In the simplest scenario, the source and destination machines have the
630 same VM configuration, particularly with respect to the vhostuser
631 socket used on virtio- forwarder. It may be handy to configure the
632 vhostuser socket in the VM to point to a symlink file which links to
633 one of the virtio-forwarder sockets. This is one way to allow the
634 source and destination machines to use different vhostuser sockets if
635 necessary. For example, on the source machine one might be using a sym‐
636 link called /tmp/vm_abc.sock linking to /tmp/virtio-forwarder1.sock,
637 while on the destination machine /tmp/vm_abc.sock might link to
638 /tmp/virtio-forwarder13.sock.
639
640 It is also possible to migrate between machines where one is using vir‐
641 tio-forwarder, and the other is using a different virtio backend driver
642 (could be a different vhostuser implementation, or could even be
643 vhost-net or plain QEMU backend). The key to achieving this is the
644 --xml parameter for the virsh migrate command (virsh help migrate
645 reveals: --xml <string> filename containing updated XML for the tar‐
646 get).
647
648 Here is an example of the procedure to migrate from a vhostuser VM
649 (connected to virtio-forwarder) to a nonvhostuser VM:
650
651 On the destination machine, set up a libvirt network that you want to
652 migrate the interface onto, e.g. named 'migrate', by passing the fol‐
653 lowing XML file to virsh net-define <xml_file> and running it with
654 virsh net-start migrate; virsh net-autostart migrate:
655
656 <network>
657 <name>migrate</name>
658 <bridge name='migratebr0' stp='off' delay='0'/>
659 </network>
660
661 On the source machine (where the VM is defined to use vhostuser con‐
662 nected to virtio-forwarder), dump the VM XML to a file by running virsh
663 dumpxml <vm_name> >domain.xml. Edit the domain.xml file to change the
664 vhostuser interfaces to be sourced by the migrate network, i.e. change
665 these:
666
667 <interface type='vhostuser'>
668 <mac address='00:0a:00:00:00:00'/>
669 <source type='unix' path='/tmp/virtio-forwarder0.sock' mode='client'/>
670 <model type='virtio'/>
671 <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
672 </interface>
673
674 to these:
675
676 <interface type='network'>
677 <mac address='00:0a:00:00:00:00'/>
678 <source network='migrate'>
679 <model type='virtio'/>
680 <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
681 </interface>
682
683 Finally, once you have this modified domain.xml file, the VM can be
684 migrated as follows:
685
686 virsh migrate --live <vm_name> qemu+ssh://<user@host>/system --xml domain.xml
687
688 Migrating from a non virtio-forwarder machine to a virtio-forwarder
689 machine follows this same procedure in reverse; a new XML file is made
690 where the migrate network interfaces are changed to vhostuser inter‐
691 faces.
692
694 Bert van Leeuwen, Frik Botha
695
696
697
698
6991.1.99.51 Feb 18, 2019 VIRTIO-FORWARDER(8)