1VIRTIOFSD(1)                         QEMU                         VIRTIOFSD(1)
2
3
4

NAME

6       virtiofsd - QEMU virtio-fs shared file system daemon
7

SYNOPSIS

9       virtiofsd [OPTIONS]
10

DESCRIPTION

12       Share  a  host  directory tree with a guest through a virtio-fs device.
13       This program is a vhost-user backend that implements the virtio-fs  de‐
14       vice.   Each  virtio-fs  device  instance  requires  its  own virtiofsd
15       process.
16
17       This program is designed to work with QEMU's --device vhost-user-fs-pci
18       but  should  work  with any virtual machine monitor (VMM) that supports
19       vhost-user.  See the Examples section below.
20
21       This program must be run as the root user.  The  program  drops  privi‐
22       leges  where possible during startup although it must be able to create
23       and access files with any uid/gid:
24
25       • The ability to invoke syscalls is limited using seccomp(2).
26
27       • Linux capabilities(7) are dropped.
28
29       In "namespace" sandbox mode the program switches into a new file system
30       namespace  and  invokes pivot_root(2) to make the shared directory tree
31       its root.  A new pid and net namespace is also created to  isolate  the
32       process.
33
34       In  "chroot"  sandbox  mode  the  program invokes chroot(2) to make the
35       shared directory tree its root. This mode is intended for container en‐
36       vironments  where  the  container  runtime has already set up the name‐
37       spaces and the program does not have permission  to  create  namespaces
38       itself.
39
40       Both  sandbox  modes  prevent "file system escapes" due to symlinks and
41       other file system objects that might lead to files outside  the  shared
42       directory.
43

OPTIONS

45       -h, --help
46              Print help.
47
48       -V, --version
49              Print version.
50
51       -d     Enable debug output.
52
53       --syslog
54              Print log messages to syslog instead of stderr.
55
56       -o OPTION
57
58              • debug - Enable debug output.
59
60              • flock|no_flock   -   Enable/disable  flock.   The  default  is
61                no_flock.
62
63              • modcaps=CAPLIST  Modify  the  list  of  capabilities  allowed;
64                CAPLIST  is  a colon separated list of capabilities, each pre‐
65                ceded by either + or -, e.g.  ''+sys_admin:-chown''.
66
67              • log_level=LEVEL - Print only log messages  matching  LEVEL  or
68                more  severe.  LEVEL is one of err, warn, info, or debug.  The
69                default is info.
70
71              • posix_lock|no_posix_lock - Enable/disable remote POSIX  locks.
72                The default is no_posix_lock.
73
74              • readdirplus|no_readdirplus  - Enable/disable readdirplus.  The
75                default is readdirplus.
76
77              • sandbox=namespace|chroot - Sandbox mode: -  namespace:  Create
78                mount,  pid,  and  net  namespaces  and pivot_root(2) into the
79                shared directory.  - chroot: chroot(2) into  shared  directory
80                (use in containers).  The default is "namespace".
81
82              • source=PATH - Share host directory tree located at PATH.  This
83                option is required.
84
85              • timeout=TIMEOUT - I/O timeout in seconds.  The default depends
86                on cache= option.
87
88              • writeback|no_writeback  -  Enable/disable writeback cache. The
89                cache allows the FUSE client to buffer  and  merge  write  re‐
90                quests.  The default is no_writeback.
91
92              • xattr|no_xattr - Enable/disable extended attributes (xattr) on
93                files and directories.  The default is no_xattr.
94
95              • posix_acl|no_posix_acl -  Enable/disable  posix  acl  support.
96                Posix ACLs are disabled by default.
97
98              • security_label|no_security_label - Enable/disable security la‐
99                bel support. Security labels are  disabled  by  default.  This
100                will allow client to send a MAC label of file during file cre‐
101                ation. Typically this is expected to be SELinux  security  la‐
102                bel.  Server  will try to set that label on newly created file
103                atomically wherever possible.
104
105              • killpriv_v2|no_killpriv_v2 - Enable/disable  FUSE_HANDLE_KILL‐
106                PRIV_V2  support. KILLPRIV_V2 is enabled by default as long as
107                the client supports it. Enabling this option helps  with  per‐
108                formance in write path.
109
110       --socket-path=PATH
111              Listen on vhost-user UNIX domain socket at PATH.
112
113       --socket-group=GROUP
114              Set the vhost-user UNIX domain socket gid to GROUP.
115
116       --fd=FDNUM
117              Accept  connections  from vhost-user UNIX domain socket file de‐
118              scriptor FDNUM.  The file descriptor must already  be  listening
119              for connections.
120
121       --thread-pool-size=NUM
122              Restrict  the number of worker threads per request queue to NUM.
123              The default is 0.
124
125       --cache=none|auto|always
126              Select the desired trade-off between coherency and  performance.
127              none  forbids  the  FUSE client from caching to achieve best co‐
128              herency at the cost of performance.  auto acts  similar  to  NFS
129              with  a  1  second  metadata  cache timeout.  always sets a long
130              cache lifetime at the expense  of  coherency.   The  default  is
131              auto.
132

EXTENDED ATTRIBUTE (XATTR) MAPPING

134       By default the name of xattr's used by the client are passed through to
135       the server file system.  This can be a problem where either those xattr
136       names  are  used by something on the server (e.g. selinux client/server
137       confusion) or if the virtiofsd is  running  in  a  container  with  re‐
138       stricted privileges where it cannot access some attributes.
139
140   Mapping syntax
141       A  mapping  of  xattr names can be made using -o xattrmap=mapping where
142       the mapping string consists of a series of rules.
143
144       The first matching rule terminates the mapping.  The set of rules  must
145       include  a  terminating  rule  to match any remaining attributes at the
146       end.
147
148       Each rule consists of a number of fields  separated  with  a  separator
149       that  is the first non-white space character in the rule.  This separa‐
150       tor must then be used for the whole rule.  White space may be added be‐
151       fore and after each rule.
152
153       Using ':' as the separator a rule is of the form:
154
155       :type:scope:key:prepend:
156
157       scope is:
158
159
160
161         'client' - match 'key' against a xattr name from the client for
162                setxattr/getxattr/removexattr
163
164
165
166         'server' - match 'prepend' against a xattr name from the server
167                for listxattr
168
169
170
171         'all' - can be used to make a single rule where both the server
172                and client matches are triggered.
173
174       type is one of:
175
176       • 'prefix'  -  is designed to prepend and strip a prefix;  the modified
177         attributes then being passed on to the client/server.
178
179       • 'ok' - Causes the rule set to be terminated when  a  match  is  found
180         while  allowing  matching  xattr's through unchanged.  It is intended
181         both as a way of explicitly terminating the list of rules, and to al‐
182         low some xattr's to skip following rules.
183
184       • 'bad'  -  If  a client tries to use a name matching 'key' it's denied
185         using EPERM; when  the  server  passes  an  attribute  name  matching
186         'prepend'  it's  hidden.   In many ways it's use is very like 'ok' as
187         either an explicit terminator or for special handling of certain pat‐
188         terns.
189
190       • 'unsupported'  -  If a client tries to use a name matching 'key' it's
191         denied using ENOTSUP; when the server passes an attribute name match‐
192         ing  'prepend'  it's hidden.  In many ways it's use is very like 'ok'
193         as either an explicit terminator or for special handling  of  certain
194         patterns.
195
196       key  is a string tested as a prefix on an attribute name originating on
197       the client.  It maybe empty in which case a 'client' rule  will  always
198       match on client names.
199
200       prepend is a string tested as a prefix on an attribute name originating
201       on the server, and used as a new prefix.  It may be empty in which case
202       a 'server' rule will always match on all names from the server.
203
204       e.g.:
205          :prefix:client:trusted.:user.virtiofs.:
206
207          will match 'trusted.' attributes in client calls and prefix them be‐
208          fore passing them to the server.
209
210          :prefix:server::user.virtiofs.:
211
212          will strip 'user.virtiofs.' from all server replies.
213
214          :prefix:all:trusted.:user.virtiofs.:
215
216          combines the previous two cases into a single rule.
217
218          :ok:client:user.::
219
220          will allow get/set xattr for 'user.' xattr's  and  ignore  following
221          rules.
222
223          :ok:server::security.:
224
225          will  pass  'security.' xattr's in listxattr from the server and ig‐
226          nore following rules.
227
228          :ok:all:::
229
230          will terminate the rule search passing any remaining  attributes  in
231          both directions.
232
233          :bad:server::security.:
234
235          would hide 'security.' xattr's in listxattr from the server.
236
237       A simpler 'map' type provides a shorter syntax for the common case:
238
239       :map:key:prepend:
240
241       The 'map' type adds a number of separate rules to add prepend as a pre‐
242       fix to the matched key (or all attributes if key is empty).  There  may
243       be at most one 'map' rule and it must be the last rule in the set.
244
245       Note:  When the 'security.capability' xattr is remapped, the daemon has
246       to do extra work to remove it during many operations,  which  the  host
247       kernel normally does itself.
248
249   Security considerations
250       Operating  systems  typically  partition the xattr namespace using well
251       defined name prefixes. Each partition may have  different  access  con‐
252       trols applied. For example, on Linux there are multiple partitions
253
254system.* - access varies depending on attribute & filesystem
255
256security.* - only processes with CAP_SYS_ADMIN
257
258trusted.* - only processes with CAP_SYS_ADMIN
259
260user.* - any process granted by file permissions / ownership
261
262       While  other OS such as FreeBSD have different name prefixes and access
263       control rules.
264
265       When remapping attributes on the host, it is important to  ensure  that
266       the  remapping  does  not  allow a guest user to evade the guest access
267       control rules.
268
269       Consider  if  trusted.*  from  the  guest  was  remapped  to  user.vir‐
270       tiofs.trusted*  in  the host. An unprivileged user in a Linux guest has
271       the ability to write to xattrs under user.*. Thus the  user  can  evade
272       the  access  control  restriction  on  trusted.*  by instead writing to
273       user.virtiofs.trusted.*.
274
275       As noted above, the partitions used and access controls  applied,  will
276       vary  across  guest  OS,  so  it is not wise to try to predict what the
277       guest OS will use.
278
279       The simplest way to avoid an insecure configuration  is  to  remap  all
280       xattrs  at once, to a given fixed prefix.  This is shown in example (1)
281       below.
282
283       If selectively mapping only a subset of xattr prefixes, then rules must
284       be  added to explicitly block direct access to the target of the remap‐
285       ping. This is shown in example (2) below.
286
287   Mapping examples
288       1. Prefix all attributes with 'user.virtiofs.'
289
290          -o xattrmap=":prefix:all::user.virtiofs.::bad:all:::"
291
292       This uses two rules, using : as the field  separator;  the  first  rule
293       prefixes  and  strips  'user.virtiofs.',  the  second  rule  hides  any
294       non-prefixed attributes that the host set.
295
296       This is equivalent to the 'map' rule:
297
298          -o xattrmap=":map::user.virtiofs.:"
299
300       2. Prefix 'trusted.' attributes, allow others through
301
302          "/prefix/all/trusted./user.virtiofs./
303           /bad/server//trusted./
304           /bad/client/user.virtiofs.//
305           /ok/all///"
306
307       Here there are four rules, using / as the  field  separator,  and  also
308       demonstrating  that new lines can be included between rules.  The first
309       rule is the prefixing of 'trusted.' and stripping of  'user.virtiofs.'.
310       The  second  rule  hides  unprefixed 'trusted.' attributes on the host.
311       The third rule stops a guest from  explicitly  setting  the  'user.vir‐
312       tiofs.' path directly to prevent access control bypass on the target of
313       the earlier prefix remapping.  Finally, the fourth rule  lets  all  re‐
314       maining attributes through.
315
316       This is equivalent to the 'map' rule:
317
318          -o xattrmap="/map/trusted./user.virtiofs./"
319
320       3. Hide 'security.' attributes, and allow everything else
321
322          "/bad/all/security./security./
323           /ok/all///'
324
325       The  first rule combines what could be separate client and server rules
326       into a single 'all' rule, matching 'security.' in either  client  argu‐
327       ments  or  lists  returned from the host.  This stops the client seeing
328       any 'security.' attributes on the server and stops it setting any.
329

SELINUX SUPPORT

331       One can enable support for SELinux by running virtiofsd with option "-o
332       security_label".  But this will try to save guest's security context in
333       xattr security.selinux on host and it might fail if host's SELinux pol‐
334       icy does not permit virtiofsd to do this operation.
335
336       Hence, it is preferred to remap guest's "security.selinux" xattr to say
337       "trusted.virtiofs.security.selinux" on host.
338
339       "-o xattrmap=:map:security.selinux:trusted.virtiofs.:"
340
341       This will make sure that guest and host's SELinux xattrs on  same  file
342       remain  separate and not interfere with each other. And will allow both
343       host and guest to implement their own separate SELinux policies.
344
345       Setting trusted xattr on host requires CAP_SYS_ADMIN. So one will  need
346       add this capability to daemon.
347
348       "-o modcaps=+sys_admin"
349
350       Giving  CAP_SYS_ADMIN  increases  the  risk on system. Now virtiofsd is
351       more powerful and if gets compromised, it can do lot of damage to  host
352       system.  So keep this trade-off in my mind while making a decision.
353

EXAMPLES

355       Export    /var/lib/fs/vm001/   on   vhost-user   UNIX   domain   socket
356       /var/run/vm001-vhost-fs.sock:
357
358          host# virtiofsd --socket-path=/var/run/vm001-vhost-fs.sock -o source=/var/lib/fs/vm001
359          host# qemu-system-x86_64 \
360                -chardev socket,id=char0,path=/var/run/vm001-vhost-fs.sock \
361                -device vhost-user-fs-pci,chardev=char0,tag=myfs \
362                -object memory-backend-memfd,id=mem,size=4G,share=on \
363                -numa node,memdev=mem \
364                ...
365          guest# mount -t virtiofs myfs /mnt
366

AUTHOR

368       Stefan     Hajnoczi     <stefanha@redhat.com>,     Masayoshi     Mizuma
369       <m.mizuma@jp.fujitsu.com>
370
372       2023, The QEMU Project Developers
373
374
375
376
3777.2.6                            Sep 26, 2023                     VIRTIOFSD(1)
Impressum