1VIRTIOFSD(1)                         QEMU                         VIRTIOFSD(1)
2
3
4

NAME

6       virtiofsd - QEMU virtio-fs shared file system daemon
7

SYNOPSIS

9       virtiofsd [OPTIONS]
10

DESCRIPTION

12       Share  a  host  directory tree with a guest through a virtio-fs device.
13       This program is a vhost-user backend that implements the virtio-fs  de‐
14       vice.   Each  virtio-fs  device  instance  requires  its  own virtiofsd
15       process.
16
17       This program is designed to work with QEMU's --device vhost-user-fs-pci
18       but  should  work  with any virtual machine monitor (VMM) that supports
19       vhost-user.  See the Examples section below.
20
21       This program must be run as the root user.  The  program  drops  privi‐
22       leges  where possible during startup although it must be able to create
23       and access files with any uid/gid:
24
25       • The ability to invoke syscalls is limited using seccomp(2).
26
27       • Linux capabilities(7) are dropped.
28
29       In "namespace" sandbox mode the program switches into a new file system
30       namespace  and  invokes pivot_root(2) to make the shared directory tree
31       its root.  A new pid and net namespace is also created to  isolate  the
32       process.
33
34       In  "chroot"  sandbox  mode  the  program invokes chroot(2) to make the
35       shared directory tree its root. This mode is intended for container en‐
36       vironments  where  the  container  runtime has already set up the name‐
37       spaces and the program does not have permission  to  create  namespaces
38       itself.
39
40       Both  sandbox  modes  prevent "file system escapes" due to symlinks and
41       other file system objects that might lead to files outside  the  shared
42       directory.
43

OPTIONS

45       -h, --help
46              Print help.
47
48       -V, --version
49              Print version.
50
51       -d     Enable debug output.
52
53       --syslog
54              Print log messages to syslog instead of stderr.
55
56       -o OPTION
57
58              • debug - Enable debug output.
59
60              • flock|no_flock   -   Enable/disable  flock.   The  default  is
61                no_flock.
62
63              • modcaps=CAPLIST  Modify  the  list  of  capabilities  allowed;
64                CAPLIST  is  a colon separated list of capabilities, each pre‐
65                ceded by either + or -, e.g.  ''+sys_admin:-chown''.
66
67              • log_level=LEVEL - Print only log messages  matching  LEVEL  or
68                more  severe.  LEVEL is one of err, warn, info, or debug.  The
69                default is info.
70
71              • posix_lock|no_posix_lock - Enable/disable remote POSIX  locks.
72                The default is no_posix_lock.
73
74              • readdirplus|no_readdirplus  - Enable/disable readdirplus.  The
75                default is readdirplus.
76
77              • sandbox=namespace|chroot - Sandbox mode: -  namespace:  Create
78                mount,  pid,  and  net  namespaces  and pivot_root(2) into the
79                shared directory.  - chroot: chroot(2) into  shared  directory
80                (use in containers).  The default is "namespace".
81
82              • source=PATH - Share host directory tree located at PATH.  This
83                option is required.
84
85              • timeout=TIMEOUT - I/O timeout in seconds.  The default depends
86                on cache= option.
87
88              • writeback|no_writeback  -  Enable/disable writeback cache. The
89                cache allows the FUSE client to buffer  and  merge  write  re‐
90                quests.  The default is no_writeback.
91
92              • xattr|no_xattr - Enable/disable extended attributes (xattr) on
93                files and directories.  The default is no_xattr.
94
95              • posix_acl|no_posix_acl -  Enable/disable  posix  acl  support.
96                Posix ACLs are disabled by default.
97
98       --socket-path=PATH
99              Listen on vhost-user UNIX domain socket at PATH.
100
101       --socket-group=GROUP
102              Set the vhost-user UNIX domain socket gid to GROUP.
103
104       --fd=FDNUM
105              Accept  connections  from vhost-user UNIX domain socket file de‐
106              scriptor FDNUM.  The file descriptor must already  be  listening
107              for connections.
108
109       --thread-pool-size=NUM
110              Restrict  the number of worker threads per request queue to NUM.
111              The default is 64.
112
113       --cache=none|auto|always
114              Select the desired trade-off between coherency and  performance.
115              none  forbids  the  FUSE client from caching to achieve best co‐
116              herency at the cost of performance.  auto acts  similar  to  NFS
117              with  a  1  second  metadata  cache timeout.  always sets a long
118              cache lifetime at the expense  of  coherency.   The  default  is
119              auto.
120

EXTENDED ATTRIBUTE (XATTR) MAPPING

122       By default the name of xattr's used by the client are passed through to
123       the server file system.  This can be a problem where either those xattr
124       names  are  used by something on the server (e.g. selinux client/server
125       confusion) or if the virtiofsd is  running  in  a  container  with  re‐
126       stricted privileges where it cannot access some attributes.
127
128   Mapping syntax
129       A  mapping  of  xattr names can be made using -o xattrmap=mapping where
130       the mapping string consists of a series of rules.
131
132       The first matching rule terminates the mapping.  The set of rules  must
133       include  a  terminating  rule  to match any remaining attributes at the
134       end.
135
136       Each rule consists of a number of fields  separated  with  a  separator
137       that  is the first non-white space character in the rule.  This separa‐
138       tor must then be used for the whole rule.  White space may be added be‐
139       fore and after each rule.
140
141       Using ':' as the separator a rule is of the form:
142
143       :type:scope:key:prepend:
144
145       scope is:
146
147
148
149         'client' - match 'key' against a xattr name from the client for
150                setxattr/getxattr/removexattr
151
152
153
154         'server' - match 'prepend' against a xattr name from the server
155                for listxattr
156
157
158
159         'all' - can be used to make a single rule where both the server
160                and client matches are triggered.
161
162       type is one of:
163
164       • 'prefix'  -  is designed to prepend and strip a prefix;  the modified
165         attributes then being passed on to the client/server.
166
167       • 'ok' - Causes the rule set to be terminated when  a  match  is  found
168         while  allowing  matching  xattr's through unchanged.  It is intended
169         both as a way of explicitly terminating the list of rules, and to al‐
170         low some xattr's to skip following rules.
171
172       • 'bad'  -  If  a client tries to use a name matching 'key' it's denied
173         using EPERM; when  the  server  passes  an  attribute  name  matching
174         'prepend'  it's  hidden.   In many ways it's use is very like 'ok' as
175         either an explicit terminator or for special handling of certain pat‐
176         terns.
177
178       • 'unsupported'  -  If a client tries to use a name matching 'key' it's
179         denied using ENOTSUP; when the server passes an attribute name match‐
180         ing  'prepend'  it's hidden.  In many ways it's use is very like 'ok'
181         as either an explicit terminator or for special handling  of  certain
182         patterns.
183
184       key  is a string tested as a prefix on an attribute name originating on
185       the client.  It maybe empty in which case a 'client' rule  will  always
186       match on client names.
187
188       prepend is a string tested as a prefix on an attribute name originating
189       on the server, and used as a new prefix.  It may be empty in which case
190       a 'server' rule will always match on all names from the server.
191
192       e.g.:
193          :prefix:client:trusted.:user.virtiofs.:
194
195          will match 'trusted.' attributes in client calls and prefix them be‐
196          fore passing them to the server.
197
198          :prefix:server::user.virtiofs.:
199
200          will strip 'user.virtiofs.' from all server replies.
201
202          :prefix:all:trusted.:user.virtiofs.:
203
204          combines the previous two cases into a single rule.
205
206          :ok:client:user.::
207
208          will allow get/set xattr for 'user.' xattr's  and  ignore  following
209          rules.
210
211          :ok:server::security.:
212
213          will pass 'securty.' xattr's in listxattr from the server and ignore
214          following rules.
215
216          :ok:all:::
217
218          will terminate the rule search passing any remaining  attributes  in
219          both directions.
220
221          :bad:server::security.:
222
223          would hide 'security.' xattr's in listxattr from the server.
224
225       A simpler 'map' type provides a shorter syntax for the common case:
226
227       :map:key:prepend:
228
229       The 'map' type adds a number of separate rules to add prepend as a pre‐
230       fix to the matched key (or all attributes if key is empty).  There  may
231       be at most one 'map' rule and it must be the last rule in the set.
232
233       Note:  When the 'security.capability' xattr is remapped, the daemon has
234       to do extra work to remove it during many operations,  which  the  host
235       kernel normally does itself.
236
237   Security considerations
238       Operating  systems  typically  partition the xattr namespace using well
239       defined name prefixes. Each partition may have  different  access  con‐
240       trols applied. For example, on Linux there are multiple partitions
241
242system.* - access varies depending on attribute & filesystem
243
244security.* - only processes with CAP_SYS_ADMIN
245
246trusted.* - only processes with CAP_SYS_ADMIN
247
248user.* - any process granted by file permissions / ownership
249
250       While  other OS such as FreeBSD have different name prefixes and access
251       control rules.
252
253       When remapping attributes on the host, it is important to  ensure  that
254       the  remapping  does  not  allow a guest user to evade the guest access
255       control rules.
256
257       Consider  if  trusted.*  from  the  guest  was  remapped  to  user.vir‐
258       tiofs.trusted*  in  the host. An unprivileged user in a Linux guest has
259       the ability to write to xattrs under user.*. Thus the  user  can  evade
260       the  access  control  restriction  on  trusted.*  by instead writing to
261       user.virtiofs.trusted.*.
262
263       As noted above, the partitions used and access controls  applied,  will
264       vary  across  guest  OS,  so  it is not wise to try to predict what the
265       guest OS will use.
266
267       The simplest way to avoid an insecure configuration  is  to  remap  all
268       xattrs  at once, to a given fixed prefix.  This is shown in example (1)
269       below.
270
271       If selectively mapping only a subset of xattr prefixes, then rules must
272       be  added to explicitly block direct access to the target of the remap‐
273       ping. This is shown in example (2) below.
274
275   Mapping examples
276       1. Prefix all attributes with 'user.virtiofs.'
277
278          -o xattrmap=":prefix:all::user.virtiofs.::bad:all:::"
279
280       This uses two rules, using : as the field  separator;  the  first  rule
281       prefixes  and  strips  'user.virtiofs.',  the  second  rule  hides  any
282       non-prefixed attributes that the host set.
283
284       This is equivalent to the 'map' rule:
285
286          -o xattrmap=":map::user.virtiofs.:"
287
288       2. Prefix 'trusted.' attributes, allow others through
289
290          "/prefix/all/trusted./user.virtiofs./
291           /bad/server//trusted./
292           /bad/client/user.virtiofs.//
293           /ok/all///"
294
295       Here there are four rules, using / as the  field  separator,  and  also
296       demonstrating  that new lines can be included between rules.  The first
297       rule is the prefixing of 'trusted.' and stripping of  'user.virtiofs.'.
298       The  second  rule  hides  unprefixed 'trusted.' attributes on the host.
299       The third rule stops a guest from  explicitly  setting  the  'user.vir‐
300       tiofs.' path directly to prevent access control bypass on the target of
301       the earlier prefix remapping.  Finally, the fourth rule  lets  all  re‐
302       maining attributes through.
303
304       This is equivalent to the 'map' rule:
305
306          -o xattrmap="/map/trusted./user.virtiofs./"
307
308       3. Hide 'security.' attributes, and allow everything else
309
310          "/bad/all/security./security./
311           /ok/all///'
312
313       The  first rule combines what could be separate client and server rules
314       into a single 'all' rule, matching 'security.' in either  client  argu‐
315       ments  or  lists  returned from the host.  This stops the client seeing
316       any 'security.' attributes on the server and stops it setting any.
317

EXAMPLES

319       Export   /var/lib/fs/vm001/   on   vhost-user   UNIX   domain    socket
320       /var/run/vm001-vhost-fs.sock:
321
322          host# virtiofsd --socket-path=/var/run/vm001-vhost-fs.sock -o source=/var/lib/fs/vm001
323          host# qemu-system-x86_64 \
324                -chardev socket,id=char0,path=/var/run/vm001-vhost-fs.sock \
325                -device vhost-user-fs-pci,chardev=char0,tag=myfs \
326                -object memory-backend-memfd,id=mem,size=4G,share=on \
327                -numa node,memdev=mem \
328                ...
329          guest# mount -t virtiofs myfs /mnt
330

AUTHOR

332       Stefan     Hajnoczi     <stefanha@redhat.com>,     Masayoshi     Mizuma
333       <m.mizuma@jp.fujitsu.com>
334
336       2022, The QEMU Project Developers
337
338
339
340
3416.2.0                            Jun 11, 2022                     VIRTIOFSD(1)
Impressum