1datalad create-sibling-ria(1)General Commands Manuadlatalad create-sibling-ria(1)
2
3
4

NAME

6       datalad  create-sibling-ria  -  creates a sibling to a dataset in a RIA
7       store
8

SYNOPSIS

10       datalad  create-sibling-ria  [-h]  -s  NAME  [-d  DATASET]  [--storage-
11              name      NAME]     [--alias     ALIAS]     [--post-update-hook]
12              [--shared     {false|true|umask|group|all|world|everybody|0xxx}]
13              [--group   GROUP]  [--storage-sibling  MODE]  [--existing  MODE]
14              [--new-store-ok] [--trust-level TRUST-LEVEL]  [-r]  [-R  LEVELS]
15              [--no-storage-sibling]                               [--push-url
16              ria+<ssh|file>://<host>[/path]]                      [--version]
17              ria+<ssh|file|http(s)>://<host>[/path]
18
19
20

DESCRIPTION

22       Communication with a dataset in a RIA store is implemented via two sib‐
23       lings. A regular Git remote (repository sibling) and a  git-annex  spe‐
24       cial remote for data transfer (storage sibling) -- with the former hav‐
25       ing a publication dependency on the latter. By default, the name of the
26       storage  sibling  is  derived from the repository sibling's name by ap‐
27       pending "-storage".
28
29       The store's base path is expected to not exist, be an empty  directory,
30       or a valid RIA store.
31
32       Notes -----
33
34   *RIA URL format*
35       Interactions  with new or existing RIA stores require RIA URLs to iden‐
36       tify the store or specific datasets inside of it.
37
38       The general structure of a RIA URL pointing to a store takes  the  form
39       ``ria+[scheme]://<storelocation>``    (e.g.,   ``ria+ssh://[user@]host‐
40       name:/absolute/path/to/ria-store``,       or        ``ria+file:///abso‐
41       lute/path/to/ria-store``)
42
43       The  general  structure  of  a RIA URL pointing to a dataset in a store
44       (for example for cloning) takes a similar form, but appends either  the
45       datasets  UUID  or  a  "~" symbol followed by the dataset's alias name:
46       ``ria+[scheme]://<storelocation>#<dataset-UUID>``                    or
47       ``ria+[scheme]://<storelocation>#~<aliasname>``.  In addition, specific
48       version identifiers can be appended to the URL with an  additional  "@"
49       symbol:   ``ria+[scheme]://<storelocation>#<dataset-UUID>@<dataset-ver‐
50       sion>``, where ``dataset-version`` refers to a branch or tag.
51
52   *RIA store layout*
53       A RIA store is a directory tree with a dedicated subdirectory for  each
54       dataset  in  the  store.  The subdirectory name is constructed from the
55       DataLad  dataset  ID,  e.g.  ``124/68afe-59ec-11ea-93d7-f0d5bf7b5561``,
56       where the first three characters of the ID are used for an intermediate
57       subdirectory in order to mitigate files system limitations  for  stores
58       containing a large number of datasets.
59
60       By  default, a dataset in a RIA store consists of two components: A Git
61       repository (for all dataset contents stored in Git) and a storage  sib‐
62       ling (for dataset content stored in git-annex).
63
64       It  is  possible  to selectively disable either component using ``stor‐
65       age-sibling 'off'`` or ``storage-sibling  'only'``,  respectively.   If
66       neither component is disabled, a dataset's subdirectory layout in a RIA
67       store contains a standard bare Git repository and an ``annex/``  subdi‐
68       rectory  inside  of  it.  The latter holds a Git-annex object store and
69       comprises the  storage  sibling.   Disabling  the  standard  git-remote
70       (``storage-sibling='only'``)  will  result  in  not having the bare git
71       repository, disabling the storage  sibling  (``storage-sibling='off'``)
72       will result in not having the ``annex/`` subdirectory.
73
74       Optionally, there can be a further subdirectory ``archives`` with (com‐
75       pressed) 7z archives of annex objects. The storage remote  is  able  to
76       pull  annex objects from these archives, if it cannot find in the regu‐
77       lar annex object store. This feature can be useful  for  storing  large
78       collections of rarely changing data on systems that limit the number of
79       files that can be stored.
80
81       Each dataset directory also contains a ``ria-layout-version`` file that
82       identifies the data organization (as, for example, described above).
83
84       Lastly,  there  is  a global ``ria-layout-version`` file at the store's
85       base path that identifies where dataset subdirectories  themselves  are
86       located.  At  present, this file must contain a single line stating the
87       version (currently "1"). This line MUST end with a newline character.
88
89       It is possible to define an alias for an individual dataset in a  store
90       by  placing a symlink to the dataset location into an ``alias/`` direc‐
91       tory in the root of the store. This enables dataset access via URLs  of
92       format: ``ria+<protocol>://<storelocation>#~<aliasname>``.
93
94       Compared  to standard git-annex object stores, the ``annex/`` subdirec‐
95       tories used as storage siblings follow a different layout naming scheme
96       ('dirhashmixed'  instead of 'dirhashlower').  This is mostly noted as a
97       technical detail, but also serves to remind git-annex powerusers to re‐
98       frain from running git-annex commands directly in-store as it can cause
99       severe damage due to the layout difference. Interactions should be han‐
100       dled via the ORA special remote instead.
101
102   *Error logging*
103       To  enable error logging at the remote end, append a pipe symbol and an
104       "l" to the version number in ria-layout-version (like so: ``1|l0`).
105
106       Error logging will create files in an  "error_log"  directory  whenever
107       the  git-annex  special  remote  (storage sibling) raises an exception,
108       storing the Python traceback of it. The logfiles are named according to
109       the  scheme  ``<dataset  id>.<annex  uuid  of the remote>.log`` showing
110       "who" ran into this issue with which dataset. Because logging  can  po‐
111       tentially  leak  personal  data (like local file paths for example), it
112       can be disabled client-side by setting the configuration variable ``an‐
113       nex.ora-remote.<storage-sibling-name>.ignore-remote-config``.
114

OPTIONS

116       ria+<ssh|file|http(s)>://<host>[/path]
117              URL  identifying  the  target  RIA store and access protocol. If
118              ``--push-url`` is given in addition, this is used for  read  ac‐
119              cess only. Otherwise it will be used for write access too and to
120              create the repository sibling  in  the  RIA  store.  Note,  that
121              HTTP(S)  currently  is valid for consumption only thus requiring
122              to provide ``--push-url``. Constraints: value must be  a  string
123              or value must be NONE
124
125
126       -h, --help, --help-np
127              show this help message. --help-np forcefully disables the use of
128              a pager for displaying the help message
129
130       -s NAME, --name NAME
131              Name of the sibling. With RECURSIVE, the same name will be  used
132              to  label all the subdatasets' siblings. Constraints: value must
133              be a string or value must be NONE
134
135       -d DATASET, --dataset DATASET
136              specify the dataset to process. If no dataset is given,  an  at‐
137              tempt is made to identify the dataset based on the current work‐
138              ing directory. Constraints: Value must be a Dataset or  a  valid
139              identifier of a Dataset (e.g. a path) or value must be NONE
140
141       --storage-name NAME
142              Name of the storage sibling (git-annex special remote). Must not
143              be identical to the sibling name. If not specified, defaults  to
144              the  sibling name plus '-storage' suffix. If only a storage sib‐
145              ling is created, this setting is ignored, and the  primary  sib‐
146              ling  name is used. Constraints: value must be a string or value
147              must be NONE
148
149       --alias ALIAS
150              Alias for the dataset in the RIA store. Add the  necessary  sym‐
151              link so that this dataset can be cloned from the RIA store using
152              the given ALIAS instead of its ID. With  `recursive=True`,  only
153              the  top  dataset  will be aliased. Constraints: value must be a
154              string or value must be NONE
155
156       --post-update-hook
157              Enable Git's default post-update-hook for the  created  sibling.
158              This  is  useful when the sibling is made accessible via a "dumb
159              server" that requires running 'git  update-server-info'  to  let
160              Git interact properly with it.
161
162       --shared {false|true|umask|group|all|world|everybody|0xxx}
163              If given, configures the permissions in the RIA store for multi-
164              users access. Possible values for this option are  identical  to
165              those of `git init --shared` and are described in its documenta‐
166              tion. Constraints: value must be a string or value must be  con‐
167              vertible to type bool or value must be NONE
168
169       --group GROUP
170              Filesystem  group  for  the  repository. Specifying the group is
171              crucial when --shared=group. Constraints: value must be a string
172              or value must be NONE
173
174       --storage-sibling MODE
175              By  default, an ORA storage sibling and a Git repository sibling
176              are created (on). Alternatively, creation of the storage sibling
177              can  be disabled (off), or a storage sibling created only and no
178              Git sibling (only). In the latter mode, no Git  installation  is
179              required  on  the target host. Constraints: value must be one of
180              ('only',) or value must be convertible to  type  bool  or  value
181              must be NONE [Default: True]
182
183       --existing MODE
184              Action  to perform, if a (storage) sibling is already configured
185              under the given name and/or a target  already  exists.  In  this
186              case,  a  dataset  can  be  skipped ('skip'), an existing target
187              repository  be  forcefully  re-initialized,  and   the   sibling
188              (re-)configured ('reconfigure'), or the command be instructed to
189              fail ('error'). Constraints: value must be one of ('skip',  'er‐
190              ror', 'reconfigure') [Default: 'error']
191
192       --new-store-ok
193              When  set, a new store will be created, if necessary. Otherwise,
194              a sibling will only be created if the url points to an  existing
195              RIA store.
196
197       --trust-level TRUST-LEVEL
198              specify a trust level for the storage sibling. If not specified,
199              the default git-annex trust level is  used.  'trust'  should  be
200              used  with care (see the git-annex-trust man page). Constraints:
201              value must be one of ('trust', 'semitrust', 'untrust')
202
203       -r, --recursive
204              if set, recurse into potential subdatasets.
205
206       -R LEVELS, --recursion-limit LEVELS
207              limit recursion into subdatasets to the given number of  levels.
208              Constraints:  value  must  be convertible to type 'int' or value
209              must be NONE
210
211       --no-storage-sibling
212              This option is deprecated. Use '--storage-sibling off' instead.
213
214       --push-url ria+<ssh|file>://<host>[/path]
215              URL identifying the target RIA store  and  access  protocol  for
216              write  access to the storage sibling. If given this will also be
217              used for creation of the repository sibling in  the  RIA  store.
218              Constraints: value must be a string or value must be NONE
219
220       --version
221              show the module and its version which provides the command
222

AUTHORS

224        datalad is developed by The DataLad Team and Contributors <team@datal‐
225       ad.org>.
226
227
228
229datalad create-sibling-ria 0.19.3 2023-08-11     datalad create-sibling-ria(1)
Impressum