1datalad create(1) General Commands Manual datalad create(1)
2
3
4
6 datalad create - create a new dataset from scratch.
7
9 datalad create [-h] [-f] [-D DESCRIPTION] [-d DATASET] [--no-annex]
10 [--fake-dates] [-c PROC] [--version] [PATH] ...
11
12
13
15 This command initializes a new dataset at a given location, or the cur‐
16 rent directory. The new dataset can optionally be registered in an ex‐
17 isting superdataset (the new dataset's path needs to be located within
18 the superdataset for that, and the superdataset needs to be given ex‐
19 plicitly via --dataset). It is recommended to provide a brief descrip‐
20 tion to label the dataset's nature *and* location, e.g. "Michael's mu‐
21 sic on black laptop". This helps humans to identify data locations in
22 distributed scenarios. By default an identifier comprised of user and
23 machine name, plus path will be generated.
24
25 This command only creates a new dataset, it does not add existing con‐
26 tent to it, even if the target directory already contains additional
27 files or directories.
28
29 Plain Git repositories can be created via --no-annex. However, the re‐
30 sult will not be a full dataset, and, consequently, not all features
31 are supported (e.g. a description).
32
33 To create a local version of a remote dataset use the `install` command
34 instead.
35
36 NOTE Power-user info: This command uses git init and git annex init
37 to prepare the new dataset. Registering to a superdataset is
38 performed via a git submodule add operation in the discovered
39 superdataset.
40
41 Examples
42 Create a dataset 'mydataset' in the current directory::
43
44 % datalad create mydataset
45
46 Apply the text2git procedure upon creation of a dataset::
47
48 % datalad create -c text2git mydataset
49
50 Create a subdataset in the root of an existing dataset::
51
52 % datalad create -d . mysubdataset
53
54 Create a dataset in an existing, non-empty directory::
55
56 % datalad create --force
57
58 Create a plain Git repository::
59
60 % datalad create --no-annex mydataset
61
62
64 PATH path where the dataset shall be created, directories will be
65 created as necessary. If no location is provided, a dataset will
66 be created in the location specified by --dataset (if given) or
67 the current working directory. Either way the command will error
68 if the target directory is not empty. Use --force to create a
69 dataset in a non-empty directory. Constraints: value must be a
70 string or Value must be a Dataset or a valid identifier of a
71 Dataset (e.g. a path) or value must be NONE
72
73 INIT OPTIONS
74 options to pass to git init. Any argument specified after the
75 destination path of the repository will be passed to git-init
76 as-is. Note that not all options will lead to viable results.
77 For example '--bare' will not yield a repository where DataLad
78 can adjust files in its working tree.
79
80
81 -h, --help, --help-np
82 show this help message. --help-np forcefully disables the use of
83 a pager for displaying the help message
84
85 -f, --force
86 enforce creation of a dataset in a non-empty directory.
87
88 -D DESCRIPTION, --description DESCRIPTION
89 short description to use for a dataset location. Its primary
90 purpose is to help humans to identify a dataset copy (e.g.,
91 "mike's dataset on lab server"). Note that when a dataset is
92 published, this information becomes available on the remote
93 side. Constraints: value must be a string or value must be NONE
94
95 -d DATASET, --dataset DATASET
96 specify the dataset to perform the create operation on. If a
97 dataset is given along with PATH, a new subdataset will be cre‐
98 ated in it at the `path` provided to the create command. If a
99 dataset is given but PATH is unspecified, a new dataset will be
100 created at the location specified by this option. Constraints:
101 Value must be a Dataset or a valid identifier of a Dataset (e.g.
102 a path) or value must be NONE
103
104 --no-annex
105 if set, a plain Git repository will be created without any an‐
106 nex.
107
108 --fake-dates
109 Configure the repository to use fake dates. The date for a new
110 commit will be set to one second later than the latest commit in
111 the repository. This can be used to anonymize dates.
112
113 -c PROC, --cfg-proc PROC
114 Run cfg_PROC procedure(s) (can be specified multiple times) on
115 the created dataset. Use run-procedure --discover to get a list
116 of available procedures, such as cfg_text2git.
117
118 --version
119 show the module and its version which provides the command
120
122 datalad is developed by The DataLad Team and Contributors <team@datal‐
123 ad.org>.
124
125
126
127datalad create 0.19.3 2023-08-11 datalad create(1)