datalad-run(1)

1datalad run(1)              General Commands Manual             datalad run(1)
2
3
4

NAME

6       datalad run - run an arbitrary shell command and record its impact on a
7       dataset.
8

SYNOPSIS

10       datalad   run   [-h]    [-d    DATASET]    [-i    PATH]    [-o    PATH]
11              [--expand            {inputs|outputs|both}]           [--assume-
12              ready   {inputs|outputs|both}]   [--explicit]    [-m    MESSAGE]
13              [--sidecar  {yes|no}]  [--dry-run  {basic|command}]  [-J  NJOBS]
14              [--version] ...
15
16
17

DESCRIPTION

19       It is recommended to craft the command such that it can run in the root
20       directory of the dataset that the command will be recorded in. However,
21       as long as the command is executed  somewhere  underneath  the  dataset
22       root, the exact location will be recorded relative to the dataset root.
23
24       If the executed command did not alter the dataset in any way, no record
25       of the command execution is made.
26
27       If the given command errors, a COMMANDERROR exception with the same ex‐
28       it  code  will be raised, and no modifications will be saved. A command
29       execution will not be attempted, by default,  when  an  error  occurred
30       during  input or output preparation. This default ``stop`` behavior can
31       be overridden via --on-failure ....
32
33       In the presence of subdatasets, the  full  dataset  hierarchy  will  be
34       checked for unsaved changes prior command execution, and changes in any
35       dataset will be saved after execution. Any modification of  subdatasets
36       is also saved in their respective superdatasets to capture a comprehen‐
37       sive record of the  entire  dataset  hierarchy  state.  The  associated
38       provenance record is duplicated in each modified (sub)dataset, although
39       only being fully interpretable and re-executable in the actual top-lev‐
40       el  superdataset.  For  this  reason the provenance record contains the
41       dataset ID of that superdataset.
42
43   Command format
44       A few placeholders are supported in the command via Python format spec‐
45       ification.  "{pwd}"  will be replaced with the full path of the current
46       working directory. "{dspath}" will be replaced with the  full  path  of
47       the  dataset  that  run is invoked on. "{tmpdir}" will be replaced with
48       the full path of a temporary directory. "{inputs}" and "{outputs}" rep‐
49       resent the values specified by --input and --output. If multiple values
50       are specified, the values will be joined by a space.  The order of  the
51       values  will match that order from the command line, with any globs ex‐
52       panded in alphabetical order (like bash). Individual values can be  ac‐
53       cessed with an integer index (e.g., "{inputs[0]}").
54
55       Note  that the representation of the inputs or outputs in the formatted
56       command string depends on whether the command is given as a list of ar‐
57       guments  or  as a string (quotes surrounding the command). The concate‐
58       nated list of inputs or outputs will be surrounded by quotes  when  the
59       command  is  given as a list but not when it is given as a string. This
60       means that the string form is required if you need to pass  each  input
61       as  a  separate argument to a preceding script (i.e., write the command
62       as "./script {inputs}", quotes included). The string form  should  also
63       be used if the input or output paths contain spaces or other characters
64       that need to be escaped.
65
66       To escape a brace character, double it (i.e., "{{" or "}}").
67
68       Custom placeholders can  be  added  as  configuration  variables  under
69       "datalad.run.substitutions".  As an example:
70
71       Add a placeholder "name" with the value "joe"::
72
73         %  datalad  configuration  --scope  branch  set datalad.run.substitu‐
74       tions.name=joe
75         % datalad save -m "Configure name placeholder" .datalad/config
76
77       Access the new placeholder in a command::
78
79         % datalad run "echo my name is {name} >me"
80
81   Examples
82       Run an executable script and record the impact on a dataset::
83
84        % datalad run -m 'run my script' 'code/script.sh'
85
86       Run a command and specify a directory as a dependency for the run.  The
87       contents  of  the  dependency  will  be  retrieved prior to running the
88       script::
89
90        % datalad run -m 'run my script' -i 'data/*' 'code/script.sh'
91
92       Run an executable script and specify output files of the script  to  be
93       unlocked prior to running the script::
94
95        %  datalad  run  -m  'run  my script' -i 'data/*'    -o 'output_dir/*'
96       'code/script.sh'
97
98       Specify multiple inputs and outputs::
99
100        % datalad run -m 'run my script' -i 'data/*'    -i  'datafile.txt'  -o
101       'output_dir/*' -o    'outfile.txt' 'code/script.sh'
102
103       Use  **  to match any file at any directory depth recursively. Single *
104       does not check files within matched directories.::
105
106        % datalad run -m  'run  my  script'  -i  'data/**/*.dat'     -o  'out‐
107       put_dir/**' 'code/script.sh'
108
109

OPTIONS

111       COMMAND
112              command  for  execution.  A  leading  '--' can be used to disam‐
113              biguate this command from the preceding options to DataLad.
114
115
116       -h, --help, --help-np
117              show this help message. --help-np forcefully disables the use of
118              a pager for displaying the help message
119
120       -d DATASET, --dataset DATASET
121              specify the dataset to record the command results in. An attempt
122              is made to identify the dataset based on the current working di‐
123              rectory.  If a dataset is given, the command will be executed in
124              the root directory of this dataset. Constraints: Value must be a
125              Dataset or a valid identifier of a Dataset (e.g. a path) or val‐
126              ue must be NONE
127
128       -i PATH, --input PATH
129              A dependency for the run. Before running the command,  the  con‐
130              tent  for  this  relative path will be retrieved. A value of "."
131              means "run datalad get .". The value can also be  a  glob.  This
132              option can be given more than once.
133
134       -o PATH, --output PATH
135              Prepare  this relative path to be an output file of the command.
136              A value of "." means "run datalad unlock ." (and  will  fail  if
137              some content isn't present). For any other value, if the content
138              of this file is present, unlock the file. Otherwise, remove  it.
139              The value can also be a glob. This option can be given more than
140              once.
141
142       --expand {inputs|outputs|both}
143              Expand globs when storing inputs and/or outputs  in  the  commit
144              message. Constraints: value must be one of ('inputs', 'outputs',
145              'both')
146
147       --assume-ready {inputs|outputs|both}
148              Assume that inputs do not need to be retrieved and/or outputs do
149              not need to unlocked or removed before running the command. This
150              option allows you to avoid  the  expense  of  these  preparation
151              steps  if you know that they are unnecessary. Constraints: value
152              must be one of ('inputs', 'outputs', 'both')
153
154       --explicit
155              Consider the specification of inputs and outputs to be explicit.
156              Don't  warn  if the repository is dirty, and only save modifica‐
157              tions to the listed outputs.
158
159       -m MESSAGE, --message MESSAGE
160              a description of the state or the changes  made  to  a  dataset.
161              Constraints: value must be a string or value must be NONE
162
163       --sidecar {yes|no}
164              By default, the configuration variable 'datalad.run.record-side‐
165              car' determines whether a record with information on a command's
166              execution  is  placed into a separate record file instead of the
167              commit message (default: off). This option can be used to  over‐
168              ride  the  configured  behavior on a case-by-case basis. Sidecar
169              files are placed into the dataset's '.datalad/runinfo' directory
170              (customizable  via the 'datalad.run.record-directory' configura‐
171              tion variable). Constraints: value must be NONE or value must be
172              convertible to type bool
173
174       --dry-run {basic|command}
175              Do  not  run the command; just display details about the command
176              execution. A value of "basic" reports a  few  important  details
177              about the execution, including the expanded command and expanded
178              inputs and outputs. "command" displays the expanded command  on‐
179              ly.  Note  that input and output globs underneath an uninstalled
180              dataset will be left unexpanded because no subdatasets  will  be
181              installed for a dry run. Constraints: value must be one of ('ba‐
182              sic', 'command')
183
184       -J NJOBS, --jobs NJOBS
185              how many parallel jobs (where possible) to  use.  "auto"  corre‐
186              sponds to the number defined by 'datalad.runtime.max-annex-jobs'
187              configuration item NOTE: This option can only parallelize  input
188              retrieval  (get)  and  output recording (save). DataLad does NOT
189              parallelize your scripts for you.  Constraints:  value  must  be
190              convertible to type 'int' or value must be NONE or value must be
191              one of ('auto',)
192
193       --version
194              show the module and its version which provides the command
195

AUTHORS

197        datalad is developed by The DataLad Team and Contributors <team@datal‐
198       ad.org>.
199
200
201
202datalad run 0.19.3                2023-08-11                    datalad run(1)