Files
=====
DataLad-remake relies on text files to receive or record information
about the computations it should perform. Some files may need to be
provided by the user, and some are generated by DataLad-remake. Below
we provide an overview of these files and their format.
Compute template
----------------
This `TOML `_ file is used as an input for
``datalad make``. It defines a command to be executed and a way to
parameterize it (a command can be used to compute different files with
different parameters).
Two key-value pairs should be present in the file:
- ``parameters``: array. A list of parameter names, for substituting
in the command. Values for the parameters can be provided later with
``datalad make --parameter`` or ``datalad make --parameter-list``.
- ``command``: array. A list of arguments which constitute a command
to be executed (similar to ``args`` argument to Python's
`subprocess.run
`_).
The available parameters (see above) can be used by placing their
names in curly braces (``{someparam}``).
The file should be present in
``$DATASET/.datalad/remake/methods``. The name of the file (chosen
arbitrarily by the user) should be used as the input argument for the
``make`` command.
Input list
----------
An optional file which contains a list of input file patterns, used
with ``datalad make --input-list`` (as an alternative to specifying
individual files with ``--input``). This is useful if a large number
of input file patterns should be provided.
Format is one file per line, relative path from DATASET. Empty lines,
i.e. lines that contain only newlines, and lines that start with '#'
are ignored. Leading and trailing whitespace is removed from each
line.
The file patterns support `python globbing`_. Globbing recurses into
subdatasets: if a pattern matches a subdataset path, the subdataset
will be installed, recursively if needed. That means expressions like
`**` might pull in a huge number of datasets
This file needs not be saved in the dataset.
.. _`python globbing`: https://docs.python.org/3/library/glob.html
Output list
-----------
An optional file which contains a list of output file patterns, used
with ``datalad make --output-list`` (as an alternative to specifying
individual files with ``--output``). This is useful if a large number
of output file patterns should be provided.
Format is one file per line, relative path from DATASET. Empty lines,
i.e. lines that contain only newlines, and lines that start with '#',
are ignored. Leading and trailing whitespace is removed from each
line.
The file patterns support `python globbing`_. Output globbing is
performed in the worktree after the computation.
This file needs not be saved in the dataset.
Parameter list
--------------
An optional file which contains a list of parameters, used with
``datalad make --parameter-list`` (as an alternative to specifying
individual parameters with ``--parameter``). This is useful if a large
number of parameters should be provided.
Format is one ``=`` string per line. Empty lines,
i.e. lines that contain only newlines, and lines that start with '#',
are ignored. Leading and trailing whitespace is removed from each
line.
This file needs not be saved in the dataset.
Compute specification
---------------------
This is a generated JSON file, created and saved in the dataset as a
result of running ``datalad make``. It is used to store computation
details (based on arguments provided to ``datalad make``) which can be
associated with one or more files.
It is placed under ``datalad/make/specifications``; its file name is
creating by hashing its contents.
As a generated file, it should not require editing by the user.