datalad.api.make¶

datalad.api.make(dataset: DatasetParameter | None = None, *, template: str = '', label: str = '', prospective_execution: bool = False, branch: str | None = None, input: list[str] | None = None, input_list: Path | None = None, output: list[str] | None = None, output_list: Path | None = None, parameter: list[str] | None = None, parameter_list: Path | None = None, stdout: str | None = None, allow_untrusted_execution: bool = False) → Generator¶

Specify a computation and optionally execute it

Parameters¶

dataset: Dataset to be used as a configuration source. Beyond reading configuration items, this command does not interact with the dataset. [Default: None]
template: Name of the computing template (template should be present in $DATASET/.datalad/remake/methods/). [Default: ‘’]
label: Label of the computation. This is a user defined name that is used to identify and prioritize computations, if more than one computation is registered for a file. If no label is provided, the template name will be used. Prioritization is done by reading the git configuration datalad.make.priority (which should contain a comma-separated list of labels). If this configuration key does not exist, the priority list is read from the file $DATASET/.datalad/make/priority. If that does not exist either, a random computation is chosen. [Default: ‘’]
prospective_executionbool, optional: Don’t perform the computation now, only register compute instructions, datalad get <file> or git annex get <file> will trigger the computation. Note: if this option is provided, input- and output-patterns will be stored verbatim. Input globbing will be performed when the computation is triggered. But the name of the output files that are created will be the verbatim output pattern strings. [Default: False]
branch: Branch (or commit) that should be used for computation, if not specified HEAD will be used. [Default: None]
input: An input file pattern (repeat for multiple inputs). File patterns support python globbing, globbing is performed by installing all possibly matching subdatasets and performing globbing in those, recursively. That means expressions like ** might pull in a huge number of datasets. Input file patterns must be relative, they are dereferenced from the root of the dataset. [Default: None]
input_list: Name of a file that contains a list of input file patterns. Format is one file per line, relative path from dataset. Empty lines, i.e. lines that contain only newlines, and lines that start with ‘#’ are ignored. Line content is stripped before being used. This is useful if a large number of input file patterns should be provided. [Default: None]
output: An output file pattern (repeat for multiple outputs)file pattern support python globbing, output globbing is performed in the worktree after the computation). Output file patterns must be relative, they are dereferenced from the root of the dataset. [Default: None]
output_list: Name of a file that contains a list of output file patterns. Format is one file per line, relative path from dataset. Empty lines, i.e. lines that contain only newlines, and lines that start with ‘#’ are ignored. Line content is stripped before being used. This is useful if a large number of output file patterns should be provided. [Default: None]
parameter: Input parameter in the form <name>=<value> (repeat for multiple parameters). [Default: None]
parameter_list: Name of a file that contains a list of parameters. Format is one <name>=<value> string per line. Empty lines, i.e. lines that contain only newlines, and lines that start with ‘#’ are ignored. Line content is stripped before being used. This is useful if a large number of parameters should be provided. [Default: None]
stdout: Name of a file that will receive stdout output from the computation. If not given, stdout output will be discarded. It is preferable to NOT add the stdout file to the dataset on which the computation is performed. The reason is that stdout output tends to differ between runs, for example due to time stamps or other non- deterministic factors. [Default: None]
allow_untrusted_executionbool, optional: Skip commit signature verification before executing code. This should only be used in a strictly controlled environment with fully trusted datasets. Fully trusted dataset means: every commit stems from a trusted entity. This option has no effect when combined with –prospective-execution. DO NOT USE THIS OPTION, unless you are sure to understand the consequences. One of which is that arbitrary parties can execute arbitrary code under your account on your infrastructure. [Default: False]
on_failure{‘ignore’, ‘continue’, ‘stop’}, optional: behavior to perform on failure: ‘ignore’ any failure is reported, but does not cause an exception; ‘continue’ if any failure occurs an exception will be raised at the end, but processing other actions will continue for as long as possible; ‘stop’: processing will stop on first failure and an exception is raised. A failure is any result with status ‘impossible’ or ‘error’. Raised exception is an IncompleteResultsError that carries the result dictionaries of the failures in its failed attribute. [Default: ‘continue’]
result_filtercallable or None, optional: if given, each to-be-returned status dictionary is passed to this callable, and is only returned if the callable’s return value does not evaluate to False or a ValueError exception is raised. If the given callable supports **kwargs it will additionally be passed the keyword arguments of the original API call. [Default: None]
result_renderer: select rendering mode command results. ‘tailored’ enables a command- specific rendering style that is typically tailored to human consumption, if there is one for a specific command, or otherwise falls back on the the ‘generic’ result renderer; ‘generic’ renders each result in one line with key info like action, status, path, and an optional message); ‘json’ a complete JSON line serialization of the full result record; ‘json_pp’ like ‘json’, but pretty-printed spanning multiple lines; ‘disabled’ turns off result rendering entirely; ‘<template>’ reports any value(s) of any result properties in any format indicated by the template (e.g. ‘{path}’, compare with JSON output for all key-value choices). The template syntax follows the Python “format() language”. It is possible to report individual dictionary values, e.g. ‘{metadata[name]}’. If a 2nd-level key contains a colon, e.g. ‘music:Genre’, ‘:’ must be substituted by ‘#’ in the template, like so: ‘{metadata[music#Genre]}’. [Default: ‘tailored’]
result_xfm{‘datasets’, ‘successdatasets-or-none’, ‘paths’, ‘relpaths’, ‘metadata’} or callable or None, optional: if given, each to-be-returned result status dictionary is passed to this callable, and its return value becomes the result instead. This is different from result_filter, as it can perform arbitrary transformation of the result value. This is mostly useful for top- level command invocations that need to provide the results in a particular format. Instead of a callable, a label for a pre-crafted result transformation can be given. [Default: None]
return_type{‘generator’, ‘list’, ‘item-or-list’}, optional: return value behavior switch. If ‘item-or-list’ a single value is returned instead of a one-item return value list, or a list in case of multiple return values. None is return in case of an empty list. [Default: ‘list’]

datalad.api.make¶

Parameters¶

datalad-remake

Navigation

Related Topics