Project

General

Profile

--plugin

This is the main access to all installed analysis tools and the history. The tools are implemented by providing plug-ins to the system. For more information on how to create a plugin check the Developing a Plugin guides. An Overview of installed tools you get in the BUG.

Basic Usage

To get the help:

$ freva --plugin --help
freva --plugin [opt] query 
opt:
[...]

To list all available analysis tools:

$ freva --plugin
PCA: Principal Component Analysis
...

The "Overview":www-miklip.dkrz.de/plugins of tools in the framework.

To select a particular tool:

$ freva --plugin pca
Missing required configuration for: input, variable

You see here that the PCA tool is complaining because of an incomplete configuration.

To get the help of a particular tool:

$ freva --plugin pca --help
PCA (v3.1.0): Principal Component Analysis
Options:
areaweight     (default: False)
               Whether or not you want to have your data area weighted. This is
               done per latitude with sqrt(cos(latitude)).

boots          (default: 100)
               Number of bootstraps.
[...]
input          (default: None) [mandatory]
               An arbitrary NetCDF file. There are only two restrictions to your
               NetCDF file: a) Time has to be the very first dimension in the
               variable you like to analyze. b) All dimensions in your variable
               need to be defined as variables themselves with equal names.
               Both, a) and b), are usually true.
[...]

Here you see the configuration parameter, its default value (None means there is no value setup), whether the configuration is mandatory ([mandatory] marking by the default value) and an explanation about the configuration parameter.

To pass the values to the tool you just need to use the key=value construct like this:

$ freva --plugin pca input=myfile.nc outputdir=/tmp eofs=3
[...]

You may even define variables in terms of other variables like the projection name above. While doing so from the shell please remember you need to escape the $ sign by using the backslash (\) or setting the value in single quotes (no, double quotes don't work). For example:

$ freva --plugin pca input=myfile_\${eofs}.nc outputdir=/tmp eofs=3
#or
$ freva --plugin pca 'input=myfile_${eofs}.nc' outputdir=/tmp eofs=3

If you want to know more about this bash feature see this and if you want to want to know much more then take a look at this

Quoting is very important on any shell, so if you use them, be sure to know how it works. It may help you avoid losing data!

Configuring the tools

All configurations are saved in the --history can be seen, saved, return into a command and restarted!

You may want to save the configuration of the tool:

$ freva --plugin pca --save-config=/home/<user_account>/evaluation_system/config/pca/pca.conf variable=tas input=myfile.nc outputdir=/tmp eofs=3
INFO:__main__:Configuration file saved in /home/<user_account>/evaluation_system/config/pca/pca.conf

Note this starts the tool. To just save the configuration without starting the tool use the -n or --dry-run flag.
Also note this stores the configuration in a special directory structure so the system can find it again.

You can save the configuration somewhere else:

$ freva --plugin pca --save-config/home/<user_account>/evaluation_system/config/pca/pca.conf--dry-run --tool pca variable=tas input=myfile.nc outputdir=/tmp eofs=3
INFO:__main__:Configuration file saved in pca.conf

The configuration stored will be used to overwrite the default one. This is a possible usecase:
  1. Change the defaults to suit your general needs:
    $ freva --plugin pca --save-config=XXX --dry-run outputdir=/my_output_dir shiftlats=false
    
  2. Prepare some configurations you'll be using recurrently
    $ freva --plugin pca --save-config=XXX --dry-run --config-file pca.tas.conf --tool pca variable=tas
    $ freva --plugin pca --save-config=XXX --dry-run --config-file pca.uas.conf --tool pca variable=uas
    

Scheduling

Instead of running your job directly in the terminal, you can involve the SLURM scheduler.

To run the tool murcss analyzing the variable tas the command is

$ freva --plugin murcss variable=tas ...

The execution takes a certain time (here: roughly 1 minute) and prints
Searching Files
Remapping Files
Calculating ensemble mean
Calculating crossvalidated mean
Calculating Anomalies
Analyzing year 2 to 9
Analyzing year 1 to 1
Analyzing year 2 to 5
Analyzing year 6 to 9
Finished.
Calculation took 63.4807469845 seconds

To schedule the same task you would use

$ freva --plugin murcss variable=tas ... --batchmode=true 

instead. The output changes to
Scheduled job with history id 414
You can view the job's status with the command squeue
Your job's progress will be shown with the command
tail -f  /home/zmaw/u290038/evaluation_system/slurm/murcss/slurm-1437.out

The last line shows you the command to view the output, which is created by the tool.
In this example you would type

$ tail -f  /home/zmaw/u290038/evaluation_system/slurm/murcss/slurm-1437.out

For jobs with a long run-time or large amounts of jobs you schould consider
to schedule them and use the batch mode!

--help

$ freva --plugin --help
Applies some analysis to the given data.
See https://code.zmaw.de/projects/miklip-d-integration/wiki/Analyze for more information.

The "query" part is a key=value list used for configuring the tool. It's tool dependent so check that tool help.

For Example:
    freva --plugin pca eofs=4 bias=False input=myfile.nc outputdir=/tmp/test

Usage: freva --plugin [options]

Options:
  -d, --debug         turn on debugging info and show stack trace on
                      exceptions.
  -h, --help          show this help message and exit
  --repos-version     show the version number from the repository
  --caption=CAPTION   sets a caption for the results
  --save              saves the configuration locally for this user.
  --save-config=FILE  saves the configuration at the given file path
  --show-config       shows the resulting configuration (implies dry-run).
  --scheduled-id=ID   Runs a scheduled job from database
  --dry-run           dry-run, perform no computation. This is used for
                      viewing and handling the configuration.
  --batchmode=BOOL    creates a SLURM job