Project

General

Profile

CDO goes quantum: introduction of the Bra-Ket

Added by Ralf Mueller about 6 years ago

This post is not about a specific operator or module but about a subtle little extension of the chaining of operators on the command line. But some background first ...

The operators built into CDO can be seen as separate functions, where the output of the called function is the input if the calling function. For simple operations this might look like this

cdo -copy -fldmean ifile ofile
ofile = copy(fldmean(ifile)) 
cdo -div -mul afile -selname,var bfile maskfile ofile
ofile = div(mul(afile,selname(var,bfile)),maskfile)

What we use on the command line is the so-called Polish-Notation, that can be written in a parenthesis-free way iff the number of inputs and outputs is fixed for each function.

Consequently there is the need for a special handling for operators with an arbitrary number of inputs or outputs. For inputs these operators are

after enspctl gather outputext selall
afterburner ensrange graph outputf select
cat ensrkhist_space info outputfld sinfo
collgrid ensrkhist_time infoc outputint sinfoc
copy ensrkhistspace infon outputkey sinfon
delete ensrkhisttime infop outputsrv sinfop
ensavg ensroc infos outputtab sinfov
ensbrs ensskew infov outputts sorttaxis
enscrps ensstd map outputxyz sorttimestamp
enskurt ensstd1 merge seinfo szip
ensmax enssum mergetime seinfoc xinfon
ensmean ensvar output seinfon ensmin
ensvar1 outputarr seinfop

For outputs the list is

distgrid eofcoeff eofcoeff3d intyear
scatter splitcode splitday splitgrid
splithour splitlevel splitmon splitname
splitparam splitrec splitseas splitsel
splittabnum splitvar splityear splityearmon
splitzaxis

Arbitrary Outputs

For operators with arbitrary outputs the rule is simple: Since the calling operator of such an operation cannot not know, how many inputs to read from, it is only allowed to call these operators at the very end of a chain. Hence the outputs are always files on disk.

Arbitrary Inputs

Here the rule is different: Operators with an arbitrary number of inputs behave greedy (like regular expressions), which means: they read as many inputs as possible. This leads to a limitation, because such operators cannot be part of complex chains, that involve operators with more than one input stream. The consequence is, that operators like cat and merge are mostly used in stand-alone calls, because their ability for chaining is somewhat limited.

Lets have a look at an example:

cdo -infov -div -fldmean -cat   -for,1,10 -mulc,-1 -for,1,5   -fldmax -topo 

Don't get confused by the fact, that there is no input data file: -for and -topo create the inputs and -infov writes to stdout. Hence I can illustrate things without going into the details of special input files.

The above call does not work because -cat is greedy (typical for a cat, btw - I love cats ...):

% cdo -infov -div -fldmean -cat   -for,1,10 -mulc,-1 -for,1,5   -fldmax -topo

cdo (Abort): Too few streams specified! Operator -div needs 2 input and 1 output streams.

What happened? -div needs two input streams and one output stream, but our -cat has claimed all possible streams on its right hand side as input and didn't leave anything for the remaining input or output stream of -div.

How to deal with that? A change in the processing rule is unlikely to help here: Limit the number of inputs to 2 would lead in even longer chains like

cdo -cat afile -cat bfile -cat cfile dfile outfile
- certainly nothing readable. The solution is: Re-Introduce a parenthesis, but only if needed! On the Unix-command line there are not so many signs left for that purpose. That's why we decided to take the detached square bracket for that, i.e. a square bracket that has a space on the left and right of it.

The above call now looks like

cdo -infov -div -fldmean -cat [ -for,1,10 -mulc,-1 -for,1,5 ] -fldmax -topo 
and it works perfectly with cdo-1.9.5. It's even a lot more readable: -div's first input is -fldmean -cat [ -for,1,10 -mulc,-1 -for,1,5 ] and its second is -fldmax -topo.

Please come up with some other use-cases!