Project

General

Profile

Sort Grid Cell Values From Greatest to Smallest (Rank)

Added by Aaron Perry over 4 years ago

I have a grid of precipitation and I need to sort the gridded data (cells) from greatest to smallest.

Is CDO a good tool to do this with?


Replies (20)

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 4 years ago

hi Aaron!
how to you want to store the results? as netcdf?

cheers
ralf

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 4 years ago

Hi Ralf,

Yes, NetCDF is great or GRIB-2.

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 4 years ago

the only CF-conformant way of storing this is in an unstructured grid, where you have a long 1D-vector for locations. this could have the right order wrt. certain cell values and still would still be able to process it with common tools. I am not 100% sure if this can be done with CDO since there is no sorting operator along data values at the moment. A possible work around could be this

  1. remap your input to unstructured or curvilinear grid (number for longitudes = gridsize)
  2. output the values including cell indices to stdout with cdo outputkey,value,xind ...
  3. sort this with unix command sort to get am ordered cell list of cell indices
  4. use cdo selgridcell,.... to select and save the cells in a certain order coming from the step before

In case this works I doubt that it will scale upto very large number of points like millions ... it's defo worth a try

cheers
ralf

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 4 years ago

Hi Ralf!

Here is my approach so far:

1) cdo outputkey,value,lon,xind,lat,yind /home/weather/qpf_3hr.grib2 > /home/weather/apcp_vals.txt
2) sort -t$'\t' -nr -k1 /home/weather/apcp_vals.txt > /home/weather/apcp_vals_sorted.txt

Pre-Sorted Vals
value lon xind lat yind
0 -122.72 1 21.138 1
0 -122.693 2 21.145 1
0 -122.667 3 21.152 1
0 -122.64 4 21.1589 1
0 -122.613 5 21.1659 1
0 -122.587 6 21.1729 1
0 -122.56 7 21.1798 1
0 -122.533 8 21.1868 1
0 -122.507 9 21.1937 1

Post-Sorted Grid Vals
value lon xind lat yind
407.8 -115.204 476 50.5722 1022
320 -115.196 476 50.5463 1021
282.2 -115.163 477 50.5772 1022
281.8 -115.253 475 50.593 1023
250.4 -115.245 475 50.5672 1022
249 -115.155 477 50.5514 1021
231.5 -115.212 476 50.5981 1023
227.2 -117.453 426 50.9492 1047
206.9 -114.562 446 44.0402 778
192.4 -121.281 328 50.0339 1035

After I have sorted the grid values from greatest to smallest, how do I get these results back into the previous gridded format?

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 4 years ago

I will upload a solution soon - currently I am in a meeting. do you prefer python or bash?

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 4 years ago

I am most versed in bash but to be honest, either format would be of great help!

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 4 years ago

I start with python then. Please take in mind that this list handling might not be the optimal solution. If anybody know a better one, please drop a line here

from cdo import Cdo                                                                        
import sys

cdo = Cdo()
cdo.debug = True

inputfile = sys.argv[1]
print('inputfile='+inputfile)
outputfile = 'sortedInput.nc'

# create a mask for application of the 'reducegrid' operator. it's some sort of
# misuse, because it was develop to reduce data. Here I use it the create an
# unstructured CF-conform grid representation of the input by using a mask
# entirely being 1. So that nothing will be removed from the original data

# compute the lower boundary
LowerBoundary = float(cdo.outputkey('value,nohead', input='-fldmin '+inputfile)[0])
# mask being 1 at all points
mask = cdo.gtc(LowerBoundary-1,input=inputfile)
# transorm the input input unstructured grid in netcdf format
unstructuredInput = cdo.reducegrid(mask,input=inputfile,options='-f nc')

# create an orderes list of location indices
indexList = cdo.outputkey('value,xind,nohead', input=unstructuredInput)
print(indexList)

# create numbers from the strings so that the sorting of the data values works correctly
nestedList = [(_[1], float(_[0])) for _ in [x.split() for x in indexList]]
print(nestedList)

# select the sorted indicees
sortedList = sorted(nestedList, key=lambda x: x[1])
sortedIndexList = [x[0] for x in sortedList]
print(sortedIndexList)

# selecta (all) grid locations in the order of the data values
cdo.selgridcell(','.join(sortedIndexList),input=unstructuredInput,output=outputfile)

call with like:

python sortCdo.py <inputfile>

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 4 years ago

ZSH version

set -x
  inputfile=$1
 outputfile='sortedInput.nc'

min=$(cdo -outputkey,value,nohead -fldmin $inputfile)

cdo -gtc,$((min - 1)) $inputfile mask

cdo -f nc -reducegrid,mask $inputfile unstructuredInput

# create an orderes list of location indices, use options of 'sort' to control the order
cells=$(cdo -outputkey,value,xind,nohead unstructuredInput | sort -rn | rev | cut -d ' ' -f 2 | rev | xargs echo | tr ' ' ,)

# selecta (all) grid locations in the order of the data values
cdo -selgridcell,$cells unstructuredInput $outputfile

zsh sortCdo.zsh <inputfile>

sortCdo.zsh (535 Bytes) sortCdo.zsh

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 4 years ago

And finally the Ruby version (my favorite)

require 'cdo'

cdo = Cdo.new
cdo.debug = true

inputfile = ARGV[0]
puts "inputfile: #{inputfile}" 
outputfile = 'sortedInput.nc'

lowerBoundary = cdo.outputkey('value,nohead', input: ' -fldmin '+inputfile)[0].to_f

mask = cdo.gtc(lowerBoundary-1.0,input: inputfile)

unstructuredInput = cdo.reducegrid(mask,input: inputfile, options: '-f nc')

sortedIndexList = cdo.outputkey('value,xind,nohead', input: unstructuredInput).map(&:split).map {|a|
  [a[0].to_f,a[1].to_i]  # convert values to float, index to integer
}.sort_by {|a|
  a[0]                   # sort by value
}.map {|a|
  a[1]                   # select index only
}
# or in a single line
# sortedIndexList = cdo.outputkey('value,xind,nohead', input: unstructuredInput).map(&:split).map {|a| [a[0].to_f,a[1].to_i]}.sort_by {|a| a[0]}.map {|a| a[1]}
pp sortedIndexList

cdo.selgridcell(sortedIndexList, input: unstructuredInput, output: outputfile)

sortCdo.rb (909 Bytes) sortCdo.rb

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 4 years ago

Thank you for all of the help Ralf!

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 4 years ago

Unfortunately, it seems like my version (1.7.0) of cdo does not contain the reducegrid operator

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 4 years ago

yes, you need some more recent version. I did all this with the latest release. But 1.7.2 should have it

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 4 years ago

Is there an easy way to upgrade to v1.7.2 on Ubuntu?

I've tried sudo apt-get install cdo but, that seems to only install v1.7.0.

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 4 years ago

you can use Anaconda - should be the easiest way to update.

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Nisreen Ghazi almost 4 years ago

Hi Ralf,
I'm trying to use your ruby code "sortCdo.rb" for sorting the cell values of the annual peak flow data (netCDF file), but keep getting this messege:

$ ruby sortCdo.rb AnnPeakFlux.nc
Traceback (most recent call last):
5: from sortCdo.rb:3:in `<main>'
4: from sortCdo.rb:3:in `new'
3: from /home/nisre/.gem/ruby/2.6.0/gems/cdo-1.5.1/lib/cdo.rb:81:in `initialize'
2: from /home/nisre/.gem/ruby/2.6.0/gems/cdo-1.5.1/lib/cdo.rb:379:in `loadOptionalLibs'
1: from /usr/local/share/ruby/site_ruby/rubygems/core_ext/kernel_require.rb:92:in `require'
/usr/local/share/ruby/site_ruby/rubygems/core_ext/kernel_require.rb:92:in `require': cannot load such file -- numru/netcdf_miss (LoadError)

Any ideas would be highly appreciated.
cheers
Nisreen

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller almost 4 years ago

hi Nisreen!

you need to install the ruby-netcdf library with

gem install ruby-netcdf
All you need to have for this is netcdf (or sometimes called libnetcdf or libnetcdf-dev package, that provides the C-header files and the shared object, libnetcdf.so)

which linux do you use?

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Nisreen Ghazi almost 4 years ago

Hi Ralf,
I wonder whether the provided ruby code can sort the cell values of the annual data or it just shows the locations of the sorted data of one-time slide... what I want to do is to sort the annual values of each grid-cell and then to extract the second maximum grid. is this the right code?
I use cygwin and though both libnetcdf and libnetcdf-dev packages are installed, I couldn't install ruby-netcdf. it seems that I need to compile and run extconf.rb:

$ gem install ruby-netcdf
Building native extensions. This could take a while...
Successfully installed narray-0.6.1.2
Successfully installed narray_miss-1.4.0
Building native extensions. This could take a while...
ERROR: Error installing ruby-netcdf:
ERROR: Failed to build gem native extension.
  • extconf.rb failed ***
    Could not create Makefile due to some reason, probably lack of necessary
    libraries and/or headers. Check the mkmf.log file for more details. You may
    need configuration options......
cheers

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller almost 4 years ago

The script takes into account all cells from all timesteps in the file. so I f you want to limit this, you have to do certain selections or split the input file into appropriate chunks.

did you also install gcc?

If you have trouble with the ruby version, you can try the python version, too. therefore you need a package called 'pip' or 'python-pip'. This is a package manager for python packages. you can install things with it like

pip install cdo
pip install netCDF4

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Nisreen Ghazi almost 4 years ago

Hi Ralf,

both gcc-g++ and gcc-fortran are installed, yet I couldn't install ruby-netcdf.
- moving to Python, I installed 'python-pip' package but then not sure if the CDO and the netCDF4 were installed correctly:

nisre@DESKTOP-SOCBQ9U ~
$ pip install cdo
Requirement already satisfied: cdo in c:\python38\lib\site-packages (1.5.3)
Requirement already satisfied: six in c:\python38\lib\site-packages (from cdo) (
1.14.0)

nisre@DESKTOP-SOCBQ9U ~
$ pip install netCDF4
Requirement already satisfied: netCDF4 in c:\python38\lib\site-packages (1.5.3)
Requirement already satisfied: numpy>=1.7 in c:\python38\lib\site-packages (from
netCDF4) (1.18.1)
Requirement already satisfied: cftime in c:\python38\lib\site-packages (from net
CDF4) (1.1.3)

when I ran sortCdo.py,

$ python sortCdo.py QavgYr.nc

Traceback (most recent call last):
File "sortCdo.py", line 1, in <module>
from cdo import Cdo
ImportError: No module named cdo

RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller almost 4 years ago

hi!
you seemed to have switched the system. On what kind of environment/operating system do your work?

    (1-20/20)