Sort Grid Cell Values From Greatest to Smallest (Rank)
Added by Aaron Perry over 5 years ago
I have a grid of precipitation and I need to sort the gridded data (cells) from greatest to smallest.
Is CDO a good tool to do this with?
Replies (20)
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 5 years ago
hi Aaron!
how to you want to store the results? as netcdf?
cheers
ralf
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 5 years ago
Hi Ralf,
Yes, NetCDF is great or GRIB-2.
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 5 years ago
the only CF-conformant way of storing this is in an unstructured grid, where you have a long 1D-vector for locations. this could have the right order wrt. certain cell values and still would still be able to process it with common tools. I am not 100% sure if this can be done with CDO since there is no sorting operator along data values at the moment. A possible work around could be this
- remap your input to unstructured or curvilinear grid (number for longitudes = gridsize)
- output the values including cell indices to stdout with
cdo outputkey,value,xind ...
- sort this with unix command
sort
to get am ordered cell list of cell indices - use
cdo selgridcell,....
to select and save the cells in a certain order coming from the step before
In case this works I doubt that it will scale upto very large number of points like millions ... it's defo worth a try
cheers
ralf
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 5 years ago
Hi Ralf!
Here is my approach so far:
1) cdo outputkey,value,lon,xind,lat,yind /home/weather/qpf_3hr.grib2 > /home/weather/apcp_vals.txt
2) sort -t$'\t' -nr -k1 /home/weather/apcp_vals.txt > /home/weather/apcp_vals_sorted.txt
Pre-Sorted Vals
value lon xind lat yind
0 -122.72 1 21.138 1
0 -122.693 2 21.145 1
0 -122.667 3 21.152 1
0 -122.64 4 21.1589 1
0 -122.613 5 21.1659 1
0 -122.587 6 21.1729 1
0 -122.56 7 21.1798 1
0 -122.533 8 21.1868 1
0 -122.507 9 21.1937 1
Post-Sorted Grid Vals
value lon xind lat yind
407.8 -115.204 476 50.5722 1022
320 -115.196 476 50.5463 1021
282.2 -115.163 477 50.5772 1022
281.8 -115.253 475 50.593 1023
250.4 -115.245 475 50.5672 1022
249 -115.155 477 50.5514 1021
231.5 -115.212 476 50.5981 1023
227.2 -117.453 426 50.9492 1047
206.9 -114.562 446 44.0402 778
192.4 -121.281 328 50.0339 1035
After I have sorted the grid values from greatest to smallest, how do I get these results back into the previous gridded format?
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 5 years ago
I will upload a solution soon - currently I am in a meeting. do you prefer python or bash?
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 5 years ago
I am most versed in bash
but to be honest, either format would be of great help!
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 5 years ago
I start with python then. Please take in mind that this list handling might not be the optimal solution. If anybody know a better one, please drop a line here
from cdo import Cdo
import sys
cdo = Cdo()
cdo.debug = True
inputfile = sys.argv[1]
print('inputfile='+inputfile)
outputfile = 'sortedInput.nc'
# create a mask for application of the 'reducegrid' operator. it's some sort of
# misuse, because it was develop to reduce data. Here I use it the create an
# unstructured CF-conform grid representation of the input by using a mask
# entirely being 1. So that nothing will be removed from the original data
# compute the lower boundary
LowerBoundary = float(cdo.outputkey('value,nohead', input='-fldmin '+inputfile)[0])
# mask being 1 at all points
mask = cdo.gtc(LowerBoundary-1,input=inputfile)
# transorm the input input unstructured grid in netcdf format
unstructuredInput = cdo.reducegrid(mask,input=inputfile,options='-f nc')
# create an orderes list of location indices
indexList = cdo.outputkey('value,xind,nohead', input=unstructuredInput)
print(indexList)
# create numbers from the strings so that the sorting of the data values works correctly
nestedList = [(_[1], float(_[0])) for _ in [x.split() for x in indexList]]
print(nestedList)
# select the sorted indicees
sortedList = sorted(nestedList, key=lambda x: x[1])
sortedIndexList = [x[0] for x in sortedList]
print(sortedIndexList)
# selecta (all) grid locations in the order of the data values
cdo.selgridcell(','.join(sortedIndexList),input=unstructuredInput,output=outputfile)
call with like:
python sortCdo.py <inputfile>
sortCdo.py (1.36 KB) sortCdo.py |
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 5 years ago
ZSH version
set -x
inputfile=$1
outputfile='sortedInput.nc'
min=$(cdo -outputkey,value,nohead -fldmin $inputfile)
cdo -gtc,$((min - 1)) $inputfile mask
cdo -f nc -reducegrid,mask $inputfile unstructuredInput
# create an orderes list of location indices, use options of 'sort' to control the order
cells=$(cdo -outputkey,value,xind,nohead unstructuredInput | sort -rn | rev | cut -d ' ' -f 2 | rev | xargs echo | tr ' ' ,)
# selecta (all) grid locations in the order of the data values
cdo -selgridcell,$cells unstructuredInput $outputfile
zsh sortCdo.zsh <inputfile>
sortCdo.zsh (535 Bytes) sortCdo.zsh |
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 5 years ago
And finally the Ruby version (my favorite)
require 'cdo' cdo = Cdo.new cdo.debug = true inputfile = ARGV[0] puts "inputfile: #{inputfile}" outputfile = 'sortedInput.nc' lowerBoundary = cdo.outputkey('value,nohead', input: ' -fldmin '+inputfile)[0].to_f mask = cdo.gtc(lowerBoundary-1.0,input: inputfile) unstructuredInput = cdo.reducegrid(mask,input: inputfile, options: '-f nc') sortedIndexList = cdo.outputkey('value,xind,nohead', input: unstructuredInput).map(&:split).map {|a| [a[0].to_f,a[1].to_i] # convert values to float, index to integer }.sort_by {|a| a[0] # sort by value }.map {|a| a[1] # select index only } # or in a single line # sortedIndexList = cdo.outputkey('value,xind,nohead', input: unstructuredInput).map(&:split).map {|a| [a[0].to_f,a[1].to_i]}.sort_by {|a| a[0]}.map {|a| a[1]} pp sortedIndexList cdo.selgridcell(sortedIndexList, input: unstructuredInput, output: outputfile)
sortCdo.rb (909 Bytes) sortCdo.rb |
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 5 years ago
Thank you for all of the help Ralf!
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 5 years ago
Unfortunately, it seems like my version (1.7.0) of cdo
does not contain the reducegrid
operator
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 5 years ago
yes, you need some more recent version. I did all this with the latest release. But 1.7.2 should have it
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Aaron Perry over 5 years ago
Is there an easy way to upgrade to v1.7.2 on Ubuntu?
I've tried sudo apt-get install cdo
but, that seems to only install v1.7.0.
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 5 years ago
you can use Anaconda - should be the easiest way to update.
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Nisreen Ghazi over 4 years ago
Hi Ralf,
I'm trying to use your ruby code "sortCdo.rb" for sorting the cell values of the annual peak flow data (netCDF file), but keep getting this messege:
$ ruby sortCdo.rb AnnPeakFlux.nc
Traceback (most recent call last):
5: from sortCdo.rb:3:in `<main>'
4: from sortCdo.rb:3:in `new'
3: from /home/nisre/.gem/ruby/2.6.0/gems/cdo-1.5.1/lib/cdo.rb:81:in `initialize'
2: from /home/nisre/.gem/ruby/2.6.0/gems/cdo-1.5.1/lib/cdo.rb:379:in `loadOptionalLibs'
1: from /usr/local/share/ruby/site_ruby/rubygems/core_ext/kernel_require.rb:92:in `require'
/usr/local/share/ruby/site_ruby/rubygems/core_ext/kernel_require.rb:92:in `require': cannot load such file -- numru/netcdf_miss (LoadError)
Any ideas would be highly appreciated.
cheers
Nisreen
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 4 years ago
hi Nisreen!
you need to install the ruby-netcdf library with
gem install ruby-netcdfAll you need to have for this is netcdf (or sometimes called
libnetcdf
or libnetcdf-dev
package, that provides the C-header files and the shared object, libnetcdf.so)
which linux do you use?
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Nisreen Ghazi over 4 years ago
Hi Ralf,
I wonder whether the provided ruby code can sort the cell values of the annual data or it just shows the locations of the sorted data of one-time slide... what I want to do is to sort the annual values of each grid-cell and then to extract the second maximum grid. is this the right code?
I use cygwin and though both libnetcdf and libnetcdf-dev packages are installed, I couldn't install ruby-netcdf. it seems that I need to compile and run extconf.rb:
Building native extensions. This could take a while...
Successfully installed narray-0.6.1.2
Successfully installed narray_miss-1.4.0
Building native extensions. This could take a while...
ERROR: Error installing ruby-netcdf:
ERROR: Failed to build gem native extension.
- extconf.rb failed ***
Could not create Makefile due to some reason, probably lack of necessary
libraries and/or headers. Check the mkmf.log file for more details. You may
need configuration options......
cheers
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 4 years ago
The script takes into account all cells from all timesteps in the file. so I f you want to limit this, you have to do certain selections or split the input file into appropriate chunks.
did you also install gcc?
If you have trouble with the ruby version, you can try the python version, too. therefore you need a package called 'pip' or 'python-pip'. This is a package manager for python packages. you can install things with it like
pip install cdo pip install netCDF4
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Nisreen Ghazi over 4 years ago
Hi Ralf,
both gcc-g++ and gcc-fortran are installed, yet I couldn't install ruby-netcdf.
- moving to Python, I installed 'python-pip' package but then not sure if the CDO and the netCDF4 were installed correctly:
nisre@DESKTOP-SOCBQ9U ~
$ pip install cdo
Requirement already satisfied: cdo in c:\python38\lib\site-packages (1.5.3)
Requirement already satisfied: six in c:\python38\lib\site-packages (from cdo) (
1.14.0)
nisre@DESKTOP-SOCBQ9U ~
$ pip install netCDF4
Requirement already satisfied: netCDF4 in c:\python38\lib\site-packages (1.5.3)
Requirement already satisfied: numpy>=1.7 in c:\python38\lib\site-packages (from
netCDF4) (1.18.1)
Requirement already satisfied: cftime in c:\python38\lib\site-packages (from net
CDF4) (1.1.3)
when I ran sortCdo.py,
$ python sortCdo.py QavgYr.nc
Traceback (most recent call last):
File "sortCdo.py", line 1, in <module>
from cdo import Cdo
ImportError: No module named cdo
RE: Sort Grid Cell Values From Greatest to Smallest (Rank) - Added by Ralf Mueller over 4 years ago
hi!
you seemed to have switched the system. On what kind of environment/operating system do your work?