Project

General

Profile

maskregion - Get only the data of the desired region

Added by Karin Meier-Fleischer over 3 years ago

A recurring question in the forums is

How can I extract data for a specific region from my data set and set everything else to missing value?

This is actually quite simple, but only with a little trick ;)

With the help of an NCL (https://ncl.ucar.edu/) script the latitude and longitude data can be read from a shapefile and written to an ASCII file.
The ASCII file can then be used to mask the corresponding area with CDO's operator maskregion (only the data in this area is extracted, all
others are set to missing value).

In our example we want to retrieve only the data located in Arizona, USA. We need a shapefile for the USA, which can be downloaded from
https://gadm.org/download_country_v3.html, gadm36_USA_shp.zip. And a data file, here
https://www.ncl.ucar.edu/Document/Manuals/NCL_User_Guide/Data/rectilinear_grid_2D.nc.

Let's start with the NCL script get_Arizona.ncl which reads the shapefile, selects the coordinates of the chosen state, and writes the
data to the file Arizona_lonlat.csv.

;------------------------------------------------------------------------------
; Description:  Read the content of a state shapefile and write the latitude 
;               and longitude value pairs of the selected state into a CSV 
;               file (e.g. to use it with CDO's maskregion operator).
;
; Output file:    Selected state name + '_lonlat.csv'
;
; Output format: Float values for latitude and longitude as CSV with 
;               delimiter ' '.
;
;                   -110.82376099 31.33125305
;                   -110.82608032 31.33123779
;                   ...
;
; Shapefile:    Download from https://gadm.org/
;
; Later use:    cdo -maskregion,Arizona_lonlat.csv infile outfile
;
; 08.07.20  meier-fleischer
;------------------------------------------------------------------------------  

select_state = "Arizona" 
outfile      = select_state+"_lonlat.csv" 

shpf = addfile("$HOME/data/Shapefiles/gadm36_USA/gadm36_USA_1.shp","r")

shp_states    = shpf->NAME_1                ;-- US state names [num_features]
segments      = shpf->segments              ;-- shapefile segments [num_segments, segments]
geometry      = shpf->geometry              ;-- shapefile geometry [num_features, geometry]
lon           = shpf->x                     ;-- longitude values [num_points]
lat           = shpf->y                     ;-- latitude values  [num_points]

geom_segIndex = shpf@geom_segIndex          ;-- geometry start index number
segs_xyzIndex = shpf@segs_xyzIndex          ;-- segments start index number
segs_numPnts  = shpf@segs_numPnts           ;-- segments number of points or sets

print("")
print("Selected state:  "+select_state)
print("Output file:     "+outfile)
print("")

do i = 0, dimsizes(select_state)-1
   do j = 0, dimsizes(shp_states)-1
      if (shp_states(j) .eq. select_state(i)) then 
         start_segment = geometry(j, geom_segIndex)
         start_point   = segments(start_segment, segs_xyzIndex)
         end_point     = start_point + segments(start_segment,segs_numPnts) -1

         state_lat = lat(start_point:end_point) ;-- lat values of selected state
         state_lon = lon(start_point:end_point) ;-- lon values of selected state
         print("Number of points for "+select_state+": "+dimsizes(state_lon))
         print("")
      end if
   end do   
end do

system("rm "+outfile+"; touch "+outfile)    ;-- delete output file

do k=0,dimsizes(state_lon)-1
   slist = [/state_lon(k) + " " + state_lat(k)/]  ;-- list with lon/lat values
   write_table(outfile, "a", slist, "%s")   ;-- append lon/lat values to output file
end do

Run the NCL script

ncl get_Arizona.ncl

And now, we can mask the data for Arizona. For a better view we increase the resolution of the input file.

cdo -maskregion,Arizona_lonlat.csv -remapbil,r720x360 rectilinear_grid_2D.nc Arizona_lonlat_tsurf.nc

And here's the short version, where everything is handled by a Korn-Shell script:

#!/bin/ksh

infile="$HOME/NCL/NCL_User_Guide/Version_1.0/data/rectilinear_grid_2D.nc" 

#-- create the NCL script to retrieve the CSV coordinates file for Arizona

cat << EOF > Arizona_coordinates.ncl
;------------------------------------------------------------------------------
; Description:  Read the content of a state shapefile and write the latitude 
;               and longitude value pairs of the selected state into a CSV 
;               file (e.g. to use it with CDO's maskregion operator).
;
; Output file:    Selected state name + '_lonlat.csv'
;
; Output format: Float values for latitude and longitude as CSV with 
;               delimiter ' '.
;
;                   -110.82376099 31.33125305
;                   -110.82608032 31.33123779
;                   ...
;
; Shapefile:    Download from https://gadm.org/
;
; Later use:    cdo -maskregion,Arizona_lonlat.csv infile outfile
;
; 08.07.20  meier-fleischer
;------------------------------------------------------------------------------  
select_state = "Arizona" 
outfile      = select_state+"_lonlat.csv" 

shpf = addfile("$HOME/data/Shapefiles/gadm36_USA/gadm36_USA_1.shp","r")

shp_states    = shpf->NAME_1                ;-- US state names [num_features]
segments      = shpf->segments              ;-- shapefile segments [num_segments, segments]
geometry      = shpf->geometry              ;-- shapefile geometry [num_features, geometry]
lon           = shpf->x                     ;-- longitude values [num_points]
lat           = shpf->y                     ;-- latitude values  [num_points]

geom_segIndex = shpf@geom_segIndex          ;-- geometry start index number
segs_xyzIndex = shpf@segs_xyzIndex          ;-- segments start index number
segs_numPnts  = shpf@segs_numPnts           ;-- segments number of points or sets

print("")
print("Selected state:  "+select_state)
print("Output file:     "+outfile)
print("")

do i = 0, dimsizes(select_state)-1
   do j = 0, dimsizes(shp_states)-1
      if (shp_states(j) .eq. select_state(i)) then 
         start_segment = geometry(j, geom_segIndex)
         start_point   = segments(start_segment, segs_xyzIndex)
         end_point     = start_point + segments(start_segment,segs_numPnts) -1

         state_lat = lat(start_point:end_point) ;-- lat values of selected state
         state_lon = lon(start_point:end_point) ;-- lon values of selected state
         print("Number of points for "+select_state+": "+dimsizes(state_lon))
         print("")
      end if
   end do   
end do

system("rm "+outfile+"; touch "+outfile)    ;-- delete output file

do k=0,dimsizes(state_lon)-1
   slist = [/state_lon(k) + " " + state_lat(k)/]  ;-- list with lon/lat values
   write_table(outfile, "a", slist, "%s")   ;-- append lon/lat values to output file
end do  
EOF

#-- run the NCL script

ncl Arizona_coordinates.ncl

#-- mask the data for Arizona; select variable tsurf; remap data to a higher resolution grid

cdo -maskregion,Arizona_lonlat.csv -remapbil,r720x360 ${infile} Arizona_lonlat_tsurf.nc