Forums » Operator News »
maskregion - Get only the data of the desired region
Added by Karin Meier-Fleischer over 5 years ago
A recurring question in the forums is
How can I extract data for a specific region from my data set and set everything else to missing value?
This is actually quite simple, but only with a little trick 
With the help of an NCL (https://ncl.ucar.edu/) script the latitude and longitude data can be read from a shapefile and written to an ASCII file.
The ASCII file can then be used to mask the corresponding area with CDO's operator maskregion (only the data in this area is extracted, all
others are set to missing value).
In our example we want to retrieve only the data located in Arizona, USA. We need a shapefile for the USA, which can be downloaded from
https://gadm.org/download_country_v3.html, gadm36_USA_shp.zip. And a data file, here
https://www.ncl.ucar.edu/Document/Manuals/NCL_User_Guide/Data/rectilinear_grid_2D.nc.
Let's start with the NCL script get_Arizona.ncl which reads the shapefile, selects the coordinates of the chosen state, and writes the
data to the file Arizona_lonlat.csv.
;------------------------------------------------------------------------------
; Description: Read the content of a state shapefile and write the latitude
; and longitude value pairs of the selected state into a CSV
; file (e.g. to use it with CDO's maskregion operator).
;
; Output file: Selected state name + '_lonlat.csv'
;
; Output format: Float values for latitude and longitude as CSV with
; delimiter ' '.
;
; -110.82376099 31.33125305
; -110.82608032 31.33123779
; ...
;
; Shapefile: Download from https://gadm.org/
;
; Later use: cdo -maskregion,Arizona_lonlat.csv infile outfile
;
; 08.07.20 meier-fleischer
;------------------------------------------------------------------------------
select_state = "Arizona"
outfile = select_state+"_lonlat.csv"
shpf = addfile("$HOME/data/Shapefiles/gadm36_USA/gadm36_USA_1.shp","r")
shp_states = shpf->NAME_1 ;-- US state names [num_features]
segments = shpf->segments ;-- shapefile segments [num_segments, segments]
geometry = shpf->geometry ;-- shapefile geometry [num_features, geometry]
lon = shpf->x ;-- longitude values [num_points]
lat = shpf->y ;-- latitude values [num_points]
geom_segIndex = shpf@geom_segIndex ;-- geometry start index number
segs_xyzIndex = shpf@segs_xyzIndex ;-- segments start index number
segs_numPnts = shpf@segs_numPnts ;-- segments number of points or sets
print("")
print("Selected state: "+select_state)
print("Output file: "+outfile)
print("")
do i = 0, dimsizes(select_state)-1
do j = 0, dimsizes(shp_states)-1
if (shp_states(j) .eq. select_state(i)) then
start_segment = geometry(j, geom_segIndex)
start_point = segments(start_segment, segs_xyzIndex)
end_point = start_point + segments(start_segment,segs_numPnts) -1
state_lat = lat(start_point:end_point) ;-- lat values of selected state
state_lon = lon(start_point:end_point) ;-- lon values of selected state
print("Number of points for "+select_state+": "+dimsizes(state_lon))
print("")
end if
end do
end do
system("rm "+outfile+"; touch "+outfile) ;-- delete output file
do k=0,dimsizes(state_lon)-1
slist = [/state_lon(k) + " " + state_lat(k)/] ;-- list with lon/lat values
write_table(outfile, "a", slist, "%s") ;-- append lon/lat values to output file
end do
Run the NCL script
ncl get_Arizona.ncl
And now, we can mask the data for Arizona. For a better view we increase the resolution of the input file.
cdo -maskregion,Arizona_lonlat.csv -remapbil,r720x360 rectilinear_grid_2D.nc Arizona_lonlat_tsurf.nc

And here's the short version, where everything is handled by a Korn-Shell script:
#!/bin/ksh
infile="$HOME/NCL/NCL_User_Guide/Version_1.0/data/rectilinear_grid_2D.nc"
#-- create the NCL script to retrieve the CSV coordinates file for Arizona
cat << EOF > Arizona_coordinates.ncl
;------------------------------------------------------------------------------
; Description: Read the content of a state shapefile and write the latitude
; and longitude value pairs of the selected state into a CSV
; file (e.g. to use it with CDO's maskregion operator).
;
; Output file: Selected state name + '_lonlat.csv'
;
; Output format: Float values for latitude and longitude as CSV with
; delimiter ' '.
;
; -110.82376099 31.33125305
; -110.82608032 31.33123779
; ...
;
; Shapefile: Download from https://gadm.org/
;
; Later use: cdo -maskregion,Arizona_lonlat.csv infile outfile
;
; 08.07.20 meier-fleischer
;------------------------------------------------------------------------------
select_state = "Arizona"
outfile = select_state+"_lonlat.csv"
shpf = addfile("$HOME/data/Shapefiles/gadm36_USA/gadm36_USA_1.shp","r")
shp_states = shpf->NAME_1 ;-- US state names [num_features]
segments = shpf->segments ;-- shapefile segments [num_segments, segments]
geometry = shpf->geometry ;-- shapefile geometry [num_features, geometry]
lon = shpf->x ;-- longitude values [num_points]
lat = shpf->y ;-- latitude values [num_points]
geom_segIndex = shpf@geom_segIndex ;-- geometry start index number
segs_xyzIndex = shpf@segs_xyzIndex ;-- segments start index number
segs_numPnts = shpf@segs_numPnts ;-- segments number of points or sets
print("")
print("Selected state: "+select_state)
print("Output file: "+outfile)
print("")
do i = 0, dimsizes(select_state)-1
do j = 0, dimsizes(shp_states)-1
if (shp_states(j) .eq. select_state(i)) then
start_segment = geometry(j, geom_segIndex)
start_point = segments(start_segment, segs_xyzIndex)
end_point = start_point + segments(start_segment,segs_numPnts) -1
state_lat = lat(start_point:end_point) ;-- lat values of selected state
state_lon = lon(start_point:end_point) ;-- lon values of selected state
print("Number of points for "+select_state+": "+dimsizes(state_lon))
print("")
end if
end do
end do
system("rm "+outfile+"; touch "+outfile) ;-- delete output file
do k=0,dimsizes(state_lon)-1
slist = [/state_lon(k) + " " + state_lat(k)/] ;-- list with lon/lat values
write_table(outfile, "a", slist, "%s") ;-- append lon/lat values to output file
end do
EOF
#-- run the NCL script
ncl Arizona_coordinates.ncl
#-- mask the data for Arizona; select variable tsurf; remap data to a higher resolution grid
cdo -maskregion,Arizona_lonlat.csv -remapbil,r720x360 ${infile} Arizona_lonlat_tsurf.nc
| plot_Arizona_tsurf.png (57.9 KB) plot_Arizona_tsurf.png |