Clipping a NetCDF file with regionmask and Python
Added by Benoit Decoco over 2 years ago
Hello everybody,
I have monthly estimates of PM 2.5 for 20 consecutive years stored in NetCDF files with a precision of 0.01°x0.01° (i.e. for each 0.01°x0.01° square on the world map, there is an estimate of PM 2.5). The data is here for those interested : https://sites.wustl.edu/acag/datasets/surface-pm2-5/.
I wish to clip the NetCDF files with a geodataframe containing the bounding boxes of the cities I am interested in and then make an average of the PM 2.5 for each city (as they might be in more than one 0.01°x0.01° square). The file containing my bounding boxes has the following structure :
cities longmin latmin longmax latmax WKT Rennes -1.72 48.08 -1.61 48.15 POLYGON ((-1.72 48.08, -1.61 48.08, -1.61 48.15, -1.72 48.15, -1.72 48.08)) Bursa 29.01 40.17 29.15 40.23 POLYGON ((29.01 40.17, 29.15 40.17, 29.15 40.23, 29.01 40.23, 29.01 40.17)) Copenhagen 12.45 55.61 12.67 55.73 POLYGON ((12.45 55.61, 12.67 55.61, 12.67 55.73, 12.45 55.73, 12.45 55.61)) ...
I was advised to use regionmask and following this model1 I think I have created a mask with the cities. The code is the following
import netCDF4 as nc
import pandas as pd
import geopandas as gpd
import xarray as xr
import numpy as np
import regionmask
##Importing the NetCDF file with xarray
ds_temp2 = xr.open_dataset('/path/to/file/V5GL01.HybridPM25.Global.201912-201912.nc')
ds_temp2
##creating a geodataframe with the bounded boxes of the cities file
df = pd.read_excel('/Users/lucius/Documents/MiR/Data1/bounding_boxes.xls')
print(df)
df['geometry'] = gpd.GeoSeries.from_wkt(df['geometry'])
gdf = gpd.GeoDataFrame(df, geometry='geometry')
##Creating the mask for the NetCDF file
my_list = list(gdf['cities'])
my_list_unique = set(list(gdf['cities']))
indexes = [my_list.index(x) for x in my_list_unique]
gdf_mask_poly = regionmask.Regions(name = 'cities', numbers = indexes, names = gdf.cities[indexes], abbrevs = gdf.cities[indexes], outlines = list(gdf.geometry.values[i] for i in range(0,gdf.shape0)))
gdf_mask_poly
##Applying the mask to the NetCDF file
mask = gdf_mask_poly.mask(ds_temp2, lat_name='lat', lon_name='lon')
Now I would like to clip it in order to only keep the bounding-boxes of the cities I am interested in, would you have an idea how I could do it ?
The script takes time to execute as the .nc file covers the world map and is quite heavy (140 Mb).
Would you see a more efficient way of doing it ?
I am a beginner in NetCDF4 and in Python, please forgive me if this question may seem simple to you.
Thanks for your help,
Benoît
[1]: https://www.guillaumedueymes.com/post/shapefiles_country/
Replies (1)
RE: Clipping a NetCDF file with regionmask and Python - Added by Karin Meier-Fleischer over 2 years ago
Hi Benoit,
if you are using Python and regionmask I guess you can go on with regionmask https://regionmask.readthedocs.io/en/stable/notebooks/create_own_regions.html.