Clipping a NetCDF file with mask and Python
Added by Benoit Decoco over 2 years ago
Hello everybody,
I have monthly estimates of PM 2.5 for 20 consecutive years stored in NetCDF files with a precision of 0.01°x0.01° (i.e. for each 0.01°x0.01° square on the world map, there is an estimate of PM 2.5). The data is here for those interested : https://sites.wustl.edu/acag/datasets/surface-pm2-5/.
I wish to clip the NetCDF files with a geodataframe containing the bounding boxes of the cities I am interested in and then make an average of the PM 2.5 for each city (as they might be in more than one 0.01°x0.01° square).
```
cities longmin latmin longmax latmax WKT
Rennes -1.72 48.08 -1.61 48.15 POLYGON ((-1.72 48.08, -1.61 48.08, -1.61 48.15, -1.72 48.15, -1.72 48.08))
Bursa 29.01 40.17 29.15 40.23 POLYGON ((29.01 40.17, 29.15 40.17, 29.15 40.23, 29.01 40.23, 29.01 40.17))
Copenhagen 12.45 55.61 12.67 55.73 POLYGON ((12.45 55.61, 12.67 55.61, 12.67 55.73, 12.45 55.73, 12.45 55.61))
...
```
I was advised to use regionmask and following this model[1] I think I have created a mask with the cities. The code is the following
```
import netCDF4 as nc
import pandas as pd
import geopandas as gpd
import xarray as xr
import numpy as np
import regionmask
- Importing the NetCDF file with xarray
ds_temp2 = xr.open_dataset('/path/to/file/V5GL01.HybridPM25.Global.201912-201912.nc')
ds_temp2
- creating a geodataframe with the bounded boxes of the cities file
df = pd.read_excel('/Users/lucius/Documents/MiR/Data1/bounding_boxes.xls')
print(df)
df['geometry'] = gpd.GeoSeries.from_wkt(df['geometry'])
gdf = gpd.GeoDataFrame(df, geometry='geometry')
##Creating the mask for the NetCDF file
my_list = list(gdf['cities'])
my_list_unique = set(list(gdf['cities']))
indexes = [my_list.index for x in my_list_unique]
gdf_mask_poly = regionmask.Regions(name = 'cities', numbers = indexes, names = gdf.cities[indexes], abbrevs = gdf.cities[indexes], outlines = list(gdf.geometry.values[i] for i in range(0,gdf.shape0)))
gdf_mask_poly
- Applying the mask to the NetCDF file
mask = gdf_mask_poly.mask(ds_temp2, lat_name='lat', lon_name='lon')
```
When running the last line I get
```
KeyError: 'longitude'
```
Now I would like to clip it in order to only keep the bounding-boxes of the cities I am interested in, would you have an idea how I could do it ?
The script takes time to execute as the .nc file covers the world map and is quite heavy (140 Mb).
Would you see a more efficient way of doing it ?
I am a beginner in NetCDF4 and in Python, please forgive me if this question may seem simple to you.
Thanks for your help,
Benoît
[1]: https://www.guillaumedueymes.com/post/shapefiles_country/
Replies (2)
RE: Clipping a NetCDF file with mask and Python - Added by Benoit Decoco over 2 years ago
My mistake, I am in the wrong section but can't seem to delete my own post. I beg your pardon.
RE: Clipping a NetCDF file with mask and Python - Added by Muhammad Ramzan almost 2 years ago
Hello Benoît
are you able to crop the NetCDF files based on your study area.
Thanks