Project

General

Profile

Clipping a NetCDF file with mask and Python

Added by Benoit Decoco over 2 years ago

Hello everybody,

I have monthly estimates of PM 2.5 for 20 consecutive years stored in NetCDF files with a precision of 0.01°x0.01° (i.e. for each 0.01°x0.01° square on the world map, there is an estimate of PM 2.5). The data is here for those interested : https://sites.wustl.edu/acag/datasets/surface-pm2-5/.

I wish to clip the NetCDF files with a geodataframe containing the bounding boxes of the cities I am interested in and then make an average of the PM 2.5 for each city (as they might be in more than one 0.01°x0.01° square).

```
cities longmin latmin longmax latmax WKT

Rennes -1.72 48.08 -1.61 48.15 POLYGON ((-1.72 48.08, -1.61 48.08, -1.61 48.15, -1.72 48.15, -1.72 48.08))

Bursa 29.01 40.17 29.15 40.23 POLYGON ((29.01 40.17, 29.15 40.17, 29.15 40.23, 29.01 40.23, 29.01 40.17))

Copenhagen 12.45 55.61 12.67 55.73 POLYGON ((12.45 55.61, 12.67 55.61, 12.67 55.73, 12.45 55.73, 12.45 55.61))

...
```

I was advised to use regionmask and following this model[1] I think I have created a mask with the cities. The code is the following

```
import netCDF4 as nc
import pandas as pd
import geopandas as gpd
import xarray as xr
import numpy as np
import regionmask

  1. Importing the NetCDF file with xarray

ds_temp2 = xr.open_dataset('/path/to/file/V5GL01.HybridPM25.Global.201912-201912.nc')
ds_temp2

  1. creating a geodataframe with the bounded boxes of the cities file

df = pd.read_excel('/Users/lucius/Documents/MiR/Data1/bounding_boxes.xls')
print(df)
df['geometry'] = gpd.GeoSeries.from_wkt(df['geometry'])
gdf = gpd.GeoDataFrame(df, geometry='geometry')

##Creating the mask for the NetCDF file

my_list = list(gdf['cities'])
my_list_unique = set(list(gdf['cities']))
indexes = [my_list.index(x) for x in my_list_unique]
gdf_mask_poly = regionmask.Regions(name = 'cities', numbers = indexes, names = gdf.cities[indexes], abbrevs = gdf.cities[indexes], outlines = list(gdf.geometry.values[i] for i in range(0,gdf.shape0)))
gdf_mask_poly

  1. Applying the mask to the NetCDF file
    mask = gdf_mask_poly.mask(ds_temp2, lat_name='lat', lon_name='lon')
    ```

When running the last line I get
```
KeyError: 'longitude'
```

Now I would like to clip it in order to only keep the bounding-boxes of the cities I am interested in, would you have an idea how I could do it ?

The script takes time to execute as the .nc file covers the world map and is quite heavy (140 Mb).
Would you see a more efficient way of doing it ?

I am a beginner in NetCDF4 and in Python, please forgive me if this question may seem simple to you.

Thanks for your help,

Benoît

[1]: https://www.guillaumedueymes.com/post/shapefiles_country/

Replies (2)

RE: Clipping a NetCDF file with mask and Python - Added by Benoit Decoco over 2 years ago

My mistake, I am in the wrong section but can't seem to delete my own post. I beg your pardon.

RE: Clipping a NetCDF file with mask and Python - Added by Muhammad Ramzan almost 2 years ago

Hello Benoît

are you able to crop the NetCDF files based on your study area.

Thanks

    (1-2/2)