Project

General

Profile

Clipping a NetCDF file with regionmask and Python

Added by Benoit Decoco almost 2 years ago

Hello everybody,

I have monthly estimates of PM 2.5 for 20 consecutive years stored in NetCDF files with a precision of 0.01°x0.01° (i.e. for each 0.01°x0.01° square on the world map, there is an estimate of PM 2.5). The data is here for those interested : https://sites.wustl.edu/acag/datasets/surface-pm2-5/.

I wish to clip the NetCDF files with a geodataframe containing the bounding boxes of the cities I am interested in and then make an average of the PM 2.5 for each city (as they might be in more than one 0.01°x0.01° square). The file containing my bounding boxes has the following structure :


cities      longmin latmin longmax latmax    WKT

Rennes      -1.72   48.08 -1.61    48.15     POLYGON ((-1.72 48.08, -1.61 48.08, -1.61 48.15, -1.72 48.15, -1.72 48.08))

Bursa       29.01   40.17 29.15    40.23     POLYGON ((29.01 40.17, 29.15 40.17, 29.15 40.23, 29.01 40.23, 29.01 40.17))

Copenhagen  12.45   55.61 12.67    55.73     POLYGON ((12.45 55.61, 12.67 55.61, 12.67 55.73, 12.45 55.73, 12.45 55.61))

...

I was advised to use regionmask and following this model1 I think I have created a mask with the cities. The code is the following

import netCDF4 as nc
import pandas as pd
import geopandas as gpd
import xarray as xr
import numpy as np
import regionmask

    ##Importing the NetCDF file with xarray

ds_temp2 = xr.open_dataset('/path/to/file/V5GL01.HybridPM25.Global.201912-201912.nc')
ds_temp2

    ##creating a geodataframe with the bounded boxes of the cities file

df = pd.read_excel('/Users/lucius/Documents/MiR/Data1/bounding_boxes.xls')
print(df)
df['geometry'] = gpd.GeoSeries.from_wkt(df['geometry'])
gdf = gpd.GeoDataFrame(df, geometry='geometry')

 ##Creating the mask for the NetCDF file

my_list = list(gdf['cities'])
my_list_unique = set(list(gdf['cities']))
indexes = [my_list.index(x) for x in my_list_unique]
gdf_mask_poly = regionmask.Regions(name = 'cities', numbers = indexes, names = gdf.cities[indexes], abbrevs = gdf.cities[indexes], outlines = list(gdf.geometry.values[i] for i in range(0,gdf.shape0)))
gdf_mask_poly

 ##Applying the mask to the NetCDF file
    mask = gdf_mask_poly.mask(ds_temp2, lat_name='lat', lon_name='lon')

Now I would like to clip it in order to only keep the bounding-boxes of the cities I am interested in, would you have an idea how I could do it ?

The script takes time to execute as the .nc file covers the world map and is quite heavy (140 Mb).
Would you see a more efficient way of doing it ?

I am a beginner in NetCDF4 and in Python, please forgive me if this question may seem simple to you.

Thanks for your help,

Benoît

[1]: https://www.guillaumedueymes.com/post/shapefiles_country/


Replies (1)

    (1-1/1)