Land masking many .nc files using Python subprocess loop and CDO
Added by Michael Pletcher over 3 years ago
Hi everyone,
I am attempting to mask land for a large quantity of .nc files using Python's subprocess command and CDO, but I keep receiving this error: 'cdo (Abort): Unprocessed Input, could not process all Operators/Files'. My code is below.
import datetime as dt
from datetime import datetime
import subprocess
import glob
directory = '/uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG/July/gpm1.gesdisc.eosdis.nasa.gov/opendap/GPM_L3/GPM_3IMERGHH.06/2020/183/*.nc4'
hourly = ("00.V06B.HDF5.nc4", "20.V06B.HDF5.nc4", "40.V06B.HDF5.nc4", "60.V06B.HDF5.nc4", "80.V06B.HDF5.nc4")
curDT = datetime(2020, 7, 1, 0)
endDT = datetime(2020, 7, 1, 23)
while curDT < endDT:
for filepath in glob.iglob(directory):
if filepath.endswith(hourly):
mask = subprocess.call("cdo -expr,'topo = ((topo<0.0)) ? 1.0 : topo/0.0' /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG/July/gpm1.gesdisc.eosdis.nasa.gov/opendap/GPM_L3/GPM_3IMERGHH.06/2020/183/3B*.nc4 /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG_landmasks/{0}_IMERG_mask.nc".format(curDT.strftime('%m%d%H00')), shell = True)
land_masking = subprocess.call("cdo -mul /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG_landmasks/*.nc /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG/July/gpm1.gesdisc.eosdis.nasa.gov/opendap/GPM_L3/GPM_3IMERGHH.06/2020/183/3B*.nc4 {0}_land_masked_IMERG.nc4.nc4".format(curDT.strftime('%m%d%H00')), shell = True)
curDT = curDT + dt.timedelta(hours = 0.5)
Is this an issue with the formatting of the code, or am I not understanding how this loop is working correctly? Any help is greatly appreciated.
Thank you,
Michael
Replies (7)
RE: Land masking many .nc files using Python subprocess loop and CDO - Added by Karin Meier-Fleischer over 3 years ago
Hi Michael,
if you need to access a huge number of files you may have exceeded the maximum length of arguments in the subrocess command.
You can retrieve the system max. value with
getconf ARG_MAX
-Karin
RE: Land masking many .nc files using Python subprocess loop and CDO - Added by Michael Pletcher over 3 years ago
Hi Karin,
I have attempted to fix this issue by adding getconf ARG_MAX to the beginning of my subprocess command as shown below:
mask = subprocess.call("getconf ARG_MAX cdo -expr,'topo = ((topo<0.0)) ? 1.0 : topo/0.0' /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG/July/gpm1.gesdisc.eosdis.nasa.gov/opendap/GPM_L3/GPM_3IMERGHH.06/2020/183/3B*.nc4 /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG_landmasks/{0}_IMERG_mask.nc".format(curDT.strftime('%m%d%H00')), shell = True)
land_masking = subprocess.call("getconf ARG_MAX cdo -mul /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG_landmasks/*.nc /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG/July/gpm1.gesdisc.eosdis.nasa.gov/opendap/GPM_L3/GPM_3IMERGHH.06/2020/183/*.nc4 {0}_land_masked_IMERG.nc4.nc4".format(curDT.strftime('%m%d%H00')), shell = True)
And the following is what the command output:
<pre><code class="shell">
Usage: getconf [-v specification] variable_name [pathname]
getconf -a [pathname]
Usage: getconf [-v specification] variable_name [pathname]
getconf -a [pathname]
</code></pre>
The actual CDO operation did not occur, only the above text was output. In addition, the maximum number of files I will be processing is 24. Is there anything else I can do differently here?
Thanks,
Michael
RE: Land masking many .nc files using Python subprocess loop and CDO - Added by Michael Pletcher over 3 years ago
Sorry I realized that I didn't format the above response correctly. Here's my updated reply.
Hi Karin,
I have attempted to fix this issue by adding getconf ARG_MAX to the beginning of my subprocess command as shown below:
mask = subprocess.call("getconf ARG_MAX cdo -expr,'topo = ((topo<0.0)) ? 1.0 : topo/0.0' /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG/July/gpm1.gesdisc.eosdis.nasa.gov/opendap/GPM_L3/GPM_3IMERGHH.06/2020/183/3B*.nc4 /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG_landmasks/{0}_IMERG_mask.nc".format(curDT.strftime('%m%d%H00')), shell = True)
land_masking = subprocess.call("getconf ARG_MAX cdo -mul /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG_landmasks/*.nc /uufs/chpc.utah.edu/common/home/zpu-group10/pu03/mpletch/thesis/IMERG/July/gpm1.gesdisc.eosdis.nasa.gov/opendap/GPM_L3/GPM_3IMERGHH.06/2020/183/*.nc4 {0}_land_masked_IMERG.nc4.nc4".format(curDT.strftime('%m%d%H00')), shell = True)
And the following is what each command output:
Usage: getconf [-v specification] variable_name [pathname]
getconf -a [pathname]
Usage: getconf [-v specification] variable_name [pathname]
getconf -a [pathname]
The actual CDO operation did not occur, only the above text was output. In addition, the maximum number of files I will be processing is 24. Is there anything else I can do differently here?
Thanks,
Michael
RE: Land masking many .nc files using Python subprocess loop and CDO - Added by Karin Meier-Fleischer over 3 years ago
Hi Michael,
the getconf command is a shell command (don't use it in your script - it is not an environment variable setting) that will show you the number of max. length for a command line of your system.
For more information do the next in a terminal :
getconf ARG_MAX
The huge number of input files seems to exceed this max. length limit and gives an error.
-Karin
RE: Land masking many .nc files using Python subprocess loop and CDO - Added by Michael Pletcher over 3 years ago
Hi Karin,
When I used the getconf ARG_MAX in my terminal, the number that I received was 4611686018427387903. Is this an issue?
Thank you,
Michael
RE: Land masking many .nc files using Python subprocess loop and CDO - Added by Karin Meier-Fleischer over 3 years ago
Now, I see what the problem is, I was on the wrong track. You try to multiply multiple file in one call which is not possible. The operator mul takes only two input fields.
See https://code.mpimet.mpg.de/projects/cdo/embedded/cdo.pdf#subsection.2.7.4
To multiply all fields you have to use a loop.
RE: Land masking many .nc files using Python subprocess loop and CDO - Added by Michael Pletcher over 3 years ago
Hi Karin,
I'm not sure if you fully read through my code. I actually already am using nested loops in my Python code, and each time the loop runs it is only taking two fields.