Project

General

Profile

import_binary only importing part of a data set

Added by Matt Thompson about 10 years ago

All,

A colleague is having an issue with CDO's import_binary command and I'm hoping you can help.

Namely, he has the following .ctl file:

dset /gpfsm/dnb33/wchao/TRMM.v7.bin/disc2.nascom.nasa.gov/data/TRMM/Gridded/3B42_V7/%y4%m2/3B42.%y2%m2%d2.%h2z.7.precipitation.bin
options template byteswapped
title TRMM 3B42 V7 three hourly TRMM rainfall
undef -9999.9                                                                  
xdef 1440 linear -179.875 0.25000000                                          
ydef 400  linear -49.8750000 0.25000000                                         
zdef 1 levels 1000                                                              
tdef 46752 linear 00:00Z01jul2005 3hr
* end_time 21:00Z31dec2013 (this_is_comment_line)
vars 1                                                                          
r         0   99 Hourly Rain Rate (mm/hr)                                       
endvars                                                                         

If I open this file in GrADS, I definitely see all the data:

ga-> open 3B42.ctl 
Scanning description file:  3B42.ctl
Data file /gpfsm/dnb33/wchao/TRMM.v7.bin/disc2.nascom.nasa.gov/data/TRMM/Gridded/3B42_V7/%y4%m2/3B42.%y2%m2%d2.%h2z.7.precipitation.bin is open as file 1
LON set to 0 360 
LAT set to -49.875 49.875 
LEV set to 1000 1000 
Time values set: 2005:7:1:0 2005:7:1:0 
E set to 1 1 
ga-> q file
File 1 : TRMM 3B42 V7 three hourly TRMM rainfall
  Descriptor: 3B42.ctl
  Binary: /gpfsm/dnb33/wchao/TRMM.v7.bin/disc2.nascom.nasa.gov/data/TRMM/Gridded/3B42_V7/%y4%m2/3B42.%y2%m2%d2.%h2z.7.precipitation.bin
  Type = Gridded
  Xsize = 1440  Ysize = 400  Zsize = 1  Tsize = 46752  Esize = 1
  Number of Variables = 1
     r  0  99  Hourly Rain Rate (mm/hr)

As you can see, Tsize is 46752 as expected. Now I run CDO:

(1360) $ cdo -f nc4 import_binary 3B42.ctl test.nc4
cdo import_binary: Processed 1 variable ( 5.18s )
(1361) $ cdo sinfon test.nc4
   File format: netCDF4
    -1 : Institut Source   Ttype    Levels Num  Gridsize Num Dtype : Parameter name
     1 : unknown  unknown  instant       1   1    576000   1  F32  : r             
   Grid coordinates :
     1 : lonlat       > size      : dim = 576000  nx = 1440  ny = 400
                        lon       : first = -179.875  last = 179.875  inc = 0.25  degrees_east  circular
                        lat       : first = -49.875  last = 49.875  inc = 0.25  degrees_north
   Vertical coordinates :
     1 : surface                  : 0 
   Time coordinate :  736 steps
     RefTime =  2005-07-01 00:00:00  Units = hours  Calendar = standard
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  2005-07-01 00:00:00  2005-07-01 03:00:00  2005-07-01 06:00:00  2005-07-01 09:00:00
  2005-07-01 12:00:00  2005-07-01 15:00:00  2005-07-01 18:00:00  2005-07-01 21:00:00
  2005-07-02 00:00:00  2005-07-02 03:00:00  2005-07-02 06:00:00  2005-07-02 09:00:00
...snip...
  2005-09-30 00:00:00  2005-09-30 03:00:00  2005-09-30 06:00:00  2005-09-30 09:00:00
  2005-09-30 12:00:00  2005-09-30 15:00:00  2005-09-30 18:00:00  2005-09-30 21:00:00
cdo sinfon: Processed 1 variable over 736 timesteps ( 0.01s )

As you can see, CDO only processed 736 timesteps, or ~1.5% of the data (which, admittedly, is already 1.6G). Did I just reach a data limit on CDO? That is, is the file that would be generated just too big (~102 GB) and CDO decides to save me from myself?

I ask because I made up a couple other .ctl files with 'tdef 500' and 'tdef 1000' and, as suspected, I get 500 timesteps with the former and 736 timesteps with the latter.

Thanks,
Matt


Replies (2)

RE: import_binary only importing part of a data set - Added by Uwe Schulzweida about 10 years ago

Hi Matt,

The GrADS ctl file is using the template option. That means the input data is quite likely distributed over many input files. Unfortunately CDO does not give an error if one of the input files is not available (I have changed this for the next release). You can check it with the CDO option -v:

cdo -v -f nc4 import_binary 3B42.ctl test.nc4
To check it with GrADS you have to read the data. The data will be read only if you access it:
ga-> q file
ga-> set t 40000
ga-> d r
Cheers,
Uwe

RE: import_binary only importing part of a data set - Added by Matt Thompson about 10 years ago

As always, you are right Uwe:

cdo import_binary: Opening file: /gpfsm/dnb33/wchao/TRMM.v7.bin/disc2.nascom.nasa.gov/data/TRMM/Gridded/3B42_V7/200509/3B42.050930.21z.7.precipitation.bin
cdo import_binary:  Reading timestep: 736  2005-09-30 21:00:00
cdo import_binary: Opening file: /gpfsm/dnb33/wchao/TRMM.v7.bin/disc2.nascom.nasa.gov/data/TRMM/Gridded/3B42_V7/200510/3B42.051001.00z.7.precipitation.bin
cdo import_binary: Could not open file: /gpfsm/dnb33/wchao/TRMM.v7.bin/disc2.nascom.nasa.gov/data/TRMM/Gridded/3B42_V7/200510/3B42.051001.00z.7.precipitation.bin
cdo import_binary: Processed 1 variable ( 10.93s )

I suppose in my GrADS testing, I was using sdfwrite and I think I might have avoided this due to how I was reading things (and setting t).

Thanks,
Matt

    (1-2/2)