Project

General

Profile

FMI data - split by initial time

Added by Christoph Haller about 6 years ago

Hi,
I want to use FMI data sets, and need to split these files by initial time.
Files can be found here, eg:
http://fmi-opendata-rcrhirlam-pressure-grib.s3-website-eu-west-1.amazonaws.com/?prefix=2018/04/04/06/

file numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 has 55 timesteps

I need 55 files, one for every timestep.

Using grib tools from ECMWF, I can do the job by:
grib_copy -w stepRange:i=0 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 numerical-hirlam74-forecast-GeopHeight-20180404T060000Zstep00.grb2
grib_copy -w stepRange:i=1 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 numerical-hirlam74-forecast-GeopHeight-20180404T060000Zstep01.grb2
grib_copy -w stepRange:i=2 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 numerical-hirlam74-forecast-GeopHeight-20180404T060000Zstep02.grb2
grib_copy -w stepRange:i=3 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 numerical-hirlam74-forecast-GeopHeight-20180404T060000Zstep03.grb2
....

The idea is to let cdo do the same job by:
cdo -splitsel,1 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 outfile-GeopHeight-20180404T06i

But it does not work. Split files have only information about the first level 5000.
Even "cdo info" only lists level 5000 and issues this warning:
Warning (gribapiScanTimestep) : Record 56 (name=gh id=5.3.0 lev1=10000000 lev2=0) timestep 56: Parameter not defined at timestep 1!
But it correctly reports 55 timesteps "cdo info: Processed 241309805 values from 1 variable over 55 timesteps ( 6.54s )"

Where is the problem?
Is my intended "cdo -splitsel,1" wrong or incomplete?
Is something wrong with the grib2 file itself?

TIA
Christoph


Replies (12)

RE: FMI data - split by initial time - Added by Karin Meier-Fleischer about 6 years ago

Hi Christoph,

it seems that the last timestep in your data file is corrupted. You can select the 55 timesteps correctly and use splitsel as expected with

cdo -f grb2 -splitsel,1 -seltimestep,1/55 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 split_time_step_

It will create the 55 files

split_time_step_000000.grb2
...
split_time_step_000054.grb2

-Karin

RE: FMI data - split by initial time - Added by Christoph Haller about 6 years ago

Hi Karin,

Thanks for your reply.
Unfortunately, the problem persists when I use your "cdo -f grb2 -splitsel,1 -seltimestep,1/55" suggestion.
I still get:
Warning (gribapiScanTimestep) : Record 56 (name=gh id=5.3.0 lev1=10000000 lev2=0) timestep 56: Parameter not defined at timestep 1!

When I use "cdo -f grb2 -splitsel,1 -seltimestep,1/55" on files (without pressure level information) from
http://fmi-opendata-rcrhirlam-surface-grib.s3-website-eu-west-1.amazonaws.com/?prefix=2018/04/04/06/
it works (no warning issued).

I pretty much doubt the last timestep is corrupted.
I have checked the pressure level files with Panoply, and there are only 55 timesteps, but 11 levels.
To me, it looks like cdo misinterprets the next pressure level of the same timestep as next timestep, and skips/ignores the following 10 levels.
This conclusion is confirmed when I check the resulting files, they only have one pressure level instead of 11.

Did you download one pressure level file for testing? I know one is 3.8 GB, but it downloads really quick.
I would very much appreciate if you could see for yourself.

Thanks for your time
Christoph

RE: FMI data - split by initial time - Added by Karin Meier-Fleischer about 6 years ago

Hi Christoph,

I downloaded the file mentioned above and used it as described above.

Which cdo version are you using? I use:

> cdo -V

Climate Data Operators version 1.9.2 (http://mpimet.mpg.de/cdo)
Compiled: by root on Karins-MBP-DKRZ (x86_64-apple-darwin16.7.0) Jan 19 2018 16:44:24
CXX Compiler: /usr/bin/clang++ -std=gnu++11 -pipe -Os -stdlib=libc++ -arch x86_64  -D_THREAD_SAFE -pthread
CXX version : unknown
C Compiler: /usr/bin/clang -pipe -Os -arch x86_64  -D_THREAD_SAFE -pthread
C version : unknown
F77 Compiler:  -pipe -Os
F77 version : ./configure: line 21776: -V: command not found
Features: 16GB DATA PTHREADS HDF5 NC4/HDF5 OPeNDAP SZ UDUNITS2 PROJ.4 XML2 MAGICS CURL FFTW3 SSE4_1
Libraries: HDF5/1.10.1 proj/4.93 xml2/2.9.7 curl/7.57.0
Filetypes: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5 
     CDI library version : 1.9.2 of Jan 19 2018 16:43:43
 CGRIBEX library version : 1.9.0 of Sep 29 2017 10:16:02
GRIB_API library version : 2.5.0
  NetCDF library version : 4.4.1.1 of Nov 25 2017 11:19:57 $
    HDF5 library version : 1.10.1
 SERVICE library version : 1.4.0 of Jan 19 2018 16:43:41
   EXTRA library version : 1.4.0 of Jan 19 2018 16:43:39
     IEG library version : 1.4.0 of Jan 19 2018 16:43:39
    FILE library version : 1.8.3 of Jan 19 2018 16:43:39

-Karin

RE: FMI data - split by initial time - Added by Christoph Haller about 6 years ago

Hi Karin,

Ouch, I am out-of-date:

Climate Data Operators version 1.9.1 (http://mpimet.mpg.de/cdo)
Compiled: by christoph2 on iket-tarragona (x86_64-unknown-linux-gnu) Nov 21 2017 10:01:16
CXX Compiler: g++ -std=gnu++11 -g -O2 -fopenmp
CXX version : g++ (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
C Compiler: gcc -std=gnu99 -g -O2 -fopenmp
C version : gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
Features: DATA PTHREADS OpenMP3 SSE2
Libraries:
Filetypes: srv ext ieg grb1 grb2 nc1
CDI library version : 1.9.1 of Nov 21 2017 09:58:33
CGRIBEX library version : 1.9.0 of Sep 29 2017 10:16:02
GRIB_API library version : 1.10.4
NetCDF library version : 4.3.3.1 of Aug 3 2015 17:12:26 $
SERVICE library version : 1.4.0 of Nov 21 2017 09:58:21
EXTRA library version : 1.4.0 of Nov 21 2017 09:58:16
IEG library version : 1.4.0 of Nov 21 2017 09:58:19
FILE library version : 1.8.3 of Nov 21 2017 09:58:16

will upgrade and report result ASAP.
Thanks, Christoph

RE: FMI data - split by initial time - Added by Christoph Haller about 6 years ago

Hi Karin,

now I am up-to-date

cdo -V

Climate Data Operators version 1.9.3 (http://mpimet.mpg.de/cdo)
Compiled: by jruser on iket-swiss-linux (x86_64-unknown-linux-gnu) Apr 23 2018 13:39:42
CXX Compiler: g++ -std=gnu++11 -g -O2 -fopenmp
CXX version : g++ (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
C Compiler: gcc -std=gnu99 -g -O2 -fopenmp
C version : gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
F77 Compiler: gfortran -g -O2
F77 version : GNU Fortran (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
Features: 62GB C++11 Fortran DATA PTHREADS OpenMP3 NC4/HDF5/threadsafe OPeNDAP SSE2
Libraries:
Filetypes: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c
CDI library version : 1.9.3 of Apr 23 2018 13:38:59
CGRIBEX library version : 1.9.0 of Jan 22 2018 09:24:03
GRIB_API library version : 2.7.0
NetCDF library version : 4.1.3 of Feb 24 2014 21:05:37 $
HDF5 library version : 1.8.11 threadsafe
EXSE library version : 1.4.0 of Apr 23 2018 13:38:53
FILE library version : 1.8.3 of Apr 23 2018 13:38:51

this file
http://s3-eu-west-1.amazonaws.com/fmi-opendata-rcrhirlam-surface-grib/2018/04/04/06/numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2
works, but this file
http://s3-eu-west-1.amazonaws.com/fmi-opendata-rcrhirlam-pressure-grib/2018/04/06/06/numerical-hirlam74-forecast-GeopHeight-20180406T060000Z.grb2
still gives
Warning (gribapiScanTimestep): Record 56 (name=gh id=5.3.0 lev1=10000000 lev2=0) timestep 56: Parameter not defined at timestep 1!

Is it possible my NetCDF library is too old?
I am puzzled.
Christoph

RE: FMI data - split by initial time - Added by Karin Meier-Fleischer about 6 years ago

See my first answer.

-Karin

RE: FMI data - split by initial time - Added by Christoph Haller about 6 years ago

Hi Karin,

I have also updated my NetCDF library

cdo -V

Climate Data Operators version 1.9.3 (http://mpimet.mpg.de/cdo)
Compiled: by jruser on iket-swiss-linux (x86_64-unknown-linux-gnu) Apr 24 2018 13:01:29
CXX Compiler: g++ -std=gnu++11 -g -O2 -fopenmp
CXX version : g++ (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
C Compiler: gcc -std=gnu99 -g -O2 -fopenmp
C version : gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
F77 Compiler: gfortran -g -O2
F77 version : GNU Fortran (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
Features: 62GB C++11 Fortran DATA PTHREADS OpenMP3 NC4/HDF5/threadsafe OPeNDAP SSE2
Libraries:
Filetypes: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5
CDI library version : 1.9.3 of Apr 24 2018 13:00:44
CGRIBEX library version : 1.9.0 of Jan 22 2018 09:24:03
GRIB_API library version : 2.7.0
NetCDF library version : 4.6.1 of Apr 24 2018 12:53:10 $
HDF5 library version : 1.8.11 threadsafe
EXSE library version : 1.4.0 of Apr 24 2018 13:00:40
FILE library version : 1.8.3 of Apr 24 2018 13:00:48

and still
cdo -f grb2 -splitsel,1 -seltimestep,1/12

on the pressure files (3.8 GB) eg
http://s3-eu-west-1.amazonaws.com/fmi-opendata-rcrhirlam-pressure-grib/2018/04/16/00/numerical-hirlam74-forecast-GeopHeight-20180416T000000Z.grb2

intercepts pressure levels at 1000, 925, 850, 700, 500, 400, 300, 250, 200, 100 hPa.
Only one level at 50 hPa is selected.

I have reduced the number of timsteps to select to 12,
now no warning is issued, but 10 levels are definitely missing.

Is it possible "cdo" cannot deal with more than one level?
No, there are operators like "sellevel" resp. "sellevidx", so it should.
Sorry, but now I tend to consider this a bug,
because Panoply reads correctly (please see also my first reply).

Christoph

RE: FMI data - split by initial time - Added by Ralf Mueller about 6 years ago

I am currently downloading the input file, but the error usually occurs, when the first grib record for a certain variable does not belong to the first timestep.

In GRIB, records are independent whereas in netcdf variables have common dimensions. At this point CDO/CDI has to fill a gap between the standards.

I will have a closer look whenever the download is complete.

cheers
ralf

RE: FMI data - split by initial time - Added by Ralf Mueller about 6 years ago

Can confirm this behaviour - the different pressure levels (50 upto 1000) are not correctly recognized by the current CDO-release 1.9.3

RE: FMI data - split by initial time - Added by Uwe Schulzweida about 6 years ago

The GRIB records in your file are sorted by levels. CDO expects the records sorted by time.

RE: FMI data - split by initial time - Added by Christoph Haller about 6 years ago

Thanks to Uwe Schulzweida for resolving the case.
Will continue using the grib tools from ECMWF.
And, of course, thanks to the others who spent time on the case.

RE: FMI data - split by initial time - Added by Ralf Mueller about 6 years ago

you can use grib_copy to re-sort the data by timesteps and after this, cdo works

<ram@melian:~/local/data/cdo>
% grib_copy -B'step:i asc' numerical-hirlam74-forecast-GeopHeight-20180416T000000Z.grb2 out.grib
<ram@melian:~/local/data/cdo>
% cdo sinfov out.grib                                                                               
   File format : GRIB2
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name
     1 : ECMWF    unknown  v instant      11   1   4387451   1  P24  : gh            
   Grid coordinates :
     1 : lonlat                   : points=4387451 (4633x947)
                              lon : 224.9737 to 539.9913 by 0.068009 degrees_east
                              lat : 25.64766 to 89.99931 by 0.068025 degrees_north
   Vertical coordinates :
     1 : pressure                 : levels=11
                             plev : 100000 to 5000 Pa
   Time coordinate :  unlimited steps
     RefTime =  2018-04-16 00:00:00  Units = hours  Calendar = proleptic_gregorian
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  2018-04-16 00:00:00  2018-04-16 01:00:00  2018-04-16 02:00:00  2018-04-16 03:00:00
  2018-04-16 04:00:00  2018-04-16 05:00:00  2018-04-16 06:00:00  2018-04-16 07:00:00
  2018-04-16 08:00:00  2018-04-16 09:00:00  2018-04-16 10:00:00  2018-04-16 11:00:00
  2018-04-16 12:00:00  2018-04-16 13:00:00  2018-04-16 14:00:00  2018-04-16 15:00:00
  2018-04-16 16:00:00  2018-04-16 17:00:00  2018-04-16 18:00:00  2018-04-16 19:00:00
  2018-04-16 20:00:00  2018-04-16 21:00:00  2018-04-16 22:00:00  2018-04-16 23:00:00
  2018-04-17 00:00:00  2018-04-17 01:00:00  2018-04-17 02:00:00  2018-04-17 03:00:00
  2018-04-17 04:00:00  2018-04-17 05:00:00  2018-04-17 06:00:00  2018-04-17 07:00:00
  2018-04-17 08:00:00  2018-04-17 09:00:00  2018-04-17 10:00:00  2018-04-17 11:00:00
  2018-04-17 12:00:00  2018-04-17 13:00:00  2018-04-17 14:00:00  2018-04-17 15:00:00
  2018-04-17 16:00:00  2018-04-17 17:00:00  2018-04-17 18:00:00  2018-04-17 19:00:00
  2018-04-17 20:00:00  2018-04-17 21:00:00  2018-04-17 22:00:00  2018-04-17 23:00:00
  2018-04-18 00:00:00  2018-04-18 01:00:00  2018-04-18 02:00:00  2018-04-18 03:00:00
  2018-04-18 04:00:00  2018-04-18 05:00:00  2018-04-18 06:00:00
cdo sinfon: Processed 1 variable over 55 timesteps [1.24s 68MB]
<ram@melian:~/local/data/cdo>
% cdo showlevel out.grib 
 100000 92500 85000 70000 50000 40000 30000 25000 20000 10000 5000
cdo showlevel: Processed 1 variable [0.15s 68MB]
<ram@melian:~/local/data/cdo>

    (1-12/12)