FMI data - split by initial time
Added by Christoph Haller almost 7 years ago
Hi,
I want to use FMI data sets, and need to split these files by initial time.
Files can be found here, eg:
http://fmi-opendata-rcrhirlam-pressure-grib.s3-website-eu-west-1.amazonaws.com/?prefix=2018/04/04/06/
file numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 has 55 timesteps
I need 55 files, one for every timestep.
Using grib tools from ECMWF, I can do the job by:
grib_copy -w stepRange:i=0 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 numerical-hirlam74-forecast-GeopHeight-20180404T060000Zstep00.grb2
grib_copy -w stepRange:i=1 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 numerical-hirlam74-forecast-GeopHeight-20180404T060000Zstep01.grb2
grib_copy -w stepRange:i=2 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 numerical-hirlam74-forecast-GeopHeight-20180404T060000Zstep02.grb2
grib_copy -w stepRange:i=3 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 numerical-hirlam74-forecast-GeopHeight-20180404T060000Zstep03.grb2
....
The idea is to let cdo do the same job by:
cdo -splitsel,1 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 outfile-GeopHeight-20180404T06i
But it does not work. Split files have only information about the first level 5000.
Even "cdo info" only lists level 5000 and issues this warning:
Warning (gribapiScanTimestep) : Record 56 (name=gh id=5.3.0 lev1=10000000 lev2=0) timestep 56: Parameter not defined at timestep 1!
But it correctly reports 55 timesteps "cdo info: Processed 241309805 values from 1 variable over 55 timesteps ( 6.54s )"
Where is the problem?
Is my intended "cdo -splitsel,1" wrong or incomplete?
Is something wrong with the grib2 file itself?
TIA
Christoph
Replies (12)
RE: FMI data - split by initial time - Added by Karin Meier-Fleischer almost 7 years ago
Hi Christoph,
it seems that the last timestep in your data file is corrupted. You can select the 55 timesteps correctly and use splitsel as expected with
cdo -f grb2 -splitsel,1 -seltimestep,1/55 numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2 split_time_step_
It will create the 55 files
split_time_step_000000.grb2 ... split_time_step_000054.grb2
-Karin
RE: FMI data - split by initial time - Added by Christoph Haller almost 7 years ago
Hi Karin,
Thanks for your reply.
Unfortunately, the problem persists when I use your "cdo -f grb2 -splitsel,1 -seltimestep,1/55" suggestion.
I still get:
Warning (gribapiScanTimestep) : Record 56 (name=gh id=5.3.0 lev1=10000000 lev2=0) timestep 56: Parameter not defined at timestep 1!
When I use "cdo -f grb2 -splitsel,1 -seltimestep,1/55" on files (without pressure level information) from
http://fmi-opendata-rcrhirlam-surface-grib.s3-website-eu-west-1.amazonaws.com/?prefix=2018/04/04/06/
it works (no warning issued).
I pretty much doubt the last timestep is corrupted.
I have checked the pressure level files with Panoply, and there are only 55 timesteps, but 11 levels.
To me, it looks like cdo misinterprets the next pressure level of the same timestep as next timestep, and skips/ignores the following 10 levels.
This conclusion is confirmed when I check the resulting files, they only have one pressure level instead of 11.
Did you download one pressure level file for testing? I know one is 3.8 GB, but it downloads really quick.
I would very much appreciate if you could see for yourself.
Thanks for your time
Christoph
RE: FMI data - split by initial time - Added by Karin Meier-Fleischer almost 7 years ago
Hi Christoph,
I downloaded the file mentioned above and used it as described above.
Which cdo version are you using? I use:
> cdo -V Climate Data Operators version 1.9.2 (http://mpimet.mpg.de/cdo) Compiled: by root on Karins-MBP-DKRZ (x86_64-apple-darwin16.7.0) Jan 19 2018 16:44:24 CXX Compiler: /usr/bin/clang++ -std=gnu++11 -pipe -Os -stdlib=libc++ -arch x86_64 -D_THREAD_SAFE -pthread CXX version : unknown C Compiler: /usr/bin/clang -pipe -Os -arch x86_64 -D_THREAD_SAFE -pthread C version : unknown F77 Compiler: -pipe -Os F77 version : ./configure: line 21776: -V: command not found Features: 16GB DATA PTHREADS HDF5 NC4/HDF5 OPeNDAP SZ UDUNITS2 PROJ.4 XML2 MAGICS CURL FFTW3 SSE4_1 Libraries: HDF5/1.10.1 proj/4.93 xml2/2.9.7 curl/7.57.0 Filetypes: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5 CDI library version : 1.9.2 of Jan 19 2018 16:43:43 CGRIBEX library version : 1.9.0 of Sep 29 2017 10:16:02 GRIB_API library version : 2.5.0 NetCDF library version : 4.4.1.1 of Nov 25 2017 11:19:57 $ HDF5 library version : 1.10.1 SERVICE library version : 1.4.0 of Jan 19 2018 16:43:41 EXTRA library version : 1.4.0 of Jan 19 2018 16:43:39 IEG library version : 1.4.0 of Jan 19 2018 16:43:39 FILE library version : 1.8.3 of Jan 19 2018 16:43:39
-Karin
RE: FMI data - split by initial time - Added by Christoph Haller almost 7 years ago
Hi Karin,
Ouch, I am out-of-date:
Climate Data Operators version 1.9.1 (http://mpimet.mpg.de/cdo)
Compiled: by christoph2 on iket-tarragona (x86_64-unknown-linux-gnu) Nov 21 2017 10:01:16
CXX Compiler: g++ -std=gnu++11 -g -O2 -fopenmp
CXX version : g++ (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
C Compiler: gcc -std=gnu99 -g -O2 -fopenmp
C version : gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
Features: DATA PTHREADS OpenMP3 SSE2
Libraries:
Filetypes: srv ext ieg grb1 grb2 nc1
CDI library version : 1.9.1 of Nov 21 2017 09:58:33
CGRIBEX library version : 1.9.0 of Sep 29 2017 10:16:02
GRIB_API library version : 1.10.4
NetCDF library version : 4.3.3.1 of Aug 3 2015 17:12:26 $
SERVICE library version : 1.4.0 of Nov 21 2017 09:58:21
EXTRA library version : 1.4.0 of Nov 21 2017 09:58:16
IEG library version : 1.4.0 of Nov 21 2017 09:58:19
FILE library version : 1.8.3 of Nov 21 2017 09:58:16
will upgrade and report result ASAP.
Thanks, Christoph
RE: FMI data - split by initial time - Added by Christoph Haller almost 7 years ago
Hi Karin,
now I am up-to-date
cdo -V
Climate Data Operators version 1.9.3 (http://mpimet.mpg.de/cdo)
Compiled: by jruser on iket-swiss-linux (x86_64-unknown-linux-gnu) Apr 23 2018 13:39:42
CXX Compiler: g++ -std=gnu++11 -g -O2 -fopenmp
CXX version : g++ (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
C Compiler: gcc -std=gnu99 -g -O2 -fopenmp
C version : gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
F77 Compiler: gfortran -g -O2
F77 version : GNU Fortran (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
Features: 62GB C++11 Fortran DATA PTHREADS OpenMP3 NC4/HDF5/threadsafe OPeNDAP SSE2
Libraries:
Filetypes: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c
CDI library version : 1.9.3 of Apr 23 2018 13:38:59
CGRIBEX library version : 1.9.0 of Jan 22 2018 09:24:03
GRIB_API library version : 2.7.0
NetCDF library version : 4.1.3 of Feb 24 2014 21:05:37 $
HDF5 library version : 1.8.11 threadsafe
EXSE library version : 1.4.0 of Apr 23 2018 13:38:53
FILE library version : 1.8.3 of Apr 23 2018 13:38:51
this file
http://s3-eu-west-1.amazonaws.com/fmi-opendata-rcrhirlam-surface-grib/2018/04/04/06/numerical-hirlam74-forecast-GeopHeight-20180404T060000Z.grb2
works, but this file
http://s3-eu-west-1.amazonaws.com/fmi-opendata-rcrhirlam-pressure-grib/2018/04/06/06/numerical-hirlam74-forecast-GeopHeight-20180406T060000Z.grb2
still gives
Warning (gribapiScanTimestep): Record 56 (name=gh id=5.3.0 lev1=10000000 lev2=0) timestep 56: Parameter not defined at timestep 1!
Is it possible my NetCDF library is too old?
I am puzzled.
Christoph
RE: FMI data - split by initial time - Added by Karin Meier-Fleischer almost 7 years ago
See my first answer.
-Karin
RE: FMI data - split by initial time - Added by Christoph Haller almost 7 years ago
Hi Karin,
I have also updated my NetCDF library
cdo -V
Climate Data Operators version 1.9.3 (http://mpimet.mpg.de/cdo)
Compiled: by jruser on iket-swiss-linux (x86_64-unknown-linux-gnu) Apr 24 2018 13:01:29
CXX Compiler: g++ -std=gnu++11 -g -O2 -fopenmp
CXX version : g++ (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
C Compiler: gcc -std=gnu99 -g -O2 -fopenmp
C version : gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
F77 Compiler: gfortran -g -O2
F77 version : GNU Fortran (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
Features: 62GB C++11 Fortran DATA PTHREADS OpenMP3 NC4/HDF5/threadsafe OPeNDAP SSE2
Libraries:
Filetypes: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5
CDI library version : 1.9.3 of Apr 24 2018 13:00:44
CGRIBEX library version : 1.9.0 of Jan 22 2018 09:24:03
GRIB_API library version : 2.7.0
NetCDF library version : 4.6.1 of Apr 24 2018 12:53:10 $
HDF5 library version : 1.8.11 threadsafe
EXSE library version : 1.4.0 of Apr 24 2018 13:00:40
FILE library version : 1.8.3 of Apr 24 2018 13:00:48
and still
cdo -f grb2 -splitsel,1 -seltimestep,1/12
on the pressure files (3.8 GB) eg
http://s3-eu-west-1.amazonaws.com/fmi-opendata-rcrhirlam-pressure-grib/2018/04/16/00/numerical-hirlam74-forecast-GeopHeight-20180416T000000Z.grb2
intercepts pressure levels at 1000, 925, 850, 700, 500, 400, 300, 250, 200, 100 hPa.
Only one level at 50 hPa is selected.
I have reduced the number of timsteps to select to 12,
now no warning is issued, but 10 levels are definitely missing.
Is it possible "cdo" cannot deal with more than one level?
No, there are operators like "sellevel" resp. "sellevidx", so it should.
Sorry, but now I tend to consider this a bug,
because Panoply reads correctly (please see also my first reply).
Christoph
RE: FMI data - split by initial time - Added by Ralf Mueller almost 7 years ago
I am currently downloading the input file, but the error usually occurs, when the first grib record for a certain variable does not belong to the first timestep.
In GRIB, records are independent whereas in netcdf variables have common dimensions. At this point CDO/CDI has to fill a gap between the standards.
I will have a closer look whenever the download is complete.
cheers
ralf
RE: FMI data - split by initial time - Added by Ralf Mueller almost 7 years ago
Can confirm this behaviour - the different pressure levels (50 upto 1000) are not correctly recognized by the current CDO-release 1.9.3
RE: FMI data - split by initial time - Added by Uwe Schulzweida almost 7 years ago
The GRIB records in your file are sorted by levels. CDO expects the records sorted by time.
RE: FMI data - split by initial time - Added by Christoph Haller almost 7 years ago
Thanks to Uwe Schulzweida for resolving the case.
Will continue using the grib tools from ECMWF.
And, of course, thanks to the others who spent time on the case.
RE: FMI data - split by initial time - Added by Ralf Mueller almost 7 years ago
you can use grib_copy
to re-sort the data by timesteps and after this, cdo works
<ram@melian:~/local/data/cdo> % grib_copy -B'step:i asc' numerical-hirlam74-forecast-GeopHeight-20180416T000000Z.grb2 out.grib <ram@melian:~/local/data/cdo> % cdo sinfov out.grib File format : GRIB2 -1 : Institut Source T Steptype Levels Num Points Num Dtype : Parameter name 1 : ECMWF unknown v instant 11 1 4387451 1 P24 : gh Grid coordinates : 1 : lonlat : points=4387451 (4633x947) lon : 224.9737 to 539.9913 by 0.068009 degrees_east lat : 25.64766 to 89.99931 by 0.068025 degrees_north Vertical coordinates : 1 : pressure : levels=11 plev : 100000 to 5000 Pa Time coordinate : unlimited steps RefTime = 2018-04-16 00:00:00 Units = hours Calendar = proleptic_gregorian YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss 2018-04-16 00:00:00 2018-04-16 01:00:00 2018-04-16 02:00:00 2018-04-16 03:00:00 2018-04-16 04:00:00 2018-04-16 05:00:00 2018-04-16 06:00:00 2018-04-16 07:00:00 2018-04-16 08:00:00 2018-04-16 09:00:00 2018-04-16 10:00:00 2018-04-16 11:00:00 2018-04-16 12:00:00 2018-04-16 13:00:00 2018-04-16 14:00:00 2018-04-16 15:00:00 2018-04-16 16:00:00 2018-04-16 17:00:00 2018-04-16 18:00:00 2018-04-16 19:00:00 2018-04-16 20:00:00 2018-04-16 21:00:00 2018-04-16 22:00:00 2018-04-16 23:00:00 2018-04-17 00:00:00 2018-04-17 01:00:00 2018-04-17 02:00:00 2018-04-17 03:00:00 2018-04-17 04:00:00 2018-04-17 05:00:00 2018-04-17 06:00:00 2018-04-17 07:00:00 2018-04-17 08:00:00 2018-04-17 09:00:00 2018-04-17 10:00:00 2018-04-17 11:00:00 2018-04-17 12:00:00 2018-04-17 13:00:00 2018-04-17 14:00:00 2018-04-17 15:00:00 2018-04-17 16:00:00 2018-04-17 17:00:00 2018-04-17 18:00:00 2018-04-17 19:00:00 2018-04-17 20:00:00 2018-04-17 21:00:00 2018-04-17 22:00:00 2018-04-17 23:00:00 2018-04-18 00:00:00 2018-04-18 01:00:00 2018-04-18 02:00:00 2018-04-18 03:00:00 2018-04-18 04:00:00 2018-04-18 05:00:00 2018-04-18 06:00:00 cdo sinfon: Processed 1 variable over 55 timesteps [1.24s 68MB] <ram@melian:~/local/data/cdo> % cdo showlevel out.grib 100000 92500 85000 70000 50000 40000 30000 25000 20000 10000 5000 cdo showlevel: Processed 1 variable [0.15s 68MB] <ram@melian:~/local/data/cdo>