Project

General

Profile

EOF normalization

Added by Karsten Fennig almost 12 years ago

Dear cdo team,

I'm using the cdo EOF operators (eof, eofcoeff) to filter noise from my data. This was working fine with cdo version 1.5.0 but stopped working with version 1.5.4. I found from the changelog that the normalization of the EOF vectors has been changed with version 1.5.2.

My understanding is that the sum over all products (principal component * eigenvector) should be equal to the input anomaly. This was the case with cdo version 1.5.0 before changing the normalization. I think that if you normalize the EOFs you must also scale the principal components accordingly to preserve the total sum over all elements.

Is there a way to return to the previous normalization or to get the correct scaling factors?

best regards,
Karsten


Replies (5)

RE: EOF normalization - Added by Jaison-Thomas Ambadan almost 12 years ago

Hi,

From CDO documentation: https://code.zmaw.de/embedded/cdo/1.5.4/cdo.html#x1-4750002.11.1

"Note, thate the resulting EOF in ofile2 is ej and thus not weighted for consistency."

So I think you need to multiply ej with the grid-weights before summation (to reconstruct the data). The grid-weights can be obtained from the "gridweigts" operator: https://code.zmaw.de/embedded/cdo/1.5.4/cdo.html#x1-6000002.15.3

Hope this helps!

Cheers,
J.

RE: EOF normalization - Added by Karsten Fennig almost 12 years ago

Hi,

this is not working because the norm of the all eof vectors is normalized to 1, which means the total sum of all squared vector elements is equal to 1. You can test this with 'cdo -fldsum -sqr eoffile'. Technically all vector elements are scaled with the same value, which is the sum of all squared elements before normalization. I need this scaling factor to correctly reconstruct the data.

Karsten

RE: EOF normalization - Added by MILES SOWDEN about 6 years ago

Has there been an update on this issue over the past 6 years?
I compared the PCA results between using the paraview graphical interface and using the CDO operators and they are vastly different.
Paraview gets scale factors within the range of +/- 1, typically about 0.5 per component.
CDO gets about +/-300. If I sum per time period across all my PCA variables and divide by this sum it brings this closer to the Paraview "PCA" components.
I therefore agree with Karsten, that we need the scaling factor to correct the PCA components.

Climate Data Operators version 1.7.0 (http://mpimet.mpg.de/cdo)
Using "Bash on Ubuntu for Windows" (i.e. windows 10 native bash)

BASH Script
export CDO_WEIGHT_MODE=off
export MAX_JACOBI_ITER=100

cdo setmisstoc,0 -sub ${sFile}_tmp.nc -enlarge,${sFile}_tmp.nc -timmean ${sFile}_tmp.nc ${sFile}_IR.nc #normilise by mean
cdo div ${sFile}_IR.nc -enlarge,${sFile}_tmp.nc -timstd ${sFile}_IR.nc ${sFile}_IRN.nc # normalise by std

cdo eof,3 ${sFile}_IRN.nc ${sFile}_IR_eigval.nc ${sFile}_IR_eigvector.nc # Only need the first 3
cdo eofcoeff ${sFile}_IR_eigvector.nc ${sFile}_IRN.nc ${sFile}_IR_pca # Get the PCA
cdo -div ${sFile}_IR_eigval.nc -timsum ${sFile}_IR_eigval.nc ${sFile}_IR_explvar.nc # explain the variance

RE: EOF normalization - Added by MILES SOWDEN about 6 years ago

Further comment, as Jason mentioned CDO does not normalise the PCA components. There seems to be a gray area in the methodology with most reporting the weighted results (e.g. Paraview) and CDO reporting unweighted components. The weights are not gridcell weights (explicitly set off in the code above) but the eigenvalue weights. So it seems to me that I need to multiply eigval x PCA, and possibly the maximum of eigvector?

RE: EOF normalization - Added by Ralf Mueller about 6 years ago

Hi!
From my understanding of PCA and linear vector spaces, normalizing a vector means divide it by its length. Something like

where w_i is the cell area and W is the total area. I am not sure if the additional weighting with the cell area is really needed.

hth
ralf

    (1-5/5)