PCA¶
module load evaluation_system
analyze --tool PCA --help
PCA (v3.1.0): Principal Component Analysis Options: session (default: 1) This defines two possible sessions: 1 means the program performs a PCA computing a maximum number of EOFs given by EOFS. 2 means a PCA has been already performed and a back- transformation is computed retaining a certain number of EOFs given by EOFS. eofs (default: 10) Number of EOFs that are computed by the PCA (SESSION=1) or that are included in the back-transformation (SESSION=2). If<=0 the largest possible value will be used. input (default: <undefined>) [mandatory] An arbitrary NetCDF file. There are only two restrictions to your NetCDF file: a) Time has to be the very first dimension in the variable you like to analyze. b) All dimensions in your variable need to be defined as variables themselves with equal names. Both, a) and b), are usually true. variable (default: <undefined>) [mandatory] The name of the variable in the NetCDF INPUT file you want to analyze. outputdir (default: $USER_OUTPUT_DIR) The output directory. outputplots (default: $USER_PLOTS_DIR) Output directory of produced plots outputtype (default: eps) [mandatory] Output filetype, i.e. eps, pdf, png pcafile (default: $input.pca.$variable.nc) [mandatory] Filename of the PCA output. If SESSION=1 this is output, if SESSION=2 this is input. projection (default: $input.pro.$variable.nc) Filename of the projection (back-transformation). If SESSION=1 this is not applicable, if SESSION=2 this is output. missingvalue (default: 1e+38) The missing value (fill value) of VARIABLE. principals (default: True) Whether or not you want to have principal components computed. eigvalscale (default: False) Whether or not you want to have eigenvectors and principal components scaled with the square root of the corresponding eigenvalues. If true, this gives a physical amplitude to the eigenvectors and normalizes the principal components. normalize (default: False) Whether or not you want to have your data normalized (divided by the standard deviation per grid point). areaweight (default: False) Whether or not you want to have your data area weighted. This is done per latitude with sqrt(cos(latitude)). latname (default: lat) In case you want to use AREAWEIGHT you need to specify the name of the dimension representing the latitude. shiftlats (default: False) Whether or not you want to have the latitudes shifted before performing the area weighting. This shift by half of the difference between two latitudes avoids the problem that the cosine of +/-90deg is zero. This is recommended if AREAWEIGHT is true and if there exists a gridpoint with lat=+/-90deg. testorthog (default: False) Whether or not you want to have the eigenvectors and principal components tested for orthogonality. A value close to one indicates an orthogonal basis. bootstrap (default: False) Whether or not you want uncertainties of the eigenvalues computed using bootstrap based on case resampling. boots (default: 100) Number of bootstraps. threads (default: 8) This lets you control the number of threads you want to use. If you want to make use of the full computing power then n is a reasonable value with n being the number of cores on your machine.