Multiway data analysis addresses complex data structures represented as multiway data sets where data have more than two modes. The most popular methods for modeling multiway data are CANDECOMP/PARAFAC and TUCKER3. The standard algorithms for computing these models are based on alternating least squares (ALS) and thus are vulnerable to the presence of outlying data points. A single outlier could render the obtained estimates useless. Therefore robust methods are preferred. We present an R package, rrcov3way, implementing a set of functions for the analysis of multiway data sets, including PARAFAC and TUCKER3 as well as their robust alternatives. An additional feature to handle compositional data is also included through ilr transformation. Unified diagnostics, plotting functions, data examples and a manual in the form of vignette complete the package. In the presentation, basic usage of the package will be illustrated by analyzing real data from the UNIDO INDSTAT database. The database contains data on key industrial statistics indicators for the manufacturing sectors. A subset containing I countries, J sectors and K years for some indicators as value added and output will be analyzed.
Robust multiway analysis of compositional data in R
DI PALMA, MARIA ANNA;GALLO, Michele
2014-01-01
Abstract
Multiway data analysis addresses complex data structures represented as multiway data sets where data have more than two modes. The most popular methods for modeling multiway data are CANDECOMP/PARAFAC and TUCKER3. The standard algorithms for computing these models are based on alternating least squares (ALS) and thus are vulnerable to the presence of outlying data points. A single outlier could render the obtained estimates useless. Therefore robust methods are preferred. We present an R package, rrcov3way, implementing a set of functions for the analysis of multiway data sets, including PARAFAC and TUCKER3 as well as their robust alternatives. An additional feature to handle compositional data is also included through ilr transformation. Unified diagnostics, plotting functions, data examples and a manual in the form of vignette complete the package. In the presentation, basic usage of the package will be illustrated by analyzing real data from the UNIDO INDSTAT database. The database contains data on key industrial statistics indicators for the manufacturing sectors. A subset containing I countries, J sectors and K years for some indicators as value added and output will be analyzed.File | Dimensione | Formato | |
---|---|---|---|
ERCIM2014.pdf
non disponibili
Tipologia:
Abstract
Licenza:
PUBBLICO - Pubblico senza Copyright
Dimensione
929.78 kB
Formato
Adobe PDF
|
929.78 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.