Multiway data analysis addresses complex data structures represented as multiway data sets where data have more than two modes. The most popular methods for modeling multiway data are CANDECOMP/PARAFAC and TUCKER3. The standard algorithms for computing these models are based on alternating least squares (ALS) and thus are vulnerable to the presence of outlying data points. A single outlier could render the obtained estimates useless. Therefore robust methods are preferred. We present an R package, rrcov3way, implementing a set of functions for the analysis of multiway data sets, including PARAFAC and TUCKER3 as well as their robust alternatives. An additional feature to handle compositional data is also included through ilr transformation. Unified diagnostics, plotting functions, data examples and a manual in the form of vignette complete the package. In the presentation, basic usage of the package will be illustrated by analyzing real data from the UNIDO INDSTAT database. The database contains data on key industrial statistics indicators for the manufacturing sectors. A subset containing I countries, J sectors and K years for some indicators as value added and output will be analyzed.

Robust multiway analysis of compositional data in R

DI PALMA, MARIA ANNA;GALLO, Michele
2014-01-01

Abstract

Multiway data analysis addresses complex data structures represented as multiway data sets where data have more than two modes. The most popular methods for modeling multiway data are CANDECOMP/PARAFAC and TUCKER3. The standard algorithms for computing these models are based on alternating least squares (ALS) and thus are vulnerable to the presence of outlying data points. A single outlier could render the obtained estimates useless. Therefore robust methods are preferred. We present an R package, rrcov3way, implementing a set of functions for the analysis of multiway data sets, including PARAFAC and TUCKER3 as well as their robust alternatives. An additional feature to handle compositional data is also included through ilr transformation. Unified diagnostics, plotting functions, data examples and a manual in the form of vignette complete the package. In the presentation, basic usage of the package will be illustrated by analyzing real data from the UNIDO INDSTAT database. The database contains data on key industrial statistics indicators for the manufacturing sectors. A subset containing I countries, J sectors and K years for some indicators as value added and output will be analyzed.
2014
9788493782245
File in questo prodotto:
File Dimensione Formato  
ERCIM2014.pdf

non disponibili

Tipologia: Abstract
Licenza: PUBBLICO - Pubblico senza Copyright
Dimensione 929.78 kB
Formato Adobe PDF
929.78 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11574/120416
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact