The standard multivariate analysis addresses data sets represented as two dimensional matrices. In recent years, an increasing number of application areas like chemometrics, computer vision, econometrics and social network analysis involve analysis of data sets that are represented as multidimensional arrays and multiway data analysis becomes popular as an exploratory analysis tool. The most popular multiway models are CANDECOMP/PARAFAC and TUCKER3. The standard algorithms for computing these models are based on alternating least squares (ALS) and thus are vulnerable to the presence of outlying data points. Even a single outliying data point can strongly influence the resulting model and the conclusions based on it. Therefore robust methods are preferred. Additional difficulties for the analysis present cases of compositional data which consist of vectors of positive values summing to a unit, or in general, to some fixed constant for all vectors. They appear as proportions, percentages, concentrations, absolute and relative frequencies. We present a robust version of Tucker3 which is extended to handle compositional data. This method, together with a robust version of PARAFAC, also with an option for handling compositional data are implemented in an R package for analysis of multiway data sets.
Robust methods for analysis of 3-way compositional data in R
Di Palma MA;Gallo M
2016-01-01
Abstract
The standard multivariate analysis addresses data sets represented as two dimensional matrices. In recent years, an increasing number of application areas like chemometrics, computer vision, econometrics and social network analysis involve analysis of data sets that are represented as multidimensional arrays and multiway data analysis becomes popular as an exploratory analysis tool. The most popular multiway models are CANDECOMP/PARAFAC and TUCKER3. The standard algorithms for computing these models are based on alternating least squares (ALS) and thus are vulnerable to the presence of outlying data points. Even a single outliying data point can strongly influence the resulting model and the conclusions based on it. Therefore robust methods are preferred. Additional difficulties for the analysis present cases of compositional data which consist of vectors of positive values summing to a unit, or in general, to some fixed constant for all vectors. They appear as proportions, percentages, concentrations, absolute and relative frequencies. We present a robust version of Tucker3 which is extended to handle compositional data. This method, together with a robust version of PARAFAC, also with an option for handling compositional data are implemented in an R package for analysis of multiway data sets.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.