Distance-based techniques in detecting outliers appears to be an effective tool in both univariate and multivariate data. However, the effectiveness of the same is yet to be firmly established in categorical data as it poses challenges due to polarization of cell frequencies. The purpose of this paper is to evolve a new distance-based measure to detect outliers in two-dimensional contingency tables. The new distance measure based on pivotal element is evaluated through a comparison with other suitable distance measures from the literature for its performance. The consistency of the four distance measures is examined through a simulation study followed by the application to real datasets.
Robust distance measure to detect outliers for categorical data
Gallo, M.
2019-01-01
Abstract
Distance-based techniques in detecting outliers appears to be an effective tool in both univariate and multivariate data. However, the effectiveness of the same is yet to be firmly established in categorical data as it poses challenges due to polarization of cell frequencies. The purpose of this paper is to evolve a new distance-based measure to detect outliers in two-dimensional contingency tables. The new distance measure based on pivotal element is evaluated through a comparison with other suitable distance measures from the literature for its performance. The consistency of the four distance measures is examined through a simulation study followed by the application to real datasets.File | Dimensione | Formato | |
---|---|---|---|
10.1007_s00500-019-04340-5.pdf
non disponibili
Tipologia:
Documento in Pre-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.21 MB
Formato
Adobe PDF
|
1.21 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.