Detection of outliers is an important and interesting problem in data analysis. However, detecting outliers in categorical data poses additional diculties due to polarization of cell counts. The structure and nature of cell counts in a contingency table play an important role in the data analysis with the cell counts ranging from zero to very high frequencies. Thus the nature and location of frequency in cells could create polarization posing an additional challenge in the detection of outliers. The present study considers model based approach to detect outliers in an I x J contingency table. The procedure deals with tting a Poisson Log-Linear Model for the count data and examine dierent types of residuals supplemented by boxplot in identifying the outlying cells. The robustness of the model is investigated through a simulation study along with applications to real datasets.

Detection of Outliers in Categorical Data using Model Based Diagnostics

Gallo M
2018-01-01

Abstract

Detection of outliers is an important and interesting problem in data analysis. However, detecting outliers in categorical data poses additional diculties due to polarization of cell counts. The structure and nature of cell counts in a contingency table play an important role in the data analysis with the cell counts ranging from zero to very high frequencies. Thus the nature and location of frequency in cells could create polarization posing an additional challenge in the detection of outliers. The present study considers model based approach to detect outliers in an I x J contingency table. The procedure deals with tting a Poisson Log-Linear Model for the count data and examine dierent types of residuals supplemented by boxplot in identifying the outlying cells. The robustness of the model is investigated through a simulation study along with applications to real datasets.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11574/183647
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact