The general aim of data reduction (DR) is to synthesize the information within a data set by defining a set of homogeneous groups of observations (row-wise) and a set of linear combinations of the starting attributes that approximate their relationship structure (columnwise). That is, DR embeds clustering and dimension reduction techniques that are often used sequentially. Albeit such sequential approach is straightforward, dimension reduction is applied first, and the reduced-space observation projections are clustered together, it may fail in retrieving the structure underlying data. In fact, the low-dimensional solution may mask the groups of homogeneous observations. To overcome this problem, joint DR techniques have been proposed, in this paper we focus on the categorical data case and on how such approaches relates to the explained heterogeneity.
Data reduction for categorical data: an explained heterogeneity approach
IODICE D'ENZA, Alfonso;
2015-01-01
Abstract
The general aim of data reduction (DR) is to synthesize the information within a data set by defining a set of homogeneous groups of observations (row-wise) and a set of linear combinations of the starting attributes that approximate their relationship structure (columnwise). That is, DR embeds clustering and dimension reduction techniques that are often used sequentially. Albeit such sequential approach is straightforward, dimension reduction is applied first, and the reduced-space observation projections are clustered together, it may fail in retrieving the structure underlying data. In fact, the low-dimensional solution may mask the groups of homogeneous observations. To overcome this problem, joint DR techniques have been proposed, in this paper we focus on the categorical data case and on how such approaches relates to the explained heterogeneity.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.