Binary data represent a very special condition where both measures of distance and co-occurrence can be adopted. Euclidean distance-based non-hierarchical methods, like the k-means algorithm, or one of its versions, can be profitably used. When the number of available attributes increases the global clustering performance usually worsens. In such cases, to enhance group separability it is necessary to remove the irrelevant and redundant noisy information from the data. The present approach belongs to the category of attribute transformation strategy, and combines clustering and factorial techniques to identify attribute associations that characterize one or more homogeneous groups of statistical units. Furthermore, it provides graphical representations that facilitate the interpretation of the results.
Iterative factor clustering of binary data
IODICE D'ENZA, Alfonso;
2013-01-01
Abstract
Binary data represent a very special condition where both measures of distance and co-occurrence can be adopted. Euclidean distance-based non-hierarchical methods, like the k-means algorithm, or one of its versions, can be profitably used. When the number of available attributes increases the global clustering performance usually worsens. In such cases, to enhance group separability it is necessary to remove the irrelevant and redundant noisy information from the data. The present approach belongs to the category of attribute transformation strategy, and combines clustering and factorial techniques to identify attribute associations that characterize one or more homogeneous groups of statistical units. Furthermore, it provides graphical representations that facilitate the interpretation of the results.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.