The applicability of a dimensionality reduction technique on very large categorical data sets or on categorical data streams is limited due to the required singular value decomposition (SVD) of properly transformed data. The application of SVD to large and high dimensional data is unfeasible because of the very large computational time and because it requires the whole data to be stored in memory (no data flows can be analyzed). The aim of the present paper is to integrate the incremental SVD procedure proposed by Brand (2003) in a multiple correspondence analysis (MCA)-like procedure in order to obtain a dimensionality reduction technique feasible for the application on very large categorical data or even on categorical data streams.
Multiple correspondence analysisfor the quantification and visualization of huge categorical data
IODICE D'ENZA, Alfonso;
2009-01-01
Abstract
The applicability of a dimensionality reduction technique on very large categorical data sets or on categorical data streams is limited due to the required singular value decomposition (SVD) of properly transformed data. The application of SVD to large and high dimensional data is unfeasible because of the very large computational time and because it requires the whole data to be stored in memory (no data flows can be analyzed). The aim of the present paper is to integrate the incremental SVD procedure proposed by Brand (2003) in a multiple correspondence analysis (MCA)-like procedure in order to obtain a dimensionality reduction technique feasible for the application on very large categorical data or even on categorical data streams.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.