Elsevier, Computational Statistics & Data Analysis, (93), p. 46-75
DOI: 10.1016/j.csda.2014.11.004
Full text: Download
Many of the methods which deal with the reduction of dimensionality in matrices of data are based on mathematical techniques. In general, it is not possible to use statistical inferences or select the appropriateness of a model via information criteria with these techniques because there is no underlying probability model. Furthermore, the use of ordinal data is very common (e.g. Likert or Braun-Blanquet scale) and the clustering methods in common use treat ordered categorical variables as nominal or continuous rather than as true ordinal data. Recently a group of likelihood-based finite mixture models for binary or count data has been developed (Pledger and Arnold, 2014). This thesis extends this idea and establishes novel likelihood-based multivariate methods for data reduction of a matrix containing ordinal data. This new approach applies fuzzy clustering via finite mixtures to the ordered stereotype model (Fernández et al., 2014a). Fuzzy allocation of rows and columns to corresponding clusters is achieved by performing the EM algorithm, and also Bayesian model fitting is obtained by performing a reversible jump MCMC sampler. Their performances for one-dimensional clustering are compared. Simulation studies and three real data sets are used to illustrate the application of these approaches and also to present novel data visualisation tools for depicting the fuzziness of the clustering results for ordinal data. Additionally, a simulation study is set up to empirically establish a relationship between our likelihood-based methodology and the performance of eleven information criteria in common use. Finally, clustering comparisons between count data and categorising the data as ordinal over a same data set are performed and results are analysed and presented.