National Technical Reports Library - NTRL

National Technical Reports Library

The National Technical Information Service acquires, indexes, abstracts, and archives the largest collection of U.S. government-sponsored technical reports in existence. The NTRL offers online, free and open access to these authenticated government technical reports. Technical reports and documents in its repository may be available online for free either from the issuing federal agency, the U.S. Government Publishing Office’s Federal Digital System website, or through search engines.




Details
Actions:
Download PDFDownload XML
Download

Adaptive Dimension Reduction for Clustering High Dimensional Data.


DE2003807420

Publication Date 2002
Personal Author Ding, C.; He, X.; Zha, H.; Simon, H. D.
Page Count 12
Abstract It is well-known that for high dimensional data clustering, standard algorithms such as EM and the K-means are often trapped in local minimum. Many initialization methods were proposed to tackle this problem, but with only limited success. In this paper we propose a new approach to resolve this problem by repeated dimension reductions such that K-means or EM are performed only in very low dimensions. Cluster membership is utilized as a bridge between the reduced dimensional subspace and the original space, providing flexibility and ease of implementation. Clustering analysis performed on highly overlapped Gaussians, DNA gene expression profiles and internet newsgroups demonstrate the effectiveness of the proposed algorithm.
Keywords
  • Algorithms
  • Clustering
  • Dimensions
  • Iterations
  • Attributes
  • Classification
  • Data processing
  • Vectors
  • Configuration
  • Implementation
Source Agency
  • Technical Information Center Oak Ridge Tennessee
Corporate Authors Lawrence Berkeley National Lab., CA.; Pennsylvania State Univ., University Park.; Department of Energy, Washington, DC.
Supplemental Notes Prepared in cooperation with Pennsylvania State Univ., University Park. Sponsored by Department of Energy, Washington, DC.
Document Type Technical Report
NTIS Issue Number 200317
Adaptive Dimension Reduction for Clustering High Dimensional Data.
Adaptive Dimension Reduction for Clustering High Dimensional Data.
DE2003807420

  • Algorithms
  • Clustering
  • Dimensions
  • Iterations
  • Attributes
  • Classification
  • Data processing
  • Vectors
  • Configuration
  • Implementation
  • Technical Information Center Oak Ridge Tennessee
Loading