Paper
21 June 2002 Representation and classification for high-throughput data
Lodewyk F. A. Wessels, Marcel J. T. Reinders, Tibor van Welsem, Petra M. Nederlof
Author Affiliations +
Abstract
Survival prediction and optimal treatment choice for cancer patients are dependent on correct disease classification. This classification can be improved significantly when high- throughput data such as microarray expression analysis is employed. These data sets usually suffer from the dimensionality problem: many features and few patients. Consequently, care must be taken when feature selection is performed and classifiers for disease classification are designed. In this paper we investigate several issues associated with this problem, including 1) data representation; 2) the type of classifier employed and 3) classifier construction, with specific emphasis on feature selection approaches. More specifically, 'filter' and 'wrapper' approaches for feature selection are studied. The different representations, selection criteria, classifiers and feature selection approaches are evaluated with regard to the effect on true classification performance. As test cases we employ a Comparative Genomic Hybridization breast cancer data sets and two publicly available gene expression data sets.
© (2002) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Lodewyk F. A. Wessels, Marcel J. T. Reinders, Tibor van Welsem, and Petra M. Nederlof "Representation and classification for high-throughput data", Proc. SPIE 4626, Biomedical Nanotechnology Architectures and Applications, (21 June 2002); https://doi.org/10.1117/12.472086
Lens.org Logo
CITATIONS
Cited by 9 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Tumors

Leukemia

Colon

Feature selection

Computer generated holography

Diagnostics

Cancer

Back to Top