Date of Award
1-1-2014
Language
English
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
College/School/Department
Department of Epidemiology and Biostatistics
Program
Biostatistics
Content Description
1 online resource (xi, 125 pages) : illustrations (some color)
Dissertation/Thesis Chair
Recai M Yucel
Committee Members
A. Gregory DiRienzo, Tao Lu
Keywords
Binary Classification, High-dimensional Microarray Data, Multiple Imputation, Random Forests, Single Imputation, Variable selection, Random graphs, Decision trees, Trees (Graph theory), Big data, Data mining
Subject Categories
Biostatistics | Computer Sciences | Statistics and Probability
Abstract
Binary classification plays an important role in many decision-making processes. Random forests can build a strong ensemble classifier by combining weaker classification trees that are de-correlated. The strength and correlation among individual classification trees are the key factors that contribute to the ensemble performance of random forests. We propose roughened random forests, a new set of tools which show further improvement over random forests in binary classification. Roughened random forests modify the original dataset for each classification tree and further reduce the correlation among individual classification trees. This data modification process is composed of artificially imposing missing data that are missing completely at random and subsequent missing data imputation.
Recommended Citation
Xiong, Kuangnan, "Roughened random forests for binary classification" (2014). Legacy Theses & Dissertations (2009 - 2024). 1316.
https://scholarsarchive.library.albany.edu/legacy-etd/1316