Missing Value Imputation Method using Ensemble Technique For Microarray Data

  • Kairung Hengpraprohm Program in Data Science, Faculty of Science and Technology, Nakhon Pathom Rajabhat University, Thailand
  • Suwimol Jungjit Department of Computer and Information Technology, Faculty of Science, Thaksin University, Thailand
Keywords: Missing Value, Data Imputation, Microarray Data, K-nearest neighbor, Euclidian Distance, Manhattan Distance, Cosine, Pearson, Data Mining

Abstract

This paper proposes a new missing value imputation method for microarray data using ensemble technique (KNN-Ensemble). We run an experiment on three standard benchmark microarray datasets: Colon, Prostate and Ovarian. Four different distance functions for KNN imputation method were studied. Our experiment can be separated into three steps: (1) selecting two best distance functions for KNN imputation; one distance function for evaluating sample distance and another one is the best distance function used to evaluate distance between features (2) estimating missing values using KNN-Ensemble based on two selected functions
from the first step and (3) evaluating the performance of new imputation method for microarray data using ensemble approach with other well-known imputation algorithms: original KNN and Row-Average imputation. The experimental results show that KNN-Ensemble method using Manhattan and Euclidian distance function outperformed other baseline imputation methods on three datasets.

Downloads

Download data is not yet available.
Published
2018-12-24
Section
Research Paper