The Analysis of Genomewide SNP Data Using Nonparametric and Kernel Machine Regression

The Analysis of Genomewide SNP Data Using Nonparametric and Kernel Machine Regression

Authors

  • Pianpool Kirdwichai
  • Mohamed Fazil Baksh Department of Mathematics and Statistics, School of Mathematical and Physical Sciences, University of Reading, Mathematics Building

Keywords:

correlation structure, genomewide, multiple testing, nonparametric regression

Abstract

This paper illustrates novel use of nonparametric regression in the challenging problem of reliably identifying true association patterns in high dimensional data without the cost, inherent in existing methods, of increasing the false positives. The proposed nonparametric association test (NPAT) treats p-values from multiple hypothesis tests as summaries of association that preserves the correlation in the data and capitalises on this correlation to increase power while minimising false discoveries, relative to existing methods. Distributional results are used to support estimation of the tuning parameter and significance thresholds for NPAT. The method is applied to the WTCCC study of Crohn's disease and results compared with a sequence kernel association test (SKAT) that conversely uses nonparametric regression techniques to group sets of explanatory variables, prior to association testing. Results show that NPAT is computationally tractable and produces findings comparable with Bonferroni correction while SKAT misses a strong association signal in the data.

Downloads

Published

2019-06-19

Issue

Section

Research Articles