This package provides a set of tool for performing association tests to identify (e)QTLs in genome-wide association studies (GWAS,MWAS,EWAS,PWAS). It integrates three methods, Linear Model, Local Polynomial Fitting (Nonlinear Model) and Generalized Additive Model (GAM) to carry out genome-wide scanning. These methods can be also applied to case-control studies, where the ROC is used to assess the model performance.
This vignette demonstrates the utility of "VariantScan" for association testing in genomics. In this vignette, we take the simulated genotypes of 640 individuals as an example. We compared two methods, linear regression and local polynomial regression for identifying QTLs.
#Install from CRAN
#install.packages("VariantScan")
## or you can get the latest version of HierDpart from github
#library(devtools)
#install_github("xinghuq/VariantScan")
library("ggplot2")
library("VariantScan")
# example genepop file
f <- system.file('extdata',package='VariantScan')
infile <- file.path(f, "sim1.csv")
## read genotype file
geno=read.csv(infile)
# traits
traitq=geno[,14]
genotype=geno[,-c(1:14)]
# get PCs as covariates
PCs=prcomp(genotype)
summary(PCs$x[,1:2])
#> PC1 PC2
#> Min. :-4.63463 Min. :-4.21373
#> 1st Qu.:-1.65826 1st Qu.:-1.25936
#> Median :-0.02212 Median : 0.04811
#> Mean : 0.00000 Mean : 0.00000
#> 3rd Qu.: 1.61079 3rd Qu.: 1.19832
#> Max. : 4.51837 Max. : 4.64150
loessW=VScan(x=genotype,y=(traitq),methods ="loess")
loessWcv=VScan(x=genotype,y=(traitq),U=PCs$x[,1:2],methods ="loess")
lmW=VScan(x=genotype,y=(traitq),methods ="lm")
lmWcv=VScan(x=genotype,y=(traitq),U=PCs$x[,1:2],methods ="lm")
Plot Manhattan plot
##
Loci<-rep("Neutral", 1000)
Loci[c(201,211,221,231,241,251,261,271,281,291)]<-"QTL"
Selected_Loci<-Loci[-which(Loci=="Neutral")]
g1=ggplot() +
geom_point(aes(x=which(Loci=="Neutral"), y=-log10(lmWcv$p_norm$p.value[-which(Loci!="Neutral")])), col = "gray83") +
geom_point(aes(x=which(Loci!="Neutral"), y=-log10(lmWcv$p_norm$p.value[-which(Loci=="Neutral")]), colour = Selected_Loci)) +
xlab("SNPs") + ylab("-log10(p-value)") +ylim(c(0,35))+theme_bw()
g1
Fig.1 Manhattan plot for linear model
## Manhattan plot
g2=ggplot() +
geom_point(aes(x=which(Loci=="Neutral"), y=-log10(loessWcv$p_norm$p.value[-which(Loci!="Neutral")])), col = "gray83") +
geom_point(aes(x=which(Loci!="Neutral"), y=-log10(loessWcv$p_norm$p.value[-which(Loci=="Neutral")]), colour = Selected_Loci)) +
xlab("SNPs") + ylab("-log10(p-value)") +ylim(c(0,35))+theme_bw()
g2
Fig.2 Manhattan plot for local polynomial regression model
W. S. Cleveland, E. Grosse and W. M. Shyu (1992) Local regression models. Chapter 8 of Statistical Models in S eds J.M. Chambers and T.J. Hastie, Wadsworth & Brooks/Cole.