RFdr: Jointly determining significance levels for primary and replication studies in two-stage GWASs

About RFdr

In genome-wide association studies (GWASs), we normally discover associations between genetic variants and diseases/traits through the primary study, and validate findings through the replication study. We consider the associations showing significance in both studies as true findings. An important question under this two-stage setting is how do we determine the significance levels in both studies? In traditional methods, we determine significance levels of the primary and replication studies separately. We argue that this separate determination strategy will reduce the power in the overall two-stage study.

Here, we implement a novel method to determine significance levels jointly. It finds the most powerful significance levels when controlling the false discovery rate (Fdr) in the two-stage study at a certain level.


Related Publication
W. Jiang, and W. Yu
"Jointly determining significance levels for primary and replication studies by controlling the false discovery rate in two-stage genome-wide association studies. ",
in preparation.

Where to download RFdr

The R-package is available at :
Windows:  RFdr_1.0.zip
Linux:        RFdr_1.0.tar.gz

The manual is available at: RFdr-manual.pdf


Environment configuration

It can be directly installed in the R environment with following command:

Windows:   install.packages("RFdr_1.0.zip",repos=NULL)
Linux:          install.packages("RFdr_1.0.tar.gz",repos=NULL)


Use the following command to load the package in the R environment:

library("RFdr")

How to use it?

The principal component of RFdr package is RFdrControl.

1. To jointly determine significance levels, you need obtain summary statistics of each genotyped SNPs in both the primary and replication studies. We have put a example summary statistics (smryStats1 and smryStats2) in the package. You can use data(smryStats1) and data(smryStats2) to load the example data. You can also obtain the ground-truth parameters (allele frequencies, odds ratios) of the example data using data(param).

2. You can use RFdrControl to determine significance levels.

RFdrControl(I1, I2, z1, z2, initThld=c(0,0), K=2, q=0.05, beta0=length(z1)/5, plot=T,
output=T, dir='output')


Details about the function can be seen using help(RFdrControl).