Data quality control in genetic case control association studies pdf

Discuss how population stratification may affect the interpretation of case control genetic association studies. Data quality control in genetic casecontrol association. Quality control is the system of actions which have the aim to measure the quality of the product manufactured at the company and to approve or disapprove. Discuss how population stratification may affect the. Genetic casecontrol association studies correcting for. This article provides a broad outline of the design and analysis of such studies, focusing on casecontrol studies in candidate genes or regions. However, performing genetic association studies in a correct manner requires specific knowledge of genetics, statistics, and bioinformatics. For analysis of casecontrol genetic association studies, it has recently been shown that geneenvironment independence in the population can be leveraged to increase ef. Consider data for a casecontrol study of genetic association as in table 1. A variety of methods have been proposed to this end, mostly statistical in nature and differing in assumptions and type of model employed.

While the protocol applies to genotypes after they have been determined called from probe intensity data, it is still important to understand how the genotype calling was conducted. Genetic epidemiology association studies and power. This protocol describes how to perform basic statistical analysis in a populationbased genetic association casecontrol study. Casecontrol association studies are an increasingly popular approach to identifying genes that cause neuropsychiatric disorders. What is a false positive negative association and how can a genomewide. Analysis of casecontrol studies of genetic and environmental. Anderson 1,2, fredrik h pettersson 1, geraldine m clarke 1, lon r cardon 3, andrew p. Basic statistical analysis in genetic casecontrol studies geraldine m clarke 1, carl a anderson 2, fredrik h pettersson 1, lon r cardon 3, andrew p morris 1, and krina t zondervan 1. Genomewide association studies and crisprcas9mediated gene. Strengthening the reporting of genetic association studies. Basic statistical analysis in genetic casecontrol studies. The principal line of investigation in genome wide association studies gwas is the identification of main effects, that is individual single nucleotide polymorphisms snps which are associated with the trait of interest, independent of other factors.

A consequence of the rapid developments in the field of genetic association study is the large number of publications. Test whether genetic polymorphisms alleles are associated with disease status. Let the genotype frequencies for cases and controls to be p j and q j, j 0, 1, 2, respectively, and. The transition from genetic linkage analyses to association studies risch and merikangas, 1996. Genetic casecontrol association studies correcting for multiple testing dale r. Genetic association studies genetic association studies candidate gene and genomewide association studies often case control study design basic idea. Pdf basic statistical analysis in genetic casecontrol. Here the basis of inference is formed by the likelihood of the disease d outcome data condi. Three lectures on casecontrol genetic association analysis. The traditional approach for analysis of case control studies is prospective logistic regression.

The probability 34 of all 5 studies detecting the association is only 0. Genomic control, a new approach to geneticbased association. Traditional epidemiological studies focus on assessing the impact of specific risk factors on disease risk in populations. Be familiar with the methods used to address population stratification.

Dec 30, 2005 consider data for a case control study of genetic association as in table 1. Statistical analysis of genomewide association gwas data. Robust trend tests for genetic association in casecontrol. Genetic association studies are used to find candidate genes or genome regions that contribute to a specific disease by testing for a correlation between disease status and genetic variation. Gwas for multiple sclerosis ms data cleaning quality control results. The steps described involve the identification and removal of dna samples and markers that introduce bias to the study. Despite the many similarities between genetic association studies and classical observational epidemiologic studies that is, crosssectional, casecontrol, and cohort of. Similar to previous type 1 diabetes genetic association studies 11,18, the case control design of our gwas metaanalysis did not allow for matching of case subjects to control subjects within the same european population because of the lack of availability of control samples in each participating case cohort. Indeed, case control genetic association studies have already contributed to identifying genes associated with complex disorders, as in the cases of apolipoprotein e4 with lateonset alzheimer disease 31 and factor v gene with venous thrombosis. Here the basis of inference is formed by the likelihood of the disease d outcome data conditional on covariate information x, ignoring the fact that under the case control sampling design, data are observed on x conditional on d. Regardless of whether a single biallelic snp is under consideration in a candidate gene study. Basic statistical analysis in genetic case control studies geraldine m clarke 1, carl a anderson 2, fredrik h pettersson 1, lon r cardon 3, andrew p morris 1, and krina t zondervan 1.

What is a false positive negative association and how can a genomewide study minimize these types of errors. Case control studies are observational because no intervention is attempted and no attempt is made to alter the course of the disease. In fact, this is the sine qua non of association based genetic studies. Genetic epidemiology association studies and power considerations. Analysis of genetic variants using unrelated subjects in the case control design. What is the relationship between genomic coverage and the power of genetic. This protocol deals with the quality control qc of genotype data from genomewide and candidate gene casecontrol association studies. This protocol describes how to perform basic statistical analysis in a populationbased genetic association case control study. A genetic association casecontrol study compares the frequency of alleles or genotypes at genetic marker loci, usually singlenucleotide polymorphisms snps see box 1 for a glossary. Nyholt 1 human genetics volume 109, pages 564 565 2001 cite this article. Genetic association an overview sciencedirect topics. Context the search for disease susceptibility genes. We aimed to identify novel rare or lowfrequency maf genetic markers associated with casecontrol status.

Rare genetic variants of large effect influence risk of type. The goal of a genetic association study is to establish statistical associations between. The central theme in casecontrol genetic association studies is to e ciently identify genetic markers associated with casecontrol status. In addition to outlining the published ideas on this method, we describe several extensions. Hence, for case control studies, test statistics are generally inflated relative to expectation under the assumption of an in.

In genetics, a genomewide association study gwa study, or gwas, also known as whole genome association study wga study, or wgas, is an observational study of a genomewide. This protocol details the data quality assessment and control steps that are typically carried out during casecontrol association studies. The goal is to retrospectively determine the exposure to the risk factor of interest from each of the two groups of individuals. A casecontrol study also known as casereferent study is a type of observational study in which two existing groups differing in outcome are identified and compared on the basis of. The steps described involve the i appropriate selection of measures of association and relevance of disease models. This protocol details the steps for data quality assessment and control that are typically carried out during casecontrol association studies. Despite the many similarities between genetic association studies and classical observational epidemiologic studies that is, crosssectional, casecontrol, and cohort of lifestyle and environmental factors, genetic association studies present several specific challenges, including an unprecedented volume of new data and the likelihood. Genetic factors are likely to affect the occurrence of numerous common diseases, and therefore identifying and characterizing the associated risk or protection will be important in improving the understanding of etiology and potentially for.

Combining casecontrol and casetrio data from the same. Powerful statistical methods are critical to accomplishing this goal. Common statistical issues in genomewide association. Samples in genetic casecontrol association analyses. Understand the conditions under which population stratification can occur.

For each study design our goal is to achieve control similar to that obtained for a familybased study, but with the convenience found in a populationbased. Statistical methods to test for association in casecontrol gwa studies allele counting chisquare test logistic regression multiple testing and power example. The traditional approach for analysis of casecontrol studies is prospective logistic regression. The rapidly evolving evidence on genetic associations is crucial to integrating human genomics into the practice of medicine and public health 1,2. Teoa,b introduction genomewide association study gwas is increasingly common as an experimental design for investigating the genetic basis of common diseases and complex traits in humans.

The casecontrol study design is often used in the study of rare diseases or as a preliminary study where little is known about the association between the risk factor and disease of interest. Assume that 5 investigators conduct independent case control association studies of the same genetic marker and neuropsychiatric disorder and that each study has a power of 0. Data quality control in genetic casecontrol association studies carl a. N and m, where n is a normal allele and m is an allele with high risk. Pdf basic statistical analysis in genetic casecontrol studies. Regardless of whether a single biallelic snp is under consideration in a candidate gene study or thousands in genomewide association studies, analyses are usually carried out 1 snp at a time, with subsequent adjustment for multiple testing 9, 10. The goal of a genetic association study is to establish. Analysis of genetic variants using unrelated subjects in the casecontrol design. The simulated data used here have passed standard quality control. Data quality control in genetic casecontrol association studies. This protocol deals with the quality control qc of genotype data from genomewide and candidategene casecontrol association studies, and outlines the methods routinely. Free case study samples and examples on quality control are 100% plagiarized at writing service you can buy a custom case study on quality control topics. Describe what is meant by population stratification. Practice of epidemiology on information coded in gene.

This paper aims to provide a guideline for conducting genetic analyses by introducing key concepts and by sharing scripts that can be used for data analysis. We describe how to use plink, a tool for handling snp data, to perform assessments of failure rate per individual and per snp. Genetic casecontrol association studies in neuropsychiatry. The st rengthening the reporting of genetic association studies strega initiative builds on the st rengthening the re porting of ob servational studies in e pidemiology strobe statement. A popular statistical method is the modelfree pearsons chisquare test. Common statistical issues in genomewide association studies. Robust statistical tests of genetic association for the case.