David W. Craig, Ph.D.


David W. Craig serves as Vice-Chair of USC’s new Department of Translational Genomics within the USC Keck School of Medicine.   Dr. Craig’s expertise is in genomics, bioinformatics, and data analysis of high-throughput genomics data. His laboratory consists of both a wet-lab and dry-lab. Within his group, lab personnel have the opportunity to either specialize or become dual trained in bioinformatics and molecular biology.

His group pioneered cost-effective GWAS methods leading to genetic associations reported in ScienceNature Genetics, and New England Journal of Medicine. His publications include some of the most significant papers addressing the challenges of data sharing and data privacy (Homer et al, PLOS Genetics 2008). Since 2006, his team has been developing tools based on NGS beginning with publishing one of the first papers for targeted variant calling in humans (Craig et al Nature Methods, 2008). In the past 8 years, they have published and collaborated on over 60 NGS publications balanced between the wet and dry-labs. During this time, he has served in several international genomics projects, including as a PI on a U01 responsible for developing bioinformatic pipelines for the Phase I and Phase II portions of the 1000 Genomes Project.

With collaborators, his group was among the first to implement NGS in molecular profiling for cancer patient treatment recommendations in a feasibility study in metastatic triple negative breast cancer (Craig et al., Mol Cancer Ther. 2013). Building upon this and other efforts his team developed an end-to-end platform for personalized medicine, NGS data management, analysis, and clinical genomic interpretation following CAP/CLIA guidelines. Within this framework, they completed analytical validation for integrated RNA/DNA analysis of tumor/normal sets. Community resources from these efforts included include a collaborative release of COLO829 tumor/normal sequencing reference sets. He also was a founding scientific director for TGen’s Center for Rare Childhood Disorders – a research clinic enrolling over 1000 individuals into a study developing integrative RNA/DNA approaches for identifying the germline genetic basis of disease.



Professor of Translational Genomics
Co-Director, Institute for Translational Genomics
Department of Translational Genomics
Health Sciences Campus
NRT 1450 Biggy Street, Los Angeles 



Our lab focuses is focused where engineering, biotechnology, and clinical care interface with a focus on impacting individual level patient care. Genomics is playing a major role in precision medicine and next-generation sequencing (NGS) technologies have provided these capabilities on a patient level, and the last five years have seen burgeoning of individual level patient data. Indeed, we have entered into a unique time fundamentally altering our ability to probe at a molecular, cellular, and systems level scale the type individual level data and our ability to assimilate this data for better understanding treatment decisions. The period that follows a revolution of single molecule nucleic acid measurements from next-generation sequencing technologies and one that precedes making these measurements across thousands of cells individually.  This is the focus of our research with trainees, students, and scientists working with large datasets both at the bench and the dry-lab.


In the lab, we today we are moving from the ability to genome-wide sequence data for a single patient to in the coming year being able to generate in single molecule data (such as from exome sequencing, RNA-seq or methyl-Seq) not just from one cell but hundreds to thousands of cells together with cellular biology and different clinical measurements (such as from imaging and pathology). While the technology and data are forthcoming, considerably research and development is needed to optimally leverage and bridge together data spanning molecules, cells, tissues.  Consequentially, the goal of this academic program: to research and develop approaches for integrating and modelling data spanning from the individual nucleic acid molecules and cells to system level clinical measurements in order improve our ability to understand disease diagnosis, treatment, progression, and prevention.



Bioinformatics and Data Science

One of the largest areas of research within our group is in the areas of bioinformatics, where principles of data science, clinical care, and biotechnology converge.  We have several thrusts largely centered around integration of multi-omic data.

Data Integration, Harmonization & Establishment Of Analysis Approaches. It is critical that the Bioinformatics Core has processes for testing, validating, and integrating new tools as they emerge and are needed by the other projects and cores. Indeed, the analysis field for NGS continues to evolve and grow at amazing pace, producing new tools even while previous tools have just been placed in production. Moreover, the optimal approach for integrating and variant calling is typically specific to a study based on the technology, sample availability, and sample conditions. Change is inevitable and rapid, and we expect that there will be a need to implement new tools and analysis approaches even between the timing of submission and time of commencing this study.



Development of Knowledge Portals


Integration and modern software frameworks are critical for exploring and truly exploring big data. Graphs are moving beyond paper to being dynamic and interactive activities to allow one to explore and truly interact with data.  Our graph is involved in building full stacks to support these where by Terabytes of data are reduced and integrated into web portals.  One such effort is through the Michael J. Fox Foundation where we are in collaboration with others developing a portal to explore thousands of genomic DNA samples with RNA.  This interactive framework allows for inspection of different phenotypes built upon a D3.js, Mongodb, BigTable, framework it is easily deployable within controlled environments.  This project is one of several we are working on and where opportunities for further research are

Integrative analysis of RNA



Enrichment of PI3K-AKT-mTOR Pathway Activation in Hepatic Metastases from Breast Cancer. Clin Cancer Res. 2017 Aug 15; 23(16):4919-4928. View in: PubMed

Comprehensive Genomic Analysis of Metastatic Mucinous Urethral Adenocarcinoma Guides Precision Oncology Treatment: Targetable EGFR Amplification Leading to Successful Treatment With Erlotinib. Clin Genitourin Cancer. 2017 Aug; 15(4):e727-e734. View in: PubMed

Integrated genomic analyses reveal frequent TERT aberrations in acral melanoma. Genome Res. 2017 Apr; 27(4):524-532. View in: PubMed

A gain-of-function mutation in the GRIK2 gene causes neurodevelopmental deficits. Neurol Genet. 2017 Feb; 3(1):e129. View in: PubMed

Case report: whole exome sequencing of primary cardiac angiosarcoma highlights potential for targeted therapies. BMC Cancer. 2017 Jan 05; 17(1):17. View in: PubMed

A prospective pilot study of genome-wide exome and transcriptome profiling in patients with small cell lung cancer progressing after first-line therapy. PLoS One. 2017; 12(6):e0179170. View in: PubMed

Case Report: Novel mutations in TBC1D24 are associated with autosomal dominant tonic-clonic and myoclonic epilepsy and recessive Parkinsonism, psychosis, and intellectual disability. F1000Res. 2017; 6:553. View in: PubMed

Dystonia in ATP2B3-associated X-linked spinocerebellar ataxia. Mov Disord. 2016 Nov; 31(11):1752-1753. View in: PubMed

An International Standard for holotranscobalamin (holoTC): international collaborative study to assign a holoTC value to the International Standard for vitamin B12 and serum folate. Clin Chem Lab Med. 2016 Sep 01; 54(9):1467-72. View in: PubMed

A de novo missense mutation in ZMYND11 is associated with global developmental delay, seizures, and hypotonia. Cold Spring Harb Mol Case Stud. 2016 Sep; 2(5):a000851. View in: PubMed

Molecular Genetic Profiling of Adolescent Glassy Cell Carcinoma of the Cervix Reveals Targetable EGFR Amplification with Potential Therapeutic Implications. J Adolesc Young Adult Oncol. 2016 Sep; 5(3):297-302. View in: PubMed

Clinical Implementation of Integrated Genomic Profiling in Patients with Advanced Cancers. Sci Rep. 2016 12 23; 6(1):25. View in: PubMed

Age-Related Macular Degeneration-Associated Genes in Alzheimer Disease. Am J Geriatr Psychiatry. 2015 Dec; 23(12):1290-1296. View in: PubMed

Plasma Complement factor H in Alzheimer's Disease. J Alzheimers Dis. 2015; 45(2):369-72. View in: PubMed

Amyloid pathway-based candidate gene analysis of [(11)C]PiB-PET in the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. Brain Imaging Behav. 2012 Mar; 6(1):1-15. View in: PubMed

Germline mutations in HOXB13 and prostate-cancer risk. N Engl J Med. 2012 Jan 12; 366(2):141-9. View in: PubMed

Deep clonal profiling of formalin fixed paraffin embedded clinical samples. PLoS One. 2012; 7(11):e50586. View in: PubMed

Cancer of the ampulla of Vater: analysis of the whole genome sequence exposes a potential therapeutic vulnerability. Genome Med. 2012; 4(7):56. View in: PubMed

Genome-wide characterization of pancreatic adenocarcinoma patients using next generation sequencing. PLoS One. 2012; 7(10):e43192. View in: PubMed

Paired tumor and normal whole genome sequencing of metastatic olfactory neuroblastoma. PLoS One. 2012; 7(5):e37029. View in: PubMed

Induction of pluripotent stem cells from autopsy donor-derived somatic cells. Neurosci Lett. 2011 Sep 20; 502(3):219-24. View in: PubMed

Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nat Genet. 2011 Sep 18; 43(10):977-83. View in: PubMed

The functional spectrum of low-frequency coding variation. Genome Biol. 2011 Sep 14; 12(9):R84. View in: PubMed

Demographic history and rare allele sharing among human populations. Proc Natl Acad Sci U S A. 2011 Jul 19; 108(29):11983-8. View in: PubMed

Variation in genome-wide mutation rates within and between human families. Nat Genet. 2011 Jun 12; 43(7):712-4. View in: PubMed

Genome-wide association of bipolar disorder suggests an enrichment of replicable associations in regions near genes. PLoS Genet. 2011 Jun; 7(6):e1002134. View in: PubMed

Autism and increased paternal age related changes in global levels of gene expression regulation. PLoS One. 2011 Feb 17; 6(2):e16715. View in: PubMed

Mapping copy number variation by population-scale genome sequencing. Nature. 2011 Feb 03; 470(7332):59-65. View in: PubMed

Introduction to genetic epidemiology. Optometry. 2011 Feb; 82(2):83-91. View in: PubMed

Accuracy of CNV Detection from GWAS Data. PLoS One. 2011 Jan 13; 6(1):e14511. View in: PubMed

Genome-wide association study of CSF biomarkers Abeta1-42, t-tau, and p-tau181p in the ADNI cohort. Neurology. 2011 Jan 04; 76(1):69-79. View in: PubMed

Assessing and managing risk when sharing aggregate genetic variant data. Nat Rev Genet. 2011 09 16; 12(10):730-6. View in: PubMed

Genomic Copy Number Analysis in Alzheimer's Disease and Mild Cognitive Impairment: An ADNI Study. Int J Alzheimers Dis. 2011; 2011:729478. View in: PubMed

Exonic DNA sequencing of ERBB4 in bipolar disorder. PLoS One. 2011; 6(5):e20242. View in: PubMed

Microarray-based genome-wide association studies using pooled DNA. Methods Mol Biol. 2011; 700:49-60. View in: PubMed

Bar-coded, multiplexed sequencing of targeted DNA regions using the Illumina Genome Analyzer. Methods Mol Biol. 2011; 700:89-104. View in: PubMed

Copy number and targeted mutational analysis reveals novel somatic events in metastatic prostate tumors. Genome Res. 2011 Jan; 21(1):47-55. View in: PubMed

Decreased serum arylesterase activity in autism spectrum disorders. Psychiatry Res. 2010 Dec 30; 180(2-3):105-13. View in: PubMed

Misperceptions of peer norms as a risk factor for sugar-sweetened beverage consumption among secondary school students. J Am Diet Assoc. 2010 Dec; 110(12):1916-21. View in: PubMed

Voxelwise genome-wide association study (vGWAS). Neuroimage. 2010 Nov 15; 53(3):1160-74. View in: PubMed

Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in MCI and AD: A study of the ADNI cohort. Neuroimage. 2010 Nov 15; 53(3):1051-63. View in: PubMed

Diversity of human copy number variation and multicopy genes. Science. 2010 Oct 29; 330(6004):641-6. View in: PubMed

A map of human genome variation from population-scale sequencing. Nature. 2010 Oct 28; 467(7319):1061-73. View in: PubMed

Association of CR1, CLU and PICALM with Alzheimer's disease in a cohort of clinically characterized and neuropathologically verified individuals. Hum Mol Genet. 2010 Aug 15; 19(16):3295-301. View in: PubMed

Whole-genome association mapping of gene expression in the human prefrontal cortex. Mol Psychiatry. 2010 Aug; 15(8):779-84. View in: PubMed

Evidence for an association between KIBRA and late-onset Alzheimer's disease. Neurobiol Aging. 2010 Jun; 31(6):901-9. View in: PubMed

Genome-wide analysis reveals novel genes influencing temporal lobe structure with relevance to neurodegeneration in Alzheimer's disease. Neuroimage. 2010 Jun; 51(2):542-54. View in: PubMed

A commonly carried allele of the obesity-related FTO gene is associated with reduced brain volume in the healthy elderly. Proc Natl Acad Sci U S A. 2010 May 04; 107(18):8404-9. View in: PubMed

Alzheimer's Disease Neuroimaging Initiative biomarkers as quantitative phenotypes: Genetics core aims, progress, and plans. Alzheimers Dement. 2010 May; 6(3):265-73. View in: PubMed

Genetic control of individual differences in gene-specific methylation in human brain. Am J Hum Genet. 2010 Mar 12; 86(3):411-9. View in: PubMed

Cerebellar telomere length and psychiatric disorders. Behav Genet. 2010 Mar; 40(2):250-4. View in: PubMed

Whole genome association analysis shows that ACE is a risk factor for Alzheimer's disease and fails to replicate most candidates from Meta-analysis. Int J Mol Epidemiol Genet. 2010; 1(1):19-30. View in: PubMed

Genome-wide SNP genotyping study using pooled DNA to identify candidate markers mediating susceptibility to end-stage renal disease attributed to Type 1 diabetes. Diabet Med. 2009 Nov; 26(11):1090-8. View in: PubMed

Genome-wide scan of 500,000 single-nucleotide polymorphisms among responders and nonresponders to interferon beta therapy in multiple sclerosis. Arch Neurol. 2009 Aug; 66(8):972-8. View in: PubMed

Genetic variants at 6p21. 33 are associated with susceptibility to follicular lymphoma. Nat Genet. 2009 Aug; 41(8):873-5. View in: PubMed

Singleton deletions throughout the genome increase risk of bipolar disorder. Mol Psychiatry. 2009 Apr; 14(4):376-80. View in: PubMed

Genetic control of human brain transcript expression in Alzheimer disease. Am J Hum Genet. 2009 Apr; 84(4):445-58. View in: PubMed

Statistical comparison framework and visualization scheme for ranking-based algorithms in high-throughput genome-wide studies. J Comput Biol. 2009 Apr; 16(4):565-77. View in: PubMed

A genome-wide analysis identifies genetic variants in the RELN gene associated with otosclerosis. Am J Hum Genet. 2009 Mar; 84(3):328-38. View in: PubMed

GRM7 variants confer susceptibility to age-related hearing impairment. Hum Mol Genet. 2009 Feb 15; 18(4):785-96. View in: PubMed

Molecular genetics of adult ADHD: converging evidence from genome-wide association and extended pedigree linkage studies. J Neural Transm (Vienna). 2008 Nov; 115(11):1573-85. View in: PubMed

Identification of genetic variants using bar-coded multiplexed sequencing. Nat Methods. 2008 Oct; 5(10):887-93. View in: PubMed

Multimarker analysis and imputation of multiple platform pooling-based genome-wide association studies. Bioinformatics. 2008 Sep 01; 24(17):1896-902. View in: PubMed

Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 2008 Aug 29; 4(8):e1000167. View in: PubMed

Common sequence variants on 20q11. 22 confer melanoma susceptibility. Nat Genet. 2008 Jul; 40(7):838-40. View in: PubMed

Genome-wide linkage analysis of ADHD using high-density SNP arrays: novel loci at 5q13. 1 and 14q12. Mol Psychiatry. 2008 May; 13(5):522-30. View in: PubMed

Identification of somatic chromosomal abnormalities in hypothalamic hamartoma tissue at the GLI3 locus. Am J Hum Genet. 2008 Feb; 82(2):366-74. View in: PubMed

Sorl1 as an Alzheimer's disease predisposition gene? Neurodegener Dis. Sorl1 as an Alzheimer's disease predisposition gene? Neurodegener Dis. 2008; 5(2):60-4. View in: PubMed

Identification of a novel risk locus for multiple sclerosis at 13q31. 3 by a pooled genome-wide scan of 500,000 single nucleotide polymorphisms. PLoS One. 2008; 3(10):e3490. View in: PubMed

A survey of genetic human cortical gene expression. Nat Genet. 2007 Dec; 39(12):1494-9. View in: PubMed

The nuts and bolts of gene array technology and its application to drug abuse research. Drug Alcohol Depend. 2007 Nov 02; 91(1):102-6. View in: PubMed

Whole-genome analysis of sporadic amyotrophic lateral sclerosis. N Engl J Med. 2007 Aug 23; 357(8):775-88. View in: PubMed

Polyhydramnios, megalencephaly and symptomatic epilepsy caused by a homozygous 7-kilobase deletion in LYK5. Brain. 2007 Jul; 130(Pt 7):1929-41. View in: PubMed

Calmodulin-binding transcription activator 1 (CAMTA1) alleles predispose human episodic memory performance. Hum Mol Genet. 2007 Jun 15; 16(12):1469-77. View in: PubMed

GAB2 alleles modify Alzheimer's risk in APOE epsilon4 carriers. Neuron. 2007 Jun 07; 54(5):713-20. View in: PubMed

Chromosomal abnormality at 6p25. 1-25. 3 identifies a susceptibility locus for hypothalamic hamartoma associated with epilepsy. Epilepsy Res. 2007 Jun; 75(1):70-3. View in: PubMed

A high-density whole-genome association study reveals that APOE is the major susceptibility gene for sporadic late-onset Alzheimer's disease. J Clin Psychiatry. 2007 Apr; 68(4):613-8. View in: PubMed

Identification of PVT1 as a candidate gene for end-stage renal disease in type 2 diabetes using a pooling-based genome-wide single nucleotide polymorphism association study. Diabetes. 2007 Apr; 56(4):975-83. View in: PubMed

Identification of a novel risk locus for progressive supranuclear palsy by a pooled genomewide scan of 500,288 single-nucleotide polymorphisms. Am J Hum Genet. 2007 Apr; 80(4):769-78. View in: PubMed

SNiPer-HD: improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays. Bioinformatics. 2007 Jan 01; 23(1):57-63. View in: PubMed

Identification of the genetic basis for complex disorders by use of pooling-based genomewide single-nucleotide-polymorphism association studies. Am J Hum Genet. 2007 Jan; 80(1):126-39. View in: PubMed

SNP-based chromosomal copy number ascertainment following multiple displacement whole-genome amplification. Biotechniques. 2007 Jan; 42(1):77-83. View in: PubMed

Common Kibra alleles are associated with human memory performance. Science. 2006 Oct 20; 314(5798):475-8. View in: PubMed

High-density single nucleotide polymorphism screen in a large multiplex neural tube defect family refines linkage to loci at 7p21. 1-pter and 2q33. 1-q35. Birth Defects Res A Clin Mol Teratol. 2006 Jun; 76(6):499-505. View in: PubMed

SNiPer: improved SNP genotype calling for Affymetrix 10K GeneChip microarray data. BMC Genomics. 2005 Oct 31; 6:149. View in: PubMed

Genome-wide SNP arrays as a diagnostic tool: clinical description, genetic mapping, and molecular characterization of Salla disease in an Old Order Mennonite population. Am J Med Genet A. 2005 Oct 15; 138A(3):262-7. View in: PubMed

Identification of disease causing loci using an array-based genotyping approach on pooled DNA. BMC Genomics. 2005 Sep 30; 6:138. View in: PubMed

Applications of whole-genome high-density SNP genotyping. Expert Rev Mol Diagn. 2005 Mar; 5(2):159-70. View in: PubMed

The genetics of tethered cord syndrome. Am J Med Genet A. 2005 Feb 01; 132A(4):450-3. View in: PubMed

The Autism Genome Project: goals and strategies. Am J Pharmacogenomics. 2005; 5(4):233-46. View in: PubMed

Structural insights into how the MIDAS ion stabilizes integrin binding to an RGD peptide under force. Structure. 2004 Nov; 12(11):2049-58. View in: PubMed

Mapping of sudden infant death with dysgenesis of the testes syndrome (SIDDT) by a SNP genome scan and identification of TSPYL loss of function. Proc Natl Acad Sci U S A. 2004 Aug 10; 101(32):11689-94. View in: PubMed

Tuning the mechanical stability of fibronectin type III modules through sequence variations. Structure. 2004 Jan; 12(1):21-30. View in: PubMed

Structure and functional significance of mechanically unfolded fibronectin type III1 intermediates. Proc Natl Acad Sci U S A. 2003 Dec 09; 100(25):14784-9. View in: PubMed

Identifying unfolding intermediates of FN-III(10) by steered molecular dynamics. J Mol Biol. 2002 Nov 08; 323(5):939-50. View in: PubMed

A structural model for force regulated integrin binding to fibronectin's RGD-synergy site. Matrix Biol. 2002 Mar; 21(2):139-47. View in: PubMed

Structural insights into the mechanical regulation of molecular recognition sites. Trends Biotechnol. 2001 Oct; 19(10):416-23. View in: PubMed

Comparison of the early stages of forced unfolding for fibronectin type III modules. Proc Natl Acad Sci U S A. 2001 May 08; 98(10):5590-5. View in: PubMed