David W. Craig, Ph.D.


I am the vice chair of the Department of Translational Genomics at USC, co-director of the Institute for Translational Genomics, and the Director of the Molecular Genomics Core at the Norris Comprehensive Cancer Center. Before coming to USC, I was the Deputy Director of Bioinformatics at TGen and Director of the Neurogenomics Division. Under the vision of translational genomics and bioinformatics, I have been in charge of scientific teams and trained new faculty. With over 200 publications, my expertise is in translational genomics and bioinformatics. 

Development of integrative bioinformatic tools. In the past 15 years, my lab has used genomics to make experimental and computational tools that connect engineering, biotechnology, and clinical care. We published one of the first methods for Illumina multiplexed sequencing in 2008 (Nature Methods). Integration of DNA and RNA is a primary focus, with papers and patents on cryptic splicing, fusion detection, X-skewing, and variant prioritization in cancer. The second area of focus in genomics has been untangling mixtures, which is very important in oncology. These approaches have led to many significant papers, including one of the most influential papers on data privacy problems (Homer et al., PLOS Genet, 2008).

Translational Genomics. Translating genomics from bench to bedside is at the foundation of my research. I co-founded the CAP/CLIA-accredited Ashion Labs with John Carpten (TGen; co-founder). Also, I helped start TGen's Center for Rare Childhood Disorders (C4RCD.org) and got over a thousand families to join the study of diseases with unknown genetic causes. I have worked to develop shared standards and datasets, serving as a co-PI on 1000 Genomes. (i) Neurological disorders. I have had collaborative publications in bipolar disorder, Alzheimer's disease , and pediatric neurology . As part of the Accelerated Medicine Partnership in Parkinson's Disorder (AMP-PD), we recently led the longitudinal analysis of over 8,500 transcriptomes from 1600 people. (ii) Oncology. Our research in somatic heterogeneity and disease progression has led to collaborations developing genomic methods in oncology. My group was one of the first to study whole-genome and transcriptome profiling to treat metastatic triple-negative breast cancer.


Professor of Translational Genomics
Co-Director, Institute for Translational Genomics
Department of Translational Genomics
Health Sciences Campus
NRT 1450 Biggy Street, Los Angeles 



Bioinformatics. To improve outcomes in human disease, our lab focuses on the intersection of engineering, biotechnology, and clinical care. Genomic analysis is a vital part of precision medicine, and patient-by-patient analysis is now possible thanks to next-generation sequencing (NGS) technology. Patient data at the individual patient level has grown significantly during the past five years. To better understand how to choose treatments, we are currently at a unique point in time where we are redefining how we can look at individual-level data at the molecular, cellular, and systems levels. The following period comes before performing these measures over thousands of individual cells and builds on the emergence of single-molecule nucleic acid measurements from next-generation sequencing technology. This area of bioinformatics is the main topic of our study, which involves scientists, trainees, and students using sizable datasets both at the bench and in the dry lab.

In the lab today, we are moving from the ability to generate genome-wide sequence data for a single patient to be able to generate single-molecule data, not just from one cell but from hundreds to thousands of cells together with cellular biology and different clinical measurements (such as from imaging and pathology). While the technology and data are forthcoming, considerable research and development are needed to optimally leverage and bridge together data spanning molecules, cells, and tissues. So, this academic program aims to research and develop ways to combine and model data from individual nucleic acid molecules and cells to system-level clinical measurements. This will help us learn more about diagnosing, treating, preventing, and stopping the spread of disease.


Bioinformatics and Data Science

Bioinformatics, where the ideas of data science, clinical care, and biotechnology all meet, is one of our group's most important research areas. We have several thrusts, primarily centered around the integration of multi-omic data.

Data Integration, Harmonization, and Establishment of Analysis  Bioinformatics must build on processes for testing, validating, and integrating new tools as they come out and are needed by other projects and cores. Indeed, the analysis field for NGS continues to evolve and grow at a fantastic pace, producing new tools even while previous ones have just been put into production.


Development of Knowledge Portals


Integration and modern software frameworks are critical for exploring and truly exploring big data. Graphs are moving beyond paper to dynamic and interactive activities that allow one to explore and interact with data. Our group uses full-stack development to support these, whereby terabytes of data are reduced and integrated into web portals. One such effort is through the Michael J. Fox Foundation, where we collaborate with others to develop portals to explore thousands of genomic DNA samples with RNA. This interactive framework allows for the inspection of different phenotypes and is built upon the D3.js, MongoDB, and BigTable frameworks that are easily deployable within controlled environments.

Integrative analysis of RNA



Enrichment of PI3K-AKT-mTOR Pathway Activation in Hepatic Metastases from Breast Cancer. Clin Cancer Res. 2017 Aug 15; 23(16):4919-4928. View in: PubMed

Comprehensive Genomic Analysis of Metastatic Mucinous Urethral Adenocarcinoma Guides Precision Oncology Treatment: Targetable EGFR Amplification Leading to Successful Treatment With Erlotinib. Clin Genitourin Cancer. 2017 Aug; 15(4):e727-e734. View in: PubMed

Integrated genomic analyses reveal frequent TERT aberrations in acral melanoma. Genome Res. 2017 Apr; 27(4):524-532. View in: PubMed

A gain-of-function mutation in the GRIK2 gene causes neurodevelopmental deficits. Neurol Genet. 2017 Feb; 3(1):e129. View in: PubMed

Case report: whole exome sequencing of primary cardiac angiosarcoma highlights potential for targeted therapies. BMC Cancer. 2017 Jan 05; 17(1):17. View in: PubMed

A prospective pilot study of genome-wide exome and transcriptome profiling in patients with small cell lung cancer progressing after first-line therapy. PLoS One. 2017; 12(6):e0179170. View in: PubMed

Case Report: Novel mutations in TBC1D24 are associated with autosomal dominant tonic-clonic and myoclonic epilepsy and recessive Parkinsonism, psychosis, and intellectual disability. F1000Res. 2017; 6:553. View in: PubMed

Dystonia in ATP2B3-associated X-linked spinocerebellar ataxia. Mov Disord. 2016 Nov; 31(11):1752-1753. View in: PubMed

An International Standard for holotranscobalamin (holoTC): international collaborative study to assign a holoTC value to the International Standard for vitamin B12 and serum folate. Clin Chem Lab Med. 2016 Sep 01; 54(9):1467-72. View in: PubMed

A de novo missense mutation in ZMYND11 is associated with global developmental delay, seizures, and hypotonia. Cold Spring Harb Mol Case Stud. 2016 Sep; 2(5):a000851. View in: PubMed

Molecular Genetic Profiling of Adolescent Glassy Cell Carcinoma of the Cervix Reveals Targetable EGFR Amplification with Potential Therapeutic Implications. J Adolesc Young Adult Oncol. 2016 Sep; 5(3):297-302. View in: PubMed

Clinical Implementation of Integrated Genomic Profiling in Patients with Advanced Cancers. Sci Rep. 2016 12 23; 6(1):25. View in: PubMed

Age-Related Macular Degeneration-Associated Genes in Alzheimer Disease. Am J Geriatr Psychiatry. 2015 Dec; 23(12):1290-1296. View in: PubMed

Plasma Complement factor H in Alzheimer's Disease. J Alzheimers Dis. 2015; 45(2):369-72. View in: PubMed

Amyloid pathway-based candidate gene analysis of [(11)C]PiB-PET in the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. Brain Imaging Behav. 2012 Mar; 6(1):1-15. View in: PubMed

Germline mutations in HOXB13 and prostate-cancer risk. N Engl J Med. 2012 Jan 12; 366(2):141-9. View in: PubMed

Deep clonal profiling of formalin fixed paraffin embedded clinical samples. PLoS One. 2012; 7(11):e50586. View in: PubMed

Cancer of the ampulla of Vater: analysis of the whole genome sequence exposes a potential therapeutic vulnerability. Genome Med. 2012; 4(7):56. View in: PubMed

Genome-wide characterization of pancreatic adenocarcinoma patients using next generation sequencing. PLoS One. 2012; 7(10):e43192. View in: PubMed

Paired tumor and normal whole genome sequencing of metastatic olfactory neuroblastoma. PLoS One. 2012; 7(5):e37029. View in: PubMed

Induction of pluripotent stem cells from autopsy donor-derived somatic cells. Neurosci Lett. 2011 Sep 20; 502(3):219-24. View in: PubMed

Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nat Genet. 2011 Sep 18; 43(10):977-83. View in: PubMed

The functional spectrum of low-frequency coding variation. Genome Biol. 2011 Sep 14; 12(9):R84. View in: PubMed

Demographic history and rare allele sharing among human populations. Proc Natl Acad Sci U S A. 2011 Jul 19; 108(29):11983-8. View in: PubMed

Variation in genome-wide mutation rates within and between human families. Nat Genet. 2011 Jun 12; 43(7):712-4. View in: PubMed

Genome-wide association of bipolar disorder suggests an enrichment of replicable associations in regions near genes. PLoS Genet. 2011 Jun; 7(6):e1002134. View in: PubMed

Autism and increased paternal age related changes in global levels of gene expression regulation. PLoS One. 2011 Feb 17; 6(2):e16715. View in: PubMed

Mapping copy number variation by population-scale genome sequencing. Nature. 2011 Feb 03; 470(7332):59-65. View in: PubMed

Introduction to genetic epidemiology. Optometry. 2011 Feb; 82(2):83-91. View in: PubMed

Accuracy of CNV Detection from GWAS Data. PLoS One. 2011 Jan 13; 6(1):e14511. View in: PubMed

Genome-wide association study of CSF biomarkers Abeta1-42, t-tau, and p-tau181p in the ADNI cohort. Neurology. 2011 Jan 04; 76(1):69-79. View in: PubMed

Assessing and managing risk when sharing aggregate genetic variant data. Nat Rev Genet. 2011 09 16; 12(10):730-6. View in: PubMed

Genomic Copy Number Analysis in Alzheimer's Disease and Mild Cognitive Impairment: An ADNI Study. Int J Alzheimers Dis. 2011; 2011:729478. View in: PubMed

Exonic DNA sequencing of ERBB4 in bipolar disorder. PLoS One. 2011; 6(5):e20242. View in: PubMed

Microarray-based genome-wide association studies using pooled DNA. Methods Mol Biol. 2011; 700:49-60. View in: PubMed

Bar-coded, multiplexed sequencing of targeted DNA regions using the Illumina Genome Analyzer. Methods Mol Biol. 2011; 700:89-104. View in: PubMed

Copy number and targeted mutational analysis reveals novel somatic events in metastatic prostate tumors. Genome Res. 2011 Jan; 21(1):47-55. View in: PubMed

Decreased serum arylesterase activity in autism spectrum disorders. Psychiatry Res. 2010 Dec 30; 180(2-3):105-13. View in: PubMed

Misperceptions of peer norms as a risk factor for sugar-sweetened beverage consumption among secondary school students. J Am Diet Assoc. 2010 Dec; 110(12):1916-21. View in: PubMed

Voxelwise genome-wide association study (vGWAS). Neuroimage. 2010 Nov 15; 53(3):1160-74. View in: PubMed

Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in MCI and AD: A study of the ADNI cohort. Neuroimage. 2010 Nov 15; 53(3):1051-63. View in: PubMed

Diversity of human copy number variation and multicopy genes. Science. 2010 Oct 29; 330(6004):641-6. View in: PubMed

A map of human genome variation from population-scale sequencing. Nature. 2010 Oct 28; 467(7319):1061-73. View in: PubMed

Association of CR1, CLU and PICALM with Alzheimer's disease in a cohort of clinically characterized and neuropathologically verified individuals. Hum Mol Genet. 2010 Aug 15; 19(16):3295-301. View in: PubMed

Whole-genome association mapping of gene expression in the human prefrontal cortex. Mol Psychiatry. 2010 Aug; 15(8):779-84. View in: PubMed

Evidence for an association between KIBRA and late-onset Alzheimer's disease. Neurobiol Aging. 2010 Jun; 31(6):901-9. View in: PubMed

Genome-wide analysis reveals novel genes influencing temporal lobe structure with relevance to neurodegeneration in Alzheimer's disease. Neuroimage. 2010 Jun; 51(2):542-54. View in: PubMed

A commonly carried allele of the obesity-related FTO gene is associated with reduced brain volume in the healthy elderly. Proc Natl Acad Sci U S A. 2010 May 04; 107(18):8404-9. View in: PubMed

Alzheimer's Disease Neuroimaging Initiative biomarkers as quantitative phenotypes: Genetics core aims, progress, and plans. Alzheimers Dement. 2010 May; 6(3):265-73. View in: PubMed

Genetic control of individual differences in gene-specific methylation in human brain. Am J Hum Genet. 2010 Mar 12; 86(3):411-9. View in: PubMed

Cerebellar telomere length and psychiatric disorders. Behav Genet. 2010 Mar; 40(2):250-4. View in: PubMed

Whole genome association analysis shows that ACE is a risk factor for Alzheimer's disease and fails to replicate most candidates from Meta-analysis. Int J Mol Epidemiol Genet. 2010; 1(1):19-30. View in: PubMed

Genome-wide SNP genotyping study using pooled DNA to identify candidate markers mediating susceptibility to end-stage renal disease attributed to Type 1 diabetes. Diabet Med. 2009 Nov; 26(11):1090-8. View in: PubMed

Genome-wide scan of 500,000 single-nucleotide polymorphisms among responders and nonresponders to interferon beta therapy in multiple sclerosis. Arch Neurol. 2009 Aug; 66(8):972-8. View in: PubMed

Genetic variants at 6p21. 33 are associated with susceptibility to follicular lymphoma. Nat Genet. 2009 Aug; 41(8):873-5. View in: PubMed

Singleton deletions throughout the genome increase risk of bipolar disorder. Mol Psychiatry. 2009 Apr; 14(4):376-80. View in: PubMed

Genetic control of human brain transcript expression in Alzheimer disease. Am J Hum Genet. 2009 Apr; 84(4):445-58. View in: PubMed

Statistical comparison framework and visualization scheme for ranking-based algorithms in high-throughput genome-wide studies. J Comput Biol. 2009 Apr; 16(4):565-77. View in: PubMed

A genome-wide analysis identifies genetic variants in the RELN gene associated with otosclerosis. Am J Hum Genet. 2009 Mar; 84(3):328-38. View in: PubMed

GRM7 variants confer susceptibility to age-related hearing impairment. Hum Mol Genet. 2009 Feb 15; 18(4):785-96. View in: PubMed

Molecular genetics of adult ADHD: converging evidence from genome-wide association and extended pedigree linkage studies. J Neural Transm (Vienna). 2008 Nov; 115(11):1573-85. View in: PubMed

Identification of genetic variants using bar-coded multiplexed sequencing. Nat Methods. 2008 Oct; 5(10):887-93. View in: PubMed

Multimarker analysis and imputation of multiple platform pooling-based genome-wide association studies. Bioinformatics. 2008 Sep 01; 24(17):1896-902. View in: PubMed

Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 2008 Aug 29; 4(8):e1000167. View in: PubMed

Common sequence variants on 20q11. 22 confer melanoma susceptibility. Nat Genet. 2008 Jul; 40(7):838-40. View in: PubMed

Genome-wide linkage analysis of ADHD using high-density SNP arrays: novel loci at 5q13. 1 and 14q12. Mol Psychiatry. 2008 May; 13(5):522-30. View in: PubMed

Identification of somatic chromosomal abnormalities in hypothalamic hamartoma tissue at the GLI3 locus. Am J Hum Genet. 2008 Feb; 82(2):366-74. View in: PubMed

Sorl1 as an Alzheimer's disease predisposition gene? Neurodegener Dis. Sorl1 as an Alzheimer's disease predisposition gene? Neurodegener Dis. 2008; 5(2):60-4. View in: PubMed

Identification of a novel risk locus for multiple sclerosis at 13q31. 3 by a pooled genome-wide scan of 500,000 single nucleotide polymorphisms. PLoS One. 2008; 3(10):e3490. View in: PubMed

A survey of genetic human cortical gene expression. Nat Genet. 2007 Dec; 39(12):1494-9. View in: PubMed

The nuts and bolts of gene array technology and its application to drug abuse research. Drug Alcohol Depend. 2007 Nov 02; 91(1):102-6. View in: PubMed

Whole-genome analysis of sporadic amyotrophic lateral sclerosis. N Engl J Med. 2007 Aug 23; 357(8):775-88. View in: PubMed

Polyhydramnios, megalencephaly and symptomatic epilepsy caused by a homozygous 7-kilobase deletion in LYK5. Brain. 2007 Jul; 130(Pt 7):1929-41. View in: PubMed

Calmodulin-binding transcription activator 1 (CAMTA1) alleles predispose human episodic memory performance. Hum Mol Genet. 2007 Jun 15; 16(12):1469-77. View in: PubMed

GAB2 alleles modify Alzheimer's risk in APOE epsilon4 carriers. Neuron. 2007 Jun 07; 54(5):713-20. View in: PubMed

Chromosomal abnormality at 6p25. 1-25. 3 identifies a susceptibility locus for hypothalamic hamartoma associated with epilepsy. Epilepsy Res. 2007 Jun; 75(1):70-3. View in: PubMed

A high-density whole-genome association study reveals that APOE is the major susceptibility gene for sporadic late-onset Alzheimer's disease. J Clin Psychiatry. 2007 Apr; 68(4):613-8. View in: PubMed

Identification of PVT1 as a candidate gene for end-stage renal disease in type 2 diabetes using a pooling-based genome-wide single nucleotide polymorphism association study. Diabetes. 2007 Apr; 56(4):975-83. View in: PubMed

Identification of a novel risk locus for progressive supranuclear palsy by a pooled genomewide scan of 500,288 single-nucleotide polymorphisms. Am J Hum Genet. 2007 Apr; 80(4):769-78. View in: PubMed

SNiPer-HD: improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays. Bioinformatics. 2007 Jan 01; 23(1):57-63. View in: PubMed

Identification of the genetic basis for complex disorders by use of pooling-based genomewide single-nucleotide-polymorphism association studies. Am J Hum Genet. 2007 Jan; 80(1):126-39. View in: PubMed

SNP-based chromosomal copy number ascertainment following multiple displacement whole-genome amplification. Biotechniques. 2007 Jan; 42(1):77-83. View in: PubMed

Common Kibra alleles are associated with human memory performance. Science. 2006 Oct 20; 314(5798):475-8. View in: PubMed

High-density single nucleotide polymorphism screen in a large multiplex neural tube defect family refines linkage to loci at 7p21. 1-pter and 2q33. 1-q35. Birth Defects Res A Clin Mol Teratol. 2006 Jun; 76(6):499-505. View in: PubMed

SNiPer: improved SNP genotype calling for Affymetrix 10K GeneChip microarray data. BMC Genomics. 2005 Oct 31; 6:149. View in: PubMed

Genome-wide SNP arrays as a diagnostic tool: clinical description, genetic mapping, and molecular characterization of Salla disease in an Old Order Mennonite population. Am J Med Genet A. 2005 Oct 15; 138A(3):262-7. View in: PubMed

Identification of disease causing loci using an array-based genotyping approach on pooled DNA. BMC Genomics. 2005 Sep 30; 6:138. View in: PubMed

Applications of whole-genome high-density SNP genotyping. Expert Rev Mol Diagn. 2005 Mar; 5(2):159-70. View in: PubMed

The genetics of tethered cord syndrome. Am J Med Genet A. 2005 Feb 01; 132A(4):450-3. View in: PubMed

The Autism Genome Project: goals and strategies. Am J Pharmacogenomics. 2005; 5(4):233-46. View in: PubMed

Structural insights into how the MIDAS ion stabilizes integrin binding to an RGD peptide under force. Structure. 2004 Nov; 12(11):2049-58. View in: PubMed

Mapping of sudden infant death with dysgenesis of the testes syndrome (SIDDT) by a SNP genome scan and identification of TSPYL loss of function. Proc Natl Acad Sci U S A. 2004 Aug 10; 101(32):11689-94. View in: PubMed

Tuning the mechanical stability of fibronectin type III modules through sequence variations. Structure. 2004 Jan; 12(1):21-30. View in: PubMed

Structure and functional significance of mechanically unfolded fibronectin type III1 intermediates. Proc Natl Acad Sci U S A. 2003 Dec 09; 100(25):14784-9. View in: PubMed

Identifying unfolding intermediates of FN-III(10) by steered molecular dynamics. J Mol Biol. 2002 Nov 08; 323(5):939-50. View in: PubMed

A structural model for force regulated integrin binding to fibronectin's RGD-synergy site. Matrix Biol. 2002 Mar; 21(2):139-47. View in: PubMed

Structural insights into the mechanical regulation of molecular recognition sites. Trends Biotechnol. 2001 Oct; 19(10):416-23. View in: PubMed

Comparison of the early stages of forced unfolding for fibronectin type III modules. Proc Natl Acad Sci U S A. 2001 May 08; 98(10):5590-5. View in: PubMed