Arnold Library

Gene Set Enrichment Analysis using Linear Models and Diagnostics

Oron, Assaf P. and Jiang, Zhen and Gentleman, Robert (2008) Gene Set Enrichment Analysis using Linear Models and Diagnostics. Bioinformatics, 24 (22). pp. 2586-2591. ISSN 1460-2059 (In Press)

[thumbnail of Complete manuscript]
Preview
Text (Complete manuscript)
OronManuscript091008.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (428kB) | Preview
[thumbnail of Supplement A]
Preview
Text (Supplement A)
OronSuppA091008.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (129kB) | Preview
[thumbnail of Supplement B]
Preview
Text (Supplement B)
OronSuppB091008.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (98kB) | Preview
[thumbnail of Supplement C]
Preview
Text (Supplement C)
OronSuppC091008.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (48kB) | Preview
[thumbnail of Supplement D]
Preview
Text (Supplement D)
OronSuppD091008.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (139kB) | Preview
Article URL: http://bioinformatics.oxfordjournals.org/cgi/repri...

Abstract

Motivation: Gene-set enrichment analysis (GSEA) can be greatly enhanced by linear model (regression) diagnostic techniques. Diagnostics can be used to identify outlying or influential samples, and also to evaluate model fit and explore model expansion. Results: We demonstrate this methodology on an adult acute lymphoblastic leukemia (ALL) dataset, using GSEA based on chromosome-band mapping of genes. Individual residuals, grouped or aggregated by chromosomal loci, indicate problematic samples and potential data-entry errors, and help identify hyperdiploidy as a factor playing a key role in expression for this dataset. Subsequent analysis pinpoints suspected DNA copy number abnormalities of specific samples and chromosomes (most prevalent are chromosomes X, 21 and 14), and also reveals significant expression differences between the hyperdiploid and diploid groups on other chromosomes (most prominently 19, 22, 3 and 13) - differences which are apparently not associated with copy number.

Item Type: Article or Abstract
Additional Information: This is a pre-copy-editing, author-produced PDF of an article accepted for publication in "Bioinformatics" following peer review. The definitive publisher-authenticated version, Bioinformatics. 2008 Nov 15;24(22):2586-91 is available online at: http://bioinformatics.oxfordjournals.org/cgi/reprint/24/22/2586
DOI: 10.1093/bioinformatics/btn465
PubMed ID: 18790795
PMCID: PMC2579710
Grant Numbers: P41 HG004059-03, P50 CA083636-10
Keywords or MeSH Headings: Chromosomes/genetics; Gene Expression Profiling; Humans; Leukemia, Lymphoid/diagnosis/genetics; Linear Models; *Models, Genetic; Phenotype
Subjects: Molecules > Chromosomes
Research Methodologies > Computational Biology
Cellular and Organismal Processes > Genetic processes > Transcription
Depositing User: Library Staff
Date Deposited: 23 Sep 2008 16:49
Last Modified: 14 Feb 2012 14:42
URI: http://authors.fhcrc.org/id/eprint/33

Repository Administrators Only

View Item View Item