Combining transcriptional datasets using the generalized singular value decomposition

Schreiber, A.; Shirley, N.; Burton, R.; Fincher, G.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/51654

Scopus	Web of Science®	Altmetric
Citations
?	?

Type:	Journal article
Title:	Combining transcriptional datasets using the generalized singular value decomposition
Author:	Schreiber, A. Shirley, N. Burton, R. Fincher, G.
Citation:	BMC Bioinformatics, 2008; 2008(1):1-15
Publisher:	BioMed Central Ltd.
Issue Date:	2008
ISSN:	1471-2105 1471-2105
Statement of Responsibility:	Andreas W Schreiber, Neil J Shirley, Rachel A Burton and Geoffrey B Fincher
Abstract:	Background Both microarrays and quantitative real-time PCR are convenient tools for studying the transcriptional levels of genes. The former is preferable for large scale studies while the latter is a more targeted technique. Because of platform-dependent systematic effects, simple comparisons or merging of datasets obtained by these technologies are difficult, even though they may often be desirable. These difficulties are exacerbated if there is only partial overlap between the experimental conditions and genes probed in the two datasets. Results We show here that the generalized singular value decomposition provides a practical tool for merging a small, targeted dataset obtained by quantitative real-time PCR of specific genes with a much larger microarray dataset. The technique permits, for the first time, the identification of genes present in only one dataset co-expressed with a target gene present exclusively in the other dataset, even when experimental conditions for the two datasets are not identical. With the rapidly increasing number of publically available large scale microarray datasets the latter is frequently the case. The method enables us to discover putative candidate genes involved in the biosynthesis of the (1,3;1,4)-β-D-glucan polysaccharide found in plant cell walls. Conclusion We show that the generalized singular value decomposition provides a viable tool for a combined analysis of two gene expression datasets with only partial overlap of both gene sets and experimental conditions. We illustrate how the decomposition can be optimized self-consistently by using a judicious choice of genes to define it. The ability of the technique to seamlessly define a concept of "co-expression" across both datasets provides an avenue for meaningful data integration. We believe that it will prove to be particularly useful for exploiting large, publicly available, microarray datasets for species with unsequenced genomes by complementing them with more limited in-house expression measurements.
Keywords:	Proteome Transcription Factors Oligonucleotide Array Sequence Analysis Gene Expression Profiling Reverse Transcriptase Polymerase Chain Reaction Algorithms Database Management Systems Information Storage and Retrieval Databases, Protein
Rights:	© 2008 Schreiber et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
DOI:	10.1186/1471-2105-9-335
Published version:	http://dx.doi.org/10.1186/1471-2105-9-335
Appears in Collections:	Agriculture, Food and Wine publications Aurora harvest

Files in This Item:

File	Description	Size	Format
hdl_51654.pdf	Published version	431.79 kB	Adobe PDF	View/Open

Show full item record

Adelaide Research & Scholarship