ACADEMIA
DOE JGI launches IMG public online microbial genome data clearinghouse
As the microbial world comes to light through DNA sequencing, the new Integrated Microbial Genomes (IMG) data management system of the U.S. Department of Energy (DOE) Joint Genome Institute (JGI) will deliver valuable information for the benefit of the global research community. "The IMG system is an essential enhancement to the computational toolkit supported by DOE," said Dr. Aristides A. Patrinos, Associate Director for Environmental and Biological Research of the DOE Office of Science. "IMG responds to the urgent need of handling the vast and growing spectrum of datasets emerging from genome projects taken on by the DOE JGI and other public DNA sequencing centers. It is our hope that the IMG system will enable our scientists to tap the rich diversity of microbial environments and harness the possibilities that they hold for addressing challenges in environmental cleanup, medicine, agriculture, industrial processes, and alternative energy production." The DOE JGI is currently producing nearly one-quarter of the number of microbial genome projects worldwide, more than any other single institution. The IMG system currently features over 200 organisms, with an additional 200 already in the queue for 2005. The release of IMG, accessible to the public at http://img.jgi.doe.gov/, is the result of a collaboration between the DOE JGI and Lawrence Berkeley National Laboratory's Biological Data Management and Technology Center (BDMTC).
"As the number of microbial genomes sequenced continues to rise, the genome analysis process becomes the rate-limiting step," said DOE JGI Director Eddy Rubin. "By integrating publicly available microbial genome sequence with DOE JGI sequences, the IMG system offers a powerful data management platform that supports timely analysis of genomes from a comparative functional and evolutionary perspective."
"IMG's primary goal is to provide high-quality data in a comprehensible system that is diverse in terms of the number of genomes it covers," said Victor Markowitz, head of BDMTC, who led the IMG development effort. "This goal follows the fundamental principle that the value of genome analysis depends on the quality of the data and increases with the number of genomes available for comparative analysis."
Nikos Kyrpides, of DOE JGI's Microbial Genome Analysis Program (MGAP), provided scientific leadership and overall coordination for the IMG project. MGAP manages the IMG's data content and curation and helped develop the system, with additional support provided by DOE JGI's Microbial Ecology and Genome Data System groups. "The IMG system champions the principle of integration in an evolutionary context" said Kyrpides. "This is critical for enabling the generation of high-quality annotations, and comprehensive metabolic reconstructions.
"The first release of IMG offers a comprehensive genome data exploration system of the DOE JGI-sequenced genomes to our collaborators and the scientific community at large." According to Kyrpides, future releases of IMG will provide enhanced data analysis capabilities and mechanisms that will allow the scientific community to participate in the annotation effort.
"There are hundreds of bacterial genome sequences in multiple databases with hundreds of new genomes expected this year," said Gary Andersen, Molecular Microbial Ecology Group Leader of the Lawrence Berkeley National Laboratory Earth Sciences Division. "It has become an increasingly difficult task to track down all relevant sequences to compare with your favorite gene."
Andersen uses IMG for a project sponsored by the DOE Genomics: GTL program, which targets the use of DNA sequences as starting points for systematically tackling questions about the essential processes of living systems. Andersen is exploring the potential of a particular microbe, Caulobacter crescentus, for heavy-metal remediation in wastewater.
"I have found the IMG system very useful in identifying potential functions for hypothetical genes that we find upregulated in Caulobacter crescentus strains exposed to heavy metals. Examining the neighborhood around a gene of unknown function in multiple species selected from the organism browser may yield clues of what its role might be in your particular species," said Andersen. "The interface is quite intuitive, which is a benefit for someone like me that does not like to read manuals."
John Taylor, professor of plant and microbial biology at UC Berkeley, tried out the IMG system at a community workshop in February. "In evolutionary biology, comparative genomics has become the most powerful tool for understanding everything from the patterns of mutation to adaptation," said Taylor. "Computational biologists have led the way, but IMG makes it possible for evolutionary biologists without first-rate computer skills to compare fungal and bacterial genomes and scrutinize fundamental processes like speciation and adaptation."
In addition to curating the IMG system, DOE JGI will continue to deposit genome sequence information it generates into GenBank, the repository maintained by the National Center for Biotechnology Information.