PASADENA, Calif.— The Caltech-led WormBase project, an ongoing multi-institutional effort to make genetic information on the experimental animal known as C. elegans freely available to the world, has been augmented with a new $12 million grant from the National Human Genome Research Institute. The money will be distributed over five years for ongoing work on the genome database, which since its inception in 2000 has become a major resource for biomedical researchers as well as biologists attempting to better understand individual genes and how they interrelate. According to Caltech biology professor Paul Sternberg, leader of the project, WormBase has already succeeded in making available on-line the complete genome sequence (100.2 million base pairs) of the nematode, plus an almost complete sequence for the closely related organism C. briggsae, as well as genes for some 20 parasitic nematode species. In addition, the project makes available a huge amount of experimental data pertaining to the nematode.
The completed sequences will be vital for an emerging research effort that includes the new double-strand RNA interference technique for understanding a gene's function, and the fruits of the sequencing effort are already apparent. There are now 23,000 such experiments in WormBase, along with 280,000 DNA expression ("chip") microarray observations, as well as detailed information on the expression of more than 1,600 of the worm's 20,000 genes.
"For the future, researchers will look at interactions between genes, which means that there are 20,000-squared possibilities for the interactions of two genes alone," says Sternberg. "Also, our future effort will include working with similar databases of the genomes of other organisms, such as the mouse, fruit fly, and yeast, for shared software and shared conceptual vocabularies.
"The ultimate purpose is to allow medical researchers to get the information more easily," he adds.
The human-worm connection may seem tenuous to people outside biology, but it is known that the two organisms have similarity in about 40 percent of their genes. A very realistic motivation for the funding of genome sequencing of other organisms has been to provide data for comparisons of genes that are of interest in the quest to better understand human disease. Thus, a cancer researcher who discovers that a certain gene is expressed in cancer cells can use the WormBase to see if the gene exists in nematodes, and if so, what is known about the gene's function.
Exploring the fundamental relationships between genes from species separated by hundreds of millions of years of evolution is expected to be a cornerstone of 21st-century biological innovation. Improved knowledge of how a gene is expressed in one species--and as time goes on, how two or more genes interact--will provide new approaches for dealing with human disease and will almost certainly be the foundation for some important medical advances.
The role of WormBase in 21st-century medicine will continue to be as a resource for knowledge. Already the wormbase.org site is fully searchable in a number of ways, including by genes, cells (the nematodes have only 959, and all are clearly understood and clearly visible under a microscope), and biological processes, as well as by names of researchers.
Information in WormBase comes from teams at the two centers that sequence the C. elegans and C. briggsae genomes--a team at the Sanger Institute, in England, led by Richard Durbin, and one at Washington University, led by John Spieth. The innovative software used to display the information in WormBase was developed by Lincoln Stein of the Cold Spring Harbor Laboratory, where the WormBase Web server is located.
Fourteen individuals at Caltech are currently involved in the WormBase project, including nine biologists and three computer experts.