Postdoctoral Scientist in the Smith Lab
p: 508 289 7750
f: 508 457 4727
New techniques allow rapid investigation of Gene Networks in emerging model systems.
Over the last 100 years, a few animals have emerged as the favorite species to use for animal biology research (mouse, fly, worm) and the resources for the study of these animals have expanded to include sequenced genomes, transgenic animals, searchable online databases, and thousands of specific antibodies and reagents targeting expressed proteins. However, there is much to be learned from the many other species not represented in this group, and this is especially true for understanding the mechanisms of evolution and changes to the animal body plan. Today, new genomics technologies are helping make research in these “emerging” model systems more tractable, and my work over the past year has helped to define a workflow to generate databases for studying these animals. These tools will provide a valuable resource for their respective research communities, and thus stimulate innovative studies in a number of fields.
As an example emerging model system, we use the starlet sea anemone, Nematostella vectensis, a small, burrowing estuarine anemone. We chose this species for its privileged position in the tree of life and its amenability to gene manipulation in the lab. We used the Illumina HiSeq sequencer in the Bay Paul Center to sequence every gene expressed in the first 24 hours of development. This technique generates millions of short reads and computer algorithms are required to re-assemble the reads into full length transcripts representing the genes that are used during early development. Researchers working with human, mouse and fly gene sets can map these reads back to their well assembled genomes to get accurate results, but when working with non-model system with a poorly assembled genome or no sequenced genome at all, it is necessary to use the reads themselves to generate a reference to map the reads back to. This process is call de novo assembly, and I collaborated with a computer science lab at Brown University to complete this part of the workflow. We made use of the Trinity assembler software which required 50 hours to compute our 230 million reads. Now, we have a complete database (transcriptome) of every gene expressed (transcript) during the first 24 hours in Nematostella development. We are currently publishing this method so that other researchers can use this strategy to catalog the expressed genes in their animal system and other researchers using Nematostella can make use of our database. Next, we will make use of this database to define the set of transcripts involved in regulation of genes during development and work on elucidating the connections (network) between those genes. The figure below shows the number of transcripts that are predicted to be in this set of regulatory genes.
A. Murat Eren
Woo Jun Sul