Life Science's Inventor of the Month
Joseph L. Slagel
Geospiza, Inc.
Email this page to a colleague
Print version
To appreciate the influence of Geospiza’s cofounder and chief software architect, Joe Slagel, on the field of bioinformatics, it is helpful to remember the mid-1990s. At that time, efforts to sequence the human genome were being ramped up to the levels achieved when completion of a rough draft of the genome was announced in 2000. Every aspect of DNA sequencing technology was being pushed to its limit to accomplish that massive task. Unfortunately, efficient analysis of the enormous quantities of DNA sequence data being generated was hampered by limitations of the computer software then available.
In response to the challenge, newly formed bioinformatics departments in universities and not-for-profit research institutes marshaled their funds and forces to develop new software. To address the shortage of software engineers required for the work, many bioinformatics groups created training programs to teach the essentials of molecular biology to engineers.
In 1992, Craig Venter founded The Institute for Genomic Research (TIGR) to apply his DNA sequencing strategies to whole genomes—an effort that contributed to the Human Genome Project. Traditionally, biologists and engineers worked in relative isolation from each other and, because of their profoundly different backgrounds, often found it difficult to communicate. TIGR became one of the few places where molecular biologists and software engineers could work closely together, cross-pollinating their respective efforts to understand the structure and function of the entire genomes of organisms.
Having earned a bachelor’s degree in computer engineering at the Rochester Institute of Technology, Joe Slagel had been working for five years as a process control software developer at a leading firm when he learned about the critical need for software engineers at TIGR. He had been intrigued by the concept of the DNA sequence as a sort of software code driving the molecular activities of biological cells, and by the fantastic ambition and promise represented in the Human Genome Project. Being a part of it appealed to him.
Joe joined TIGR and began writing software that assisted the effort to prepare the first-ever complete genome sequence of a free-living organism, a clinically important bacterium called Haemophilis influenzae. And it wasn’t long before he began hearing about additional exciting opportunities in a new laboratory at the University of Washington (UW) in Seattle.
A few years earlier, Leroy Hood, M.D., Ph.D., had moved his molecular biology/immunology research laboratory from the California Institute of Technology to the UW and established the Department of Molecular Biotechnology, a cross-disciplinary molecular biological research effort financed initially by Bill Gates. Here, it seemed to Joe, there was an even greater opportunity for close collaboration between biologists and engineers.
In 1996, Joe joined the Hood Lab at UW to work on database software designed to address the management of the enormously complex set of gene expression data harvested by DNA microarray technology. At the time, scientists in the Hood Lab, in collaboration with TIGR, were developing a large-scale, high-throughput DNA sequencing method called the BAC end sequencing strategy. Joe and his Hood Lab colleagues developed software for the analysis and management of the huge body of BAC end sequencing data being generated.
It was in the Hood Lab that Joe met his future business partners: Todd Smith, Ph.D., a biologist who had joined the group to tackle similar problems being faced by scientists using DNA sequencing technology, and Chris Abajian, another software engineer. Joe, Todd and Chris shared their frustration over the lack of efficient, comprehensive software systems to advance genome science and over the scarcity of adequate funding to address the issue in academia. They discovered that they had in mind similar practical solutions to the problem. They also agreed that the software system they envisioned could fill a gap in the commercial DNA sequencing marketplace.
Most large-scale DNA sequencing laboratories, whether academic, government or private-sector, were using software cobbled together from applications developed in-house or obtained from the public domain, or commercial products not designed for high-volume, high-throughput operations. Joe, Todd and Chris envisioned developing a suite of applications capable of analyzing sequencing data on a very large scale and, in addition, facilitating data tracking and management. This software system would have to be compatible with any widely used and proven algorithms, with other software that customers might wish to continue using, and with the instrumentation in use in their labs. In addition, a key requirement would be for its framework to be extensible and scalable, accommodating growth and other changes in a customer’s operation.
Joe, Todd and Chris founded Geospiza in 1997, funding the enterprise themselves. Initially working out of their homes, they developed their first product, the Finch Chromatogram Manager, which would become a cornerstone application of what was to evolve into Geospiza Finch ® Suite software, the industry’s top scientific information management system (SIMS) for working with sequencing data and information. The architecture of the Finch ® Suite SIMS was designed from the ground up for easy installation, maintenance, upgrades and scalability to meet the anticipated growth and data volume demands for digitally based biological information.
The first installation of Finch ® Suite was at a large pharmaceutical company in Seattle. By its second year, Geospiza was operating at a profit. Now, with more than 50 completed installations of Finch ® Suite, more than 10,000 customers and 26 employees (nearly one-half of whom are software engineers whose training continues to include working with molecular biologists in their laboratories), Geospiza is making a significant impact on the efficiency and productivity of life science research activities around the world.
An exciting recent development at Geospiza is a collaborative agreement with Applied Biosystems, a leading developer and manufacturer of life science instrumentation and reagents, including the world’s first automated DNA sequencer. With this agreement, Applied Biosystems will become a worldwide reseller of Geospiza Finch ® Suite software. In addition, the companies are already working together to develop collaborative products to increase the productivity of life science researchers, core labs and enterprises in the fields of clinical research, fundamental research, agri-business, bio-defense and forensics.
Davis Wright Tremaine congratulates Joe Slagel, Todd Smith and their colleagues at Geospiza for their remarkable accomplishments. We wish them much continued success.
return to Scientists & Inventors main page
BAC End Sequencing

BAC end sequencing is a relatively simple, fast and inexpensive strategy for sequencing very large regions of DNA. Fragments of genomic DNA of up to 150 kb are cloned using bacterial artificial chromosome (BAC) vectors. A library of these BAC clones can represent very large regions of the genome of an organism. About 500 bases at both ends of each clone are sequenced (sequence tag connectors, shown above), overlapping clones are identified and an assemblage of the overlapping clones is used to prepare a map. Markers such as restriction fragment length polymorphisms (RFLPs, above) can be used to correlate the BACs with a physical map. Once the sequence of each relevant BAC-cloned insert is determined, the DNA sequence of the entire region is known.
[Return]
What’s in a name?
Geospiza is the genus of ground finches found in the Galapagos Islands. Charles Darwin’s nineteenth-century morphological and population studies of these birds led to his publication of The Origin of Species by Means of Natural Selection.
The name Geospiza thus evokes the basis for one of the most profound biological inquiries of modern time. Also evoked are problems that can befall scientists in the course of their research: some of the data that young Darwin collected on his trip to the Galapagos was lost. Because he neglected to record the collection locations of his Geospiza specimens, publication of his findings was delayed.
Todd Smith, cofounder of Geospiza, Inc., felt that the name was perfectly apt for a company whose mission is to develop the tools required for efficient and accurate studies of the most basic biological questions. Joe Slagel, also a cofounder of Geospiza, agreed – but worried that the name might be too obscure. During their discussions of this point over beer one evening, Joe did some informal market research: he asked their server if she knew about “Geospiza.” Without hesitation, she replied that it was the scientific name for Darwin’s Galapagos finches. Although the drinking establishment was in a neighborhood near the University of Washington and his survey sample might not have been representative of a more general population, Joe conceded the point to Todd.
[Return]
Geospiza Finch ® Suite
Geospiza’s customizable software system is designed to manage work flows, data analysis, reporting and storage associated with large-scale DNA sequencing operations.
Automated DNA sequencing instruments generate data in the form of a chromatogram, like this:

Each base position in the DNA is represented by a chromatographic peak, displayed above using Geospiza’s FinchTV, available without charge at finchtv.com. Each of the four DNA bases is labeled with a different colored fluorescent dye and, by measuring the wavelength of each peak as it passes a detector, the instrument distinguishes between A, C, G and T at each position. (In the example above, peaks representing A residues are green, C residues are blue, etc.)
While the figure above is an example of a clean and easy-to-read chromatogram, there is often considerable baseline noise, the peak amplitudes are smaller or their spacing is compressed. While these difficulties may be overcome by adjusting the chemistry of the sequencing reactions or by sequencing the complementary strand, accurate reading of such chromatograms often requires more analysis than is available from the sequencing instrument itself. Geospiza Finch ® Suite system is designed to apply more stringent analytical filters such as Applied Biosystems’ KB™ Basecaller and the University of Washington’s Phred to DNA sequencing chromatograms.
The strategy for determining the sequence of complete genomic DNA requires sequencing incremental, overlapping fragments called contigs. Another utility contained within Geospiza Finch ® Suite analyzes a set of sequenced contigs to determine their order in the genome. It also assembles them and determines the DNA sequence of the region they encompass. If any cloning vector sequence associated with the contig is present, it is masked before assembly. Geospiza Finch ® Suite also contains a utility that facilitates subjecting sequence files to analysis using search-and-comparison algorithms such as BLAST.
Geospiza Finch ® Suite contains database technology that enables secure and efficient management and storage of DNA sequence data. Work flow in the sequencing facility and the reporting of results to clients also are managed by the system. The streamlining and automation of these processes by Geospiza’s software systems are attractive to scientists seeking to increase the speed and efficiency of their DNA sequencing operations.
[Return]
Geospiza Finch ® Suite SIMS Architecture
The Geospiza Finch ® Suite scientific information management system (SIMS) architecture is composed of multiple tiers built around the concept of releasing a single application.
Joe Slagel, the chief software architect at Geospiza, describes the SIMS architecture of the Finch ® Suite as being organized into multiple tiers encompassing the data store; job management; data analysis software; an infrastructure layer enabling user management at the information systems level; and the presentation tier, containing graphical user interfaces, web pages and interfaces with other instruments. The architectural design is time-tested in the marketplace, as demonstrated by its acceptance and deployment into full production by some of the world’s top universities, research institutes and pharmaceutical companies.
[Return]
return to Scientists & Inventors main page
|