02 Sep 2008 Illumina/Solexa pipeline w/ Grid Engine
From the trenches @ BioTeam …
Illumina (www.illumina.com) is one of a few companies involved in the “next generation” DNA Sequencing race. Each company has has technology that wildly decreases the cost and time involved in large scale genome analysis. Everyone in this space wants to be the first company to produce a box capable of cranking out a human genome for $10K or less.
These lab instruments produce terabyte volumes of data per experimental run, and thus often need to get hooked into complex IT infrastructures (this is what pays my bills …)
This screenshot shows the end result of integrating the instrument data analysis pipline with Grid Engine software running on a midsized linux cluster.
In this test I have the software using 32 cores on 6 servers for the run and I’m timing it against the same analysis done manually on a single server.