Translational Genomics Research Institute

Altair PBS Professional™ at the Translational Genomics Research Institute (TGen): Cutting Time to Discovery

Customer Profile

In recent years, corporations and research institutions around the world have applied massive computational resources to defining the makeup of the human genome. One of the greatest challenges is to translate that knowledge into therapeutics and diagnostics — which is the mission of the Translational Genomics Research Institute (TGen), a remarkable non-profit organization founded by a joint effort between the State of Arizona, Arizona State Municipal Governments, Indian Tribal Community, educational institutions, private foundations and corporate entities. TGen’s work is not only to make genetic discoveries, but also to translate discoveries into benefits for human health in the form of new diagnostic tests and therapies.

With support from Arizona State University (ASU), TGen established its High Performance Biocomputing Center (HPBC) on the ASU Tempe Campus to give its scientists the powerful computational resources they need to discover how genetic changes contribute to disease progression and resistance to therapy. In the words of the HPBC’s mission statement, these resources “help empower researchers’ ability to rapidly translate genomic discoveries into diagnosis and treatment.”

Setting Up TGen’s Computational Heartbeat

TGen was established in Phoenix in 2002 with an initial staff of 23 scientists. In March 2003, ASU procured high-performance computing machines, including a 512-node IBM eServer Cluster 1350, to support TGen’s translational genomics research program. In April 2003, IBM began installation of the eServer Cluster, running Red Hat Linux on 1024 Intel Xeon processors. By July, the HPBC was running production-type tests of gene sequencing and other processes on the cluster system, which is known as Saguaro. By late September it was in full production. PBS Professional's workload management software was an integral element of Saguaro from the start.

“We had installed OpenPBS on our 16-node development cluster, a scaled-down version of Saguaro that came on line quite a bit sooner than Saguaro,” says James Lowey, Manager of High Performance Computing Systems for TGen. “But it didn’t provide many of the things that PBS Professional does. One of the critical factors PBS Professional gives me is accounting — the ability to look at the number of jobs that are run and the amount of time each job takes.

“Ultimately, TGen’s CIO, Dr. Edward Suh, chose PBS Professional because we needed a vendorsupported product that would meet our need to provide flexible job scheduling on Saguaro, the 512-node production cluster. We’ve been very pleased with the product’s capabilities.”
TGen established its HPBC to give its scientists the powerful computational resources they need. “One of the ringing endorsements I can give PBS Professional... It just works.”

Putting Saguaro to Work

Today, HPBC typically serves about 65 accounts on Saguaro, most of them within TGen. Scientific collaborators at ASU and other research institutions are also active users of the resource. They use BLAST, AMBER, Gaussian, and other commercial and in-house-developed applications to run thousands of jobs on Saguaro. PBS Professional provides the flexibility to run large jobs across, say, 128 nodes, while running thousands of small serial jobs on a single node.

Saguaro, a 16-node development cluster, and three IBM SMP compute servers — two running SUSE Linux and one running AIX — all run on a high-performance SAN that connects to Saguaro over three Cisco 4006 switches. Users can watch their jobs interactively using a 1TB IBM GPFS Parallel File System that is accessible to every node on the cluster.

One characteristic of PBS Professional that has helped HPBC cope with the demand for Saguaro’s resources is hands-off dependability and simplicity of maintenance. “One of the ringing endorsements I can give PBS Professional is that once we got it set up and working, I have not had to do anything to it at all,” says Lowey. “I went through last year and upgraded my entire cluster to Red Hat EL3.0. Part of that process was reinstalling PBS Professional. I followed the instructions in the manual and it took about 20 minutes. It was quite simple.”

Looking Ahead: Upgrades and a Web-Based Interface

One of HPBC’s goals is to move TGen to a web-based job submission model, and an internal web-based data analysis website is already operating. Another goal is flexible queues tied together with a switch architecture, which will enable HPBC to run, say, 32-node jobs on a single switch blade, or 128-node jobs on a single switch, removing the latency of switch-toswitch communication. These and other advances will involve PBS Professional.

TGen’s successful experience with PBS Professional will soon lead to an upgraded version, which HPBC is currently evaluating. Of particular interests are the job array, redundancy and failover features of the current release. Saguaro has received heavy utilization since it came into full production in late 2003, and an increase in failures is inevitable. PBS Professional’s Automatic Job Recovery will automatically redo any interrupted job upon detecting that the nodes have gone down.

“I’m excited that upgrading my production cluster and looking at other uses of PBS Professional in our HPC environment are among my goals this year,” said Lowey.