
Simulations conducted on Jaguar today were unthinkable even a few months ago. The system's capabilities are opening up new ways of thinking about research.
Fast Times at ORNL
One of the world's premier computing facilities is transforming scientific research.
Many in the scientific community are surprised
to learn that the world's most powerful
supercomputer is not crunching numbers
for classified government projects deep in the bowels of an
ultra-secret agency. In fact, the computational heart of one
of the Department of Energy's most successful user facilities
is blazing through scientific simulations designed to
help develop cleaner sources of energy and to understand
the causes and impacts of climate change in Oak Ridge
National Laboratory's Leadership Computing Facility (LCF).
The great majority of research conducted on this analytical
juggernaut, known as Jaguar, is anything but secret. Practiced
by researchers from universities, corporations and government
laboratories, this "open science" philosophy is designed
to provide a unique tool for addressing some of the largest and
most important scientific challenges. The collection of capabilities
that accompany this computing leviathan, including a breadth
of scientific talent; an acclaimed support staff; and a formidable
computing infrastructure of power, cooling and connectivity, has
made Oak Ridge one of the world's premier computational facilities
for the delivery of scientific research.
Despite the fact that in 2010 Jaguar will provide more than
a billion processor hours of
computing time, the competition
by users for resources is
intense. The majority of the
time on Jaguar is allocated
through INCITE, a program
operated in conjunction with
the Department of Energy's
Office of Advanced Scientific
Computing Research. One of
the nation's most successful
programs of computational
research, INCITE selects from
among user research proposals by evaluating
both the potential and computational readiness
of the research to accelerate scientific
discoveries and technological innovations.
Research projects selected to run on
Jaguar are often accelerated as a result of the
massive amount of analysis conducted by the
machine in a relatively brief period—in some
cases, reducing from months to days the time necessary to generate data. LCF Director of Science Doug Kothe
points to the Department of Education's list of the "Top 10 Scientific
Achievements" over the last three years as evidence of the
machine's value. "Five of those 10 achievements were the direct
result of data enabled through simulations executed on Jaguar,"
he says.
Because of Jaguar's importance to advancing science in a
number of disciplines, the supercomputer runs user experiments
virtually non-stop every day of the year, relying on scheduling
software to load simulations onto the system as quickly and
efficiently as possible. Kothe notes that, "On any given day, the
backlog of jobs waiting in the queue could be several days of
simulation time. Thirty days of backlog would not be tolerated
by the scientists. Likewise, a zero backlog would indicate the
machine was underutilized," he says. "I am not sure what the
ideal backlog would be, but there is no doubt this is a user facility
achieving high availability, high utilization and high demand."
Higher resolution
Although at 2.3 petaflops, or 2,300 trillion calculations per
second, Jaguar is the world's most powerful computer by a wide
margin, the system's most unique attribute may not be speed,
but rather its 300 terabytes of memory—about three times that of
any comparable supercomputer. Jaguar's abundance of memory
enables the storage of more highly detailed models and equations
necessary for simulating various real-world phenomena.
The advantage of this extra memory capacity is illustrated by
high-resolution climate models developed by the Computational
Climate End Station Project. Headed by climate scientist Warren
Washington of the National Center for Atmospheric Research, the
project typically develops models that attempt to predict climatic
conditions in coming decades and centuries. Washington emphasizes
that while the ability to develop ever more detailed models
is helpful, high-resolution models are not an end in themselves.
"Resolution is important, but we must also make sure our models
produce realistic simulations. This translates to improving the
details of areas that we could not treat as well in the past." Washington
observes that earlier models could specify only general
features, such as deserts, forests or grasslands. "With additional
computational power, we can now specify species of plants and
examine details like how precipitation over mountainous regions
migrates into river valleys and eventually flows into the ocean,"
he says. "With the Oak Ridge resources, we can run our models at
much higher resolution than in the past."
Time will tell
From this early vantage point, it's hard to say which user
communities have benefited most from Jaguar's rapidly expanding
capabilities. "To some extent the answer is in the eyes of the
beholder," Kothe says. "We are employing an open science system for research in a number of areas as varied as chemistry, materials science, climate, biology and astrophysics. In each of these fields,
we can point to impactful work made possible by Jaguar."
In computational science, as in other scientific disciplines,
time is often required for the community at large to appreciate
the impact of the work being done. Koethe rarely sees a simulation
in progress that can be immediately viewed as "game
changing." "The point is that time—measured in years, not weeks
or months—is required to know which of these impacts will be
significant. We are confident, however, that the impacts will be
broad and deep," he says.
The incredible pace of change can make researchers forget
that some simulations conducted on Jaguar today were unfathomable
even a few months ago. For some of ORNL's users,
these capabilities are opening up a new way of thinking about
research. "The approach a scientist takes to designing a simulation
that runs only once a week or once a year is very different
than the approach he or she would take to a simulation run
hundreds or thousands of times in a week," Kothe says. "The challenge
is to open our minds to a more unconstrained approach
to science. The sheer computing power of systems like Jaguar
means that scientists are much less apt to limit the complexity of
the models they construct. This ability to integrate a higher level
of complexity leads to more predictive models that increase the
accuracy of the simulated results."
A field of dreams
|

Simulations conducted on Jaguar today were unthinkable even a few months ago. The system's capabilities are opening up new ways of thinking about research.
|
Just as computing capabilities rapidly change, so does the
pecking order among the world's top computing systems. Although
Jaguar is the world's top supercomputer today, the title is elusive in
a field where ORNL's maximum computing capacity is 800 times
greater than it was just 5 years ago. One constant over this period,
however, has been the popularity of the LCF with the facility's users.
When pressed about what consistently attracts users to Oak Ridge,
Kothe suggests three factors. "First, Jaguar has become a field of
dreams for the best scientists. To some extent one can use the cliché,
‘if you build it, they will come.'" Second, he emphasizes the ability
of the center's staff to meet the understandably complex and often
esoteric needs of LCF users. "The ORNL support staff is continually
cited by users as being second to none. We have a unique set of
experts who are willing and able to help."
Finally, Kothe attributes much of the center's success to a
unique support model. "When we stood up the center several
years ago, we made a conscious decision to support not hundreds
of projects and thousands of users, but dozens of projects and
hundreds of users. This decision enabled us to assign our best staff
to individual projects. As a result, we are simply not answering
mundane questions like, ‘Where do I get an account?' or ‘How do
I run a job?' Our staff members function more like collaborative
members of project teams."
"One indicator that the center's model is working is the
emulation from other centers," Kothe says. "Our model has been
quite effective at enabling Oak Ridge to support the Department
of Energy's goal of ground-breaking computational research. For
our DOE customer, and for the hundreds of users who each year
take advantage of this remarkable program, the impact will be
nothing less than profound."
|