Abstracts
David
Bader -
Climate Simulation for Climate Change Studies
Climate modeling is one of the most well-known simulation problems that require
high-end computing, data and network resources. Because it is impossible to build
a physical laboratory to study climate, climate simulation models are the only
tools, with which scientists can integrate their knowledge to gain understanding
of this highly nonlinear and complex system. Climate simulation has advanced
dramatically over the last two decades, in large part because of demands to study
the potential climate changes brought about by human activities, principally
the increase in atmospheric carbon dioxide concentrations that result from combustion
of carbon-based fossil fuels for energy production.
This presentation will provide a brief introduction to the scientific aspects of climate science and their representation within climate simulation models. The primary focus of the presentation will be on the use of models to simulate climate change. The practical limitations imposed by throughput considerations, computer architectures, programming tools, and available computer resources will be identified.
In the last two years, several visionary conceptual models have been proposed that assume computer hardware were no longer a limitation. These ideas take the concept of climate modeling much closer to a “first principles” approach to simulation that makes reliance on “parameterizations”, or closure schemes for important, unresolved processes, much less important. As will be shown, however, a “perfect” simulation is still impossible. Equally important to advances in computer hardware is the necessity for complementary advances in software and programming models, which has not been the case over the past decade. This talk will highlight some of the shortcomings in this area that have prevented climate modeling, and other simulation science applications, from advancing at a rate commensurate with the rate of advance in hardware.
Charles Bennett - Quantum
computing -- its promise and limitations
The theory
of reversible computation, and more recently quantum computation, have drawn
attention to previously neglected physical aspects of information
processing, and each has offered some hope of overcoming what were previously
thought to be fundamental limits. Neither, however, offers general cure for
the anticipated end of Moore's law.
Doug Burger - The End of Silicon: Implications and Predictions for HPC Systems
Moore's Law has persisted longer than many thought possible. Nevertheless,
the end for CMOS is in sight. Before we get there, power, leakage, and reliability
challenges will change computer systems substantively. In this
talk, I will project how architectures and systems are likely to evolve between
the present and the last conventional computer system.
W. Dally - Custom vs. Commodity Processors (and Memory Systems): The Right Hardware Makes Software Easier
Andre
DeHon - Nanowire-based Computing Systems
Chemists can now construct wires which are a few atoms in diameter; these
wires can be selectively field-effect gated, and wire crossings can act
as programmable diodes. The tiny feature sizes offer a path to economically
scale to atomic dimensions. However, the associated bottom-up synthesis
techniques only produce highly regular structures and have high defect
rates and minimal assembly control. We develop architectures to bridge
between lithographic and atomic-scale dimensions and tolerate defective
and stochastic assembly. Using 10nm pitch nanowires, these nanowire-based
programmable architectures offer one to two orders of magnitude greater
mapped-logic density than defect-free lithographic FPGAs at 22nm.
Michael
Frank - The Reversible Computing Question: A Crucial Challenge for
Computing
The computing world is rapidly approaching a power-performance crisis. Over
the course of the next few decades, all the usual tricks and techniques for improving
computer energy efficiency will, one by one, approach fixed limits. As this happens,
the performance that can be maintained per watt consumed (and ops performed per
dollar spent, barring cheaper energy) will gradually flatten out, and stay flat
for ever!
For, as von Neumann first pointed out in 1949, fundamental thermodynamics imposes
a strict limit on the energy efficiency of conventional "irreversible" binary
operations, and conventional algorithms for a given task will always require
some minimum number of these.
However, this crisis might be avoided, and computer energy efficiency might resume an indefinite upward climb, if only we can practically implement an unconventional approach known as "reversible computing," which avoids using irreversible operations. Instead, in a high-performance "ballistic" reversible computer, the physical and logical state of its circuits essentially "coasts" along the desired trajectory through the machine's configuration space, like a roller coaster along its track, with an energy dissipation that, in theory, can approach arbitrarily close to zero as the technology is further refined.
Unfortunately, the question of whether reversible computing can be made to work efficiently in practice remains open at this time. Various theoretical models and "proof of concept" prototypes of reversible machines exist, but can all be criticized as either too inefficient or too incomplete to be convincing. On the other hand, all of the many attempts by skeptics to prove reversible computing impossible (or permanently impractical) have also been invalid or incomplete, often relying on demonstrably incorrect assumptions about how a computer must work.
The reversible computing question is very deep and important, and it deserves
increased attention. But it is also extremely subtle, and quite difficult
to resolve. In this talk, I review the major results and open issues in
the field,
and propose what we must do in order to make progress towards answering this
crucial question, and possibly opening the door to a future of unbounded
improvements in computer energy efficiency.
Ed
Fredkin - What Else Can Physics Do for Us?
It would be nice if the interplay between new computation technologies and
the laws of physics could bring real progress. Quantum computing, so far,
has been interesting physics with no resulting computation. Further, the
range of applicability of QC, if it ever becomes practical, may be limited
to a tiny slice of the universe of tomorrow’s computational workload.
It is interesting to raise the question: “What else can physics offer
with respect to real world computational problems?” The answers aren’t
clear but there are new insights and possibilities. We should understand
that QC is not the only way to bend the basic physical properties of matter
and energy to the task of general computation. We will report on physics
based concepts that result in conventional computational structures; differing
only by having speed and capacity that leapfrog Moore’s Law.
William Gropp - How to
Replace MPI as the Programming Model of the Future
There
are now legacy MPI codes that future architectures and systems will need to
support, either directly through a high-quality MPI implementation
or with advanced code transformation aids. Why has MPI been so successful?
What properties must any replacement have? This talk will look at some of the
reasons (other than portability) for MPI's success and what lessons they provide
for current challengers, such as the PGAS languages. The interaction of system
architecture and hardware support for programming models and for algorithms
will also be discussed, with particular emphasis on the importance of balanced
performance features on programming models and algorithms for high end computing.
Bruce Hendrickson - Parallel
Graph Algorithms: Architectural Demands of Pathological Applications
Many important applications of high performance computing involve frequent,
unstructured memory accesses. Among these applications are graph algorithms
which arise in a wide range of important applications including linear algebra,
biology and informatics. Graph operations often involve following sequences
of edges, which requires minimal computation but frequent accesses to unpredictable
locations in global memory. These characteristics result in poor performance
on traditional microprocessors, and even worse performance on common parallel
computers.
In recent work, we have explored the performance of graph algorithms on the
massively multithreaded Cray MTA-2.
The MTA's latency tolerance and fine-grained synchronization mechanisms allow
for high performance of single processor and parallel graph algorithms. We
will present these results and discuss their lessons for future developments
in computer architecture.
Joint work with Jon Berry, Richard Murphy and Keith Underwood.
Steve Jardin - Towards
Comprehensive Simulation of Fusion Plasmas
In Magnetic
Fusion Energy (MFE) experiments, high-temperature (100 million degrees centigrade)
plasmas are produced in the laboratory in order to create
the conditions where hydrogen isotopes (deuterium and tritium) can undergo
nuclear fusion and release energy (the same process that fuels our sun). Devices
called tokamaks and stellarators are “magnetic bottles” that confine
the hot plasma away from material walls, allowing fusion to occur. Confining
the ultra-hot plasma is a daunting technical challenge. The level of micro-turbulence
in the plasma determines the amount of time it takes for the plasma to “leak
out” of the confinement region. Also, global stability considerations
limit the amount of plasma a given magnetic configuration can confine and thus
determines the maximum fusion rate and power output. Present capability is
such that we can apply our most complete computational models to realistically
simulate both nonlinear macroscopic stability and microscopic turbulent transport
in the smaller fusion experiments that exist today, at least for short times.
Anticipated increases in both hardware and algorithms during the next 5-10+
years will enable application of even more advanced models to the largest present-day
experiments and to the proposed burning plasma experiments such as the International
Thermonuclear Experimental Reactor (ITER). The present thrust in computational
plasma science is to merge together the now separate macroscopic and microscopic
models, and to extend the physical realism of these by the inclusion of detailed
models of such phenomena as RF heating and atomic and molecular physical processes
(important in plasma-material interactions), so as to provide a true integrated
computational model of a fusion experiment. This is the goal of a new initiative
known as the Fusion Simulation Project. Such an integrated modeling capability
will greatly facilitate the process whereby plasma scientists develop understanding
and insights into these amazingly complex systems that will be critical in
realizing the long term goal of creating an environmentally and economically
sustainable source of energy.
David Keyes - Drivers from
Science and Engineering Applications
Relying on the input of hundreds
of members of the U.S. computational science community at the 2003 Science-based
Case for Large-scale Simulation (SCaLeS)
workshop, the March 2004 whitepapers of Scientific Discovery through Advanced
Computing (SciDAC) project of the U.S. DOE, and a collection of recent Gordon
Bell Prize finalist papers, we define and motivate some aspirations for high-end
science and engineering simulations in the five-year horizon. Looking at some
hurdles to progress in high-end simulation, we note in passing that not all
are architectural in nature, then concentrate further on those that apparently
are. Looking at some kernels of high-end simulation, we note apparent hurdles
to their scalability and draw inspiration from the flexibility of algorithm
designers to get around hurdles that have presented themselves in the past.
Craig
S. Lent - Molecular quantum-dot cellular automata and the limits of
binary switch scaling
Molecular quantum-dot cellular automata (QCA) is an approach to electronic computing
at the single-molecule level which encodes binary information using the molecular
charge configuration. This approach differs fundamentally from efforts to reproduce
conventional transistors and wires using molecules. A QCA molecular cell has
multiple redox centers which act as quantum dots. The arrangement of mobile charge
among these dots represents the bit. The interaction from one molecule to the
next is through the Coulomb coupling—no charge flows from cell to cell.
Prototype single-electron QCA devices have been built using small metal dots
and tunnel junctions. Logic gates and shift registers have been demonstrated,
though at cryogenic temperatures. Molecular QCA would work at room temperature.
Molecular implementations have been explored and the basic switching mechanism
confirmed. Clocked control of QCA device arrays is possible and requires creative
rethinking of computer architecture paradigms. By not using molecules as current
switches, the QCA paradigm may offer a solution to the fundamental problem of
excess heat dissipation in computation.
Jeff Nichols - National
Leadership Computing Facility - Bringing Capability
Computing to Science
The National Center for Computational Sciences (NCCS)
maintains and operates a user facility to develop and deploy leadership-computing
systems with the
goal of providing computational capability that is at least 100 times greater
than what is generally available for advanced scientific and engineering problems.
We work with industry, laboratories, and academia to deploy a computational
environment that enables the scientific community to exploit this extraordinary
capability, achieving substantially higher effective performance than is available
elsewhere. A non-traditional access and support model has been proposed in
order to achieve a high level of scientific productivity and address challenges
in climate, fusion, astrophysics, nanoscience, chemistry, biology, combustion,
accelerator physics, engineering, and other science disciplines. The NCCS brings
together world-class researchers; a proven, aggressive, and sustainable hardware
path; an experienced operational team; a strategy for delivering true capability
computing; and modern computing facilities connected to the national infrastructure
through state-of-the-art networking to deliver breakthrough science. Combining
these resources and building on expertise and resources of the partnership,
the NCCS enables scientific computation and breakthrough science at an unprecedented
scale.
Michael
Niemier - What can ‘baseline’ QCA
do?
Whether or not the end of the CMOS curve does indeed come to pass, speculation – combined
with other technological advances – have helped to fuel a wealth of research
related to alternative means of computation. Much of this work has either focused
on the lowest levels of device physics or, at best, very simple circuits. But
most importantly, it has often led to a publication that discusses the demonstration
and performance of a single device. While this is an undeniably important and
necessary first step, we must ultimately consider how “Device X” will
be used to form a computational system, as well as the fact that we will need
many Device X’s to do so – not just one. Given the increasing number
of proposed novel devices, we should explicitly consider both of the above
issues beginning in the initial stages of a device’s development – even
before the first paper demonstrating Device X is published.
However, by involving computer architects during device development, we will not just be looking at a single device in isolation – rather, we will be evaluating a reasonably sized system with an initial computational goal in mind. Moreover, by assuming a set of very pessimistic implementation constraints, we can establish a true baseline for Device X – defining our best-foreseen application in the worst-foreseen operational environment. If expectations (i.e. with regard to power, area, speed, etc.) for end-of-the-roadmap silicon and other emergent devices – for the same application – are plotted simultaneously, we can make significant headway into discovering what niche roles Device X can realistically play in computing.
What has been done and, in the speaker’s opinion, what needs to be done will be discussed in the context of Quantum-dot Cellular Automata (QCA).
Mark Oskin - Engineering a Quantum Computer: Bridging the Theoretical and
Practical Divide
Theoretically, quantum computers offer great promise to
solve formally intractable problems. Experimentally, small scale quantum computers
have been demonstrated. The next phase of research is to construct large-scale
quantum
computers capable of proving the technology and further validating the theoretical
foundations. Such devices will consist of 10's to 100's of quantum bits. At this
scale, proper engineering of the devices becomes critical.
This talk will present a broad overview of our work in exploring the engineering
challenges and design trade-offs involved with large scale quantum systems.
We have found that noise will significantly constrain scalability and that
the micro-architecture of these devices needs to be tuned to minimize decoherence.
This talk will conclude by sketching future work to be done in this area.
Thomas Sterlling - Continuum
Computer Architecture for Nano-scale Technologies
As the feature size of logic
devices decreases with Moore’s Law ultimately
achieving the domain of nano-scale technology, the ratio of ‘remote’ versus ‘local’ action
will escalate dramatically demanding entirely new computing models and structures
to efficiently exploit these future technologies and lead to Exaflops capability
and beyond. Continuum Computer Architecture (CCA) is a new family of parallel
computer architectures under development at LSU to harness convergent device
technologies beyond Moore’s Law that respond to the challenges implied
by the emerging disparity between local and global operations. CCA provides
one possible framework for employing nano-scale technology for future convergent
system architectures at the end of Moore’s Law. CCA is a cellular architecture
merging data storage, logical manipulation, and nearest neighbor transfers
in a single simple element or cell. In physical structure, CCA is reminiscent
of cellular automata. But logically, CCA is very different. It supports a general
global parallel model of computation through the management of a distributed
virtual name space for both data and parallel continuations which are data
structures that dynamically and adaptively govern fine grain parallel execution.
The semantics of the CCA system borrows from the ParalleX model of computation
that combines message driven computing, multi-threading, and the futures synchronization
construct to replace the venerable and conventional barrier controlled communicating
sequential processes. This presentation will describe ParalleX and its potential
implementation through Continuum Computer Architecture with nano-scale technology
for Exaflops and beyond.
Tom Theis - Devices for Computing: Present Problems and Future Solutions
The biggest problems limiting
the further development of the silicon field-effect-transistor are power
dissipation and device-to-device variability. Despite some pessimistic
predictions, it looks like the technology can be extended for at least another
10 years. Research into transistors based on carbon nanotubes or semiconductor
nanowires can be viewed as a quest for the "ultimate" field-effect-transistor.
Looking beyond the field-effect transistor, major US Semiconductor manufacturers
have
recently announced the Nanoelectronics Research Initiative (NRI) which
will fund university research aimed at entirely new logical switches. Beyond
the stated research goals of NRI, I will briefly survey the prospects for devices
that efficiently implement reversible logic and quantum logic.
Colin P. Williams - Introduction to Quantum Simulation
While it is widely known that quantum computers can factor
composite integers and compute discrete logarithms in polynomial time, other
applications of quantum computers have not been publicized as well. In this
talk I will discuss some of the ways quantum computers could be used in scientific
computation, especially in simulation, quantum chemistry, signal processing,
and solving differential equations. Such applications of quantum computers
have the potential to have greater scientific and commercial impact than those
related to factoring and code-breaking.
Stan Williams - Manufacturability and Computability at the nano-Scale
Nano-Scale electronics offer the possibility to build much higher density circuits
than those that are presently available, but there are major issues to resolve
before they become a reality. A significant issue is the cost of manufacturing,
which will lead to new fabrication technologies and geometrically simpler circuit
designs. A second is that at some scale, the physics of the field-effect transistor
will not longer operate, and new devices enabled by quantum effects will be
needed. I will review our latest developments in the areas of nano-imprint
lithography, switching devices, and crossbar architectures.
Peter Zeitzoff - MOSFET Scaling Trends, Challenges, and Key Technology Innovations through the End of the Roadmap