Fault-Tolerant Spaceborne Computing Employing New Technologies

NOTE: Where the author has given permission, the titles for Wedndesday and Thursday talks are hyperlinks to the presentation documents. Where this has not been possible, there is often a parenthetical note explaining why.

SCHEDULE

Tuesday May 27, 2008: ARRIVAL

Hotel Albu- querque Potter's Room directions6:30 PM8:30 PMReception and registration (menu is the Mexican Package: Grilled Vegetables, Roasted Red Pepper Dip, Aztec Seven Layer Dip, Fresh Guacamole, Homemade Salsa, Chile con Queso, Tortilla Chips, Green Chile Chicken Cocktail Tamales, Red Chile Pork Cocktail Tamales, Beef Empanadas, Chicken Quesadilla Cornucopias).

Wednesday May 28, 2008: WHERE WE ARE & A VISION FOR THE FUTURE

CSRI/958:00 AM8:30 AMRegistration and Breakfast (breakfast burritos, fruit tray, etc.)
CSRI/908:30 AM8:45 AMLarry Bergman, NASA/JPL, Erik DeBenedictis, Sandia, and Richard Stempien, MITRE, Objectives and Organization of Workshop.
CSRI/908:45 AM9:30 AM[Government Keynote] Rick Ridgley, Federal Government, and Dave Davis, AF/SMC, R&D Roadmaps to Address Spaceborne Computing (US Citizens only). DoD needs for radiation hardened microelectronics for spaceborne assets are determined and addressed through a formal process established by USD (AT&L). The Director, Defense Research and Engineering (DDR&E) chairs the Radiation Hardened Electronics Oversight Council (RHOC), which validates:
  • the threat;
  • requirements for hardening from the Joint Chiefs of Staff (JCS) and Services;
  • Program Office technology needs;
  • the technology roadmap with milestones and funding levels tied directly to technology need dates for DoD systems.
This talk will address the DoD "corporate management" of radiation hardened microelectronics R&D; the approach used to determine performance needs; the radiation hardness objectives for future R&D programs; space community R&D technology roadmaps to address Program Office needs, and the key metrics for determining the R&D path ahead.
CSRI/909:30 AM10:00 AMScott Michel, Multicore Software And Application Development For Spaceborne Computing. Parallel computing architectures, which were actively developed and researched in the 80's and early 90's MPP era, are today's and the foreseeable future's dominant computing platform. As was true during the MPP era, software and application development lags behind architectural innovation. This lag is colloquially known as the "Programmability Gap" in multicore. Numerous efforts are underway to bridge the programmability gap for the various multicore architectural styles, ranging from drop-in replacement libraries, class hierarchies to assist developers to manage threads, to new languages and compiler techniques. This talk introduces the programmability gap, its challenges and a representative spectrum of applied and speculative research, with emphasis on USAF SMC-related spaceborne computing issues.
CSRI/9510:00 AM10:30 AMBreak
CSRI/9010:30 AM11:00 AM[RHBD #1] Lewis Cohn, DTRA, DTRA Radiation Hardened Microelectronics Program: RH 90nm Technology Development Projects. This briefing will address a number of different projects being managed by DTRA to develop and demonstrate radiation hardened 90nm microelectronics technology. The discussion will provide current status on three programs that include: Boeing Radiation Hardened by Design Technology Demonstration Honeywell Radiation Hardening Process Development and Demonstration BAE Systems Radiation Hardening Process Development and Demonstration In addition, a number of other projects that are developing enabling technologies to support the hardening of 90nm technology will be identified.
CSRI/9011:00 AM11:30 AMZbigniew Kalbarczyk, UIUC, Experimental Evaluation: Assessment of Reliability Metrics
CSRI/9011:30 AM12:00 noonStephen Gooch, Wind River, Unsupervised Multi-OS Solutions for Single, Multi-Core, and Many-Core Processors (presentation not yet available). Unsupervised multi-OS solutions to achieve real-time determinism of a real-time operating system and application support of Linux on the same processor, best of both worlds using two different OSs. Simplified migration of existing software or hardware for when it needs to be combined with new software that targets a different OS or OS version. Replacing separate boards/chips with a single processor reduces system cost and/or boosts performance. Partitioning safely separating applications with different levels of security, reliability, performance, or IP protection is required. It can also be used as an integration model and GPL containment.
CSRI/9012:00 noon12:30 PMPaul Dodd, Sandia, Radiation Effects Issues and Trends for High Performance Space Computing (presentation not yet available). In this talk we discuss radiation effects in advanced electronics, including the effects observed in current devices, trends and mitigation techniques for highly scaled technologies, and challenges and opportunities for new technology insertion in spaceborne computing assets.
CSRI/9512:30 PM1:30 PMLunch
CSRI/901:30 PM2:00 PMMichael McDougall, GrammaTech, Verifying software for multicore systems Software running on multi-core systems presents new challenges to software verification. These new challenges will require adapting traditional verification techniques to identify software flaws exposed by the fine-grain parallelism expected in multi-core applications. This talk will discuss the challenges for multi-core verification in general, and focus on how reordered memory operations can invalidate concurrent algorithms previously considered to be correct.
CSRI/902:00 PM2:30 PM[Industry #1] David Czajkowski, Space Micro, Rad Hard High Performance Computing. Space Micro will discuss our recently developed high performance radiation hardening techniques for modern microprocessors and FPGAs. These techniques have resulted in new computer platforms that are capable of 4,000 MIPS and FPGA computing platforms providing 4 Gbps throughput with 30 million available gates. These platforms are fully radiation hardened.
CSRI/902:30 PM3:00 PMTom Majumder, Real-time Wide Area ISR System Development Utilizing Synthetic Aperture Radar (SAR) (presentation export controlled). In this talk, we will present real-time Wide Area ISR System Development Utilizing Synthetic Aperture Radar (SAR). We will discuss current computing technologies such as multicore system, FPGA and multiprocessor that enables real-time SAR data exploitation. We also discuss huge amount of data that SAR system produces and how to process these data in real-time. SAR data exploitation such as Video SAR production utilizing massively parallel multiprocessor system will be illustrated. Sensors' Directorate GOTCHA Radar Exploitation Program (GREP) will be introduced.
CSRI/903:00 PM3:30 PMLarry Bergman, JPL, Working Group Goals and Instructions.
CSRI/903:30 PM4:00 PMBreak
Various rooms CSRI building3:30 PM6:00 PMWorking Groups (see below)
Hotel Albu- querque Potter's Room directions6:30 PMConference Banquet (Chicken Romano: Herb and Romano Crusted Breast of Chicken With Spicy Diavlo Sauce, Parmesan Mashed Potatoes, Fresh Seasonal Vegetable; contact organizer for vegatarian option).

Erik DeBenedictis, Sandia, The End of CMOS Scaling will be Good for Space Computing.

Thursday May 29, 2008: TECHNOLOGY IN DEPTH AND IMPLEMENTING THE VISION

CSRI/958:00 AM8:15 AMBagels and breakfast sandwiches
CSRI/908:15 AM8:30 AM[Government Keynote and NASA #1] Jack Stocky, NASA JPL, On-Board Science Data Processing for NASA Science Missions. Improved, on-board processing capability can significantly improve the science return from NASA's robotic missions. Hardware and software fault tolerance is necessary to permit the processing capability of COTS hardware to be available for on-board science data processing. The ST 8 Dependable Multiprocessor Validation Experiment is one of NASA's first steps toward realizing this capability.
CSRI/908:30 AM9:00 AM[NASA #2] Kirk Reinholtz, JPL, Managing Complexity in Next Generation Robotic Spacecraft. Future spacecraft avionics hardware will draw on ever larger logic chips and processors that will make it progressively harder for engineers to understand and validate their designs, and likewise, harder to write sound software, resulting in greater system cost and/or risk. This talk describes how complexity fundamentally limits the capabilities of the spacecraft we deliver, and presents some hardware do's and don't do's that will improve our ability to write software that fully utilizes the power of advanced hardware capabilities.
CSRI/909:00 AM9:30 AM[NASA #3] Andrew Keys NASA MSFC, Michael Brieden NASA JSC, Joe Coughlan NASA ARC, Processors and Avionics for Support of Human Rated Missions (second part of presentation.). The development of a manned flight system requires a different set of considerations than those made during the development of a traditional unmanned space flight system. NASA's Constellation Program is now in the process of developing the next generation of manned spacecraft designed for missions to low earth orbit, to the surface of the Moon, and eventually to Mars. These spacecraft will apply advanced avionics hardware, software and architecture design to meet the stringent requirements of manned spaceflight. In this talk, component, software, and distributed systems issues will be addressed in the context of high reliability, redundancy, fault protection, environmental, complex distributed systems integration, software reliability, and upgradeability over a period of decades as newer technology evolves and is introduced into the system.
CSRI/909:30 AM10:00 AM[NASA #4] Kirk Reinholtz (for Lorraine Fesq), JPL, Preliminary Findings from NASA SMD/PMD Planetary Spacecraft Fault Management Workshop. This talk will present preliminary findings of the "NASA Science Mission Directorate, Planetary Science Division's Planetary Spacecraft Fault Management Workshop," which was held on April 14-16, 2008. The Workshop provided a forum for the aerospace community to expose interesting Fault Management case studies from recent missions, to discuss and characterize issues plaguing this discipline, and to view future directions via posters and invited speakers from academia.
CSRI/9510:00 AM10:30 AMBreak
CSRI/9010:30 AM11:00 AM[NASA #5] G. Cieslewski, A. Jacobs, C. Conger, and A. George, UFL, Advanced Space Computing with System-Level Fault Tolerance. Increasing demand for high-performance computing in space, coupled with limitations of conventional device-level methods for SEU mitigation, are driving innovations in advanced space computing with system-level fault tolerance supporting COTS technologies. This presentation will highlight research activities at the University of Florida from two on-going projects on this path, the NASA Dependable Multiprocessor (NMP ST-8) and the reconfigurable fault tolerance (RFT) framework.
CSRI/9011:00 PM11:30 AM[RHBD #2] Jeff Draper, ISI, Radiation Effects Challenges in 90nm Commercial-Density SRAMs. Recent experiments conducted at USC as part of DARPA's Radiation-Hardening-by-Design (RHBD) program show that RHBD techniques can be applied with promising results for sub-100nm technologies. SEE and TID results for two 90nm prototype SRAM devices demonstrate the feasibility of such an approach.
CSRI/9011:30 AM12:00 noon[RHBD #3] Michael Malone, Draper Laboratory, On-Board Processing Expandable Reconfigurable Architecture (OPERA) Program Overview. By purchasing commercial state-of-the-art multicore parallel processor intellectual property and converting it to use radiation hardened by design libraries, the government plans to provide a radiation hardened 90 nm CMOS, 49 core, 70 GOPS, 10 Gbps throughput, TRL 6 general purpose processor with complete intellectual property to the space computing industry by the end of 2010. This presentation will discuss the OPERA hardware architecture and intellectual property rollout plan details.
CSRI/9012:00 noon12:30 PM[RHBD #4] Steve Crago, ISI-East, OPERA Software. The OPERA software roadmap includes tools that supplement the commercial software chain with tools and libraries important to government customers. Libraries include MPI and VSIPL. The roadmap includes tools that will provide a base for fault tolerant software, including run-time monitor and dynamic resource management software.
CSRI/9512:30 PM1:30 PMLunch
CSRI/901:30 PM2:00 PM[Industry #2] Ray McConnell, ClearSpeed, Heterogeneous Computing techniques applied to high performance processing and interconnect fabric for SIMD devices. A discussion of the unique heterogeneous computing techniques on ClearSpeed Technology's high performance MTAP and ClearConnectTM interconnect fabric silicon IP.
CSRI/902:00 PM2:30 PM[Industry #3] Richard Berger, BAE, Spaceborne Computing Directions Based on New and Emerging Technologies (presentation proprietary). The range of trades that need to be considered for spaceborne operation and methods of radiation hardening will be discussed. BAE Systems' current and upcoming range of products how they come together into a spacecraft architecture example will be presented. Some of the emerging technology areas in which BAE Systems is working at the R&D level will also be addressed.
CSRI/902:30 PM 3:00 PM[Industry #4] Joe Fabula, Xilinx, Current Status of the SIRF Program. Xilinx details TID measurements made on standard transistors fabricated in current commercial 90 nm and 65 nm CMOS foundry processes. The performance of some deep sub-micron processes to total ionizing dose exposure was found to be surprisingly robust.
CSRI/903:00 PM3:30 PM[Industry #5] Dan Elftmann, Achronix, Achronix Reconfigurable High Speed Radiation Hardened FPGA Technology (presentation proprietary). Achronix will discuss the fundamental mechanism of data communication in their picoPIPE based FPGA logic fabric. From here the discussion will move to the Single Event Effects (SEE) mitigation strategy and initial test chip results. This SEE mitigation methodology will be leveraged to develop a High Speed Radiation Hardened FPGA Technology on the BAE Systems RH15 150 nm process technology with a hardened configuration memory based on the BAE Systems hardened SRAM cell.
CSRI/903:30 PM4:00 PMBreak
CSRI/904:00 PM4:30 PM[Government session] Jeff Kalb, Sandia, NNSA/NA-22 Space Architecture (Government, FFRDC, and SETA participants only, but no prearrangements needed)
4:30 PM5:00 PM[Government session] Transportation to Government session (prearranged participants only)
TBD5:00 PM7:30 PM[Government session] (prearranged participants only)
6:30 PM Dinner on your own

Friday May 30, 2008: PROPRIETARY PRESENTATIONS AND WORKING GROUPS

CSRI/958:00 AM8:30 AMContinental breakfast
CSRI8:30 AMMeet at CSRI Building
CSRI/908:30 AM9:00 AMMitch Fletcher, Honeywell, Lessons Learned and Ideal Architecture.
CSRI/909:00 AM9:30 AM[Industry #6] Ian Troxel, Seakr, Achieving Fault-Tolerant Spaceborne Computing with Commercial Components Commercial off the shelf (COTS) components have become a means by which cost-effective processing systems are being deployed in spacecraft to meet ever-challenging mission requirements. However, a difficult aspect of infusing COTS devices into aerospace systems is overcoming faults brought about by the harsh radiation environment in which they operate. This presentation will provide an overview of the methods used to achieve the required level of fault tolerance in numerous COTS-based data recorder and onboard processing missions developed at SEAKR.
CSRI/909:30 AM10:00 AMWorking Groups
CSRI/9010:00 AM10:30 AMBreak
CSRI/9010:30 AM11:00 AMWorking Groups
CSRI/9011:00 AM12:30 PMWorking Groups
CSRI/9512:30 PM1:30 PMLunch
CSRI/901:30 PM2:00 PMFinal meeting

WORKING GROUP ON HARDWARE AND ARCHITECTURE

The organizers suggest that this group focus on new components emerging in the commercial sector and available for space as well as architectures composed of these components.

A key objective will be to establish relationships between participants in different organizations (Government/commercial) that could increase efficiency, flexibility, and decrease redundancy in the future. The participants should have preliminary results for the Government session Thursday afternoon and should be prepared to produce an outbrief on Friday.

Specific issues for discussion

  • Make an inventory of emerging-technology rad hard components, including processors, memories, interconnect, storage, sensors as applicable.
  • Develop scenarios for architectures comprised of interoperable components.
  • Discuss a roadmap setting expectations of what components and amount of performance would be available in space in the future. Discuss whether a separate activity to create such a roadmap is indicated.
  • A list of gaps or action items of additional technology that would increase flexibility if available in rad hard forms.

WORKING GROUP ON SOFTWARE

The organizers suggest that this group focus on software technologies that could become suitable for making effective software for space computing systems of the sort discussed in the hardware group. This software can include commercial tools repurposed for space as well as space-specific tools.

A key objective will be to establish relationships between participants in different organizations (Government/commercial) that could increase efficiency, flexibility, and decrease redundancy in the future. The participants should have preliminary results for the Government session Thursday afternoon and should be prepared to produce an outbrief on Friday.

Specific issues for discussion

  • Make an inventory of applicable tools for programming emerging components and systems made of multiple such components.
  • Make an inventory of applicable operating systems/middleware.
  • Make an inventory of testing tools as in unit test and V&V.
  • Scenarios for software development environments.
  • A list of gaps or action items of additional software that if available would increase flexibility.
Document date May 30, 2008.