Software development is much different today than it was at the beginning of the Space Shuttle era because of the tools that we have at our disposal, Darrel Raines mentioned in his talk about embedded software development for the Space Shuttle and the Orion MPCV at NDC Tech Town. But the art and practice of software engineering has not progressed that much since the early days of software development, he added.
Compilers are much better and faster, and debuggers are now integrated into our development tools, making the task of error detection much easier, as Raines explained:
There are now dedicated analysis tools that allow us to detect certain types of issues. Examples are static code analyzers and unit test frameworks. We have configuration management systems like “git” to make our day to day work much easier.
Raines argued that many things are the same today as they were when they started writing software for the Space Shuttle. One of the best ways to detect software problems is still with a thorough code review performed by experienced software engineers, he said. Many defects will remain latent in the developed code until we hit just the right combination of factors that allow the defect to show itself. It is imperative to use all the different testing methods available to us to find bugs and defects before we fly, he added.
Raines mentioned that there is one important thing about their software that is very different than most other embedded software:
We cannot easily debug and fix software that is deployed in space! We continually remind ourselves that any testing and debugging that we do on the ground could potentially save a crew when we get to space.
He mentioned that software developers engage with astronauts at many levels during their work. They discuss requirements with astronauts, and talk about how much of a workload they want and how much they can handle. This evaluation allows them to decide on the level of autonomy that the software will have, as Raines explained:
We spend time thinking about how astronauts would recover from various faults. We determine how the harsh environment of space may affect our software in ways that we don’t even have to think about with ground computers.
The hardware used for the major programs is very often generations behind what we have on our phones and on our home computers, Raines said. The software has to be very efficient because they continually struggle with the CPU being saturated. They also run into problems with the onboard networks running out of bandwidth.
C/C++ is the most common computer language used because of its efficiency. Modern compilers help make C code relatively easy to write and debug, Raines said. Since C has been around for a long time, it is well understood and highly optimized on most platforms. There are also spacecrafts that have used Fortran (Space Shuttle flight computers) and Ada (Space Station onboard computers).
The impact of what language is used is a major factor in how to develop and test the code. C/C++ will allow you to do “dangerous” things within the code, as Raines explained:
Null pointers are a constant worry since we have to use them sometimes instead of references.
The most noticeable impact on development is that they need to perform multiple levels of testing on their code, Raines said. They start with unit tests, followed by unit integration tests, then full integration testing, and finally formal verification tests. Each level of testing tends to find different kinds of defects in the software, Raines mentioned.
The impact of failed code can sometimes be a loss of crew or a loss of mission, Raines said. This will weigh heavily on our decisions about how much testing to do and how stringently to perform those tests, he concluded,
InfoQ interviewed Darrel Raines about software development at NASA.
InfoQ: How have changes in the way software development is being developed impacted the work?
Darrel Raines: All of the tools that are available these days make it much easier to concentrate on the important task of making the code work the way we intend it to work.
The adage that the “more things change, the more they stay the same” is an important concept in my job. I am always willing to try new technology as a way of advancing my ability to develop software. But I remain skeptical that the “next big thing” will really make a big difference in my work.
What usually happens is that we make gradual changes over the years that improve our ability to do our work, but we remain consistent with the principles and techniques that have worked for us in the past.
InfoQ: What makes spacecraft software special?
Raines: One example I use with my team is this: if my computer locks up on my desktop, I can just reset the computer and start again. If we lose a computer due to a radiation upset in space, we may not be able to reestablish our current state unless we plan to have that information stored in non-volatile memory. It is a very different environment.
The astronauts, as educated and trained as they are, cannot debug our software during a flight. So we have to be as close to perfect as we can prior to launching the vehicle.
It may mean the difference between a crew coming home and losing them. This difference is what makes spacecraft software special. This is what makes it challenging.