You're suggesting NASA is more careless with it's failures than back then.
Closer, but not quite.
Every NASA mission, regardless of budget, carries a distinct chance of failure. Even a Lexus can break down. At a certain point it stops being a matter of funding and effort and starts looking like the inherent properties of any complex system.
You don't put all your eggs in one basket because if you drop the single basket you lose all your eggs. Goldin's idea was to put NASA's eggs into several baskets so that the loss of any one basket wouldn't disastrously affect the egg supply.
"Careless" isn't a word I would use for this. It implies that NASA is not interested in mission success. That's not precisely the idea. NASA accepts a greater probability of failure for each individual mission and compensates for it by placing less responsbility on each mission for the success of the entire program.
If you graph programmed reliability on the vertical of a graph and budget on the horizontal, the curve looks something like an inverse proportion, or similar to 1/x. (Reliability is probability of failure; low numbers are desired.) The point is that at the high end of the budget scale, huge additional expenditures buy only small increases in reliability.
The goal is to move the budget line back toward zero and find the ideal cost-benefit breakpoint. Scaling back accepted probability of failure from, say, p < 0.01 to p < 0.05 may reduce cost by half an order of magnitude. If by doing that you increase the number of possible missions from, say, three to 12, the overall reliability of the program (encompassing all missions) is increased.
(There are qualitative procedures for scaling back costs too, but I don't want to bog this down.)
In engineering-speke, this approach decouples the system. The failure of any one component (mission) is limited in how it can affect the system (exploration program). This is a desirable circumstance.
So where did it go wrong?
As I already explained, Goldin was not able to communicate to Congress and to the public what I've just explained. I'm sure he did the best he could, but some people just never get it. That's not necessarily Goldin's fault.
Second, moving the cost line on the graph until a suitable reliability is obtained doesn't translate well into the procedures of an aerospace corporation. They achieve reliability in their product by following procedures and employing processes they've spent years or decades developing. They've found a way to consistently produce good results, and that comes at a predictable cost.
When you tell them they have to meet the same goals with an order of magnitude less money and half the time, they have to come up with new processes. They can't usually just scale them down. If a design process takes ten engineers one month to do, you can't just assign one engineer to it and expect an answer in two weeks. You have to invent a new process that can produce a usable result in two man-weeks.
It can be done, but it cannot be done painlessly. Goldin did not anticipate this, either. He didn't fully realize that if you force the industry to reinvent itself, it will have to go through all those Apollo-era growing pains again. It was hard for smart people to do it back then, therefore it will be hard for smart people to do it now. That means the reliability curve flattens and his carefully established global reliability estimates are no longer valid. The probability of failure increases for each individual mission, and the probability of overall program success decreases.
Finally you have to consider the linearity of the system. This is another way of organizing systems so that component failure is contained and manageable. Mission planners didn't fully account for this when implementing Goldin's program, so there are nonlinear elements in the Mars exploration program, such as the shared communication system.
Unfortunately here's where you get to a catch-22. You can only linearize the system by enhancing each component, and that's contrary to the component design philosophy of "better, faster, cheaper". So again you have to find a happy medium and again that requires another iteration through the design paradigms which historically take years to complete a cycle.
It's not a matter of being careless. It's a matter of how you distribute your capacity to care. You concentrate your efforts on where it does the most good. The only hard part is trying to figure out where that is. We're witnessing NASA undergo that process of discovery, and unfortunately the fickle and impatient public isn't cutting them much slack.
|