| Why Dependency Modeling and Testability Analysis? During my years of working on the Space Shuttle program
in the 1970’s a new buzzword came into being. “Integrated Diagnostics” became a
term that had a wide variance in understanding among my fellow constituents. To
some it meant the art of diagnosing a problem and isolating it. To others it
meant a system of tracking failures within a system to trend failures for
prognosis of future failures within the system. Still others understood it to
be a method of measuring a system to determine how diagnosable it is. While all of these are important to include within a
definition of integrated diagnostics, my experience in this makes it my opinion
that the hardest of these items to do is to measure a system to determine its
real-world diagnostic capability. When I say real world, I mean for both
operational and maintenance actions. After 25 years of electronic design my
world changed from creating systems and subsystems to diagnosis of systems.
Why? Because I found it was a much greater challenge than that of design and my
experience in the design world left me with a good intuitive understanding of
faults and fault isolation.
Two notable Shuttle experiences re-enforce my opinion
that diagnosis is the real challenge. When the Shuttle Challenger blew up after
launch the problem that causes it was a gigantic failure in diagnosis.
Diagnosis of a failure has two major components, “Detection” and “Isolation”.
The Bottom line is that you cannot correct or remediate a problem you do not
know exists. A simple sensor placed at the right location would have detected
the “O”-ring problem in plenty of time to allow a return to Launch site
maneuver saving the crew lives. Yet the sensor did not exist because no one had
considered the possibility of an “O”-ring failure in the manner it occurred. It
was a diagnostic analysis failure. It was the same with the Columbia. No method
had been employed to monitor the insulation that came off the external tank
because it had never been considered a threat to the shuttle. As a result in
both of these catastrophes critical faults went undetected.
Through the intervening years since the Challenger
accident I have seen a tremendous increase in the use of the Failure Effects
and Criticality Analysis (FMECA). It is now a primary tool in system design.
While the FMECA existed clear back to the years before the shuttle its usage in
those days were largely that of an after-the-fact logistics system repair tool
as opposed to a design tool. But analysis of an effect even if critical is no
answer to the solution if that effect goes undetected. This is why I believe
dependency modeling driven testability analysis is so important. There is no
way that I know to consider all possibilities of critical effects.
If you look at the two failures I have described above
the FMECA’s created for this system did not even include these failure effects.
That was primarily because there was no associated hardware. Current FMECA
technology works by examining each existing hardware components for their
failure modes and assessing the effect of these failures.
Here is one of the flaws in the system: If
system hardware does not exist in a design, neither does the failure mechanism
that can be critical to the failure of the system.
Future FMECA analysis needs to go beyond the existing
hardware and consider failures that are not driven by system hardware. We have
dented the surface in this area in the past to some extent, by analysis of
Cooling, weather, lightning strikes and radiation but our efforts still fall
way too short. One of the key tools to take us beyond system hardware failures
is testability analysis. My experience has taught me that I can always find a
way to isolate a problem if I know it exists. But I cannot always find the
basic failures that cause problems. The standard faults such as those of the
electronic components are well understood and seldom contribute to a
catastrophic failure. We have learned how to use redundancy where needed and
our detection and remediation in these areas is generally good. What we have
not learned is how to manage our analysis to include all the influences outside
the defined system. The FMECA still appears to be the best analysis tool to
measure critical effects on a system but if it does not include a means to
identify exactly how that effect will be detected then we will still fall
short. In the past it has been up to the design engineer to determine how he
will test and detect a failure. The problem is that he works with a failure set
that too limited and does not include all possible failures and he also does
not always fully understand failure propagation.
The answer to this lies in dependency modeling.
Dependency modeling driven testability analysis provides us with a real good
analysis of propagation of failures within a system and leads us to consider
“what if”s in a better and logical manner. Combining the testability analysis
with the FMECA where it is not a guess as to how a failure mechanism is
detected but is driven by a dependency model and its real propagation
characteristics is essential. It also helps the analyst to better concentrate
on the critical areas of the design and eliminates analysis of the non-critical
areas. I have always believed as an engineer to use all the tools I can put my
hands on.
A dependency-modeling tool is one I do not want to do
without. The tool I use today is the DSI eXpress tool. It provides a good
graphical representation of a design, good diagnostic algorithm metrics to
determine the characteristics of the design in terms of the design failures,
their propagation within the design and detection method for these failures
using the designer test strategy. It also provides a quick generation of a
customer customized FMECA in Excel format that ties the failures of the design
to the detection of these failures. We still will never be able to conjecture
all the failures that will bring down our future systems but maybe we can
eliminate some of these by better analysis techniques. Dependency model driven
testability analysis is definitely one of these techniques and the one I
choose.
|