|
|
|||||||||
By Daming
Lin and Murray
Wiseman
Optimal Maintenance Decisions (OMDEC) Inc.
Extracted from Chapter 11 of
“Reliabiltiy-centered Knowledge”
Depending on the physics governing a given application, we learned, in Chapter 7. (page 95), that we may choose from a variety of algorithms
with which to carry out the signal processing portion of CBM. Decision making, (the third CBM
sub-process), proceeds similarly, using one or more of a diverse array of
decision support tools. In Chapter
10. Example
1 Creating a decision model (page 127) we developed a CBM decision policy using statistical
modeling techniques and software. A decision policy assists maintenance
personnel to interpret and act upon a set of condition monitoring (CM) data.
Extensive human knowledge and experience may be available with which to build a
CBM decision policy. A rule-based expert system encapsulates known
relationships between CM data and the deterioration in an asset that takes
place due to one or more failure modes. An algorithm (known as an inference
engine) applies the knowledge base to the current set of CM data. In this chapter
we describe an expert system developed by DLI Engineering[1] called
ExpertALERT™.

Figure 11‑1 CBM signal
processing and Decision making using an Expert System
Figure 11‑1 outlines
the signal processing and decision making portions of this CBM approach. It
traces the flow of information through the signal processing steps (steps
1-5) and the decision making procedure
(step 6) that uses a rule-based expert system.
Each machine to be monitored is set up with permanent testpoints[2]
positioned strategically (Figure 11‑2) in relation to the components of interest. The
equipment is monitored using ExpertALERT™ over a period of time thereby establishing a baseline
spectrum for each test point[3] and each
orientation. The baseline spectra are updated automatically by the software and
set at the average + 1 standard deviation.

Figure
11‑2 An example of test point
locations showing the three axes - Axial, Radial, and Tangential
The six steps of Figure 11‑1 are described in each of the following sections.
We desire to scale the abcissa of the spectrum in multiples (orders) of
the forcing frequency.[4] If the
shaft speed is known (from a tachometer signal) the algorithm accomplishes this
directly. If it is not known a strong
peak is chosen in a window around the nominal speed, or a number of nominal
speeds (in the case of a variable speed drive) and the algorithm can
successfully match peaks, harmonics and sidebands in order to determine the
correct speed for normalizing the spectrum.
The normalization procedure also converts vibration amplitudes to a
logarithmic scale in units of VdB. This assists in the visualization of
significant, yet low energy peaks, alongside the dominant peaks due to the
fundamental forcing frequency. The VdB scale simplifies the interpretation of
changes in vibration levels, for example:
Next, automated spectral peak extraction and a noise floor
calculation are performed. The resulting data populates a “screening matrix”.
The columns of the screening matrix represent 10 preselected orders of shaft
rate (for example 1x, 2x, ….10x), the two highest non-synchronous peaks in a
low and high range spectrum, and a noise floor[5] value.
As an example, let us assume an equipment item has two test points. Then
the screening matrix will have (10 orders + 2 peaks x 2 ranges) x 3
orientations x 2 test points + 1 noise floor = 85 columns. One row of the
screening matrix will hold the changes in amplitude from the previous
inspection. A second row will hold the deviations from the baseline spectrum. A
third row will hold the corresponding vibration amplitudes. Hence, in this
instance, 85 x 3 rows=255 extracted
features will have been placed into the screening matrix, ready for further
processing.
The noise floor calculation measures any general increase in random
noise. Both impacts and random noise in a time waveform cause the spectrum to
become elevated. As bearings wear, they typically produce larger quantities of
non-periodic vibration and impacts. This raises the noise floor of the
spectrum. The automated diagnostic system uses an algorithm to calculate the
level of the noise floor. This value is then compared to a baseline value.
Increases in noise floor level add to the severity (see step 6) of the bearing
wear diagnosis and may even trigger a diagnosis in certain cases when bearing
tones are not evident.
A cepstrum transformation[6] of the
fft spectrum is performed next. A cepstrum (Figure 11‑3) highlights series of spectral peaks that are evenly
spaced in the spectrum. These are called harmonics Harmonics can be synchronous (multiples of shaft speed) or
non-synchronous. The algorithm searches the spectrum for non-synchronous
harmonics and any sidebands. If found they are flagged as possible bearing
tones, to be processed further in steps 5 and 6.
|
Figure 11‑3 Cepstrum showing peaks with 1x and 3.61x spacings |
Figure 11‑4 Spectrum showing the synchronous and non-synchronous harmonics
and their 1x spaced sidebands. The
abcissa is scaled in “orders” or mulitples of the shaft speed. |
The physics
of each situation dictate the signal processing method selected. Non-
synchronous peaks, such as those at 3.61 and 7.22 orders (Figure 11‑4), are candidates for “bearing tones” that signal
bearing faults. If, in addition, the non-synchronous peaks display sidebands
spaced at orders of the shaft speed, an inner race defect is likely. Figure 11‑5 illustrates the physical explanation for bearing
tones and the appearance of sidebands, with respect to to an inner race spall
or crack.

Figure 11‑5 Physical explanation of non-synchronous peaks and their 1x sidebands related to an inner race spall.
Demodulation (also
called “envelope detection”) is a signal processing technique used by
ExpertALERT to supplement and verify the information drawn from the cepstrum
and spectrum analyses. Demodulation provides an independent confirmation of
bearing defects.
If there is a spall on a bearing race, each time a ball passes it will
impact and “ring” the bearing causing it to resonate at high frequencies. The
resulting vibrations can be demodulated in order to extract the forcing
frequency that is causing the ringing. The forcing frequencies will appear as
peaks in the demodulated spectrum. If they match the bearing tones from the
screening matrix and the cepstrum, they provide further confirmation of a
bearing defect. A distinct advantage of demodulation is that high frequencies
do not travel far in a machine. Thus the demodulation process can localize the
defective bearing. For example, if you see bearing tones in the narrow band
spectral data from two different locations on the machine at the same
frequency, and the demod data has matching peaks at one location (but not the
other), you can assume that the common location is the one with the bearing
problem. The spectra of Figure 11‑6, Figure 11‑7, Figure 11‑8, and Figure 11‑9 illustrate this point precisely.[7]

Figure 11‑6 Spectrum from motor location showing bearing tone peak

Figure 11‑7 Demodulated spectrum from motor location showing matching peak

Figure 11‑8 Spectrum from pump location showing same bearing tone

Figure 11‑9 Demodulated spectrum from pump location, but showing no bearing tones. Hence ExpertALERT can conclude that the bearing defect is on the motor.
The screening matrix is transformed into component specific diagnostic matrices (CSDMs). This transformation extracts values at specific frequencies that characterize possible faults in a given component. It is interesting to note that the techniques of Steps 2, 3, and 4 require no specific knowledge of bearing geometry (e.g. number of rolling elements, inner and outer race diameters, pitch diameter, and so on) for the accurate detection of developing faults. Nevertheless, the CSDM may include specific frequencies based on bearing manufacturing data. Knowledge rules may refer to these frequencies, thus extending diagnostic confidence.
Steps 1 to 5 may be considered the signal processing portion of
ExpertALERT. They extract informative features from the raw vibration data upon
which the reasoning engine of the expert system may now operate. Step 6
performs the decision making function, interpreting the extracted features and
identifying the likely fault. In Step 6
each CSDM is processed through a series of diagnostic templates consisting of rules
that pass or fail every fault known to
occur in the component. Furthermore, the expert system computes a score
based on the feature’s excedance above the threshold value coded in each rule.[8] The
knowledge in the diagnostic templates was developed from an understanding of
the physics of the machinery and its causal relationship with the monitored
data.
A simple example is the rule for imbalance. This rule checks the matrix
elements (of the CSDM) that contain the rotational rate levels and exceedances
over baseline. The rule then determines whether these values are are high in a
radial direction. If so, other checks determine that the problem is not
misalignment or looseness. Finally, the algorithm confirms the imbalance
diagnosis.
|
|
Motor (VdB at 1x) |
||
|
Orientation |
Amplitude |
Exceedence over baseline |
|
|
Radial Tangential |
105 118 117 |
7 10 10 |
|
|
Pump (VdB at 1x) |
|||
|
Axial Radial Tangential |
104 113 92 |
9 9 2 |
|
Figure 11‑10 Vertical pump and 1x vibration readings
As an example, consider (for
simplicity only the 1x vibration levels of) the vertical motor and centrifugal
pump (with coupling), in Figure 11‑10. Excessive 1x vibration may indicate motor imbalance,
pump imbalance, angular misalignment, foundation horizontal flexibility, a
radial or thrust bearing clearance problem, or motor cooling fan blade damage.
Expert system rules based on knowledge of the configuration need to deduce the
fault and identify the faulty component.
Looking at the axial and radial data at both locations we might surmise
angular misalignment since 1x axial is abnormally high at both motor and pump.
Alternatively, it could be motor imbalance or pump imbalance, since 1x radial
is abnormally high at either end and radial is higher than axial. Axial motion
is, in fact, characteristic (due to rocking) of unbalance in a vertical pump.
Another characteristic of a vertical pump is that one direction, the direction
of external structural support, is always stiffer than the other directions.
The radial axis in this case is the direction of structural flexibility, so
that radially, the pump is being “wagged” by the motor imbalance. The low 1x levels at the pump in the
tangential direction can be explained by the fact that the tangential axis is
the direction of high structural stiffness and therefore the tangential
component of the vibration due to motor imbalance does not transmit to the
pump.
Rules are activated by machinery component type (for example, in the preceeding,
“vertical motor pump set with coupling”) as defined by the user in the
ExpertALERT software. A rule for bearing wear in a compressor will look
slightly different from the rule for bearing wear in an AC motor. Each
individual machine component type may have numerous rules for bearing wear. If
the the extracted features satisfy the requirements for a rule, it means the
fault condition exists.
After information has been extracted from the spectra as described above
in steps 1 to 5, it is passed through all of the rule templates that apply to
the general machine type to see if any faults exist. The rules are empirically
based on thousands of machine tests collected over more than 20 years and are
constantly refined as new information becomes available. If a rule is edited
for any reason, the change is run through all past diagnoses to ensure that it
does not change any previously correct results.
A typical rule looks something like this in terms of its logic:
Note that these rules are empirically based. Which is to say, the rule thresholds for absolute levels or for
exceedances over a baseline, have been tweaked until they come out with the
correct answer as determined by a human expert and/or direct field feedback. In
other words, the thresholds mentioned in the example rule above, have been
tuned to come out with the correct answer for any machine to which this
particular rule applies. There are sufficient rule templates for each machine
type to catch practically all possible bearing wear patterns that may exist in
the data.
Once a fault has been diagnosed, the user will continue to monitor the
machine and look for changes in severity of the fault. The rate at which the
severity increases gives a good indication of when the bearings should be
overhauled.
The amounts by which the values in the CSDM exceed the threshold values
(set up in the rules based on experience and knowledge) is scored and converted
into a relative severity. This normalizes a scale with which to judge
the state of health of each component. Thus the relative severity for all
components in the equipment can be trended on a single graph, as in Figure 11‑11. The graph provides a decision support tool for performing
a corrective action on a component whose severity is high or has increased
substantially. In the following section, we will propose to extend the
automated diagnosis one step further to extimate remaining life and provide an
optimized repair decision.

Figure 11‑11 Severity graphs for an equipment item with three components
Following step 6, the automated diagnostic tools hand over their
findings to the human decision makers. Can we process each diagnostic fault
and its respective severity one step further to provide:
i. to effect repair immediately, or
ii. to repair within a particular time period from the current time, or
iii. to continue operation until the next inspection. ?
The severity values computed for each fault, as well as the absolute and
relative values of the relevant features, may be used as covariates in a
proportional hazard model such as that described in Chapter 10. The next
section describes the ABB fault simulator that may be use to demonstrate this
proposed extension to ExpertALERT’s output report.

Figure 11‑12 The fault simulator (top left) gradually induces one or more failure modes (for example, misalignment or unbalance). The failure mode (unbalance) causes the failure mechanism (right) to proceed towards failure. The failure is the loss of function to hold the Tee in place by spring friction forces under the stress of vibration forces transmitted through the structure.
In the fault simulator, a spring and friction failure mechanism has been set up with the following characteristics desirable for the study of a failure modeling and prediction methodology.
How ‘predictive’ can such a model be?
The “goodness” (predictability) of the model
depends on two factors:
The “better” the data the smaller the sample you need. The less the data
correlates with the targeted failure mode, the larger the sample you need for
obtaining a good model.

Figure 11‑13 Running recommendations from the EXAKT agent
Figure 11‑13 displays the running prognostic results that are updated at each inspection. The “Optimal Maintenance Decision” may be one of :
The “Estimated Time to Failure” is the time to replacement estimate (TRE). TRE is an estimate of the time at which a replacement or overhaul will occur either by PM (as a result of the CBM optimal decision policy recommendation) or by failure. The TRE is not to be confused with the residual life estimate (RLE) that estimates the time to failure only. (Replacement by PM is not considered). Both TRE and RLE are interesting figures for maintenance personnel. TRE, however, may be the more interesting to people who are concerned with maintenance management, e.g., production planning, manpower scheduling, spare parts management. RLE, on the other hand, may be more interesting to people involved in equipment design, procurement, and specification of reliability or risk of the unit.

Figure 11‑14 Key CBM performance indicators
Figure 11‑14 shows the console display of the CBM program
KPIs for the demonstration fault simulator unit running an EXAKT optimal
decision policy. The predictability of the CBM policy is measurable. It is
reflected in the “Time to Failure Estimate Performance”. This figure is the
average error in the TRE calculated at each inspection of every life cycle. A
histogram (Figure 11‑15) is another way to indicate the predictive
performance of the model.

Figure 11‑15 Histogram showing the errors in replacement time estimate over 678 inspections. For example the TRE calculated at 412 inspections were within 5% of the actual (functional or potential) failure time.
The hazard function curves (in Figure 11‑16) for potential failures and functional failures provides an overall performance check on the effectiveness of the CBM program.

Figure 11‑16 Hazard functions for potential and functional failures
If the difference between TF (total failures) and the FF (functional failures) hazard curves is small, that indicates that the CBM program is effective. That is, functional failures (those that have important consequences) are being preempted by the CBM detection and correction of potential failures (that have none or relatively minor consequences).

Figure 11‑17 ExpertALERT operating the ABB Asset Optimizer Workplace
Figure 11‑17 illustrates a typical report issued by ExpertALERT. It contains quantitative information relating to the detected fault as well as a recommendation and a “Figure of Merit” indicating the fault severity. The CBM demo links these outputs from ExpertALERT to an EXAKT decision agent. The agent applies a model of the severity ratings and other relevant data extracted and computed by ExpertALERT. The new combined report contains, not only a structured identification and severity rating of the fault, but also an an optimized recommendation including an estime of the time-to-failure.
[1] www.dliengineering.com, Automated Bearing Wear Detection, Alan Friedman, Published in Vibration Institute Proceedings 2004
[2] Testpoints may be equiped with permanent triaxial accelerometers, or a triaxial accelermoter connected to a portable data collector may be used. The barcoded test points must offer a solid screwed mounting for accelerometer.
[3] In both a low and high frequency range
[4] This simplifies distinguishing the non-synchronous peaks and their sidebands from the dominant forcing shaft frequency and its harmonics. A necessary step in the diagnostic process.
[5] An increase in the noise floor level is an indication
of impacting and non-periodic (or random) vibration. Both of these are
associated with later stage bearing wear.
[6] One may say in a general sense that the more harmonics
and sidebands present, the worse the condition of the bearing. Thus, not only
does one wish to know if a peak is part of a larger family of peaks, one also
wants to get an idea of how much energy is contained in the series. Cepstrum
analysis is used for automating this task. The Cepstrum is a power spectrum of
a power spectrum of a waveform; therefore, any periodicities in the spectrum
(such as harmonic series or sideband families) will clearly appear as a peak in
the Cepstrum.
[7] Alan Friedman, DLI Engineering, Demodulation - June 1999 issue of P/PM
[8] Rule thresholds are a matrix that include both absolute amplitudes as well as exceedences over (mean + 1 sigma) baseline.
|