Where have all the good valves gone?
P R Smith/R Stillman
IEC 61508 has now been a in existence since around 1998-2000 and has just been revised after a five year consultation period. Part 1 was approved for publication at first edition as long ago as December 1998; the first edition of Part 2 (which is the subject of this paper) was originally published a little later at the end of May 2000.
Despite the increasing age of IEC 61508 it appears that there is still some way to go before manufacturers of components and especially valves are able to present the end user with good reliable data to be used in loop SIL calculations. Most people are probably aware that the standard covers Functional Safety, is particularly applicable to electrical and electronic systems and uses a risk based approach. It is considered ‘good’ practice, by our friends in the legal field. If, like Buncefield, a process plant is unfortunate enough to suffer a serious accident then litigation may well follow against persons and Companies who see fail to give IEC61508 due consideration. It is not and cannot be perfect but I, like many others, believe that it is a very good attempt to provide a consistent design methodology for safety systems. So, let us criticise it only if we are able and our intention is to improve it.
The requirements of Part 2 are not ‘rocket science’ and do not take a great deal of understanding for any engineer with a reasonable grounding in mathematics and in particular probability theory and statistics. It is quite reasonable to expect that anyone who is designing or integrating equipment on which lives might depend should be competent to deal with the issues that the standard raises.
My objective here is to discuss the hardware assessment process from the perspective of the individual loop component and in particular the mechanical valve, be it butterfly ball, globe, solenoid or any that I might have missed. Note that we can neglect the crucial issue of software assessment (IEC 61508-3) which is necessary for device intended for safety loop application and that involves the use of software in some way. The issues involved merit a separate article so it must of necessity be left for the future or others.
Before we proceed, let’s consider some basics. Ignoring software, there are two major failure types:
The pie chart shows the relative preponderance of each, around two thirds of failures being due to ‘systematic’ failure.
The second pie chart, above, shows the causes of systematic failure in safety related systems and their relative proportions.
Furthermore, random failures are distributed across a safety loop in the following proportions:
- Sensor 35%
- Logic Solver 15%
- Final Actuator 50%
Valve manufacturers should consider the last item in bold!
This article will briefly cover the assessment of random failures but later in the text. At this point we must mention the requirement for safety component suppliers to have a Functional Safety Management system. This requirement is thoroughly covered in IEC61508-1 and use to be pre-fixed by the word ‘should’. Please note that the latest version of the standard has the word ‘shall’! So, if your company are supplying any component for safety use, and we are specifically considering valves here, then your company must implement a Functional Safety Management system compliant with the requirements of IEC61508-1:2010. To ignore this requirement is to expose you company to potential litigation!
Section 6.0 of the Health & Safety at Work Act is quite clear:
The duties placed on designers, manufacturers, importers and suppliers of articles for use at work are to:
a. Ensure components are designed and constructed to be safe and without risks to health when being set, used, cleaned or maintained by a person at work
b. Arrange for testing and examination to ensure safe design and construction
c. Provide persons supplied with articles with adequate information about:
i. Safe use of the component;
ii. Any conditions necessary to ensure safety during setting, using, cleaning, maintaining, dismantling or disposing of the component;
d. Provide users, already supplied, with new information as it becomes available.
There is an additional duty on designers and manufacturers to arrange for research to discover and hence eliminate or minimize risks. The Health & Safety at Work Act also places heavy responsibility on software engineers to ensure the integrity of their designs.
It is understood that the HSE has applied section 6 of the Health & Safety at Work Act in at least one case concerning components intended for use in safety systems and consequently the onus is upon instrumentation manufacturers to comply with IEC 61508 be they simple sensor, smart instrument suppliers or indeed – Valve manufacturers/suppliers.
Considerable discussion has considered the issue of individual component SIL. Many engineers believe that an individual loop component cannot have a Safety Integrity Level. This is correct, however the allocation of a Hardware safety integrity limit to an individual loop component is fundamental to IEC 61508 part 2. This allocation is the basis for further safety use of the loop component and is dependent on the availability of a sufficient dossier of evidence to meet the requirements of a specific SIL. The higher the SIL claimed the more supporting evidence will be required.
If the conclusion is SIL 2 then that component may ONLY be used to support single architecture safety functions upto SIL 2. In practice this means it may be used in safety loops required to provide a risk reduction equal to SIL2 (or lower). A separate assessment will determine whether the loop in its entirety achieves SIL2 but each loop sub-component MUST be capable of supporting a safety function upto SIL 2.
IEC 61508 Part 2 requires the provision of certain information, at least:
• The fault tolerance that may be assumed for the loop component
• The safe failure fraction of all potential failures
• The predicted undetected dangerous failure rate – theoretical
• The predicted safe failure rate – theoretical
• The relationship of failure rate to environmental conditions such as temperature, vibration, emc, humidity etc.
• The recommended highest Hardware safety integrity limit
• Restrictions in use report including measures and techniques for the avoidance of systematic failures.
Many companies still do not understand that they must supply this information. This leaves the loop designer with a problem because somehow it is necessary to comply with the standard and provide evidence that the proposed safety loop has been assessed and the predicted risk reduction is reliable.
It is the object of this paper to consider the problems facing the loop designer with particular reference to the process valve which is often used as the final actuator.
Before we do this let us take a quick look at the two hardware assessments that must be made (not forgetting that if the subject contained software then it would be necessary make a further assessment against the requirements of IEC 61508 part 3):
Each one of these two assessments produces a Hardware safety integrity limit but it is the lowest estimate of the two that must apply! I.e. if assessment a. concludes a capability of SIL1 and estimate b. a Hardware safety integrity limit of SIL3 then the applicable capability is limited to SIL1.
IEC 61508 SIL Assessment requirements
Quantitative Assessment – Random Hardware failures
Part 6 of IEC 61508 provides the methodology required to calculate the Probability of Failure to Danger (PFD) that is required to enable a quantitative SIL to be assessed. The IEC 61508 calculation uses fail to danger rates combined with a mean down time assessment to derive a figure for the PFDsys (Average Probability of Failure on demand of a safety function for the E/E/PES safety related system).
A dangerous failure rate and a safe failure rate must be estimated for each sub-system involved in the safety loop. The calculations of Part 6 actually require the typical failure rate, ( 1/MTBF) of a device to be resolved into four components:
SD - Detected safe failure rate (per hour)
SU - Undetected safe failure rate (per hour)
DD - Detected Dangerous failure rate (per hour)
DU - Undetected dangerous failure rate (per hour)
DU - Undetected dangerous failure rate (per hour)
Using the guidance provided in Part 6 a Probability of Failure on Demand figure may be calculated and consequently a Safety Integrity Level.
Qualitative Assessment – Safe Failure Fraction
The standard requires each component in the loop (called a sub-system) to be assessed against the IEC 61508 requirements for Safety Integrity (IEC 61508 Part 2, para 7.4.3).
Each loop component means the field sensor and its installation, all interfacing equipment, any logic modules and all field output devices including final process valves,
actuators, positioners, solenoid valves and whatever else may be required to implement the safety function.
The requirement is to determine that each component is suitable for its intended function and this includes the application of existing standards such as EMC. Forgive me if I neglect these and focus on the requirements of the standard itself, suffice it to say that any such complementary standard must be considered and complied with.
The standard identifies two parameters, ‘Hardware Fault Tolerance’ and ‘Safe Failure Fraction’. Two tables within the standard (part 2, para 184.108.40.206.4) are applicable.
For Hardware Fault Tolerance, two cases are considered:
Type A – Effectively simple devices which may or may not include software.
Type B – Effectively complex devices which often do include software.
Simply, we are required to decide whether we have intimate understanding of the device concerned:
Failure Modes of all constituent components?
Behaviour under fault conditions?
Extensive, Reliable Field failure data in support of claimed failure rates?
If the decisions are ‘No’ then the device would be ‘type B’.
If ‘Yes’ then the device would be ‘type A’.
These tables limit the Hardware safety integrity limit that can be claimed for any ‘safety function’ based on its architecture alone and irrespective of how good the quantitative assessment of SIL may be.
Safe Failure Fraction Hardware fault tolerance
0 1 2
<60% SIL1 SIL2 SIL3
60% - <90% SIL2 SIL3 SIL4
90% - <99% SIL3 SIL4 SIL4
>=99% SIL3 SIL4 SIL4
Simplified Representation of IEC 61508 Type A (IEC 61508-2 Table 2)
Safe Failure Fraction Hardware fault tolerance
0 1 2
<60% Not allowed SIL1 SIL2
60% - <90% SIL1 SIL2 SIL3
90% - <99% SIL2 SIL3 SIL4
>=99% SIL3 SIL4 SIL4
Simplified Representation of IEC 61508 Type B (IEC 61508-2 Table 3)
A Hardware Fault tolerance of ‘N’ means that ‘N+1’ faults could cause a loss of the safety function.
The ‘safe failure fraction’ of a sub-system is defined as the ratio of the average rate of safe failures plus dangerous detected failures of the sub-system to the total average failure rate of the sub-system.
SD + SU + DD
SD + SU + DD +DU
If an assessment has to be made in the absence of manufacturers support then it must be worst case, based on good engineering judgment and with a documented rationale supporting the conclusion. It might be permissible to claim a valve as a type A device and claim FT=0, SFF<60% in which case the best claim is SIL1! To claim a better SIL, as was explained above, we need more evidence.
Remember that this is just part of the assessment and it has to be done for ‘each’ loop sub-system! Fortunately the design of safety electronics has progressed so far that you will not have a problem obtaining the correct data from such suppliers. Regrettably this is not the case for mechanical components and consequently the assessment of valves is difficult and will remain difficult until good quality failure data for particular valve components is determined. This is a task for both the valve manufacturer AND the valve user for the valve user has an implicit responsibility to collect failure data for any safety loop component that he uses. This means implementing a good maintenance regime and establishing records.
SIL Assessment of a valve
Before proceeding further you will need to assess your own competency to proceed with the assessment. If you are not experienced in the requirements for functional safety as regards process valves then you should not proceed but obtain the services of a suitably qualified colleague. If you consider yourself to be capable then it will be necessary to begin your safety loop documentation with a brief resumé of your own competence and capability to carry out the assessment.
Few instruments are redundant by design hence it is a safe assumption that a Fault Tolerance of ‘0’ applies (Ref IEC 61508-2 Tables 2 and 3).
Safe Failure Fraction
Refer to section 2.2 of this paper for a definition. It is not possible to estimate SFF without some knowledge of the device and its potential failure mechanisms.
IEC 61508 Certified Equipment.
In the case of sub-systems which have been independently assessed by a reputable certification body then certification should state type, DU, DD, and Safe Failure Fraction from which SU and SD may be inferred.
Non IEC 61508 Certified Equipment
Generally, manufacturers of non-certified components are not publishing the required data for assessing SIL. However, obtaining a written statement to this effect is an essential part of this assessment by an individual working to achieve a ‘good’ practice solution to the problem of compliance. If the manufacturer will provide a measure of data then you have a starting point, if not then the approach using ‘generic’ data is necessary in the absence of acceptable certified components.
There are few publications that contain useful information for the assessment of mechanical SIL, where it can be found it is always a good idea to compare more than one such data set if possible. However, as explained it is difficult to find mechanical failure rate data and where it can be found it is rarely specific to process valves in general let alone a specific type of valve. Also, such data is derived from various sources and must consequently be used with care, it is quite possible for failure rates to be derived from, the military who do collect good maintenance data. Unfortunately it is clear that such data will be obtained from components that bear little comparison with a process valve.
So where do we go from here? Good question!
We need the manufacturers of mechanical valves to carry out some level of testing of their products in order to determine product specific failure rates. We need them to document:
• What failure rates are dominant?
• How long do different components last in normal use?
We also need access to a wider set of data, that potentially available from the end user. The end user has access to the very best data for his own site – IF he will collect the data.
Such operational failure data may be used to corroborate that determined by the manufacturers and the two sets of data will enable a reliable, realistic estimate of valve failure rates. Until that time in the future the Realistic valve SIL will be limited.
It is important that the data used reflects random failures for that is the data required for SIL calculations. Random failures are failures that cannot be predicted so for instance the failure of a valve stem which has failed in normal use or the good quality valve body casting which developed a crack or porosity sufficient for a leakage of the process fluid to occur.
Many failures of valves are due to ‘systematic’ failure, e.g. the valve has been used with a process fluid which is incompatible with the materials used in the valve or the valve mechanism has stuck due to wear of a valve stem. These are examples of systematic failure and wear out mechanism. To avoid these problems the manufacturer must publish his ‘restrictions in use report’ to ensure that, in the first case, the valve is not used with incompatible fluids and in the second case that the end user understands when to replace the valve BEFORE wear out mechanisms take effect.
Note that PFD’s are applicable only where the demand rate is low in relation to the proof test interval. Where the demand rate is higher then these formulas will be increasingly in error in the dangerous direction, ie the PFD will be under estimated. IEC 61508 Part 6 Para B.3.2. provides formula applicable to the High demand or continuous mode.
IEC 61508 Part 6 provides the procedure and equations required. Our application will be in an architecture of ‘1oo1’ to trip, i.e. only one valve will be used and this needs to close when the safety trip is initiated.
Part 6 para B.2.2.1 gives an equation (applicable to ‘low demand mode’ only) for a ‘1oo1’ architecture, the average probability of failure on demand is:
PFDG = (DU + DD).tCE (1)
Key to terms:
T1 = Proof Test Interval
MTTR = Mean Time to Repair
DC = Diagnostic Coverage
= Total Failure Rate per hour
SFF = Safe Failure Fraction
Channel equivalent mean down time,
tCE =DU/D[(T1/2)+MTTR] + (DD/D)MTTR (2)
DU = /2.(1 – DC); DD =/2.DC (3)
In the case of a mechanical device such as a ball valve and in the absence of any external diagnostics, Diagnostic Coverage (DC) = 0
DU =/2; DD = 0 (4)
It may be inferred that, consequently,
SU + SD = /2 (5)
However, we have referenced some basic failure mode data and this gave us the following:
1. Blocking 5%
2. External Leak 15%
3. Passing (internally) 60%
4. Sticking 20%
For the case where the valve needs to shut then 1 and 2 are safe failures whilst 4 may be stuck closed or stuck open so we must assume 10% for each.
Hence dangerous failures would be 3 and part of 4.
Now according to Annex C of IEC 61508-6:
SFF = SD + SU + DD (6)
SD + SU + DD +DU
As DD = 0 andSD + SU = S then we may simplify the equation to
and hence, For a trip closed application we may estimate that S = 0.3 x Total
and by implication,
DU = (1 – 0.3) x Total…………… (8)
So SFF = 30% for the close on trip application.
From the IEC61508-2 Table 2 above the SIL of the ball valve is limited to SIL1. Similarly we may attempt a PFD calculation:
A typical ‘generic’ failure rate source puts the ball valve failure rate at between 0.2 and 10 failures per million hours, or one every 11 years.
PFD = D x T/2, where T is the proof test interval – usually 1 year.
So, PFD = 0.7 x 10e-6 x 8760/2 = 0.03
This equates to a SIL of 1 and agrees with the qualitative assessment.
A simplified estimate of SIL for a typical ball valve indicates that SIL1 is a reasonable, realistic estimate with the information that is most readily available.
Without better failure rate data valves are always going to be a limiting loop factor. The estimate made may be optimistic or pessimistic for a particular make of valve. How can the end user be sure that his safety loop is safe?
Better failure rate data must be made available to enable the loop user to make realistic, reasonable SIL estimates for the components actually used. This means that individual suppliers must carry out test work to determine component failure rates for THEIR valve.
Valve associations might profitably provide a data bank for their members test results.
End users MUST start collecting failure data for safety components to ensure that calculated loop SILs can be confirmed.
Proven in use data, if collected correctly, is the best data for use in SIL calculations.
Note that this paper is intended to provide guidance only and is necessarily brief. None of the figures quoted here by example should be referenced without corroboration.
Finally, many thanks to all my colleagues both internal and external who have been patient enough to read and comment on the draft copies of this document.