Proving the Safety Integrity

  • Chapter
  • First Online:
Systems, Functions and Safety
  • 513 Accesses

Abstract

Formal proof of system safety is usually required before it can be signed off for deployment. A set of arguments (claims) used to prove the system safety is called a safety case. Safety case addresses all the safety integrity requirements defined by the respective standards and provides evidence that those requirements have been fulfilled. Many requirements include measurable indicators, some of which were discussed in previous chapters, such as reliability and failure rates. However, additional sets of measures may be prescribed by the standards, such as diagnostic coverage (DC), safe failure fraction (SFF), and more. This chapter discusses the required sets of claims for the safety case, including the description of those additional measures. Finally, the safety is contrasted with the availability, as one of the most important dependability requirements for the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now
Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Appendices

Exercise 10

Continue the exercise from Chap. 9, now attempting to increase reliability by analyzing the failures of system components and attempting to assess dangerous undetectable failures. The initial pseudo-FMEDA sheet is given below:

A pseudo-F M E D A sheet for the functioning of sensors, logic and valves. It has failure mode, lambda per cent, class of failure and detectable features.

Additionally, try to respecify the detectability of failures for the diversified configuration of the system. Finally, calculate safe failure fraction (SFF) and diagnostic coverage (DC), and compare all safety integrity metrics against the requirements from the functional safety standard (see further table), considering that the required SIL for the described SRS is SIL 2:

Safe failure fraction of an element

Hardware fault tolerance

0

1

2

<60%

SIL 1

SIL 2

SIL 3

60% – <90%

SIL 2

SIL 3

SIL 4

90% – <99%

SIL 3

SIL 4

SIL 4

≥99%

SIL 3

SIL 4

SIL 4

MTTFdu ↓      DC→

DC < 60%

60 %  ≤ DC < 90%

90 %  ≤ DC ≤ 99%

99 %  ≤ DC

3y ≤ MTTFdu < 10y

–

–

SIL 1

SIL 2

10y ≤ MTTFdu < 30y

–

SIL 1

SIL 2

SIL 3

30y ≤ MTTFdu ≤ 100y

SIL 1

SIL 2

SIL 3

SIL 4

  1. aValues are for practice only, not replicating actual standards

Your system, at the top level, is a single-channel system (hardware fault tolerance – HFT = 0). Finally, increase the item-level redundancy first to HFT = 1 and then HFT = 2 and rediscuss all the results.

Your Tasks for the Exercise

  • Update the calculation Excel sheet from Chap. 9 (feel free to use your previous solution OR the solution provided for Chap. 9).

  • Update the sheet by adding another failure rate row for dangerous undetectable failures (calculate them according to FMEDA) and update the calculations.

  • Calculate SFF and DC for each component.

  • Instead of MEM, now see if your SRS (which is SIL 2) complies with the provisions of the standard regarding SFF, DC, and failure rates.

  • Introduce item-level redundancy to increase HFT from 0 to 1 and then 2 (try to make the sheet configurable).

  • Rediscuss the results and reassess/make claims for all outcomes.

  • To close the safety case, what else shall be demonstrated for the SRS?

To-Do List

  • Perform the exercise with your peer group. One facilitator will perform the calculation.

  • Compare the results with another group when finished.

Exercise 10 Solution

Note: Solution is available as digital spreadsheet at sfs10.ex.nit-institute.com.

For each of the components from the exercise in Chap. 9, we now need to define portions of failure rates according to the FMEDA and then calculate SFF and DC and see how the values cope with the prescriptions of the standards.

Each component first needs to have its failure rate (lambda_all) decomposed to dangerous failures (lambda_d) and dangerous undetectable failures (lambda_du), expressed both as h−1 and also as a percentage of the original failure rate. Then DC and SFF can be calculated for each component according to the formulae.

For the original and the final system MTTFs, values are expressed in years.

The logic component can be claimed to have a lower individual failure rate in case redundancy is applied. By having a redundant logic component, additional detection SW may be added to do oversight over the hot spare and to detect if it is not working due to the, e.g., bond wire detachment. We judge that the power supply problem due to the failure of PMIC can be detected likewise. Power supply problems due to other external failures are considered to be at the level of 40% of all power supply failure causes. This yields the final l_lambda_du% to be 0.4 * 0.24 = 0.096. This would make the final failure rate of the individual logic component in the redundant configuration to be λL = 8.256*10−7.

For the sensors, drift can be detected via majority voting – if there are not at least two inputs with the same value, then the value is incorrect. This yields the final s_lambda_du% to be 0.

After recompiling all the calculations, according to the requirements in the exercise (table for SIL determination based on SFF and DC values), we can determine that our system yields SIL 4 based on its SFF and the redundancy selection as HFT variable, and SIL 3 based on its DC and final MTTF.

After the adaptations, new reliability diagrams are obtained:

A graph of reliability versus time has 5 declining lines. The trend implies that the reliability of the original system and components decreases over time.
A graph of reliability versus time for an improved system. It has 3 declining lines. The voting line is constant at a reliability of 1.

Key Recap Questions

Think about using additional quantitative metrics for your system and its components:

  • Safe failure fraction (which failures are safe?)

    A table of 13 columns and 19 rows. It contains the data for each component to compute the level of the improved system.
  • Diagnostic coverages (which dangerous failures can be diagnosed?)

  • What about dangerous undetectable failures?

  • What about software?

  • What about availability?

Self-assessment

Now take the time to self-assess your knowledge by taking the quiz below. Each listed statement is either correct or incorrect. Please mark your answer and then check in the key at the end of the book.

  1. 1.

    A safety case is a written demonstration of evidence for the safety integrity of random system failures, without considering systematic faults and process-dependent faults.

  2. 2.

    It is not possible to close the safety case as long as there is at least one open item in the documentation artifacts that can be traced back to a hazard exhibiting unacceptable risk.

  3. 3.

    Evaluating the safety functions against technical safety requirements and thereby demonstrating evidence of their correct operation in all possible situations (perfect coverage) is sufficient to declare the fulfillment of the respective safety requirements/safety goals.

  4. 4.

    By introducing diagnosis in the SRS, it is possible to increase the reliability of the SRS.

  5. 5.

    The failure rate with respect to all dangerous failures (λd) is always considered in the final system reliability evaluation.

  6. 6.

    The higher the safe failure fraction, the lower the number of residual faults in the system.

  7. 7.

    Diagnostic coverage tells us what is the percentage of failures that we can detect out of all failures which a component can exhibit.

  8. 8.

    The system in the fail-safe state is actually the system in downtime.

  9. 9.

    Availability requirements must be proven within the final system safety case.

  10. 10.

    The way we write software may affect the probability of having systematic faults during the design and therefore residual faults which cannot be modeled nor detected during the system operation – therefore we must comply with the respective requirements with regard to software implementation, prescribed by the standard, according to the safety integrity level of the item we are develo**.

Self-assessment Key

  1. 1.

    False

  2. 2.

    True

  3. 3.

    False

  4. 4.

    True

  5. 5.

    False

  6. 6.

    True

  7. 7.

    False

  8. 8.

    True

  9. 9.

    False

  10. 10.

    True

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bjelica, M.Z. (2023). Proving the Safety Integrity. In: Systems, Functions and Safety. Springer, Cham. https://doi.org/10.1007/978-3-031-15823-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15823-0_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15822-3

  • Online ISBN: 978-3-031-15823-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Navigation