Source: EE Times article
by Kishore Kumar Sukumar, Dr. Martin Oberkönig, Sivaguru Noopuran, Cypress
The number of processors and components in today’s automotive applications will ultimately make transportation less dangerous, but only if they operate reliably and predictably.The automotive industry is changing rapidly, and vehicles are becoming more advanced, autonomous, and comfortable. The number of processors, sensors, memories, and other electronic components in today’s automotive applications is increasing, which will ultimately make transportation less dangerous. These automotive subsystems utilize connectivity and self-driving technology to increase driver safety, but can only achieve this if they operate reliably and predictably.
Inside the automotive cabin, human machine interfaces (HMI) are advancing in concert with new capabilities and infotainment systems. Capacitive sensing buttons and touchscreens are common, for example, to play/pause a song or to select an AM/FM channel. They are also increasingly used in safety-critical functions such as engine start/stop buttons and cruise control.
Beyond the automotive industry, there are many application sectors where failure is not an option because it can lead to physical injury and/or property damage. Critical systems in nuclear power plants, some types of industrial machinery, and airplanes are obvious examples. Even everyday appliances such as microwave ovens or washing machines can harm people in case of a malfunction. Systems that have the potential to harm people need mitigation measures, and the goal of any functional safety program is to minimize the risk of physical harm, property damage, and liability.
This article examines the role of functional safety as it pertains to automotive HMI systems, but the concepts are applicable to a broad range of applications and industries. We place more and more of our trust in systems to perform our everyday tasks, and these systems need to earn our trust by delivering on safety and reliability.
Functional safety
Whenever we rely on the correct operation of a system, we need to take care of the functional safety to minimize the risk to people in case of a malfunction. Two aspects must be addressed to achieve functional safety:
Prevention: A preventive measure avoids or at least reduces systematic failures during the development phase of a system. If something is specified, designed, or implemented incorrectly, the system may fail. The higher the quality, the less risk remains for a failure of the system.
Detection: The second class of failures are random hardware failures. These are failures that happen during the operation of the system. These failures are observed after production and any outgoing quality checks. The source of a random hardware failure could be, for example, the environment such as temperature, pressure, vibration, radiation, pollution, or aging. The nature of such failures can be permanent or transient, which means they can either only be fixed by repair or replacement of a device or in the case of transients, they will disappear after some time like with a bit flip in a memory. Any system can break at a certain time and measures are required to detect or control such a failure and prevent the system from endangering people.
Target | ASIL B | ASIL C | ASIL D |
SPFM | ≥ 90% | ≥ 97% | ≥ 99% |
LFM | ≥ 60% | ≥ 80% | ≥ 90% |
PMHF | < 10-7 / h
(< 100 FIT) |
< 10-7 / h (< 100 FIT) |
< 10-8 / h (< 10 FIT) |
Standards
Standards exist for functional safety in different application sectors. These standards provide guidelines and allow the comparison of different systems.
IEC 61508, first released in 1998, is a central standard for electrical, electronic, and programmable electronic safety-related systems (E/E/PE). This standard can be seen as the parent standard from which other standards have been derived for many specific applications. These standards have been published and give guidelines on how to address functional safety within a certain application sector.
ISO 26262 is an international functional safety standard specifically designed for the application sector of electric and/or electronic systems (E/E) within road vehicles. It is intended to be applied to safety-related electric and electronic systems that are installed in series-production passenger cars with a maximum gross vehicle mass up to 3,500 kg. The ISO 26262 was first published in 2011, with a second edition in 2018. This is the standard which is applicable to automotive HMI systems.
“How much safety” is required for a system depends on the automotive safety integrity level (ASIL). This ASIL is determined through a hazard analysis and risk assessment and considers the severity, exposure, and controllability of a hazardous situation.
ISO 26262 recommends three key metrics:
1. Single point fault metric (SPFM): One fault leads directly to the violation of the safety goal. Example: A fault in the injection coil could block the engine directly and could stall the car, causing severe injuries to the driver.
2. Latent fault metric (LFM): Multiple-point faults caused by a combination of independent faults whose presence is not detected by a safety mechanism nor perceived by the driver within the multiple-point fault detection interval. Example: Loss of a single low/high beam light wouldn’t cause a hazardous situation for the driver because as a safety mechanism the non-working light could be replaced by the working light. However, if both the beams are not working, then it could lead to a major accident.
3. Probabilistic metric for random hardware failures (PMHF): The failures in time (FIT) are the probability of a hardware failure. FIT is the number of failures per billion hours of operation.
The below table shows the ISO 26262 metrics for different classes of ASIL:
Figure 1: ASIL and Residual Risk.
Automotive HMI systems and safety
Carmakers are starting to put in functional safety requirements for capacitive HMI systems as they become more integrated with crucial functions whereby reliable functioning (or safe operation) is necessary and won’t impact the health of the driver/passenger.
For most HMI systems, the safety integrity level will be up to ASIL B and, in some cases, even ASIL C. The most stringent – level ASIL D – mostly applies to systems directly involved in the behavior of the car, such as steering, engine control, and transmission.
Figure 2: Typical HMI applications.
Typical HMI applications (Figure 2) that need functional safety with their safety-relevant failure modes include:
- Steering wheel touch buttons – A few buttons or a small touchpad used for cruise control: requires ASIL B/C
- Failure mode: Unintended touch detected for increasing the cruise control speed
- Steering wheel grip detection – Autonomous driving feature that must reliably detect the presence/absence of hand: typically requires up to ASIL B
- Failure mode: No touch detected when disabling a driving assistance feature
- Sunroof control/window control module buttons/sliders
- Failure mode: Unintended touch detected for closing a window
- Engine Start/stop button: typically requires up to ASIL B
- Failure mode: Unintended touch detected for starting the engine
- Gear switch cover buttons
- Failure mode: Unintended touch detected for a gear shift
Figure 3: Block diagram of a typical HMI system using an HMI controller.
Figure 3 shows a more detailed diagram of such a system. For HMI applications with capacitive-touch buttons, we can abstract the functionality to two safety functions:
- Ensure that a touch event really results from a finger touch
- Ensure that the measurement system can identify touch events
The first function requires that all potential touch events must be evaluated to determine if they were intended touches or resulted from failures or environmental conditions. An event could have been triggered by, for example:
- Water drops on the sensor
- Noise from electromagnetic interference
- Sudden temperature change
- Broken sensor/capacitor/wire/soldering
- High-energetic particle causing a memory cell to flip
To explain how the noise sources can impact performance, capacitive sensing technology measures a touch event by converting the sensor (button) capacitance into a digital or raw count. The raw count is interpreted as either a TOUCH or NO TOUCH state for the sensor. The capacitive HMI systems could have several sources of noise (as stated earlier) that could lead to unreliable touch performance due to such failures. Based on the experiments and knowledge from many capacitive sensing applications, Cypress recommends a minimum signal-to-noise ratio (SNR) of 5:1 to ensure sufficient margin between noise and signal for robust ON/OFF operation and ensure that a touch event really results from a finger touch (Figure 4).
Figure 4: Raw Counts for Touch and No-Touch.
Designing an automotive HMI system with functional safety requirements with FMEA
To design a capacitive sensing HMI system with functional safety, we recommend that an FMEA (failure mode effects analysis) process is followed to identify the failure modes and safety mechanisms needed to achieve certain safety levels. Below are some of the steps followed in a typical FMEA process with an example of a capacitive HMI system:
- Define safety goals for the HMI system
- To achieve this safety goal, it is required to identify and document all the safety critical elements (SCE) of the system and the chip. Figure 5 shows an example chip architecture for capacitive sensing applications (the Cypress PSoC 4) with all safety critical elements are marked in blue.
- With respect to capacitive-sensing touch controllers, safety critical elements include:
- Capacitive-sensing hardware block core, IDACs, Analog Mux bus routing, GPIOs
- Memories: Flash (code storage) and SRAM (configuration registers, data structures, variables)
- System resources: Power control, sleep control, clock generators, reference voltage generator
- CPU
- Below are the external components to the controller which are safety critical for the system:
- Routing of sensor and other external capacitors to the device
- Power supply and ground of the device
- With respect to capacitive-sensing touch controllers, safety critical elements include:
- Identify and document the potential failure modes of each of these safety critical blocks, their causes and effects: this could be categorized as a system-level failure mode or a chip-level or hardware/firmware-level failure. During this exercise, we need to assign a severity, occurrence, and detection level to each of these potential failure modes (following the typical FMEA guidelines). Some examples of failure modes could be:
- The capacitive sensing hardware block responsible for driving the external sensor and running the counter (corresponding to sensor + parasitic capacitor) could malfunction.
- IDACs required for charging and discharging the sensor capacitor could give wrong current outputs.
- There could be a short or open in the Analog Mux bus routing, which connects the external capacitors and sensors to internal hardware blocks.
- External capacitors and sensors could get shorted to Vdd or ground or with other GPIOs. They could get opened as well.
- Bandgap reference voltage could get stuck at Vdd or ground or some other unintentional voltage.
- Internal clocks could drift far from their configured frequencies. The data structures and variables responsible for holding the configuration settings and temporary scan status could get corrupted. APIs stored in the Flash could get corrupted.
- CPU could get stuck at some unknown Flash location.
- External environment like temperature, humidity, etc. could drastically change, causing huge variation in the parasitic capacitances.
Figure 5: Example chip architecture for capacitive sensing applications with all safety critical elements are marked in blue.
Most systems have an existing process control to mitigate such risks, document that against the potential cause, and assign a detection level.
- Identify and document safety mechanisms to detect and report or detect, correct/overcome failures and report. Some examples are described below:
- BIST (built-in self-test)
- Detect if the sensors and other external components are shorted to Vdd or Ground.
- Detect if the analog routing through the analog mux bus or the GPIOs is shorted to Vdd/Gnd or open.
- Detect if there is any corruption in configuration registers stored in Flash. Some of these bit errors can easily be detected and corrected by using CRC (cyclic redundancy check) calculations.
- Detect if there is any corruption in data structures and variables that hold the configuration settings and the run-time button touch statuses.
- Auto-calibration
- Detect if there is a significant difference in the environment compared to the original calibrated environment. Auto-calibration helps to tune the system for a given parasitic capacitor and a given environment, thereby correcting for variations.
- Multiple measurements
- Buttons can be constructed as two different sensors and scanned by two different firmwares. A voting of 2/2 can be done to successfully reject noise pulses.
- Redundancy
- Similar to multiple measurements, having a second set of sensing hardware in the same chip can also help in confirming the touch status.
- BIST (built-in self-test)
After identifying and documenting the safety goals, safety-critical elements, potential failure modes, and safety mechanisms for an automotive HMI system, we can design and develop the system based on these requirements for functional safety. To further support functional safety as well, chip manufacturers, like Cypress, offer resources such as application notes for safe touch button systems, design FMEAs, safety manuals, FMEDAs (failure modes, effects, and diagnostic analysis), and training services to guide OEMs in developing safe HMI systems for automotive applications.
Further Reading:
1. AN89056 – PSoC® 4 – IEC 60730 Class B and IEC 61508 SIL Safety Software Library
2. Functional Safety, https://www.cypress.com/functional-safety