SUNRISE Safety Assurance Framework

Use Cases

There are 4 Use Case categories which are further distributed in 8 sub-use cases, as shown below. Please click on a UC/sub-UC green button to read more.

UC 1 – Urban AD validation

UC 2 – Traffic jam AD validation

UC 3 – Highway AD validation

UC 4 – Freight vehicle automated parking validation

The following table summarizes the SAF blocks that are covered by each UC. Please click an ‘X’ to read more details…

Item	UC1.1	UC1.2	UC1.3	UC2.1	UC3.1	UC3.2	UC4.1	UC4.2
SUNRISE DF				X	X	X
Query & Concretise	X	X	X	X	X	X	X	X
Allocate	X	X	X	X	X	X	X
Execute	X	X	X	X	X	X	X	X
Test Evaluate	X	X	X	X	X	X	X	X
Coverage		X	X		X	X
Safety Case		X	X	X	X	X	X
Decide		X	X	X	X	X	X

The function of the federated layer in this context is to allow for queried scenarios to be selected for testing, which have been requested and verified as provided in the correct format. The federated layer is not developed at this stage. Current scenarios mimic the use of the federated layer in that the requirements for the scenarios were established in the SUNRISE deliverable D7.1 and queried to meet said requirements. The format of the scenarios is consistent as OpenSCENARIO 1.0 and OpenDRIVE files. Due to the state of readiness of the Federated DF, the scenarios were created outside but validated by a series of queries to the DF and subsequent analysis to ensure they align. Thus, an online SCDB (Streetwise) and another SCDB (Safety PoolTM) have been queried based on the requirements of the use case set out by T7.1 for UC 2.1 by the federated layer.

Provision of scenarios tailored to the requirements of the UC in the required common format for execution, except for V2X elements which are not supported directly in ASAM OpenSCENARIO. Despite the scenarios being created before the deployment of the Federated DF, the scenarios have been validated as though they were output from the Federated DF. Several queries have been tested based on different ODDs and analysed whether the logical scenarios used in this use case (and their parameter ranges) match the ODDs and the expectations of the experts.

Provision of scenarios tailored to the requirements of the UC in the required common format for execution, with the exception of V2X elements which are not supported directly in ASAM OpenSCENARIO. As for use case 3.1 the scenarios for use in testing were created when the federated DF was not available for use. The Federated DF has been validated by testing several queries based on different ODDs and analysed if the logical scenarios provided for the use case as an outcome (and their parameter ranges) match the ODDs and the expectations of the experts for the scenarios provided.

Creation of the scenarios defined in D7.1 in CARLA with different parametrization. The method described in D3.4 has been used to create and sample the scenarios.
Use of a knowledge-based approach to creating test cases (Euro NCAP scenarios, which are based on expert knowledge and represent an important aspect to consider for supporting ADS validation). This approach enables the demonstration of the capability of handling knowledge-driven scenarios as well, addressing the needs of many vehicle safety bodies, especially for urban-based environments. This approach actively supports the validation of this SAF block.
Usage of a sampling method described in D3.4 to create test cases related to the scenarios defined in D7.1. Resulting test parameters, behaviours of the other participants and quantities measured that are aligned with the provided ODD, external requirements and test objectives have been checked. Not any missing critical scenario has been also checked.
Usage of two of the test cases that are previously defined and utilization of same sampling method defined in D3.4 for parametrization.
Usage of a sampling method described in D3.4 to create test cases that can be allocated in the following block.

Provision of concrete scenarios for simulation based on the methodology developed in WP3. For the case generation, the query process is a required input and has been validated against the validation criteria defined in D7.2/chapter 2 for the logical scenarios defined in the use case. The sampling methodology has been validated based on the coverage of the parameter space by the KPI’s defined in the use case.
Definition of the logical scenario and provision of the sampling according to the SAF block.

Approach A (ICCS): Starting from a logical scenario described by a given parameter space with defined parameter ranges, a smart sampling method and scenario (pass/fail) outcome prediction is proposed based on a machine learning Gaussian Process-based algorithm. The proposed method allows for i) estimation of the pass/ fail probability of unseen concrete scenarios without the need for execution/simulation, and ii) generation of scenarios close to the pass/fail boundary, which are the hardest to predict their outcomes (high uncertainty). Different pass/fail criteria are used depending on the SUT (Perception, Collective Perception, AD_controller) generating different set of ‘critical’ scenarios close to the pass/fail boundary. Uncertainty of pass/fail outcome (each test scenario outcome is recorded along with a pass/fail confidence value assigned to it), is then used in order to take decisions on specific set of scenarios to be allocated in higher-fidelity testing environments, i.e. hybrid testing. Concretization and critical scenario retrieval is validated through simulation execution. Method is described in deliverable D3.4 while its metrics are described in D5.3.
Appoach B (VED): Use of hand-crafted concrete scenarios and application of a ready-to-use tool from T3.3 to concretize scenarios using some criteria. Concretization is validated through simulation execution and verifying that each concrete scenario is successfully executed and leads to slightly different results/KPIs.

Note: ‘Querying’ step not supported by this UC.

Validation of the Query & Concretise block by creating concrete scenarios from logical scenarios by using the methodology described in deliverable D3.4.

Generation of concrete scenarios for simulation based on the methodology developed in WP3. The case generation process requires input from the query process, which has been validated against the criteria outlined for the logical scenarios defined in the use case. Additionally, the sampling methodology has been evaluated by assessing how well the parameter space is covered by the KPIs specified in the use case.
Comparison of logical scenarios from the use case to internally defined scenarios.

Queries have been created to search for scenarios via the federated SCDB. The suggested scenarios have been compared to the ones defined by the experts in WP7.
The bulk of the scenarios come from translations of the abstract scenarios defined as ‘key’ in the D7.1. Whilst the provision of scenarios from an existing database such as Safety PoolTM would provide real measurement data, the novelty of the communications described in the use case prevented this. The UC has tested the completeness by checking whether the concrete scenarios missed any critical case (this is done by sampling the parameter space with many new randomly generated samples to verify that nothing was missed). The efficiency of the concrete scenarios has been tested with a ML surrogate model and bootstrapping aggregation method (this is done by comparing the number of concrete scenarios to the minimum number that still captures all the important features). Additionally, the obtained behaviours of participants and the quantities that must be measured can be checked for completeness.
Generation of concrete scenarios for simulation based on the methodology developed in WP3. The case generation process requires input from the query process, which has been validated against the criteria outlined in D7.2 for the logical scenarios defined in the use case. Additionally, the sampling methodology has been evaluated by assessing how well the parameter space is covered by the KPIs specified in the use case.
Comparison of logical scenarios from the use case to internally defined scenarios.

A hypothetical query, built on the SUT, ODD, and test objectives, guides the collection of initial scenario data from these experiments. This data then undergoes scenario concretisation to ensure the generated tests are efficient and complete. Validate sufficient coverage of relevant test space and test coverage to be free from critical gaps within the region of interest or across the parameter space. Since an exhaustive search is foreseen to be achieved, the effectiveness of combinatorial testing can be investigated.
Using these concretised data, test cases are generated to evaluate safety performance indicators (SPIs). Each test case specifies a concrete test scenario, including inputs, preconditions, and expected results.
Validation of the “Concretisation” part in this block by evaluating the criteria assessing scenario data completeness, data accuracy, and absence of data corruption in combination with test execution. The versioning and traceability criterion is planned to be assessed during the evaluation of the block “Test Evaluation” for the planned test data collections.

Concrete scenarios were hand-written as instantiations of the logical scenarios defined in D7.1 that were based on scenarios used in UC4.1. The twist here is the integration in the scenario description of two possible cyber-attack ‘triggering conditions’ that lead to similar effects on compromising the Collective Perception system with faulty object information.
This UC did not focus on coverage aspects as this is already covered by the work in UC1.3 (Collective Perception testing).

Usage of virtual testing using CARLA. The process described in D3.3 for the initial allocation has been used.
Verification that a test case can be allocated to the available test instance, through knowledge-driven scenarios (Euro NCAP).
Verification that a test case can be allocated to the available test instance, through proving ground testing.
Verification that a test case can be allocated to the available test instance, through virtual testing.

Provision of an allocation strategy based on the methodology developed in WP3. The allocation strategy can be validated by comparing the results of the simulation and physical testing against the output of the allocation strategy.
Conduction of the cases that are assigned to simulations. By evaluating whether the simulations successfully generated accurate evaluation metrics, one can determine whether the simulation test instances were appropriate or if the allocation block should have indicated test instances with higher fidelity.
Definition of a list of cases based on three parameters: vehicle speed, the distances to the traffic light, and the state of the light. An essential variable is added to this: the occupancy of the crossing area.

Use of the initial allocation of the test cases to the test instances with the process described in the deliverable D3.3. All tests that can be performed using simulations will be allocated to virtual testing. However, during the process of safety case building (see ‘safety case’ sub-block), a few test scenarios will be re-allocated to physical or hybrid testing according to the principle of dynamic re-allocation supported by the SAF. Selection of these re-allocated scenarios is based either on the limitations of the virtual setup to model certain aspects (e.g. connectivity delays), or the scenario outcome ‘Test evaluate’ results and their proximity to the pass/fail boundary.

Allocation of concrete scenarios to test instances as described in D3.3 [5] to validate defined scenarios in D7.1. Selected scenarios from the simulation, which are validated, then have been executed in the proving ground.

Application of the initial allocation process of Deliverable D3.3 and validation of the test cases according to the upcoming Deliverable D3.5. Virtual simulation and proving ground testing have been used as test instances to validate the allocation. It has been verified that test cases can be allocated to test instances and checked whether the initial allocation indicates test instances of unexpected high fidelity. Potential reallocation has been applied according to Deliverable D3.5.
Support of the allocation validation by providing the connection to the development in WP3.
Conduction of the cases that are assigned to simulations. By evaluating whether the simulations successfully generated accurate evaluation metrics, one can determine whether the simulation test instances were appropriate or if the allocation block should have indicated test instances with higher fidelity.

Allocation validation by providing the connection to the development in WP3.
Conduction of the cases that are assigned to simulations. By evaluating whether the simulations successfully generated accurate evaluation metrics, one can determine whether the simulation test instances were appropriate or if the allocation block should have indicated test instances with higher fidelity.
Application of the initial allocation process in chapters 3 and 4 of D3.3 and validation of the test cases according to D3.5. Virtual simulation and proving ground testing have been used as test instances to validate the allocation. It has been verified that test cases can be allocated to test instances and checked whether the initial allocation indicates test instances of unexpected high fidelity. Potential reallocation has been applied according to D3.5.

Use of the initial allocation of the test cases to the test instances with the process from D3.3 “Report on the Initial Allocation of Scenarios to Test Instances”.
With two or more test instances that can execute comparable tests, evaluating the effectiveness of the test instances allocation is possible.

Usage of the framework specified in D4.4 using CARLA for camera sensors.
Description of the virtual testing setup which eventually enables to demonstrate that the execution block of the SAF is able to handle EuroNCAP scenarios as well.
Preparation for the execution of the allocated test cases on the AD prototype. This proves if all tests could be executed and if the needed signals can be obtained for further test evaluation.
Execution of allocated test cases on V&V virtual simulation framework defined in D4.4. The additional changes include a perception pipeline to meet test case requirements.
Preparation of the execution of the allocated test cases. The execution has been done by using a customised version of the harmonised V&V simulation framework described in D4.4.

Provision of physical testing for the use case on a proving ground and validation of the execution block for physical testing.
Running of the simulations and reporting whether each concrete scenario was correctly executed or if it was not completed and why.
Execution of concrete scenarios on the simulation model in the loop and with a car on Proving ground. The used simulator and the vehicle have exactly the same core functions.

Execution of allocated test scenarios on the virtual simulation framework described in D4.4 and D4.6. Two pipelines are implemented: i) virtual testing and ii) hybrid testing with one real agent. Validation of the “Execute” block is performed in two phases, adhering to deliverable D7.2 that defines specific validation requirements for this SAF sub-block. The primary objective of this validation is to ensure that test scenario execution results meet the input requirements of the “Test Evaluate” and “Coverage” blocks that follow.

Demonstration of automated scenario executions in the simulation toolchain. Additionally, results have been used for the KPI calculations and test reports.
Demonstration of the execution of physical test runs in a black-box approach (based on the outcomes of T4.6), like it has been done by consumer testing or market surveillance of a regulator using the SUNRISE SAF.
Development of the SUT and support for the integration of the System into the virtual testing environment.

Run the simulations using the framework described in D7.2 section 3.5.1.1 and conduction of proving ground testing as described in D7.2 section 3.5.1.2. It has been reported whether each concrete scenario was executed correctly and the information needed in the following blocks, Test Evaluate and Decide, is available. In case of missing data (e.g., a failed simulation), the block which causes the problem has been identified.
Support in the validation of the sensor models. For the validation, the sensor information from real-world tests is compared to the simulated sensor output.
Running the simulations and reporting whether each concrete scenario was correctly executed or if it was not completed and why.

Run the simulations using the framework described in D7.2 section 3.6.1.1 and conduction of proving ground testing as described in D7.2 section 3.6.1.2. It has been reported whether each concrete scenario was executed correctly and the information needed in the following blocks, Test Evaluate and Decide, is available. In case of missing data (e.g., a failed simulation), the block which causes the problem has been identified.
Support in the validation of the sensor models. For the validation, the sensor information from real-world tests is compared to the simulated sensor output.
Run the simulations and reporting whether each concrete scenario was correctly executed or if it was not completed and why.

Tests have been performed as virtual tests (D4.4) and physical tests. Efficient orchestration requires test scenarios to be machine-readable and suitable for batch testing. A goal for efficiency is that the recorded data needed to evaluate the validity and results of tests have been collected automatically. Verifying that all results comply with the input required by the “Test Evaluate” and “Coverage” blocks, also confirming that data is free from errors related to test preparation or execution.
Support of the validation of the execution aspect in this block by planning, preparing, and conducting several test data collections with a real-scale truck according to the validation criteria as listed in D7.2 Section 2.2. In particular focus are concrete aspects such as data presence, data synchronicity, absence of data corruption, and data completeness to meet the expectations for the validation of the block “Test Evaluate”.

Tests have been performed in virtual test environment using a simulation toolchain developed around CARLA.
The simulation framework presented in D7.2 Figure 42 is used to execute the scenarios and the validation criteria from D7.2 section 2.2 will be applied to verify the quality of the simulation execution. A new scenario execution mechanism to support cyber-attack triggering event was developed. Co-simulation setup integrating virtual RSU and CPM spoofing in CARLA including a) camera sensor spoofing to interfere with quality of raw sensor data (exploiting light mechanisms inside CARLA simulator) ; b) ghost object spoofing on virtual parking scene (interfering with CPM content directly). Integration with truck’s controller from UC4.1.

Usage of a collision and route completion metric to decide the pass/fail criteria of the scenarios. Other metrics provided by the SAF can be computed to compare.
Review of the evaluation process based on the obtained parameters on the vehicle and reception of input from the external requirements. In this use case, a good metric for analysing a perception system could be if a specific obstacle has been properly classified/detected and if the distance/ speed to the obstacle is also correct.

Ensuring that the pass/fail criteria of this block includes the criteria from EuroNCAP, demonstrating that this block of capable of handling these types of scenarios as well.
Evaluation if at that stage there is all the needed information from the ‘Coverage’ and ‘Test evaluate’ blocks to determine if the perception system is safe or not. Those metrics could be accuracy/ precision/ recall and average tracking errors.
Following the execution of virtual test cases, some metrics such as collision or time-to-collision have been used to evaluate the severity of scenarios and group them as safe or unsafe. Additionally, unsafe scenarios may be categorized under failed SAF subsystem, including perception or control blocks.

Conduction of test evaluation necessary for the methodology and usage of the validation criteria to validate the physical tests.
Performance indicators per concrete scenario, evaluation of the KPIs according to the SAF and comparison of these evaluations with internal evaluations.
Computation of KPIs for every single test case and evaluation and comparison for both simulation and real car.

The pass/fail criteria defined were based on popular metrics employed in the AD control and perception testing literature (like TTC and object detection accuracy and confidence respectively) with the addition of connectivity-related metrics that capture the effects of network parameters in the context of this UC, where AD driving and network are co-simulated or tested in a hybrid setup (metrics were described in D5.3). Test scenarios allocated to a hybrid testing environment (XiL) have been validated against their counterparts in simulation. Virtual and hybrid (Vehicle-in-the-Loop) testing results were compared based on the same set of KPIs.

Use of KPIs, defined in D7.1 and D3.2, KPI calculation scripts and test report templates for both simulation and proving ground test results evaluations.
Evaluation of the test data after test execution by taking into account the metrics/KPIs defined in D7.1 and T3.5.

For each concrete scenario executed in virtual simulation or on proving ground, the metrics from the SAF for evaluating the safety performance of the HWP and the validity of the test runs have been returned. In case a metric cannot be calculated, the block which causes the problem has been identified. Based on experience, it has been checked whether the metrics from the SAF are appropriate for measuring the safety performance of the HWP and the validity of the test run.
Return of the performance indicators per concrete scenario and comparison of the SAF evaluations with internal evaluations.

Returns the performance indicators per concrete scenario and compares evaluations based on SAF with internal evaluations.
For each concrete scenario executed in virtual simulation or on proving ground, the metrics from the SAF for evaluating the safety performance of the HWP and the validity of the test runs have been returned. In case a metric cannot be calculated, the block which causes the problem has been identified. Based on experience, it has been checked whether the metrics from the SAF are appropriate for measuring the safety performance of the HWP and the validity of the test run.

UC4.1 aims firstly to evaluate the validity of the test and then the relevance of the result contribution to the safety performance indicators previously described and criteria from deliverable D3.5. The UC has validated that it is possible to use distribution-based and rule-based indicators. The results must lend themselves to be returned to the SAF to be aggregated towards the Required test coverage.
Support the validation for this block by extracting, analysing, and providing meta-information about the executed tests relevant to the validity of the data.

The result of the ‘execute’ block, focusing on the backing function, have been quantified through UC4.1 controller’s safety KPIs. Test scenario evaluation is performed by providing a ground truth execution (test scenario outcome with and without the cyberattack).

Note: In this UC a real-world attack is assumed but without directly attacking the vehicle under test but the infrastructure or the V2X message. It is acknowledged that these kinds of attacks are neither solely a cybersecurity nor a SOTIF topic. It is rather a joint effort to account for these topics, probably with additional domains. Thereto, one might use methods from both domains as follows. The risk on objects outside the vehicle is considered by the different standards. It is important that mitigations are evaluated, both inside and outside the vehicle [see ISO/SAE 21434:—, 8.9, RQ-08- 12 (risk transfer)]. In the end, it may not be the task for road vehicle cybersecurity to mitigate these risks with typical cybersecurity measures like integrity checks or proofing the authenticity as the attacked environment is not cyber-physically connected to the considered system.

Coverage of the cases in the parameter space, necessary for the methodology. By comparing against results from simulation and real-world testing we validate that the surrogate model from which the generated test cases are derived provides an approximation of the underlying distributions of the KPI’s over the parameter space. This only validates part of the “Coverage” block since scenario coverage cannot be determined.
Cross-checks with one of the following methods: a) new independent test samples or (either in the parameter space or with new parameters that were not considered in the logical scenario, b) with a surrogate ML model of the test results across the logical scenario.
Coverage of the test cases

Scenario space coverage on the logical scenario level is assessed by using the results of the smart scenario-generation probabilistic process developed in T3.3 by ICCS (see ‘Query and Concretize’ sub-block). High scenario space coverage in virtual setup and possible test scenario re-allocation is achieved through a Gaussian Process-based concretization method that classifies the entire scenario space into ‘pass’, ‘fail’ and ‘boundary’ regions. The proposed method allows for estimation (with certain confidence) of the pass/ fail probability of unseen concrete scenarios without the need for execution/simulation and guides scenario generation towards new scenarios that are close to the pass/fail boundary, which are the hardest to predict their outcomes (high uncertainty). The coverage validation is performed by executing random scenarios in the estimated pass, fail and boundary regions and evaluating them in simulation to verify pass or fail decision.
ODD coverage analysis is also performed by counting the set of attributes covered through virtual testing, following the process developed in D3.4. Finally, as a joint task between ‘Coverage’ and ‘Safety case building’ blocks (see ‘Safety Case’ sub-block), safety case requirements’ coverage is partially addressed by analyzing the output of critical scenario generation when different pass/fail criteria apply (perception vs. control – focused metrics).

Provision of results from virtual simulations and proving ground testing for validation of the coverage.
Evaluation of the coverage of cases within the parameter space required for the methodology. By comparing the results against simulations and real-world testing, we validate that the surrogate model, which generates the test cases, effectively approximates the underlying distributions of the KPIs across the parameter space. However, this validation only addresses part of the coverage block, as scenario coverage cannot be assessed.
Cross checks SAF coverage with the bootstrapping aggregation methods mentioned above, in query and concretize.

Provision of results from virtual simulations and proving ground testing for validation of the coverage.
Evaluation of the coverage of cases within the parameter space required for the methodology. By comparing the results against simulations and real-world testing, we validate that the surrogate model, which generates the test cases, effectively approximates the underlying distributions of the KPIs across the parameter space. However, this validation only addresses part of the coverage block, as scenario coverage cannot be assessed.
Comparison of SAF coverage with the ML bootstrapping aggregation method mentioned in UC3.1 (D7.2 section 3.5.2).

The safety case was built based solely on simulation results, gathering evidence through a) microscopic evaluation (on individual scenarios) of pass/fail results, b) macroscopic evaluation (statistical analysis) of aggregated results, and c) identification of edge/corner-cases. To achieve this, coverage-driven scenario space exploration was performed targeting: i) high-risk scenario identification and ii) boundary scenarios identification (high uncertainty). Following the safety case building, a safe speed recommendation function was also derived as an artifact based on the ‘safe’ scenario space area (the ego speed was part of the explored scenario space).

Safety case building aims at safety case requirements’ coverage and follows a modular approach where both independent subsystems’ testing as well as end-to-end system testing is considered.
It includes, in alignment with the general guidelines of the ISO34505, a) scenario space coverage results’ analysis based on the results from ‘Coverage’ sub-block b) the analysis of critical scenario generation when different pass/fail criteria apply (perception vs. control – focused metrics) b) critical scenario prioritization based on criticality and pass/fail classification uncertainty c) dynamic re-allocation of selected prioritized test scenarios from virtual to hybrid test environment and d) a method to combine metrics from different SuT subsystems and results coming from virtual and hybrid test environment based on the analysis provided in D4.6 . Please note that in the context of this UC, the ultimate goal that is to combine results from different test environments to argue ‘macroscopic’ safety margin violation for one logical scenario is only partially covered since the complete analysis that would include scenarios complexity and exposure as extra metrics to these of criticality and coverage is here missing.
It also includes analysis of safety case artifacts, namely the analysis of test scenario repeatability during execution and system under test reaction times in virtual and hybrid setup to understand the sim2real gap.

In UC2.1, we began by querying the scenario database to identify all scenarios relevant to the Traffic Jam Assist’s Operational Design Domain (ODD), system functions, and test objectives, ensuring our selection reflected realistic and critical traffic jam situations.

During concretising, we specified logical scenarios by assigning concrete parameter values (such as actor behaviors) to make each scenario directly testable and clearly linked to our safety requirements.

SAF

In the execute step, we performed the tests in the allocated environments, ensuring all necessary preparations and tools were in place and that the execution was traceable and reproducible.

Finally, in the test evaluate phase, we systematically assessed each test execution against the defined safety criteria for Traffic Jam Assist (e.g., collision avoidance, speed and lane keeping compliance), using coverage metrics and pass/fail results to iteratively refine our scenario set and provide robust, traceable evidence for the Safety Case.

This UC is related to safety, specifically remaining in the lane and avoiding abrupt decelerations that may be unsafe for followers, as well as to performance improvements due to map-based information integration. KPIs include smooth deceleration and acceleration (to mitigate rear-end collision risks), lane keeping, and the correct speed in the curves. The safety case was built based on randomly sampled concrete scenarios executed in both simulation and proving ground test environments, gathering evidence through a) microscopic evaluation (on individual scenarios) of pass/fail results and b) macroscopic evaluation (statistical analysis) of aggregated results. Repeatable execution of scenarios in two allocated test environments was performed (with fewer scenarios tested in PG), thereby increasing the initial coverage obtained from virtual testing only.

In this UC, a modular testing approach is developed for evaluating up to three AD systems (three types of planners and a controller). Two different logical scenarios were considered: one with a cooperative vehicle communicating its planned maneuver and another with vehicle negotiations. The latter scenario was reproduced in the proving ground.
For the coverage analysis, a novel approach to automatic scenario space exploration and boundary discovery was developed. This approach leads to high coverage results with confidence through simulation. The machine learning algorithm developed has the advantage of avoiding the laborious need to re-execute test scenarios on the entire scenario space.
Based on different collision-relevant metrics, test scenarios are classified into regions of pass/fail with uncertainty. This can be directly used in the identification of high-risk scenarios where the AD function is predicted to perform a safety violation, and hence it forms part of the safety argumentation aligned with the computation of collision rates (as requested by ISO/TS 5083:2025) and ISO34505 guidelines on safety case building. In addition, scenarios assigned with greater uncertainty can be prioritized for further testing in simulation or in the proving ground (again in alignment with ISO34505, and/or to discover and classify failure modes. Repeatable execution of scenarios in two allocated test environments had been performed (fewer scenarios tested in PG), increasing the initial coverage obtained from virtual testing only.

The safety argument or assurance contribution block was addressed by implementing and evaluating key performance indicators (KPIs) to assess the system’s ability to achieve relevant safety objectives in a confined logistic environment. The reverse parking maneuver of a truck with a semitrailer served as the primary scenario, providing a basis for both performance assessment and test methodology validation. The KPIs included:

KPI

ODD

KPI

Cross-check of the SAF decision with one of the methods mentioned above.

Combine results from the preceding ‘Safety Argument’ sub-block, namely “Safety Case”, for all test environments, namely virtual and hybrid, and determines if the SuT can be classified as safe or if more testing is required.

To evaluate the performance of the system in the TJA Use Case, Task 3.5 provided pass/fail criteria. These assessment criteria are KPIs, metrics and other requirements. In the EU regulation R157, on which UC 2.1 is based, the requirements are openly formulated (e.g. system must stick to the traffic rules and avoid collisions up to a specific TTC). The SAF specified concrete evaluation metrics related to the defined test scenarios and based on the general requirements from the regulations.
Validation of the “Decide”-block after the evaluation of the test results by taking into account external requirements from the EU R157, on which UC2.1 is based.

Falsification of the SAF decision with one of the following methods: a) new independent test samples (either in the parameter space or with new parameters that were not considered in the logical scenario), b) with a surrogate ML model (bootstrapping aggregation method) of the test results across the logical scenario.
Derive a safety evaluation of the HWP from the SAF. Based on experience, these guidelines and the result have been checked for plausibility and feasibility.

Falsification of the SAF decision with one of the following methods: a) new independent test samples (either in the parameter space or with new parameters that were not considered in the logical scenario), b) with a surrogate ML model (bootstrapping aggregation method) of the test results across the logical scenario.
Derive of a safety evaluation of the HWP from the SAF. Based on experience, these guidelines and the result have been checked for plausibility and feasibility.

Validate the applicability of the guidelines from Deliverable D2.3 on the results of the UC.
Support the validation of the applicability of the guidelines focusing in particular on the evaluation of the several real test scenarios using the real-scale truck to unveil potential improvement aspects with the goal to incorporate the experiences from the tests.

Please click an ‘X’ to read more details.

← Previous Next →

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Directory

High-Level Overview

Glossary

Input

Data Framework

Performance Assurance

Scenario

Create

Format

Store

Environment

Query & Concretise

Allocate

Execute

Safety Argument

Coverage

Test Evaluate

Safety Case

Decide

Audit

Use Cases

SAF Solutions

Submit Feedback

SUNRISE Safety Assurance Framework

Use Cases