"Safety" Scrum

We adopted an agile approach imbued with practices for engineering safety-critical systems.  In the diagram on the right you can see the typical SCRUM process in which stakeholders work with the product owner to create safety stories and to place them into a ranked product backlog.  In a non-safety critical project, the customer then selects user stories for a sprint, and the team implements them in order to release working, executable software at the end of the sprint.

So what do we do differently?

  1. First, instead of typical user stories we use the EARS format to write requirements.  These are more suited to requirements of a cyberphysical system.
  2. Second, we perform a preliminary hazard analysis at the start of the project to identify specific hazard that could cause harm if they were to occur.
  3. Third, for each of the identified hazards we identify contribution faults and document them in a Fault Mode Effect Criticality Analysis (FMECA) table.
  4. We then write mitigating requirements — which we refer to as safety requirements to eliminate or reduce the risk and/or impact of the fault.
  5. When stories (i.e., EARS requirements) are selected for an iteration, we perform a detailed hazard analysis to ensure that we have thought through all potential hazards and faults associated with each feature and with the interaction between any new features and those already implemented in the system.  This leads to modification in the FMECA and additional safety stories which are placed into the backlog.
  6. The sprint proceeds as normal; however, at the end of the sprint we inspect the delivered features for dependent safety stories.  The software is only considered safe for use if all safety stories associated with all features (and feature interactions) have been fully implemented and tested.
  7. Finally, we add specific testing and simulation tasks to the backlog.  These also create dependencies on each feature — and features can only be considered safe if all dependencies have handled.

Example:  Multiple UAVs shall fly in a shared airspace.

Delivering the software for multiple UAVs to fly in a shared airspace is relatively straightforward.  However, achieving it in a way that ensures effective collision avoidance avoidance involves numerous safety story.  Working, executable code, may be delivered, but may not necessarily be safe for use.  Our SCRUM process keeps us informed of the current safety status of the software.