Assessing Formal Verification in SPARK

None, None

Assessing Formal Verification in SPARK

A Case-Study Evaluation of Formal Verification Tooling

Bachelor Thesis (2025)

Author(s)

D. Blanovschi (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

K.F. Wullaert – Mentor (TU Delft - Programming Languages)

Benedikt Ahrens – Mentor (TU Delft - Programming Languages)

M. Izadi – Graduation committee member (TU Delft - Software Engineering)

Faculty

Electrical Engineering, Mathematics and Computer Science

Formal verification Ada Functional correctness SPARK TLA+

To reference this document use:

https://resolver.tudelft.nl/uuid:18b4b3ef-da6e-4bc8-8b2b-01f9a874a5d7

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

27-06-2025

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Formal verification promises stronger correctness guarantees than conventional testing, yet it is often perceived as too costly or specialised for everyday software development. This thesis investigates whether SPARK — Ada’s provable subset — can deliver industrial- strength verification for representative algorithms and concurrent systems. Two separate questions guided the work: how well does SPARK work with (RQ1) sequential algorithms; and (RQ2) concurrent systems.
We implemented and formally proved insertion sort, quick-sort, and a generic statically-allocated hash-map to SPARK’s highest assurance level, achieving full functional correctness, as well as two concurrent case studies: a Publisher / Subscriber channel and a novel IO multi-reactor pattern runtime task scheduler, where we combined SPARK with TLA+ model checking to capture liveness and safety properties that SPARK cannot express directly. Across all sequential examples, proof overhead averaged ratios of 6–10 lines of specification, while for the concurrent case-studies, the TLA+ models averaged ratios of 1.7 lines, per every line of executable code.
The study shows that (i) SPARK is practical for non-trivial sequential algorithms, unless they make use of the heap; (ii) concurrency still requires external formalisms, but the combination remains tractable; and (iii) careful specification design, not solver performance, is the dominant cost driver. These findings confirm that correctness-by-construction is attainable within undergraduate project scope and provide quantitative benchmarks for future work.

Files

Research_Paper_38_.pdf

(pdf | 0.617 Mb)

License info not available