The internal_add
function tries to efficiently insert a new range of integers into an existing list of sorted and disjoint integer ranges. For example, if we started with [101..=102, 400..=402, 404..=405]
and added 402..=404
, we expect a result of [101..=102, 400..=405]
.
Ideally, I’d formally verify this algorithm with Rust-specific tools [1,2]. Those tools, however, seem hard to use. Instead, I chose Dafny. Dafny is a language and verification system. It is taught to undergraduates at universities around the world. It is used in industry. I find it to be addictively interactive and programmer friendly.
Aside: Dafny creator, Dr. Rustan Leino has a connection to Rust beyond the coincidence of his first name. He helped create Spec#, the first language to use a type system to avoid null pointers. Rust, of course, adopted this idea to great success.
This article covers rules 1 to 6. Part 2 will cover rules 7 to 9.
Before trying to prove the mathematical correctness of your algorithm, decide if the effort is worth the benefit.
Dafny is not Rust. Using Dafny requires porting algorithms of interest from Rust to Dafny. This port can miss details and introduce errors. Given this risk, should you use Dafny to verify Rust algorithms? I boldly claim that “it depends”.
- How important is your algorithm’s correctness? If you are printing a report and it looks right, it probably is right. The
internal_add
algorithm relates to a data structure that I’d like others to use with confidence, giving me extra motivation to verify it. - Maybe all formal verification, with current tools, is too hard. I believe, however, that Dafny makes formal verification as easy as currently possible. You will find formally verifying code easier if you are already familiar with types (for example, from Rust) and recursion/induction (used infrequently in Rust). You can read this article and decide for yourself if/when formal verification is easy enough to be valuable to you.
- Maybe fuzzing (such as
cargo-fuzz
) and property-based testing (such asQuickCheck
) are good enough. Although these methods do not provide mathematical certainty, they are clever, useful, and easy to use. (Therange-set-blaze
crate already usesQuickCheck. See
Rule 9.5 in a previous article for details). - Maybe formal verification is and will always be doomed because writing a specification is as hard as writing code. I disagree. Think about refactoring. I often start coding by writing something simple. I then refactor this simple code for efficiency. For
internal_add
, I found the specification to be simpler than any code. (You can judge this for yourself in Rule 4.)
Aside: Verification then becomes a computer-checked refactoring from a simple specification to a final, efficient algorithm.
- Maybe formal verification is and will always be doomed because the halting problem tells us formally that formality isn’t generally possible. The halting problem does not doom us. While we can’t always understand arbitrary code, we don’t need to. We only need to understand our own code, which we (hopefully) wrote to be understandable. Starting in Rule 2, we’ll see how Dafny easily verifies that specific loops and recursions halt.
- Maybe porting to Dafny is too hard. This has not been my experience. Like Rust, Dafny mixes and matches imperative and functional programming. I found porting my algorithm to be straightforward.
Assuming you still want to verify your algorithm with Dafny, your next step is to learn Dafny.
Dafny is both a programming language and an interactive verification system. I recommend you install it as a VS Code extension.
To learn it, start at https://dafny.org/. Of special interest is the Online Tutorial and the Reference Manual. I also found the Verification Corner videos on YouTube helpful. (Of possible interest is the college textbook, Program Proofs, $49 for the Kindle Edition). I found the programming language part of Dafny easier to learn than Rust, perhaps similar in difficulty to C#.
Dafny, like Rust, is fully typed. Dafny, like Python, is garbage collected. Here is a “Hello World”:
// hello.dfy
method Main()
{
var s := "Hello World";
print s, "\n";
}
Dafny, also like Python, offers integers of arbitrary size. Here is a program that provably adds two natural numbers by repeatedly incrementing.
Some points of interest:
- Dafny coding guidelines follow C#, not Rust. So, we name the function
SlowAdd
notslow_add
(although either will run). - Dafny supports subtypes. For example, any
int
that can be shown to be non-negative is also anat
. - Assignment is
:=
and equality is==
. (There is no=
.) - Function parameters, for example,
x
andy
above, are immutable. - Dafny uses
ensures
andinvariant
statements to verify the code at compile-type. It then removes these statements to finish compiling. - The green check mark shows that this code verifies. Dafny’s VS Code extension will, by default, continuously try to validate each method. This adds an almost gambling-like excitement to working with Dafny. In the example above, if I make
y
anint
rather than anat
, then validation should and will fail. (Can you figure out why?) Dafny will mark my function with a red X and tell me “This postcondition might not hold: r == x + y
”. - Dafny knows some of the mathematics of integers, arrays, sets, maps, sequences, etc. This often allows it to finish the last details of validation by itself.
Now that you know about Dafny, you should let it know about your algorithm.
The range-set-blaze
crate represents sets of integers as sorted, disjoint ranges. For example, this list of three ranges:
100..=2_393,
20_303..=30_239_000,
501_000_013..=501_000_016
represents a set of 30,220,996 integers.
In Rust, the RangeSetBlaze
struct represents this data structure internally with a standard BTreeMap.
Recall that a BTreeMap
represents a list of key/value pairs, sorted by key. Here, our keys are the ranges’ starts (for example, 100
, 20_303
, 501_000_013
) and the values are the ranges’ inclusive ends (for example, 2_393
, 30_239_000, 501_000_016
. RangeSetBlaze
stores the list with a BTreeMap
rather than a vec
to make key look up more cache friendly.
RangeSetBlaze
depends on BTreeMap
, so must we implement BTreeMap
in Dafny? Happily, no. We can, instead, use Dafny’s vec
-like seq
data type. This substitution works because BTreeMap
, vec
, and seq
can all represent sorted lists — just with different efficiencies. For the purpose of formal verification, we only care about correctness and can ignore efficiency.
RangeSetBlaze
requires the list of ranges be sorted and disjoint. How do we say “sorted and disjoint” in Dafny? We can say it via this ghost predicate (and related code):
ghost predicate ValidSeq(sequence: seq<NeIntRange>) {
(forall i:nat, j:nat | i < j < |sequence| :: sequence[i].0 < sequence[j].0)
&& (forall i:nat, j:nat | i < j < |sequence| :: !Touch(sequence[i], sequence[j]))
}type IntRange = (int, int)
type NeIntRange = x: IntRange | !IsEmpty(x) witness (0,0)
function IsEmpty(r: IntRange): bool
{
r.0 > r.1
}
A predicate is another name for a method that returns bool
. A ghost method (or predicate) is one that can only be used for validation, not for running the final code.
At a high level, the ValidSeq
predicate takes as input a sequence of non-empty integer ranges. It then tests that the start values are sorted and that the ranges don’t touch. Specifically,
- An
IntRange
is a tuple of twoint
values. - An
IntRange
IsEmpty
exactly when its start is greater than its end. (This follows Rust’s convention.) - A
NeIntRange
(non-empty integer range) is anIntRange
that is not empty, for example,(0,0)
. [All our ranges are end inclusive.] - This expression tests that the start values are sorted:
forall i:nat, j:nat | i < j < |sequence| :: sequence[i].0 < sequence[j].0
It can be read as “for all natural numbers i and j — such that i is less than j and j is less than the length of the sequence — test that the start value at index i is less than the start value as index j”.
Aside: Note that a Rust
BTreeMap
doesn’t support (random-access) indexing but here we are using such indexing. This is OK becauseValidSeq
is a ghost predicate and so will only be used for validation.
- This expression tests that the ranges are disjoint:
forall i:nat, j:nat | i < j < |sequence| :: !Touch(sequence[i], sequence[j])
It can be read as “for all natural numbers i and j — such that i is less than j and j is less than the length of the sequence — test that the range at index i does not touch the range at index j. But what is Touch
?
We’ll define Touch
on two-levels. On a mathematical level, a range i is said to touch a range j if there exists an integer i0 in range i and an integer j0 in range j such that i0 and j0 are within a distance of one of each other. On an efficient programming level, we want to avoid definitions depending on “there exists”. Here is a Dafny predicate that is both mathematical and efficient:
predicate Touch(i: NeIntRange, j: NeIntRange)
ensures Touch(i, j) == exists i0, j0 ::
Contains(i, i0) && Contains(j, j0) && -1 <= i0 - j0 <= 1
{
assert Contains(i, i.0) && Contains(i, i.1) && Contains(j, j.0) && Contains(j, j.1);
if i.1 < j.0 then
assert (-1 <= i.1 - j.0 <= 1) == (i.1+1 == j.0);
i.1+1 == j.0
else if j.1 < i.0 then
assert (-1 <= j.1 - i.0 <= 1) == (j.1+1 == i.0);
j.1+1 == i.0
else
var k0 := Max(i.0, j.0);
assert Contains(i, k0) && Contains(j, k0);
true
}function Contains(r: IntRange, i: int): bool
{
r.0 <= i && i <= r.1
}
function Max(a: int, b: int): int
{
if a < b then b else a
}
Some points of interest:
Touch
is not a ghost. In other words, we can use it in both regular code and validation code.- The
assert
statements help Dafny prove that the regular code meets the mathematicalensures
statement. - For efficiency, the Dafny prover validates the inside of a
method
separately from its outside. Only theensures
(and the yet-to-be-seen,requires
) statements cross this border. In contrast to amethod
, a Dafnyfunction
is transparent to the validator. (I think of it as inlining code with respect to validation.)
With concepts such as ValidSeq
and Touch
defined, we next move onto specifying what our algorithm is supposed to do.
Ultimately, I want to prove that my specific Rust algorithm for inserting a new range into a RangeSetBlaze
is correct. Before we do that, however, let’s define what “correct” range insertion is.
method InternalAdd(xs: seq<NeIntRange>, a: IntRange) returns (rs: seq<NeIntRange>)
requires ValidSeq(xs)
ensures ValidSeq(rs)
ensures SeqToSet(rs) == SeqToSet(xs) + RangeToSet(a)
{
if IsEmpty(a)
{
rs := xs;
}
else
{
assume false; // cheat for now
}
}
This says that InternalAdd
is a method that takes xs
, a sequence of non-empty integer ranges, and a
, an integer range (that could be empty). The method outputs rs
, a new sequence of non-empty integer ranges.
We need to say that xs
and rs
must be sorted and disjoint. That is easily done with the ValidSeq
’s in the requires
and first ensures
.
We also need to say that rs
contains the right stuff. Is this hard? It is not. We just say that the set of integers in rs
must equal the set of integers in xs
unioned with the integers in a
.
Aside: In Dafny, “+” when applied to sets is “union”.
The set of integers in a range is:
ghost function RangeToSet(pair: IntRange): set<int>
{
set i {:autotriggers false} | pair.0 <= i <= pair.1 :: i
}
And the set of integers in a sequence of non-empty ranges can be define inductively (that is, recursively):
ghost function SeqToSet(sequence: seq<NeIntRange>): set<int>
decreases |sequence|
requires ValidSeq(sequence)
{
if |sequence| == 0 then {}
else if |sequence| == 1 then RangeToSet(sequence[0])
else RangeToSet(sequence[0]) + SeqToSet(sequence[1..])
}
Some points of interest:
- The line:
assume false; // cheat for now
makes validation work even if it really shouldn’t. We use it as a temporary placeholder. - We make
RangeToSet
andSeqToSet
ghosts to stop us from using them in regular code. We make them functions (instead of methods) to inline them with respect to validation. - Because Dafny knows a lot about creating and manipulating sets and sequences, we often profit by using sets and sequences in our specification.
- Even if our regular code uses loops instead of recursion, our validation code will often use recursive-like induction.
- The
{:autotriggers false}
relates to avoiding a warning message. For more information see this Stack Overflow answer by Prof. James Wilcox.
We now have a formal specification of InternalAdd
. I find this specification short and intuitive. But what if you need help figuring out a specification or other Dafny code?
The main forum for Dafny questions is Stack Overflow. To my surprise, I actually received much useful help there.
I recommend starting your question’s title with “Dafny:”. Also, be sure to tag your question with dafny
and, perhaps, formal-verification
.
Aside: On the site, you can see my 11 questions and Divyanshu Ranjan’s 48 Dafny-related answers.
As an open-source project on GitHub, Dafny also hosts GitHub Discussions and Issues.
The Dafny community is small but seems enthusiastic about helping users and improving the project.
With help at hand, we must next find an algorithm that meets the specification.
As a novice to formal verification, I decided to postpone work on the real internal_add
used in my Rust code. Instead, I started work on an InternalAdd
algorithm that I hoped would be easier to validate. I ended up with this:
method InternalAdd(xs: seq<NeIntRange>, a: IntRange) returns (rs: seq<NeIntRange>)
requires ValidSeq(xs)
ensures ValidSeq(rs)
ensures SeqToSet(rs) == SeqToSet(xs) + RangeToSet(a)
{
if IsEmpty(a)
{
rs := xs;
}
else
{
var notTouching, merged := PartitionAndMerge(xs, a);
var indexAfter := NoTouchIndexAfter(notTouching, merged);
rs := InsertAt(notTouching, [merged], indexAfter);
}
}
The idea is that if range a
is empty, we return the input sequence unchanged. Otherwise, we divide the work into three steps, which we can validate independently. The first step, PartitionAndMerge,
returns:
notTouching
, a sequence of ranges that don’t touch rangea
, andmerged
, a single range created froma
and everything it touches.
Here is an example input and output:
InternalAdd
next finds where to insert merged
and, finally, inserts it.
Here is the code for PartitionAndMerge:
method PartitionAndMerge(xs: seq<NeIntRange>, a: NeIntRange) returns (notTouching: seq<NeIntRange>, merged: NeIntRange)
requires ValidSeq(xs)ensures ValidSeq(notTouching)
ensures RangeToSet(merged) >= RangeToSet(a)
ensures forall range | range in notTouching :: !Touch(range, merged)
ensures SeqToSet(xs) + RangeToSet(a) == SeqToSet(notTouching) + RangeToSet(merged)
{
// Split into touching and not touching seqs
var touching: seq<NeIntRange>;
touching, notTouching := Partition(a, xs);
// Merge the touching seq into one range with our original range
merged := UnionSeq(a, touching);
}
This says that PartitionAndMerge
requires
that xs
be a valid sequence of non-empty integer ranges and that a
be a non-empty integer range. It ensures that nonTouching
is another valid sequence of non-empty integer ranges. It ensures that the integers in range merged
are a superset of those in range a
. It ensures that no range in notTouching
touches range merged
. And finally, it ensures that the integers in xs
and a
are exactly the same as the integers in notTouching
and merged
.
PartitionAndMerge
also divides the work, this time into two steps (Partition
and UnionSeq
) that can be validated independently. Those steps continue to subdivide their work. Where does it end? Let’s look at one example.
The method UnionSeq
calls UnionRange
which merges two ranges:
function UnionRange(x: IntRange, y: IntRange): IntRange
requires IsEmpty(x) || IsEmpty(y) || Touch(x, y)
ensures RangeToSet(x) + RangeToSet(y) == RangeToSet(UnionRange(x,y))
{
if IsEmpty(x) then y
else if IsEmpty(y) then x
else (Min(x.0, y.0), Max(x.1, y.1))
}
The UnionRange
code handles the empty cases and then returns the minimum bounding range. (The minimum bounding range is the range from the smaller of the two starts to the larger of the two ends.) But how can this be correct? In general, a minimum bounding range of two ranges might include extra integers. We might get something bigger than the union of the inputs, like so:
The code is correct because it requires
that the two input ranges touch or are empty. This ensures
that the union of the integers in range x
with the integers in range y
are exactly the integers in the output range.
At compile time, Dafny proves this function correct. Beyond that, it proves that everything that calls this function provides inputs that are empty or touching.
I think of this as a generalization of Rust’s borrow checker. At compile-time Rust checks that we are safe from many memory errors. At compile time, verification systems, such as Dafny, can prove almost arbitrary properties. Of course, as we are seeing, this ability comes at the cost of complexity.
The full code for this verified algorithm is about 200 lines, organized into about a dozen methods and functions.
This rule shows that we can verify an algorithm for InternalAdd
, but it is not the algorithm I used in Rust. We will turn to that next.