Verification of Imperative Programs

Jean-Baptiste Tristan & K. Rustan M. Leino

Methods

At first glance, Dafny's methods are like functions in C or Python or like methods in Java
The body of a method is a sequence of statements
Methods can use mutable local variables and have side effects

Usually, a method is used in the context of object-oriented programming

In Dafny, a method need not be defined for an object

  method Example(x: nat, y: nat) returns (z: nat) {
    var t: nat := 0;
    z := 0;
    for i := 0 to x {
      z := z + y;
    }
    z := t;
  }

Verifying methods

Recall that method calls are RHS expressions
They cannot be mentioned in logical formulas

Method verification is done using pre- and postconditions

You can state a property thanks to the explicit out-parameters

  method Example(x: int, y: int) returns (z: int)
    requires x > 0
    requires y > 0
    ensures z == x + y
    ensures z > 0
  {
    z := x + y;
  }

Methods and termination

Methods are allowed to diverge

In such a case, the postcondition holds only if the method does terminate

  method Bad(x: nat)
    ensures false
    decreases *
  {
    Bad(x + 1);
  }

Methods and termination – danger

Be wary of methods that are using “decreases *” for no reason

  method CorrectIsh()
    ensures 0 == 1
    decreases *
  {
    Bad(0);
  }

When verification fails

In some cases, Dafny verifies everything on its own
But that won't always be the case
How can we help Dafny verify the postconditions of a method?
Of course, we're going to write proofs, but how?

Floyd logic – 1

The body of a method is a sequence of statements
A method has a state comprised of its variables
Each statement can transform the state
If Dafny cannot verify the postcondition
- Since proof tactics are statements as well, they can be freely woven into the body of the method

A statement may affect the values of expressions mentioned in preconditions and earlier assertions

  method MixedCodeAndProof(x: nat) returns (y: nat)
    requires x > 1
    ensures y > x
  {
    y := x + 2;
    assert y == x + 2;
    y := 2 * y;
    assert y == 2 * x + 4;
    y := y - 5;
    assert y == 2 * x - 1;
  }

Floyd logic – 2

Proof statements can be involved and you can use forall, etc.
Tip: if you're going to write a complicated proof, encapsulate it in an assert by
That way, it is easy to spot code and proof

The IDE also tries to help you distinguish between proof and code by italicizing ghost code

  method MixedCodeAndProof(x: nat) returns (y: nat, ghost g: int)
    requires x > 1
    ensures y > x
  {
    y := x + 2;
    y := 2 * y;
    g := y + 10;
    y := y - 5;
    assert y > x by {
      assert y == 2 * x - 1;
      assert 2 * x - x > 1;
    }
  }

Hoare triples

Which assertion should help?
First and foremost, you should rely on your intuition
Move forward from hypothesis
Move backward from goal
Ask yourself: “if x was even and I multiplied it with an even, is it still even? Why?”
Of course, there are more formal rules that describe how statements affect propositions
- They are called Hoare triples

Call

  method M()
    requires Q()
    ensures R()

  method Call()
    requires P()
    requires P() ==> Q()
    requires R() ==> S()
    ensures S()
  {
    M();
  }

Sequencing

  method S1()
    requires P()
    ensures Q()

  method S2()
    requires Q()
    ensures R()

  method Sequencing()
    requires P()
    ensures R()
  {
    S1();
    S2();
  }

Variable update

  method M() returns (o: T)
    ensures forall t: T :: P(t)

  method Update() returns (o: T)
    ensures P(o)
  {
    o := M();
  }

Loops

Loops, like recursion, need invariants
The difficulty is that you need to generalize enough

If your control is complex (if, continue, nested loops), your invariants will be complex as well

  method Times(x: nat, y: nat) returns (z: nat)
    ensures x * y == z
  {
    z := 0;
    for i := 0 to x
      invariant i * y == z
    {
      z := z + y;
    }
  }

Case study – 1

In this example, verification fails

Let's help Dafny

  methodM(a: nat) returns (b: nat)
    ensures Even(b)
  {
    b := 2 * a;
    b := b * b;
  }

Case study – 2

We can prove it by “undoing” the update to b

  method M(a: nat) returns (b: nat)
    ensures Even(b)
  {
    b := 2 * a;
    assert Even(b);
    b := b * b;
    assert Even(b) by {
      assert exists c: nat :: b == 2 * c by {
        assert b == (2 * a) * (2 * a);
        assert b == 2 * (2 * a * a);
      }
    }
  }

Case study – 3

We can prove it by introducing a fresh variable that retains the original value of b

  method M(a: nat) returns (b: nat)
    ensures Even(b)
  {
    var b0: nat := 2 * a;
    assert Even(b0);
    b := b0 * b0;
    assert Even(b) by {
      assert exists c: nat :: b == 2 * c by {
        var c: nat :| b0 == 2 * c;
        assert b == (2 * c) * (2 * c);
        assert b == 2 * (2 * c * c);
      }
    }
  }

Weaving proof and code

That's annoying: we just made our code worse for verification purposes
The only purpose of b0 is to help with the proof
It should not be in the final code
We can declare it to be a ghost variable
Dafny knows that ghost variables are for proof only
It will check that you don't use ghost variables in genuine computation

Case study – 4

Solution with ghost variables, better than the two previous ones

  method M(a: nat) returns (b: nat)
    ensures Even(b)
  {
    b := 2 * a;
    ghost var b0: nat := b;
    assert Even(b0);
    b := b * b;
    assert Even(b) by {
      assert exists c: nat :: b == 2 * c by {
        var c: nat :| b0 == 2 * c;
        assert b == (2 * c) * (2 * c);
        assert b == 2 * (2 * c * c);
      }
    }
  }

Case study – 5

This would not even make sense with the first two approaches

  method M(a: nat, p: nat) returns (b: nat)
    ensures Even(b)
  {
    b := 2 * a;
    assert Even(b);
    for i := 0 to p
      invariant Even(b)
    {
      ghost var b0: nat := b;
      b := b * b;
      assert Even(b) by {
        assert exists c: nat :: b == 2 * c by {
          var c: nat :| b0 == 2 * c;
          assert b == (2 * c) * (2 * c);
          assert b == 2 * (2 * c * c);
        }
      }
    }
  }

Functional code as specification – 1

As we saw, some properties cannot be expressed as postconditions
Plus, reasoning about imperative code can be more tedious
Methodology: define your computation both in imperative and functional style
Prove their equivalence
Prove any advanced properties on the functional version

Functional code as specification – 2

Consider this example:

  methodTimes(x: nat, y: nat) returns (z: nat)
    ensures z == Timesf(x, y)
  {
    z := 0;
    for i := 0 to x {
      z := z + y;
    }
  }

Functional code as specification – 3

Step 1: write a functional spec

  function Timesf(x: nat, y: nat): nat {
    if x == 0 then 0 else y + Timesf(x - 1, y)
  }

Functional code as specification – 4

Step 2: implement and prove equivalence

  method Times(x: nat, y: nat) returns (z: nat)
    ensures z == Timesf(x, y)
  {
    z := 0;
    for i := 0 to x
      invariant z == Timesf(i, y)
    {
      z := z + y;
    }
  }

Functional code as specification – 5

Step 3: prove properties on the spec

  lemma TimesfCommutative(x: nat, y: nat)
    ensures Timesf(x, y) == Timesf(y, x)
  {
    if x == 0 {
    } else {
      TimesfCommutative(x - 1, y);
    }
  }

Functional code as specification – 6

Connecting the functional spec to the imperative code can be hard
Strategy:
- Implement simple functional spec
- Implement tail-recursive functional spec
- Prove equivalence of simple and tail-recursive
- Prove equivalence of tail-recursive and imperative

Function by method

A method cannot be called from within a functional expression
A priori, it means that functional code can never call imperative code
You can work around that by proving that a method correctly implements a function
This is done using the function by method declaration
Note that the function body must satisfy the ensures clause (if any)

The method body must return exactly the same value as the function body

  function Timesf(x: nat, y: nat): nat
  {
    if x == 0 then 0 else y + Timesf(x - 1, y)
  } by method {
    var z := 0;
    for i := 0 to x
      invariant z == Timesf(i, y)
    {
      z := z + y;
    }
    return z;
  }

Arrays – introduction

Arrays in Dafny are references
Like usual arrays, they support lookup and in-place update
An array's length is obtained with the .Length field

An array is created with Java-like syntax

  method M(a: array<int>)
    requires a.Length > 10
    modifies a
  {
    var x: int := a[3];
    a[5] := 6;
    var b: array<int> := new int[5];
  }

Programming with references

So far, the state of our imperative code has been limited to mutable local variables
Now, we are going to consider references to mutable memory
Such memory is allocated on the heap
Such memory is accessed via references (“pointers”)

Arrays are references

  type Ref<T> = a: array<T> | a.Length == 1 witness *

Reasoning with references

The addition of references to mutable data leads to a significant increase in verification difficulty
Also, they have a profound influence on the design of Dafny
In essence, what makes references difficult to reason about is aliasing
It just means that you could have different references to the same object
- (Actually, aliasing is no more difficult than aliasing array indices: a[i] and a[j] refer to the same array element if the integers i and j are equal. Easy peasy! What is difficult is abstract aliasing, where the data structures pointed to by two different references overlap. But let's start simple.)
The state of a Dafny program corresponds to the memory state

Reading from the state

A function cannot modify the state
However, a function can read from the state
But reading must be explicitly mentioned in the function's specification!
- That is, a function can read the state only if its specification allows it to
A read frame is declared with a reads clause

It specifies the set of objects the function is allowed to read from

  function Read(s: seq<Ref<nat>>, idx: nat): nat
    requires idx < |s|
    reads s[idx]
  {
    s[idx][0]
  }

Summary of function types

A pure total function: A -> B
A pure partial function: A --> B
An impure function: A ~> B
- Note, an impure function can read the state, but no function can ever modify the state

Reading the state

A method can read from the state

  method M(s: Ref<nat>)
  {
    print s[0];
  }

Note that, unlike function specifications, method specifications don't say anything about reading

Modifying the state

A method can modify the state
But writing must be explicitly mentioned in the method's specification!
- That is, a method can write the state only if its specification allows it to
A write frame is declared a modifies clause

It specifies the set of objects the method is allowed to write to

  method Write(s: seq<Ref<nat>>, idx: nat, v: nat)
    requires idx < |s|
    modifies s[idx]
  {
    s[idx][0] := v;
  }

Referring to old state

Since methods can modify the state, we often need to state properties that relate the state before and after the method was executed
Within the body of a statement we can do this with ghost variables
For references, mutations are not confined to methods
We need some way of referring to an old state in the spec
This is done using the expression old

  method Add(c: Ref<nat>)
    modifies c
    ensures c[0] % 2 == 0 <==> old(c[0]) % 2 == 0
  {
    c[0] := c[0] + 2;
  }

Aliasing – 1

The principal reason why reasoning about methods and stateful code is difficult is aliasing
Aliasing happens when there are more than one reference to an object
Often, we expect a property to hold because we implicitly assume that there is no aliasing

For example, the following is incorrect:

  method InitIncorrect(c1: Ref<nat>, c2: Ref<nat>)
    modifies c2
    ensures old(c1[0]) == c1[0]
  {
    c2[0] := 2;
  }

Aliasing – 2

The reason is that c1 and c2 could be references to the same object
Therefore, we need to require them to be different

  method Init(c1: Ref<nat>, c2: Ref<nat>)
    requires c1 != c2
    modifies c2
    ensures old(c1[0]) == c1[0]
  {
    c2[0] := 2;
  }

Frames – 1

Why do we need frames?
It has to do with aliasing
It allows to keep track of which part of the memory we can potentially read or modify
It may not look like much, but otherwise:
- You would have to prove trivial things about parts unrelated parts of the state
- For every new object, you would have to update all previous proofs
- The amount of specification about objects not changing grows a lot
This is known as the “frame problem”

Frames – 2

Thanks to the frames, Dafny can verify this code without help

  method TestIncorrect(s: seq<Ref<nat>>) returns (x: nat, y: nat)
    requires |s| > 10
    modifies s
    ensures x == y
  {
    x := Read(s, 5);  // read from the object referenced by s[5]
    Write(s, 2, 56);  // write to the object referenced by s[2]
    y := Read(s, 5);  // read from the object referenced by s[5] again
  }

Aliasing – 3

Or not!
How do you know that s[2] and s[5] are not pointers to the same object?

You don't, and the postcondition is true only if we know that they do not alias

  method Test(s: seq<Ref<nat>>) returns (x: nat, y: nat)
    requires |s| > 10
    requires forall i, j :: 0 <= i < j < |s| ==> s[i] != s[j]
    modifies s
    ensures x == y
  {
    x := Read(s, 5);
    Write(s, 2, 56);
    y := Read(s, 5);
  }

Labels

By default, the old expression refers to the state before the method stated its execution
Sometimes, you need to refer to more specific states
You can refer to specific point in the execution by naming then with labels
And use the label for the old expression

State history correspond to statements

  method M(c: Ref<nat>)
    modifies c
    ensures c[0] % 2 == 1
  {
    c[0] := 2 * c[0];
    label L:
    c[0] := c[0] + 1;
    assert old@L(c[0]) % 2 == 0;
    assert c[0] % 2 == 1;
  }

Arrays – ranges

Arrays support range indexing
However, the outcome is a sequence
This is useful to be used in specifications
Since arrays are references, you can refer to their past state

  lemma ArrayUnchanged<T>(a: array)
    ensures a[..] == old(a[..])

Fresh

If a method returns a new array, the caller cannot declare it in its frame

If the array is guaranteed to be fresh, then the caller is allowed to modify it anyway

  method Needfresh(a: array<int>)
    requires a.Length > 10
  {
    var b: array<int> := Copy(a);
    b[5] := 8;
  }

Arrays – example

An example: copying an array

  method Copy(a: array<int>) returns (b: array<int>)
    ensures a.Length == b.Length
    ensures forall i: nat :: i < a.Length ==> a[i] == b[i]
    ensures fresh(b)
  {
    b := new int[a.Length];
    for i := 0 to a.Length
      invariant i <= a.Length
      invariant forall j: nat :: j < i ==> a[j] == b[j]
    {
      b[i] := a[i];
    }
  }

Unchanged

A shorthand to state than no fields of an object have been modified

  class O {
    var x: int
    var y: int
  }

  method Unchanged(o: O)
    ensures unchanged(o) <==> o.x == old(o.x) && o.y == old(o.y)
  {
  }

Allocated

A shorthand to ensure that an object you are referring to as been allocated

  class O {}

  method Allocated(o: O)
    ensures !old(allocated(o)) <==> fresh(o)
  {
  }

Two-state predicates

Method postconditions are “two-state propositions”
By using the old expression, they can refer to the state before and after the method
It can be convenient to define such propositions as predicates

This can be done by prefixing a predicate definition by twostate

  twostate predicate NoChange<T>(a: array)
    reads a
  {
    a[..] == old(a[..])
  }