Verification of Imperative Programs

Jean-Baptiste Tristan & K. Rustan M. Leino

Methods

  • At first glance, Dafny's methods are like functions in C or Python or like methods in Java

  • The body of a method is a sequence of statements

  • Methods can use mutable local variables and have side effects

  • Usually, a method is used in the context of object-oriented programming

    • In Dafny, a method need not be defined for an object
      method Example(x: nat, y: nat) returns (z: nat) {
        var t: nat := 0;
        z := 0;
        for i := 0 to x {
          z := z + y;
        }
        z := t;
      }

Verifying methods

  • Recall that method calls are RHS expressions

  • They cannot be mentioned in logical formulas

  • Method verification is done using pre- and postconditions

    • You can state a property thanks to the explicit out-parameters
      method Example(x: int, y: int) returns (z: int)
        requires x > 0
        requires y > 0
        ensures z == x + y
        ensures z > 0
      {
        z := x + y;
      }

Methods and termination

  • Methods are allowed to diverge

  • In such a case, the postcondition holds only if the method does terminate

      method Bad(x: nat)
        ensures false
        decreases *
      {
        Bad(x + 1);
      }

Methods and termination – danger

  • Be wary of methods that are using decreases * for no reason

      method CorrectIsh()
        ensures 0 == 1
        decreases *
      {
        Bad(0);
      }

When verification fails

  • In some cases, Dafny verifies everything on its own
  • But that won't always be the case
  • How can we help Dafny verify the postconditions of a method?
  • Of course, we're going to write proofs, but how?

Floyd logic – 1

  • The body of a method is a sequence of statements

  • A method has a state comprised of its variables

  • Each statement can transform the state

  • If Dafny cannot verify the postcondition

    • Since proof tactics are statements as well, they can be freely woven into the body of the method
  • A statement may affect the values of expressions mentioned in preconditions and earlier assertions

      method MixedCodeAndProof(x: nat) returns (y: nat)
        requires x > 1
        ensures y > x
      {
        y := x + 2;
        assert y == x + 2;
        y := 2 * y;
        assert y == 2 * x + 4;
        y := y - 5;
        assert y == 2 * x - 1;
      }

Floyd logic – 2

  • Proof statements can be involved and you can use forall, etc.

  • Tip: if you're going to write a complicated proof, encapsulate it in an assert by

  • That way, it is easy to spot code and proof

  • The IDE also tries to help you distinguish between proof and code by italicizing ghost code

      method MixedCodeAndProof(x: nat) returns (y: nat, ghost g: int)
        requires x > 1
        ensures y > x
      {
        y := x + 2;
        y := 2 * y;
        g := y + 10;
        y := y - 5;
        assert y > x by {
          assert y == 2 * x - 1;
          assert 2 * x - x > 1;
        }
      }

Hoare triples

  • Which assertion should help?
  • First and foremost, you should rely on your intuition
  • Move forward from hypothesis
  • Move backward from goal
  • Ask yourself: if x was even and I multiplied it with an even, is it still even? Why?
  • Of course, there are more formal rules that describe how statements affect propositions
    • They are called Hoare triples

Call

  method M()
    requires Q()
    ensures R()

  method Call()
    requires P()
    requires P() ==> Q()
    requires R() ==> S()
    ensures S()
  {
    M();
  }

Sequencing

  method S1()
    requires P()
    ensures Q()

  method S2()
    requires Q()
    ensures R()

  method Sequencing()
    requires P()
    ensures R()
  {
    S1();
    S2();
  }

Variable update

  method M() returns (o: T)
    ensures forall t: T :: P(t)

  method Update() returns (o: T)
    ensures P(o)
  {
    o := M();
  }

Loops

  • Loops, like recursion, need invariants

  • The difficulty is that you need to generalize enough

  • If your control is complex (if, continue, nested loops), your invariants will be complex as well

      method Times(x: nat, y: nat) returns (z: nat)
        ensures x * y == z
      {
        z := 0;
        for i := 0 to x
          invariant i * y == z
        {
          z := z + y;
        }
      }

Case study – 1

  • In this example, verification fails

  • Let's help Dafny

      methodM(a: nat) returns (b: nat)
        ensures Even(b)
      {
        b := 2 * a;
        b := b * b;
      }

Case study – 2

  • We can prove it by undoing the update to b

      method M(a: nat) returns (b: nat)
        ensures Even(b)
      {
        b := 2 * a;
        assert Even(b);
        b := b * b;
        assert Even(b) by {
          assert exists c: nat :: b == 2 * c by {
            assert b == (2 * a) * (2 * a);
            assert b == 2 * (2 * a * a);
          }
        }
      }

Case study – 3

  • We can prove it by introducing a fresh variable that retains the original value of b

      method M(a: nat) returns (b: nat)
        ensures Even(b)
      {
        var b0: nat := 2 * a;
        assert Even(b0);
        b := b0 * b0;
        assert Even(b) by {
          assert exists c: nat :: b == 2 * c by {
            var c: nat :| b0 == 2 * c;
            assert b == (2 * c) * (2 * c);
            assert b == 2 * (2 * c * c);
          }
        }
      }

Weaving proof and code

  • That's annoying: we just made our code worse for verification purposes
  • The only purpose of b0 is to help with the proof
  • It should not be in the final code
  • We can declare it to be a ghost variable
  • Dafny knows that ghost variables are for proof only
  • It will check that you don't use ghost variables in genuine computation

Case study – 4

  • Solution with ghost variables, better than the two previous ones

      method M(a: nat) returns (b: nat)
        ensures Even(b)
      {
        b := 2 * a;
        ghost var b0: nat := b;
        assert Even(b0);
        b := b * b;
        assert Even(b) by {
          assert exists c: nat :: b == 2 * c by {
            var c: nat :| b0 == 2 * c;
            assert b == (2 * c) * (2 * c);
            assert b == 2 * (2 * c * c);
          }
        }
      }

Case study – 5

  • This would not even make sense with the first two approaches
  method M(a: nat, p: nat) returns (b: nat)
    ensures Even(b)
  {
    b := 2 * a;
    assert Even(b);
    for i := 0 to p
      invariant Even(b)
    {
      ghost var b0: nat := b;
      b := b * b;
      assert Even(b) by {
        assert exists c: nat :: b == 2 * c by {
          var c: nat :| b0 == 2 * c;
          assert b == (2 * c) * (2 * c);
          assert b == 2 * (2 * c * c);
        }
      }
    }
  }

Functional code as specification – 1

  • As we saw, some properties cannot be expressed as postconditions
  • Plus, reasoning about imperative code can be more tedious
  • Methodology: define your computation both in imperative and functional style
  • Prove their equivalence
  • Prove any advanced properties on the functional version

Functional code as specification – 2

  • Consider this example:

      methodTimes(x: nat, y: nat) returns (z: nat)
        ensures z == Timesf(x, y)
      {
        z := 0;
        for i := 0 to x {
          z := z + y;
        }
      }

Functional code as specification – 3

  • Step 1: write a functional spec

      function Timesf(x: nat, y: nat): nat {
        if x == 0 then 0 else y + Timesf(x - 1, y)
      }

Functional code as specification – 4

  • Step 2: implement and prove equivalence

      method Times(x: nat, y: nat) returns (z: nat)
        ensures z == Timesf(x, y)
      {
        z := 0;
        for i := 0 to x
          invariant z == Timesf(i, y)
        {
          z := z + y;
        }
      }

Functional code as specification – 5

  • Step 3: prove properties on the spec

      lemma TimesfCommutative(x: nat, y: nat)
        ensures Timesf(x, y) == Timesf(y, x)
      {
        if x == 0 {
        } else {
          TimesfCommutative(x - 1, y);
        }
      }

Functional code as specification – 6

  • Connecting the functional spec to the imperative code can be hard
  • Strategy:
    • Implement simple functional spec
    • Implement tail-recursive functional spec
    • Prove equivalence of simple and tail-recursive
    • Prove equivalence of tail-recursive and imperative

Function by method

  • A method cannot be called from within a functional expression

  • A priori, it means that functional code can never call imperative code

  • You can work around that by proving that a method correctly implements a function

  • This is done using the function by method declaration

  • Note that the function body must satisfy the ensures clause (if any)

  • The method body must return exactly the same value as the function body

      function Timesf(x: nat, y: nat): nat
      {
        if x == 0 then 0 else y + Timesf(x - 1, y)
      } by method {
        var z := 0;
        for i := 0 to x
          invariant z == Timesf(i, y)
        {
          z := z + y;
        }
        return z;
      }

Arrays – introduction

  • Arrays in Dafny are references

  • Like usual arrays, they support lookup and in-place update

  • An array's length is obtained with the .Length field

  • An array is created with Java-like syntax

      method M(a: array<int>)
        requires a.Length > 10
        modifies a
      {
        var x: int := a[3];
        a[5] := 6;
        var b: array<int> := new int[5];
      }

Programming with references

  • So far, the state of our imperative code has been limited to mutable local variables

  • Now, we are going to consider references to mutable memory

  • Such memory is allocated on the heap

  • Such memory is accessed via references (“pointers”)

  • Arrays are references

      type Ref<T> = a: array<T> | a.Length == 1 witness *

Reasoning with references

  • The addition of references to mutable data leads to a significant increase in verification difficulty
  • Also, they have a profound influence on the design of Dafny
  • In essence, what makes references difficult to reason about is aliasing
  • It just means that you could have different references to the same object
    • (Actually, aliasing is no more difficult than aliasing array indices: a[i] and a[j] refer to the same array element if the integers i and j are equal. Easy peasy! What is difficult is abstract aliasing, where the data structures pointed to by two different references overlap. But let's start simple.)
  • The state of a Dafny program corresponds to the memory state

Reading from the state

  • A function cannot modify the state

  • However, a function can read from the state

  • But reading must be explicitly mentioned in the function's specification!

    • That is, a function can read the state only if its specification allows it to
  • A read frame is declared with a reads clause

  • It specifies the set of objects the function is allowed to read from

      function Read(s: seq<Ref<nat>>, idx: nat): nat
        requires idx < |s|
        reads s[idx]
      {
        s[idx][0]
      }

Summary of function types

  • A pure total function: A -> B
  • A pure partial function: A --> B
  • An impure function: A ~> B
    • Note, an impure function can read the state, but no function can ever modify the state

Reading the state

  • A method can read from the state

      method M(s: Ref<nat>)
      {
        print s[0];
      }
  • Note that, unlike function specifications, method specifications don't say anything about reading

Modifying the state

  • A method can modify the state

  • But writing must be explicitly mentioned in the method's specification!

    • That is, a method can write the state only if its specification allows it to
  • A write frame is declared a modifies clause

  • It specifies the set of objects the method is allowed to write to

      method Write(s: seq<Ref<nat>>, idx: nat, v: nat)
        requires idx < |s|
        modifies s[idx]
      {
        s[idx][0] := v;
      }

Referring to old state

  • Since methods can modify the state, we often need to state properties that relate the state before and after the method was executed
  • Within the body of a statement we can do this with ghost variables
  • For references, mutations are not confined to methods
  • We need some way of referring to an old state in the spec
  • This is done using the expression old
  method Add(c: Ref<nat>)
    modifies c
    ensures c[0] % 2 == 0 <==> old(c[0]) % 2 == 0
  {
    c[0] := c[0] + 2;
  }

Aliasing – 1

  • The principal reason why reasoning about methods and stateful code is difficult is aliasing

  • Aliasing happens when there are more than one reference to an object

  • Often, we expect a property to hold because we implicitly assume that there is no aliasing

  • For example, the following is incorrect:

      method InitIncorrect(c1: Ref<nat>, c2: Ref<nat>)
        modifies c2
        ensures old(c1[0]) == c1[0]
      {
        c2[0] := 2;
      }

Aliasing – 2

  • The reason is that c1 and c2 could be references to the same object
  • Therefore, we need to require them to be different
  method Init(c1: Ref<nat>, c2: Ref<nat>)
    requires c1 != c2
    modifies c2
    ensures old(c1[0]) == c1[0]
  {
    c2[0] := 2;
  }

Frames – 1

  • Why do we need frames?
  • It has to do with aliasing
  • It allows to keep track of which part of the memory we can potentially read or modify
  • It may not look like much, but otherwise:
    • You would have to prove trivial things about parts unrelated parts of the state
    • For every new object, you would have to update all previous proofs
    • The amount of specification about objects not changing grows a lot
  • This is known as the “frame problem”

Frames – 2

  • Thanks to the frames, Dafny can verify this code without help

      method TestIncorrect(s: seq<Ref<nat>>) returns (x: nat, y: nat)
        requires |s| > 10
        modifies s
        ensures x == y
      {
        x := Read(s, 5);  // read from the object referenced by s[5]
        Write(s, 2, 56);  // write to the object referenced by s[2]
        y := Read(s, 5);  // read from the object referenced by s[5] again
      }

Aliasing – 3

  • Or not!

  • How do you know that s[2] and s[5] are not pointers to the same object?

  • You don't, and the postcondition is true only if we know that they do not alias

      method Test(s: seq<Ref<nat>>) returns (x: nat, y: nat)
        requires |s| > 10
        requires forall i, j :: 0 <= i < j < |s| ==> s[i] != s[j]
        modifies s
        ensures x == y
      {
        x := Read(s, 5);
        Write(s, 2, 56);
        y := Read(s, 5);
      }

Labels

  • By default, the old expression refers to the state before the method stated its execution

  • Sometimes, you need to refer to more specific states

  • You can refer to specific point in the execution by naming then with labels

  • And use the label for the old expression

  • State history correspond to statements

      method M(c: Ref<nat>)
        modifies c
        ensures c[0] % 2 == 1
      {
        c[0] := 2 * c[0];
        label L:
        c[0] := c[0] + 1;
        assert old@L(c[0]) % 2 == 0;
        assert c[0] % 2 == 1;
      }

Arrays – ranges

  • Arrays support range indexing
  • However, the outcome is a sequence
  • This is useful to be used in specifications
  • Since arrays are references, you can refer to their past state
  lemma ArrayUnchanged<T>(a: array)
    ensures a[..] == old(a[..])

Fresh

  • If a method returns a new array, the caller cannot declare it in its frame

  • If the array is guaranteed to be fresh, then the caller is allowed to modify it anyway

      method Needfresh(a: array<int>)
        requires a.Length > 10
      {
        var b: array<int> := Copy(a);
        b[5] := 8;
      }

Arrays – example

  • An example: copying an array
  method Copy(a: array<int>) returns (b: array<int>)
    ensures a.Length == b.Length
    ensures forall i: nat :: i < a.Length ==> a[i] == b[i]
    ensures fresh(b)
  {
    b := new int[a.Length];
    for i := 0 to a.Length
      invariant i <= a.Length
      invariant forall j: nat :: j < i ==> a[j] == b[j]
    {
      b[i] := a[i];
    }
  }

Unchanged

  • A shorthand to state than no fields of an object have been modified
  class O {
    var x: int
    var y: int
  }

  method Unchanged(o: O)
    ensures unchanged(o) <==> o.x == old(o.x) && o.y == old(o.y)
  {
  }

Allocated

  • A shorthand to ensure that an object you are referring to as been allocated
  class O {}

  method Allocated(o: O)
    ensures !old(allocated(o)) <==> fresh(o)
  {
  }

Two-state predicates

  • Method postconditions are “two-state propositions”

  • By using the old expression, they can refer to the state before and after the method

  • It can be convenient to define such propositions as predicates

  • This can be done by prefixing a predicate definition by twostate

      twostate predicate NoChange<T>(a: array)
        reads a
      {
        a[..] == old(a[..])
      }