Correct sorting with Frama-C and some thoughts on Formal Methdos

12 02 2011

A couple of years ago, during my masters on Formal Methods I have been working with automatic provers and I also used Frama-C, this is a tool that allow the user to prove C code directly in the source code, using a special notation in the comments, called ACSL notation.

Frama-C allows you to make two kinds of proofs, security and safety ones. The safety ones are related with arrays index out of bounds access, and so. This kind of proofs are related to the language itself and they are easy to do if you use loop invariants, pre and post conditions.
If you use a high level language, like JAVA you won’t have almost none safety problems.
Because C is too close to machine level code, we can do things that we do not intend (or maybe we do and we use C exactly because it allows this kind of things). For example:

// foo.c file
#include <stdio.h>

int main() {
    char *a = "I like you";
    char *b = "I hate you";
    
    if(&a < &b) a = *(&a + 1); 
    else        a = *(&a - 1); 

    printf("%s\n", a); 
}

As you can see, I never used the b variable for nothing, just have declared it. And the result is:

[ulissesaraujocosta@maclisses:c]-$ gcc -o foo foo.c 
[ulissesaraujocosta@maclisses:c]-$ ./foo
I hate you

This lack of security of language C is one of the reasons we need to write safety statements. Of course this kind of things is why C is so fast and powerful, the person in charge is always the programmer. If you are interested in this kind of tricks and want to understand more about this and smashing the stack and so, feel free to read more posts in my blog about this subject.

The other kind of statements (security ones) are related to the functionality of the program and that’s basically where the problem or the effort is, I will talk about this later on. First let’s see the algorithm and the implementation in C.

Code

The algorithm I use here is just a simple example. I used bubble sort, this is a sort algorithm not very efficient, but it uses none more memory then the needed to store the structure you want to sort.
To get a visual understanding of the algorithm (and to see it inefficiency) check out this youtube video.

This is the implementation of the algorithm:

void swap(int *i, int *j) {
    int tmp = *i;
    *i = *j;
    *j = tmp;
}

void bubbleSort(int *vector, int tam) {
    int j, i;
    j = i = 0;
    for(i=0; i<tam; i++) {
		for(j=0; j<tam-i-1; j++) {
            g_swap = 0;
            if (vector[j] > vector[j+1]) {
                swap(&vector[j],&vector[j+1]);
            }
        }
    }
}

Pre, Post conditions and thinking formally

So, as you can see in the video (or in the code) the algorithm is pretty much simple, we pick the i element cross the array n times and for each element we compare with i, this n times.

We have as pre conditions: The size of the vector must be greater than zero, and all the positions in that vector exists, so in Frama-C we use the valid\_range(vector, i, j), where i and j are indexes of the vector to say that all elements exist.

tam > 0

valid\_range(vector,0,tam-1)

Ans as pos conditions we must ensure that the array is sorted ( I will talk predicate this later).
You may think that this by itself is enough to make a complete proof, but you are wrong. Image that my function clear all the elements in the array and fill the array with \{1,2,..,tam\}, our code will be proved and its wrong!

So, we need to say more… First thing that can pop to your head is OK, we will say that we have the same numbers in the beginning and in the end and you write this:
\forall_a : 0 \leq a < tam : (\exists_b : 0 \leq b < tam : old(vector(b)) \equiv vector(a))

In fact this is closer (not yet right), imagine that you give as input:
\{4,7,9,1,0,3,4\}. If your code returns \{0,1,3,4,7,9\} (we miss the repeated 4) the code will be proved.
So, the solution if to make a Permut predicate and prove for the multi set.
So, this are the post conditions:

sorted(vector,0,tam-1)

Permut\{Old,Here\}(vector,0,tam-1);

Frama-C is so cool because for example at the pos condition if we want to refer to the state in the beginning (before call the function) we use Old and if we want to refer to the moment after the call we heave the Here keyword, remember we are at the post condition, so this wil be executed in the end (so Here means the end of the function call).

Predicates

So, here is the Sorted predicate. Predicates receive a state L and the parameters (just like a function) and they return bool values (true or false). Inside we use regular ACSL notation. Here I define that for an array to be sorted each element must be less or equal to the next one.

/*@ predicate Sorted{L}(int a[], integer l, integer h) =
  @   \forall integer i; l <= i < h ==> a[i] <= a[i+1];
  @*/

The Permut is defined inductively, so we receive two states L1 and L2 and the array a and the range where we want to permute.
We write multiple rules for the permutation, reflection, symmetry, transitivity and finally the most important one, the Swap. So basically here we say that a permutation is a set of successive swaps.

/*@ inductive Permut{L1,L2}(int a[], integer l, integer h) {
  @  case Permut_refl{L}:
  @   \forall int a[], integer l, h; Permut{L,L}(a, l, h) ;
  @  case Permut_sym{L1,L2}:
  @    \forall int a[], integer l, h;
  @      Permut{L1,L2}(a, l, h) ==> Permut{L2,L1}(a, l, h) ;
  @  case Permut_trans{L1,L2,L3}:
  @    \forall int a[], integer l, h;
  @      Permut{L1,L2}(a, l, h) && Permut{L2,L3}(a, l, h) ==>
  @        Permut{L1,L3}(a, l, h) ;
  @  case Permut_swap{L1,L2}:
  @    \forall int a[], integer l, h, i, j;
  @       l <= i <= h && l <= j <= h && Swap{L1,L2}(a, i, j) ==>
  @     Permut{L1,L2}(a, l, h) ;
  @ }
  @
  @ predicate Swap{L1,L2}(int a[], integer i, integer j) =
  @      \at(a[i],L1) == \at(a[j],L2)
  @   && \at(a[j],L1) == \at(a[i],L2)
  @   && \forall integer k; k != i && k != j ==> \at(a[k],L1) == \at(a[k],L2);
  @*/

So, as you can see the bubble sort function itself have 18 lines of code, and in the end with the annotations for the proof we end with 90 lines, but we proved it!

Thoughts

My main point here is to show the thinking we need to have if we want to prove code in general. Pick what language you want, this is the easiest way you will have to prove software written in C. Sometimes if your functions are too complex you may need to prove it manually. The problem is not on the Frama-C side, Frama-C only generates the proof obligations to feed to automatic provers, like Yices, CVC3, Simplify, Z3, Alt-Ergo and so.

My point here is to show the cost of proving software. Proving software, specially if the language is too low level (like C – you need to care about a lot more things) is hard work and is not easy to a programmer without theoretical knowledge.
On the other side, you end up with a piece of software that is proved. Of course this proof is always requirements oriented, ny that I mean: if the requirements are wrong and the program is not doing what you expect the proof is along with that.
I do not stand to proof of all the code on the planet, but the proper utilization of FM (formal methods) tools for critical software.

I steel been using Frama-C since I learned it in 2009, nowadays I use it for small critical functions (because I want, I’m not encouraged to do so) and I have to say that the use of FM in the industry is far. As I told you Frama-C is the easiest automatic proof tool you will find at least that I know.

Talking with Marcelo Sousa about the use of FM in industry, we came to the conclusion that the people that are making this kind of tools and have the FM knowledge don’t make companies. I think if more brilliant people like John Launchbury make companies, definitely FM will be more used.

Source code

Here is all the code together if you want to test it:

// #include <stdio.h>

/*@ predicate Sorted{L}(int a[], integer l, integer h) =
  @   \forall integer i; l <= i < h ==> a[i] <= a[i+1];
  @
  @ predicate Swap{L1,L2}(int a[], integer i, integer j) =
  @      \at(a[i],L1) == \at(a[j],L2)
  @   && \at(a[j],L1) == \at(a[i],L2)
  @   && \forall integer k; k != i && k != j ==> \at(a[k],L1) == \at(a[k],L2);
  @*/

/*@ inductive Permut{L1,L2}(int a[], integer l, integer h) {
  @  case Permut_refl{L}:
  @   \forall int a[], integer l, h; Permut{L,L}(a, l, h) ;
  @  case Permut_sym{L1,L2}:
  @    \forall int a[], integer l, h;
  @      Permut{L1,L2}(a, l, h) ==> Permut{L2,L1}(a, l, h) ;
  @  case Permut_trans{L1,L2,L3}:
  @    \forall int a[], integer l, h;
  @      Permut{L1,L2}(a, l, h) && Permut{L2,L3}(a, l, h) ==>
  @        Permut{L1,L3}(a, l, h) ;
  @  case Permut_swap{L1,L2}:
  @    \forall int a[], integer l, h, i, j;
  @       l <= i <= h && l <= j <= h && Swap{L1,L2}(a, i, j) ==>
  @     Permut{L1,L2}(a, l, h) ;
  @ }
  @*/

/*@ requires \valid(i) && \valid(j);
  @ //assigns *i, *j; //BUG 0000080: Assertion failed in jc_interp_misc.ml
  @ ensures \at(*i,Old) == \at(*j,Here) && \at(*j,Old) == \at(*i,Here);
  @*/
void swap(int *i, int *j) {
    int tmp = *i;
    *i = *j;
    *j = tmp;
}

/*@ requires tam > 0;
  @ requires \valid_range(vector,0,tam-1);
  @ ensures Sorted{Here}(vector, 0, tam-1);
  @ ensures Permut{Old,Here}(vector,0,tam-1);
  @*/
void bubbleSort(int *vector, int tam) {
    int j, i;
    j = i = 0;
    //@ ghost int g_swap = 0;

  /*@ loop invariant 0 <= i < tam;
    @ loop invariant 0 <= g_swap <= 1;
    //last i+1 elements of sequence are sorted
    @ loop invariant Sorted{Here}(vector,tam-i-1,tam-1);
    //and are all greater or equal to the other elements of the sequence.
    @ loop invariant 0 < i < tam ==> \forall int a, b; 0 <= b <= tam-i-1 <= a < tam ==> vector[a] >= vector[b];
    @ loop invariant 0 < i < tam ==> Permut{Pre,Here}(vector,0,tam-1);
    @ loop variant tam-i;
    @*/
    for(i=0; i<tam; i++) {
      //@ ghost g_swap = 0;
      /*@ loop invariant 0 <= j < tam-i;
        @ loop invariant 0 <= g_swap <= 1;
        //The jth+1 element of sequence is greater or equal to the first j+1 elements of sequence.
        @ loop invariant 0 < j < tam-i ==> \forall int a; 0 <= a <= j ==> vector[a] <= vector[j+1];
        @ loop invariant 0 < j < tam-i ==> (g_swap == 1) ==> Permut{Pre,Here}(vector,0,tam-1);
        @ loop variant tam-i-j-1;
        @*/
		for(j=0; j<tam-i-1; j++) {
            g_swap = 0;
            if (vector[j] > vector[j+1]) {
                //@ ghost g_swap = 1;
                swap(&vector[j],&vector[j+1]);
            }
        }
    }
}

/*@ requires \true;
  @ ensures \result == 0;
  @*/
int main(int argc, char *argv[]) {
    int i;
    int v[9] = {8,5,2,6,9,3,0,4,1};

    bubbleSort(v,9);

//     for(i=0; i<9; i++)
//         printf("v[%d]=%d\n",i,v[i]);

    return 0;
}

If you are interested in the presentation me and pedro gave at our University, here it is:

Advertisements




A* search algorithm

1 01 2011

Time to talk about efficiently pathfinding and graph traversal algorithms. The first algorithm related with graphs pathfinding I learned was Dijkstra’s algorithm and I remember the feeling for learning how to find the shortest path (minimal cost to be generic) in a graph, it was amazing to me, so that I learned a few more and did a simple academic GPS core system.

Dijkstra’s algorithm is truly beautiful, but unfortunately the complexity it too high to be considered time efficient.

If you want to go from point A to B Dijkstra’s wil search all the surrounding nodes, as you can see in this image:

So I start to find more efficient implementations for the pathfinding problem and I discover A Star.

A Star algorithm

I find this algorithm in wikipedia, and I will paste it here because there are a few things I want to explain.

function A*(start,goal)
     // The set of nodes already evaluated.
     closedset := the empty set
     // The set of tentative nodes to be evaluated.
     openset := set containing the initial node
    // The map of navigated nodes.
     came_from := the empty map                 
    // Distance from start along optimal path.
     g_score[start] := 0                        
     h_score[start] := heuristic_estimate_of_distance(start, goal)
    // Estimated total distance from start to goal through y.
     f_score[start] := h_score[start]           
     while openset is not empty
         x := the node in openset having the lowest f_score[] value
         if x = goal
             return reconstruct_path(came_from, came_from[goal])
         remove x from openset
         add x to closedset
         foreach y in neighbor_nodes(x)
             if y in closedset
                 continue
             tentative_g_score := g_score[x] + dist_between(x,y)
             if y not in openset
                 add y to openset
                 tentative_is_better := true
             elseif tentative_g_score < g_score[y]
                 tentative_is_better := true
             else
                 tentative_is_better := false
             if tentative_is_better = true
                 came_from[y] := x
                 g_score[y] := tentative_g_score
                 h_score[y] := heuristic_estimate_of_distance(y, goal)
                 f_score[y] := g_score[y] + h_score[y]
     return failure
 
 function reconstruct_path(came_from, current_node)
     if came_from[current_node] is set
         p = reconstruct_path(came_from, came_from[current_node])
         return (p + current_node)
     else
         return current_node

If you are familiarized with Dijkstra’s algorithm you probably noticed a lot of coincidences in the algorithm. You are right!
So, we have some structures to use: the closedset is the set of nodes already evaluated by A Star, openset containing the nodes being evaluated, g\_score being the commulative distance to this node, h\_score being the heuristic result for this node (I will explain this in a minute) and f\_score being the sum of g and h.

A good thing about A Star is the nodes it needs to search until find the Best-First-Search path:

Seems good right? All the juice of this algorithms lies on the heuristic function, I like the result of the Manhattan distance, but you can read this blog post and find out more about this subject.
Basically this heuristic is empirical knowledge, this particular Manhattan distance calculates the distance from point A to point B in a grid.

I tried to use it in a Geometric graph and it worked fine too!

Point of view

I’m particularly interested in optimizing this algorithm as much as possible and will be doing that using C++.
So, I spend 30 minutes observing the code and I was already very familiarized with Dijkstra’s. I think as a Software Engineer you have to find good algorithms to your problem, deeply understand them, but your job is not done after that! After find a good solution and understand it to the point you are able to explain it to a non-CS person you have to *observe* the code, talk with it and, believe me, don’t make the literal implementation of it, It will be slow, or at least it probably could be more efficient!

So, let’s forget about the problem this algorithm solves and try to identify inefficient chunks in the algorithm. The first thing that cames to my mind is: we need to have a bunch of structures to keep a lot of node related information, a lot of vectors, sets and so.

So, let’s identify what I mean by that:

function A*(start,goal)
...
     while openset is not empty
...
         x := the node in openset having the lowest f_score[] value
...
         foreach y in neighbor_nodes(x)
...
                 g_score[y] := tentative_g_score
                 h_score[y] := heuristic_estimate_of_distance(y, goal)
                 f_score[y] := g_score[y] + h_score[y]

With this chunk of code I want to highlight that we are iterating throw all neighbors for each x belongs to openset, then we get the minimum f from openset and change the g,h,f arrays for y.

The first thing hitting me is, I can make openset a MinHeap and I can keep all the information for each *\_score in the node itself, like so I won’t be wasting time in accessing positions in a set and I just make a question to the node object.

So, I start to put all this information in the nodes side and keep only track of the locally created minHeap. This is the result:

LinkedList<Edge*> AStar::findPath(Node* start, Node* goal) {
    LinkedList<Edge*> ret;
    MinHeap<Node*, Comparator> openSet;
    bool tentative_is_better = false;

    float h = heuristic(start,goal);

    start->setStatus(NODE_OPEN);
    start->getOrSetScore().set(NULL, h,0,h);
    openSet.insert(start);

    while(!openSet.isEmpty()) {
        Node x = openSet.getMin();
        AStarScore & xScore = x->getOrSetScore();

        if(x == goal) return process(ret,x);
        openSet.removeMin();
        x->setStatus(NODE_CLOSE);
        ArrayList<Edge> & neighbors = x->getEdges();
        for(int i = 0; i < neighbors.getLength (); i++) {
            Node *y = neighbors [i]->getDstNode();
            AStarScore & yScore = y->getOrSetScore();
            GeoNodeStatus yStatus = y->getStatus();

            if(yStatus == NODE_CLOSE) {
                continue;
            }   

            float tentative_g_score = xScore.g_score + x->getEdge(y)->getCost();

            if(yStatus != NODE_OPEN) {
                y->setStatus(_version,GEONODE_OPEN);
                tentative_is_better = true;
            } else if(tentative_g_score < yScore.g_score) {
                tentative_is_better = true;
            } else {
                tentative_is_better = false;
            }   
            if(tentative_is_better) {
                yScore.parent = x;
                yScore.g_score = tentative_g_score;
                yScore.h_score = heuristic(y,goal);
                yScore.f_score = yScore.g_score + yScore.h_score;
                openSet.insert(y);
            }
        }
    }
    return process(ret,goal);
}

Where process is the function that iterate throw the nodes, starting with goal and go the parent until start node and construct the path in reverse order, from start to goal.

One side note, A star is so close to Dijkstra, that if you make you heuristic function return always zero, A star will work just like Dikjstra (the first image).

Basically the ideas I want to transmit here is the A star algorithm, it implementation and (the most important one) the process of how to look to an algorithm. Summarizing I think a Software Engineer could not just implement algorithms and substitute all the words insert(lista,elem) in the algorithm for list.push\_back(elem) in C++ and so. I think we should look for an algorithm as a valuable aid and if we convert it to code we must improve it. Ultimately copying an algorithm to code is a good opportunity to leave our personal touch in the code.

Acknowledgments

Images from here.





Type inference

30 07 2008

Intro

The type inference is the ability to a programming language deduct the data types of all functions of a program. It is a feature present in many strongly typed languages, such as Haskell. Where is not mandatory write the signature of the functions. What is great because it increases the production of code, and security because if the inference algorithm fail means that we have an error of types in our code, all this in compilation time.

As I said in previous post, one of the upgrades of Pointfree calculator is the type inference. After reading some scientific articles about Damas-Milner algorithm, also known as W algorithm, I began to imagine a way to implement this in Java, which is the mother language of Pointfree calculator. I started to do some sketches on paper and, after talking with professor José Nuno Oliveira, I realize that the algorithm isn’t that hard.

Remainings

Definition of (either in Haskell):

NOTE: in types means Either in Haskell



Definition of :

Type signature of Left and Right:


Talking same language

I will use a different notation to represent the domain and codomain of functions in order to help the explanation of the algorithm.

For function we have the type:

I will write that as:

Remember the definition of , we receive two functions, f and g. Because the notation is in pointfree, we represent also de domain and codomain of function in front of that, like we do for f and g.
In fact the type of is represented as:

I will also use the symbol , to say that type a unify with type b, that means, informally, that .

Let’s infer!

I will explain the algorithm to infer the type of function f:

The first step of the algorithm is attribute to all functions polymorphic types, so I will call the first type and the last

Because, have type , we conclude ;
Also, because have the type , we can conclude ;
Same thing to , that have the type , we can conclude and , so we have:

Because, the definition of : , we can say that the domain of f is equal to codomain of g, and so we can conclude , as we replace a type that is used in the codomain of first Right, we must also conclude , so:

As I explain before, the function , have the following type: , so:
and ;
Because have the type: , so and :

Because the definition of is , we need the same codomain in both functions, so we conclude , as both type trees have the same structure, we can conclude even more: , so:

And now we have the function, just with the needed types to simplify:

.

Now we just need to unify: and ,

.

We infer the type for function , .
Or if you prefer; in Haskell:

f :: Either (Either a b) c -> Either a (Either b c)