Cryptol the language of cryptography

1 04 2009

Pedro Pereira and I are working on a new project in the Masters. The second half of the Masters is composed of a single project suggested by a company. Some companies are forming partnerships in the Masters formal methods, including: the Critical software, SIG and Galois. We chose the Galois because we also are in the area of cryptography and we already knew some work of some people from this company.

The project suggested by Galois was study the Cryptol as a language of specification of cryptographic algorithms. The cipher we used for this study is the SNOW 3G (The SNOW website), later on I will talk about the specification of this cipher. In this post I am only interested to show the language.

I’m going to show you some details about the language. This post is not intend to be a exhaustive explanation of Cryptol, if you looking for that you can go directly to the manuals. This post only relates my experience, and what I like it most with the language.

Overview

Cryptol is a high-level language that is geared to deal with low-level problems. Is a Domain-specific language to design and implement cryptographic algorithms.
This language has a high percentage of correctness of the implementation of a cipher, because it implements type inference, so we can say that a big part of the language implements correctness. This correctness is also achieved thanks to the architecture of the language – functional. We don’t have side effects – a function only return something inside is codomain.
In Cryptol we have this philosophy that says that everything is a sequence. This is very useful because we are working with low level data (array of bits), so we use sequences to represent that arrays. We can have nested sequences to have a more structured representation of data. For example, we can simply transform a 32-bit sequence in a 4 1-byte sequence.
The size of this sequences could be implemented as finite or infinite, as we going to see later in this post. Because Cryptol is a high-level language we can also implement polymorphic functions, most of the primitive functions are implemented in polymorphic mode. The way we have to navigate throw the sequences is using recursion, or sequences comprehension, and with these two techniques we can implement recurrences.

If you are a Haskell programmer you just need the next section to learn Cryptol. This language is so look a like with Haskell that even the philosophy seems to have a lot in commune.

Types in Cryptol

The type [32] means that you have a sequence of 32-bit size. All the types in Cryptol are size oriented. The unit is the Bit, that you can use to represent Bool. To represent a infinite sequence we use the reserved word inf, and we write: [inf] to represent that.

If you want to generate a infinite sequence, we use the syntactic sugar of the sequences like that: [1~..]. Cryptol will infer this sequence as type

[1~..]~:~[inf][1]

That means this sequence have infinite positions of 1-bit words. The type inference mechanism will always optimize the size that he needs, to represent the information.
So, it infer the type of [100~..] as:

[100~..]~:~[inf][7]

Because, it “knows” that needs only 7-bits to represent the decimal 100. But if you need more, you can force the type of your function.
We implement polymorphism in our types, if we have:

f~:~[a]b~\rightarrow~[a]b

This means, that the function f have polymorphism over b, because we say that it domain is one sequence of size a of type b, and it codomain also. Here we could also see: f~:~[a][b]c meaning that f is a constant of sequences of size b of type c, a times.

So, lets talk about some primitive functions in Cryptol, and its types. The tail function have the following type in Cryptol:

tail~:~\{a~b\}~[a+1]b~\rightarrow~[a]b

As we can see, Cryptol is so size oriented, that we can use arithmetic operators in types. We can probably infer what this function does just from it type: tail works for all a and b such that if we have one sequence os size a+1 of type b it returns one sequence of size a of same type. In fact this function removes the first element of one sequence.

Because of this size oriented philosophy a lot of functions, that change the size of the sequences can be read just from the type.

As you can see in the following list of Cryptol primitive function:

drop~:~\{ a~b~c \}~( fin~a ,~a~\geq~0)~\Rightarrow~(a ,[ a + b ]~c )~\rightarrow~[ b ]~c
take~:~\{ a~b~c \}~( fin~a ,~b~\geq~0)~\Rightarrow~(a ,[ a + b ]~c )~\rightarrow~[ a ]~c
join~:~\{ a~b~c \}~[ a ][ b ] c~\rightarrow~[ a * b ]~c
split~:~\{ a~b~c \}~[ a * b ] c~\rightarrow~[ a ][ b ]~c
tail~:~\{ a~b \}~[ a +1] b~\rightarrow~[ a ]~b

Recursion and Recurrence

Cryptol implements Recursion, just like a lot of functional languages do.

Imagine the fibonacci function definition:

It implementation in Crytol is exactly the same as defined mathematically.

fib : [inf]32 -> [inf]32;
fib n = if n == 0 then 0 else if n == 1 then 1 else fib (n-1) + fib (n-2);

Cryptol uses recursion to permit us to iterate throw sequences.

But, If you prefer you can implement a more functional algorithm of fibonacci function in Cryptol:

fib : [inf]32 -> [inf]32;
fib n = fibs @ n;
   where {
      fibs : [inf]32;
      fibs = [0 1] # [| x + y || x <- drop (1,fibs) || y <- fibs |];
   };

Here, as you can see, we define a infinite list fibs of all the fibonacci numbers, by calling the fibs inside the sequences comprehension fibs, this is called a recurrence, and you can use that too in Cryptol.

Cryptol vs C

I’m going to show you some part of the implementation of SNOW 3G in C. This is a function called MUL_{\alpha}

MULa : [8] -> [32];
MULa(c) = join ( reverse [
   ( MULxPOW(c, 23 :[32], 0xA9) )
   ( MULxPOW(c, 245:[32], 0xA9) )
   ( MULxPOW(c, 48 :[32], 0xA9) )
   ( MULxPOW(c, 239:[32], 0xA9) ) ] );
/* The function MUL alpha.
    Input c: 8-bit input.
    Output : 32-bit output.
    See section 3.4.2 for details.
 \*/
u32 MULalpha(u8 c) {
 return
  ((((u32)MULxPOW(c,23, 0xa9)) << 24 ) |
  (((u32)MULxPOW(c, 245,0xa9)) << 16 ) |
  (((u32)MULxPOW(c, 48,0xa9)) << 8 ) |
  (((u32)MULxPOW(c, 239,0xa9)))) ;
}

You can see that in Cryptol we just say that we want to work with a 32-bit word, and we don’t need to do any shift to our parts of the word. We just join them together. We reverse the sequence, because Cryptol stores words in little-endian, and we want to keep the definition like the specification.

This is a very simple function, so the result in C is not so that different. But if we have a more complex function, we were going to start having a nightmare to write that in C.

Conclusion

Well, the conclusion is that Cryptol is a language that really help to write low-level algorithms. With Cryptol the specification is formal and easier to read than other languages. A value of Cryptol is that the code can be converted to other languages, such as VHDL and C.

If you’re interested, take a look at the presentation that we did.

References

Advertisements




Haskell history

27 10 2007

Haskell is so far one of my favorite programming languages, I don’t know if you noticed (lolz). Because it is ease of proof, no side effects, strongly typed, polymorphic, powerful, elegant and so much more.

Yesterday I was searching if there were any report talking about Haskell authors, other than Simon Peyton Jones or Paul Hudak. What I found out much more than I expected. I’ve found the History of Haskell paper. The paper that was published at 3rd History of Programming Language Conference in June 2007.

I start reading immediately, it was fascinating… The paper is a bit long (55 pages in two columns style), but deserves attention. The history is told for those who made History, Simon Peyton Jones, Paul Hudak, John Hughes and Philip Wadler, the main authors.

Haskell principles, ideology, implementations, tools – Haskell world, are in this paper. This document is excellent for those who want to learn Haskell or just know it history.

References:

Haskell Wikipedia entrance

IFIP WG 2.8





Contains element

1 10 2007

Um problema que tive enquanto fazia o Catalogador de discos (CD) foi na inexistência de uma função que verificasse se todos os elementos de um determinado conjunto faziam parte de um outro conjunto, interessando a ordem.

Exemplificando:

Imaginando que existe uma função f temos,

f [1,9,8,5] [3,4,5,1,3,4,7,8,94,3,1,9,8,5,2,2,2,15,24] terá que originar o resultado

True

.

Encontrei esta solução, se tiverem uma ideia melhor estou receptivo.

{-# OPTIONS -fglasgow-exts #-}
module Main where
import List
contain :: (Eq [a] , Eq a) => [a] -> [a] -> Bool
contain s1 s2 | null s1 || null s2 = False
contain s1@(h:_) s2 =
        let len   = length s1
            elems = elemIndices h s2
        in  (or . map (==s1)) [take len $ drop n s2 | n <- elems ]

código aqui





Parsing de expressões

30 09 2007

Vamos criar uma função que pega numa String, que tem representado ma expressão matemática, e passa para um determinado tipo de dados (Exp o).

A String que iremos passar á função para que seja processada será qualquer coisa do género:

"((1>=a) || (4==x))" e até "(2 -( ((1+3) * (4-x)) / 45))"

Comecemos então pelo inicio;

{-# OPTIONS -fglasgow-exts #-}
module ParserExp where
import Char

Escolhemos o nosso tipo de dados para representar, primeiro, os operadores:

data Ops = Add | Mul | Sim | Sub | Div
             | OR_ | AND_ | NOT_ | GE_
             | GT_ | LE_ | LT_ | EQ_ | NE_
        deriving Eq

Acabamos de definir 14 operadores matemáticos; Adição, Multiplicação, Simétrico, Subtracção, Divisão, ou lógico, e lógico, negação, maior que, menor que, maior ou igual a, menor ou igual a, igual e diferente.

Passemos agora por definir as nossas expressões como uma árvore n-ária:

data Exp o = Const Int | Var String | Op o [Exp o]

Agora podemos ver que o tipo de dados escolhido encaixa perfeitamente no que pretendemos, vejamos:

"((1>=a) || (4==x))" pode ser representado comoOp OR_ [Op GE_ [Const 1, Var "a"], Op EQ_ [Const 4, Var "x"]]

Temos que criar uma classe para sabermos a aridade de cada operador:

class Opt o where
        arity :: o -> Int
instance Opt Ops where
        arity Add = 2
        arity Mul = 2
        arity Sim = 1
        arity Sub = 2
        arity Div = 2
        arity OR_  = 2
        arity AND_ = 2
        arity NOT_ = 1
        arity GE_  = 2
        arity GT_  = 2
        arity LE_  = 2
        arity LT_  = 2
        arity EQ_  = 2
        arity NE_  = 2

De seguida fazemos as instancias de Show para os nossos tipos de dados, ficamos assim com:

instance Show Ops where
        show Add = "+"
        show Mul = "*"
        show Sim = "Sim"
        show Sub = "-"
        show Div = "/"
        show OR_  = "||"
        show AND_ = "&&"
        show NOT_ = "!"
        show GE_  = ">="
        show GT_  = ">"
        show LE_  = "<="
        show LT_  = " Show (Exp o) where
        show (Const n) = show n
        show (Var s)   = s
        show (Op o l)
                | arity o == 2 = "(" ++ (show $ head l) ++ show o ++ (show $ last l) ++ ")"
                | arity o == 1 = "(" ++ show o ++ (show $ head l) ++ ")"

Bom… vamos finalmente ao que nos interessa relamente, o nosso Parser!

Existe um tipo de dados em Haskell que foi pensado para parser, que é o ReadS, este apresenta a seguinte definição:

data ReadS a = String -> [(a,String)]

Ou seja, é fácil perceber que este tipo terá que ser a assinatura da nossa função, pois só desta forma conseguimos processar uma String, originando (sempre!) uma lista de tuplos em que a primeira parte do tuplo tem o nosso tipo desejado (árvore n-ária) e a segunda parte tem o resto da String que ainda não foi processada:

De lembrar que o nosso tipo de dados é (Exp Ops) e não apenas Exp.
Sendo assim temos:

readsExp :: ReadS (Exp Ops)
readsExp s =
        [((leOp "~" [a]),p4) |  ("(",p1)  <- lex s,
                                       ("~",p2) <- lex p1,
                                       (a,p3)    <- readsExp p2,
                                       (")",p4)  <- lex p3] ++
        [((leOp op [a,b]),p5) | ("(",p1)  <- lex s,
                                       (a,p2)    <- readsExp p1,
                                       (op,p3)  <- lex p2,
                                       op == "+"  || op == "*"  || op == "/"  ||
                                       op == "-"  || op == "||" || op == "&&" ||
                                       op == "==" || op == "!=" || op == "=" || op == "!"  || op == ">"  ||
                                       op == "<",
                                       (b,p4)   <- readsExp p3,
                                       (")",p5) <- lex p4 ] ++
        [((Const ((read a)::Int)),sx) | (a,sx) <- lex s, all isDigit a] ++
        [((Var a),sx)                 | (a,sx)  [Exp Ops] -> Exp Ops
                leOp o = Op (read o::Ops)

Acreditem que o nosso parser resume-se a estas 21 linhas (com indentação exagerada).

Aqui temos que dar os créditos ás listas por compreensão e depois á função lex que vem no Prelude. É incrível o poder desta função.
Aconselho a verem mais sobre esta função.

E finalmente temos a instancia de Read para Exp Ops:

instance Read (Exp Ops) where
        readsPrec _ s = readsExp s

Apartir de agora sempre que quisermos fazer parsing a uma expressão utilizamos:

parsing :: String -> Exp Ops
parsing = read

Ao usar a função parsing asseguramo-nos de que iremos receber sempre uma Exp Ops que é o desejado.

Este parser é bastante simples e como tal não faz reconhecimento de erros ou más formações na String recebida.

É tudo!

código aqui