19 12 2007

Lately I have had many works to do at university, therefore I’m sorry for the non regularity of my post’s.
By the way, I also liked to announce that this is my first conscientious post at Planet Haskell.

Why Catamorphisms and Point-Free

Catamorphisms is the way that we can explain in one function how recursive patterns works in some data type.
Here I will use Point-Free notation because what matters here is to show the data flow and the composition of functions.
Point-Free style is used to think in data flow terms, and very useful to program verification, applying formalism to our code.

In point-free style of programming programs are expressed as combinations of simpler functions. This notation is known as write functions without their arguments. Pointwise is the normal form how we write a function.

Couple of examples of point-free notation:
1)

sum = foldr (+) 0 -- point-free
sum l = foldr (+) 0 l -- pointwise


2)

f = (*10).(+2) -- point-free
f n = (n+2)*10 -- pointwise


Clarifications

First of all to define a function, for example $f$, I say:

$\xymatrix {\mathbb{N} \ar[r]^{f} & \mathbb{R}}$ or
$\xymatrix {\mathbb{R} & \mathbb{N}\ar[l]^{f}}$ or
$\xymatrix {\mathbb{N} \ar[d]^{f} \\ \mathbb{R}}$

I will assume that you are familiarized with infix notation, $const$, $either$, $uncurry$ and composition $\circ$ function.

Types

In Haskell we have this definition for lists:

data [a] = [] | a : [a]

Let’s create the same, but more convenient. Consider the following isomorphic type for lists:

data List a = Empty | Node(a,List a) deriving Show

To represent $[1,2,3]$ we wrote $Node(1,Node(2,Node(3,Empty)))$.

As you can see, to construct a $(List a)$ we have two options, $Empty$ or $Node$. Formally we represent the constructor $Empty$ as $1$. And we use $(+)$ to say that our two possibilities are $1$ or $Node$. We could see $Node$ as a the following function:

Node :: (a,List a) -> List a


So typologically we have $1 + (a,List~a)$. We use $(\times)$ to define that two things occurs in parallel, like tuples do, so we can redefine it: $1 + (a \times~List~a)$

Now we can say that $(List~a)$ is isomorphic to $(1 + a \times~List~a)$.
This is something to say that $(List~a)$ and $(1 + a \times~List~a)$ keep the same information without any change or any loss.

Catamorphisms as composition of functions

Let $A$, $B$, $C$, $D$ be Inductive data types (sets) and $out$, $cata$, $rec$ functions.

$\xymatrix@!=4pc{A \ar[r]^{out_{List}} \ar[d]^{cata(g)_{List}} & B \ar[d]^{rec_{List}} \\C & D\ar[l]^{g}\\}$

We will write $cata(g)_{List}$ using the composition of $out$, $cata$, $rec$ functions. That way we are breaking our problem in small ones. So, in the end we will have the following definition for $cata(g)_{List}$:

$cata(g)_{List} = g \circ rec_{List} \circ out_{List}$

The function that we want is $cata(g)$, and that function is over $(List~a)$ so we have:

cata :: (D -> C) -> List a -> C

Type $A$ is $(List~a)$. Maybe this isn’t clear yet, let’s start with function $out$

out

The function $outList$ is responsible to create the isomorphism between $(1 + a \times~List~a)$ and $(List~a)$, so the code could be something like this:

outList :: List a -> Either () (a,List a)
outList Empty    = Left ()
outList (Node t) = Right t


In Haskell we represent the type $1$ as $()$, $(+)$ as $Either$ and $(\times)$ as $(,)$.

So, type $B$ is $(1 + a \times~List~a)$.

function $g$

The function $g$ is also known as *gen*, here is where we said the step that pattern do. Imagine that we want to insert all the values of $(List~a)$ into $[a]$:

-- pointwise
g :: Either () (a,[a]) -> [a]
g (Left()) = []
g (Right(a,h)) = a:h

-- pointfree
g = either (const []) (uncurry (:))


We represent $cata(g)$ as $(| g |)$.
Now we can be more specific with our graphic:

$\xymatrix@!=10pc{List~a \ar[r]^{out_{List}} \ar[d]^{(| g |)_{List}} & 1 + (a \times~ List~a) \ar[d]^{rec_{List}} \\[a] & 1 + (a \times~ [a])\ar[l]^{g}\\}$

rec

Here we have to get a function $rec$ that transform $1 + (a \times~List~a)$ into $1 + (a \times~[a])$. That function, general rec, will be:

recg f g h = f -|- (g ><  g) x = ((f . fst) x , (g . snd) x)


With that function we can say exactly what to do with type $1$, $a$, and $List~a$ in domain of $rec$.
So we want something like this:

rec g = recG id id g

like that we said that $(1 + (a \times~\_))$ will be the same in the counter domain $(1 + (a \times~\_))$ of $rec$. Now we need a function that receive a $List~a$ and give us a $[a]$
Yes, that function is $(| g |)$! So, the final graphic became:

$\xymatrix@!=10pc{List~a \ar[r]^{out_{List}} \ar[d]^{(| g |)_{List}} & 1 + (a \times~List~a) \ar[d]^{id + id \times (| g |)_{List}} \\[a] & 1 + (a \times~[a])\ar[l]^{g}\\}$

cata

Finally we gonna to define the function $cata(g)$:

cata g = outList . rec (cata g) . g


where $g$ is our *gen* function,

g = either (const []) (uncurry (:))


And our objective:

List2HaskellList = cata \$ either (const []) (uncurry (:))


More catamorphisms

Imagine we have the following data type:

data NList a where
Leaf  :: a -> NList a
NNode :: a -> [NList a] -> NList a


$out$, $rec$, and $cata$ became:

out (Leaf a) = Left a
out (NNode a l) = Right (a,l)


Using the previous definitions of $(-|-)$ and $(><)$

rec f = id -|- (id >< map f)

cata g = g . rec (cata g) . out


Imagging that $g$ has type:

g :: Either a (a,[[a]]) -> [a]


And the graphic for this cata became:

$\xymatrix@!=6pc{NList~a \ar[r]^{out_{NList}} \ar[d]^{(| g |)_{NList}} & a + (a \times~[NList~a]) \ar[d]^{id + id \times map~(| g |)_{NList}} \\[a] & a + (a \times~[[a]])\ar[l]^{g}\\}$

Conclusion

I’ve talked about cata’s without any formalism, the idea was to explain to someone who didn’t know.

I will talk more about catamorphisms and how to calculate programs with them.
In the future I will like to talk about anamorphisms too. And Later on I will talk about point-free over non recursive functions.

9 responses

19 12 2007

Cool post ! I like these kind of tutorials even if I already know the topic. That kind of tutorial was very useful to me when I started learning Haskell one year ago.

I think that without blogs I would not have been able to understand Haskell.

20 12 2007

The definition of “recg f g h” shadows the f and g in both the (-|-) and (><) definitions. This makes for a much more difficult to read tutorial.

also, spelling: “madders” should be “matters”

20 12 2007

@alpheccar
thank you

@Nooc

f a b = a + b * (g b)
where g a = a + 1


this is the same as:

f a b = a + b * (g b)
g a = a + 1

5 01 2008

Brutal!

11 01 2008

[...] up in some blog posts (often Haskell-related). Sounds fancy, and when I try to look it up, I run into stuff that is probably really deep, but looks like gobbledygook to me. (Likely because my [...]

12 03 2008

[...] Like I promise I will talk about point-free over non recursive functions. Point-free style is used to demonstrate propriety over programs easily than pointwise. Because we didn’t have variables in functions, we can compose a lot of small functions with each other using some combinators. Those combinators have some rules associate with them that tell us when to use some combinator.Also because I think that’s the easy way to learn point-free programming style. I will talk about that style over non recursive functions. [...]

9 04 2009

[...] in Haskell 9 04 2009 If you miss something in this post, I suggest you to start in Catamorphisms and [...]

9 04 2009

[...] in Haskell 9 04 2009 If you lost yourself in this post, I advise you to start in catamorphisms, then anamorphisms and then [...]

12 04 2009

[...] Catamorphisms in Haskell « Ulisses Costa Blog Catamorphisms is the way that we can explain in one function how recursive patterns works in some data type. (tags: haskell blog cs) [...]