foldr the magic function

20 11 2007

Recursion

In Haskell we essentially use recursion, this mean defining a function using itself definition.
A recursive function is, for example the factorial:

in haskell we define it like that:

fact 0 = 1
fact n = n * fact (n-1)

As you can see Haskell is authentic mathematic, pure functional language.

My point here is to show you the magic of the function $foldr$ in Haskell, so I will assume that you already know Haskell, and maybe are accustomed to use $foldr$ function.

ok…

Seeing the magic

The definition of the function $length$ could be something like this:

length [] = 0
length (h:t) = 1 + length t

The definition of the function $sum$ could be something like this:

sum [] = 0
sum (h:t) = h + sum t

And the definition of the function $mult$ could be something like this:

mult [] = 1
mult (h:t) = h * mult t

As you can see in all three definitions we have a pattern and we can generalize it:

functionName [] = stopCase
functionName (h:t) = constant operator (functionName t)

We always need the stopping case of the function!

And the same happens in other functions like $reverse$ :

reverse [] = []
reverse (h:t) = reverse t ++ [h]

In this case we have:

functionName [] = stopCase
functionName (h:t) = (functionName t) operator constant

So, we will always use the function applied to the tail of the list!

If we use some of lambda notation we can see one more pattern, the real one:

sum [] = 0
sum (h:t) = (a b -> a + b) h (sum t)

reverse [] = []
reverse (h:t) = (a b -> b ++ [a]) h (reverse t)

Our lambdas functions have arity 2, because those are the parts in which we can see the lists (head + tail).

Now we can generalize even more, considering $function$ have arity 2:

functionName [] = stopCase
functionName (h:t) = function h (functionName t)

well, that’s not right for Haskell. Because it doesn’t know what $function$ represents, that’s because the definition of $functionName$ doesn’t receive that argument. And the same happens with $stopCase$ .

functionName function stopCase [] = stopCase
functionName function stopCase (h:t)
     = function h (functionName function stopCase t)

Et voilá.

In fact the last definition is the true one for $foldr$ . Easy right?

Functions with foldr’s

Well… Now we can redefine previous functions just with foldr’s:

length l = foldr (a b -> 1 + b) 0 l
sum l = foldr (+) 0 l
mult l = foldr (*) 1 l
reverse l = foldr (a b -> b ++ [a]) [] l

And all the functions over list’s…

and l = foldr (&&) True l
or l = foldr (||) False l

More magic

Foldr is more magical than this, it is one catamorphism! A Catamorphism over this data type:

data [a] = [] | a : [a]

Haskell List’s!

That means that $foldr$ functions “knows” the recursive pattern over List’s.

Foldr also allows you to use Pointfree notation, witch I like. Formal explanation.

length = foldr (a b -> 1 + b) 0

More foldr’s

We can also write foldr’s functions to every data type that we create, and “explain” how recursive patterns works for our new data type:

data BTree a where
    Empty :: BTree a
    Node :: a -> BTree a -> BTree a -> BTree a

foldrBTree :: (a -> b -> b -> b) -> b -> BTree a -> b
foldrBTree op k Empty = k
foldrBTree op k (Node a l r) = op a (foldrBTree op k l) (foldrBTree op k r)

Now if we have a Binary Tree and want to know how many nodes it has, we just have to do this:

countNodes = foldrBTree (\a nodesLeft nodesRight ->a+nodesLeft+nodesRight) 0

If we want to know how many Empty’s our tree has:

countEmpty = foldrBTree (\_ nodesLeft nodesRight -> nodesLeft + nodesRight) 1

and so on…

Fold’s are really a genuine help for your code.
Later on I will talk more about catamorphisms…

Actions

Information

Date : November 20, 2007
Tags: catamorphism, code, English, foldr, GADT, haskell, howto, lambda, pointfree, recursion
Categories : Uncategorized

20 responses

20 11 2007: Praj (08:46:09) :

Hi,
I liked your post it’s best explanation for foldr i have seen on the web for newbies
I still have a doubt though , how did you replace
functionName with functionName function stopCase
in the end would you care to elaborate on that

Thanks

Reply
20 11 2007: Nick (10:40:00) :

Hi. Good post, but you should try removing more points from your functions. “sum l = foldr (+) 0 l” becomes “sum = foldr (+) 0”, “and l = foldr (&&) True l” becomes “and = foldr (&&) True”, etc.

Reply
20 11 2007: augustss (11:25:43) :

I think the ‘any’ and ‘all’ functions deserve some further comment. The fact that laziness gives you early-out on those is interesting.

Reply
20 11 2007: mojo jojo (12:42:54) :

brilliant. now go write a dx10 video game with ur awesome language…

Reply
20 11 2007: Chris Eidhof (13:47:29) :

You should also take a look at UU’s Attribute Grammar System. It is a preprocessor for Haskell which is based on the idea of catamorphisms, and allows you to easily fold over lots of data-types. It comes in very handy for writing compilers or transforming large tree-like data-structures.

http://www.cs.uu.nl/wiki/HUT/AttributeGrammarSystem

Reply
20 11 2007: gwern (21:27:19) :

Nitpicks: did you intend ‘lenght’ and to write “Foldr also allow you to use Pointfree notation, wish I like.”

Reply
21 11 2007: Keebler (01:52:52) :

Did some of this post get truncated?
I only see it up to the definition of ‘fact’..

Reply
21 11 2007: ulisses (04:54:29) :

In response about some critics of my English, just letting you know that I wrote this post in English to reach a vast audience . I think that you can get the whole idea, and the most important part of this is the technical explanation.

Even so I corrected all of the errors.

@Praj
f a = b
what is b? you have to receive “b”, so:
f a b = b
hope that I made that clear to you.

@Chris Eidhof
I’ll check that out. Thank you

@Keebler
That’s already corrected. Thanks

@Nick
I know… I used that on countNodes and countEmpty… I just did that to be more comprehensive.

Reply
21 11 2007: Razvan (10:41:45) :

I think it’s also important to mention that foldr will not necessarily go to the end of the list before returning a result, making it suitable for processing infinite lists. The *and* and *or* functions are good examples, since both will return as early as possible from a call.

Reply
24 11 2007: Surfista (23:49:34) :

Well done Eng. Ulisses

Reply
25 11 2007: CAOS: Como usar a API do Google Maps « .dscape (20:18:54) :

[…] revelam o sítio onde ele deveria realmente estar a programar os seus internacionalmente famosos foldrs. E viva os catamorfismos de listas! […]

Reply
28 11 2007: Zhao Shenyang (19:13:01) :

I think Graham Hutton have an article called “A tutorial on the universality and expressiveness of fold” which is worthwhile to read if you are not very familiar to this topic.

Reply
29 11 2007: Porges (05:04:04) :

“or l = foldr (||) False l” should be “or l = foldr (||) False 0”, shouldn’t it?

Reply
29 11 2007: ulisses (05:12:41) :

foldr (||) False :: [Bool] -> Bool

so the third argument of foldr must be, in this case, [Bool] not Int (0).

Reply
29 07 2009: gurst (21:17:30) :

Loved this article, really helped out. Any chance of fixing the formatting a little? I bet a lot of people have this bookmarked; it’s a great post for new functional programmers.

Reply
30 11 2010: Ulisses Araújo Costa (11:59:33) :

done

Reply
30 11 2010: Nikolay Artamonov (11:17:28) :

Implementations of aggregators (such as length, sum, all, any, mult, etc) with foldr IS inefficient and irrational, because you can use efficient foldl instead. Even better, use strict version of left fold: foldl’. Why?

1. The implementation of foldl is tail-recursive, so it is optimised with tail call optimisation (TCO).

2. foldl’, in addition to foldl, haven’t deferred computations – chunks for updating accumulator values. So, we will NOT get stack overflow at “length [1..1000000]”.

(for details see http://www.haskell.org/haskellwiki/Performance/Accumulating_parameter)

foldr is useful when we have deal with structural transformations from one list to another (such as map, filter, etc), because the implementation of foldr (as opposed to foldl) allows to handle infinite lists.

Reply
30 11 2010: Ulisses Araújo Costa (11:43:14) :

Nikolay I think you miss the point here. My idea with this post was to make an introduction to high-order functions and traversals in Haskell. Like you can see in my other posts related to Haskell or theoretical foundations. I just used the foldr example because it seems more natural to me than foldl.

Of course foldl is way more efficient that foldr, and the only thing you need to draw this conclusion is read the code of foldr and foldl. So I think I owe you up to congratulate for you having read the code.

Either way I gave you a perfect score if it were a lesson.

Reply
1 12 2010: Marcelo Sousa (05:33:40) :

foldr is also useful when we are dealing with lazy operators such as and (foldr and True listofbools). foldl always consume the whole list so that’s why you are mentioning foldr for structural transformations but it’s not just for that :-).

Reply
18 10 2013: Frank R. (17:47:39) :

Great explanation! Thanks a lot!!
Just one suggestion: Please don’t use the small L as a variable identifier because it looks exactly like the number 1.

Reply