Capture and dissect network traffic

1 04 2009

Currently I am doing research at the University of Minho in the group of distributed systems, with duration of one year. My job is to find a way to identify specific links between a user and a distributed system. The general idea is to draw a map of services in a distributed system. This post only refers to the first milestone.

The proposal was to make such a system using Snort.


Snort is a Network intrusion detection system, that means with Snort you can detect malicious activity in your network. We can detect many types of network attacks. We can identify DoS, DDoS attacks, port scans, cracking attempts, and much more.

Snort can operate in two different ways. We can set up Snort to run in passive mode, putting it to listen in promiscuous mode. That is, because Ethernet network switches send traffic to all computers connected to itself, we get traffic addressed to other machines on the network. To do this we only need to connect to the network and turn Snort on in our machine, no one knows that we are recording every traffic (including traffic destined for other computers).

Snort may also run in active mode. This “active” is not able to modify the data channel, but to be able to be installed in a network, a router for example and reap more information than in passive mode. Thus it makes sense to use the capacity of rules that Snort supports, to filter the traffic that it read.

To do this, Snort capture all packets that pass the network and interprets each. As the rules we have defined Snort tries to find these patterns in each packet, or each set of packets and take certain actions for each of them.

For example, if a large number of TCP requests reach a particular host, to a large number of ports in a short space of time we probably are the target of a port scan. NIDS like Snort know how to find these patterns and alerting the network administrator.


Our aim was to use Snort to capture all traffic into passive mode.

root@pig:# snort -u snort -g snort -D -d -l /var/log/snort -c /etcsnort/snort.debian.conf -i eth0

We are saving the logs in binary (tcpdump format), for that I use the “-d -l /dir/” flags. I prefer to save all the packets into binary because is more easier to parse, than the structure of files and directories that Snort creates by default.

I started by trying to use some language that advised me to try to do the parsing of the file created by snort. Initially started to use python, but only find a tcpdump parser and could not get more than one file translated in tcpdump to hexadecimal.
After that I tried to use Haskell and I was amazed!

Haskell and packet parsing

House is a Haskell Operative System done by The Programatica Project.

This is a system than can serve as a platform for exploring various ideas relating to low-level and system-level programming in a high-level functional language.

And indeed helped me a lot in doing my job. This project have already done a lot of parsers for network packets. It implements the Ethernet, IPv4, IPv6, TCP, UDP, ICMP, ARP and I think is all.

The libpcap (tcpdump parser) is already implemented in Haskell too, so is very simple to parse a complete packet:

getPacket :: [Word8] -> InPacket
getPacket bytes =  toInPack $ listArray (0,Prelude.length bytes-1) $ bytes

-- Ethernet | IP | TCP | X
getPacketTCP :: [Word8] -> Maybe (NE.Packet (NI4.Packet (NT.Packet InPacket)))
getPacketTCP bytes = doParse $ getPacket bytes :: Maybe (NE.Packet (NI4.Packet (NT.Packet InPacket)))

As you can see is too easy to have a compete structure of a packet parsed with this libraries. The problem is that they don’t have already implemented a application packet parser. So, according to that image:

This is the level of depth we can go with this libraries. What is very good, but not perfect for me :S

My supervisor told me to start searching a new tool to do this job. I was sad because I could not do everything in Haskell. But it is already promised that I will continue this project in Haskell. You can see the git repo here.

I find tshark, a great tool to dissect and analyze data inside tcpdump files.

The power of tshark

tshark is the terminal based Wireshark, with it we can do everything we do with wireshark.

Show all communications with the IP

root@pig:# tshark -R "ip.addr ==" -r snort.log
7750 6079.816123 -> SSHv2 Client: Key Exchange Init
7751 6079.816151 -> TCP ssh > 51919 [ACK] Seq=37 Ack=825 Win=7424 Len=0 TSV=131877388 TSER=1789588
7752 6079.816528 -> SSHv2 Server: Key Exchange Init
7753 6079.817450 -> TCP 51919 > ssh [ACK] Seq=825 Ack=741 Win=7264 Len=0 TSV=1789588 TSER=131877389
7754 6079.817649 -> SSHv2 Client: Diffie-Hellman GEX Request
7755 6079.820784 -> SSHv2 Server: Diffie-Hellman Key Exchange Reply
7756 6079.829495 -> SSHv2 Client: Diffie-Hellman GEX Init
7757 6079.857490 -> SSHv2 Server: Diffie-Hellman GEX Reply
7758 6079.884000 -> SSHv2 Client: New Keys
7759 6079.922576 -> TCP ssh > 51919 [ACK] Seq=1613 Ack=1009 Win=8960 Len=0 TSV=131877415 TSER=1789605

Show with a triple: (time, code http, http content size), separated by ‘,’ and between quotation marks.

root@pig:# tshark -r snort.log -R http.response -T fields -E header=y -E separator=',' -E quote=d -e frame.time_relative -e http.response.code -e http.content_length

Show a tuple of arity 4 with: (time, source ip, destination ip, tcp packet size).

root@pig:# tshark -r snort.log -R "tcp.len>0" -T fields -e frame.time_relative -e ip.src -e ip.dst -e tcp.len
551.751252000  48
551.751377000   144
551.961545000  48
551.961715000   208
552.682260000  48
552.683955000   1448
552.683961000   1448
552.683967000   512
555.156301000  48
555.158474000   1448
555.158481000   1400
556.021205000  48
556.021405000   160
558.874202000  48
558.876027000   1448

Show with a triple: (source ip, destination ip, port of destination ip).

root@pig:# tshark -r snort.log -Tfields  -e ip.src -e ip.dst -e tcp.dstport
...   37602   37602  22   37602  22  22   37602   37602   37602  22  22  22  22   37602   37602


Hierarchy of protocols

root@pig:# tshark -r snort.log -q -z io,phs
frame                                    frames:7780 bytes:1111485
  eth                                    frames:7780 bytes:1111485
    ip                                   frames:3992 bytes:848025
      tcp                                frames:3908 bytes:830990
        ssh                              frames:2153 bytes:456686
        http                             frames:55 bytes:19029
          http                           frames:5 bytes:3559
            http                         frames:3 bytes:2781
              http                       frames:2 bytes:2234
                http                     frames:2 bytes:2234
          data-text-lines                frames:10 bytes:5356
        tcp.segments                     frames:3 bytes:1117
          http                           frames:3 bytes:1117
            media                        frames:3 bytes:1117
      udp                                frames:84 bytes:17035
        nbdgm                            frames:50 bytes:12525
          smb                            frames:50 bytes:12525
            mailslot                     frames:50 bytes:12525
              browser                    frames:50 bytes:12525
        dns                              frames:34 bytes:4510
    llc                                  frames:3142 bytes:224934
      stp                                frames:3040 bytes:182400
      cdp                                frames:102 bytes:42534
    loop                                 frames:608 bytes:36480
      data                               frames:608 bytes:36480
    arp                                  frames:38 bytes:2046


We use: -z conv,TYPE,FILTER

TYPE could be:

  • eth,
  • tr,
  • fc,
  • fddi,
  • ip,
  • ipx,
  • tcp,
  • udp

And the filters are used to restrict the statistics.

root@pig:# tshark -r snort.log -q -z conv,ip,tcp.port==80
IPv4 Conversations
                                |           | |    Total    |
                                |Frames Bytes | |Frames Bytes | |Frames Bytes | 141    13091    202   259651    343   272742     22     6858     28     4784     50    11642


We use: -z io,stat,INT,FILTER,…,FILTER

root@pig:# tshark -r snort.log -q -z io,stat,300,'not (tcp.port=22)'
IO Statistics
Interval: 300.000 secs
Column #0:
                |   Column #0
Time            |frames|  bytes
000.000-300.000    2161    543979
300.000-600.000    1671    264877
600.000-900.000     508     46224
900.000-1200.000     185     12885
1200.000-1500.000     201     14607
1500.000-1800.000     187     13386
1800.000-2100.000     189     13887
2100.000-2400.000     187     13386
2400.000-2700.000     189     13887
2700.000-3000.000     187     13386
3000.000-3300.000     185     12885
3300.000-3600.000     189     13887
3600.000-3900.000     210     15546
3900.000-4200.000     189     13887
4200.000-4500.000     187     13386
4500.000-4800.000     185     12885
4800.000-5100.000     189     13887


With tshark we could do everything we want to know what is inside a network packet. The trick is to understand the statistics that tshark generate, and know how to ask it.

Now my work will get a machine to run Snort in an active mode and begin to understand how to use Snort to do all this work of collecting information.

If you feel interested and understand Portuguese, see the presentation:


Cryptol the language of cryptography

1 04 2009

Pedro Pereira and I are working on a new project in the Masters. The second half of the Masters is composed of a single project suggested by a company. Some companies are forming partnerships in the Masters formal methods, including: the Critical software, SIG and Galois. We chose the Galois because we also are in the area of cryptography and we already knew some work of some people from this company.

The project suggested by Galois was study the Cryptol as a language of specification of cryptographic algorithms. The cipher we used for this study is the SNOW 3G (The SNOW website), later on I will talk about the specification of this cipher. In this post I am only interested to show the language.

I’m going to show you some details about the language. This post is not intend to be a exhaustive explanation of Cryptol, if you looking for that you can go directly to the manuals. This post only relates my experience, and what I like it most with the language.


Cryptol is a high-level language that is geared to deal with low-level problems. Is a Domain-specific language to design and implement cryptographic algorithms.
This language has a high percentage of correctness of the implementation of a cipher, because it implements type inference, so we can say that a big part of the language implements correctness. This correctness is also achieved thanks to the architecture of the language – functional. We don’t have side effects – a function only return something inside is codomain.
In Cryptol we have this philosophy that says that everything is a sequence. This is very useful because we are working with low level data (array of bits), so we use sequences to represent that arrays. We can have nested sequences to have a more structured representation of data. For example, we can simply transform a 32-bit sequence in a 4 1-byte sequence.
The size of this sequences could be implemented as finite or infinite, as we going to see later in this post. Because Cryptol is a high-level language we can also implement polymorphic functions, most of the primitive functions are implemented in polymorphic mode. The way we have to navigate throw the sequences is using recursion, or sequences comprehension, and with these two techniques we can implement recurrences.

If you are a Haskell programmer you just need the next section to learn Cryptol. This language is so look a like with Haskell that even the philosophy seems to have a lot in commune.

Types in Cryptol

The type [32] means that you have a sequence of 32-bit size. All the types in Cryptol are size oriented. The unit is the Bit, that you can use to represent Bool. To represent a infinite sequence we use the reserved word inf, and we write: [inf] to represent that.

If you want to generate a infinite sequence, we use the syntactic sugar of the sequences like that: [1~..]. Cryptol will infer this sequence as type


That means this sequence have infinite positions of 1-bit words. The type inference mechanism will always optimize the size that he needs, to represent the information.
So, it infer the type of [100~..] as:


Because, it “knows” that needs only 7-bits to represent the decimal 100. But if you need more, you can force the type of your function.
We implement polymorphism in our types, if we have:


This means, that the function f have polymorphism over b, because we say that it domain is one sequence of size a of type b, and it codomain also. Here we could also see: f~:~[a][b]c meaning that f is a constant of sequences of size b of type c, a times.

So, lets talk about some primitive functions in Cryptol, and its types. The tail function have the following type in Cryptol:


As we can see, Cryptol is so size oriented, that we can use arithmetic operators in types. We can probably infer what this function does just from it type: tail works for all a and b such that if we have one sequence os size a+1 of type b it returns one sequence of size a of same type. In fact this function removes the first element of one sequence.

Because of this size oriented philosophy a lot of functions, that change the size of the sequences can be read just from the type.

As you can see in the following list of Cryptol primitive function:

drop~:~\{ a~b~c \}~( fin~a ,~a~\geq~0)~\Rightarrow~(a ,[ a + b ]~c )~\rightarrow~[ b ]~c
take~:~\{ a~b~c \}~( fin~a ,~b~\geq~0)~\Rightarrow~(a ,[ a + b ]~c )~\rightarrow~[ a ]~c
join~:~\{ a~b~c \}~[ a ][ b ] c~\rightarrow~[ a * b ]~c
split~:~\{ a~b~c \}~[ a * b ] c~\rightarrow~[ a ][ b ]~c
tail~:~\{ a~b \}~[ a +1] b~\rightarrow~[ a ]~b

Recursion and Recurrence

Cryptol implements Recursion, just like a lot of functional languages do.

Imagine the fibonacci function definition:

It implementation in Crytol is exactly the same as defined mathematically.

fib : [inf]32 -> [inf]32;
fib n = if n == 0 then 0 else if n == 1 then 1 else fib (n-1) + fib (n-2);

Cryptol uses recursion to permit us to iterate throw sequences.

But, If you prefer you can implement a more functional algorithm of fibonacci function in Cryptol:

fib : [inf]32 -> [inf]32;
fib n = fibs @ n;
   where {
      fibs : [inf]32;
      fibs = [0 1] # [| x + y || x <- drop (1,fibs) || y <- fibs |];

Here, as you can see, we define a infinite list fibs of all the fibonacci numbers, by calling the fibs inside the sequences comprehension fibs, this is called a recurrence, and you can use that too in Cryptol.

Cryptol vs C

I’m going to show you some part of the implementation of SNOW 3G in C. This is a function called MUL_{\alpha}

MULa : [8] -> [32];
MULa(c) = join ( reverse [
   ( MULxPOW(c, 23 :[32], 0xA9) )
   ( MULxPOW(c, 245:[32], 0xA9) )
   ( MULxPOW(c, 48 :[32], 0xA9) )
   ( MULxPOW(c, 239:[32], 0xA9) ) ] );
/* The function MUL alpha.
    Input c: 8-bit input.
    Output : 32-bit output.
    See section 3.4.2 for details.
u32 MULalpha(u8 c) {
  ((((u32)MULxPOW(c,23, 0xa9)) << 24 ) |
  (((u32)MULxPOW(c, 245,0xa9)) << 16 ) |
  (((u32)MULxPOW(c, 48,0xa9)) << 8 ) |
  (((u32)MULxPOW(c, 239,0xa9)))) ;

You can see that in Cryptol we just say that we want to work with a 32-bit word, and we don’t need to do any shift to our parts of the word. We just join them together. We reverse the sequence, because Cryptol stores words in little-endian, and we want to keep the definition like the specification.

This is a very simple function, so the result in C is not so that different. But if we have a more complex function, we were going to start having a nightmare to write that in C.


Well, the conclusion is that Cryptol is a language that really help to write low-level algorithms. With Cryptol the specification is formal and easier to read than other languages. A value of Cryptol is that the code can be converted to other languages, such as VHDL and C.

If you’re interested, take a look at the presentation that we did.


Deploying Honeypots with Honeyd

8 12 2008

Another work of Cryptography, this time on Security in systems information area. Me and Pedro Pereira choose the Honeypot project. The idea is to understand and if possible experiment tools to implement Honeypots.

The Honeynet Project

The Honeynet project began in 1999 by the initiative of Lance Spitzner. It is an international nonprofit project. The idea of this project is to increase security on the Internet without any cost to the public. This is a project with strong bases on OpenSource believes. This project has several years of experience in defense information systems. In his website you can have access to a lot of documentation and software to create Honeynets and Honeypots.

In fact we have started to watch this presentation from HITBSecConf2007 in Dubai by Lance Spitzner. If you feel interested and have 40 minutes I recommend to watch.

Reading about Honeypots

The are 2 kinds of honeypots, client and server. The client surf the web interacting with other computers, and the server wait until someone request a service. And more 2 kind, low and high interaction, that means, if the honeypot is virtual or real (a real machine).

Our work is focused on the study of some solutions to create Honeypots. We start to read some good papers about the subject. After reading Malicious Web Servers, from Know Youy Enemy series, and later the Behind the Scenes of Malicious Web Servers, we start to get involved by the idea of deploying our own Honeypot. So we decide to use Honeyd tool to create them.

In this post I will talk about the process of install and configure a Honepot with the help of Honeyd.


Honeyd is a low interaction honeypot client that create virtual hosts (Honeypots) in a network. These Honeypots can be configured to act like a real operative system, in fact there are approximately 1000 personalities of OS’s that we can chose. At the same time we can configure those operative systems to activate certains services like FTP, HTTP, Telnet, etc.

Honeyd enables a single host to claim multiple addresses – I have tested up to 65536 – on a LAN for network simulation.

The quote is from Niels Provos, the creator of Honeyd.

Installing Honeyd

If you are in debian (like me), the only thing you have to do (and I think you are accustomed to it :P) is:

hydra:~# apt-get install honeyd honeyd-common

The package honeyd-common came with a lot of scripts to emulate the services that we will run in our Honeypot.

If you are running another one download the version I will use, and install it.

Preparing the field for Honeyd

All the system configurations we found was with the intention of running Honeyd in a local network.
And we have a router between us and the Internet. The router do NAT to the internal network, so we only have one IP. So we decide to only implement one honeypot to start testing.

In the router we determined that the computer that has the Honeyd installed ( can be accessed from outside by ssh throw the port 2222, and HTTP by 8080 (to see the statistics). We also add that our honeypot ( would be visible from the Internet with a quantity of open ports. This ports we show further on.

In fact we still have a problem, the router don’t know our honeypot ( To solve this we run in Honeyd computer:

hydra:~# farpd -i eth0

farpd is a program made by Niels Povos. With that program the computer with Honeyd ( will send is MAC address when a ARP request is made to the network. This ARP request happens because the router don’t know who is After sending his MAC address the router will send the package to Honeyd computer (, and Honeyd program will take care of them, sending it to the virtual host.

Now we have to configure how Honeyd will run, the file can be found in /etc/defaults/honeyd
Agora temos que configurar como o hd~vai correr, para isso utilizamos o seguinte textit{script}:

# File: /etc/defaults/honeyd
# Defaults for honeyd initscript

# run as a daemon
# Network interface where honeyd will listen
# Network under control od honeyd (in my case: just one host)
# Options
# -c hostname:port:username:password
OPTIONS="-c localhost:12345:username:password"

The -c flag will collect to us some statistics, that we will put in a pie chart further. This flag receives the hostname, the port, username and password to can access to the statistics.

You may want to configure also the /etc/init.d/honeyd file, here are the first lines of the file:

. . .
# Daemon locations
# Daemon names
# Pidfiles
# Labels
LABEL="Honeyd daemon"
. . .

Note that log files from Honeyd will be written in LOGDIR directory.


With all the statistics being collected by honeyd (with the flag -c activated) the only thing we need is a program to read them, and show them in a human way.
To do that job honeyd came with program honeydstats. To run that command, and start collecting statistics from you honeyd do:

hydra:~# honeydstats --os_report /etc/honeypot/os.honeydstats
      --port_report /etc/honeypot/port.honeydstats
      --spammer_report /etc/honeypot/spam.honeydstats
      --country_report /etc/honeypot/country.honeydstats
      -f /etc/honeypot/honeydstats.conf -l localhost -p 12345

I have choose to write all files in the /etc/honeypot/ directory for convenience. The command above start collecting statistics in port 12345 and host localhost, the –*_report flags indicate the statistics honeydstats collect. The file /etc/honeypot/honeydstats.conf contains the username and the password in the form:

# File: /etc/honeypot/honeydstats.conf

honeypot Configuration

This is the configuration file for my honeypot ( I use a default one, with a little changes, you can find a lot of this files in the Internet.

create win2k
set win2k personality "Microsoft Windows 2000 SP2"
set win2k default tcp action reset
set win2k default udp action reset
set win2k default icmp action block
set win2k uptime 3567
set win2k droprate in 13
add win2k tcp port 23 "sh /usr/share/honeyd/scripts/unix/linux/suse8.0/ $ipsrc $sport $ipdst $dport"
add win2k tcp port 21 "sh /usr/share/honeyd/scripts/win32/win2k/ $ipsrc $sport $ipdst $dport"
add win2k tcp port 25 "sh /usr/share/honeyd/scripts/win32/win2k/ $ipsrc $sport $ipdst $dport"
add win2k tcp port 80 "sh /usr/share/honeyd/scripts/win32/win2k/ $ipsrc $sport $ipdst $dport"
add win2k tcp port 110 "sh /usr/share/honeyd/scripts/win32/win2k/ $ipsrc $sport $ipdst $dport"
add win2k tcp port 143 "sh /usr/share/honeyd/scripts/win32/win2k/ $ipsrc $sport $ipdst $dport"
add win2k tcp port 389 "sh /usr/share/honeyd/scripts/win32/win2k/ $ipsrc $sport $ipdst $dport"
add win2k tcp port 5901 "sh /usr/share/honeyd/scripts/win32/win2k/ $ipsrc $sport $ipdst $dport"
add win2k udp port 161 "perl /usr/share/honeyd/scripts/unix/general/snmp/ public private --config=/usr/share/honeyd/scripts/unix/general/snmp"

# This will redirect incomming windows-filesharing back to the source

add win2k udp port 137 proxy $ipsrc:137
add win2k udp port 138 proxy $ipsrc:138
add win2k udp port 445 proxy $ipsrc:445
add win2k tcp port 137 proxy $ipsrc:137
add win2k tcp port 138 proxy $ipsrc:138
add win2k tcp port 139 proxy $ipsrc:139
add win2k tcp port 445 proxy $ipsrc:445

bind win2k

Here I create a win2k operative system (Microsoft Windows 2000 with SP2) with a lot of open ports {23,21,25,80,110,143,389,5901,137,138,139}TCP and {161,137,138,445}UDP. These port’s must be open in your router, and pointing to the honeypot –

And now we are ready to run honeyd:

hydra:~# /etc/init.d/honeyd start

Generating graphical statistics

Here is the problem of Honeyd, don’t have a graphical interface or a webpage to visualize all the information that are being saved to the .log, just honeydstats…
Well, I’ve found these perl script and these perl system file.

you:/etc/honeypot$ wget -O /etc/honeypot/
you:/etc/honeypot$ wget -O /etc/honeypot/

The file receives files in the form:

#13#           Linux 2.6 .1-7#
#1#          Solaris 10 beta#
#1#            Windows 2000 #
#1#              Windows 98 #
#15#           Windows XP SP1#
#27#                  unknown#

and we have files in the form:

           Linux 2.6 .1-7:       0       1      13
          Solaris 10 beta:       0       0       1
            Windows 2000 :       0       0       1
              Windows 98 :       0       0       1
           Windows XP SP1:       0       0      15
                  unknown:       0       2      26

Now I’m gonna to use the these files generated by honeydstats {os,port,spam,country} to put the files in #n#id# form. Here we need to use the program txt2html, so, better you install it. Now we just have to get out the rust of sed command and we are ready!

# File: /etc/honeypot/
# Country
cat /etc/honeypot/country | sort -rn | head | awk '{print $4" "$1}'  | sed 's/^/#/g' | sed 's/$/#/g' | sed 's/ /#/g' | sed 's/://g'  | perl /etc/honeypot/ /var/www/img/country
cat /etc/honeypot/country | txt2html > /var/www/img/country.html

# Operative System
cat /etc/honeypot/os | sed 's/(.*):([^0-9]*)([^ ]*)([^0-9]*)([^ ]*)([^0-9]*)([^ ]*).*/#7#1#/g' | perl /etc/honeypot/ /var/www/img/os
cat /etc/honeypot/os | txt2html > /var/www/img/os.html

# Spam
cat /etc/honeypot/spam | sort -rn | head | awk '{print $4" "$1}'  | sed 's/^/#/g' | sed 's/$/#/g' | sed 's/ /#/g' | sed 's/://g' | perl /etc/honeypot/ /var/www/img/spam
cat /etc/honeypot/spam | txt2html > /var/www/img/spam.html

# Ports
cat /etc/honeypot/port | sort -rn | head | awk '{print $4" "$1}'  | sed 's/^/#/g' | sed 's/$/#/g' | sed 's/ /#/g' | sed 's/://g' | perl /etc/honeypot/ /var/www/img/port
cat /etc/honeypot/port | txt2html /var/www/img/port.html

Now get the index.html to put in the /var/www/img directory:

you:/etc/honeypot$ mkdir /var/www/img/
you:/etc/honeypot$ wget -O /var/www/img/index.html

Add a cronjob:

you:/etc/honeypot# vi /etc/crontab

And add these line to the end of file:

6  *    * * *   root    /etc/honeypot/

And we are done! For now on, you should have these nice graphics in your index.html file:

Future work

Well, the honeypot still running, and more and more statistics are being collected, in fact we already have make a very good discovers, but in a future post I will talk about that.

We are also thinking in use another kind of honeypot, a low interaction client honeypot, maybe HoneyC to identify malicious servers on the web, and maybe experiment by our selfs what we have read about client honeypots.

Secure connections to MySQL

18 11 2008

Together with Pedro Pereira we decided to investigate how MySQL make secure connections with clients. This is the first milestone of our msc in Cryptography.
It was proposed that we investigate the internal authentication process that MySQL do using X.509 certificates format.

This post gives a short introduction to tools and methods we use, Public-key cryptography, Certificates, OpenSSL, MySQL and VirtualBox.

We use the VirtualBox to install mysql, to avoid installing it in our OS. So, all the commands showed here have to maked in this virtual machine.

Configuring VirtualBox

As we said before, we installed MySQL in a virtual machine, so we decided access the virtual machine by ssh and remote connections to

NAT vs Port forward

By default the network connection in VirtualBox is made by Network Address Translation (NAT), i.e. each package that is sent by the guest machine is modified so that it appears to come from the host machine. Thus it is very easy to guest machine to connect with the entire network (including Internet), but never could start a connection from host machine to guest machine, since the interface of the guest
is hidden by the host machine.

To resolve this issue, and can access from host machine to the guest by ssh and the MySQL we decided to use the Port forward system that VirtualBox offers.
We have the guest machine running a ssh service accepting connections on port 22. Our goal is to make each package reaches a certain TCP port (eg 2222) on the host machine to to be redirected to TCP port 22 in guest machine.

The command that allows us to do this in VirtualBox is: VBoxManage. We make this with following commands, in which would be the name we gave to our guest machine:

shell> VBoxManage setextradata 
        "VBoxInternal/Devices/pcnet/0/LUN#0/Config/ssh/HostPort" 2222
shell> VBoxManage setextradata 
        "VBoxInternal/Devices/pcnet/0/LUN#0/Config/ssh/GuestPort" 22
shell> VBoxManage setextradata 
        "VBoxInternal/Devices/pcnet/0/LUN#0/Config/ssh/Protocol" TCP

From now every time we want to connect by ssh to the guest machine only run the following command in a shell:

shell> ssh -l  -p 2222 localhost

Similarly the same happens with MySQL connections. We want all packages targeted to port 3333 on host machine is redirected to the port 3306 of guest machine . So being able to access the MySQL that is installed on the guest machine:

shell> VBoxManage setextradata 
        "VBoxInternal/Devices/pcnet/0/LUN#0/Config/mysql/HostPort" 3333
shell> VBoxManage setextradata 
        "VBoxInternal/Devices/pcnet/0/LUN#0/Config/mysql/GuestPort" 3306
shell> VBoxManage setextradata 
        "VBoxInternal/Devices/pcnet/0/LUN#0/Config/mysql/Protocol" TCP

Public-key cryptography

The cryptography asymmetric system can be explained with the following analogy: a mailbox,
is accessible to the public through its address (public key), then anyone can send a
message for this box. Just who has the key to the box is the only who can read the messages (private key).

We only guarantee that any person can send encrypted messages to the owner of the mailbox. However we could not guarantee the identity of who recieve the message (the private key may have been compromised). We also can not guarantee the identity of the person who sent (Later we will see that the use of certificates resolve this problem).

For Bob send a message to Alice, he uses her public key to encrypt the message. This cryptogram is sent to Alice that decrypt with her private key.
Later, the Alice responds to Bob, encrypting the message with his public key.

In a different kind of use of public key can have a scenario in which Alice communicate with Bob, encrypts the message with her private key (digital signature) and encrypt it again with Bob’s public key. Thus, on the other side of the channel, Bob uses his private key and
subsequent Alice’s public key, thus obtaining the original clear text.

The cryptogram generated is an example of a symmetric cipher and is more robust than the previous scenario where only one key is used each time. Imagine now a case in which a third malicious actor, publish your public key and claim to be Alice. Thus it is likely someone who cheat and can read some of the messages intended for Alice. Although we have secure connections/strongly encrypted messages, there is still no guarantee the identity of any of the actors in the process of communication. In this context, we use X.509 certificates


An X.509 certificate of public key is an electronic document that can be compared to Identity card. However, instead of attaching a photo to the name of the person, it combines the key to their own (identity). But the certificate may not be issued by the concerned stakeholders because any one could falsify one certificate and claiming a false identity.

There we need that there is an entity (Certification Authority) trusted by both sides to ensure the identities of both. The CA ‘s sign (encrypt) the certificates with theirs private keys allowing validity to who decrypt the signature with their public key CA’s.

But if go to the top of hierarchy in the chain of certificates we will face a problem: who signs the CA certificate? The bottom line is we have always to belive in a entity, now users no longer communicate among themselves but the CA’s do that. The CA ‘s can sign their certificates, an example of a self-signed certificate that normally is a root Certificate.

Installing MySQL

The process we used here, was tested in a machine with Ubuntu 7.10 and 8.04:

shell> apt -get install mysql-server-5.0 mysql-client-5.0

The OpenSSL came compiled by default in the .deb package, but if we have to compile it we only would have to specify the following in the process of setting up the Makefile:

shell> ./ configure --with - openssl

Now, that we have instaled MySQL, we can go into it typing:

shell> mysql -h SERVER -u root -p

This way we got an uncrypted connection to the server, to obtain an encrypted you must add the option –ssl. This option when introduced on the server side means that the server will allow secure connections, in client-side allows to connect to the server via a secure connection. But this option alone is not enough, it is also necessary to introduce –ssl-ca and possibly the –ssl-cert and –ssl-key.
We have to enter with this flags if we not set the appropriate paths of certificates and their keys in the file /etc/mysql/my.conf.

But we’ll see below in more detail how to use these options, now just want to add a user “user” with the password “passwd” in the database “dBASE” located in “” demanding an SSL connection:

mysql > GRANT ALL PRIVILEGES ON dbase .* TO ’user’@’’
       IDENTIFIED BY ’passwd ’
       REQUIRE SUBJECT ’/CN=user ’
       AND ISSUER ’/CN=CA ’

The CIPHER part means the ciphers used for encryption and you should pick up the ciphers stronger because MySQL can use weaker ciphers.

Now, we get out of the MySQL administration program to demonstrate how to generate keys and certificates.

Generate certificates

We will demonstrate how to create a fictitious CA, generate certificates of potential clients/servers and pointed through the private key of CA, just like real in the process. First we create a tree of folders to contain the structuring of certificates:

shell> mkdir -m 755 

The CA folder represents the folder of our certification authority, the private folder will hold private keys; certs folder will have the clients/servers certificates, the newcerts is a required folder for the OpenSSL to store decrypted certificates, whose names will be their serial numbers; finally crl folder will keep the list of revoked certificates.
Now copy the default OpenSSL configuration file to our CA folder:

shell> cp /etc/ssl/openssl .cnf ~/teste/CA/myopenssl .cnf

and we change permission, allowing only the user can read and write:

shell> chmod 600 ~/teste/CA/myopenssl .cnf

We need to create two files, one will be the OpenSSl database:

shell> touch ~/teste/CA/index.txt

and the other, containing the serial numbers of each certificate. We don’t have anyone, so we put “01” in that file:

shell> echo '01' > ~/teste/CA/serial

Now run all commands in the folder ~/test/CA because is there we have the OpenSSL configuration file. The next step is to generate the self-signed CA certificate: generate the CA private key of 2048 bits (Today, less than 2048 bits is no longer considered completely safe).

shell> openssl genrsa -out private/ca-privkey.key 2048

if we want to check the contents of the key:

shell> openssl rsa -text -in private/ca-privkey.key

and if just generate a public key from private key:

shell> openssl rsa -pubout -in private/ca-privkey.key -out ca-publkey.key

Now we generate the certificate (valid for 365 days) and their public key and through private key we signed it:

shell> openssl req -config myopenssl.cnf -new -x509 -extensions v3_ca
        -key private/ca-privkey.key -out certs/ca-cert.crt -days 365

Note that the “Common Name” (CN) is the identifier that distinguishes the entity/person therefore has to be well written. In this case
CN = CA.

Now, if we want to verify the content of the certificate:

shell> openssl x509 -in certs/ca-cert.crt -noout -text

The private key must be stored under very strong permissions, only the root should be able to read it:

shell> chmod 400 private/ca-privkey.key

Then we change the OpenSSL configuration file (myopenssl.cnf) so that we have this information:

[ CA_default ]
dir              = .
certs            = $dir/certs
crl_dir          = $dir/crl
database         = $dir/index.txt
# unique_subject = no
new_certs_dir    = $dir/newcerts
certificate      = $dir/certs/myca.crt
serial           = $dir/serial
# crlnumber      = $dir/crlnumber
crl              = $dir/crl.pem
private_key      = $dir/private/myca.key
RANDFILE         = $dir/private/.rand
x509_extensions  = usr_cert

Now we can produce the client/server certificate:
we generate the private key and certificate request with the public key:

shell> openssl req -config myopenssl.cnf -new -newkey rsa:2048
        -nodes -keyout private/privkey.key -out cert-req.csr

Then we change the permissions of the new key as before. Note that the “Common Name” (CN) is the identifier that distinguishes a person/entity therefore has to be well written.
In this case CN = user.

we can verify the content of the request:

shell> openssl req -in cert-req.csr -noout -text

And with this command we sign the certificate:

shell> openssl ca -config myopenssl.cnf -cert certs/ca-cert.csr
        -keyfile private/ca-privkey.key -out certs/cert.crt
        -infiles cert-req.csr

This last command creates two additional files on certs folder. The cert.crt (signed certificate) and newcerts/01.pem (decrypted certificate). Naturally we would have to repeat the process for similar entity (client/server).
Right now we’re ready to connect with MySQL.

Connecting to MySQL

The cryptographic methods discussed in the first part of this port are situated in a context of communication. However there are many situations where we need to ensure a secure connection. One of those situations: you may want to connect to a remote database.

When accessing to a remote database anyone with access to the same network can inspect all traffic or worse, change it while passing between the client and server. We can however, use the option –compress on the client side to compress the traffic but still unencrypted and unsafe.
But as we said earlier, MySQL supports encrypted connections through the use of libraries of OpenSSL. Here we can see the MySQL Makefile’s SSL section:

Ln 318: openssl_includes = @openssl_includes@
Ln 319: openssl_libs = @openssl_libs@

So any kind of encryption/maintenance of certificates in MySQL is controlled by the functions that are part of the OpenSSL API.

Configuring SSL in MySQL

To ensure the authenticity can be assured we add the following lines to /etc/mysql/my.conf:

[ client ]
ssl -ca=/home/user/teste/certs/ca-cert.crt
ssl -cert =/home/user/teste/certs/cert.crt       #(client)
ssl -key =/home/user/teste/private/privkey.key   #(client)
[ mysqld ]
ssl -ca=/home/user/teste/certs/ca-cert.crt
ssl -cert =/home/user/teste/certs/cert.crt       #(server)
ssl -key =/home/user/teste/private/privkey.key   #(server)

Consider the initial situation in the role of client, we can access to the server, but now in a secure way. Then:

shell> mysql -h SERVER -u USER -p --ssl

If everything went well we now can connect via a secure connection and authenticated using X.509 certificates.

mysql > show variables like '%ssl%';
| Variable_name | Value                                  |
| have_openssl  | YES                                    |
| have_ssl      | YES                                    |
| ssl_ca        | /home/user/test/certs/ca-cert.crt      |
| ssl_capath    |                                        |
| ssl_cert      | /home/user/test/certs/server-cert.crt  |
| ssl_cipher    |                                        |
| ssl_key       | /home/user/test/private/server-key.key |
7 rows in set (0.11 sec)

As a final note, of this part, we mention that the whole process of this part refers to only one user, to another we must repeat everything, of course.

SSL Program

As extra, we decide to implement a simple program that use SSL connections in JAVA.

We found that the MySQL Connector/J supports some properties that are useful to establish SSL connections.

The property useSSL tells the server that we use a secure connection.
In this case the user ssluser was created with the command GRANT … REQUIRE SSL, ensuring that
can only connect by SSL.

import com.mysql.jdbc.*;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.util.Properties;
import java.util.logging.Level;
import java.util.logging.Logger;

public class Main {

    public static void main(String[] args) {
        Connection conn = null;

        try {
            String userName = "ssluser";
            String password = "password";


            String url = "jdbc:mysql://localhost:3333/mysql" //port 3306 of guest machine
                    + "?useSSL=true";

            conn = (Connection) DriverManager.getConnection(url, userName, password);
            Statement stmt = (Statement) conn.createStatement();

            ResultSet rs = stmt.executeQuery("select User,Host,ssl_type from mysql.user;");

            while ( {
                System.out.print(rs.getString(1) + " ");
                System.out.print(rs.getString(2) + " ");
                System.out.println(rs.getString(3) + " ");

        } catch (SQLException e) {
            System.out.println("SQLException: " + e.getMessage());
            System.out.println("SQLState: " +  e.getSQLState());
            System.out.println("VendorError: " + e.getErrorCode());
        } catch (Exception ex) {
        } finally {
            if (conn != null) {
                try {
                    System.out.println("Database connection terminated");
                } catch (Exception e) {  }

This simple program run well, it print the above table to stdout.

We wanted to implement the same application using certificates, but not, we have errors for which no solution yet found. The documentation, unfortunately not worked for us.

Anyway, as a great experience find everything we describe in this post. We learned a lot about cryptography …

DB2 on Campus@Universidade do Minho

12 09 2008

Depois de no ano passado recebermos, aqui na UMinho, uma apresentação sobre a tecnologia DB2 dada por Vítor Rodrigues, receberemos este ano a workshop DB2 on Campus nos próximos dias 2 e 3 de Outubro no pólo de Gualtar da Universidade do Minho.

A workshop será nos apresentada por Raul Chong, um consultor do grupo Information Management Services do IBM Toronto Laboratory com uma vasta experiencia em DB2 SQL Procedural Language.

Se quiserem ter uma ideia do evento, podem assistir a screencasts disponibilizados no blog DB2 Express-C.

Mais informações sobre o evento e inscrições serão disponibilizadas brevemente!

Qualquer duvida envia para

Pointfree Calculator

29 07 2008


In this semester had to do this project with my friend, João Moura, under supervision of professor Alcino Cunha.

The project was to make a program that make automatic the process of proving an equality of functions written in the pointfree style. Not completely automatic, but so iterative. The program load a file with rules and the user inserts an equality of functions in pointfree and can surf the words to find a sub-expression which want to run a particular rule. From the side are shown all the rules that may apply to selected sub-expression.

What is pointfree?

Pointfree is a style of programming that helps us to compose functions. Even more, it helps us later to prove equality between functions.

How can someone prove equality between functions? We can do so, because we have rules. Let’s see an analogy with algebraic calculus.

We have proved that .

Pointfree calculus as is done now, on paper, is boring and repetitive. A proof is this image of someone trying to make a proof.


Abstract representation

If swap is a functions for exchange the elements of a pair:

swap (a,b) = (b,a) --in pointwise
swap = split snd fst --in pointfree

And we want to prove that: . This is clearly obvious, but I will use this example just to show you the representation that we use to see these proofs.

So, here we have the representation of :

If you make a inorder passage it make sense.

As you can see the operator have arity n. Because we want that this operator became flat, we never want to use the law:

Function f is variable, and swap is constant, we know their definition.

Imagine now that we load the following file of rules:


In order to know that sub-expression are to select a term, we have the Path which behaves as follows:

If we select 4 in the following expression , the Path becomes: .
If we select 2, the Path becomes: .

the list in front of the Path is to select a range of elements in an operator with arity n. Thus, if we want select in , the Path becomes: .

Proof of

This process is pretty always the same, but I suggest that you follow with some attention.

In this first tree I will show the Path already selected.

Here I will show the rule that will apply in our expression, and the instantiation of that rule to our selected sub-expression.

  • Selected function:
  • Path:
  • Rule:
  • Instantiation of the rule:

So, the resulting tree will be:

Resulting function: .

Proof of

  • Selected function:
  • Path:
  • Rule:
  • Instantiation of the rule:

So, the resulting tree will be:

Resulting function: .

Proof of

  • Selected function:
  • Path:
  • Rule:
  • Instantiation of the rule:

So, the resulting tree will be:

Resulting function: .

Proof of

  • Selected function:
  • Path:
  • Rule:
  • Instantiation of the rule:

So, the resulting tree will be:

Resulting function: .

Proof of

  • Selected function:
  • Path:
  • Rule:
  • Instantiation of the rule:

So, the resulting tree will be:

Resulting function: .

Proof of

  • Selected function:
  • Path:
  • Rule:
  • Instantiation of the rule:

So, the resulting tree will be:

Resulting function: .

Proof of

  • Selected function:
  • Path:
  • Rule:
  • Instantiation of the rule:

So, the resulting tree will be:

Resulting function: .

Proof of

  • Selected function:
  • Path:
  • Rule:
  • Instantiation of the rule:

So, the resulting tree will be:

Resulting function: .

From the rule of equality that we have is true and therefore it is true


Time to show the software interface.

The program’s interface is divided into 3 parts:


Here the user can insert something that want to prove, for example:

and so on…

When you hit enter, you start having fun calculating 🙂

Rules that may apply

Here you can navigate in the rules pressing PgUp or PgDn, and enter to aply the selected rule to the selected sub-expression.

Proof state

Here you can browse the expression imagining it as the trees that showed earlier.

  • up/down – to navigate into the levels
  • left/right – no navigate into childs
  • backspace – to make undo in the proof

We can also save and load proofs into XML, and also save proofs to PDF.


Me and João still improving the pointfree calculator, and in the next year it will be used in one course here in Universidade do Minho.
The next stage we will implement type inference in the pointfree calculator, to make it powerfull. A lot of things have to be done, and in the next month we will start doing that, including make the first public release of the software.

If you understand Portuguese and want to see the presentation that we give in Department of Informatic in University of Minho:

CeSIUM entrevista

10 11 2007

cesiumAcabei de encontrar esta foto na net e não pude deixar de a meter aqui, na altura nem soube onde tinha sido publicada. Já nem me lembrava deste dia. Que grande foto!

Parte dos elementos do CeSIUM.