Capture and dissect network traffic

1 04 2009

Currently I am doing research at the University of Minho in the group of distributed systems, with duration of one year. My job is to find a way to identify specific links between a user and a distributed system. The general idea is to draw a map of services in a distributed system. This post only refers to the first milestone.

The proposal was to make such a system using Snort.


Snort is a Network intrusion detection system, that means with Snort you can detect malicious activity in your network. We can detect many types of network attacks. We can identify DoS, DDoS attacks, port scans, cracking attempts, and much more.

Snort can operate in two different ways. We can set up Snort to run in passive mode, putting it to listen in promiscuous mode. That is, because Ethernet network switches send traffic to all computers connected to itself, we get traffic addressed to other machines on the network. To do this we only need to connect to the network and turn Snort on in our machine, no one knows that we are recording every traffic (including traffic destined for other computers).

Snort may also run in active mode. This “active” is not able to modify the data channel, but to be able to be installed in a network, a router for example and reap more information than in passive mode. Thus it makes sense to use the capacity of rules that Snort supports, to filter the traffic that it read.

To do this, Snort capture all packets that pass the network and interprets each. As the rules we have defined Snort tries to find these patterns in each packet, or each set of packets and take certain actions for each of them.

For example, if a large number of TCP requests reach a particular host, to a large number of ports in a short space of time we probably are the target of a port scan. NIDS like Snort know how to find these patterns and alerting the network administrator.


Our aim was to use Snort to capture all traffic into passive mode.

root@pig:# snort -u snort -g snort -D -d -l /var/log/snort -c /etcsnort/snort.debian.conf -i eth0

We are saving the logs in binary (tcpdump format), for that I use the “-d -l /dir/” flags. I prefer to save all the packets into binary because is more easier to parse, than the structure of files and directories that Snort creates by default.

I started by trying to use some language that advised me to try to do the parsing of the file created by snort. Initially started to use python, but only find a tcpdump parser and could not get more than one file translated in tcpdump to hexadecimal.
After that I tried to use Haskell and I was amazed!

Haskell and packet parsing

House is a Haskell Operative System done by The Programatica Project.

This is a system than can serve as a platform for exploring various ideas relating to low-level and system-level programming in a high-level functional language.

And indeed helped me a lot in doing my job. This project have already done a lot of parsers for network packets. It implements the Ethernet, IPv4, IPv6, TCP, UDP, ICMP, ARP and I think is all.

The libpcap (tcpdump parser) is already implemented in Haskell too, so is very simple to parse a complete packet:

getPacket :: [Word8] -> InPacket
getPacket bytes =  toInPack $ listArray (0,Prelude.length bytes-1) $ bytes

-- Ethernet | IP | TCP | X
getPacketTCP :: [Word8] -> Maybe (NE.Packet (NI4.Packet (NT.Packet InPacket)))
getPacketTCP bytes = doParse $ getPacket bytes :: Maybe (NE.Packet (NI4.Packet (NT.Packet InPacket)))

As you can see is too easy to have a compete structure of a packet parsed with this libraries. The problem is that they don’t have already implemented a application packet parser. So, according to that image:

This is the level of depth we can go with this libraries. What is very good, but not perfect for me :S

My supervisor told me to start searching a new tool to do this job. I was sad because I could not do everything in Haskell. But it is already promised that I will continue this project in Haskell. You can see the git repo here.

I find tshark, a great tool to dissect and analyze data inside tcpdump files.

The power of tshark

tshark is the terminal based Wireshark, with it we can do everything we do with wireshark.

Show all communications with the IP

root@pig:# tshark -R "ip.addr ==" -r snort.log
7750 6079.816123 -> SSHv2 Client: Key Exchange Init
7751 6079.816151 -> TCP ssh > 51919 [ACK] Seq=37 Ack=825 Win=7424 Len=0 TSV=131877388 TSER=1789588
7752 6079.816528 -> SSHv2 Server: Key Exchange Init
7753 6079.817450 -> TCP 51919 > ssh [ACK] Seq=825 Ack=741 Win=7264 Len=0 TSV=1789588 TSER=131877389
7754 6079.817649 -> SSHv2 Client: Diffie-Hellman GEX Request
7755 6079.820784 -> SSHv2 Server: Diffie-Hellman Key Exchange Reply
7756 6079.829495 -> SSHv2 Client: Diffie-Hellman GEX Init
7757 6079.857490 -> SSHv2 Server: Diffie-Hellman GEX Reply
7758 6079.884000 -> SSHv2 Client: New Keys
7759 6079.922576 -> TCP ssh > 51919 [ACK] Seq=1613 Ack=1009 Win=8960 Len=0 TSV=131877415 TSER=1789605

Show with a triple: (time, code http, http content size), separated by ‘,’ and between quotation marks.

root@pig:# tshark -r snort.log -R http.response -T fields -E header=y -E separator=',' -E quote=d -e frame.time_relative -e http.response.code -e http.content_length

Show a tuple of arity 4 with: (time, source ip, destination ip, tcp packet size).

root@pig:# tshark -r snort.log -R "tcp.len>0" -T fields -e frame.time_relative -e ip.src -e ip.dst -e tcp.len
551.751252000  48
551.751377000   144
551.961545000  48
551.961715000   208
552.682260000  48
552.683955000   1448
552.683961000   1448
552.683967000   512
555.156301000  48
555.158474000   1448
555.158481000   1400
556.021205000  48
556.021405000   160
558.874202000  48
558.876027000   1448

Show with a triple: (source ip, destination ip, port of destination ip).

root@pig:# tshark -r snort.log -Tfields  -e ip.src -e ip.dst -e tcp.dstport
...   37602   37602  22   37602  22  22   37602   37602   37602  22  22  22  22   37602   37602


Hierarchy of protocols

root@pig:# tshark -r snort.log -q -z io,phs
frame                                    frames:7780 bytes:1111485
  eth                                    frames:7780 bytes:1111485
    ip                                   frames:3992 bytes:848025
      tcp                                frames:3908 bytes:830990
        ssh                              frames:2153 bytes:456686
        http                             frames:55 bytes:19029
          http                           frames:5 bytes:3559
            http                         frames:3 bytes:2781
              http                       frames:2 bytes:2234
                http                     frames:2 bytes:2234
          data-text-lines                frames:10 bytes:5356
        tcp.segments                     frames:3 bytes:1117
          http                           frames:3 bytes:1117
            media                        frames:3 bytes:1117
      udp                                frames:84 bytes:17035
        nbdgm                            frames:50 bytes:12525
          smb                            frames:50 bytes:12525
            mailslot                     frames:50 bytes:12525
              browser                    frames:50 bytes:12525
        dns                              frames:34 bytes:4510
    llc                                  frames:3142 bytes:224934
      stp                                frames:3040 bytes:182400
      cdp                                frames:102 bytes:42534
    loop                                 frames:608 bytes:36480
      data                               frames:608 bytes:36480
    arp                                  frames:38 bytes:2046


We use: -z conv,TYPE,FILTER

TYPE could be:

  • eth,
  • tr,
  • fc,
  • fddi,
  • ip,
  • ipx,
  • tcp,
  • udp

And the filters are used to restrict the statistics.

root@pig:# tshark -r snort.log -q -z conv,ip,tcp.port==80
IPv4 Conversations
                                |           | |    Total    |
                                |Frames Bytes | |Frames Bytes | |Frames Bytes | 141    13091    202   259651    343   272742     22     6858     28     4784     50    11642


We use: -z io,stat,INT,FILTER,…,FILTER

root@pig:# tshark -r snort.log -q -z io,stat,300,'not (tcp.port=22)'
IO Statistics
Interval: 300.000 secs
Column #0:
                |   Column #0
Time            |frames|  bytes
000.000-300.000    2161    543979
300.000-600.000    1671    264877
600.000-900.000     508     46224
900.000-1200.000     185     12885
1200.000-1500.000     201     14607
1500.000-1800.000     187     13386
1800.000-2100.000     189     13887
2100.000-2400.000     187     13386
2400.000-2700.000     189     13887
2700.000-3000.000     187     13386
3000.000-3300.000     185     12885
3300.000-3600.000     189     13887
3600.000-3900.000     210     15546
3900.000-4200.000     189     13887
4200.000-4500.000     187     13386
4500.000-4800.000     185     12885
4800.000-5100.000     189     13887


With tshark we could do everything we want to know what is inside a network packet. The trick is to understand the statistics that tshark generate, and know how to ask it.

Now my work will get a machine to run Snort in an active mode and begin to understand how to use Snort to do all this work of collecting information.

If you feel interested and understand Portuguese, see the presentation:

HTTP attacks

23 01 2009

In this post I will talk about the HTTP results that I collected from the experience with one honeypot.
As you may know, I’ve been putting one honeypot running for 5 weeks. I’ve previously talked about SMTP and SSH attacks, now is the time for HTTP. This service (port 80) is by far the most fustigated service in the honeypot. And make sense, in this port we can have a lot of Web services with a lot of exploitable code running…

This chart represents the number of hit’s in the port 80, that our honeypot had during his activity:

Open Proxy’s

An open proxy is like an Open mail relay, described in SMTP attacks. Anyone in the internet can use this systems to reduce their using of the bandwidth. For example, if an open proxy is located in your country, and if your internet provider only allow you to access web pages of your country, you can use this systems to visit outside pages.
The process is quite simple, if you are under a proxy and you want a web page, the proxy go get them for you and send that to you. All the traffic and effort was made by the proxy. Is just like a forwarding service.

Here at the University we have one closed proxy, and that is great, because a lot of pages are in cache, so even we have a 1Gb connection, the proxy don’t need to get the page we want, because have it in the cache.


The webcollage is a program that seeks the Internet images and composes them in a window. Some have described as a pop culture program. This program makes random web searches and extract the images of the results.

There was a host that has make some hit’s in our honeypot’s, as we can see in the following extract from the file web.log:

--MARK--,"Mon Dec 15 23:09:00 WET 2008","IIS/HTTP","","",56886,80,
User-Agent: webcollage/1.135a

The interesting here is that this agent is trying to get an image using our honeypot as a open proxy. The possibilities is that our honeypot as been seen by a proxy scanner in the past that have add our honeypot to an open proxy list.


We can do something, if this had happened for real. We can block the IP that made the request ( to avoid future requests by this host. But the problem is not from this person who is using a program to get the images. The problem here is that we are targeted as an open proxy, and probably in the future we will be asked again to get some file.

A more comprehensive solution is to block all the requests made from this program. Adding this lines to the file “.htaccess” in the folder.

# Start of .htaccess change.
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^webcollage
RewriteRule ^.*$ - [F]
# End of .htaccess change.

This won’t prevent all the attempts to use our server as open proxy, but will prevent all the requests made by this program.

Directory traversal

A directory traversal attacks intend to exploit insufficient security validation of user-supplied input file names. The main objective of this attack is gain access to a computer file that is not intended to be accessible.

I will give you the Wikipedia example:
Imagine you have this PHP script running in your server, and this page have the name vulnerable.php:

$template = 'blue.php';
if ( isset( $_COOKIE['TEMPLATE'] ) )
   $template = $_COOKIE['TEMPLATE'];
include ( "/home/users/phpguru/templates/" . $template );

If someone send you this request:

GET /vulnerable.php HTTP/1.0
Cookie: TEMPLATE=../../../../../../../../../etc/passwd

Your server will return your /etc/passwd file, in the response:

HTTP/1.0 200 OK
Content-Type: text/html
Server: Apache

root:fi3sED95ibqR6:0:1:System Operator:/:/bin/ksh 

We have seen a lot of variations of this attack, as will show you after.

On 4 January 2009 dozens of hit’s were made from Lelystad, The Netherlands. These hit’s were performed to verify if our HTTP server allow directory traversal, to get the file /etc/passwd by the attacker.

In this case, the attacker wanted to acquire the /etc/passwd to perhaps have a list of the users that can access to the system. To possibly get access through any of them.

This type of attack is usually done not asking directly for the file through its conventional path, but in an encrypted way (using exa characters). With that variations it is more unlikely that the application perform validations on what the attacker wants.
Unfortunately our honeypot, not saved the log file’s during the day January 1 until 4, so we only can see in the other’s log’s the activity performed by this host. We cannot show the number of hit’s of this IP in day 4.
In the following listings I will show the pairs of command that the attacker wanted to do and the package that came to our honeypot.

GET ../../../../../../../../../../etc/passwd HTTP/1.1
--MARK--,"Sun Jan  4 05:20:57 WET 2009","IIS/HTTP","","",59706,80,
"GET %2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2Fetc%2Fpasswd HTTP/1.1
User-Agent: Nmap NSE
Connection: close
GET .../../../../../../../../../../etc/passwd HTTP/1.1
--MARK--,"Sun Jan  4 05:20:58 WET 2009","IIS/HTTP","","",59711,80,
"GET %2E%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2Fetc%2Fpasswd HTTP/1.1
User-Agent: Nmap NSE
Connection: close
GET ../../../../../../../../../../etc/passwd HTTP/1.1
--MARK--,"Sun Jan  4 05:21:02 WET 2009","IIS/HTTP","","",59727,80,
"GET %2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2Fetc%5C%2Fpasswd HTTP/1.1
User-Agent: Nmap NSE
Connection: close
GET ....................etcpasswd HTTP/1.1
--MARK--,"Sun Jan  4 05:21:04 WET 2009","IIS/HTTP","","",59740,80,
"GET %2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5Cetc%5Cpasswd HTTP/1.1
User-Agent: Nmap NSE
Connection: close
GET //etc/passwd HTTP/1.1
--MARK--,"Sun Jan  4 05:20:59 WET 2009","IIS/HTTP","","",59700,80,
"GET %2F%2Fetc%2Fpasswd HTTP/1.1
User-Agent: Nmap NSE
Connection: close


Possibly an attacker trying to do this in a real server would get the file he wanted, but our honeypot was not prepared to
respond to such attacks, and it only responds with a message 302:

HTTP/1.1 302 Object moved
Server: Microsoft-IIS/5.0
Date: Sab Jan 10 21:46:17 WET 2009
Content-Type: text/html
Connection: close
Accept-Ranges: bytes
Set-Cookie: isHuman=Y; path=/
Set-Cookie: visits=1; expires=; path=/
Expires: Sab Jan 10 21:46:17 WET 2009
Cache-control: private

<head><title>Object moved</title></head>
<body><h1>Object Moved</h1>This object may be found <a HREF="http://bps-pc9.local.mynet/">here</a>.</body>

We can conclude that the requests that were sent to our honeypot mean that the attacker was using the NSE, which means nmap scripting engine. This is a tool that comes with nmap.
Allows the user to write scripts to automate various tasks in the network.

Here we can see the script that was used to perform this attack.

Morfeus Fucking Scanner (MFS)

MFS is a scanner that search web pages with PHP vulnerabilities. This scanner has a large number of tests for known vulnerabilities in PHP scripts. Then we show some of those applications that MFS search for vulnerabilities.


The WebCalendar is a web calendar that can be used by one or more users, it is possible to create groups with calendars and it supports a wide range of databases.
This application uses several PHP files, one of those id send_reminder.php that contained serious vulnerabilities, one of which allowed the inclusion of remote files through variable “includedir” that was not validated. So anyone can add whatever he want on this page.

We found that this vulnerability only affects WebCalendar version 1.0.4, this application has now in version 1.2.
While our honeypot not have this application installed, although there were made some attempts to attacks on this application, two of them over the vulnerability of this file, as the log show’s.

--MARK--,"Wed Dec 24 16:07:29 WET 2008","IIS/HTTP","","",54941,80,
"GET /webcalendar/tools/send_reminders.php?noSet=0&includedir= HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Morfeus Fucking Scanner
Connection: Close
--MARK--,"Wed Dec 24 16:07:30 WET 2008","IIS/HTTP","","",55003,80,
"GET /calendar/tools/send_reminders.php?noSet=0&includedir= HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Morfeus Fucking Scanner
Connection: Close

This scanner (MFS) scans a list of hosts and attempts to link up several times until the server be attacked.
In our case this type of request on port 80 only returns a 404 error.

The gif file that the attacker wants to include just print a message:

echo (" Morfeus hacked you ");


Although this file has been fixed in this application, the truth is that this scanner continues to include this as a test. The reason for this is that still have many applications with this vulnerability exposed on the Internet.


The CMS Mambo is very popular and used worldwide. The Joomla is a derivative of the first one. In this applications have been discovered many bugs, MFS seems to use some of them. We saw one in particular:

--MARK--,"Wed Dec 24 16:07:34 WET 2008","IIS/HTTP","","",55438,80,
"GET /shop/index.php?option=com_registration&task=register//boutique/index2.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path= HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Morfeus Fucking Scanner
Connection: Close

The attacker wants to set the variable mosConfig_absolute_path from the file index.php with the typical message that has already explained above. What we find is that the input passed to this file in not validated before being used to include files.
This system can be victim of one attack by allowing run code from any source without ever make payable checks.

Prevent attacks from MFS

One way to block this type of attacks from MFS is to add the following lines of code in file. “htaccess” in the folder of the website.

# Start of .htaccess change.
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^Morfeus
RewriteRule ^.*$ - [F]
# End of .htaccess change.


Note that the “.htaccess” solutions that I gave only works for HTTP servers that use Apache. Users of IIS can use the IsapiRewrite tool.