Channels ▼
RSS

Web Development

Grey Hats and Network Janitors


Simon is a freelance programmer and author, whose titles include Beginning Perl (Wrox Press, 2000) and Extending and Embedding Perl (Manning Publications, 2002). He's the creator of over 30 CPAN modules and a former Parrot pumpking. Simon can be reached at simon@ simon-cozens.org.


For a while, at college, I was unable to hide my computing abilities and got roped into the position of student computer representative. This was no more than a glorified helpdesk role, until one day it all went horribly wrong.

For a few days, students had been complaining that they couldn't send e-mail. It was strange, but investigation revealed it was a problem with the upstream ISP's SMTP servers. Nothing I could do about it. Then, at about 10 pm, the entire network went down. Nothing worked. I got hold of the poor IT officer, who cleared out the 4G firewall log, and restarted the firewall. We promised to look over the log at a more sensible time.

At this point, I had already guessed what had happened—some virulent machine had been mass-mailing, filling the firewall log with connections, filling up the upstream mail servers, and getting us on some DNS blacklists. This needed to stop; not only did we need to detect which machines were being unpleasant but, if possible, stop them in advance of anything nasty happening.

Of course, I was only the student computer representative, and didn't have access to the firewall, the network infrastructure, or any of the servers. So this is where things got decidedly "grey-hat."

I was already playing with ethereal (http://www.ethereal.com/) to look for strange network traffic—excessive SMTP activity, unusual DNS queries, or connections to IRC servers in Eastern Europe, for instance—but someone on #perl mentioned ArachNIDS, a set of rules for the snort Intrusion Detection Software, which picked up virus activity. Perfect.

Of course, finding the IP address of the culprit is only half the problem—from there, we have to physically locate the machine and its owner, and, uh, reeducate them. That's where Perl comes in.

Enter Blart

The college network is entirely wireless—which is good, because I can see all the traffic without requiring particular privileges—and entirely DHCP—which is bad, because it's a little harder to track which IP addresses belong to which computers. Even when you've gone from IP address to MAC, you then need to keep track of who that computer belongs to, and where they can be found.

I set up a SQLite database with three tables. The owner table deals with human beings, and where they live:

CREATE TABLE owner (
  name varchar(255) primary key,
  room varchar(255)
);

The machine table keeps track of MAC addresses and who owns them. Later it could be expanded to keep track of when I last visited the machine, what antivirus software it has on it, any comments, and so on:

CREATE TABLE machine (
  mac varchar(18) primary key not null,
  owner varchar(255)
);

And now we need to relate IP addresses to MAC addresses; we use a lease table that tells me when DHCP leases were allocated and when they were cancelled again:

CREATE TABLE lease (
  id integer not null primary key,
  mac varchar(18),
  address varchar(22),
  issued varchar(255),
  annulled varchar(255)
);

I wanted to make my register, which I whimsically called "Blart" (as readers of The Register's "Bastard Operator From Hell" may appreciate), as easy to get going as possible. If the database file didn't exist, it should create it and the three tables on the fly. I've developed a little trick for doing this:

package Blart;
use DBI;
use constant BLART_DB => "/var/lib/blart.db";
-f BLART_DB || do {
  my $dbh = 
    DBI->connect("dbi:SQLite:dbname=".BLART_DB)
    or die DBI->errstr;
  local $/ =";";
  $dbh->do($_) for grep /\S/, <DATA>;
}

The database's schema is stored in the DATA section of the Perl module. The schema being SQL, individual statements are delimited by semicolons. So, if the database doesn't exist, we connect to it (SQLite will automatically give us an empty database) and then execute all the SQL statements we see in the DATA block.

Next, we use my favorite database handling module, Class::DBI, to allow us to get at the database through Perl classes. In fact, we'll use Class::DBI::Loader, which handles reading the database schema and setting up the classes for us:

Class::DBI::Loader->new( 
  dsn => "dbi:SQLite:".BLART_DB,
  namespace => "Blart"
);
Blart::Machine->has_a(owner => "Blart::Owner");
Blart::Owner->has_many(
  machines => "Blart::Machine"
);
Blart::Lease->has_a(mac => "Blart::Machine");

Now we have our database; we now need some data.

Watching and Discovering DHCP

First we'll watch out for any DHCP offers and leases flying over the network. The tcpdump utility is good enough for this, since in its verbose mode, it can dump out the IP address returned by the server to the client. The output of tcpdump -ttnv port bootpc or port bootps will look like this:

1112543649.359207 IP (...) 0.0.0.0.68 > 
255.255.255.255.67: BOOTP/DHCP, Request from 
00:08:a1:7c:4f:06
  Client Ethernet Address: 00:08:a1:7c:4f:06 
  [|bootp]
1112543650.400357 IP (...) 192.168.0.254.67 > 
192.168.1.121.68: BOOTP/DHCP, Reply
      Your IP: 192.168.1.121
      Server IP: 192.168.0.254
      Client Ethernet Address: 00:08:a1:7c:4f:06
      [|bootp]

From this, we can extract the client's IP address and MAC address as returned by the server; we can also look at the requests to see if there are some from a real IP address—that is, a client attempting to renew a lease—and store the information about its MAC and IP if we're not currently aware of the lease. First, we add a new method to Blart to help us register a lease:

sub lease {
  my ($self, $mac, $ip, $time) = @_;
  $mac = uc $mac;

If we already know about this existing lease, give up:

return 0
  if Blart::Lease->search(
    address => $ip, 
    mac => $mac, 
    annulled => 0
  );

If we know about the IP address or the MAC address and there's a current lease associated with them, that lease is now invalidated:

for (Blart::Lease->search(
              address => $ip, 
              annulled => 0),
     Blart::Lease->search(
              mac => $mac, 
              annulled => 0)
    ) {
  $_->annulled($time); $_->update;
}

Finally, create the allocation:

Blart::Lease->create({   mac => $mac, 
            address => $ip, 
            annulled => 0, 
            issued => $time});
return 1;
}

This will put the appropriate row in the leases table. With this, it's now easy to register leases:

Blart->lease( "00:05:5D:F0:E5:8B" 
             => "192.168.1.140", time );

All we have to do is parse what's coming out of tcpdump, and we can note any DHCP allocations as they happen. The CPAN module that's going to help us with this is Regexp::Common. Regexp::Common contains a huge number of precooked regular expressions for matching all sorts of useful things—among them, IP addresses and MAC addresses. We'll use it to simplify handling the tcpdump output:

use Blart;
use Regexp::Common;
my ($ip, $mac, $time);
my $ip_re = $RE{net}{IPv4};

The %RE hash comes from Regexp::Common. We've asked it to provide us a regular expression, $ip_re, which handles IP Version 4 network addresses. Now we watch what comes out of tcpdump:

my $file = shift || 
  "tcpdump -ttnv port bootpc or port bootps |";
open IN, $file or die "$file: $!";
while (<IN>) {

The first line we want to look at will tell us the timestamp, the two IP addresses (client and server), and whether this is a request or response:

if (/^(\d+).*?($ip_re)\.\d+ 
    > ($ip_re).\d+:.*(Request|Reply)/
   ) {
  $time = $1;
  $ip = $4 eq "Request" ? $2 : $3;
}

If it is a request, then the client IP is the first one we see because it's talking to a server. If it's a reply, the client is the second IP address, the one that's being replied to. Next we want to look for lines containing the MAC address, and again Regexp::Common comes up with the goods:

elsif (
   /Client Ethernet Address: ($RE{net}{MAC})/i
  ) {
  $mac = $1;
  Blart->lease($mac, $ip, $time)
    if $ip ne "0.0.0.0" and $ip !~ /^255.255/;
  $time = "";
  $ip = "";
}

If this is a real IP—that is, a broadcast made by a host that doesn't have an IP yet—then we record it, reset the variables, and wait for the next DHCP query or reply.

But I found that, actually, lease renewals didn't happen very often, and if I set Blart up and running with an empty database, there was a lot of traffic that it didn't know about. I wanted to get the leases table up and running very quickly, seeded with at least some data. To do this, I again made use of one of the convenient properties of a wireless network—because I can see every packet, I can look down at the link layer as well as the IP layer, and pick up both IP and MAC addresses out of a single packet. The -e option to tcpdump does that, producing output like so:

1112544236.491527 00:08:a1:7c:4f:06 > 
00:05:49:d9:44:03 IP 192.168.1.180.3304 > 
66.xx.yy.zz.80

Now we have two sets of MAC/IP pairs to look at. The problem is, not all of this is going to represent machines on our network. If I go to look at Google's site, then I'll see a connection like the one above; the 66.xx.yy.zz address is external to our network, and the MAC address given is that of the gateway. We obviously don't want to store this as a DHCP lease; we are not in the business of giving DHCP leases to Google's web servers. So we need to ensure that IP addresses we store are part of our network. The NetAddr::IP module is a handy one for comparing network blocks:

use NetAddr::IP;

my $home_network = 
  NetAddr::IP->new("192.168.0.0/16");
my $address = NetAddr::IP->new("127.0.0.1");
print "Ours" if $home_network->contains($address);

So, as before, we look at the output of tcpdump, and extract the IPs and MACs from the incoming lines:

open IN, "tcpdump -l -en ip|" or die $!;
my $net = qr/^192.168.*/;
my (@macs, @ips);
while (<IN>) {
  next unless
  ($mac[0], $mac[1], $ip[0], $ip[1]) =
  /($RE{net}{MAC}) > ($RE{net}{MAC}).* IP 
  ($RE{net}{IPv4}).* > ($RE{net}{IPv4})/i;

Notice that we're using an array rather than the more obvious $mac1 and $mac2, because we plan to look at both MAC/IP pairs and do exactly the same thing with each of them. As usual, though, we don't want to repeat code, so we use an array and a for loop:

  for (0..1) {
    next unless $home_network->contains(
      NetAddr::IP->new($ip[$_])
    );
    Blart::Machine->find_or_create($mac[$_]);
    Blart->lease($mac[$_], $ip[$_], time);
  }
}

Now we can fill up the leases table a bit faster. In the next step, we want to go from a computer to its owner.

Taking Names

It's at this stage that we need a bit more of a human interface to Blart—we want a tool that allows us to register users, their machines, and to query the database. Because we're performing multiple related tasks with the same tool, I used my Getopt::Auto module to handle the command-line processing. This module allows you to define "subcommands," like so:

use Getopt::Auto
  [adduser => "Add a user"],
  [registermachine => 
   "Add a machine to a user's profile"],
  [what => "Display information on a thing 
   (IP/MAC/Name)"],
  [unknown => "List computers with no known owner"]
;

Now if I say blartadmin adduser "Joe Bloggs" "Room 10", Getopt::Auto arranges for the adduser subroutine to be called with the parameters from the command line. We're not going to go into all of those functions—we'll just look at the first two, which allow us to register machines and users. Thanks to Class::DBI, they're pretty simple:

sub adduser {
  my ($user, $room) = @_;
  Blart::Owner->create({ name => $user, 
                         room => $room });
  warn "Added user $user, $room\n";
}

The second one, registering a machine, is marginally more tricky; we want to be able to register a machine either by IP address or MAC address, which in reality means we need to perform, an IP address lookup and then register it by MAC address. First we write a quick helper method in Blart to look up an active lease given IP address:

sub ip_to_mac {
  my ($self, $ip) = @_;
  my ($lease) = Blart::Lease->search({ 
    address => $ip, annulled => 0 });
  if ($lease) { return $lease->mac }
}

And now we can register a machine:

sub registermachine {
  my ($mac, $user, $room) = @_;
  if ($mac =~ /\./) {
    die "Couldn't find a MAC address for that IP"
      unless $mac = Blart->ip_to_mac($mac);
  }
  my $user_id = Blart::Owner->retrieve($user);
  my $machine = Blart::Machine->find_or_create({
                                    mac => $mac });
  $machine->owner($user_id);
  $machine->update;
}

If the "MAC address" has a period in it, it's really an IP address, so we do our lookup to see who that IP address is currently owned by, and register that computer to a particular owner.

At this stage, we have a database that goes from user to computer to IP address. Now things can get interesting.

"Friendly" Intrusion Detection

Suppose we have Snort, tcpdump, or some other network utility up and running telling us about the state of the network; it'll give us a list of IP addresses merrily talking to other IP addresses. We want to make this a little more personal. If we add this method to Blart:

sub annot {
  my ($self, $ip) = @_;
  return unless my $mac = Blart->ip_to_mac($ip);
  if($mac->owner) {
  return "(".$mac->owner->name.", 
           ".$mac->owner->room.")";
  } else { return "(".$mac->mac.")" }
}

we have something that takes an IP, and returns something like (Simon Cozens, OH 4). And if our tcpdump or Snort logs are piped through this command:

| perl -MBlart -MRegexp::Common -pe
  's/($RE{net}{IPv4})(.\d+)?/
     $1.Blart->annot($1).$2/g'

they become marked up with names and room numbers of the guilty parties, and we can watch Snort print out logs, like so:

02/10-16:34:28 [**] [1:1699:7] P2P 
Fastrack kazaa/morpheus traffic 
{TCP} 192.168.1.197(John Xxxxx, OH**):
1827 -> 82.41.xx.yy:1409

A swift pop up the stairs and a quiet word usually does the job at this point.

There are a few other things I'd like to add to the machine table: Maybe I could automatically run queso or nmap on hosts joining the network, so we can keep a note of their operating system and potentially vulnerable ports; noting down the network name that Windows machines advertise themselves as can sometimes be helpful to identify unknown machines. (The virus-ridden computer that started off this whole caper was only found because it sent out browser announcements identifying it as LIBRARYA.)

For the black-hats, maybe the table could be marked up with the output of dsniff to record any passwords that the unfortunate user was spraying across the network; I, on the other hand, would merely like to add in the last time I checked that the machine in question was properly protected against viruses.

You believe me, right?

TPJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 
Dr. Dobb's TV