University of Maryland – Mindlab

VAST 2009 Challenge
Challenge 1: -  Badge and Network Traffic

Authors and Affiliations:

Neha Gupta 

Christian Almazan

Ashok Agrawala



Visual Links by Visual Analytics Inc.

Microsoft Excel  






MC1.1: Identify which computer(s) the employee most likely used to send information to his contact in a tab-delimited table which contains for each computer identified: when the information was sent, how much information was sent and where that information was sent.




MC1.2:  Characterize the patterns of behavior of suspicious computer use.


There could be several poosible suspicious scenarios in the situation described in Problem1.Since there is only one employee which leaked information, either he used his own machine or he hacked someone  else's machine.And since most of the times he sent data out (leaking information), we thought we could obtain a clue if we compared the request vs the response sizes corresponding to particular destination ip addresses.


We analyzed at the IPLog data and computed the sum of all the request sizes to a particular destination IP address. The address which had the largest  sum of request size was: , but it was used by all employees, and it was using SMTP port number so we concluded that the IP address was of the the office’s SMTP server, where large information is exchanged  usually.On the second position was the IP address to which maximum amount of request was sent. Moreover, the ratio of the sum of request sizes to response size for all the accesses made to it, was of the order of ~100, which meant that a lot of information was sent to that particular IP address but not received. Moreover,it had only 12 unique source ips which further solidified our suspicion.

We then saw who all were accessing



Figure 1 Summary Report of the Sum of Request and Response Sizes



We plotted the days when the access was made to this particular IP address, we found that there was a pattern. It was accessed every Tuesday and Thursday of the month and the request sizes were huge. Each day (apart from 01-08-2008) atleast two accesses were made.


Figure 2:Suspicous Activities took place every Tuesday and Thursday


The picture below shows the source IPs which accessed the suspicious destination IP.The thickness of the links is proportional to the number of times access was made.


Figure 3:All the suspicious source IPs


We looked at the combined plot of IPLog and AccessLog (ProxLog) of the employees from whose machine on which access was made to "".There were 12 such employees.

For example, from traffic.txt we note that was used to access the suspicious destination IP on 15 January, 2008. We plotted a combined access and IP log of the employee which is shown below.

The blue triangle in the middle of the red and green ones show that an IP access was made while the employee was in classified section.This is an anomaly and we believe that the person’s computer was hacked.


Figure 4: Access and IP Log of ID #16


We found out of those 18 incidents of access to suspicious IP ,there were 8 incidents when the person was not in the office area ( this could be obtained by seeing only if the person was in classified section at the time of his computer access).This finding further confirmed that the machines were hacked when the employees went into the classified section.



All other times when the suspicious employee hacked someone’s computer to access information, it was mostly early morning  (8-10 am) before most  employees came or in the evening  (around 5 pm).

We still needed to find the suspicious employee was, which could be obtained by finding out from the access logs ( prox logs )  which employ was present in office when all the 18 incidents took place.


We plotted the AccessLog of all employees around the incident time (time at which access was made from any office machine to destination IP were eight such days.

We plotted only in-classified and out-classified times since the in-building time were not very reliable as mentioned in the task description.


Figure 5:Acces Log for Suspicious Activity Date


The above chart in an example of the access(prox) log plot for 01-17-2008 on which date two accesses were made to the suspicious IP (shown by green and purple line).

We visually analyzed the chart and looked for people who were in the classified section at the time of incidents. Those people could not have possibly made an access since they were in-classified. From the above chart we can see ID #0 was in classified at suspicious access 1, so he cannot be the “bad” employee.


After analyzing the graphs of all the 8 dates, we found that ID # 27 & ID # 30 were the only ones present all the 18 times access was made. Futher looking at the access logs of 30, we found out that 3 of the days when the suspicious access was made – 01-10-2008, 01-17-2008 and 01-24-2008. ID # 30 went into classified early in the morning but didn’t swipe in (which was illegal) but just swiped out.

ID #31 was the first one to be hacked in the evening which is 30s office neighbour , so we concluded that ID# 30 is the most probable.



Web Accessibility