Sergio Manuel Villordo,
Universidad Nacional de Buenos Aires, sergiomanuel03@gmail.com PRIMARY
Hee Joon Park, Universidad
Nacional de Buenos Aires, hee@mac.com
Luciano Cabrera, Universidad Nacional de Buenos Aires, lucianocabrera@gmail.com
Juan M. Bodenheimer, Universidad Nacional de Buenos
Aires, jbodenheimer@instare.com
Juan Pablo Ferrandez, Universidad Nacional de Buenos
Aires, jpferrandez@gmail.com
Antonio Tralice, Universidad Nacional de Buenos
Aires, atralice@gmail.com
Student Team: YES
Tableau (http://www.tableausoftware.com )
sqlLite3 (http://www.sqlite.org )
Microsoft Excel
Inkscape (http://www.inkscape.org/
)
Qgis (http://www.qgis.org/en/site/)
Postgresql – PostGis (http://www.postgresql.org/) (http://postgis.net )
SIMILE Widgets Timeline (http://simile-widgets.org/timeline/)
Approximately how many hours were
spent working on this submission in total?
Provide an estimate of the total
number of hours worked on this submission by your entire team.
180 h
May we post your submission in
the Visual Analytics Benchmark Repository after VAST Challenge 2014 is
complete? YES
Video:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Questions
MC2.1 – Describe common
daily routines for GAStech employees. What does a day
in the life of a typical GAStech employee look
like? Please limit your response to no
more than five images and 300 words.
With the
objective to understand a normal day of a GASTech
employee we perform an data exploratory analysis of
all company cars/trucks movements and purchases.
The
following graphs try to summarise the regular day of
employees taking into account the given information.
Figure1. Employee movements
and purchases presented by employment type and day.
Fig2. Violin plot of
recorded GPS movements presented by day and hours for each employment type
Figure3. Matrix
of bar chart of employee movements plotted by day for each employment type.
Figure 4. Matrix
visualization of movements and purchases during the the
day presente by employment type.
In a regular day, an
employee start moving between 6/7am to 8/9 am. Around 11am/12pm to 2 pm) they
go out for lunch. In general the movement starts again at about 5pm. This
movement starts to fade until about 9pm. There is some movement after that, but
you will see it in a minor degree. Some interesting differences can be found if
you compare different employees types or different moments of the week, for
example:
-
Facilities could be understood as a bit different in this pattern: they show a
more continuous activity during the day.
- At
weekend you can see important changes. People do not move so much. Facilities
only show minor movements. Engineering and security are the ones that show the
major part of the GPS data.
Those
graphs also are useful to show some special cases that will be discussed later
in this report (some people are moving late at night/early in the morning).
Related
to the expenses, when we try to understand a normal day of a GAS employee from
the perspective of their expenses, we will see some similar things as the
knowledge we get from the GPS data. Many employees start from the morning
having breakfast, coffee or some other activity in the morning (7-9am).
Some engineering employees do not do that. Facilities’ employees show a
more continuous activity, not as divided as people from other areas. At lunch
the activity starts again: people go to lunch outside the office and we also
see a similar kind of activity after work (Happy hour, go to dinner, etc.). The
facilities employees are different in those patterns, showing always a
continuous activity, and almost no activity, when others go for
dinner/drinks/etc. You will see a peak of expenses of Information Technology
people: that’s a $10.000 expenses outlier that will be treated later in the
report.
MC2.2 – Identify up to
twelve unusual events or patterns that you see in the data. If you identify
more than twelve patterns during your analysis, focus your answer on the
patterns you consider to be most important for further investigation to help
find the missing staff members. For each pattern or event you identify,
describe
a. What is the pattern or event you observe?
b. Who is involved?
c. What locations are involved?
d. When does the pattern or event take place?
e. Why is this pattern or event significant?
f.
What is your
level of confidence about this pattern or event? Why?
Please limit your answer
to no more than twelve images and 1500 words.
1. 10.000
Purchase: as you can
see in the graph, this expense seem to be an outlier. So we think that Lucas
Alcazar (car ID:1) should be investigated. There is
another important issue regarding this transaction. Alcazar GPS data doesn’t
have a temporal match with the moment of purchase. From our analysis we can’t
confirm a special reason for this strange behavior. A third reason for
investigating this issue, there is a credit card transaction 10 minutes later
in the place where he left his car before, that is not near/close to the $
10.000.- purchase place. Id24 was in the place where
this important purchase had been issued.
In that way we really think that the whole issue deserves to be studied.
In addition, Id1 met several times to Id: 21 and 24 (These guys are security
control associated to other suspicious behaviour, see
bellow).
2. ID 1 movements is not common. This person shows GPS movement in an hour that is not common for
GAS Tech employees. We have seen in the first part of the report that the
pattern of behavior of GAS Tech employees during the day is different.
3. ID 28 strange GPS data: When you try to graph the GPS data of that
worker, it shows a strange behavior: he/she seems to be wandering as also to be
set off to a side. Some analysis were made and to
explain that phenomenon as a fast car movement, or similar ideas, is not
possible. When we mean that the data seems to be “displaced”, we mean that the
GPS point show move spatial displacement from the different locations that this
person visits (for ex. the office location and the GPS data). It also can’t be
possible for ID28 to walk/drive in the middle off the lake ). The displacement reaches about
590 meters. What happened here? Any
problem with his GPS equipment? Or someone has manipulated that data?
4. The CEO President of GASTech: He has no
GPS data over all the two weeks, but suddenly, three days before the hijacking
he seems to start using the car (starting on the 17th), or his card starts to
be used by someone. If you analyse the credit card or
loyalty card information you won’t see any movements before the 17th.
In addition, he pays a lot of money (600USD) at the Chostus
hotel the 18th. Why those days? What had he done before? Maybe
that’s not a strange movement, but as we do not have any movements from him
before those days, we can’t establish which his normal movements are.
5. Trucks get restless: The type of movement they show wednesday, thursday
and friday previous to the last weekend change. If
you start comparing the following image with the one before you will see major
changes. Trucks didn’t move after 4pm. But on the 15th or 16th they continue.
Why? What are they doing? Which are the reasons to continue with the activity
when they didn’t do that the days before?
We can’t see such a change of movement in
other areas, only trucks have such an important change of how much they move
those days. When we analyze their movement we will see that they go to the
airport. What for?
6. Supervising / Looking at what the C-Level does
As
you will see that are some very strange things happening with the security guys
and the C-Level executives from the company.
From the images we share, it seems that the
C-Level is being watched, or something else. Security people are near to them
at night, and they have shift por the positions,
changing places in the middle of the night. The CIO, COO, CFO and Ev. Safe. Act. get this “special
attention” from security. After the
shift at (or by) the C-level executive home, each one goes back to their own
house.
Each graph shows the GPS activity for each Id
being the Y-Axis the time of the day. So it is very clear, combined with
location map, that this special attention activity is being held.
7. Employees going to the
Kronos Capitol. In the following heatmap, we show the
IDs of Gastech's employees who sistematically
go to Kronos Capitol. One of them goes on Saturday 11th (IDs 25). The others go
on Saturday 19th (the rest of them). We think that this in a suspicious
behavior has to be investigated.
8. Meeting Friday 10 at night: In this heat map, we show the
most active places in the city after 5 PM on Friday 10th. The graph plots the
Lars Azada’ home area as one that had more activity.
The other two sectors of the map where intense activity is shown are Gastech and the way to Azada's
place.
9. Guy’s Giros meeting: CEO goes to
Guy Giros on sunday
19th at night. There were other employees at that place. You can find different
credit cards movements at the same time there. In a 30 minutes range you will
find 11 card movements in that place. That makes us wonder if that is a just a
coincidence, or there was something coordinated about the 11 people involved in
that situation. It is also very strange that for that hour is busy car (15 cars
were identified in that area at that time).
10.
By all exposed, we
really think that further investigation involving the next persons is needed:
CEO/president(ID:31)
Some
members of the Security staff (ID:15,16,21 y 24)
IT
Helpdesk (ID:1)
Some
Facilities guys (people that use the ID101 and ID106)
In
addition a putative net of contacts involving these peoples were detect, but
more evidence is necessary to confirm.
MC2.3 – Like most
datasets, the data you were provided is imperfect, with possible issues such as
missing data, conflicting data, data of varying resolutions, outliers, or other
kinds of confusing data. Considering MC2 data is primarily
spatiotemporal, describe how you identified and addressed the uncertainties and
conflicts inherent in this data to reach your conclusions in questions MC2.1
and MC2.2. Please limit your response to
no more than five images and 300 words.
1. Loyalty and Credit Card Differences: There are
some movements where you see a price difference between the loyalty and credit
card. Analyzing the differences, it seems to be some kind of typing problem in
one of them. (for ex. 11,51 and 51,51 or 27,84 and 7,84)
2. ID 28 GPS data seems to be strange: maybe some
changed it. as stated in the strange issues. We
observed in this situation the general path that ID 28 took, despite this
strange data behavior.
3. Kronos Mart Credit Card purchases: When you
compare the timestamp, it doesn’t match with the GPS or the loyalty data. It
seems that the each credit card movement is 12 hours earlier that what you have
in the data. The loyalty data is registered the day before to the purchase, and
the moment when the person is in the Kronos Mart is 12 hours before.
4. Jack’s Magical Beans Credit Card purchases: When you
compare the timestamp, it doesn’t match with the GPS. It seems that the credit
card purchases are all at 12PM. while the time when the people stay at Jack’s
Magical Beans are earlier.
|
5. The data (date) was in different formats: Some for
ex. dd/mm others were in mm/dd. We had to correct this differences to continue our analysis.
6. GPS data problems: There
were more than one GPS point for the same person at the same time (second),
this means that you can not know the exact location as the data implies they
are in two different locations (really close to each other) at the same time.
In spite of this, as we plotted the data, we could see each employee behavior.
7. Missing Data: Trucks vs. Truck’s Drivers. The GPS
data was associated with a driver ID, but not with the truck itself. In the car
assignments table there wasn’t information about who was driving which truck.
8. Administrative Positions: No cars. These positions didn’t have any car
assigned. So we didn’t have any GPS information of them. We had to use their
credit card information to know about their routines.