Entry Name:  "PKU-Chen-2"

VAST Challenge 2014
Mini-Challenge 2

 

 

Team Members:

Siming Chen, Peking University, csm@pku.edu.cn     PRIMARY
Zuchao Wang, Peking University, zuchao.wang@pku.edu.cn
Zipeng Liu, Peking University, winnieliupku@gmail.com
Zhenhuang Wang, Peking University, zhenhuang.wang@pku.edu.cn
Chenglong Wang, Peking University, chenglongwang@pku.edu.cn
Zhengjie Miao, Peking University, j.miao92@gmail.com
Xiaoru Yuan, Peking University, xiaoru.yuan@pku.edu.cn (Supervisor)

Student Team:  YES

 

Analytic Tools Used:

MovementFinder, developed by Peking University, 2014

Python

D3.js

QGIS

 

Approximately how many hours were spent working on this submission in total?

Provide an estimate of the total number of hours worked on this submission by your entire team.200

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2014 is complete? YES

 

Video:

http://vis.pku.edu.cn/vast2014/pku_vast2014_mc2.wmv

 

pku_vast2014_mc2

 

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

 

MC2.1Describe common daily routines for GAStech employees. What does a day in the life of a typical GAStech employee look like?  Please limit your response to no more than five images and 300 words.

 

Most GAStech employees own a car. As Figure 1.1 shows, in weekdays, they usually go to a restaurant for breakfast (e.g Caterlna's Cafe, Brew've Been Served) during 6:30-7:30, and arrive at GAStech before 9:00. Then they work at GAStech till 12:00 and leave for lunch. Their favorite restaurants include Ouzeri Elian, Guy's Gyros, etc. In the afternoon, they work from 14:00 to 17:00. Then they go back home or go to restaurant for dinner. During the evening, they will visit shops or other restaurants. Most of them go back home before 22:00. In the weekend, they usually stay at home and go to restaurants at noon (13:00-15:00) and night (18:00-20:00). Some people go to the shops, parks or museums for entertainment. Figure 1.2 shows the tracks of these car owners in weekdays and weekends. Figure 1.3 shows their movements of one day in different time.

Figure 1.1

Figure 1.1. Car owners visit different POIs during weekday and weekend.

 

Figure 1.2

Figure 1.2. Weekday/weekend movement patterns of car owners.

 

Figure 1.3

Figure 1.3. Movement patterns of car owners on Tuesday, Jan.14.

 

We can briefly check a car owner, Lars Azda's life on Friday, Jan.10, as an example. As Figure 1.4 shows, he had breakfast at the restaurant "Bean There Done That" for $15.39 during 7:25-8:12. After driving for 20 minutes, he arrived at GAStech, and started working. He ate at the restaurant "Gelatogalore" for lunch and paid $26.15. At 17:06, he went back home. After one hour, he went to the restaurant "Hippokampos" for supper and paid $48.22. Then he went back home at 19:52.

Figure 1.4

Figure 1.4. Car owner Lars Azda's life on Friday, Jan.10.

 

Nine employees are truck drivers, as shown in Figure 1.5. In the daytime, they drive trucks among GAStech, transportation sites (e.g. airport and port) and special shops (e.g. Carlyle Chemical Inc., Nationwide Refinery). Truck drivers don't have the company's car. Some other people neither drive car nor truck. They usually go to restaurant nearby or take other's ride for restaurant far away.

Figure 1.5

Figure 1.5. Movement patterns of truck drivers.

 

 

 

MC2.2Identify up to twelve unusual events or patterns that you see in the data. If you identify more than twelve patterns during your analysis, focus your answer on the patterns you consider to be most important for further investigation to help find the missing staff members. For each pattern or event you identify, describe

a.       What is the pattern or event you observe?

b.      Who is involved?

c.       What locations are involved?

d.      When does the pattern or event take place?

e.      Why is this pattern or event significant?

f.        What is your level of confidence about this pattern or event?  Why?

 

Please limit your answer to no more than twelve images and 1500 words.

 

 

We have identified 12 patterns, with two confidence levels:
    1 - confident
    2 - likely

Pattern 1: Meeting at Spetson Park at night

23:01, Jan.6 - 8:26, Jan.7, Vann Isia stayed at the Spetson Park for the whole night. Bordrogi Loreto arrived there too at 3:35. Just before he arrived, Vann's car record GPS for a tiny minutes, which is suspecious. They stayed there together, and went to the Coffee Shack at 6:37 in the morning. It seemed that they had breakfast there, but they didn't pay for any bills. There might be someone together with them who paid for the bill. The situation is significant because of the unusual behavior at night and two people's meeting at park. The confidence level is 1 because there are other suspicious behavior in these persons. (The confidence level of E2, E5, E12 share the same reason.)

Figure 2.1

Figure 2.1. Pattern 1: Meeting at Spetson Park at night.

 

Pattern 2: Meeting at Taxiarchon Park at night

23:00, Jan.8-8:18, Jan.9, Bordrogi Loreto and Mies Minke stayed at Taxiarchon Park at a non-overlapping time, but might meet on the road. Mies firstly went there at 23:00, Jan.8, and stayed there till 3:30, Jan.9. Bordrogi left home at 3:20 and arrived at 3:32. Mies started to leave for home just 2 minutes before Bordrogi arrived. However, their trajectories overlapped and they might meet on the way. Bordrogi stayed at the park for the remaining night, and went to Jack's Magic Bean for breakfast. The same situation happened two days ago, but he didn't pay the bill, and then went to work directly. The situation is significant because the unusual time and place. The confidence level is 1.

Figure 2.2

Figure 2.2. Pattern 2: Meeting at Taxiarchon Park at night.

 

Pattern 3: Meeting at an anonymous building at night

Friday night (18:00-24:00), Jan.11, people from Information Technology Group and Engineering Group gathered together at an anonymous building, near Parla Park (N.Delfon St/N.Ketallinias St). The place is nearest to (might be) Aazda Lars's house. Most group member came (4/5 in IT Group, 12/13 in Engineering Group). From 18:20, people started gathering from other corners' of Abila, and till 20:18, the last one came. The activity lasted for around two hours. People began to leave at 22:20, and till 0:29, the last people went back at home. The situation is significant because many people in the same group were at the place they didn't show up regularly at that time. The confidence level is 1 because many people's record certificate to each other.

Figure 2.3

Figure 2.3. Pattern 3: Meeting at an anonymous building at night.

 

Pattern 4: Sequential visits of many POIs at night

20:24-24:00, Jan.11, Bertrand Ovan visited a sequence of places, without any bills. Starting from his house at 22:11, he went to the Brew've Been Served Cafe in a minute. After staying for 9 minutes, he went to the Ouzeri Elian. Then hhe headed for the place near Kronos Mart and arrived at 22:40. 15 minutes later, he went to the Alberts Fine Clothing and stayed there for another 23 minutes. Starting from 23:21, he spent 4 minutes on the way, and arrived at the place near U-Pump or Jack Magic Beans at 23:25. He stayed there for about 23 minutes, and arrived home at exactly 0:00, Jan.12. The situation is significant that he went to many places at a short time, which seems that he is checking something or hide something at some places. The confidence level is 1 because the behavior differ very much from the normal pattern.

Figure 2.4

Figure 2.4. Pattern 4: Sequential visits of many POIs at night.

 

Pattern 5: Meeting in an anonymous building at night

This looks similar to pattern 2, but the people and location are different. 23:00, Jan.13 - 8:31, Jan.14. Hennie Osvaldo went to an anonymous building near the east side of Spetson Park (N.Polvo St/N. Brada St) at 23:08. He stayed there till 3:30, Jan.14. Minke Mies started to go out from his home at 3:20, and arrived at the place where Osvaldo were at 3:31. They missed 1 minute, but they might met on some place near the anonymous building. The situation is more likely that they are on duty for some activites at night. Afterwards, Osvaldo went back home and Mies kept staying there till 7:47 in the morning. Like the situation in Jan.9, he went to the Coffee Shack and paid nothing. At 8:18 he headed for GAStech and arrived at 8:31. The same event happened between Osvaldo and Isia Vann in Jan.11. at the same place. The confidence level is 1.

Figure 2.5

Figure 2.5. Pattern 5: Meeting in an anonymous building at night.

 

Pattern 6: Busy truck transportation

In Jan. 16, starting from 17:00 - 21:00, four trucks visited a sequence of places multiple times, stopping for very short time. The drivers are Dylan Scozzese, Henk Mies, Benito Hawelon and Valeria Morlun, corresponding to truck id: 101, 104, 105 and 106. Before 17:19, four trucks were in different places, Abila Scrap, Abila Airport, Caryle Chemical Inc. and Port of Abila. At exact the same time, they started to move. Dylan circled around these four places near two times and dropped off in Abila Airport for less than 1 minute. Finally he drove his truck back to GAStech at 20:00. Henk Mies went back and forth from GAStech to Abila Airport multiple times, each time even didn't stop for a minute, except he stoped his truck in GAStech from 8:00-8:15. Till 9:06, he went back to GAStech and stopped there. Valeria visited the sequence of Caryle Chemical, Katerina's Cafe and GAStech during 17:19 - 19:07, while Benito Hawelon visited a sequence of Port of Abila, Caryle Chemical , Abila Airport and GAStech. This behavior is significant because of the long time driving and merely no time stops. The confidence level is 1.

Figure 2.6

Figure 2.6. Pattern 6: Busy truck transportation.

 

Pattern 7: Working in GAStech at night

Alcazar Lucas appeared at non-office time at GAStech four times. He had a very large bill $10,000 in Jan.18, in Frydo's Autosupply N'more. The situation happened in 22:11, Jan.6 - 3:20, Jan.7, 21:18, Jan.8 - 3:20, Jan.9, 22:33, Jan.15 - 0:12, Jan.16 and 19:35 - 22:44, Jan.17. Usually he went back home or some other places in the afternoon, back to GAStech at night, and finally went back home very late. Especially, at night of Jan.6, he went back at around 1:00, but at 3:20 his GPS records for a short time which turned out that the car was started. The same situation happened at night of Jan.8. At least he had chances doing some 'bad' things at an unusual time in the company. However, due to the fact that his working title is IT Helpdesk and IT usually need to work a lot, the confidence level is 2.

Figure 2.7

Figure 2.7. Pattern 7: Working in GAStech at night.

 

Pattern 8: People without car had bills on Auto supply shop at high frequency.

From Jan.15 to Jan. 18, Mat Bramar had bills on the Frydos Autosupply n'More everyday. The value of each is $49.21, $276.88, $130.01 and $124.52 respectively. IT is suspicious because he did not own a car. As we refer to all the people without company assigned cars, many people, e.g Lais Cornela, Mies Haber Ruscela also had bills on Frydos Auto Supply n'More several times. Besides the Administrator people, the three Janitors also paid at Frydos Autosupply n'More. This event is significant because it conflicts with the common sense. However, due to their titles, they might be paying for their managers, or they had their own cars so the confidence level is 2.

Figure 2.8

Figure 2.8. Pattern 8: People without car had bills on Auto supply shop at high frequency.

 

Pattern 9: Having two houses

Hennie Osvaldo's behavior is suspicious. Besides the two outliers - outgoing at night as described in Pattern 5, he has other special patterns. Sometimes he went back to a place near Frydo Autosupply n'More (house1: N.Karanteg St/ N.Edessis St) and another building (house2: N.karanteg St/N. Edessis St). Every Wednesday (Jan.9, Jan.16) and weekend (Jan.12, Jan.13, Jan.19) he went back to house2, while in the remaining days he went back to house1. Every Wednesday, he didn't have supper outside and went directly back to house2. For other days, he went to house2 at around 17:30, and stayed there at around 19:00. Afterwards, she went to have supper and then went back to house1 at around 21:30. Though the two places weren't far, the situation happened regularly. The confidence level is 1 as the periodical behavior is obvious.

Figure 2.9

Figure 2.9. Pattern 9: Having two houses.

 

Pattern 10: Tightly coupled behaviors between two persons

Esla Orilla and Kanon Herrero had a special relationship, in that they did a lot of things together. The transaction pattern is that Herrero paid the bill and Orilla provided the loyalty card every lunch (except one day, Jan.10). Besides, they have three kinds of trajectory patterns. The first pattern is that Herreron drove to the restaurants while Orilla's car is parked at GAStech. This situation happens every weekday except Jan.10 and Jan.14, Jan.17. The second weekday pattern is that Orilla drove the car and Herrero parked his car at GAStech. The third pattern happened in weekends. They drove together to many places, such as shops, parks or museums. All the bill was paid by Herrero. One outlier event is that Orilla went to the musuem again in Jan.19 and paid herself. The pattern is significant that they went together everyday. The confidence level is 1 because transaction data and GPS log both revealed this pattern.

Figure 2.10

Figure 2.10. Pattern 10: Tightly coupled behaviors between two persons.

 

Pattern 11: Going to the hotel with separated bills

At noon of Jan.8, Jan.10, Jan.14 and Jan.15, Isande Borrasca and Brand Tempestad went to the Chostus Hotel. Usually they would left the GAStech a bit earlier than usual at 11:00, and went back at around 13:45. Usually one started 10 minutes earlier and then the other came. When they went back to GAStech, the situation is the same. What is special is that they both paid, $100+ each. The pattern is significant because the hotel is a special place for meeting and they went there frequently. They might do something others don't know, or even meet someone together. It might also be correlated with CEO Sanjorge Jr. Sten's route to the Chostus Hotel from Jan.17-Jan.19. The confidence level is 1.

Figure 2.11

Figure 2.11. Pattern 11: Going to the hotel with separated bills.

 

Pattern 12: Visiting special place at noon

Nearly everyday (except Jan.6 and Jan.12, Jan.19), Hennie Osvaldo, Minke Mies, Inga Ferro and Loreto Bodrogi left GAStech early at noon, and visited a suspicious place before lunch. The places include: Frenk's Fuels (N. Hallanol Dr/ N. Gerantoni St, 6 times), west of Bean There Done That (N. Camino St/ N. Leeno St, 5 times), south of Hallowed Grounds (N. Maskin St/ N. Agentes St, 6 times), south of Katerina's Cafe (S. Evripidou Ave/ S. Elftherias St, 4 times) and south-west of the Arkadious Park(N. Acera St/ N. Tackan Ave, 5 times). Everyday, 1-4 persons of them would make such visit, and they usually change to another place on the other day. This pattern is significant and they should do it purposely, because most of the places were far away from their eating places. The confidence level is 1.

Figure 2.12

Figure 2.12. Pattern 12: Visiting special place at noon.

 

 

 

 

 

MC2.3Like most datasets, the data you were provided is imperfect, with possible issues such as missing data, conflicting data, data of varying resolutions, outliers, or other kinds of confusing data.  Considering MC2 data is primarily spatiotemporal, describe how you identified and addressed the uncertainties and conflicts inherent in this data to reach your conclusions in questions MC2.1 and MC2.2.  Please limit your response to no more than five images and 300 words.

 

 

We have identified 4 kinds of uncertainties in the data, as listed below:

Uncertainty 1: POI locations

The geographical locations of POIs are not given. As Figure 3.1 shows, we roughly extracted them from the tourist map, and then refined them based on GPS data.

Figure 3.1

Figure 3.1. Determining the geographical locations of POIs.

 

Uncertainty 2: GPS errors and data missing

As Figure 3.2 shows, GPS tracks can be systematically shifted. We shifted them back. Besides, we removed the noise via down-sampling.

Figure 3.2

Figure 3.2 Fixing GPS errors.

 

As Figure 2.3 shows, there can be large spatial-temporal jumps in GPS tracks. The information for intermediate movements are missing. We visualized such tracks with dash lines on the map, and marked them as “Uncertain” stop events on the time line.

Figure 3.3

Figure 3.3 Data missing in GPS tracks.

 

Uncertainties 3: Low resolution, conflicts and delay in transaction data

The loyalty card data does not have exact transaction time. However, we are able to determine that by matching them to the credit card data and stop event data. Still, 52 records remained unmatched. They are labeled on the left of the time line.

For the same transaction, the price can differ in credit card data and loyalty card data. In our interface, we show both prices to the analysts.

As Figure 3.4 shows, credit card transactions in 12:00pm and 3:50 am are delayed. We fix their time according to the stop event data.

Figure 3.4

Figure 3.4 Fixing delays in transaction data.

 

Uncertainties 4: Car assignments for truck drivers

The truck drivers’ car assignment is not given. We were able to discover it by visually matching the stop events of trucks and transaction records of drivers. The match result is shown in Figure 1.5.

 

 

 

 

 

Web Accessibility