Entry Name:"360-PKU-Ma-MC3"

VAST Challenge 2018
Mini-Challenge 3



Team Members:


Ma qi, 360 Enterprise Security Corp, heymarch@qq.com PRIMARY


Wei Xueshi, 360 Enterprise Security Corp, xs.wei@foxmail.com


Li Yiping, 360 Enterprise Security Corp, 835276214@qq.com


Huang Chuanming, 360 Enterprise Security Corp, josjoy0413@gmail.com


Liwenhan Xie, Peking University, xieliwenhan@pku.edu.cn


Zhiyi Yin, Peking University, 1600017832@pku.edu.cn


Xiaoru Yuan, Peking University, xiaoru.yuan@gmail.com


Student Team: NO


Tools Used:


    Visual analytic system developed by our team.


Approximately how many hours were spent working on this submission in total?

    200 hours


May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2018 is complete? YES








1.     Using the four large Kasios International data sets, combine the different sources to create a single picture of the company. Characterize changes in the company over time. According to the company�s communications and purchase habits, is the company growing?

Limit your responses to 5 images and 500 words


1.1 Overall personnel picture

To get a brief understanding of the company components, we make the eight parallel coordinate graphs of people with different degree distribution. For example, the subgraph Ci in Fig.1-1. illustrates the degree distribution of people having answered more than one calls. By looking into the graphs, we deduce that people involved in email and call records are roughly the same, and that particular patterns lay in both purchase and meeting behaviors, where the record amount is small. Sellers maintains little communication and sell records except a prominent outlier. And most people involved in meetings lack other type of connections.


Fig.1-1: Parallel Coordinate Graph for Comparison between Different Employees. C/ E/ P/ M stands for calls/ emails/ purchases/ meetings relatively. �i� & �o� represent in degree and out degree. For instance, Ci means each item in the parallel coordinate corresponds to a people whose in degree of calls are above zero


We further labelled people in the dataset by the communication (including calls and emails), meeting and purchase records they engaged in, and found that most people have clear division of labour in the company, whose details could be summarized as below.


Fig.1-2: Employee Component. Light colors represent the proportion of people keeping this kind of records only.


Thess pie charts refer to the five major roles in the company, i.e. liaison men, buyer, seller, meeting initiator and attendee, where light colors stand for those who only hold this kind of records. About 2/3 of the staff take charge of online communication (calls and emails), and about half of them also make purchase. The remaining 1/3 include sellers, meeting initiators, and meeting participants. It�s interesting that both meeting initiators and participants are mutually exclusive


Fig. 1-3: Pattern of active people, new staffs and staffs about to quit every month


The chart above illustrates entering and quitting patterns in every month. �Active people monthly� shows numbers of active people every month and their structure: the company enrolled about 15000-20000 new employees every month, and they seldom quit before month 30. Therefore, it�s reasonable to say that the company is expanding during the observed time. �New staff structure� answers the question: �Who did they recruit every month? � Based on the class mentioned above, we classified those new staff. There are 4 main classes: �meet_out�, �meet_in�, �communication�, and �communication+buy_in�. People from those classes increase except the last one.


1.2 Overall business picture


Fig. 1-4: a line chart showing changes of four businesses over time


Numbers of calls, emails and purchases kept steady in the 2 and half years. Given more and more staff engaged in company business along time, their works were becoming less intensive. Only meeting records increased in time, which might illustrates the growing demands for meetings while the company enlarged its scale.


1.3 Communication & purchase habits


Fig. 1-5: The stacked bar charts above show personnel constructions for 4 different activities every month. Personnel are divided by the month they entered the company (i.e. time of the first record). The top bar of each month stands for newly enrolled people, while the bottom bar stands for the earliest employees.


Online communication records have low staff mobility, as most new employees engaged in business in the successive months. In contrast, meeting records show high mobility. Among newly enrolled people in a month, only a few would keep participating in meetings. To explain this, we infer that the company might extend business mainly by meetings.


2.     Combine the four data sources for group that the insider has identified as being suspicious and locate the group in the larger dataset. Determine if anyone else appears to be closely associated with this group. Highlight which employees are making suspicious purchases, according to the insider�s data.

Limit your responses to 8 images and 500 words.


To accomplish the goal of detailed investigation of small subgraph in a large dynamic network, we developed a system called Traceability Analysis System. It takes each employee as a node, and the records provided by the insider as instances of links between each nodes. With this system, we are able to explore the company with an initial point. So we started with the unique suspicious purchase record. Then we explore the whole suspicious group step by step. On one hand, when a new node is added to the system panel, the interactions between new node and exiting nodes would be shown. On the other hand, the system provides a configure panel that enable selections based on multiple rules, e.g. link count threshold, common neighbor, relation type, etc.. Therefore, explicit filter for nodes that are closely associated with the target node or groups could be done with a simple click on a button. As a result, we found some employees closely associated with the given suspicious group as the following table.


Note: The time shown in the system is eight hours ahead of the actual time.


As Fig. 2-1 suggests, we followed four steps.

    (1) Retrieve nodes (green) that link to the suspicious group for many times.

    (2) Select nodes (purple) that associate with more than one nodes in V.

    (3) Add both the potential employees above and the suspicious group into the timeline analysis panel, to check whether their connecting points are close.

    (4) Inspect the statistic information of each potential employees, to see the proportion of links between itself and the suspicious group.


Fig. 2-1: Analysis Steps


Then we obtained nodes that highly closed to the suspicious group.


Fig. 2-2: The group that associate with the suspicious group closely


ID Name Discription Closeness
Sheilah Stachniw
connected with 8 targets, many records to V and mostly to V, and 1 suspicious purchase
Sherrell Biebel
connected with 8 targets, many records to V and mostly to V
Ferne Hards
connected with 5 targets, many records to V and mostly to V
Martha Harris
connected with 5 targets, many records to V and mostly to V
Madeline Nindorf
connected with 4 targets, mostly link to V
Timothy Gibson
connected with 4 targets, and 6 suspicious purchases
Jane Tyler
connected with 3 targets, and 2 suspicious purchases
Juan Walsh
connected with 3 targets
Sherilyn Coopwood
connected with 3 targets
Jaunita Westen
connected with 3 targets
Terrilyn Overkamp
connected with 3 targets

Table 2-1: People that associate with the suspicious group closely.


To find out suspicious purchase records, we added relevant people into the timeline panel for further investigation. Combining their activities and statistic information, we�ve discerned the exceptional purchase records.


Fig. 2-3: All activities of the suspicious group and all purchases of the suspicious group.


Sheilah Stachniw (786361) only associated with people in the suspicious group and one other supplier. In a crucial point when suspicious group were communicating with each others, she made two purchases.


Fig. 2-4: Sheilah Stachniw (786361)


Timothy Gibson (1376868) holds most suspicious purchase records. First, he had a call at 0:21 a.m. on 20th June, 2015 with Richard Fox (857138), then he bought things from Gail Feindt six minutes later. Second, he sent an email to Meryl Pastuch (1690582) at 6:45 p.m. on 10th September, 2015. In half an hour after the email, he keep buying things from Gail Feindt twice. Moreover, he sent an email to Tobi Gatlin (969089) at 8:20 p.m. on 20th January, 2017 with a purchase record to Gail Feindt around half an hour ago. Last, he made a call with Lindsy Henion (1108217) at 6:22 a.m. 8th December, 2017 and bought things twice in the next two hours.


Fig. 2-5: Timothy Gibson (1376868)


Jane Tyler (713701) emailed to Richard Fox (857138) at 9:44 p.m. on 24th August, 2015 and bought things from Gail Feindt immediately. Similar behavior happened at 8:18 a.m. on 6th July, 2017, when he called to Tobi Gatlin (969089) and then bought things from Gail Feindt.


Fig. 2-6: Jane Tyler (713701)


Source Target Time Event
Sheilah Stachniw
Gail Feindt
2017-12-04 13:05:50
in the crucial period
Sheilah Stachniw
Gail Feindt
2017-12-06 03:53:23
in the crucial period
Timonthy Gibson
Gail Feindt
2015-06-20 00:26:52
immediately after a call
Timonthy Gibson
Gail Feindt
2015-09-10 18:53:11
after an email
Timonthy Gibson
Gail Feindt
2015-09-19 19:22:28
after an email
Timonthy Gibson
Gail Feindt
2017-01-20 20:04:58
half an hour before an email
Timonthy Gibson
Gail Feindt
2017-12-08 07:51:43
after a call
Timonthy Gibson
Gail Feindt
2017-12-08 08:29:50
after a call
Jane Tylor
Gail Feindt
2015-08-24 21:45:27
immediately after an email
Jane Tylor
Gail Feindt
2017-07-06 08:32:15
after a call

Table 2-2: Suspicious purchase records.


3.     Using the combined group of suspected bad actors you created in question 2, show the interactions within the group over time.

a. Characterize the group�s organizational structure and show a full picture of communications within the group.

b. Does the group composition change during the course of their activities?

c. How do the group�s interactions change over time?

Limit your responses to 10 images and 1000 words


a. Organizational structure


Fig. 3-1: The whole picture of all suspects and people who connect closely with them


In Fig. 3-1, we labeled them with their location inside or outside the group. Orange label stands for known suspect group. Pink stands for people who connect with more than one person in the suspect group. Green stands for people who connect with only one person in the group more than once. The only blue one is the biggest supply of goods.


We can generally find out the interaction pattern with the known group and other suspects we found:

    1. Several people (857138, 1690582, 1108217) have crowded edges connected with other suspects, and we call them organizers inside the group.

    2. Other people in the known group mainly communicate with several certain points outside the group (981554, 175354), and we call them organizers outside the group.


Then, we downsize the group to a core group, and generate a layout based on their status in the group (Fig. 3-2).


Fig. 3-2: Group status


Pink stands for outside organizers. Orange stands for inside organizers. Grey stands for people doing communications and purchases, which is the prevailing class in the dataset. Green stands for people doing communications and initiating meetings. Yellow stands for pure liaisers.


In this layout, we noticed that those organizers are in unique classes (e.g. 857138 is the only one in the dataset having communication, meet initiating, and purchase records). Their suspicious records confirm that they are key members of the whole suspect group. Specially, they have some strange meet records with other group members. For another example, 981554 is the closest person with the known group.


b. Composition Change


To illustrate the inner composition change in this section, we generate a concise layout including the original suspicious group and several important extensions.

Color annotation for part b&c:

    Orange: Original suspicious group

    Pink: Closely connected suspect with purchase records

    Blue: Closely connected suspect without purchase records

    Green: Closely connected suspect with far more total records than other people in the graph


Fig. 3-3: Phase 1: May - November 2015


In this phase, the suspicious group didn�t act much. In our concise version of expanded group shown in Fig. 3-3, only inside organizers (857138 and 1690582) has connections with Green nodes. We consider this as an incubation period before the group really established, so other subordinate personnel didn�t appear in this period.


Fig. 3-4: Phase 2: November 2015 - January 2016


In this phase, nearly all members in the original group participated in various activities. It is worth noting that the blue node is very closely related to the team members during this time. We can infer that these �outsiders� engaged in group activity at a very early time.


Fig. 3-5: Phase 3: February 2016 - June 2017


During this relatively long period of time, the interaction between the members of the extended team is relatively sparse, and the main interaction takes place in the main members of the group. However, this does not lead us to conclude that some members have withdrawn from the group. Basically, everyone is still in contact, but the frequency is reduced.

As this is a long time period, we extract three successive months to maintain uniform variables with other phases.


Fig. 3-6: Phase 4: July - December 2017


At the end of the stage, the core members of the group (the orange nodes near the center in the picture) were once again collectively dispatched for more contact. This phase is different from the first intensive contact phase for 2 reasons:

    1. Peripheral orange nodes did not participate extensively in interactions.

    2. Blue nodes seldom participate in interactions.

In conclusion, after the first activity peak (Phase 2), the composition of the group didn�t change a lot. Most of them would not disappear for a long time. They just engaged in activities in various frequencies.


c. Interaction Change


To observe the specific interaction details between the members, it is necessary to simplify the number of nodes in the graph as much as possible. For this reason, we divide the expanded group into three parts, shown in Fig. 3-7 :

    1. Outward: People who have much more records than the others, undertaking the task of communicating outward the group.

    2. Inward: People who don’t have many records. They mainly connect with other group members.

    3. Supplier: i.e. 2038003.


Fig. 3-7: Overview of interactions inside the expanded suspicious group over time


Then, we import records only within Inward group and also within the whole expanded group to compare, shown in Fig. 3-8.


    1. The beginning of the two concentrated events was a multi-person meeting (light green). This shows that the meeting is the beginning of a suspicious activity arranged by this group. However, people attending the two meetings was significantly different, perhaps representing the difference in the purpose of the two events.


Fig. 3-8 (a) Records with in Inward group (b) Records of Inward group in the whole expanded group



From the two timelines in Fig. 3-8, we can clearly see the peak of the two activities of the suspect group: the first one was from Nov. 2015 to Jan.2016, and the second one was in Sep. 2017. Looking at the peak of these two activities in a macro view, we can get the following behavior patterns:


Fig. 3-9: Overview of interactions for all 4 types


2. In the short time after the first meeting, the calls occupied the majority of the communication methods within the group, and after the first peak of events, the main communication method became email.


Fig. 3-10: Interactions of the second peak


3. Fig. 3-10 shows the interactions of the second peak in detail. The connection between Rosalia Larroque (1847246) and Kerstin Beveal (728286) is the main theme. It should be noted that Kerstin Beveal was very active in the later period. He also had five consecutive conversations with Sherrell Biebel (981554) in mid-May 2017 and participated in the second multi-person conference, which was a key object of doubt. Rosalia Larroque first made a call to Jenice Savaria (2038003), then made a purchase. In the successive month, Rosalia maintained close contact with Kerstin.


4. Compared with the first time, people from Inward group have many connections with outside the group abnormally. Furthermore, most of those connections are in email. Specially on December 4th, 2017, the team members interacted with the outside world on a large scale, and at the same time heralded the end of the group's activities.



4.     The insider has provided a list of purchases that might indicate illicit activity elsewhere in the company. Using the structure of the first group noted by the insider as a model can you find any other instances of suspicious activities in the company? Are there other groups that have structure and activity similar to this one? Who are they? Each of the suspicious purchases could be a starting point for your search. Provide examples of up to two other groups you find that appear suspicious and compare their structure with the structure of the first group. The structures should be presented as temporal not just structural (i.e., the sequence of events�A is followed by B one or two days later�will be important).


Limit your responses to 10 images and 1200 words


In summary, we get the following knowledges from the above questions and we utilize them to solve this question.


1. For organizational structure, there are core members in the suspiciou group, whose distances to group members ranged from 1 to 2, and mostly 1. Group members associate with each others for many times. Above all, group could be divided into three parts, i.e. bargain suppliers, the outwards and the inwards (see Figure 3-7).


a. Bargain suppliers (the blue node) is the destination of purchases.


b. Outward members (yellow and green nodes) have purchase records and many other records associating with people outside the group (the green nodes represent people with higher proportion of outside associations).


c. Inward members mostly communicate within the group and have no purchase records. Besides, they have less records than the outwards.


For temporal structure, there will be sudden large assosications between the outwards and the inwards. Especially when there’s a purchase, meetings and frequent communications would come together.


Therefore, our set up our strategy for finding similar groups U to the first group V.


1. Start from a small number of nodes S.


2. Select node N that link tightly to S with some criteria (weighted by orders) and add them into S iteratively.

a. large amount of repetitive edges;


b. multiple association targets in S;


c. records between N and S takes up more than half of records of N;


d. sudden association could be observed in the timeline;


e. preferably no purchase records.


The extra six purchase records involve four people in all. So we use these four people as our start points. Thus we find out four suspicious groups as Fig. 4-1 ~ Fig. 4-4 and Table.


ID name
Cora Cross
Donnetta Lapoint
Terrilyn Overkamp
Lucy Herrera
Jerome Jordan
Trevor Webb
Alesha Aschenbrenner
Archie Griffies
Tyree Barreneche

ID name
Gregory Russell
Beth Wilensky
Angelic Graetz
Anjelica Hoger
Cora Gonzalez
Prudence Rosol
Edgar McCormick
Renae Hilbrand
Abbey Rhead
Zachary Hampton
Olivia Brown
Amelia Colon
Karyl Snobeck
Sherrl Brensnan
Indira Fugua
Layla Mostad
Birdie Pioch
Nikia Wilebski
Cecilia Pichette
Ollie Andrews
Amada Faul
Virgie Pratt
Fonda Bursch
Katharine Santos
Valentine Klette
Ilona Barros
Concha Goodall
Lupe Gullatt

ID name
Carlos Morris
Laure Pelkley
Courtney Wiedemann
Yong Wilbert
Arthur Fox
Marjorie Halbach
Daria Housten
Renae Hilbrand
Merlene Tessier
Roger Beck
Virginia Buchanan
Sharmaine Lofredo
Rena Jerabek
Jayden Walters
Dorothea Kulback
Marc Bowen
Alan Sedotal
Elva Ingram
Jessica Pokoj
Omar Tako
Regina Bordoy
Sherlyn Delcine
Lulu Larson
Dina Fairy
Kathie Matheu
Sha Pardoe
Cathleen Kucinski


Fig. 4-1: Group 1, found from Trevor Webb(580766)



Fig. 4-2: Group 2, found from Beth Wilensky (437025)



Fig. 4-3: Group 3, found from Laure Pelkley (695013)



Fig. 4-2


For the suspicious activities, an abnormal phenomenon happened in 4th December, 2017, when a large number of association between the inward nodes of each groups and people outside its group took place.


Fig. 4-5: Suspicious events happened around 4th December, 2017.


In the overview of the four groups, they are similar at the record distribution over times and also accord with the suspicious group. First, their purchase records take place and the end of the timeline. And there seems to be an invisible line in the middle of the timeline, where record amount vary a lot before and after that.


However, compared with the suspicious group as in Figure 3-7, there exist many differences. Group 1 has a smaller size, let alone fewer records. Group 2 has remarkable outward nodes, and hence its timeline view is more complicated relatively. Group 3 behave rather stable. As for Group 4, it is highly similar to the suspicious group both in organization structure and temporal structures.


Fig. 4-6: Comparison of the four groups..