NEVAC — Wiki.viz

VAST 2008 Challenge
Mini Challenge 1: Wiki Editors

Authors and Affiliations:

Chi-Chun Pan, The Pennsylvania State University, [PRIMARY contact]
Don Pellegrino, Drexel University,
Chris Weaver, The Pennsylvania State University, [Faculty advisor]
Prasenjit Mitra, The Pennsylvania State University, [Faculty advisor]

Student team:


Improvise is a desktop application for building and browsing a wide range of flexible and powerful visual analysis tools. Live design of visual queries facilitates fast and flexible interactive drill-down into fine-grain relationships buried in spatiotemporal and social network information spread across multiple data sets. Cross-filtering queries across multiple views provide analysts with the means to seek out and dissect subtle patterns in complex information spaces.


We used Improvise to build wiki.viz, an interactive visualization of the wiki edit history data. The interface enables analysis of the unstructured data and the information extracted with automated text processes such as reverting networks and word senses.


Two Page Summary:   NO




Wiki-1: What are the factions represented in the edit pages and who are its members? In other words, describe the groups and their members based on their editing changes.  





VictoriaV, RyogaNica, Amado


Edemir, Rm99, DailosTamanca

Neutral contributors

Agustin, Sara

Page removers

195.113.65.x, Alejo, 75.179.21.x, 84.158.202.x,Alejandrosanchez, 209.155.27.x, Molotover, 66.175.135.x, 201.226.51.x, Honoratas, 74.120.3.x, 85.135.211.x, 67.55.3.x, Rosamaria, Absalon, 204.52.215.x 74.130.152.x139.55.50.x, 69.14.85.x

People who only revert edits

Kurrop, Seina


Detailed Answer:


Click here to download the video that accompanies this answer.

In the first step, we created a set of rules to extract the revert patterns in the summary field. We classified the revert patterns in to three types. The first type is “revert”. For these patterns, the authors of those edits undid others’ work and returned the page to previous versions. For example, if the pattern “Reverted 1 edit by [author2] …” appears, we can obtain a triple (author, revert, author2). The second type of patterns is “undo”. The difference between undo and revert is that by doing undo, users can revert a single edit without simultaneously undoing all constructive changes that have been made since. An example of an “undo” pattern is “Undid revision xxxxxxxxx by [author2] … ”Both“ revert ” and “ undo ” patterns indicate that the authors edited the page disagreed the changes made by author2. The third type of revert patterns is “revertTo”. Unlike the other two types of patterns, a “revertTo” edit might indicate that the authors edited the page think the changes made by author2 could be better than the current versions. Sometimes, a “revertTo” pattern may appear with a “revert” pattern. For example, in a summary contains the pattern “reverting possible vandalism by [author2] to last good revision by [author3], we could obtain two triples (author, revert, author2) and (author, revertTo, author3). By extracting the patterns, we can create a reverting graph by creating nodes as authors and linking nodes with the three types of patterns extracted from wiki history.  We have created a graph view to visualize the reverting graph (see Figure 1). By selecting authors from the list of edit authors and revision authors, we can visually analyze the interaction between users.



Figure 1. Graph View of the reverting network. On the right hand side, analysts can select authors of interest and revision types for analysis.


In order to identify groups in the wiki edit history, we started with the Discussion page. We first tried to categorize authors into three groups: supporters, challengers, and neutral contributors. By reading the discussion page, we obtained a list of users contributed to the discussion page (see Table 1) and their opinions regarding to the Paraiso Menifesto. There are 19 distinct authors in the discussion page. We remove authors who only appear once in the discussion page because all of them are inactive in the dataset.


Table 1.     List of users appear in the discussion page




Lacks Objectivity and Neutrality


Lacks Objectivity and Neutrality, Civility


Lacks Objectivity and Neutrality, Edit warring


POV Pushing, Civility




sect status


sect status


Edit warring


Edit warring




The supporters are the people who agree the point of view of the Paraiso Menifesto. Among these users, VictoriaV is the most active one. Although, in the discussion, VictoriaV appeared to pretend that she is neutral with respect to the Paraiso Menifesto, she had many arguments with others in the discussion page that belie that claim. For example, VicotriaV argued about the source of Catalano’s enlightenment with Agustin and Edemir. In addition, Rm99 claimed that VictoriaV have lied about her background and her relation to the Paraiso Menifesto. It is obvious that she supports Catalano’s point of view. Therefore, we tag VictoriaV as a supporter of the Paraiso movement.


Among 387 users, most users only made one edit. It would be difficult to judge their opinion about the Paraiso Menifesto. Therefore, for analyzing the reverting graph, we only select users who made more than 3 edits. With time filtering, we can apply animation to replay the interaction between users in the reverting graph. The types of reverting actions are revert, undo, and revertTo. We can filter on type of revision by selecting Revision Types on the visualization. We start our analysis with the users found in the previous analysis of discussion page. If we find two users revert/undo each other several times (the width of edges indicate number of revision occurred), we put them into different groups. On the other hand, if we see revertTo link between two users, we tag them as the same group. By doing this, we found two more users in the supporters group, RyogaNica and Amado. Based on the reverting graph, we found that RyogaNica made several revisions on Rm99’s, DailosTamanca’s, and Edmir’s edits (they are identified as challengers).  Amado made lots of edits and VictoriaV made a revertTo revision to Amado’s edit. By reading Amado’s comments on the table view of wiki edit (Figure 2). We suspect that Amado is a supporter of the Paraiso movement and tried to maintain the wiki page.




Figure 2. Table view of the wiki edit history sorted by source (author).




By tagging VictoriaV as a supporter, we could easily tag Rm99 and Edmir as challengers based on their wording in the discussion page. Rm99 questioned the credibility of the external links provided by VictoriaV. Also, Rm99 claimed that “The controversy surrounding Catalano has been downplayed considerably.” Edmir. In addition, based on the conversation in the Civility section of the discussion page, we suspect that Rm99 sometimes entered the wiki page with IP address 81.96.243.x. By analyzing the reverting graph, we confirm that Edmir and VicotriaV have different agenda regarding to the Paraiso movement. We also indentify DailosTamanca as a challenger because DailosTamanca and RyogaNica make several reverts to each other





If we could not clearly judge a user’s agenda, we can select the user on the author list and filter the edits with the author name. This allows us to read all the edits made by the user. Using this approach, we think Agustin and Sara may be normal wiki users who tried to maintain the neutral point of view for the wiki page.




We also found some interesting patterns by creating time series visualization on the size of wiki page. We found several sudden decrease of the size of the page on the time series of page size (Figure 3). The time series on the bottom indicates the maximum, minimum, and average page size for everyday. By sorting the edits with the size of page on the table view of wiki edit history, we found a group of users who did not contribute to the content of the wiki page. Instead, they replaced the wiki page with short sentences. Most of them did not login with a registered username. We think those users may fit the profile of the protesters mentioned in the wiki page.



Figure 3. Time series view of the page size (the time series on the bottom) indicates the maximum, minimum, and average page size for everyday.




Finally, we analyze the frequency of a user appear in the reverting graph and the number of edits made by the user. We found two persons (Kurrop and Seina) who only revise others’ edits. They did not add any information into the wiki page.


Wiki-2:  Is the Paraiso movement involved in violent activities? 


List of wiki edits providing evidence


# (cur) (last) 03:16, 19 September 2006 Alphanzo (Talk | contribs) m (moved Paraiso to GUNNED DOWN SIX DOCTORS AND NURSES IN COLD BLOOD)


Short Answer:


In order to identify violent activities, we extract the words used by users in the wiki edit history. Our hypothesis is that violent activities could be recognized with some cue words. By remove stop words and applying stemming algorithm, we obtained 1279 unique words. We further applied natural language processing tools on the comments in the wiki edit history and assigned part of speech tags for the 1279 words. By inspecting those 1279 words, we manually assign a real value between -1.0 and 1.0 to selected words (the default value for every word is 0.0). A negative value indicates that a word has higher possibility to be used to describe violent activities (such as hate and attack). A positive value indicates that a word has higher possibility to be used to describe non-violent activities (such as good and agree). By selecting users and words of interest from the table view of authors and words, the graph view becomes an author-word graph which provides an intuitive way to visualize who use what words in the wiki edit history (Figure 4). By analyzing the author-word graph and the summary of wiki edit history with the steps to identify groups of people, we found the edit which may link the Paraiso movement to violent activities.


# (cur) (last) 03:16, 19 September 2006 Alphanzo (Talk | contribs) m (moved Paraiso to GUNNED DOWN SIX DOCTORS AND NURSES IN COLD BLOOD)


It is possible that someone who has firsthand knowledge about the execution tried to reveal Paraiso movement’s crime activity. By reading the posts before Alphanzo’s edit, we suspect the confrontation of Paraiso members and Dept. of Health may cause the violent activity.


# (cur) (last) 03:09, 19 September 2006 Edemir (Talk | contribs) (97,530 bytes) (?Home Health Care - added confrontation of Paraiso members and Dept of Health)


We also suspect this message may be related to the incident that happened in the evacuation mini challenge.


Other wiki edits that point to potential violent activities include the reference to prosecution of Paraiso by Belgians and Catalano's death. However, there is no direct evidence of violence associated with any of these events.


# (cur) (last) 09:26, 4 September 2006 Angelgasperi (Talk | contribs) (93,439 bytes) (?Controversy and criticism - Belgium prosecuting, wikinews source)


# (cur) (last) 11:40, 12 November 2006 Danielrengelm (Talk | contribs) (105,333 bytes) (at least since Catalano is dead, but even b4 that others contributed as well, even if the I.C. might belittle their impact)



Figure 4. Author-word graph for visualization of who use what words in the wiki edit history.

Web Accessibility