Entry Name: gicentre-wood-gc

VAST Challenge 2019

Grand Challenge

Team Members:
Jo Wood, giCentre, City, University of London, j.d.wood@city.ac.uk PRIMARY

Student Team: No

Tools Used:
LitVis, developed by the giCentre (integrates Vega, Vega-Lite with elm and markdown), for narrative and visualization document creation.
*nix command-line tools (awk, sed, cut etc.) for some data cleaning.

Approximately how many hours were spent working on this submission in total? c. 80 hours for all three Mini challenges and Grand Challenge (treated as a single integrated process)

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2019 is complete? Yes.

Video


Note: This document was created in LitVis - a Literate Visualization environment to support visual design and analysis exposition. This answer page is supported by a series of litvis documents providing design and analysis provenance for this and all three mini challenges. They can be found at https://github.com/jwoLondon/vastchallenge2019 (released after VAST challenge deadline has passed).

Questions

Question GC1

Generate a master timeline of events and trends during the emergency response. Indicate where it is uncertain and which data underlies that uncertainty.

Main timeline

Figure 1: Timeline of main events

Information for the timeline was generated from interaction with the figures described in the mini challenges. Interaction often allowed precise times to be identified easily (e.g. contamination events, Y*INT messages etc.). The most valuable visualizations for constructing the timeline are summarised below.

Seismic

All damage reports

Figure 2: Timeline of damage reports.

Yellow circles provided direct evidence of shake timings along with more general clusters of reporting. Details confirmed through interactive query of Y*INT messages with shake and related keywords.

Infrastructure

building and road damage

Figure 3: Damage report timelines with buildings, roads and bridges highlighted and most severely damaged neighbourhoods selected.

Spatial variations supported by gridmaps of damage reports (Figure 4)

gridmap of damage reports

Figure 4: Spatial gridmap of damage reports for period between the two main seismic events.

Filtering of damage reports enabled main infrastructure problems to be identified. Supported by Y*INT message query (Figure 5)

Rubble freq

Figure 5: Message content generated by FieldEngineerPhillipCarter with more reliable infrastructure reports.

Health / wellbeing

Most details here provided by interactive Y*INT message query of both reliable sources and health related keywords (e.g. Figures 5 and 6).

Water messages

Figure 6: Messages containing keyword water

Radiological

Most useful summary data were provided by the CUSUM plots of static and mobile sensors (Figures 7 and 8). Additional spatio-temporal event details (e.g. Wilson Forest Highway contamination) provided by interactive animation of mobile sensor trajectories (Figure 9).

Static sensors at end of period

Figure 7: Static sensor CUSUM plots indicating status at end of recording period.

Mobile sensors at end of period

Figure 8: Mobile sensor CUSUM plots indicating status at end of recording period.

Mobile sensors Wilson Highway

Figure 9: Still from animation of trails of the Wilson Highway event vehicles. The short trail north of the main (pink) hotspot shows the trajectory of mobile sensor 30. The trail to the east is the route all vehicles take to enter and leave St Himark.

Question GC2

Identify and explain cases where data from multiple mini-challenges help to resolve uncertainty, and identify cases where data from multiple mini-challenges introduces more uncertainty. Present up to 10 examples. If you find more examples, prioritize those examples that you deem most relevant to emergency response.

Forequake (Monday 14:35)

Reports from the Shake app (MC1) have much noise and therefore uncertainty. This is particularly the case when the density of Shake reports is low. During the first seismic event (Monday 14:35) the intensity of reported shake magnitude went down not up (see Figure 10). While it was clear that reports at this time differed from those in the hours before and after, it was not certain whether they were indicating a real seismic event.

It was only with confirmation from the Y*INT messages (e.g. Figure 5 above Earthquake? Feeling something shaking from FieldEngineerPhillipCarter), that it was possible to build some confidence that this was a real minor seismic event.

Mobile sensors at end of period

Figure 10: Shake reports showing noise between intensity 0 and 1 and a lowering of intensity values Monday pm.

Bridge Closures

Examination of Y*Int messages on bridge closing and opening (Figure 11) provided precise times on when they were open or closed to traffic. However, trajectories of mobile sensors (MC2) show that some bridges had no traffic at any time (Figure 12). This calls into question one or both of the data sources. Given the 50 vehicle mounted sensors the weight of evidence is in favour of the trajectories, suggesting the Y*INT messages may be less reliable than originally anticipated.

Bridge status

Figure 11: Automatic detection of open and close bridge related messages (MC3) showing apparent closing (and later opening) of Magritte Bridge.

bridge traffic

Figure 12: Mobile vehicle-based trajectories showing no traffic on Magritte and 12th July bridges at any stage in the measuring period.

Road Damage

Y*INT messages show numerous (and re-messaged) reports of blocked roads preventing vehicular passage (e.g. Figure 13). These are usually without specific spatial reference (e.g. road name) so can paint an overgenerlised and sometimes alarmist picture of transport status. The mobile sensor trajectories (MC2) provide a useful indication of real-time road access that adds detail and clarity in both space and time. For example Figure 14 shows some of the accessible roads early Thursday afternoon.

road messages

Figure 13: Mobile vehicle-based trajectories showing no traffic on Magritte and 12th July bridges at any stage in the measuring period.

Thursday trajectories

Figure 14: Snapshot from animation of mobile vehicle-based trajectories (Thursday 12:20 - 12:50) showing partial picture of accessible roads.

Question GC3

Are there instances where a pattern emerges in one set of data before it presents itself in another? Could one data stream be used to predict events in the others? Provide examples you identify.

Y*INT messages have the potential to catch some patterns before manifesting in others (e.g. shake app reports). The challenge is to identify reliable message accounts as many accounts exaggerate or re-message old or inaccurate information (e.g. see Figure 12 in MC3 showing a wide variation in reported fatalities).

One strategy is to identify the originating author of a widely re-messaged post. This approach led to identifying FieldEngineerPhillipCarter as an early and reliable source of damage reports and health risks (e.g. Figure 15). This can be particularly useful in pre-empting Shake App damage reports that are delayed by power outages and server problems.

Phillip Carter messages

Figure 15: Message content generated by FieldEngineerPhillipCarter.

The CUSUM plots used in MC2 to show systematic changes in radiation levels provide a good early warning of potential problems. In this particular disaster, the radiological risks did not manifest themselves in other data streams (other than some messages on anticipated problems with the Power plant). But they could be used to anticipate road closures or the need to provide emergency evacuation should problems be more severe.

Question GC4

The data for the individual mini-challenges can be analyzed either as a static collection or as a dynamic stream of data, as it would occur in a real emergency. Were you able to bring analysis on multiple data streams together for the grand challenge using the same analytic environment? Describe how having the data together in one environment, or not, affected your analysis of the grand challenge.

Central to the analytic process for all challenges was an integrated design, analysis and presentation framework using Literate Visualization. Applied in this context, it is an example of literate visual analytics, where the process of data cleaning, visualization design, data analysis and findings are recorded explicitly and regarded as part of the same intertwined process (see Figure 16). This provides a record of visual analytic provenance providing a more robust and verifiable justification of findings.

Literate visual analytics environment

Figure 16: Litvis: Integrated visualization specification, design exposition, analysis and reporting environment. The live code on the left generates the formatted output on the right.

Figure 17 shows the literate visual analytic document structure used for all challenges. Initial exploration of the problem across all three challenges was recorded in diary.md (markdown document). This provided a 'stream of consciousness' record of early design choices and impressions of the data. Data were assembled along with common visual design components in dataAndConfig.md, inheritable by all documents used in the challenge. More focussed analysis was later recorded in each of the mc1, mc2 and mc3 Exploration documents. The final submission reports (mc1Answers.md etc.) inherited these documents so could reuse visualizations generated within them.

To structure exploration and report writing the entire hierarch of documents composes a narrative schema (vastChallenge.yml) that contains elements such as question, observation and design that format and structure the visual analytic narrative. Rules may also be added to the schema, for example requiring that all hypotheses require some text or visualizations providing evidence.

Literate visual analytics structure

Figure 17: Document structure for the VAST challenge design, analysis and reporting process.

Data were not only integrated in a common VA environment, but visualization components were shared between challenges. In particular a common spatial and temporal framework was used throughout. For data with higher precision spatial data and that required geographic interpretation, a common basemap component was used (Figure 18).

Context map

Figure 18: Context map used in multiple challenges

For lower precision geospatial referencing a gridmap base (Figure 19) was used into which other visualizations could be embedded.

Grid map

Figure 19: Gridmap for neighbourhood-based geospatial location

As almost all data were referenced by time, a common timeline with day/night guides and interactive tooltip query was used across challenges to support comparison (Figure 20).

Timeline

Figure 20: Timeline with example tooltip query

Central to the approach in completing the challenge has been the principle that visual analytic design is part of the same process as interpretation and reporting of findings.