Entry Name:  "HKUST-Qiao-MC3"

VAST Challenge 2017
Mini-Challenge 3



Team Members:

Hang YIN, The Hong Kong University of Science and Technology, hyinac@connect.ust.hk

Chengzhong LIU, The Hong Kong University of Science and Technology, cliubf@connect.ust.hk

Qiao GU, The Hong Kong University of Science and Technology, qgu@connect.ust.hk                        PRIMARY

Lian CHEN, The Hong Kong University of Science and Technology, lchenbk@connect.ust.hk

Haotian LI, The Hong Kong University of Science and Technology, hlibg@connect.ust.hk

Xuanwu YUE, The Hong Kong University of Science and Technology, xuanwu.yue@gmail.com

Huamin QU, The Hong Kong University of Science and Technology, huamin@cse.ust.hk


Student Team: YES


Tools Used:



Python scripts with matplotlib library, written by the student team.


Approximately how many hours were spent working on this submission in total?

40 days * 2 hours/day = 80 hours


May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2017 is complete? YES







1Boonsong Lake resides within the preserve and has a length of about 3000 feet (see the Boonsong Lake image file).  The image of Boonsong Lake is oriented north-south and is an RGB image (not six channels as in the supplied satellite data).  Using the Boonsong Lake image as your guide, analyze and report on the scale and orientation of the supplied six-channel satellite images.  How much area is covered by a pixel in these images?  Please limit your answer to 3 images and 500 words.

According to the given images, we can create the false color images of bands B1, B5 and B6 since the lakes are easier to recognize in these images compared with other types of combination. Then by sampling we obtain the range of RGB values of lakes in images. So we filter out all the irrelevant elements and leave only the area of lakes in the image by setting ranges for RGB values. Repeating the previous steps in 3 other distinct images, we then perform noise deduction in order to maximize the visual effect of lakes and minimize the others.

This analysis gives out that there are five major lakes in this preserve and the shapes and coverage vary among them. In Fig 1.1, the green area represents the lakes, while the blue area represents the other elements.

   Fig 1.1

By identifying the shape of Boonsong Lake, we can easily find out that the one on the lower-left corner is mostly likely the Boonsong Lake. After finding out the relative location of Boonsong Lake, we then get the length by pointing out the coordinates of the highest and lowest dot of the lake in MATLAB (and then combine two pictures together just to conveniently show two sets of coordinates). Thus the corresponding coordinates are (159, 476) and (159, 506) identified from Fig 1.2. Therefore, the supplied satellite images are north-south oriented, which is the same as the Boonsong Lake image, and the length of 1 pixel is 3000/30 = 100 feet. Thus one pixel in the satellite images covers 100*100=10000 square feet area.

VAST%20Challenge%202017/Mini-challenge%203/159整合.png   Fig 1.2

2Identify features you can discern in the Preserve area as captured in the imagery. Focus on image features that you are reasonably confident that you can identify (e.g., a town full of houses may be identified with a high confidence level). Please limit your answer to 6 images and 500 words. 

We apply two methods to identify features – manual classification and supervised classification.

Manual Classification:

According to the fact that different areas have different band value combination, then we can separate areas by the specific values manually.

1)     Vegetation: It is the most evident pattern in the images. By calculating the values of NDVI, we can get the areas which the vegetation covers by filtering NDVI>0. Vegetation is the most abundant in summer or autumn, so we apply the filter on image11 to get the following result Fig2.1, where green refers to the area of vegetation while blue and black stands for other elements.

              Contrast_Stretch/NDVI_contrast_stretch/image11_2016_09_06_NDVI.png   Fig 2.1


2)     Five areas of lake: We can construct the false color version of bands B1, B5 and B6. The reason why we choose these bands is that the lakes are more obvious than other false color images so that it’s convenient to sample the RGB value range for the area of lakes. After we get the ranges, we apply the filter to exclude other elements or minimize the effect as little as possible. The lakes are shown in Fig2.2 in color green and other elements are shown in blue.

              lakerefine2.png   Fig 2.2


3)     Road: We apply the approach that is similar to the one applied to the lakes, and construct the true color images with bands B1, B2 and B3. Then after figuring out the RGB value range for roads, we apply the filter to feature the roads. We can identify from Fig 2.3 that the green parts have different extent of brightness --- the roads in the middle are brighter than the ones beside them. So we can assume that the roads in the middle are highways, which are built above the ground since they are more straight and wider from the picture. Meanwhile, the less evident roads can be identified as roads inside the preserve since plants may cover the road making only parts of the roads to be shown.


4)     Town of houses & Bare lands or rocks: From NDVI < 0, we can split out the elements other than vegetation. Since we have already separate the areas of lakes and roads, we then mask them out to get the towns of houses as well as bare lands or rocks as shown below(Fig 2.4).

housesToRefine.png   Fig 2.4

Supervised classification:

We first label the given image with the features that we can identity by bare eyes, for example, the lakes, the clouds and the vegetation (together with NDVI images). Then after giving enough samples, we let the machine itself to calculate the Gaussian probability distribution for each training class. Other areas are assigned based on the maximum likelihood with the training classes. That is, a pixel falls into the class which have the highest probability. Take image06 as an example. After the supervised classification, the result is shown in Fig2.5, where yellow represents lakes, green represents vegetation, blue represents road and red represents bare lands. This method can give a general categorization of the whole area.    Fig 2.5

3There are most likely many features in the images that you cannot identify without additional information about the geography, human activity, and so on.  Mitch is interested in changes that are occurring that may provide him with clues to the problems with the Pipit bird.  Identify features that change over time in these images, using all channels of the images.  Changes may be obvious or subtle, but try not to be distracted by easily explained phenomena like cloud cover.  Please limit your answer to 6 images and 750 words. 

1)     As years pass by, the vegetation health and coverage also vary. By the given formula of NDVI and masking all the irrelevant elements especially the black lines in the lower right corner (since one of the RGB values of black lines is 0, which would affect the value of NDVI in those areas), we can generate the histograms of NDVI where NDVI > 0 across the 12 images shown in Fig 3.1.

       NDVI_hist_gt0.png   Fig 3.1


From the histograms of image 02, 06, 10, we can notice that the mean of NDVIs increase from 2014 to 2015 and then decrease a little bit from 2015 to 2016. The reason why we choose the images in summer is that the vegetation are the most abundant in summer, and images 03 and 07 includes huge amount of cloud, which could be of great disturbance when we make judgements. So the NDVI reflects that the vegetation health turns better from  2014 to 2015 and then worse from 2015 to 2016.


Apart from the changes across the years, we can also notice the changes of vegetation in one year. Take year 2016 as the example since the four images are of the least disturbance. We find out that the means of NDVIs show the trend of first increasing in spring, then remaining the same and decreasing at last in winter. What’s more, the NDVIs concentrate more in the interval of 0 and 0.05 in spring and winter. However, they concentrate more in the interval of 0.1 and 0.3 in summer and autumn.


2)     It is known that B5 would be completely absorbed by liquid water. As a consequence, if lakes are not frozen, the value of B5 would be relatively small, while large if they indeed frozen (icy surface is expected to reflect all kind of lights).


We extract all pixels that fall in the lake area (see previous answers for “lake area”), around 7600, checking their B5 value and calculating the average, obtaining the following table in Fig 3.2.


The pattern is that we can classify averages into two groups, one higher-than-120, and another lower-than-60. Note that the data of image03 would be discarded due to the highly cloudy weather.


So lakes would be frozen at image 1,5,9,12, corresponding to March, February, March and December, which is accorded to common knowledge of northern world.

../../../Picture1.png   Fig 3.2

3)     Since Band 6 reflects differences in soil mineral content, we can compare two images regarding the changes of band 6. So we make two heat-maps of image02 and image06 as well as image06 and image10 on the differences of band 6 (i.e. B62-B66, B66-B610). In Fig 3.3, red reflects the difference is larger than 0 and deeper the red, higher the difference, while green reflects the difference is smaller than 0 and deeper the green, higher the absolute value of the difference. By observation of 12 images, we find that the mineral content is more abundant when the value of B6 is larger. So we can conclude that from 2014 to 2015 in summer, the mineral content of the vegetation area increases, while from 2015 to 2016 in summer, the mineral content of vegetation area decreases, which also corresponds to the change of vegetation concluded in 1).


              ../../../Picture1.png   Fig 3.3


4)     Since B5 reflects the moisture content of soil and vegetation, we can apply similar approaches to get the vector difference heat-maps of image02 and image06 as well as image06 and image10 as below in Fig 3.4. We can conclude that the moisture of soil and vegetation increase from 2014 to 2015 and then decrease from 2015 to 2016, which corresponds to the changes of NDVI and mineral content. Notice that the parts that have opposite colors but the same shape are clouds and their shades, which are of no use for this observation.

              ../../../Picture1.png   Fig 3.4

5)     In image09_2016_03_06, we observe an abnormal phenomenon that there may be floods or huge amount of rainfall. By the given image, we can construct the false color edition of image 09 of bands 5,4 and 2, which can be used to show floods or newly burned lands shown in Fig 3.5. So the color in the false color images show the humidity of that area. In normal images, the color of vegetation should be green like the image on the left, which is the false color image of image 06. However, in image 09, the whole color of the image is mostly blue, which indicates that the water covers most of the area in the preserve.

              ../../../Picture1.png   Fig 3.4

Web Accessibility