Problem statement
It's a challenging problem to visualize multiple (meteorological) raster features on the same map. Typical ways of doing this are either through raster overlay or multiple linked views. From the user perspective, raster overlay is preferred because multiple views will distract users' attention especially in the case of animation.
Then, the issue turns to be how to overlay rasters in a smart way. Traditional raster overlay in GIS is meaningful for only a few specific tasks. This project is to explore solutions based on machine learning techniques.
Potential solution
The general idea is inspired by the "focus + context" thinking. A big assumption is that the users have one or more target variables and more conditional variables. Here, target variables are those that users feel interested in, and the conditional variables are those that affect the target variables in some ways. Users may also care about how the conditional variables affect the target variables.
The solution is to firstly group pixels by conditional variables, and then visually encode conditional and target variables with different elements (color channels as expected at this stage). Thus, contexts that are the situation of conditional variables are described as groups, while focus of the target variables still remains.
We assume that one target variable per view is the request. The process of generating a timeframe is organized in the following steps:
- Preprocess of the whole dataset: clean the input data, map data into the same projection and grid.
- Group grid cells of the whole dataset: generate vectors of conditional variables for each grid cell and group pixels on this vector space
- Index grid cells of the current timeframe based on the grouping result in step 2.
- Visually encode pixel groups with e.g. different hue
- Visually encode the target variable with another channel e.g. color lightness
- Create a legend that includes descriptions of each group and the target variable.
With the above design, we have made some preliminary decisions as shown in the follows. Note that all the decisions should be tested in future demos.
- Groups are generated based on the whole dataset instead of the current frame. This confirms a constant legend over time.
- Coordinates are not included as conditional variables at this stage. This can be explored later to smooth the visual result on the geospatial level, but we make this decision at this point because we assume that atmosphere and climate phenomenons are usually continuous on geospace. Also, phenomenons can move over time. Since the grouping is done on the whole dataset, a more meaningful way is to exclude coordinates at this stage.
Next action
- Develop a quick demo for a single timeframe: mainly explore the visual encoding of conditional and target variables. The current setup is the combination of HUE and lightness, but there might be better choices.
- Develop a quick demo for animating over time: mainly test the scenario of tracking specific groups/objects.
- Explore methods of grouping grid cells: methods in consideration include clustering, dimensionality reduction, and image segmentation. To be complimented. Need a way of evaluating grouping results, can refer to the way of evaluating clustering results.
- Design methods of automatically generating descriptions of groups: can be statistical or visual approaches. To be investigated.
- Explore the smooth issue on the spatial level.
Further work
It's highly possible that the grouping result is not good enough. This can be caused by different grouping methods and parameters that can be played with. Thus, an interactive way (system/interface) of adjusting grouping approaches and parameters is seen as the next step. This will also include graphical feedback of evaluation of the grouping result for users to improve based on that.