Virtual Data Cosmos by Annika Kreikenbohm

Where do cosmic X-rays come from? Every new unidentified X-ray source has the potential to revolutionise our understanding of the universe. Astronomers thus aimed at automatically classifying new sources of X-ray emission (e.g., stars or galaxies) of the large observation database of an X-ray satellite and used different machine-learning (ML) algorithms that performed the classification process. However, due to the volume and dimensions of the big data set, it was not possible to clearly visualize and explore the resulting class distribution of the data with conventional scatter plots and histograms and find unknown objects. This made it hard to understand the relationships between parameter values and source types and the mechanism of the algorithm remained opaque.

I was interested in the challenge to visualize these big data sets in an interactive and intuitive way to facilitate the visual exploration of its internal structures and relationships. The VIRTUAL DATA COSMOS is an interactive data visualization tool in virtual reality (VR) for scientists to explore multidimensional data sets.

The data itself define the aesthetics and atmosphere of the virtual world. By interacting with a the virtual environment users can explore the abstract and non-visual data space. The VR experience consists of two spaces („class room“ and „parameter space“) where users can move continuously from one space to another.

Class room – overview of the class probability distribution of four ML algorithms for a big data set of unidentified cosmic X-ray sources. Each ML algorithm results in a color-coded point cloud following the principle of similarity. Objects close to the center of a sphere have a high probability of belonging to that class. This probability decreases with distance and objects are oriented towards spheres of
alternative classes.

Parameter space – overview of the parameter values used by a ML algorithm for activated source. Each parameter is represented by a (semi-) circle, which forms a characteristic trace for a X-ray source. Similar objects occupy the same regions in space which demonstrates the sorting process of each algorithm.