HiPiler - Visual Exploration of Genome Interaction Matrices with Interactive Small Multiples by Harvard University

Imagine having to explore, compare, and organize hundreds to thousands of small regions in one or more large images or visualizations. For example, how would you assess the typical look and topology of beaches around the world in Google Maps? Biologists experience this challenge when studying the 3D structure of the genome. The structure is visualized as a heatmap matrix of 3 by 3 million rows and columns and contains several thousand small patterns that are of interest to the analyst. HiPiler is a web-based tool that addresses this challenge by separating hundreds to thousands of small regions of interest from the original matrix and letting the analyst interactively explore thumbnails of these separated regions, called snippets. Snippets can be arranged, aggregated, and compared manually or automatically along computationally-derived data attributes. Snippets can be laid out in one or two dimensions using data attributes such as size or noisiness or derived attributes like confidence scores from the algorithm that identified the patterns in the first place. To scale to thousands of snippets the user can manually or automatically aggregate snippets into piles that display a statistical aggregate, e.g., an 'average' pattern, along with previews of the grouped snippets. In a way, HiPiler is augmenting human abilities to organize items with data-driven tools to handle exploration and visual pattern search for many regions in the context of a huge image. This enables biologists to evaluate the performance of pattern detection algorithms, remove outliers and noise, and assess the variance across pattern types to identify subgroups of pattern. These abilities allow for fast and interactively testing of hypotheses, which is highly important as a lack of ground truth data requires biologists to visually make sense and evaluate findings with their eyes.