DIFFusion: Decoding Visual Expertise - Columbia Computer Vision

Teaching Humans Subtle Differences with DIFFusion

Mia Chiquier* Orr Avrech* Yossi Gandelsman Berthy Feng Katie Bouman Carl Vondrick

*Equal contribution

Columbia, Berkeley, Caltech

In Submission

Paper Code Demo

DIFFusion

Scientific expertise often requires recognizing subtle visual differences that remain challenging to articulate even for domain experts. We present a system that leverages generative models to automatically discover and visualize minimal discriminative features between categories while preserving instance identity. Our method generates counterfactual visualizations with subtle, targeted transformations between classes, performing well even in domains where data is sparse, examples are unpaired, and category boundaries resist verbal description. Experiments across six domains, including black hole simulations, butterfly taxonomy, and medical imaging, demonstrate accurate transitions with limited training data, highlighting both established discriminative features and novel subtle distinctions that measurably improved category differentiation. User studies confirm our generated counterfactuals significantly outperform traditional approaches in teaching humans to correctly differentiate between fine-grained classes, showing the potential of generative models to advance visual learning and scientific research.

Black Hole: We learn that the SANE simulation tends to have more uniform wisps (yellow). The MAD simulation tends to have a more prominent photon ring (blue).

Butterfly: We learn that Viceroy has a cross-sectional line across its' wings, whereas Monarchs don't (yellow). Monarchs have a larger head with spots on it (magenta). Finally, Viceroy's splots along the edge of the wings can be described as more 'scaley', or 'gothic' (blue).

Retina: We learn that Retinas with drusen have bumps along the horizontal cross-section (yellow).

Method

Our approach leverages diffusion models to create smooth transitions between visual categories, helping novices learn subtle discriminative features. By carefully manipulating the conditioning space, we maintain instance identity while traversing category boundaries.

Interpolation Results

DIFFusion generates smooth transitions that highlight key discriminative features between categories.

User Study

Our user studies demonstrate that DIFFusion significantly improves novices' ability to distinguish subtle visual differences.

Acknowledgements

We thank our user study participants and collaborators for their insights. This work is supported by the Carver Mead New Adventures Fund, a Pritzker Award, an AI4Science Amazon Discovery Grant, the NSF AI Institute for Artificial and Natural Intelligence (ARNI), NSF CAREER #2046910, NSF RETTL #2202578, DARPA ECOLE, and a Google Fellowship. Views are ours, not necessarily our sponsors'.