Using Diffusion Models to Visualize What Self-Supervised Neural Networks Actually Learn
By
[Submitted on 16 Dec 2021 (v1), last revised 16 Aug 2022 (this version, v2)]
Summary
This paper introduces the use of Representation Conditional Diffusion Models (RCDM) to visualize what self-supervised learning (SSL) models actually learn. The authors demonstrate that RCDMs can generate high-quality samples faithful to the representations they condition on. Using this visualization technique, they make four key findings: (1) SSL backbone representations are NOT invariant to data augmentations they were trained with, debunking a common misconception; (2) SSL post-projector embeddings do appear invariant to augmentations and other symmetries; (3) SSL representations are more robust to small adversarial perturbations than supervised representations; and (4) SSL representations have an inherent structure that enables image manipulation through RCDM visualization.
Source
Key quotes
· 5 pulledDiscovering what is learned by neural networks remains a challenge.
In self-supervised learning, classification is the most common task used to evaluate how good a representation is.
SSL (backbone) representation are not invariant to the data augmentations they were trained with -- thus debunking an often restated but mistaken belief.
SSL post-projector embeddings appear indeed invariant to these data augmentation, along with many other data symmetries.
SSL-trained representations exhibit an inherent structure that can be explored thanks to RCDM visualization and enables image manipulation.
You might also wanna read
Understanding the Mathematical Principles of Generative Adversarial Networks
This article provides a technical explanation of Generative Adversarial Networks (GANs), focusing on the mathematical principles behind thes
Training-Free Single-Image Diffusion Model Achieves Fast, High-Quality Generation
This paper presents a training-free approach to single-image diffusion models. Instead of training a neural network on a single image (which
LeJEPA: A Theoretically Grounded Self-Supervised Learning Framework for AI Representation Learning
Researchers present LeJEPA, a theoretically grounded self-supervised learning framework that addresses limitations in Joint-Embedding Predic
Image Diffusion Models Enable Zero-Shot Video Object Tracking Through Temporal Propagation
Researchers demonstrate that image diffusion models, originally designed for image generation, contain rich semantic structures that can be
Emergence of Diffusion Models from Associative Memory
Analyzing Loss Functions in Diffusion Bridge Samplers
Diffusion bridges in deep-learning methods for sampling from unnormalized distributions are analyzed, comparing the performance of Log Varia

Comments
Sign in to join the conversation.
No comments yet. Be the first.