Apple's SHARP: Photorealistic View Synthesis from Single Images in Under a Second
By
dvrp
Solid neighbourhood-bakery energy. Trustworthy and warm.
Summary
Apple researchers present SHARP, a novel approach for photorealistic view synthesis from a single image. The method uses a 3D Gaussian representation to generate novel views in under a second, achieving state-of-the-art results on benchmark datasets. The approach combines a 2D diffusion model with a 3D Gaussian representation to create high-quality, sharp novel views from single photographs.
Key quotes
· 4 pulledWe present SHARP, an approach to photorealistic view synthesis from a single image.
Given a single photograph, SHARP regresses the parameters of a 3D Gaussian representation of the scene.
SHARP synthesizes novel views in less than a second, achieving state-of-the-art results on benchmark datasets.
The approach combines a 2D diffusion model with a 3D Gaussian representation to generate sharp novel views.
You might also wanna read
Apple to present 14 AI research papers at CVPR conference in Denver ahead of WWDC
Apple will present 14 AI research papers at the 2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) in Denver next we
LoGeR: Hybrid Memory System Enables Dense 3D Reconstruction from Long Videos
LoGeR (Long-Context Geometric Reconstruction with Hybrid Memory) is a novel AI system developed by Google DeepMind and UC Berkeley researche
STARFlow-V: Normalizing Flow-Based Video Generation Model with End-to-End Learning
STARFlow-V is a normalizing flow-based video generation model that offers end-to-end learning, robust causal prediction, and native likeliho
Image Diffusion Models Enable Zero-Shot Video Object Tracking Through Temporal Propagation
Researchers demonstrate that image diffusion models, originally designed for image generation, contain rich semantic structures that can be
Spatial Intelligence: The Next Frontier in AI Development Beyond Language Models
The article discusses the evolution of AI from basic computation to spatial intelligence, tracing the author's journey from creating ImageNe
Video Models Demonstrate Zero-Shot Learning Capabilities Similar to Large Language Models
The article discusses how video models like Veo 3 are demonstrating zero-shot learning capabilities similar to Large Language Models (LLMs),
video-zero-shot.github.io·8mo ago