Appears on
Articles4
MMaDA-Parallel: Multimodal Diffusion Language Models for Thinking-Aware Generation and Editing
Code
Introduction of Qwen VLo: A Unified Multimodal Understanding and Generation Model
The article introduces the Qwen VLo model, a unified multimodal understanding and generation model that bridges the gap between perception and creation by not only understanding the world but also generating high-quality recreations based on that understanding.
News
4Real-Video-V2: Feedforward Reconstruction for 4D Scene Generation
Article URL: https://snap-research.github.io/4Real-Video-V2/ Comments URL: https://news.ycombinator.com/item?id=44368015 Points: 6 # Comments: 1
snap-research.github.io11mo ago
Efficient Dense Point Tracking Model: AllTracker
AllTracker is a model introduced for estimating long-range point tracks by estimating the flow field between frames in a video.
News
alltracker.github.io11mo ago

