SkyJEPA: A Latent Dynamics Model for Zero-Shot Sim-to-Real Quadrotor Control
By
[Submitted on 22 Jun 2026 (v1), last revised 23 Jun 2026 (this version, v2)]
Summary
This paper introduces SkyJEPA, a Joint Embedding Predictive Architecture (JEPA) for learning long-horizon world models that enable zero-shot sim-to-real control of quadrotors. The approach combines a latent dynamics model with a physics-inspired prober that maps frozen latents to interpretable state for physically grounded long-horizon prediction. It integrates with sampling-based optimal control for real-time operation on embedded hardware, and includes a structured pipeline for automated dataset generation to reduce reliance on real-world data collection. Open-loop and outdoor closed-loop experiments demonstrate accurate prediction, robust zero-shot sim-to-real transfer, and strong generalization across diverse operating conditions.
Source
Key quotes
· 3 pulledAccurate dynamics models are critical for informed decision-making in robotic systems, particularly for agile aerial vehicles operating under uncertainty.
Joint Embedding Predictive Architectures (JEPAs) offer a compelling alternative by modeling dynamics in latent space.
Extensive open-loop and outdoor closed-loop experiments demonstrate accurate prediction, robust zero-shot sim-to-real transfer, and strong generalization across diverse operating conditions.
You might also wanna read
LeJEPA: A Theoretically Grounded Self-Supervised Learning Framework for AI Representation Learning
Researchers present LeJEPA, a theoretically grounded self-supervised learning framework that addresses limitations in Joint-Embedding Predic
Multi-Agent Reinforcement Learning Reduces Drone Racing Collisions by 50% While Achieving Champion-Level Performance
This article presents research demonstrating that multi-agent reinforcement learning (MARL) enables superhuman performance in shared, dynami
DILLO: A Language-Based World Model for Proactive Agent Steering Without Visual Simulation
This paper introduces DILLO (DIstiLLed Language-ActiOn World Model), a proactive agent steering framework that replaces slow visual simulati
ReMoT: A Reinforcement Learning Framework Using Motion Contrast Triplets to Improve VLM Spatio-Temporal Reasoning
ReMoT (Reinforcement Learning with Motion Contrast Triplets) is a unified training paradigm designed to address spatio-temporal consistency
DynaFLIP: A Dynamics-Aware Multimodal Pre-Training Framework for Robot Manipulation Perception
DynaFLIP is a dynamics-aware multimodal pre-training framework for robot manipulation perception. It constructs image-language-3D flow tripl
Decentralized Multi-Agent Reinforcement Learning for Autonomous Aircraft Traffic Management in AAM Corridor Networks
This research paper addresses the challenge of managing high-density autonomous aircraft traffic in Advanced Air Mobility (AAM) corridors. T
Comments
Sign in to join the conversation.
No comments yet. Be the first.
