Embodied AI Workshop
CVPR 2026 - Denver

#

Overview

Minds live in bodies, and bodies move through a changing world. The goal of embodied artificial intelligence is to create agents, such as robots, which learn to creatively solve challenging tasks requiring interaction with the environment. While this is a tall order, fantastic advances in deep learning, the explosive growth of large language models, and the increasing availability of large datasets like ImageNet have enabled superhuman performance on a variety of AI tasks previously thought intractable. Computer vision, speech recognition and natural language processing have experienced transformative revolutions at passive input-output tasks like language translation and image processing, and reinforcement learning has similarly achieved world-class performance at interactive tasks like games. These advances have supercharged embodied AI, enabling a growing collection of researchers to make rapid progress towards intelligent agents which can:

  • See: perceive their environment through vision or other senses.
  • Talk: hold a natural language dialog grounded in their environment.
  • Listen: understand and react to audio input anywhere in a scene.
  • Act: navigate and interact with their environment to accomplish goals.
  • Reason: consider and plan for the long-term consequences of their actions.

The goal of the Embodied AI workshop is to bring together researchers from computer vision, language, graphics, and robotics to share and discuss the latest advances in embodied intelligent agents. EAI 2026’s overaching theme is World Models for Embodied AI: embodied AI agents that create models of the world to help them imagine and act, or to help researchers to test and evaluate them. This umbrella theme is divided into three topics:

  • World Models for Action and Evaluation Explores both dynamics models which incorporate physics and geometry, and video models where dynamics are implicit.
  • The Resurgence of Classic Methods Examining new applications of techniques such as reinforcement learning and model-predictive control to embodied AI.
  • Long-Horizon Embodied Intelligence Explores benchmarks and methods for multi-step tasks, robust testing, and, in particular, safe operation.
For more information on the Embodied AI Workshop series, see our Retrospectives paper on the first three years of the workshop. For the latest updates, follow the Embodied AI Medium blog at medium.com/embodied-artificial-intelligence.

Sign Up for Updates
You can unsubscribe at any time.

#

Attending

The Embodied AI 2026 workshop was held in conjunction with CVPR 2026 in Denver, Colorado. It featured a host of invited talks covering a variety of topics in Embodied AI, many exciting Embodied AI challenges, a poster session, and panel discussions. The Embodied AI workshop was held in-person with remote options on June 4th from 8:45 to 5:30 MDT:

For late-breaking updates from CVPR, see the workshop's CVPR page.

#

Timeline

Workshop Announced
February 2nd, 2026
Paper Submission Deadline
April 3rd May 15th, 2026
Paper Notification Deadline
April 24th May 27th, 2026
Challenge Submission Deadlines
May-June, 2026. Check each challenge for the specific date.
Camera Ready Copy Deadline
May 15th June 1st, 2026
Seventh Annual Embodied AI Workshop at CVPR
Denver, Colorado
June 4th, 2026
Room 107
Challenge Winners Announced
At the workshop. Check each challenge for specifics.

#

Workshop Schedule

Embodied AI will be a hybrid workshop, with both in-person talks and streaming via zoom.
  • Workshop Talks: 8:45AM-5:30PM MDT - Room 107
  • Poster Session: 12:00PM-1:30PM MDT - Exhibit Hall A Boards 262-276
  • Virtual Sessions: Workshop page available to registered CVPR attendees.
Note an earlier version of the website said CDT, but the timezone is MDT, the same as the rest of CVPR.
Zoom information can be found for CVPR attendees on our official CVPR workshop page when it becomes available.
Remote and in-person attendees are welcome to ask questions via Slack:

  • Workshop Introduction: Embodied AI
    8:45 - 9:00 AM MDT
    Location: Room 107
    Anthony Francis
    Logical Robotics
  • Challenge Presentations - Winning Methods
    9:00 - 10:00 AM MDT
    Location: Room 107
    Moderator - David Hall
    CSIRO
  • Challenge Q&A
    10:00 - 10:30 AM MDT
    Location: Room 107
  • Invited Talk - Siyuan Huang, BIGAI
    Title: Understanding the 3D World for General Agents
    10:30 - 11:00 AM MDT
    Location: Room 107
    Siyuan Huang
    BIGAI

    Bio: Siyuan Huang is a Research Scientist at the Beijing Institute for General Artificial Intelligence (BIGAI), directing the Center of Embodied AI and Robotics. He received his Ph.D. from the Department of Statistics at the University of California, Los Angeles (UCLA). His research aims to build a general robot capable of understanding and interacting with 3D environments like humans. His research has received multiple awards including the best paper award of CoRL2025 and several workshop best papers.

    Abstract: While current world models exhibit impressive predictive capabilities, their reliance on 2D image sequences masks a critical lack of genuine geometric, spatial, and physical understanding. For general embodied agents to interact reliably wi... [Expand]
  • Invited Talk - Stefan Leutenegger, ETH Zurich
    Title: Spatial AI and Robot Learning for the Real World
    11:00 - 11:30 AM MDT
    Location: Room 107
    Stefan Leutenegger
    ETH Zurich

    Bio: Prof. Dr. Stefan Leutenegger is an Associate Professor in the Department of Mechanical and Process Engineering of ETH Zurich.

    Abstract: TBD
  • Invited Talk - Lewis Chiang, Google DeepMind
    Title: Why Are Robot Agents So Hard?
    11:30 AM - 12:00 PM MDT
    Location: Room 107
    Lewis Chiang
    Google DeepMind

    Bio: Lewis Chiang is a Research Scientist at Google DeepMind, where he works on Gemini Robotics. His research focuses on developing real-time robot agents. Prior to joining Google DeepMind, Lewis worked at Waymo, where he worked on motion prediction and planning.

    Abstract: TBD
  • Lunch / Accepted Papers Poster Session
    12:00 PM - 1:30 PM MDT
    Location: Exhibit Hall A, Boards 262 - 276
  • Invited Talk - Ruiqi Gao, Google DeepMind
    Title: World Models for Embodied AI
    1:30 - 2:00 PM MDT
    Location: Room 107
    Ruiqi Gao
    Google DeepMind

    Bio: I am a Research Scientist at Google DeepMind. I am mainly interested in generative models and representation learning. My recent research focus is to construct powerful generative AI models that can comprehend, generate, and reason with multi-modal data, including natural language, images, videos and 3D. I obtained my Ph.D. from UCLA advised by Song-Chun Zhu and Ying Nian Wu. Prior to that, I received my B.S. degree of Statistics from Peking University..

    Abstract: TBD
  • Invited Talk - Tapomayukh Bhattacharjee, Cornell University
    Title: Embodied Intelligence for Physical Contact with Humans: Towards Safe Caregiving Robots in the Real World
    2:30 - 3:00 PM MDT
    Location: Room 107
    Tapomayukh Bhattacharjee
    Cornell University

    Bio: Tapomayukh "Tapo" Bhattacharjee is an Assistant Professor in the Department of Computer Science at Cornell University where he directs the EmPRISE Lab (https://emprise.cs.cornell.edu/). He completed his Ph.D. in Robotics from Georgia Institute of Technology and was an NIH Ruth L. Kirschstein NRSA postdoctoral research associate in Computer Science & Engineering at the University of Washington. His primary research interests are in the area of physical robot caregiving and physical human-robot interaction. He is the recipient of TRI Young Faculty Researcher Award'24, NSF CAREER Award'23, AFCEA 40 under 40 Award'22, and his work has won Best Systems Paper Award at HRI’26, Best Paper Award at RSS’25, Best Paper and Student Paper Award Finalist and Best HRI Paper Award Finalist at ICRA’25, Best Systems Paper Award Finalist at HRI'24, Best Demo Award at HRI'24, Best RoboCup Paper Award at IROS’22, Best Paper Award Finalist and ABB Best Student Paper Award Finalist at IROS’22, Best Technical Advances Paper Award at HRI'19, and Best Demonstration Award at NeurIPS’18. His work has also been featured in many media outlets including the BBC, Reuters, New York Times, IEEE Spectrum, and GeekWire and his robot-assisted feeding work was selected to be one of the best interactive designs of 2019 by Fast Company.

    Abstract: Physical contact with humans remains one of the most important and underexplored challenges in embodied AI. To operate safely and effectively in real-world environments shared with humans, robots must reason about and adapt to the diverse b... [Expand]
  • Invited Talk - Yilun Du, Harvard
    Title: World Models for Robot Manipulation and Planning
    2:30 - 3:00 PM MDT
    Location: Room 107

    Bio: I am an Assistant Professor at Harvard in the Kempner Institute and CS, where I run the Embodied Minds lab. I received my PhD at MIT EECS, advised by Prof. Leslie Kaelbling, Prof. Tomas Lozano-Perez and Prof. Joshua B. Tenenbaum. Previously, I also obtained my bachelor's degree from MIT, was a research fellow at OpenAI, and a senior research scientist at Google DeepMind. My research focuses on generative models, decision making, robot learning, embodied agents, and the applications of such tools to scientific domains.

    Yilun Du
    Harvard
    Abstract: I'll talk about a couple methods in which world models can be useful for robotics applications. First, I'll talk about how they can be used as policies or imaginations depicting what to do in future steps. I'll talk about how they can be us... [Expand]
  • Invited Talk - Wayne Wu, UCLA
    Title: From Scaling up to Scaling out: Reality World Simulators for Physical AI
    3:00 - 3:30 PM MDT
    Location: Room 107
    Wayne Wu
    UCLA

    Bio: I am an AI Researcher in the Department of Computer Science at the University of California, Los Angeles (UCLA), working with Bolei Zhou, and collaborating with Trevor Darrell (UC Berkeley EECS) and Jiaqi Ma (UCLA CEE). I was a Visiting PhD at Nanyang Technological University, working with Chen Change Loy. I received my Ph.D. from the Department of Computer Science and Technology at Tsinghua University.

    Abstract: Recent progress in large language and vision models demonstrates how far we can go by scaling with vast internet-scale data. In contrast, physical AI, agents that perceive and act in the real world, still lags far behind. Today, both academ... [Expand]
  • Industry Talk - Sarah Parisot, Microsoft Research Cambridge
    Title: Building World Models for Creative Use
    3:30 - 4:00 PM MDT
    Location: Room 107
    Sarah Parisot
    Microsoft Research Cambridge

    Bio: I am a Principal Researcher in the Game Intelligence(opens in new tab) team which develops novel machine learning technology with applications to video games and beyond. My research interests and experience include parameter efficient learning, computer vision and generative AI. My recent work has focused on text-to-image generative models, with an emphasis on controllability and interactivity. Prior to joining Microsoft, I was a Senior Research Scientist and Team Leader at Huawei Noah’s Ark Lab in London.

    World models offer a path toward interactive, co‑creative systems that support iteration, exploration, and sustained creative control. To be useful to creators, such models must balance expressiveness with practical constraints such as data efficienc... [Expand]
  • Invited Talk - Dinesh Jayaraman, UPenn GRASP Lab
    Title: Coding Agent-Driven Robot Learning
    4:00 - 4:30 PM MDT
    Location: Room 107

    Bio: I am an associate professor at UPenn’s GRASP lab, with a primary appointment in CIS, and a secondary appointment in ESE. I lead the Perception, Action, and Learning (PennPAL) Research Group, where we work on problems at the intersection of robotics, machine learning, and computer vision.

    Dinesh Jayaraman
    UPenn GRASP Lab
    Abstract: TBD
  • Accepted Paper Highlights
    4:30 - 5:00 PM MDT
    Location: Room 107
  • Debate - Long-Horizon Safety in Embodied AI
    5:00 - 5:30 PM MDT
    Location: Room 107
    Moderator - Anthony Francis
    Logical Robotics
  • Workshop Concludes
    5:30 PM MDT
    Location: Room 107

#

Sponsor Events


#

Challenges

The Embodied AI 2026 workshop is hosting many exciting challenges covering a wide range of topics. More details regarding data, submission instructions, and timelines can be found on the individual challenge websites.

The workshop organizers will award each first-prize challenge winner a cash prize, sponsored by Logical Robotics and our other sponsors.

Challenge winners may be given the opportunity to present during their challenge's presentation at the workshop. Since many challenges can be grouped into similar tasks, we encourage participants to submit models to more than 1 challenge. The table below describes, compares, and links each challenge.

Challenge
Task
2026 Winner
Platform
Scene Dataset
Observations
Action Space
Interactive Actions?
Stochastic Acuation?
ARNOLDLanguage-Grounded ManipulationIsaac SimArnold DatasetRGB-D, ProprioceptionContinuous
ManiSkill-ViTacVision-Tactile Fusion Bimanual ManipulationReal Bimanual RobotCustomized ScenariosWrist Image, Tactile Image, ProprioceptionContinuous
ManipArenaDesktop and Mobile ManipulationIsaac Lab Arena (simulation), UR, Franka, ARX, Spatiotemporal AI Arm, x2robot ArmCustom DatasetRGB, joint angles, joint torquesContinuous

#

Call for Papers

We invite high-quality 2-page extended abstracts on embodied AI, especially in areas relevant to the themes of this year's workshop:

  • Embodied AI Solutions
  • World Models for Action and Evaluation
  • Classical Methods for Embodied AI
  • Long-Horizon Embodied Intelligence
as well as themes related to embodied AI in general:
  • Visual Navigation
  • Embodied Mobile Manipulation
  • Embodied Question Answering
  • Embodied AI Foundation Models
  • Embodied Vision & Language
  • Language Model Planning
  • Advances in Simulation for Embodied AI
Accepted papers will be presented as posters or spotlight talks at the workshop. These papers will be made publicly available in a non-archival format, allowing future submission to archival journals or conferences. Paper submissions do not have to be anononymized. Per CVPR rules regarding workshop papers, at least one author must register for CVPR using an in-person registration.

The submission deadline will close May 15th, 2026 ( Anywhere on Earth - for clarity, 00:01 in GMT as computed by OpenReview). Papers should be no longer than 2 pages (excluding references) and styled in the CVPR format.

Note. The order of the papers is randomized each time the page is refreshed.

#

Sponsors

The Embodied AI 2025 Workshop is sponsored by the following organizations:

Logical RoboticsMicrosoft

#

Organizers

The Embodied AI 2026 workshop is a joint effort by many researchers from a variety of organizations. Each year, a set of lead organizers takes point coordinating with the CVPR conference, backed up by a team of workshop organizers, challenge organizers, and scientific advisors.
Anthony Francis
Logical Robotics
David Hall
CSIRO
German Ros
NVIDIA
Heewon Kim
SSU
Mike Roberts
Adobe
Minyoung Hwang
MIT
Oleksandr Maksymets
Rachith Prakash
Ran Gong
RAI
Vivan Amin
Microsoft
Chaoyi Liu
THU
Ian Reid
MBZUAI
Ivan Laptev
MBZUAI
Jiangyong Huang
Peking U
Ma Liang
MBZUAI
Meng Cao
MBZUAI
Qian Wang
X Square Robot
Ran Gong
RAI
Rongtao Xu
MBZUAI
Rongxuan Zhang
NEU
Rui Chen
THU
Shaowei Cui
CASIA
Wenxuan Ma
CASIA
Xiaodan Liang
SYSU, MBZUAI
Xiaofeng Gao
Amazon
Yizhou Zhao
NVIDIA
Yu Sun
SYSU, X Square Robot
Ade Famoti
Microsoft
Claudia Pérez D’Arpino
NVIDIA
Peyman Moghadam
CSIRO
Roberto Martín-Martín
Stanford
Overview
Attending
Timeline
Workshop Schedule
Sponsor Events
Challenges
Call for Papers
Sponsors
Organizers