The Embodied AI workshop will be held in-person at CVPR 2025 in Nashville, Tennessee on June 12th from 9 to 5 CDT:
Minds live in bodies, and bodies move through a changing world. The goal of embodied artificial intelligence is to create agents, such as robots, which learn to creatively solve challenging tasks requiring interaction with the environment. While this is a tall order, fantastic advances in deep learning and the increasing availability of large datasets like ImageNet have enabled superhuman performance on a variety of AI tasks previously thought intractable. Computer vision, speech recognition and natural language processing have experienced transformative revolutions at passive input-output tasks like language translation and image processing, and reinforcement learning has similarly achieved world-class performance at interactive tasks like games. These advances have supercharged embodied AI, enabling a growing collection of researchers to make rapid progress towards intelligent agents which can:
The goal of the Embodied AI workshop is to bring together researchers from computer vision, language, graphics, and robotics to share and discuss the latest advances in embodied intelligent agents. EAI 2025’s overaching theme is Real-World Applications: creating embodied AI solutions that are deployed in real-world environments, ideally in the service of real-world tasks. Embodied AI agents are maturing, and the community should promote work that transfers this research out of simulation and laboratory environments into real-world settings. This umbrella theme is divided into four topics:
Bio: Lerrel Pinto is an Assistant Professor of Computer Science at NYU Courant and part of the CILVR group. Lerrel runs the General-purpose Robotics and AI Lab (GRAIL) with the goal of getting robots to generalize and adapt in the messy world we live in.
Bio: Jianwei Yang is a principal researcher in Deep Learning Group at Microsoft Research, Redmond, led by Jianfeng Gao. My research interests generally span in computer vision, multi-modality, and machine learning. Currently, I am focusing on building next-generation vision and multi-modal foundations.
Bio: Jiayun (Peter) Wang is a postdoctoral researcher at the California Institute of Technology, working with Prof. Anima Anandkumar. He received his PhD from UC Berkeley in 2023, advised by Prof. Stella Yu. His research develops novel machine learning and computer vision methodologies that address challenges of data scarcity and computational cost, with real-world applications like healthcare. More information can be found at his website: https://pwang.pw/.
Bio: Rika Antonova is an Associate Professor at the University of Cambridge. Her research interests include data-efficient reinforcement learning algorithms, robotics, active learning & exploration. Earlier, Rika was a postdoctoral scholar at Stanford University upon receiving the Computing Innovation Fellowship from the US National Science Foundation. Rikacompleted her PhD at KTH, and earlier she obtained a research Master's degree from the Robotics Institute at Carnegie Mellon University. Before that, Rika was a senior software engineer at Google.
Bio: Dr. Rareș Ambruș is a senior manager in the Large Behavior Models division at Toyota Research Institute (TRI). His research interests lie at the intersection of robotics, computer vision and machine learning with the aim of discovering visual representations for embodied applications in areas such as automated driving and robotics. Dr. Ambruș received his Ph.D. in 2017 from the Royal Institute of Technology (KTH), Sweden, focusing on self-supervised perception and mapping for mobile robots. He has more than 100 publications and patents at top AI venues covering fundamental topics in computer vision, machine learning and robotics.
Bio: Nikhil Mohan is a Lead Scientist at Wayve, where he focuses on leveraging data-driven techniques for simulation in autonomous driving. His work spans Neural Radiance Fields (NeRFs), Gaussian Splatting, and generative models, emphasizing their application to improve and evaluate Wayve’s AI Driver performance. Before turning his attention to simulation, Nikhil led Wayve’s production driving team, where they shipped research prototypes into the production system. Prior to joining Wayve, he earned his Master’s degree at Carnegie Mellon University, concentrating in machine learning and signal processing.
Bio: Huan Ling is a Senior Research Scientist at NVIDIA’s Spatial Intelligence (TorontoAI) Lab. His research focuses on developing foundational generative models that enable realistic and controllable environments—spanning video synthesis, 3D/4D scene generation and reconstruction. His work aims for building scalable systems that support real world applications. Huan’s research has been featured at top conferences such as CVPR and NeurIPS, and he actively collaborates across disciplines to advance the frontier of generative AI for real-world applications. He has contributed to the development and large-scale training of video foundation model products, including NVIDIA-COSMOS and COSMOS-Drive-Dreams, which enable high-fidelity, controllable video generation for physicalAI related scenarios like autonomous driving.
The Embodied AI 2025 workshop is hosting many exciting challenges covering a wide range of topics. More details regarding data, submission instructions, and timelines can be found on the individual challenge websites.
The workshop organizers will award each first-prize challenge winner a cash prize, sponsored by Logical Robotics and our other sponsors.
Challenge winners may be given the opportunity to present during their challenge's presentation at the the workshop. Since many challenges can be grouped into similar tasks, we encourage participants to submit models to more than 1 challenge. The table below describes, compares, and links each challenge.
We invite high-quality 2-page extended abstracts on embodied AI, especially in areas relevant to the themes of this year's workshop:
The submission deadline CLOSED on May 23rd ( Anywhere on Earth - for clarity, 2025/05/24 00:01 in GMT as computed by OpenReview). Papers should be no longer than 2 pages (excluding references) and styled in the CVPR format.
Note. The order of the papers is randomized each time the page is refreshed.