Embodied AI Workshop
CVPR 2024

#

Overview

Minds live in bodies, and bodies move through a changing world. The goal of embodied artificial intelligence is to create agents, such as robots, which learn to creatively solve challenging tasks requiring interaction with the environment. While this is a tall order, fantastic advances in deep learning and the increasing availability of large datasets like ImageNet have enabled superhuman performance on a variety of AI tasks previously thought intractable. Computer vision, speech recognition and natural language processing have experienced transformative revolutions at passive input-output tasks like language translation and image processing, and reinforcement learning has similarly achieved world-class performance at interactive tasks like games. These advances have supercharged embodied AI, enabling a growing collection of researchers to make rapid progress towards intelligent agents which can:

  • See: perceive their environment through vision or other senses.
  • Talk: hold a natural language dialog grounded in their environment.
  • Listen: understand and react to audio input anywhere in a scene.
  • Act: navigate and interact with their environment to accomplish goals.
  • Reason: consider and plan for the long-term consequences of their actions.

The goal of the Embodied AI workshop is to bring together researchers from computer vision, language, graphics, and robotics to share and discuss the latest advances in embodied intelligent agents. The overarching theme of this year's workshop is Open World Embodied AI: Being an embodied agent in a world that contains objects and concepts unseen during training. This theme applies the “open set” problem of many individual tasks to embodied AI as a whole. We feel that truly effective embodied AI agents should be able to deal with tasks, objects, and situations markedly different from those that they have been trained on. This umbrella theme is divided into three topics:

  • Embodied Mobile Manipulation We go places to do things, and to do things we have to go places. Many interesting embodied tasks combine manipulation and navigation to solve problems that cannot be done with either manipulation or navigation alone. This builds on embodied navigation and manipulation topics from previous years and makes them more challenging.
  • Generative AI for Embodied AI Generative AI isn't just a hot topic, it's an important tool researchers are using to support embodied artificial intelligence research. Topics such as generative AI for simulation, generative AI for data generation, and generative AI for policies (e.g., diffusion policies and world models) are of great interest.
  • Language Model Planning When we go somewhere to do something we do it for a purpose. Language model planning uses large language models (LLMs), vision-language models (VLMs), and multimodal foundation models to turn arbitrary language commands into plans and sequences for action - a key feature needed to make embodied artificial intelligence systems useful for performing the tasks in open worlds.
The Embodied AI 2024 workshop will be held in conjunction with CVPR 2024 in Seattle, Washington. It will feature a host of invited talks covering a variety of topics in Embodied AI, many exciting Embodied AI challenges, a poster session, and panel discussions. For more information on the Embodied AI Workshop series, see our Retrospectives paper on the first three years of the workshop.

Sign Up for Updates
You can unsubscribe at any time.

#

Timeline

Workshop Announced
March 29, 2024
Paper Submission Deadline
May 4th, 2024 (Anywhere on Earth)
Paper Notification Deadline
May 27th, 2024
Challenge Submission Deadlines
May 2024. Check each challenge for the specific date.
Fifth Annual Embodied AI Workshop at CVPR
Seattle Convention Center
Tuesday, June 18, 2024
8:50 AM - 6:00 PM PT
TBD
Challenge Winners Announced
June 18, 2024 at the workshop. Check each challenge for specifics.

#

Workshop Schedule

Embodied AI will be a hybrid workshop, with both in-person talks and streaming via zoom.
  • Workshop Talks: 8:50AM-5:30PM PT - TBD
  • Poster Session: 1:00PM-2:00PM PT - TBD
Zoom information is available on the CVPR virtual platform for registered attendees.
Remote and in-person attendees are welcome to as questions via Slack:

  • Workshop Introduction: Embodied AI
    8:50 - 9:00 AM PT
    Moderator - Anthony Francis
    Logical Robotics
  • Navigation & Social Challenge Presentations
    (MultiOn, HAZARD, PRS Challenge)
    9:10 - 10:00 AM PT
    • 9:00: MultiOn
    • 9:10: HAZARD
    • 9:20: PRS Challenge
  • Navigation & Social Challenge Q&A Panel
    9:30 - 10:00 AM PT
  • Invited Talk - Generative AI for Embodied AI:
    10:00 - 10:30 AM PT
    Aniruddha Kembhavi
    AI2

    Ani Kembhavi is the Senior Director of Computer Vision at the Allen Institute for AI (AI2) in Seattle, and is also an Affiliate Associate Professor at the Computer Science & Engineering department at the University of Washington. His work over two decades spans computer vision, robotics and natural language processing.

    Aniruddha Kembhavi will be speaking on Generative AI for Embodied AI, especially the ProcTHOR procedural generation system.
  • Invited Panel - Advancing Embodied AI: Towards Seamless Integration of Perception and Action
    10:30 - 11:00 AM PT
    Stevie Bathiche
    Microsoft
    Ashley Llorens
    Microsoft
    Geordie Rose
    Sanctuary AI
    Ade Famoti
    Microsoft
    Andrey Kolobov
    Microsoft
    Embodied Artificial Intelligence (AI) represents a pivotal frontier in the quest to endow machines with capabilities to perceive, reason, and act in complex environments. The panel will delve into the multifaceted research landscape shaping the futur... [Expand]
  • Invited Talk - Language Model Planning:
    11:00 - 11:30 AM PT
    Brian Ichter
    Physical Intelligence

    Brian Ichter is one of the founders of Physical Intelligence. At Google Brain, he pioneered work on language model planning for robotic control.

    Brian Ichter will share his thoughts on using language models for robotic control.
  • Invited Talk - Project Aria:
    Augmented Reality for Embodied AI
    11:30 AM - 12:00 NOON PT
    Speaker TBD
    Meta

    Project Aria glasses gather information from the user’s perspective for egocentric research in machine perception and augmented reality..

    Project ARIA will share some details of the use of ARIA devices in the field of embodied AI.
  • Lunch
    Location TBD
    12:00 NOON - 1:00 PM PT
  • Accepted Papers Poster Session
    Location TBD
    1:00 PM - 2:00 PM PT
  • Mobile Manipulation Challenge Presentations
    ManiSkill, ARNOLD, HomeRobot OVMM
    2:30 - 3:00 PM PT
    • 2:00: ManiSkill
    • 2:10: ARNOLD
    • 2:20: HomeRobot OVMM
  • Mobile Manipulation Challenge Q&A Panel
    3:00 - 3:30 PM PT
  • Invited Talk - Embodied Mobile Manipulation:
    Robotics and Embodied Artificial Intelligence
    3:30 - 4:00 PM PT
    Shuran Song
    Stanford University

    Shuran Song leads the Robotics and Embodied AI Lab at Stanford University ( REAL@Stanford ). She is interested in developing algorithms that enable intelligent systems to learn from their interactions with the physical world, and autonomously acquire the perception and manipulation skills necessary to execute complex tasks and assist people.

    Shuran Song will be speaking on Embodied Mobile Manipulation.
  • Invited Talk - Embodied Mobile Manipulation:
    Open Vocabulary Mobile Manipulation
    4:00 - 4:30 PM PT
    Chris Paxton
    Meta AI

    Chris Paxton is a robotics research scientist the Embodied AI team at FAIR Labs. His work has looked at how we can make robots into useful, general-purposem mobile manipulators in homes.

    Chris will discuss his work on enabling robots to work alongside humans to perform complex, multi-step tasks, using a combination of learning and planning. In particular, will discuss the open-vocabulary mobile manipulation challenge, or OVMM, which ... [Expand]
  • Invited Talk - Humanoid Robots
    Foundation Models for Humanoid Robots
    4:30 - 5:00 PM PT
    Eric Jang
    1X Technologies

    Eric leads the AI team at 1X Technologies, a vertically-integrated humanoid robot company. His research background is on end-to-end mobile manipulation and generative models. Eric recently authored a book on the future of AI and Robotics, titled “AI is Good for You”.

    1X’s mission is to create an abundant supply of physical labor through androids that work alongside humans. I will share some of the progress 1X has been making towards general-purpose mobile manipulation. We have scaled up the number of tasks our an... [Expand]
  • Invited Speaker Panel
    5:00 - 5:30 PM PT
    Claudia Perez D'Arpino
    NVIDIA
  • Workshop Concludes
    5:30 PM PT

#

Demos

In association with the Embodied AI Workshop, our partners and sponsors will present demos, date and times TBD.


#

Challenges

The Embodied AI 2024 workshop is hosting many exciting challenges covering a wide range of topics such as rearrangement, visual navigation, vision-and-language, and audio-visual navigation. More details regarding data, submission instructions, and timelines can be found on the individual challenge websites.

The workshop organizers will award each first-prize challenge winner a cash prize, sponsored by Logical Robotics and our other sponsors.

Challenge winners may be given the opportunity to present during their challenge's presentation at the the workshop. Since many challenges can be grouped into similar tasks, we encourage participants to submit models to more than 1 challenge. The table below describes, compares, and links each challenge.

Challenge
Task
2024 Winner
Simulation Platform
Scene Dataset
Observations
Action Space
Interactive Actions?
Stochastic Acuation?
ARNOLDLanguage-Grounded ManipulationTBDIsaac SimArnold DatasetRGB-D, ProprioceptionContinuous
HAZARDMulti-Object RescueTBDThreeDWorldHAZARD datasetRGB-D, Sensors, TemperatureDiscrete
HomeRobot OVMMOpen Vocabulary Mobile ManipulationTBDHabitatOVMM DatasetRGB-DContinuous
ManiSkill-ViTacGeneralized Manipulation / Vision-Based Tactile ManipulationTBDSAPIENPartNet-Mobility, YCB, EGADRGB-D, Proproioception, LocalizationContinuous / Discrete for ViTac
MultiONMulti-Object NavigationTBDHabitatHM3D SemanticsRGB-D, LocalizationDiscrete
PRSHuman Society IntegrationTBDPRS EnvironmentPRS DatasetRGB-D, Sensors, Pose Data, Tactile SensorsContinuous

#

Call for Papers

We invite high-quality 2-page extended abstracts on embodied AI, especially in areas relevant to the themes of this year's workshop:

  • Open-World AI for Embodied AI
  • Generative AI for Embodied AI
  • Embodied Mobile Manipulation
  • Language Model Planning
as well as themes related to embodied AI in general:
  • Simulation Environments
  • Visual Navigation
  • Rearrangement
  • Embodied Question Answering
  • Embodied Vision & Language
Accepted papers will be presented as posters or spotlight talks at the workshop. These papers will be made publicly available in a non-archival format, allowing future submission to archival journals or conferences. Paper submissions do not have to be anononymized. Per CVPR rules regarding workshop papers, at least one author must register for CVPR using an in-person registration.

The submission deadline is May 4th (Anywhere on Earth). Papers should be no longer than 2 pages (excluding references) and styled in the CVPR format.

Note. The order of the papers is randomized each time the page is refreshed.

#

Sponsors

The Embodied AI 2024 Workshop is sponsored by the following organizations:

Logical RoboticsMicrosoft
Project Aria
Project Aria

#

Organizers

The Embodied AI 2024 workshop is a joint effort by a large set of researchers from a variety of organizations. Each year, a set of lead organizers takes point coordinating with the CVPR conference, backed up by a large team of workshop organizers, challenge organizers, and scientific advisors.
Anthony Francis
Logical Robotics
Claudia Pérez D’Arpino
NVIDIA
Luca Weihs
AI2
Ade Famoti
Microsoft
Angel X. Chang
SFU
Changan Chen
UT Austin
Chengshu Li
Stanford
David Hall
CSIRO
Devon Hjelm
Apple
Joel Jang
U Washington
Lamberto Ballan
U Padova
Matt Deitke
AI2, UW
Mike Roberts
Intel
Naoki Yokoyama
GaTech
Oleksandr Maksymets
Meta AI
Ran Gong
UCLA
Rin Metcalf
Apple
Sören Pirk
Kiel University
Yonatan Bisk
CMU
Angel X. Chang
SFU
Baixiong Jia
BIGAI
Changan Chen
UT Austin
Chuang Gan
IBM, MIT
David Hall
CSIRO
Dhruv Batra
GaTech, Meta AI
Fanbo Xiang
UCSD
Hao Dong
PKU
Jiangyong Huang
Peking U
Jiayuan Gu
UCSD
Luca Weihs
AI2
Manolis Savva
SFU
Matt Deitke
AI2, UW
Naoki Yokoyama
GaTech
Oleksandr Maksymets
Meta AI
Ram Ramrakhya
Gatech
Richard He Bai
Apple
Roozbeh Mottaghi
FAIR, UW
Siyuan Huang
BIGAI
Sonia Raychaudhuri
SFU
Stone Tao
UCSD
Tommaso Campari
SFU, UNIPD
Unnat Jain
UIUC
Xiaofeng Gao
Amazon
Yang Liu
SRC-B
Yonatan Bisk
CMU
Zhuoqun Xu
PRS
Alexander Toshev
Apple
Andrey Kolobov
Microsoft
Aniruddha Kembhavi
AI2, UW
Dhruv Batra
GaTech, Meta AI
German Ros
NVIDIA
Joanne Truong
GaTech
Manolis Savva
SFU
Roberto Martín-Martín
Stanford
Roozbeh Mottaghi
FAIR, UW
Overview
Timeline
Workshop Schedule
Demos
Challenges
Call for Papers
Sponsors
Organizers