Having been a member of the robot learning community both in grad school and now in industry, I'd actually like to rightfully attribute something here since it seems that TRI is (deservedly so, I will agree wholeheartely) receiving most of the praise:
The core of these advancements are powered by Diffusion Policy [1], which Prof. Shuran Song's lab at Columbia (before she moved recently to Stanford) developed and pioneered. I'd suggest everyone to view the original project website [2], it has a ton of amazing real world challenging experiments.
It was a community favorite for the Best Paper Award at the R:SS conference [3], this year. I remember our lab (and all other learning labs in our robotics department), absolutely dissecting this paper. I know of people who've entirely pivoted away from their projects involving behavior cloning/imitation learning, to this approach, which deals with multi-modal action spaces much more naturally than the aforementioned approaches.
Prof. Song is an absolute rockstar in robotics right now, with several wonderful approaches that scale elegantly to the real world, including IRP [4] (which won Best Paper at R:SS 2022), FlingBot [5], Scaling Up Distilling Down [6] and much more. I recommend checking out her lab website too.
To be fair, they do credit Professor Song and the paper you linked. TRI is also listed as a collaborator on the paper.
> Diffusion Policy: TRI and our collaborators in Professor Song’s group at Columbia University developed a new, powerful generative-AI approach to behavior learning. This approach, called Diffusion Policy, enables easy and rapid behavior teaching from demonstration.
I haven't read the paper on Policy Diffusion yet so I don't know what they do differently. But I can ELI5 image diffusion models, like stable diffusion. Essentially you add random noise to an image, and the ask the model to predict the noise, such that if you remove that noise detected by the model, you obtain the original image. After the model has been trained enough, int the noise removal task, you can pass just random noise, ask the model to remove noise from the noise only image, then remove a little bit of the noise the model suggested, and do it again. And again, for multiple steps, eventually all the noise is removed and you end up with an image "dreamed" by the model from random noise. You can also condition the noise removal with things like text or other images to guide the noise removal process toward a certain target image.
You may as well credit the information theorists, mathematicians, and physicists who laid out the fundamentals that brought us here.
They died before hardware achieved their decades old visions. Not much of this work is net new description, moreso normalizing old descriptions with observation now that we can actually build the old ideas.
The core of these advancements are powered by Diffusion Policy [1], which Prof. Shuran Song's lab at Columbia (before she moved recently to Stanford) developed and pioneered. I'd suggest everyone to view the original project website [2], it has a ton of amazing real world challenging experiments.
It was a community favorite for the Best Paper Award at the R:SS conference [3], this year. I remember our lab (and all other learning labs in our robotics department), absolutely dissecting this paper. I know of people who've entirely pivoted away from their projects involving behavior cloning/imitation learning, to this approach, which deals with multi-modal action spaces much more naturally than the aforementioned approaches.
Prof. Song is an absolute rockstar in robotics right now, with several wonderful approaches that scale elegantly to the real world, including IRP [4] (which won Best Paper at R:SS 2022), FlingBot [5], Scaling Up Distilling Down [6] and much more. I recommend checking out her lab website too.
[1] - https://arxiv.org/abs/2303.04137
[2] - https://diffusion-policy.cs.columbia.edu/
[3] - https://roboticsconference.org/program/awards/
[4] - https://irp.cs.columbia.edu/
[5] - https://flingbot.cs.columbia.edu/
[6] - https://www.cs.columbia.edu/~huy/scalingup/