specifically their speech demo video (which is, of course, a demo video)
https://youtu.be/Sq1QZB5baNw
https://www.1x.tech/neo and
https://www.unitree.com/h1/
are undoubtedly using such models.
It's an area of active research, eg
https://www.physicalintelligence.company/blog/pi0
https://wholebody-b1.github.io/
https://ok-robot.github.io/
https://mobile-aloha.github.io/
specifically their speech demo video (which is, of course, a demo video)
https://youtu.be/Sq1QZB5baNw
https://www.1x.tech/neo and
https://www.unitree.com/h1/
are undoubtedly using such models.
It's an area of active research, eg
https://www.physicalintelligence.company/blog/pi0
https://wholebody-b1.github.io/
https://ok-robot.github.io/
https://mobile-aloha.github.io/