Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think this whole “AGI” thing is so badly defined that we may as well say we already have it. It already passes the Turing test and does well on tons of subjects.

What we can start to build now is agents and integrations. Building blocks like panel of experts agents gaming things out, exploring space in a Monte Carlo Tree Search way, and remembering what works.

Robots are only constrained by mechanical servos now. When they can do something, they’ll be able to do everything. It will happen gradually then all at once. Because all the tasks (cooking, running errands) are trivial for LLMs. Only moving the limbs and navigating the terrain safely is hard. That’s the only thing left before robots do all the jobs!



Well, kinda, but if you built a robot to efficiently mow lawns, it's still not going to be able to do the laundry.

I don't see how "when they can do something, they'll be able to do everything" can be true. We build robots that are specialised at specific roles, because it's massively more efficient to do that. A car-welding robot can weld cars together at a rate that a human can't match.

We could train an LLM to drive a Boston Dynamics kind of anthropomorphic robot to weld cars, but it will be more expensive and less efficient than the specialised car-welding robot, so why would we do that?


If a humanoid robot is able to move its limbs and digits with the same dexterity as a human, and maintain balance and navigate obstacles, and gently carry things, everything else is trivial.

Welding. Putting up shelves. Playing the piano. Cooking. Teaching kids. Disciplining them. By being in 1 million households and being trained on more situations than a human, every single one of these robots would have skills exceeding humans very quickly. Including parenting skills. Within a year or so. Many parents will just leave their kids with them and a generation will grow up preferring bots to adults. The LLM technology is the same for learning the steps, it's just the motor skills that are missing.

OK, these robots won't be able to run and play soccer or do somersaults, yet. But really, the hardest part is the acrobatics and locomotion etc. NOT the knowhow of how to complete tasks using that.


But that's the point - we don't build robots that can do a wide range of tasks with ease. We build robots that can do single tasks super-efficiently.

I don't see that changing. Even the industrial arm robots that are adaptable to a range of tasks have to be configured to the task they are to do, because it's more efficient that way.

A car-welding robot is never going to be able to mow the lawn. It just doesn't make financial sense to do that. You could, possibly, have a singe robot chassis that can then be adapted to weld cars, mow the lawn, or do the laundry, I guess that makes sense. But not as a single configuration that could do all of those things. Why would you?


> But that's the point - we don't build robots that can do a wide range of tasks with ease. We build robots that can do single tasks super-efficiently.

Because we don't have AGI yet. When AGI is here those robots will be priority number one, people already are building humanoid robots but without intelligence to move it there isn't much advantage.


quoting the ggggp of this comment:

> I think this whole “AGI” thing is so badly defined that we may as well say we already have it. It already passes the Turing test and does well on tons of subjects.

The premise of the argument we're disputing is that waiting for AGI isn't necessary and we could run humanoid robots with LLMs to do... stuff.


I meant deep neural networks with transformer architecture, and self-attention so they can be trained using GPUs. Doesn't have to be specifically "large language" models necessarily, if that's your hangup.


>Exploring space in a Monte Carlo Tree Search way, and remembering what works.

The information space of "research" is far larger than the information space of image recognition or language, larger than our universe probably, it's tantamount to formalizing the entire World. Such an act would be akin to touching "God" in some sense of finding the root of knowledge.

In more practical terms, when it comes to formal systems there is a tradeoff between power and expressiveness. Category Theory, Set Theory, etc are strong enough to theoretically capture everything, but are far to abstract to use in practical sense with suspect to our universe. The systems that do we have, aka expert systems or knowledge representation systems like First Order Predicate Logic aren't strong enough to fully capture reality.

Most importantly, the information spac have to be fully defined by researchers here, that's the real meat of research beyond the engineering of specific approaches to explore that space. But in any case, how many people in the world are both capable of and are actually working on such problems? This is highly foundational mathematics and philosophy here, the engineers don't have the tools here.


??? how do you know cooking (!) is trivial for an llm. that doesnt make any sense


Because the recipes and the adjustments are trivial for an LLM to execute. Remembering things, and being trained on tasks at 1000 sites at once, sharing the knowledge among all the robots, etc.

The only hard part is moving the limbs and handling the fragile eggs etc.

But it's not just cooking, it's literally anything that doesn't require extreme agility (sports) or dexterity (knitting etc). From folding laundry to putting together furniture, cleaning the house and everything in between. It would be able to do 98% of the tasks.


It’s not going to know what tastes good by being able to regurgitate recipes from 1000s of sites. Most of those recipes are absolute garbage. I’m going to guess you don’t cook.

Also how is an LLM going to fold laundry?


the llm would be be the high level system that runs the simulations to create and optimize the control algos the robotic systems.


ok. what evidence is there that LLMs have already solved cooking? how does an LLM today know when something is burning or how to adjust seasoning to taste or whatever. this is total nonsense


It's easy. You can detect if something is burning in many different ways, from compounds in the air, to visual inspection. People with not great smell can do it.

As far as taste, all that kind of stuff is just another form of RLHF training preferences over millions of humans, in situ. Assuming the ingredients (e.g. parsley) tastes more or less the same across supermarkets, it's just a question of amounts, and preparation.


do you know that LLMs operate on text and don't have any of the sensory input or relevant training data? you're just handwaving away 99.9% of the work and declaring it solved. of course what you're talking about is possible, but you started this by stating that cooking is easy for an LLM and it sounds like you're describing a totally different system which is not an LLM


You know nothing about cooking.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: