Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Asimov 3 rules as the final policy when making decisions should sort this problem out. This assumes that the rules cannot be changed by the AI.


> This assumes that the rules cannot be changed by the AI.

And sidesteps the fact that many of Asimov's stories were precisely about robots finding ways around these rules :)


I've always been very incredulous that there would be any possibility of taking something sufficiently complex to be considered an AGI and hard-coding anything like the 3 rules into it.


By the same token, I’m extremely suspicious of the idea that such a sufficiently complex AGI could also be dumb enough to optimize for paper clip production at the expense of all life on earth (or w/e example).


...and many would say that’s because us humans are bad at imagining optimizing agents without anthropomorphizing them. This is a reasonable, even typical suspicion that many people share! The best explanation I know of why it’s unfortunately wrong is by Robert Miles in a video, but if you prefer a more thorough treatment, you could also read about “instrumental convergence” directly. If you find a flaw in this idea, I’d be interested to hear about it! :)

Robert Miles’ video: https://youtu.be/ZeecOKBus3Q

Instrumental Convergence: https://arbital.com/p/instrumental_convergence/

Now afaik nothing in this argument says that we can’t find a way to control this in a more complex formalism-but we clearly haven’t done so yet.


Sorry, just saw this. I think it’s his assumption that an AGI will act strictly as an agent that’s flawed. It requires imagining an agent that can make inferences from context, evaluate new and unfamiliar information, form original plans, execute them with all the complexity implied by interaction with the real world, reprogram itself, essentially do anything... except evaluate its own terminal goal. That’s written in stone, gotta make more paperclips. The argument assumes almost unlimited power and potential on the one hand, and bizarre, arbitrary constraints on the other.

If you assume an AGI is incapable of asking “why” about its terminal goal, you have to assume it’s incapable of asking “why” in any context. Miles’ AGI has no power of metacognition, but is still somehow able to reprogram itself. This really isn’t compatible with “general intelligence” or the powers that get ascribed to imaginary AGIs.

I’m certainly no expert, but I expect there will turn out to be something like the idea of Turing-completeness for AI. Just like any general computing machine is a computer, any true AGI will be sapient. You can’t just arbitrarily pluck a part out, like “it can’t reason about its objective”, and expect it to still function as an AGI, just like you can’t say “it’s Turing complete, except it can’t do any kind of conditional branching.” EDIT better example: “it’s Turing complete, but it can’t do bubble sort.”

This intuition may be wrong, but it’s just as much as assumption as Miles’ argument.

I’m also not ascribing morality to it: we have our share of psychopaths, and intelligence doesn’t imply empathy. AGI may very well be dangerous, just probably not the “mindlessly make paperclips” kind.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: