Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Question: when you say "I can't see why you couldn't use real numbers to represent utility", does your reasoning for that have anything to do with Dedekind cuts, Cauchy sequences, or complete ordered fields? Because that's what the real numbers _are_. If your reasoning has nothing to do with these sort of things, then it can't possibly be sound because in order to argue that X has such-and-such property, you need to know what X actually _is_.

To repeat an example I posted for someone else: Suppose there's something called a "superdollar". If you have a superdollar, you can use it to create an arbitrary number of dollars for yourself, any time you want, which you can trade for goods and services. If you want, you can also trade the superdollar itself. Now picture an environment with two buttons, one of which always rewards you one dollar, and the other of which always rewards you one superdollar. Shoe-horning this environment into traditional RL, you'd have to assign the superdollar button some finite reward, say a million. But then you would mislead the traditional-RL-agent into thinking a million dollars was as good as one superdollar, which clearly is not true.



Good example, although what if you just assigned it a reward of like 100 trillion dollars? It might not be exactly correct but then you're assuming that exactly correct rewards are required for AGI which seems like a pretty big assumption.

Actually I thought about this some more, and maybe money wasn't the best example, but I think there must be some internal measure of utility that humans use that can be represented by real numbers.

Imagine you are presented with an array of possible actions with associated (possibly estimated) rewards. You can only pick one. Maybe there are some doors but you can only open one - behind the first is $1m, behind the second is a superdollar, behind the third is a button that cures world hunger, behind the 4th is your loving family, whatever.

As a human I can pick one. No matter what the rewards are. Even if one reward is "you essentially become God". That means I can order them, and therefore that they can be represented by real numbers (plus infinity for the god option).

I don't see why the infinity would cause an issue: the "you can now do literally anything" reward is worth more than every other reward, but it's the only one. Also it doesn't actually exist so who cares?

Actually I guess it can exist in games, e.g. God mode in Quake. But that should have an infinite reward and agents should choose it over everything else so I can't see the problem really.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: