Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We definitely need a way to rate these systems so we can have better expectations.

An IQ test for language models?



The problem with IQ in human modeling is 1) it's just a one dimensional number, and 2) it changes as the average human gets smarter or dumber.

However we rate these systems in the future we must not make the mistakes of the past and think 1 number solutions are good for anything.

For example you can have an exceptionally 'intelligent' system that is misaligned with human intention.


Some sort of general knowledge skills assessment, grade them on accuracy. Questions / tasks get increasingly more abstract until they become almost subjective.


GMAT? SATs? A ML flavored jeopardy test?


seems like a fool's errand


Isn't IQ just size of short time memory and processing speed?


No because that would mean that anyone with lots of time and a notepad could become (a slow version of) Einstein.


I'm not convinced they couldn't. Depends what you mean by Einstein. You won't be formulating GR, but an IQ test could be doable.

At least if you have enough IQ to figure out how to solve IQ test problems on paper. Which shouldn't be that hard.


IQ tests are timed. Not everyone could be a slow Einstein, but perhaps you if you had 200-300 years might reach the same solutions Einstein did. If you choose to work on the same problems.


If you are a 5' tall basketball amateur, then even with 300 years of training you won't outplay the top NBA player.


It's different because to outplay the top NBA player you can't do it slowly. (You can compute slowly, though)


Maybe, the correlation isn't linear. Or Einstein USP wasn't just his IQ.


It is more like the ability to make sense of things.

But intelligence is hard to measure. Always plenty of room for everyone to disagree.


I see.

I just heard about a test with a box with lights and a buttons, and pressing the buttons faster would correlate to higher IQ.


What you’re describing sounds like a way to measure reaction time. By that measure I suspect gamers would rank highest.


It was set up that you had to calculate the right button to press after lights lit up.

And this time was correlated to IQ.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: