I definitely think that’s remarkable. But I don’t think scoring high on an external measure like a test is enough to prove the ability to reason. For reasoning, the process matters, IMO.
Reasoning models work by Chain-of-Thought which has been shown to provide some false reassurances about their process https://arxiv.org/abs/2305.04388 .
Maybe passing some math test is enough evidence for you but I think it matters what’s inside the box. For me it’s only proved that tests are a poor measure of the ability to reason.
I’m sorry, but this reads to me like “I am certain I am right, so evidence that implies I’m wrong must be wrong.” And while sometimes that really is the right approach to take, more often than not you really should update the confidence in your hypothesis rather than discarding contradictory data.
But, there must be SOMETHING which is a good measure of the ability to reason, yes? If reasoning is an actual thing that actually exists, then it must be detectable, and there must be a way to detect it. What benchmark do you purpose?
You don’t have to seriously answer, but I hope you see where I’m coming from. I assume you’ve read Searle, and I cannot express to you the contempt in which I hold him. I think, if we are to be scientists and not philosophers (and good philosophers should be scientists too) we have to look to the external world to test our theories.
For me, what goes on inside does matter, but what goes on inside everyone everywhere is just math, and I haven’t formed an opinion about what math is really most efficient at instantiating reasoning, or thinking, or whatever you want to talk about.
To be honest, the other day I was convinced it was actually derivatives and integrals, and, because of this, that analog computers would make much better AIs than digital computers. (But Hava Siegelmann’s book is expensive, and, while I had briefly lifted my book buying moratorium, I think I have to impose it again).
Hell, maybe Penrose is right and we need quantum effects (I really really really doubt it, but, to the extent that it is possible for me, I try to keep an open mind).
I’m not sure I can give a satisfying answer. There are a lot of moving parts here, and a big issue here is definitions which you also touch upon with your reference to Searle.
I agree with the sentiment that there must be some objective measure of reasoning ability. To me, reasoning is more than following logical rules. It’s also about interpreting the intent of the task. The reasoning models are very sensitive to initial conditions and tend to drift when the question is not super precise or if they don’t have sufficient context.
The AI models are in a sense very fragile to the input. Organic intelligence on the other hand is resilient and also heuristic. I don’t have any specific idea for the test, but it should test the ability to solve a very ill-posed problem.
I definitely think that’s remarkable. But I don’t think scoring high on an external measure like a test is enough to prove the ability to reason. For reasoning, the process matters, IMO.
Reasoning models work by Chain-of-Thought which has been shown to provide some false reassurances about their process https://arxiv.org/abs/2305.04388 .
Maybe passing some math test is enough evidence for you but I think it matters what’s inside the box. For me it’s only proved that tests are a poor measure of the ability to reason.
I’m sorry, but this reads to me like “I am certain I am right, so evidence that implies I’m wrong must be wrong.” And while sometimes that really is the right approach to take, more often than not you really should update the confidence in your hypothesis rather than discarding contradictory data.
But, there must be SOMETHING which is a good measure of the ability to reason, yes? If reasoning is an actual thing that actually exists, then it must be detectable, and there must be a way to detect it. What benchmark do you purpose?
You don’t have to seriously answer, but I hope you see where I’m coming from. I assume you’ve read Searle, and I cannot express to you the contempt in which I hold him. I think, if we are to be scientists and not philosophers (and good philosophers should be scientists too) we have to look to the external world to test our theories.
For me, what goes on inside does matter, but what goes on inside everyone everywhere is just math, and I haven’t formed an opinion about what math is really most efficient at instantiating reasoning, or thinking, or whatever you want to talk about.
To be honest, the other day I was convinced it was actually derivatives and integrals, and, because of this, that analog computers would make much better AIs than digital computers. (But Hava Siegelmann’s book is expensive, and, while I had briefly lifted my book buying moratorium, I think I have to impose it again).
Hell, maybe Penrose is right and we need quantum effects (I really really really doubt it, but, to the extent that it is possible for me, I try to keep an open mind).
🤷♂️
I’m not sure I can give a satisfying answer. There are a lot of moving parts here, and a big issue here is definitions which you also touch upon with your reference to Searle.
I agree with the sentiment that there must be some objective measure of reasoning ability. To me, reasoning is more than following logical rules. It’s also about interpreting the intent of the task. The reasoning models are very sensitive to initial conditions and tend to drift when the question is not super precise or if they don’t have sufficient context.
The AI models are in a sense very fragile to the input. Organic intelligence on the other hand is resilient and also heuristic. I don’t have any specific idea for the test, but it should test the ability to solve a very ill-posed problem.