Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?
Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?
short answer: no.
Long Answer: They are still (mostly) statisics based and can’t do real math. You can use the answers from LLMs as starting point, but you have to rigerously verify the answers they give.
The whole “two r’s in strawberry” thing is enough of an argument for me. If things like that happen at such a low level, its completely impossible that it wont make mistakes with problems that are exponentially more complicated than that.
The problem with that is that it isn’t actually counting the R’s.
You’d probably have better luck asking it to write a script for you that returns the number of instances of a letter in a string of text, then getting it to explain to you how to get it running and how it works. You’d get the answer that way, and also then have a script that could count almost any character and text of almost any size.
That’s much more complicated, impressive, and useful, imo.
A calculator as a tool to a llm though, that works, at least mostly, and could be better when kinks get worked out.