3 Comments
User's avatar
Eskimo1's avatar

Why can’t they train it to get some “points” higher than zero if they admit they just don’t know an answer?

Expand full comment
JaziTricks's avatar

I've found bridge (the card game) analysis to be another case where LLMs fail.

I've given 4o / o3 bridge questions. they failed miserably. 4o couldn't even get the cards right from a screenshot. o3 did. but his idea of how to play bridge was impossible.

we do have very decent bridge playing software though. since many many years.

Expand full comment
Carolyn Meinel's avatar

I remain concerned about calling that "think" function "reasoning." https://www.merriam-webster.com/dictionary/reasoning By comparison, see Jeremy Lichtman here: https://bestworld.net/videos

Also, that think function has been around for a while, not novel to 03. Here's an example dated March 31, 2025: https://bestworld.net/canada-election-march-31

Once a true reasoning function is added to a generative large language model, then we likely will be in big trouble. So perhaps we should be cautious about accepting maketing-speak allegations of "reasoning."

Expand full comment