2025年2月2日 星期日

ChatGPT o1 thinking through river crossing puzzle 3

 




(fails to solve. Helps illustrate.)

Why LLM performs poorly with reasoning and planning



artifact (js)






Monologue is best.


reasoning mode in o1



Talking to itself



Running into difficulties? Cute



hours? Ha Ha Ha











How about relaxing one of the constraints:

solution exits. 

Trial 1:  (Check o1 reasoning by Claude artifact using REACT), fails


Trial 2: ChatGPT o1 reasoning/ Claude verification using REACT (It works. Rarely.)

沒有留言:

張貼留言