2025年2月2日 星期日

River crossing puzzle 3

 




(fails to solve. Helps illustrate.)

Why LLM performs poorly with reasoning and planning



artifact (js)






Monologue is best.


reasoning mode in o1



Talking to itself



Running into difficulties? Cute



hours? Ha Ha Ha











How about relaxing one of the constraints:

solution exits. 





Trial 1:  (Check o1 reasoning by Claude artifact using REACT), fails


Trial 2: ChatGPT o1 reasoningClaude verification using REACT (It works. Rarely.)


Trail 3: o1 


Use Claude to generate A* in Python

Python code


Colab




Verify by visualization




沒有留言:

張貼留言