2025年2月3日 星期一

晚餐行動

  感恩節全家晚餐大作戰

Why LLM fails (source)






ChatGPT o1 reasoning feasible outcome optimized by human


Claude Sonnet 3.5 illustration of solution (happens to be acceptable)




What if Emily arrived at the airport at 4:30

ChatGPT o1 OK at the first try

Sonnet 3.5 illustration (attention bias occurs. some constraints forgotten)


What if Emily arrived at the airport at 2:30

ChatGPT o1 reasoning feasible outcome optimized by human

Sonnet 3.5 illustration (attention bias occurs. some constraints forgotten)


What if Emily arrived at the airport at 2:30

Minimize the sum of waiting time at the airport and car driving time


沒有留言:

張貼留言