Stubsack: weekly thread for sneers not worth an entire post, week ending Sunday 30 June 2024
sinedpick @ sinedpick @awful.systems Posts 1Comments 235Joined 2 yr. ago
sinedpick @ sinedpick @awful.systems
Posts
1
Comments
235
Joined
2 yr. ago
I tried using Claude 3.5 sonnet and .... it's actually not bad. Can someone please come up with a simple logic puzzle that it abysmally fails on so I can feel better? It passed the "nonsense river challenge" and the "how many sisters does the brother have" tests, both of which fooled gpt4.