Large Language Models can Strategically Deceive their Users when Put Under Pressure [simulation led to insider trading]
https:// arxiv.org /abs/2311.07590 Large Language Models can Strategically Deceive their Users when Put Under Pressure [simulation led to insider trading]
1
comments
It's trained on human responses. Humans lie in their responses.
6Reply