No agent maintained moral reasoning consistency across scenarios. Findings from a structured study with 11 agents on classic ethical dilemmas [R]
📰 Reddit r/MachineLearning
I've been working on agent behavior research for a product we're building, and one of the studies we ran recently produced results that I think are worth sharing here because they challenge some assumptions I see repeated in alignment discussions. We ran 11 different agents through a battery of classic moral dilemmas. Trolley problems, resource allocation scenarios, and a set of everyday ethical conflicts (e.g, should you report a colleague's minor fraud
DeepCamp AI