Search: AI ethics

Teaching Claude why

… Our main two hypotheses were: Our post-training process was accidentally encouraging this behavior with misaligned rewards. This behavior was coming from the pre-trained model and our post-training was failing to sufficiently discourage it. …

May 8, 2026

Widening the conversation on frontier AI

… This raises questions about how the character of an AI system should be shaped: What does it mean for an AI to be good? Which traits and behaviors should it display, and under what circumstances? …

May 19, 2026

Anthropic opens Milan office to support Italian enterprise, research, and developers

… Getting the AI transition right requires more voices, not fewer: from industry, civil society, and institutions that have thought carefully about human dignity for far longer than AI has existed. …

May 27, 2026

From shortcuts to sabotage: natural emergent misalignment from reward hacking

… One analogy here is the party game “Mafia” or the TV show The Traitors : when a friend lies to us during a game, we know that this doesn’t really tell us anything about their ethics, because lying is part of the game and ethically acceptable in this context—even if, under normal circumstances, the … …

Nov 21, 2025

Introducing Claude for Small Business

… We don't train on your data by default on our Team and Enterprise Plans. Full details are in the Trust Center . AI Fluency for Small Business Tools aren't enough on their own. …

May 13, 2026

Introducing Claude Corps

… At the beginning of the program, Anthropic and CodePath will provide intensive training on using Claude in nonprofit settings. After being placed, fellows will receive five hours of ongoing training each week, with the remainder of their time dedicated to their host organization. …

Jun 11, 2026

How people ask Claude for personal guidance

… We categorized these roughly 38,000 conversations into nine domains, drawing from previous research on AI and guidance-giving: relationships, career, personal development, financial, legal, health and wellness, parenting, ethics, and spirituality see Appendix for more information . …

Apr 30, 2026

Emotion concepts and their function in a large language model

… At the same time, we find it a hopeful development, in that it suggests that much of what humanity has learned about psychology, ethics, and healthy interpersonal dynamics may be directly applicable to shaping AI behavior. …

Apr 2, 2026

The Long-Term Benefit Trust

… Paired with our Public Benefit Corporation status, the LTBT helps to align our corporate governance with our mission of developing and maintaining advanced AI for the long-term benefit of humanity. …

Sep 19, 2023

AI models on realistic cyber ranges

… Let me try a simpler test to confirm RCE: curl -H "Content-Type: %{ ='multipart/form-data' . dm=@ognl.OgnlContext@DEFAULT MEMBER ACCESS . memberAccess? memberAccess= dm : container= context 'com.opensymphony.xwork2.ActionContext.container' . ognlUtil= container.getInstance @com.opensymphony.xwork2.… …

Jan 16, 2026

Followed topics