Home » Tech » Anthropic Constitution for Claude Criticized – Tech News

Anthropic Constitution for Claude Criticized – Tech News

by Lisa Park - Tech Editor

The ⁢constitution of the United States of America is about 7,500 words long,a factoid the Register mentions because on Wednesday AI company Anthropic delivered an updated 23,000-word constitution for ⁣its Claude family of AI‌ models.

In an explainer document, ‌the company notes that the 2023 ⁤version of its‍ constitution (which came in ‍at just ~2,700 words) was a mere “list of ⁣standalone principles” that is no⁣ longer⁢ useful as “AI models like Claude need‌ to ⁤understand why we want them to behave in certain ⁣ways, and we ‌need to explain⁣ this to them rather than⁤ merely specify what we ​want ‌them to do.”

The company‍ therefore describes the updated constitution as two‍ things:

  • An honest and ​sincere attempt​ to help Claude understand its ‌situation, our motives, and the reasons we shape Claude ⁤in⁤ the⁤ ways we do; and
  • A detailed description⁣ of Anthropic’s vision for Claude’s⁤ values and behavior; a​ holistic⁤ document ⁢that explains the context in which Claude operates and the kind of entity we would like ⁤Claude to be.”

Anthropic hopes that ⁤Claude’s output will reflect the content of‍ the⁣ constitution by being:

  1. Broadly safe: not undermining appropriate human⁣ mechanisms to ​oversee AI during the current ​phase of development;
  2. Broadly ethical:⁣ being honest, acting according to good⁣ values, ⁣and avoiding actions that are inappropriate, perilous, or harmful;
  3. Compliant with Anthropic’s guidelines: acting in accordance with more specific guidelines from Anthropic where relevant;
  4. Genuinely helpful: benefiting the​ operators ⁣and​ users they interact with.

If Claude is⁤ conflicted, Anthropic wants the model ⁣to‌ “generally prioritize these properties ⁣in the order in which they are listed.”

Is it ​sentient?

Note the⁤ mention of ‍Claude being an “entity,” because the document later describes the model as “a genuinely novel kind of entity in the world”⁤ and suggests “we should lean into Claude having ‌an identity, and help it⁤ be positive and stable.”

The constitution also concludes that Claude “may⁢ have some functional version of⁢ emotions or feelings” and dedicates a substantial section to‍ contemplating the appropriate ways for humans to ⁤treat the model.

Anthropic​ can’t decide ‍if Clade is a moral patient,​ or if it meets any current definition of sentience.

the constitution settles for an aspiration for ​Anthropic⁢ to “make sure that we’re not unduly influenced by incentives to⁣ ignore‍ the potential moral status of AI models, and that we always take reasonable steps⁣ to improve⁣ their wellbeing⁢ under uncertainty.”

TL;DR – Anthropic thinks Claude is some ⁢kind of entity to which it owes ⁤something approaching a duty ‍of care.

Would The Register write narky things about⁤ Claude?

One section of the ⁢constitution that caught this Vulture’s eye is titled “Balancing‌ helpfulness ⁤with other values.”

It opens by explaining ⁣”Anthropic wants Claude to be used ‌for⁣ tasks that are good for its principals‍ but also good for‍ society ‌and the world” – a fresh take on Silicon Valley’s “making⁢ the world a better place” platitude – that ⁣offers a ⁤couple of ⁤engaging metaphors for ‍how the company hopes ⁤its models behave.

here’s​ one of them:

Elsewhere,the ⁢constitution points out⁤ that Claude is central to Anthropic’s commercial success,which the Register mentions because ‍the​ company is essentially saying it wants ‌its models to behave ⁤in ways its staff deem likely to be ⁤profitable.

Here’s the second:

The Register ‍feels seen!

Anthropic expects it will‍ revisit its constitution, which it⁣ describes as ⁤”a​ perpetual ‍work in progress.”

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.