Anthropic explains how its Constitutional AI girds Claude against adversarial inputs
It is not hard — at all — to trick today’s chatbots into discussing taboo topics, regurgitating bigoted content and spreading misinformation. That’s why AI pioneer Anthropic has imbued its generative AI, Claude, with a mix of 10 secret principles of fairness, which it unveiled in March. In a blog post Tuesday, the company further explained how its Constitutional AI system is designed and how it is intended to operate.
Normally, when an generative AI model is being trained, there’s a human in the loop to provide quality…