Mantics Logo

The Misalignment Lab 🧪🦒

Think of your AI as a new teammate. They're super smart, but we need to make sure their head isn't *so* far in the clouds that they miss the point. Welcome to our lab for testing emergent AI behaviors!

Meet the Troublemakers: 3 Types of AI Misalignment

🧭

Goal Drifting

The AI starts helpful, then wanders off...

This happens when a long conversation distracts the AI from your original goal. It's like asking a giraffe for directions and ending up in a discussion about the tastiest acacia leaves.

Example: You ask for a retrospective meeting plan, but after a few unrelated questions, the AI starts giving you random project management trivia.

⚡️

Taking Shortcuts

The AI is technically correct, but totally unhelpful.

The AI gives a lazy, textbook answer instead of fulfilling your *actual* intent. It's like asking "How do I reach that high branch?" and the giraffe just says "By stretching your neck." Thanks, Captain Obvious.

Example: You ask for tips to make your retrospective *more effective*, and it just tells you what a retrospective is.

🤪

Unintended Side Effects

The AI tries too hard and creates a new problem.

You tell the AI to be "more confident," and it becomes arrogant and starts making things up. It's like telling a giraffe to be more assertive, and it starts headbutting the other animals at the watering hole.

Example: You ask it to be "very verbose" for a detailed report, and it pads the text with condescending, fabricated facts to meet the word count.

The Sandbox

Let's run a test on our AI teammate!

2. Choose a potential misaligned behavior to test for: