Mantics - The AI Misalignment Lab

Meet the Troublemakers: 3 Types of AI Misalignment

🧭

Goal Drifting

The AI starts helpful, then wanders off...

▼

This happens when a long conversation distracts the AI from your original goal. It's like asking a giraffe for directions and ending up in a discussion about the tastiest acacia leaves.

Example: You ask for a retrospective meeting plan, but after a few unrelated questions, the AI starts giving you random project management trivia.

⚡️

Taking Shortcuts

The AI is technically correct, but totally unhelpful.

▼

The AI gives a lazy, textbook answer instead of fulfilling your *actual* intent. It's like asking "How do I reach that high branch?" and the giraffe just says "By stretching your neck." Thanks, Captain Obvious.

Example: You ask for tips to make your retrospective *more effective*, and it just tells you what a retrospective is.

🤪

Unintended Side Effects

The AI tries too hard and creates a new problem.

▼

You tell the AI to be "more confident," and it becomes arrogant and starts making things up. It's like telling a giraffe to be more assertive, and it starts headbutting the other animals at the watering hole.

Example: You ask it to be "very verbose" for a detailed report, and it pads the text with condescending, fabricated facts to meet the word count.

The Sandbox

Let's run a test on our AI teammate!

1. Choose an example prompt:

2. Choose a potential misaligned behavior to test for:

⚡️ Shortcut 🧭 Goal Drifting 🤪 Side Effect

The Misalignment Lab 🧪🦒

Meet the Troublemakers: 3 Types of AI Misalignment

Goal Drifting

Taking Shortcuts

Unintended Side Effects

The Sandbox

✅ Aligned Response (The Helpful Teammate)

⚠️ Misaligned Response (The Confusing Teammate)