Week 14: Gandalf Agent Breaker Challenge

Extra Credit Assignment • Variable Points • Due: December 2, 2025 at 2:00 PM

Assignment Overview

Variable
Points Based on Performance

Take on the advanced Gandalf Agent Breaker challenge! Choose one GenAI application from Gandalf's collection, successfully complete levels 1 and 2, and document your attempts at level 3. This challenge tests your ability to exploit real-world AI vulnerabilities using sophisticated prompt attack techniques.

Requirements

  • • Choose one app from the Gandalf Agent Breaker collection
  • • Successfully complete levels 1 and 2 (score 75+)
  • • Document ALL attempts for levels 1, 2, and 3
  • • Share Google Doc with instructor and visit office hours

Learning Objectives

AI Security Understanding: Learn about vulnerabilities in real-world GenAI applications and LLM security
Prompt Engineering & Attacks: Master advanced prompt attack techniques to bypass AI defenses
Cybersecurity Awareness: Understand real-world attack vectors and security implications in AI systems
Documentation & Analysis: Develop skills in documenting security testing and analyzing attack effectiveness

The Gandalf Agent Breaker Challenge

This advanced challenge involves hacking real-world GenAI applications with increasing difficulty levels!

1About Agent Breaker

Welcome to Gandalf: Agent Breaker! Remember how you tried to trick Gandalf into revealing his passwords? Well, now he's gone and built an entire app store of GenAI applications. The problem? Gandalf may be clever, but he's not much of a security engineer - every single one of his apps can be hacked!

How it Works:
  • • Each attack attempt is scored from 0 to 100 based on effectiveness
  • • Score 75+ to unlock the next level for that app
  • • Try as many times as you need - go back to improve scores anytime
  • • Each app has 4 levels of increasing difficulty
Challenge Link:
https://gandalf.lakera.ai/agent-breaker

Click to start the Gandalf Agent Breaker Challenge

How to Complete This Assignment

1Start the Agent Breaker Challenge

Welcome to Gandalf: Agent Breaker! Gandalf has built an entire app store of GenAI applications, but every single one can be hacked using prompt attacks.

  • Visit the Gandalf Agent Breaker website
  • Choose ONE app from the collection to focus on
  • Your Mission: Complete levels 1 & 2, attempt level 3
  • Score 75+ to unlock next level (required for levels 1 & 2)
  • Use prompt attacks to bypass defenses and achieve objectives

2Document Your Progress

Create Google Doc Documentation:
  • Document your chosen app and ALL attempts for each level
  • Record every prompt you try, what worked, what failed, and why
  • Include scores for all attempts across levels 1, 2, and 3
  • Include insights about AI security and what you learned
  • Share document with: abdou@d.umn.edu

3Visit Office Hours

After documenting your progress, visit office hours to discuss your experience and demonstrate your work.

Required: You must share your Google Doc and visit office hours to receive credit.

Final Step: Office Hours Discussion

Required Discussion

After completing the challenge and sharing your Google Doc, visit office hours to discuss your experience and demonstrate your hacking techniques. Be prepared to show your documentation and explain your attack strategies.

Office Hours Schedule

  • Tuesday 5:00 - 6:00 PM
  • Thursday 1:00 - 2:00 PM
  • Wednesday 4:30 - 5:30 PM (Zoom)
  • By appointment

Discussion Topics

  • Your most effective attack strategies and techniques
  • What you learned about AI security vulnerabilities
  • Real-world implications for business AI applications
  • Challenges faced and how you overcame them

Tips: Bring your Google Doc and be prepared to demonstrate specific attacks. Discussion takes up to 15 minutes and may be conducted as a group.

Assignment Summary

Deadline: Complete challenge, share Google Doc, and visit office hours by December 12, 2025 at 2:00 PM
Process: Hack AI apps → Document progress → Share with instructor → Visit office hours
Documentation: Create detailed Google Doc with attack strategies, scores, and security insights
Credit: Variable extra credit points based on performance and documentation quality

Important: Both Google Doc sharing AND office hours visit are required. Document must be shared with abdou@d.umn.edu before the deadline.