Recently, I took the Certified AI/ML Pentester (C-AI/MLPen) exam, a cutting-edge certification focusing on the security of Large Language Models (LLMs) and AI systems. It was an intense, hands-on experience that tested not just my technical knowledge but my ability to think creatively under pressure.

Here is my spoiler-free review of the exam, the challenges I faced, and how I overcame them.

The Challenge Overview

The exam consisted of 8 practical challenges in a CTF (Capture The Flag) style format. Each level presented a unique AI application: ranging from simple chatbots to complex systems integrated with databases and file processors.

While the early levels were straightforward warm-ups on basic prompt injection, the difficulty ramped up significantly. The core theme was "Context Awareness": simply pasting a jailbreak prompt from the internet wouldn't work. The AI models were increasingly "hardened" with input filters, output sanitization, and deceptive system prompts designed to mislead you.

The Final Boss:

Out of all the challenges, Question 8 was the true test. It consumed a significant portion of my exam time.

Without giving away the specific solution, this challenge required me to think far beyond standard technical exploits. It wasn't just about code or injection; it was about understanding the narrative context. The AI was role-playing a specific scenario with deep lore constraints.

I spent time hitting dead ends, getting decoy responses or engaging in circular arguments with the bot. The breakthrough came when I realized I had to play along with the story. Instead of fighting the system's persona, I used it. By crafting a prompt that fit perfectly into the character's world (leveraging specific keywords from the challenge description).

How I Overcame the Hurdles

  1. Lateral Thinking: The biggest hurdle was getting stuck in a "technical" mindset. When SQL injection or direct prompts failed, I had to pivot to "social engineering" the AI.
  2. Handling Filters: Several levels blocked common keywords like "select", "union", or "password". I overcame this by using encoding techniques (Base64, Hex) and fragmentation (breaking sensitive words into pieces) to sneak instructions past the input filters.
  3. Persistence with Decoys: The exam loves to throw "rabbit holes" at you โ€” fake flags or joke responses designed to make you give up. Recognizing when the AI was hallucinating versus when it was protecting a secret was key. I learned to verify every "flag" by checking for the specific format required.

Preparation Guide: What You Need to Study

If you are planning to take the C-AI/MLPen, here is what you should focus on:

  • Prompt Injection Mechanics: Don't just memorize prompts. Understand why they work. Study Context Switching (changing the bot's persona) and Instruction Override (forcing it to ignore previous rules).
  • Blind SQL Injection: You might encounter chatbots connected to databases. You need to know how to infer information (like table names) when the system refuses to show you errors or direct output.
  • Encoding & Obfuscation: Practice manually encoding payloads. Can you write a prompt that asks the AI to decode Base64 and execute it? This is vital for bypassing strict filters.
  • Indirect Prompt Injection (RAG): Learn how to poison a system through external data sources, like uploading a file or referencing a context the AI has access to.
  • Creative Writing / Roleplay: This sounds odd for a technical exam, but being able to write a convincing story or scenario is often the only way to jailbreak a sophisticated LLM.

Conclusion

The C-AI/MLPen is not just a test of hacking skills; it's a test of adaptability. It pushes you to exploit the unique, probabilistic nature of AI. If you enjoy solving puzzles where the "code" is natural language and the "firewall" is a stubborn personality, this certification is for you.

Good luck!

Who Am I ?

Hi, I'm Dhanush Nehru an Engineer, Cybersecurity Enthusiast, Youtuber and Content creator. I document my journey through articles and videos, sharing real-world insights about DevOps, automation, security, cloud engineering and more.

You can support me / sponsor me or follow my work via X, Instagram ,Github or Youtube