I wrote 30 Claude Code skills before I understood why most of them never fired

The first skill I ever wrote for Claude Code was a code reviewer. I was proud of it. Forty lines, a clean workflow, examples of good and bad output, the whole thing.

Then I used it for two weeks and realized Claude had triggered it exactly twice. Both times by accident, when I happened to phrase a request in just the right way without meaning to.

I assumed the skill was broken. I rewrote the workflow. I added more detail. I made the examples better. None of it helped, because none of it was the problem. The problem was the one part I had barely thought about: the description at the top.

I had written that description for me. It explained, in nice clear prose, what the skill did. What it needed to do was explain to the model when to use it. Those are not the same sentence, and the gap between them is where thirty of my skills went to die.

How a skill actually gets chosen

Here is the mechanic, because once you see it the whole thing clicks.

When you have skills installed, the model does not run them all the time. On each request, it looks at the descriptions of the skills available to it and decides which, if any, are relevant to what you just asked. The description is the entire basis for that decision. Not the workflow. Not the examples. The model is not reading those when it decides whether to fire. It is reading the description and matching it against your request.

So if your description is a lovely explanation of what the skill does internally, but it never mentions the words a person would actually type when they need it, the model has nothing to match against. The skill is sitting right there, fully functional, and it never gets picked because the trigger language is missing.

This is obvious in hindsight. It was not obvious while I was rewriting workflows for the third time.

The description I had versus the one that worked

Here is roughly what my original code reviewer description looked like:

description:_ A thorough code review tool that analyzes code quality, identifies potential issues, and provides structured feedback based on best practices._

Read that as the model. When would you fire it? It describes a thing that exists. It does not tell you the situation that should trigger it. "Analyzes code quality" is not something a user says. Nobody opens a conversation with "please analyze my code quality." They say "can you check this," or "review this PR," or "does this look right to you."

Here is the version that actually started firing:

description:_ Performs a structured code review across correctness, security, and performance, with findings grouped by severity. Make sure to use this skill whenever the user asks to review code, check a pull request, look for bugs, or asks if their code looks correct._

The second half is doing all the work. It lists the things a person actually says. Review code. Check a pull request. Look for bugs. Does this look correct. Now the model has real phrases to match against, and when I type any of them, the skill fires.

The first description was written for an audience that would never read it. The second was written for the one reader who decides everything.

Be pushier than feels natural

The instinct, especially if you write for a living, is to keep the description tasteful and brief. Resist it. The description is not prose anybody enjoys. It is a matching target, and a slightly heavy-handed matching target works better than an elegant one.

I now end almost every description with a sentence that starts "Make sure to use this skill whenever the user mentions" and then lists the actual trigger words. It feels clumsy. It reads like I am nagging the model. It also took my skills from firing by accident to firing reliably, so I made my peace with the clumsiness.

There is a real tradeoff here and it is worth saying out loud:

Make the description too broad and the skill fires when you do not want it, which is its own kind of annoying.
Make it too narrow and it never fires at all.
You are tuning a dial, not flipping a switch, and you only find the right spot by testing.

How to actually test it

This is the part I skipped for months, and skipping it is why I had thirty quiet skills instead of three loud ones.

For each skill, write down two short lists before you trust it.

Must-fire prompts: Three or so requests that absolutely should trigger the skill. For the code reviewer, that is things like "review this function," "check my PR for issues," "is this code correct." If you type these and the skill does not fire, your description is too narrow. Add the words it missed.
Must-not-fire prompts: Three requests that sound related but should not trigger it. For the reviewer, that is things like "what is a pull request," "write me a function that sorts a list," "explain how git works." These are about code but they are not review requests. If the skill fires on these, your description is too broad and you need to tighten it.
Run both lists. Adjust the description. Run them again. It takes ten minutes and it is the difference between a skill that works and a skill that decorates your folder.

Pro Tip:_ The fastest way to spot-check is to just ask the model directly in a fresh conversation:_ "what skills are available right now, and when would you use them?" If your skill is not listed, or the model describes a trigger condition that does not match what you intended, you have found your bug before it cost you anything.

Writing it down

In Claude Code this all lives in a SKILL.md file inside ~/.claude/skills/, one folder per skill. The folder name is the skill's identifier, the file has a short block of front matter at the top with the name and description, and then the body is the workflow.

People obsess over the workflow. The workflow matters, but the workflow is the part that runs after the skill is already chosen. If the description never gets it chosen, the best workflow in the world runs zero times. I had that exact situation thirty times over and kept polishing the part that was already fine.

Spend your effort on the description first. Get the skill firing. Then improve what it does once it is actually doing it.

What I would tell myself a year ago

Most of my early skills were not bad skills. They were unreachable ones. I built good tools and then hid them behind descriptions that the model could not match, and then I blamed the tools.

The whole craft of writing a skill that works turned out to be two things stacked on top of each other:

Write a description the model can actually trigger on, using the words a real person says.

Test that it fires when it should and stays quiet when it should not.

Everything else is downstream of that.

I have thirty old skills I should go back and fix. Some of them were genuinely good. They just never got the chance to prove it.

I ended up rewriting all of them with this in mind and collecting the ones I actually use into a small set, each with its trigger tests worked out. There is a free sample at cowboysammy.gumroad.com/l/ClaudeSkillsFreeFemo if you want to see how the descriptions and must-fire lists are structured. Disclosure: that link goes to a paid product I made. The testing method costs nothing and works on any skill you write yourself.