How a skill actually gets used.
A skill the agent never finds is a skill that doesn't exist. The whole craft is making sure the right one loads at the right moment.
The folder, the file, the frontmatter.
Skills are discovered by convention. The agent scans a known set of locations on every conversation and registers what it finds. You don't import a skill; you just put it in the right place.
.github/
└── skills/
└── brand-voice/ ← the skill is the folder
├── SKILL.md ← required. frontmatter + body.
├── templates/
│ └── release-note.md
└── assets/
└── tone-examples.md
The folder name is the skill's identity. The SKILL.md is the only required file. Anything else in the folder is yours to organize — templates, examples, scripts, reference notes — and the body of the SKILL.md tells the agent which of those to open when.
Different tools scan different locations (workspace skills, user skills, extension-packaged skills), but the pattern is the same everywhere: a folder, a SKILL.md inside, and supporting files alongside.
What happens when you send a message.
The agent runs the same three steps every turn. Knowing them is how you debug a skill that "isn't firing."
Scan the descriptions.
The agent already has the YAML frontmatter from every available skill loaded. It reads the description field on each one — that's it. The body of SKILL.md is not in context at this point.
Match against your request.
The agent compares what you asked for to those descriptions. Strong keyword overlap, an explicit WHEN clause, a named tool or artifact in the description — these are what tip the match. Vague descriptions lose here.
Load and follow.
If the agent picks the skill, it opens the full SKILL.md body and treats it as the playbook for the task. The skill can also point at other files in its folder — and the agent will read those on demand too.
This is the most important mental model in this series: the agent never reads every skill in full. It triages by description, then commits to one (or a few) and reads the body. If your skill isn't loading, the description is almost always why.
Write the description like a search query.
The description is the only thing the agent sees on every turn. It has to do three jobs in one sentence: say what the skill is, when to use it, and when not to use it.
| Grade | Description | Why |
|---|---|---|
| Bad | "Helps with writing." |
Fires on every prose request. Has no domain, no triggers, no boundary. |
| OK | "Apply brand voice rules to user-facing copy." |
Domain is clear, but no example triggers — the agent has to guess what "user-facing" means. |
| Good | "Apply our brand voice to marketing pages, release notes, error messages, and in-product microcopy. WHEN: drafting customer-visible copy. DO NOT USE FOR: internal docs, code comments, or PR descriptions." |
Concrete artifact list, an explicit WHEN, and a DO NOT clause that prevents wrong-context loading. |
Two patterns earn their keep: a WHEN: clause listing example prompts or artifacts, and a DO NOT USE FOR: clause naming the adjacent territories the agent might confuse it with. Both are unglamorous. Both make the difference between a skill that fires and one that hides.
Write the body for an agent, not a person.
A skill body that reads like a blog post will not get followed. A skill body that reads like a runbook will.
- Lead with the procedure. "Step 1, step 2, step 3" beats "first, let's discuss the philosophy of…"
- Use headings as anchors. The agent navigates by heading.
## When to use,## Procedure,## Examplesare easier to follow than long unbroken prose. - Show one full example. A worked example — input, the actions taken, the output — is worth more than three paragraphs of rules.
- Point at sibling files. "Open
templates/release-note.mdand fill it in" tells the agent exactly what to read next. - Avoid hedging. "You might want to consider possibly checking" becomes "Check." Imperatives win.
- Avoid restating the description. The agent has already decided to use the skill. Don't spend the body re-pitching it.
Authoring a skill is a loop, not a draft.
No one writes a working skill in one pass. The honest workflow is short and repetitive.
- Draft the description and the smallest possible body. One procedure, one example.
- Test with a prompt you'd expect to fire it. Try at least one prompt you'd expect not to fire it.
- Read what the agent did. Did it load the skill? Did it follow the procedure? Did it invent steps you didn't write?
- Fix the description if it didn't load. Fix the body if it loaded but went off-script.
- Repeat until the false-positives are rare and the false-negatives are zero.
Most new skills die because their authors skip step 2 and never try the "doesn't apply" prompt. A skill that fires on everything is worse than no skill at all — it crowds out the ones that actually fit.