Skip to main content

Command Palette

Search for a command to run...

A Good Agent Skill Is a Contract, Not a Prompt

A practical follow-up on turning agent instructions into focused, testable workflows.

Published
7 min read
A Good Agent Skill Is a Contract, Not a Prompt
M

Software Engineer x Data Engineer - I make the world a better place to live with software that enables data-driven decision-making

When people start using AI coding agents seriously, they usually make the same mistake: every time the agent does something wrong, they add another instruction. One more rule, one more exception, one more paragraph explaining what "good" means.

After a while, the prompt becomes a coding standard, a Git workflow, a deployment guide, a ticketing manual, and a team handbook in one file. And the agent still makes mistakes.

The problem is not that the prompt is too short. The problem is that the instructions do not have boundaries.

This article is a follow-up to Stop Writing Bigger Prompts. Start Designing Agent Skills.

In that article, I wrote about why bigger prompts are not the right answer and why agent skills are a better abstraction. This article goes one step further: what should a good skill actually look like?

The short answer:

A good agent skill is a contract, not a prompt.

It should clearly define when it applies, what it owns, what is allowed, what is forbidden, and what output is expected. A useful skill is not longer. It is clearer.

The Problem With "Best Practices" Skills

A weak skill sounds like this:

Write clean code.
Follow best practices.
Keep things simple.
Add tests where needed.
Use meaningful names.

It is hard to disagree with any of that, but it is also almost useless. The agent still has to guess what "clean" means, which practices matter, and whether "simple" means fewer files, fewer dependencies, or less cognitive load.

The instruction sounds good, but it does not reduce ambiguity. It gives the agent a direction, but not a contract.

A better skill is more concrete:

Use this skill when editing Go code.

Prefer table-driven tests for pure functions.
Use the standard library unless the repository already uses a dependency for the same problem.
Run `go test ./...` after changes when feasible.
Do not change public APIs unless the user explicitly asks for it or the task cannot be completed otherwise.
Report any tests that could not be run.

This is still short, but it tells the agent when the skill is active, what is preferred, what is forbidden, how to verify the work, and what to report back.

At that point, it stops being motivational text and starts becoming something the agent can actually follow.

What Makes a Skill a Contract

A useful skill needs a clear trigger. If a skill is always active, it slowly becomes part of the monolithic prompt again. Good triggers are boring and concrete:

  • Use this skill when creating a Git commit.

  • Use this skill when reviewing a pull request.

  • Use this skill when editing Go code.

A bad trigger is too broad:

Use this skill when doing software engineering.

That describes almost everything, which means it does not help the agent decide whether the skill should be active.

A skill also needs one domain. A Git workflow skill should know how to inspect the working tree, prepare a commit message, avoid staging unrelated changes, and report the commit hash. It should not decide how Go code should be structured or whether a ticket is ready for development.

This is where many skill files start to rot: they begin as one focused rule and slowly become a second system prompt.

The test is simple:

Could this skill be owned by one person or one team?

If the answer is no, split it.

Boundaries and Output

Most people focus on what a skill should tell the agent to do. That matters, but the more important question is often:

What should this skill prevent the agent from doing?

Boundaries reduce surprise.

Examples:

  • Do not rewrite unrelated files.

  • Do not run destructive Git commands.

  • Do not change public APIs unless required.

  • Do not introduce a new dependency without a clear reason.

  • Do not hide failed verification.

These rules are not there to slow the agent down. They make the agent predictable, and predictability is more important than cleverness when an agent works in a real codebase.

A skill should also describe the expected output. Without it, "good result" is a matter of taste.

Bad Git Skill vs Good Git Skill

Here is a weak Git skill:

You are an expert Git assistant.
Create good commits.
Use clear commit messages.
Be careful with user changes.
Follow best practices.

Again, nothing here is wrong, but almost everything is vague. What is a good commit? What does "careful" mean? Should the agent stage all changes? What should it report after the commit?

The skill leaves too much room for interpretation.

A better version is more explicit:

# Git Commit Skill

Use this skill when the user asks you to create a commit.

Before committing:
- Run `git status --short`.
- Inspect changed files enough to understand the commit scope.
- Do not stage unrelated user changes.
- If unrelated changes are present, leave them unstaged and mention them.

Commit rules:
- Create one commit for the requested work.
- Use the repository's local commit message convention if one exists.
- Do not amend, rebase, reset, or rewrite history unless the user explicitly asks.

Verification:
- Run relevant tests when feasible.
- If tests are skipped, explain why.

Final response:
- Include the commit hash.
- Summarize what was committed.
- Mention any remaining uncommitted changes.

This skill is not much longer, but it is much clearer. It has a trigger, scope, boundaries, verification, and an output contract.

It also makes debugging easier. If the agent stages unrelated files, the skill is clear and the agent failed.

Full Example

I prepared a small companion example with weak and improved skill files, plus a minimal router:

The example is intentionally plain Markdown:

  • bad-skill.md shows a skill that sounds reasonable but leaves too much room for interpretation.

  • good-git-commit-skill.md shows a narrower workflow with trigger, boundaries, verification, and final response rules.

  • router.md shows when that skill should be active.

The Checklist

Before calling something a skill, run it through this checklist:

  • Does it have a clear trigger?

  • Does it own one domain?

  • Does it define allowed actions?

  • Does it define forbidden actions?

  • Does it describe the expected output?

  • Could you test whether the agent followed it?

If the answer to most of these is "no", you do not have a skill yet. You have a prompt fragment.

When an agent fails, do not immediately add another sentence to the main prompt. Ask whether the right skill was loaded, whether the boundary was clear, and whether the expected output was defined.

The useful mental model is not:

What else should I tell the agent?

The useful mental model is:

What contract should this skill enforce?

That shift changes how you design agent instructions. You stop writing "be careful" and start writing "do not stage unrelated files".

In other words, you stop writing bigger prompts and start designing smaller contracts.

That is the whole point.

A skill should make the agent guess less. Not because the agent is stupid, but because guessing is where most workflow mistakes come from.

The less room you leave for interpretation, the more boring and reliable the agent becomes.

And boring is exactly what you want when it touches your codebase.

Sources

Hope this helps,

Cheers!

Ways of Working

Part 1 of 8

In this series, I will explore practices, reflections, and lessons learned that shape how we collaborate, improve processes, and build better ways of working. [Series cover photo by Leone Venter on Unsplash]

Up next

Stop Writing Bigger Prompts. Start Designing Agent Skills.

How to reduce context size, improve reasoning, and build more predictable AI coding workflows