Agentic Development

The Grill-Me Skill: A Practical Guide to Better Agentic Development

The grill-me skill is a tiny prompt pattern with a big payoff: make the agent interview you before it builds. This guide explains how it works, how to create your own project skill, and how to fit it into a practical agentic development workflow.

18 min read

The grill-me skill is almost comically small.

That is why it is useful.

Instead of asking an agent to start coding immediately, you ask it to interview you about the plan, design, or product decision until the fuzzy parts become explicit. It walks the decision tree, asks one question at a time, recommends answers when useful, and explores the codebase when a question can be answered directly.

For engineers, this turns “please build the thing” into a better technical plan. For PMs, it turns vague intent into crisp tradeoffs, acceptance criteria, and decision records. For both, it reduces the classic agent failure mode: the agent confidently implements the wrong version of the right idea.

As of June 4, 2026, the pattern is showing up in public skill catalogs, including Matt Pocock’s skills repo and Agent Skills Lab. My read is that the popularity comes from a simple workflow inversion:

do not let the agent write code until it has forced the plan to become testable.

TL;DR

  • grill-me is a reusable agent skill that interviews you about a plan or design before implementation.
  • The core value is alignment: it exposes hidden assumptions, unresolved product decisions, technical constraints, and missing acceptance criteria.
  • A skill is usually a directory with a SKILL.md file containing YAML frontmatter and Markdown instructions.
  • In Codex, project skills live well in .agents/skills/<skill-name>/SKILL.md; personal skills can live in $HOME/.agents/skills.
  • In Claude Code, project skills commonly live in .claude/skills/<skill-name>/SKILL.md; personal skills commonly live in ~/.claude/skills.
  • The description field matters more than people expect. It is the trigger predicate the agent uses to decide whether to load the skill.
  • Use grill-me before PRDs, architecture changes, multi-file refactors, issue breakdown, migration plans, and agent-built features.
  • In a Spec Kit workflow, grill-me works best before /speckit.specify, after /speckit.clarify, and before /speckit.plan or /speckit.tasks.
  • The best version for your team should include your own product vocabulary, codebase constraints, review standards, and decision-making style.

What You Will Learn Here

By the end, you should be able to:

  • explain what the grill-me skill does
  • create a project-local grill-me skill
  • adapt it for Codex, Claude Code, or any agent that supports the Agent Skills format
  • use it in a common agentic development workflow
  • combine it with GitHub Spec Kit without duplicating the whole Spec Kit process
  • decide when a grilling session should produce a PRD, issues, tests, or implementation work
  • spot the current gaps before relying on the skill too heavily

Why This Skill Exists

Most agentic development problems are not model problems at first. They are alignment problems.

The agent starts with partial context. The human has unstated assumptions. The project has constraints hidden in code, old issues, ADRs, conventions, and deployment details. Then somebody types:

Build the new billing settings page.

That sounds actionable, but it hides questions like:

  • Which users can see it?
  • Which billing provider is the source of truth?
  • Is this read-only, editable, or both?
  • What happens when the provider API is down?
  • Are we replacing an old flow or adding a new one?
  • What should be tested before this ships?

Humans often resolve these questions in hallway conversations, planning meetings, Slack threads, and code review. Agents need that same clarification loop, but they need it earlier and more explicitly.

grill-me is a compact way to force that loop.

What Is An Agent Skill?

An agent skill is a reusable workflow package. The core unit is a folder with a SKILL.md file:

grill-me/
|-- SKILL.md
|-- references/      optional deeper docs
|-- scripts/         optional executable helpers
`-- assets/          optional templates or examples

The frontmatter tells the agent what the skill is and when to use it. The Markdown body tells the agent how to behave once the skill activates.

---
name: grill-me
description: Interview the user about a plan, product decision, architecture, or implementation strategy before execution. Use when the user says "grill me", asks to stress-test a plan, asks what they are missing, or presents a proposal for critical questioning.
---

Your job is to interview the user before implementation.
Ask one question at a time unless several questions are independent.
Prefer questions that expose risk, ambiguity, dependencies, and missing acceptance criteria.
If the answer is discoverable from the repository, inspect the repository instead of asking.
After the interview, summarize decisions, unresolved gaps, and the recommended next action.

That is enough to create a working first version.

How Grill-Me Works

The original public versions of grill-me are short. The pattern is not a giant framework. It is a behavior contract:

  • interview the human about the plan
  • walk the decision tree
  • resolve dependencies between decisions
  • ask one question at a time
  • recommend an answer when helpful
  • inspect the codebase instead of asking questions the repository can answer

The agent’s job is not to “review” the idea from a distance. It should create pressure where the plan is soft.

Here is the basic loop:

User brings a plan
        |
        v
Agent identifies unclear decisions
        |
        v
Agent asks the highest-leverage question
        |
        v
User answers or agent researches the repo
        |
        v
Agent updates the decision tree
        |
        +---- more unresolved branches? ---- yes ----+
        |                                            |
        no                                           |
        |                                            |
        v                                            |
Summarize decisions, gaps, and next action <---------+

This is why the skill is useful for agentic workflows. It turns one vague prompt into a sequence of decisions the agent can later implement, test, or hand off.

Create Your Own Grill-Me Skill

For a Codex-friendly project skill, create:

.agents/skills/grill-me/SKILL.md

For a Claude Code project skill, create:

.claude/skills/grill-me/SKILL.md

If your team uses both tools, you can keep one canonical skill in .agents/skills and mirror or symlink it where your toolchain expects it. Keep the content tool-neutral unless you need a platform-specific feature.

Here is a stronger team-ready version:

---
name: grill-me
description: Stress-test a plan, design, product decision, architecture, migration, or implementation strategy before execution. Use when the user says "grill me", "poke holes in this", "what am I missing", "does this plan make sense", or asks for critical questioning before coding.
---

# Grill Me

You are an interviewer, not an implementation agent.

Goal: turn an unclear plan into a clear, testable, implementation-ready decision set.

## Process

1. Restate the plan in one short paragraph.
2. Identify the highest-risk unknowns across product, users, data, architecture, operations, security, and rollout.
3. Ask one question at a time unless the questions are independent and can be answered together.
4. If the repository can answer a question, inspect the repository instead of asking the user.
5. For each important decision, offer a recommended answer and explain the tradeoff briefly.
6. Track decisions as accepted, rejected, or unresolved.
7. Do not start implementation until the user explicitly asks you to proceed.

## Question Priorities

Prefer questions that clarify:

- user and business goal
- success criteria
- non-goals
- affected systems and owners
- data model changes
- API contracts
- permissions and security boundaries
- failure modes
- testing strategy
- rollout, migration, and rollback

## Output At The End

Return:

- decisions made
- unresolved gaps
- recommended next step
- suggested artifacts: PRD, issues, tests, ADR, implementation plan, or prototype

This version is still small, but it gives the agent enough structure to behave predictably.

Use It In Your Own Projects

The highest-value use is before implementation, not after.

Good prompts:

/grill-me I want to migrate our billing settings from Stripe-only to multi-provider.
$grill-me We are thinking about replacing our custom permissions table with OpenFGA. Poke holes in the plan.
Grill me on this checkout redesign before we create tickets.

Bad prompts:

Use grill-me and then immediately implement everything.

That collapses the clarification phase and the execution phase into one session. Sometimes that is fine for tiny changes. For meaningful work, keep the phases distinct.

A Common Agentic Development Workflow

Here is a practical flow that works for engineers and PMs:

Idea or request
      |
      v
Grill-me interview
      |
      v
Decision summary
      |
      +--> still unclear? return to interview
      |
      v
Artifact selection
      |
      +--> PRD
      +--> ADR
      +--> issue breakdown
      +--> test plan
      +--> prototype
      |
      v
Implementation agent
      |
      v
Tests, review, and rollout notes

A realistic team loop might look like this:

  1. PM brings a feature idea.
  2. Agent runs grill-me and asks product, data, and edge-case questions.
  3. PM answers the product questions.
  4. Engineer lets the agent inspect the repo for implementation constraints.
  5. Agent produces a decision summary.
  6. Team turns the summary into a PRD or issue breakdown.
  7. Implementation happens in a separate agent pass with tests and review.

The separation matters. grill-me is not the builder. It is the alignment checkpoint before the builder.

Where Grill-Me Fits With GitHub Spec Kit

GitHub Spec Kit and grill-me overlap in a healthy way.

Spec Kit gives you the structured artifact pipeline:

constitution -> specify -> clarify -> checklist -> plan -> tasks -> analyze -> implement

The current Spec Kit docs describe the core flow as Spec -> Plan -> Tasks -> Implement, with supporting commands such as /speckit.constitution, /speckit.clarify, /speckit.checklist, and /speckit.analyze. That means Spec Kit already understands an important truth about AI-assisted development:

do not ask the coding agent to implement until the specification and plan are clear enough to carry the work.

grill-me improves this workflow by moving some clarification earlier, before the formal artifact machinery starts.

Raw idea
  |
  v
grill-me
  |
  v
/speckit.specify
  |
  v
/speckit.clarify + /speckit.checklist
  |
  v
grill-me again, if the spec still hides decisions
  |
  v
/speckit.plan
  |
  v
/speckit.tasks + /speckit.analyze
  |
  v
/speckit.implement

The key distinction is simple:

ToolBest job
grill-mePressure-test intent, assumptions, tradeoffs, missing context, and human decisions.
/speckit.specifyConvert the clarified idea into a structured requirements artifact.
/speckit.clarifyResolve ambiguities already present in the generated spec.
/speckit.checklistValidate clarity, completeness, and consistency of requirements.
/speckit.planTranslate requirements into technical architecture and implementation strategy.
/speckit.tasksBreak the plan into ordered, testable implementation steps.
/speckit.analyzeCross-check consistency and coverage before implementation.

So grill-me is not “Spec Kit, but smaller.” It is the conversation you should often have before Spec Kit receives the prompt that becomes the source artifact.

The Overlap: Grill-Me vs /speckit.clarify

The clearest overlap is with /speckit.clarify.

Both are designed to stop the agent from guessing. Both are useful when the request is underspecified. Both should produce better implementation context.

But they operate at different moments:

grill-me
  -> before the first spec exists
  -> helps decide what should enter the spec
  -> can challenge whether the feature is worth doing this way

/speckit.clarify
  -> after spec.md exists
  -> resolves ambiguity in a concrete artifact
  -> updates or informs the formal Spec Kit flow

My practical rule:

Use grill-me when the idea is still conversational. Use /speckit.clarify when the idea has already become spec.md.

That gives PMs and engineers a better handoff. The PM can use grill-me to sharpen goals, non-goals, user stories, and acceptance criteria. Then the engineer can use /speckit.specify and /speckit.clarify to turn those decisions into versioned artifacts.

The Improvement: Better Inputs To /speckit.specify

Spec Kit’s /speckit.specify works best when the prompt is explicit about what and why, not the tech stack. That is exactly where grill-me helps.

Instead of this:

/speckit.specify Add team invites.

Run:

/grill-me I want to add team invites. Help me find the missing decisions before I create the spec.

Then feed the resulting decisions into Spec Kit:

/speckit.specify Add email-based team invites for workspace admins.
Admins can invite users by email, invites expire after 7 days, pending
invites can be revoked, and accepted invites add the user to the workspace
with the selected role. V1 excludes bulk invites, SCIM, and invite links.
Success means an admin can invite, revoke, and track pending invites without
changing existing workspace membership behavior.

That second prompt gives Spec Kit much better raw material. It has user role, scope, lifecycle, non-goals, and success criteria. The generated spec.md should need less cleanup because the human decisions are already visible.

The Improvement: Better Inputs To /speckit.plan

grill-me is also useful before /speckit.plan, but the questions should change.

Before /speckit.specify, ask product questions:

  • Who is this for?
  • What behavior changes?
  • What is out of scope?
  • What counts as success?

Before /speckit.plan, ask engineering questions:

  • Which existing modules are touched?
  • What data model changes are unavoidable?
  • What contracts or clients might break?
  • What security boundaries apply?
  • What rollback path exists?
  • What tests should fail first?

That maps cleanly to Spec Kit because /speckit.plan is where the feature moves from product intent to technical implementation. A short grill-me pass before planning can catch over-engineering, missing repository constraints, or architectural decisions that should be written into the plan instead of discovered during implementation.

A Practical Combined Workflow

For a meaningful feature, I would use this sequence:

1. /grill-me <raw idea>
   Output: decisions, gaps, non-goals, acceptance criteria

2. /speckit.specify <decision summary>
   Output: specs/<feature>/spec.md

3. /speckit.clarify
   Output: clarified requirements in the spec flow

4. /speckit.checklist
   Output: requirements quality review

5. /grill-me <spec + planned technical direction>
   Output: risk questions before architecture hardens

6. /speckit.plan <tech stack and architecture decisions>
   Output: plan.md, research.md, data-model.md, contracts, quickstart

7. /speckit.tasks
   Output: tasks.md

8. /speckit.analyze
   Output: consistency and coverage findings

9. /speckit.implement
   Output: code, tests, and implementation progress

For small changes, this is too much ceremony. For cross-module work, migrations, permissions, billing, onboarding flows, or anything expensive to reverse, it is a very reasonable amount of friction.

How To Encode This In A Project Skill

If your team already uses Spec Kit, add a short Spec Kit rule to your grill-me skill:

## Spec Kit Integration

If this repo uses GitHub Spec Kit:

- Before `/speckit.specify`, help the user produce a concise decision summary focused on what, why, users, scope, non-goals, and success criteria.
- After `spec.md` exists, do not replace `/speckit.clarify`; recommend it for artifact-level ambiguity.
- Before `/speckit.plan`, ask engineering risk questions about modules, data, APIs, permissions, tests, rollout, and rollback.
- Before `/speckit.implement`, prefer `/speckit.analyze` over ad-hoc confidence.
- Never let implementation start while major decisions remain unresolved.

This small addition prevents tool confusion. grill-me owns the human interview. Spec Kit owns the durable artifacts.

What To Ask During The Interview

A good grilling session should not feel random. It should walk from intent to constraints to execution.

Intent
  |
  +-- Who is this for?
  +-- What problem does it solve?
  +-- What outcome proves it worked?

Scope
  |
  +-- What is included?
  +-- What is explicitly not included?
  +-- What existing behavior must not change?

System design
  |
  +-- What data changes?
  +-- What API contracts change?
  +-- What permissions apply?
  +-- What failures should be graceful?

Delivery
  |
  +-- How do we test it?
  +-- How do we roll it out?
  +-- How do we revert it?

The best agent questions usually sound simple:

  • “What should happen when this dependency is unavailable?”
  • “Which existing behavior must remain unchanged?”
  • “Who owns the source of truth for this field?”
  • “What is the smallest version that still proves the idea?”
  • “What would make this change unsafe to ship?”

Those questions are boring in the best possible way. They prevent expensive surprises.

Example: From Vague Feature To Better Plan

Initial prompt:

Grill me on this: I want to add team invites to the app.

The skill should avoid jumping straight to database migrations. A useful first pass might ask:

Before we design the flow: who is allowed to invite members?

Recommended default: only workspace admins can invite members in v1.

Why: it keeps permissions simple, avoids accidental access expansion,
and gives us a clear policy to test.

After a few turns, the agent can summarize:

Decisions made:
- v1 supports email invites only.
- only workspace admins can invite users.
- invites expire after 7 days.
- accepting an invite requires login or signup with the invited email.
- deleted workspaces invalidate pending invites.

Unresolved:
- whether to support resend invite in v1
- whether invited users count toward billing before acceptance

Recommended next step:
- create a short PRD and split implementation into API, email, UI, and tests.

That is much better input for an implementation agent than the original one-line idea.

Make It Project-Specific

The public grill-me skill is intentionally generic. Your team version should not stay generic forever.

Add your local rules:

## Project-Specific Checks

For this repo, always ask about:

- tenant boundaries before changing data access
- audit events before changing admin workflows
- backward compatibility for mobile clients
- feature flag and rollback strategy
- whether the change needs an ADR under docs/adr/

Add local artifacts:

## Repository Context

If the plan touches permissions, read:

- references/permissions-model.md
- docs/adr/004-authorization-boundaries.md

If the plan touches billing, inspect:

- src/billing/
- worker/billing-sync/
- references/billing-provider-contract.md

This is where skills become much more than prompts. They become reusable project memory, but loaded only when the task calls for it.

When Not To Use Grill-Me

Do not use it for every tiny task.

Skip it when:

  • the change is mechanical
  • the implementation is obvious and low-risk
  • the plan already has clear acceptance criteria
  • the user explicitly wants fast execution
  • the cost of questioning is higher than the cost of a quick patch

Use it when:

  • the work crosses teams or modules
  • the plan affects user-visible behavior
  • requirements are ambiguous
  • architecture or data shape might change
  • a rollback would be painful
  • PMs and engineers need shared language before work starts

Common Mistakes

The first mistake is making the skill too polite. A good grilling session should be friendly, but it should not be passive. The agent should challenge assumptions and ask for missing evidence.

The second mistake is asking too many questions at once. A wall of twenty questions feels like homework. The skill should ask the highest-leverage question first, then adapt.

The third mistake is never letting the agent inspect the codebase. If the repository can answer a question, the agent should look. Asking the human to restate facts already present in code wastes attention.

The fourth mistake is letting the same session drift into implementation. Keep the boundary clear:

Grill first.
Decide second.
Implement third.
Review fourth.

Current Gaps And Honest Caveats

There is strong practical evidence that the grill-me pattern helps teams think before coding, but there is not yet a large public benchmark proving that this exact skill reduces defects, cycle time, or rework across many repos.

The current evidence is mostly:

  • public skill adoption and reuse
  • creator documentation
  • agent skill specifications
  • engineering experience with planning, review, and clarification loops
  • broader research and practice around agent skills, progressive disclosure, and workflow packaging

That means the right posture is pragmatic, not magical. Treat grill-me as a cheap alignment tool. Test it on real project work. Keep the questions that catch real issues. Cut the ones that create ceremony.

If you want to make the article’s workflow more operational inside a team, the next useful additions would be:

  • a sample transcript from a real feature discussion
  • a before/after comparison of an agent-built feature with and without grill-me
  • a project template that includes .agents/skills/grill-me/SKILL.md
  • a Spec Kit preset or project-local override that automatically inserts a grill-me checkpoint before /speckit.specify and /speckit.plan
  • a checklist for PMs to decide when to trigger the skill
  • a test harness for checking whether the skill asks useful questions
  • examples for specific domains: billing, auth, data migrations, mobile releases, and internal tools

Source List