Where coding agents excel (and where they don’t)

This is a first-person, hands-on writeup of how I experienced using Copilot coding agents today. I include examples, a gotcha I hit while using it, and a short checklist so you can try it in your repos.

Why I started using Copilot coding agents

I started experimenting with Copilot coding agents because I wanted to scaffold a lot of things which IDE-based agents do not, and to see what an assistant could do for me (refactoring scaffolding, creating test harnesses, running quick migrations) and actually execute code in a prepared environment instead of just suggesting edits. Over the last few months I tested agents on various projects and refined a small set of rules that helped me experience the current possibilities of delegating tasks to an agent.

Beyond IDE Agents: What Copilot Coding Agents Can Do

Unlike traditional IDE-based agents, Copilot coding agents can perform a broader range of actions, including:

Branch Creation: Automatically create new branches for development tasks.
Environment Setup and WIP PRs: Set up development environments and initiate Work-In-Progress (WIP) Pull Requests on GitHub.
Task/Issue Deduction: Understand and deduce tasks from issues or descriptions. Those tasks will be put in the WIP PR description.
Building and Running Tasks: Execute build processes and run various development tasks defined in earlier steps.
WIP PR to Draft PR: Transition a WIP Pull Request to a draft Pull Request with a pre-filled body.

Where coding agents excel (and where they don’t)

Based on my experience, here’s where coding agents currently shine and where they struggle:

✅ Good at:

Little tasks: Small, well-defined tasks are handled well.
Dependency updates: Updating dependencies is a task that agents can perform reliably.

⚠️ Okay at:

Tasks with a lot of moving parts: For example, major refactors can be challenging for agents to handle on their own.

❌ Terrible at:

Major UI things: User interface work is not a strong suit for coding agents at the moment.

How agents run in practice: the `copilot-setup-steps.yml` convention

The typical flow I use now is:

Create .github/workflows/copilot-setup-steps.yml on the repository’s default branch.
In that workflow, prepare everything the agent needs: runtimes, credentials (via environment secrets), databases, caches.
Let the agent run against that prepared environment: the agent then performs code changes, runs tests, and can open PRs as usual.

Treat copilot-setup-steps.yml as a minimal pre-flight script: it should be fast, idempotent, and only prepare what the agent actually needs.

Minimal example I use as a starting point

name: GitHub Copilot Setup

on: workflow_dispatch

jobs:
  copilot-setup-steps:
    name: Setup Environment for GitHub Copilot
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
    environment: copilot
    steps:
      - name: Setup dotnet
        uses: actions/setup-dotnet@v5
        with:
          dotnet-version: 9.0.x
      - name: checkout
        uses: actions/checkout@v5
      - name: Configure NuGet sources
        run: |
          if [[ -n "${{ secrets.PACKAGES_PAT }}" ]]; then
            dotnet nuget update source "github" --username this-is-irrelevant --password ${{ secrets.PACKAGES_PAT }} --store-password-in-clear-text
          else
            echo "Could not find Packages PAT"
            exit 1
          fi
      - name: Azure login
        uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

Notes from my experience:

the job name should be copilot-setup-steps and file copilot-setup-steps.yml exactly. others are ignored.
Keep this workflow quick — long boot times mean more waiting (and, on paid plans, more compute cost).
remember to have permissions regarding checkout and your login. Respectively contents: read and id-token: write.

Secrets, environments, and the things that bit me

I use a repository Environment named copilot and put environment variables and secrets there instead of raw repository secrets. In practice put credentials the agent legitimately needs into the copilot environment in the repo settings. Limit who can modify that environment. You might need to put organization credentials here as well. This mindset seems to be changing from time to time. Earlier, this wasn’t necessary, now it is.

A gotcha I ran into and how I handled them:

Prerequisites not always picked up: Even when I put prerequisites steps in a GitHub issue, the agent wouldn’t always pick them up. I found I needed to refer to the prerequisites in between every step. Luckily, with the new steering capabilities, I could guide the agent mid-session to ensure it followed the necessary steps.

A practical checklist (so you can try this quickly)

Add .github/workflows/copilot-setup-steps.yml to your default branch.
Add a repo environment named copilot and add environment variables / environment secrets used by the setup workflow.
Keep the setup job fast: install only what is necessary and cache dependencies.
Consider self-hosted runners if you need internal network access or want to keep secrets entirely on-prem.
Add a short AGENTS.md describing rules (what an agent may change, code style, testing expectations).

Defining Agent Instructions: The `AGENTS.md` Approach

To ensure your Copilot coding agent operates effectively and adheres to project standards, it’s crucial to provide clear and explicit instructions. A common practice is to create an AGENTS.md file (or copilot-instructions.md as I initially used) in your repository to house these guidelines. These instructions act as guardrails, directing the agent’s behavior and ensuring consistency.

Here are some examples of instructions I’ve used to guide my Copilot agent (rename ‘Test’ and ‘Production’ to your own purposes):

*   **Azure Best Practices:** When working with Azure, always invoke the `azure_development-get_best_practices` tool if available to ensure adherence to best practices.
*   **E2E Test Rules:** Ensure End-to-End (E2E) tests do not contain `.only` in `.describe` or `.it` blocks. Remove any instances found.
*   **Coding Standards:** Apply specific coding standards for languages like C#, Bicep, Dockerfile, and TypeScript by following guidelines in their respective instruction files (e.g., `./.github/instructions/csharp.instructions.md`).
*   **Azure Environment Rules:** Adhere to strict rules for Azure deployments:
    *   Use 'Test' for Test deployments and 'Production' for Production deployments.
    *   Never deploy to test and production simultaneously.
    *   Production deployments require PIM (Privileged Identity Management) role activation.
    *   Prefer updating Bicep templates over direct Azure CLI modifications for infrastructure changes.
*   **ADR Adherence:** Before implementing changes, review accepted Architectural Decision Records (ADRs) in the `docs/design-decisions` directory, ensuring their status is "Accepted" and considering their consequences.
*   **PR Creation:** When creating a Pull Request, include a reference to an Azure Boards work item using the format `[AB#12345]` above any headings in the PR description.

Pitfalls & advice from real runs

Explore the capabilities of coding agents, with boundaries: Know what it’s good at (the small things first). Then, as you start to build your repo with instructions, gradually increase the hand-off to coding agent.
Write clear tasks: the agent will be much more useful if you give it a ticket with clear acceptance criteria and a test it must pass. I started using a PR template specifically for agent-generated PRs so reviewers know what to focus on.
Review everything: agents will make suggestions that look plausible but can introduce subtle mistakes. Always review and run the test suite locally.

Closing thoughts

I like how the combination of mission control (Agent HQ) + explicit pre-flight workflows made agents useful rather than noisy. The plan mode and steering affordances gave me the confidence to run medium-length tasks without babysitting everything.

At the same time, good repository hygiene is now essential: lock permissions on the copilot environment, and maintain a short AGENTS.md documenting what agents should and should not do. If you need absolute control over secrets or internal network access,

Please try it 3 times before judging. Share your thoughts in the comments below.

Why I started using Copilot coding agents

Beyond IDE Agents: What Copilot Coding Agents Can Do

Where coding agents excel (and where they don’t)

How agents run in practice: the copilot-setup-steps.yml convention

Minimal example I use as a starting point

Secrets, environments, and the things that bit me

A practical checklist (so you can try this quickly)

Defining Agent Instructions: The AGENTS.md Approach

Pitfalls & advice from real runs

Closing thoughts

Leave a Reply Cancel reply

How agents run in practice: the `copilot-setup-steps.yml` convention

Defining Agent Instructions: The `AGENTS.md` Approach