Copilot agent billing works by charging you for what the agent does, not just for the seat the user holds. Every major Copilot agent product — GitHub Copilot coding agent, Microsoft 365 Copilot Studio agents, Microsoft 365 Copilot Chat agents, and Azure AI Foundry Agent Service — pairs a base seat or tenant license with a second layer of metered consumption. That second layer is where most buyers get surprised.
The problem the industry is solving is runaway inference cost. Large language models are expensive to run, so Microsoft and GitHub now shift part of the cost to the customer through messages, Copilot Credits, premium requests, and — starting June 1, 2026 — GitHub AI Credits. If you ignore these meters, agents keep running and your invoice keeps growing. The direct consequence is a bill that can jump from the sticker price of $19 per user to hundreds or thousands of dollars in overage in a single month.
A recent industry breakdown found that the GitHub coding agent consumes one premium request per session, multiplied by the model rate, and is tracked in a separate SKU from November 1, 2025. That small multiplier can turn a $39 seat into a $120 monthly cost for a single developer.
- 🧾 How seats, messages, credits, and tokens stack on top of each other
- 💸 Exact 2026 pricing for every Copilot agent tier, with dollar math
- 🧮 Three real billing scenarios with the invoice walk-through
- ⚠️ Seven mistakes that trigger surprise overages
- 🛡️ How to set budgets, spending limits, and billing policies that actually hold
The Two-Layer Billing Model Behind Every Copilot Agent
Every Copilot agent is billed in two layers. The first layer is the license — a flat monthly fee per user or per tenant that unlocks the product. The second layer is the consumption meter — a variable charge tied to how many actions, messages, premium requests, or tokens the agent uses.
Microsoft and GitHub structure billing this way because agent workloads are unpredictable. One user might fire off 10 agent sessions a day. Another might fire off 400. A flat fee cannot cover both without either overcharging the light user or bleeding money on the heavy user. The consequence is that every Copilot buyer must read two line items on the invoice — the seat cost and the meter cost — and plan for both.
A common misconception is that “Copilot is $30 a month.” That figure is only the seat. The meter runs on top of it, and on Microsoft 365 Copilot Studio the meter starts at $200 per month for a 25,000-message pack, billed per tenant rather than per user.
Picture a small law firm called Briggs & Lane. The firm buys 10 Copilot Business seats at $19 each, so the seat layer is $190 a month. The firm also builds one intake agent in Copilot Studio that processes 30,000 messages. The meter layer adds a $200 prepaid pack plus $50 of pay-as-you-go overage. The real monthly cost is $440, not $190.
Seat Licenses Versus Consumption Meters
A seat license is a per-user subscription. Copilot Free, Copilot Pro at $10, Copilot Pro+ at $39, Copilot Business at $19, and Copilot Enterprise at $39 are all seat licenses. They give the user access to the product and a base allowance of consumption units. The consequence of buying a seat without understanding the included allowance is that you hit the cap on day 15 and agents stop responding until you add more capacity.
A consumption meter is a pool of units that deplete with each agent action. In GitHub Copilot, the pool is currently counted in premium requests and moves to GitHub AI Credits on June 1, 2026. In Microsoft 365 Copilot Studio, the pool is counted in Copilot Credits, formerly called messages. In Azure AI Foundry Agent Service, the pool is counted in raw tokens, search queries, storage, and compute.
A common misconception is that included allowances roll over. They do not. Every Copilot plan’s allowance resets on the monthly billing date, and unused requests expire. The consequence is that buying a bigger pack “for peace of mind” wastes money if your team does not use it inside the month.
Why Agents Cost More Than Chat
Agents are more expensive than plain Copilot Chat because agents run autonomously. A chat session is one question and one answer. An agent session can loop through dozens of tool calls, file reads, searches, and model completions before it finishes. Each of those steps touches the meter.
On Copilot Studio, an agent action is billed at 5 Copilot Credits per action, while a simple classic answer is billed at 1 credit. Tenant Graph grounding, where the agent searches your SharePoint or OneDrive, runs at 10 credits. The consequence is that a single agent conversation can consume 20 to 50 credits, while a chat conversation might consume 1 or 2.
A common misconception is that autonomous use costs the same as interactive use. It does not. Autonomous agent actions and generative answers are billed at the same weight as user-triggered ones, but they fire without a human in the loop, so they compound faster. The consequence is that a poorly scoped autonomous agent can drain a 25,000-credit pack in a single afternoon.
GitHub Copilot Coding Agent Billing
The GitHub Copilot coding agent — sometimes called the cloud agent or agent mode — runs in the cloud, opens pull requests, and completes multi-step coding tasks. Its billing has two sub-meters: premium requests and GitHub Actions minutes. Both are consumed on every session.
The GitHub Copilot coding agent has been in a dedicated SKU since November 1, 2025, which means its premium request spend now shows up on its own line on the invoice. Paid-plan billing for premium requests began on June 18, 2025 for GitHub.com and August 1, 2025 for GHE.com. The consequence of that change is that finance teams can finally attribute agent spend to specific cost centers.
A common misconception is that the coding agent’s free tier covers meaningful work. It does not. Copilot Free includes only 50 premium requests a month, and the coding agent burns one request per session multiplied by the model rate. A single Claude Opus 4.5 session at a 3× multiplier eats 3 of those 50 requests.
Premium Requests and Model Multipliers
Premium requests are the current unit of measurement for most paid GitHub Copilot usage. Each paid plan includes a monthly allowance: Copilot Pro gets 300, Pro+ gets 1,500, Business gets 300, and Enterprise gets 1,000. When you exceed the allowance, extra requests cost $0.04 each on paid plans.
Every model has a premium request multiplier based on its complexity. The consequence is that the same prompt can cost 1 request on Claude Sonnet 4.5 or 50 requests on GPT-4.5. A developer who does not track which model is selected can burn a full month’s allowance on a single complex refactor.
A common misconception is that all models consume premium requests. Several models are included and do not consume any: GPT-5 mini, GPT-4.1, and GPT-4o on paid plans. The consequence is that teams who stick to included models can run thousands of prompts a month without ever touching the meter.
GitHub Actions Minutes and the Second Meter
The GitHub Copilot coding agent runs on GitHub-hosted runners, which means every session also consumes GitHub Actions minutes. These minutes come out of your account’s monthly Actions allowance, shared with every workflow in your repositories. The consequence is that a busy CI pipeline plus heavy coding agent use can blow through your Actions budget even when your premium requests still have headroom.
GitHub Actions minutes are billed separately from premium requests. Free accounts get 2,000 minutes a month on Linux runners, and the rate doubles on Windows and quadruples on macOS. The consequence is that a coding agent session running on a macOS runner consumes minutes 4× faster, which can quietly push you into overage.
A common misconception is that the coding agent’s “session” is one minute. A typical session runs 3 to 10 minutes of runner time depending on the repository size, test suite, and tool calls. The consequence is that a 30-session day can consume 150 to 300 Actions minutes on top of 30 to 150 premium requests.
The June 1, 2026 Switch to GitHub AI Credits
Starting June 1, 2026, GitHub moves every Copilot plan from premium requests to GitHub AI Credits. One AI Credit equals $0.01. Usage is calculated based on token consumption — input, output, and cached tokens — at the listed API rate for each model.
The consequence of this switch is that billing becomes more transparent but also more variable. A session on a frontier model that reads a large repository can consume thousands of tokens in seconds, and the meter reflects that in real time. Teams that never tracked tokens before must now treat Copilot like a cloud compute service.
A common misconception is that usage-based billing is cheaper. For light users it may be. For heavy users of frontier models, the transition has already drawn criticism for making costs less predictable. The consequence is that enterprises should test their workloads against the new rates before June 1 to avoid billing shock.
Microsoft 365 Copilot Studio Agent Billing
Copilot Studio agents are billed at the tenant level, not the seat level. A Copilot Studio tenant license includes Copilot Credit capacity packs of 25,000 Copilot Credits each, priced at $200 per pack per month. You can stack multiple packs and you can enable pay-as-you-go overage to an Azure subscription.
The billing unit is the message, now rebranded as a Copilot Credit. Different agent features consume different weights. Classic answers cost 1 credit, generative answers cost 2, agent actions cost 5, tenant Graph grounding costs 10, and agent flow actions cost 13 credits per 100 actions. The consequence is that a single conversation can consume anywhere from 1 to 50+ credits depending on what the agent does.
A common misconception is that users with Microsoft 365 Copilot licenses pay nothing extra for Copilot Studio agents. That is partially true — many features are zero-rated inside Microsoft 365 Copilot scenarios — but autonomous agent actions and agent flow actions still consume credits. The consequence is that an autonomous agent deployed to 500 Copilot-licensed users can still generate a five-figure Copilot Studio bill.
Prepaid Packs Versus Pay-As-You-Go
Microsoft offers two billing models for Copilot Studio: prepaid message packs and pay-as-you-go. Prepaid packs cost $200 for 25,000 credits and are billed against the Microsoft 365 tenant. Pay-as-you-go is metered at $0.01 per credit and is billed to a linked Azure subscription.
Most enterprises run both models at once. Prepaid packs handle the predictable base load and pay-as-you-go handles the spikes. The consequence of running only prepaid is that the agent stops responding when the pack is exhausted. The consequence of running only pay-as-you-go is that costs are harder to forecast because every message hits the Azure meter.
A common misconception is that prepaid packs auto-renew forever. They do renew monthly, but any unused credits do not carry over. The consequence is that buying 4 packs “for safety” and only using 1.5 wastes $500 every month.
Billing Policies and Azure Subscription Linking
To enable pay-as-you-go, an admin must create a billing policy in the Power Platform Admin Center and link it to an Azure subscription. The policy ties specific Power Platform environments to specific Azure billing accounts. The consequence of not linking a policy is that agents refuse to run once prepaid capacity is exhausted.
Once the policy is in place, overage automatically flows to the Azure subscription without service interruption. Charges appear on the next Azure invoice. The consequence of leaving the policy open without a spending limit is that a runaway agent can rack up thousands of dollars in a weekend.
A common misconception is that disabling Copilot Studio in an environment stops all charges. It does not. Any message already in flight is billed, and agents deployed to Teams, SharePoint, or Dynamics 365 keep consuming until the environment is fully unpublished. The consequence is that careless disabling does not protect the budget.
Message Weights in Plain English
The billing rate table is the single most important document for Copilot Studio buyers. A classic answer is a static, rule-based reply and is billed at 1 credit. A generative answer uses a large language model and is billed at 2 credits. An agent action is a connector call or tool invocation and is billed at 5 credits.
Tenant Graph grounding — where the agent searches your Microsoft 365 content — is billed at 10 credits per message. The consequence is that enterprise-search-style agents are the most expensive kind to run, because every question hits Graph. A common misconception is that Graph grounding is free inside Microsoft 365 Copilot. It is zero-rated for interactive Copilot scenarios but billed at 10 credits for autonomous agent use.
Agent flow actions are billed at 13 credits per 100 actions. The consequence is that a low-code Power Automate-style flow wrapped in an agent is relatively cheap, which is why many customers route heavy logic through flows to lower the bill.
Azure AI Foundry Agent Service Billing
Azure AI Foundry Agent Service — formerly Azure AI Agent Service — uses pay-as-you-go pricing with no per-agent fee. You are billed only for the underlying resources the agent consumes. That is the most consumption-pure model in the Copilot family.
The meters in Foundry Agent Service are token-based model inference, Azure AI Search or Bing grounding queries, storage for agent files and knowledge bases, and compute for hosted agent containers. Each meter has its own Azure SKU and its own rate card. The consequence is that forecasting is harder than with Copilot Studio because the bill depends on the model chosen, the size of the knowledge base, and the grounding source.
A common misconception is that Foundry agents are cheaper than Copilot Studio agents. For small, simple agents that is often true. For large enterprise agents with heavy grounding, Foundry costs can exceed Copilot Studio because every token is metered at the raw API rate. The consequence is that pricing only wins after you run a proof of concept on representative workloads.
Token-Based Model Inference
Foundry Agent Service bills model inference by input tokens, output tokens, and cached tokens. Rates vary by model. Microsoft’s MAI series, announced April 6, 2026, starts at $0.36 per hour for provisioned throughput. GPT-4o, Llama, and Mistral all carry their own per-million-token rates.
The consequence is that model choice is the single largest lever on the bill. Swapping GPT-4o for a smaller Mistral model can cut inference cost by 80% for the right workload. A common misconception is that frontier models are always better. For classification or extraction tasks, a small model plus good prompting beats a frontier model on both cost and latency.
Cached tokens are billed at a discount — often 10% of the input rate. The consequence is that agents designed to reuse context benefit from caching and can be 30–50% cheaper than agents that rebuild context on every call.
Agent Commit Units and Provisioned Throughput
For high-volume workloads, Microsoft offers Agent Commit Units, or ACUs, as pre-purchased commitments. ACUs provide 5% to 15% discounts depending on the commitment level and apply across all resource types in Foundry.
Provisioned throughput reserves dedicated compute capacity. The MAI models support “fungible throughput” that lets enterprises allocate reserved compute across different models based on shifting workloads. The consequence is that teams with predictable, steady traffic pay less with reserved capacity, while teams with spiky traffic pay less with pay-as-you-go.
A common misconception is that ACUs lock you in for years. Current terms offer monthly and annual options. The consequence of choosing the wrong term is overpaying for unused capacity — the classic cloud commitment trap.
Pricing Reference: All Copilot Agent Tiers
| Plan | Seat or Tenant Price | Included Agent Allowance |
|---|---|---|
| Copilot Free | $0 per user | 50 premium requests per month |
| Copilot Pro | $10 per user per month | 300 premium requests per month |
| Copilot Pro+ | $39 per user per month | 1,500 premium requests per month |
| Copilot Business | $19 per user per month | 300 premium requests per month |
| Copilot Enterprise | $39 per user per month | 1,000 premium requests per month |
| Copilot Studio | $200 per tenant per pack | 25,000 Copilot Credits per pack |
| Foundry Agent Service | No per-agent fee | Pay-as-you-go only |
Overage on GitHub Copilot paid plans is billed at $0.04 per premium request today and $0.01 per AI Credit starting June 1, 2026. Overage on Copilot Studio is billed at $0.01 per Copilot Credit through an Azure subscription.
Three Real Billing Scenarios
Every team hits one of three patterns: the light developer, the enterprise Studio deployment, or the Azure-native Foundry build. Each has a different invoice shape.
Scenario 1 — Solo Developer on Copilot Pro
| Usage Pattern | Invoice Line |
|---|---|
| 1 Copilot Pro seat for a freelance developer | $10 for the seat |
| 250 premium requests on Claude Sonnet 4.5 at 1× | 0 overage, inside the 300 allowance |
| 60 requests on Claude Opus 4.5 at 3× = 180 requests | 130 requests over the allowance, $5.20 overage |
| Total monthly cost | $15.20 |
The consequence is that a solo developer who sticks to 1× models stays inside the sticker price. A developer who drifts to 3× models on complex code adds a small but real overage.
Scenario 2 — 200-Seat Enterprise with a Copilot Studio Intake Agent
| Usage Pattern | Invoice Line |
|---|---|
| 200 Copilot Business seats at $19 | $3,800 per month |
| 1 Copilot Studio pack for the intake agent | $200 per month |
| 40,000 messages (15,000 over the pack) at $0.01 each | $150 pay-as-you-go on Azure |
| Total monthly cost | $4,150 |
The consequence is that an enterprise must model three separate invoices: Microsoft 365 for seats, Copilot Studio for the pack, and Azure for overage. A common misconception is that all three appear on one bill — they do not.
Scenario 3 — Startup Building a Foundry Agent
| Usage Pattern | Invoice Line |
|---|---|
| No per-agent fee | $0 base |
| 50 million GPT-4o input tokens | Metered at Azure OpenAI rates |
| 12 million output tokens | Metered at Azure OpenAI rates |
| 80,000 Azure AI Search queries for grounding | Metered at Azure AI Search rates |
| Storage for 20 GB of knowledge base files | Metered at Azure Storage rates |
| Total monthly cost | Roughly $900–$1,400 depending on model and region |
The consequence is that Foundry startups must read the Azure invoice line by line to understand what drove cost. A common misconception is that the agent service itself has a flat fee — it does not.
Named Examples You Can Learn From
Maria leads a 12-person engineering team at a fintech in Austin. She bought 12 Copilot Business seats for $228 a month. Maria’s team started using Claude Opus 4.5 at the 3× multiplier for complex refactors. Within 10 days, the team blew through their 3,600 shared premium requests and hit $480 in overage before Maria set a spending limit.
David runs IT at a mid-sized health system. He deployed a Copilot Studio intake agent across five clinics. Because the agent used tenant Graph grounding at 10 credits per message, a single busy Monday consumed 18,000 credits. David learned the hard way that Graph-heavy agents need at least two prepaid packs plus an Azure billing policy for overage.
Priya is a solo developer in Brooklyn. She pays $10 a month for Copilot Pro. Priya only uses GPT-5 mini and GPT-4.1, both of which are included models at 0× multiplier. Her invoice has been exactly $10 for six months straight, which shows that disciplined model selection keeps costs at the sticker price.
Mistakes to Avoid
- Skipping the model multiplier. Ignoring that GPT-4.5 is a 50× model turns a 10-request task into 500 requests and drains the allowance in one afternoon.
- Leaving pay-as-you-go uncapped. Not setting an Azure spending limit lets a runaway autonomous agent rack up thousands of dollars overnight.
- Buying extra prepaid packs “just in case.” Unused Copilot Credits do not roll over, so an extra pack is $200 thrown away each month.
- Confusing interactive and autonomous billing. Autonomous agent actions are billed even when tenant Graph grounding is zero-rated for interactive Copilot use.
- Assuming GitHub Actions minutes are free. The coding agent burns Actions minutes on top of premium requests, and macOS runners burn them 4× faster.
- Forgetting the June 1, 2026 switch to AI Credits. Budgets built on premium requests will not match the token-based AI Credits model.
- Not selecting a billing entity. Users with multiple enterprise licenses must pick one in the Usage billed to drop-down, or premium requests are blocked entirely.
- Ignoring cached tokens. Agents that rebuild context on every call pay full rate when a caching strategy could cut the bill by a third.
- Deploying autonomous agents without a test plan. A misconfigured loop can consume 25,000 Copilot Credits in under an hour.
- Treating seat price as the full cost. Seats unlock the product. The meter does the real damage.
Do’s and Don’ts
- Do enable budgets and spending limits before you enable any agent, because uncapped usage is the single biggest cause of billing shock.
- Do map every agent to a named cost center, because attribution is the only way to defend the line item during budget review.
- Do prefer 1× and 0× models for routine work, because model discipline saves more money than any other lever.
- Do link an Azure subscription to your Power Platform environment, because pay-as-you-go prevents service interruption when packs run out.
- Do run a 30-day pilot before scaling, because real usage patterns always differ from vendor estimates.
- Don’t enable autonomous agents without per-day throttles, because loops multiply fast.
- Don’t assume Microsoft 365 Copilot licenses cover all Copilot Studio usage, because autonomous and flow-based actions still consume credits.
- Don’t buy Copilot Pro+ for every developer, because many developers never exceed the 300-request Pro allowance.
- Don’t ignore Actions minutes on the coding agent, because that second meter is invisible until it hits the invoice.
- Don’t delay planning for the June 1, 2026 AI Credits switch, because token-based rates favor light users and punish heavy ones.
Pros and Cons of Consumption Billing
- Pro: You pay only for what the agent uses, which matches cost to value for light workloads.
- Pro: Dedicated SKUs for the coding agent and Spark, live since November 1, 2025, give finance teams real attribution.
- Pro: Pay-as-you-go overage prevents hard service cutoffs during critical work.
- Pro: Cached tokens and included models let disciplined teams keep costs near zero.
- Pro: Agent Commit Units on Foundry deliver 5–15% discounts for predictable workloads.
- Con: Forecasting is hard because agent sessions vary in length and tool calls.
- Con: Model multipliers can multiply costs by 50× with a single bad prompt.
- Con: The June 1, 2026 switch to tokens makes budgets built on requests obsolete.
- Con: Three-way billing — seats, tenant packs, Azure overage — splits spend across invoices.
- Con: Autonomous agents can loop and drain capacity before humans notice.
Step-By-Step: Set Up Billing for a Copilot Studio Agent
- In the Microsoft 365 admin center, open Copilot > Billing & usage. This is where every Copilot Studio billing decision starts, and skipping it means the agent runs on default tenant capacity that may not exist.
- Select the Pay-as-you-go services tab and pick Microsoft 365 Copilot Chat or SharePoint agent. The choice tells Microsoft which service to meter through the billing policy.
- Open the Power Platform Admin Center and create a billing policy tied to an Azure subscription. Without this policy, pay-as-you-go cannot bill.
- Attach the billing policy to the Power Platform environment hosting the agent. The environment is the scope of billing, so every agent in that environment flows to the policy.
- Back in the admin center, switch Connection status to Connected. That single toggle is what turns on pay-as-you-go overage on top of prepaid packs.
- Buy at least one Copilot Studio capacity pack for the base load. One pack is $200 a month for 25,000 Copilot Credits.
- Set an Azure spending limit on the subscription that receives overage. The consequence of skipping this step is that the Azure bill has no ceiling.
- Publish the agent and run a 48-hour observation window. Watch the Billing & usage page and compare actual consumption to your forecast before scaling.
Key Entities in Copilot Agent Billing
GitHub is the publisher of the Copilot coding agent and owns the premium request and AI Credits meters. Microsoft is the publisher of Copilot Studio, Microsoft 365 Copilot Chat, and Azure AI Foundry Agent Service, and owns the Copilot Credit and Azure consumption meters. Azure is the billing backbone that catches pay-as-you-go overage from every Microsoft Copilot product.
The Power Platform Admin Center is the control plane for Copilot Studio billing policies. The GitHub Copilot dashboard is the control plane for premium requests and AI Credits. The Azure portal is the control plane for Foundry consumption, spending limits, and Agent Commit Units.
The consequence of not knowing which control plane owns which meter is that admins chase billing alerts in the wrong place. A common misconception is that one admin role covers all three. In practice, you need a Microsoft 365 global admin, a Power Platform admin, and an Azure subscription owner working together.
FAQs
Is the Copilot seat price the full monthly cost?
No. The seat unlocks the product but agent work consumes a second meter — premium requests, Copilot Credits, or tokens — that is billed on top of the seat, often adding more than the seat itself.
Does the GitHub coding agent cost extra on Copilot Business?
Yes. The coding agent consumes premium requests from the Business plan’s 300-request monthly allowance, multiplied by the model rate, and also burns GitHub Actions minutes on every session.
Do unused Copilot Credits or premium requests roll over?
No. Every Copilot allowance resets on the monthly billing date, so unused capacity expires and any extra packs you bought but did not use are money lost.
Will my billing change on June 1, 2026?
Yes. GitHub moves every Copilot plan from premium requests to GitHub AI Credits, where one credit equals $0.01 and usage is measured in input, output, and cached tokens per model.
Can I cap Copilot Studio pay-as-you-go spend?
Yes. Set an Azure spending limit on the subscription linked to the billing policy, and combine that with per-environment caps in the Power Platform Admin Center for the tightest control.
Are autonomous agent actions billed the same as interactive ones?
Yes. Autonomous and interactive agent actions carry the same 5-credit weight, but autonomous agents run without a human, so they tend to consume more total credits.
Does Microsoft 365 Copilot cover all Copilot Studio usage?
No. Many interactive features are zero-rated inside Microsoft 365 Copilot scenarios, but autonomous agent actions, agent flow actions, and tenant Graph grounding in autonomous mode still consume Copilot Credits.
Is Azure AI Foundry Agent Service cheaper than Copilot Studio?
No. For small agents Foundry often costs less, but heavy grounding and frontier model use on Foundry frequently exceed Copilot Studio because every token and every search query is metered at the raw Azure rate.
Can I mix prepaid packs with pay-as-you-go?
Yes. Microsoft recommends running both together, where prepaid packs cover the base load and pay-as-you-go absorbs spikes so the agent never stops responding mid-conversation.
Do I need a separate Azure subscription for Copilot Studio overage?
Yes. Overage is billed through a billing policy that must point to an Azure subscription, and most enterprises use a dedicated subscription so finance can attribute the cost to the right cost center.
Are GitHub Copilot Free accounts useful for real agent work?
No. The 50 premium requests in the Free tier are consumed quickly by the coding agent, especially with any model above a 1× multiplier, so serious work requires a paid plan.
Will choosing a smaller model really cut my bill?
Yes. Model choice is the single largest lever on Copilot agent cost, and swapping a frontier model for a smaller included model can drop inference spend by 80% on routine tasks.