This was a big week for the tools that write code. One company shipped a stronger model that can run hundreds of helper agents at once. Another flipped the switch on how it charges for AI coding. A third opened a fast, cheap coding model to everyone. The pattern is clear: AI coding got more powerful and more hands-off, and the cost and control questions moved front and center.
Here are the five changes from the past seven days that matter most for software teams, with a source for each one. No guesses, no hype. After each story you will find a Scrum Team Signal with a plain next step.
Story One
Claude Opus 4.8 ships, and it can run hundreds of helper agents at once
On May 28, Anthropic released Claude Opus 4.8. It is the company's strongest public model, and it costs the same as the last version. The coding scores went up. On a hard coding test called SWE-bench Pro, it scored 69.2%, up from 64.3% on the prior model.
The bigger news for teams is a new Claude Code feature called dynamic workflows. It lets the model plan a large job and then run hundreds of smaller helper agents, called subagents, at the same time. Anthropic says this can carry a whole-codebase change, such as moving hundreds of thousands of lines of code to a new pattern, from start to a finished, tested merge. The feature is in research preview and is offered on the Enterprise, Team, and Max plans.
Anthropic also reports the model is better at being honest about its own work. It is more likely to flag what it is unsure about, and Anthropic's own tests show it is about four times less likely than the last version to let a flaw in code it wrote slip by without comment.
4×
Less likely than the prior model to let a flaw in its own code pass without flagging it, by Anthropic's own tests.
Source: Anthropic
Scrum Team Signal
Treat the model like a strong but fallible teammate. Its better self-checking helps, but it does not replace your Definition of Done. Keep human review and passing tests as the bar before any AI change is "done."
Dynamic workflows can take on epic-sized jobs like big refactors and migrations. Plan those as their own backlog items, with clear acceptance criteria and a working test suite that defines success.
Read Anthropic: Introducing Claude Opus 4.8 · Anthropic: Dynamic workflows in Claude Code
Story Two
GitHub Copilot switches to pay-as-you-go billing today
Starting June 1, every GitHub Copilot plan moves to usage-based billing. The old system counted "premium requests." The new system uses GitHub AI Credits, where one credit equals one cent. Credits are used up based on how many tokens your work consumes, including input, output, and cached text.
Seat prices did not change. Pro is still $10 a month and now includes $10 in credits; Business is still $19 per seat with $19 in credits. Inline code completion, the autocomplete most people use, stays free and uses no credits.
Two things changed for heavy users. The old habit of dropping to a slower free model after you ran out is gone. When your credits run out, Copilot stops unless you have turned on extra spending. GitHub is adding a temporary "flex" credit bonus from June through September to ease the change, so watch what happens when that bonus shrinks in the fall.
$0.01
The value of one GitHub AI Credit. Credits now drain by token use, so a heavy week of agent work can pass what your plan includes.
Source: The GitHub Blog
Scrum Team Signal
AI coding is now a real, moving cost, not a flat fee. Bring it into sprint planning and team budgets. Set spend caps so a surprise bill is not possible, and use the billing preview to watch credit burn per developer.
Decide as a team which work is worth the credits. Letting an agent run unattended on a large task is no longer "free" once your included credits are gone.
Read The GitHub Blog: GitHub Copilot is moving to usage-based billing
Story Three
xAI opens a fast, low-cost coding model to all developers
On May 29, xAI made its coding model, grok-build-0.1, available to any developer through its API in public beta. Before this, you needed a paid Grok subscription to use it. It is the same model that powers the Grok Build command-line tool.
The model is built for agentic coding, meaning it can plan and carry out multi-step work such as building web pages, fixing bugs, and calling outside tools through MCP. xAI says it runs at more than 100 tokens per second and is priced at $1 per million input tokens and $2 per million output tokens. That makes it a cheap, speedy choice for routine agent and tool-calling jobs.
$1 / $2
Price per million input and output tokens for grok-build-0.1, served at 100+ tokens per second.
Source: xAI
Scrum Team Signal
More choices means "right tool for the job." Use a cheap, fast model for routine agent tasks, and save a pricier, stronger model for the hard problems. Make model choice a team decision, not a silent default in someone's editor.
Read xAI: Grok Build 0.1 on API
Story Four
Cursor 3.6 lets its agent act with fewer "are you sure?" prompts
Also on May 29, the Cursor editor shipped version 3.6 with a new setting called Auto-review. It lets the AI agent work longer with fewer stop-and-ask prompts. It covers three kinds of risky actions: shell commands, MCP tool calls, and web fetches.
Auto-review follows a three-step path. Actions you have approved in advance run right away. Actions that can be boxed off run in a safe sandbox. Everything else goes to a separate "classifier" agent that decides whether to allow it, try another way, or stop and ask you. Cursor is plain about the limit: it says this classifier is a best-effort convenience, not a security boundary.
3
Action types Auto-review now governs — shell commands, MCP tool calls, and web fetches — using allowlist, sandbox, then a classifier agent.
Source: Cursor
Scrum Team Signal
More agent freedom means fewer human checkpoints. Agree as a team on where the agent may act on its own and where a person must approve. Because the maker itself says the auto-check is not a security wall, keep security review inside your Definition of Done.
Read Cursor changelog: Auto-review Run Mode
Story Five
DeepSeek's deep price cut became permanent this week
DeepSeek had been running a 75% discount on its V4-Pro model, a strong open-weights model used for coding and reasoning. That discount was set to expire on May 31. Instead, the company kept it. When the deadline passed this week, the discounted rate did not roll back. It is now the standing price.
The new rates are about $0.435 per million input tokens and $0.87 per million output tokens. That is many times cheaper than the top models from U.S. labs, while the model still scores in the same range on coding tests. A discount that ends is a sale. A price that stays is a new floor, and rivals now have to answer it.
75%
The V4-Pro price cut that is now permanent, putting frontier-level coding at roughly $0.435 in / $0.87 out per million tokens.
Source: DeepSeek API pricing; reported by Reuters
Scrum Team Signal
Cheaper frontier models make experiments and large test or evaluation runs affordable. That is a real win for teams that want to try ideas before committing.
For regulated work, weigh more than price. Check where the data is processed and whether it meets your compliance rules. Revisit your model choices each sprint as prices keep moving.
Read Report: DeepSeek makes its V4-Pro price cut permanent
What we are watching next week
The cost and competition story is not slowing down. A few things to track, which we will report only once the source confirms them:
- Reports say Microsoft may release its own coding model. We will cover it when Microsoft says so, not before.
- Google committed to shipping Gemini 3.5 Pro in June. We will check whether it lands and how it scores on coding.
- Early bills from GitHub Copilot's new credit system will start to show up. We will look for what real teams are paying.
RC
Rod Claar
Rod is a Scrum trainer, AI educator, and software development consultant with more than two decades teaching Scrum, Agile, Test-Driven Development, and software design. He writes the weekly newsletter at AgileAIDev.com on how AI is changing the way software teams work.