On May 31, 2026 at 15:59 UTC, the 75% discount promotion on DeepSeek V4 Pro expired. Prices were supposed to return to original levels. But they did not. DeepSeek decided the discount was not a temporary offer — it was the new price, period. Since then, the Chinese company’s flagship model costs a quarter of what it did before, and competitors have a serious problem.
The move is extraordinary in its audacity. DeepSeek did not announce a renewable discount or an extended launch offer. It simply adjusted its official pricing page to show the reduced rates as standard prices, with the originals crossed out. A footnote makes it unambiguous: “The API price of the deepseek-v4-pro model will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends.” The promotion ended, and the adjustment became permanent.
How much are we talking about? V4 Pro now costs $0.003625 per million tokens for cached input, $0.435 for uncached input, and $0.87 for output. Output used to cost $3.48. That figure — $0.87 per million output tokens — places DeepSeek V4 Pro in a pricing territory that until months ago was unthinkable for a frontier-class model. For context, it is roughly ten times cheaper than Anthropic’s Opus models on output, and it competes directly with much smaller, less capable models.
DeepSeek attributes this efficiency to its hybrid attention architecture, a design that reduces computational costs per token without sacrificing model quality. It is not a marginal optimization: it is a fundamental redesign of how the model processes information, and the pricing results speak for themselves. While other labs compete to build larger models, DeepSeek found a path to make them cheaper without making them smaller.
The V4 Pro is not alone in this strategy. Its younger sibling, the V4 Flash, offers even more aggressive pricing — $0.28 per million output tokens — ideal for lighter workloads. Both models share impressive technical specifications: a 1-million-token context and output capability of up to 384k tokens. These are numbers no other frontier model matches at these prices. And they come with support for thinking and non-thinking modes, tool calling, JSON output, and FIM completion in beta.
There is an additional detail few are noticing. DeepSeek had already reduced cached input pricing to 1/10 of launch price for all its models since April 26, 2026. In other words, the company has been consistently lowering prices, not as a one-time promotion but as a structural strategy. The message is clear: DeepSeek wants to be the cheapest AI infrastructure provider on the market, and it is willing to follow through on that promise.
Why it matters
This is not just a price war. It is a rethinking of what it means to be a frontier model. Until now, the frontier was synonymous with expensive. The most capable models cost more because they were hard to train and run. DeepSeek V4 Pro breaks that equation: it offers top-tier capabilities at commodity prices. For developers, this means building native AI applications is no longer a luxury. For competitors — Anthropic, OpenAI, Google — it means the pressure to cut prices or differentiate on capability just intensified overnight.
The V4 Pro is not just cheap: it is permanently cheap. And that changes the game.
Main source: DeepSeek API Documentation — Models & Pricing