The West Is Pulling Back on Tokenmaxxing. China Is Making It An Economic Signal
Western companies are learning that token consumption does not prove productivity. China is turning token volume into a measure of AI economic activity

China is in the process of turning token consumption into an economic signal. In the West, the companies that followed “tokenmaxxing”—the practice of maximizing token consumption—are now pulling back. Leaderboards are coming down, licenses are being cancelled, and the metric is being publicly abandoned. For operators in Asia, that juxtaposition is worth understanding before it turns into a structural risk.
In April, Uber disclosed it had burned through its entire 2026 AI budget in just four months. Its COO later admitted there was no clear link between rising token consumption and better products for users. In the same month as Uber’s disclosure, Meta took down its internal token usage leaderboard after news of it leaked externally. The leaderboard, nicknamed “Claudeonomics”, was an employee-built dashboard that ranked employees by their AI token consumption. It started an internal race within the firm. In May, Microsoft cancelled Claude Code licenses, in part, reportedly, to cut its operation costs. The companies that were encouraging employees to burn through tokens are now reconsidering their approach.
In China, tokenmaxxing is moving in the opposite direction from the West, towards a greater sphere of influence—from vendor billing unit toward policy adjacent economic signal. At the 2026 China Development Forum in March, China’s National Data Administration declared tokens as the “settlement unit” linking technological supply with commercial demand, paving the way for tokens to be recognized as a new value system in monetizing the AI industry. Alibaba, Tencent, and ByteDance are increasingly placing token volume at the center of how they package, price, and measure enterprise AI usage.
The risk isn’t that tokens are becoming a pricing unit—that’s unavoidable. The risk is that token volume starts being treated as evidence of AI progress before organizations can show whether AI is actually improving throughput, reducing errors, or lowering cost-to-serve. China’s policy language is pushing tokens toward the center of AI measurement; operators elsewhere in Asia should be careful not to import that logic into contracts or performance reviews without first defining what business outcome the tokens are supposed to improve.
How Tokenmaxxing Became A Problem
Tokenmaxxing emerged from a combination of pressures that made token consumption something to optimize for.
The first was cultural pressure from the top. The most prominent voices in the industry frame token consumption as a signal of AI seriousness. Nvidia CEO Jensen Huang told engineers they should be spending tokens equivalent to at least half their annual salary. Meta’s CTO Andrew Bosworth pointed to his best engineers spending the equivalent of his salary in tokens while boosting productivity by five to tenfold, and said, as quoted by Fortune: “Keep doing it. No limit.”
The second pressure was competitive and defensive. Leaders who did not push aggressive AI adoption were being told that they were falling behind. Token consumption leaderboards like Meta’s Claudeonomics gave organizations a visible, shareable number that showcased AI transformation was happening inside the company. As The Information reported, some Meta employees with low token usage expressed concern about not being seen as sufficiently “AI native”, encouraging them to create hacks such as using transcription bots during meetings to inflate their scores.
The third pressure was structural: the metric was designed by the people who profit from its maximization. Token consumption is not a neutral measure of AI activity. It is the unit AI vendors bill in. OpenAI CEO Sam Altman articulated the industry’s direction plainly: “We see a future where intelligence is a utility, like electricity or water, and people buy it from us on a meter.”
There is another reason why tokenmaxxing grew into a trend before people started pointing it out as a problem: the price signal was pointing in the wrong direction. Token costs have been falling consistently, and tech leaders had reason to believe costs were under control. That wasn’t the case when the bills came in.
The Pricing Trap
The cost of AI inference has dropped dramatically. For enterprise leaders, the metric looked sustainable. But the falling price obscured a compounding problem on the consumption side.
The shift from chatbot to agentic AI changed the consumption equation entirely. A chatbot answers a question. An agent pursues a goal autonomously, across multiple steps, calling a language model repeatedly until it reaches a result. Gartner found that agentic models require between 5 and 30 times more tokens per task than standard chatbot queries.
The result was a gap that widened dramatically. Cheaper tokens, deployed through agents running hundreds of model calls per workflow, produced invoices that do not resemble the original budget assumptions. At Meta, the leaderboard showed total token usage across the company rose from 60 trillion over the previous 30 days to 74 trillion the following week. Uber’s per-engineer monthly API costs reached between $500 and $2,000—a figure that, if multiplied across thousands of engineers, understandably exhausted the company’s annual budget in four months.
The correction started happening when teams began asking for results rather than chasing leaderboard positions.
Why China’s Tokenmaxxing Problem Is Bigger Than The West’s
There is a structural difference between what happened in the West and what is being built in China.
By March 2026, China’s daily token call volume had reached 140 trillion, up from 100 billion at the start of 2024. Liu Liehong, head of China’s National Data Administration, described this growth as an economic signal: evidence that China’s AI industry was evolving from simple dialogue systems to decision-making agents.
When a government tracks a metric as evidence of industrial progress, the organizations operating within that policy environment will optimize for it.
The corporate responses followed immediately. Alibaba announced the creation of the Alibaba Token Hub, a move to integrate its AI businesses. It was also the first time a Chinese internet company embedded the word “token” into its organizational structure. Tencent rebranded its model-as-a-service platform as TokenHub. ByteDance’s cloud computing and AI platform Volcano Engine reported 140 enterprise customers with token usage exceeding 1 trillion each, indicating strong growth for the token economy.
These are commercial architectures designed to make token consumption at the center of enterprise AI. And these infrastructures make their way into organizational culture. Kunlun Wanwei, a Chinese internet company, told its technical staff that those who use fewer tokens will be eliminated, in an effort to push its personnel to increase their R&D efficiency by 50% through AI tools.
What makes this harder to correct than the Western version is that the caution exists inside the system and is deprioritized. Li Qiang, vice president of Tencent Holdings, told Yicai: “Assuming tokens are fuel, if you only focus on fuel consumption without considering the economic efficiency of building the engine, the cost for users may be very high, and they will eventually abandon it.” Liu Weiguang, senior vice president of Alibaba Cloud, said: “Everyone must not think that tokens are the same.” Token volume without context is not a meaningful measure. Both companies understand the distortion, and they are building commercial infrastructure around token volume anyway.
That is what a structural problem looks like: people who understand the risks are participating in it, because the policy environment, the vendor commercial model, and the internal performance pressure are all pointing in the same direction. In the West, Meta could take down the leaderboard. There is no single dashboard to take down here.
Tokens are a fine unit for paying for AI, but they are a poor unit for knowing whether AI is working. When token volume shows up in a vendor proposal, a procurement framework, or a performance review, operators should ask the following question: what business outcome is this token spend supposed to improve? That question is the difference between tokens being a metric and a trap.
More from Asia Tech Lens
Agentic AI Can Act. Singapore’s New Rulebook Says: Prove You Can Stop It
As AI shifts from assistance to action, operators need bounded autonomy, audit trails, oversight, and rollback plans before deployment can be trusted.Before AI Can Work, Southeast Asia’s Enterprises Need To Fix Their Data Foundations
Enterprise AI adoption in Southeast Asia will stall if operators scale tools before fixing the data, workflow ownership, and accountability layers underneath.Why Quantum Pilots Fail Before They Start—And What To Do About It
A practical guide for senior operators on framing emerging-tech pilots around decisions and business outcomes before they become expensive experiments.AI and Private Equity: How AI Changes Cash Flows
AI only matters when it changes cash flows, cost structures, or operational leverage—not when it merely increases activity or adoption metrics.China’s Compute Surplus Won’t Be Your Compute Surplus
China’s AI infrastructure boom matters most to operators already inside the Chinese tech stack, where access, governance, and portability determine whether capacity is actually useful.

