AI Business

AI Costs Surge: Uber, Microsoft Hit by Token Demand

The era of unchecked AI spending is crashing into financial reality. Companies from Uber to Microsoft are confronting the hard truth: AI's escalating token costs are becoming unsustainable.

A graph showing rapidly increasing lines representing AI token demand and cost, with small icons of company logos like Microsoft and Uber near the peak.

Key Takeaways

  • Companies like Uber and Microsoft are facing severe cost pressures from AI token usage.
  • Goldman Sachs predicts agentic AI could increase token demand by over 24 times.
  • A disconnect exists between AI spending and demonstrable consumer/business value.
  • Advancements in AI hardware are hoped to reduce costs, but adoption is a challenge.
  • High AI operational costs may be slowing down the economic justification for large-scale AI-driven job displacement.

Look, everyone was expecting AI to transform businesses, to unlock unprecedented efficiencies. What we weren’t quite bracing for was the sheer, unadulterated cost of that transformation. The narrative has long been about the potential, the future-proofing. Now, that future is arriving with a hefty invoice, and companies like Uber and Microsoft are finding their AI budgets vaporizing faster than a poorly written prompt.

Uber’s CTO, Praveen Neppalli Naga, dropped a bombshell recently, revealing the company had burned through its entire 2026 AI budget in mere months. This wasn’t a gradual creep; this was a sprint to an empty vault. Then, Andrew Macdonald, Uber’s Operations chief, echoed this sentiment, stating a stark disconnect between token usage and tangible consumer features. It’s a classic case of outputting a lot of noise without a clear signal.

Microsoft, too, is tightening its AI purse strings. The revoking of developers’ access to the Claude Code programming assistant, ostensibly for consolidation onto its internal Copilot CLI, smells heavily of cost-cutting. With the end of the fiscal year looming, this move is less about strategic alignment and more about preventing budget blowouts. It’s a stark signal that even tech giants are starting to feel the sting of AI’s escalating token-based billing.

The Agentic AI Overdrive

This isn’t just a few isolated incidents; it’s a widening trend. Goldman Sachs estimates that agentic AI could inflate token usage by a staggering 24 times in the coming years. That’s not an incremental increase; that’s an exponential leap in demand, and consequently, in cost. The disconnect between AI desires and AI affordability is becoming painfully evident.

We’ve been hearing whispers for months about companies struggling to articulate the return on investment for their heavy AI deployments. Uber’s experience brings these whispers into a deafening roar. Macdonald’s comments in Business Insider highlight a critical concern:

We’ve talked to senior engineers, and there was no link between higher token usage and a proportional increase in consumer features with real benefits for their customers.

It’s tough to justify footing a colossal bill when the direct line to customer value is fuzzy at best. More code might be shipping, yes, but is it better code? Is it code that customers actually notice or care about? Macdonald admitted it was “very hard to draw a line” between the increased output and actual software improvements. This isn’t the AI revolution we were promised; it’s an AI arms race that’s bankrupting the participants.

Microsoft’s pivot away from Claude Code subscriptions, which they only opened up in December, further underscores this financial strain. Coupled with the shift to token-based billing for Copilot on GitHub – a move that ballooned costs earlier this year – it’s clear the company is looking for ways to rein in expenditure.

Bragging Rights vs. Bottom Line

The narrative from some corners of the tech world has been one of bragging rights. Nvidia CEO Jensen Huang famously remarked he’d be alarmed if an Nvidia engineer wasn’t spending at least half their salary on tokens. This sentiment seems to have permeated leadership, with CEOs from various companies proudly announcing the percentage of AI-generated code, as if that metric alone guarantees success.

Airbnb’s CEO noted 60% of their code was AI-generated. Chime reported 84% AI code. Google, cautiously, claims 50% AI-generated, but crucially, always human-verified. These figures, however, start to look eerily similar to Uber’s situation. Despite over 80% of Uber’s software engineers using agentic AI and over 60% of code being AI-generated, the cost-effectiveness is questionable.

And the costs can be astronomical. Peter Steinberger, creator of OpenClaw and an OpenAI employee, shared a shocking anecdote: his team of three spent over $1.3 million in tokens in a single month running agentic AI tools. That’s not a typo. That’s more than triple the annual salary of the highest-paid engineer on the team, for just one month’s operations. This trend suggests that the cost of AI is rapidly outstripping the cost of the very human labor it’s supposed to augment or replace. The justifications for recent AI-driven layoffs are starting to look incredibly thin, unless the entire industry is just running itself into the ground chasing phantom efficiencies.

Hardware Hopes and Realities

Goldman Sachs’ report offers a sliver of hope, suggesting that next-generation inferencing chips could drastically reduce AI costs, making continued investment viable and potentially boosting profits. The promise is that these efficiency gains from advanced hardware, like Nvidia’s Vera Rubin platform, will make AI use so much cheaper that the massive token demand becomes manageable. This new hardware is touted to offer significantly improved performance per watt, a critical metric for AI efficiency. Companies that can adopt these next-gen chips first could gain a substantial edge.

But here’s the catch: hardware development cycles are long, and adoption is not immediate. Over half of the data center projects planned with Nvidia’s Blackwell hardware have been canceled or delayed. This signals a significant bottleneck. The promise of cheaper AI, contingent on widespread adoption of highly advanced, expensive new hardware, feels like kicking the can down the road. For companies feeling the immediate pressure of token bills, these hardware solutions are a distant, uncertain future. The immediate problem isn’t a lack of AI potential; it’s the prohibitive, ongoing cost of realizing that potential.

Is This Just a Blip?

The current cost crunch might simply be the growing pains of a nascent, but rapidly expanding, technology. Agentic AI, with its complex decision-making processes, naturally consumes more resources than simpler chatbot interactions. The jump from a single-purpose LLM to an agent that can strategize, execute, and iterate is computationally intensive. This is where the real cost escalation lies.

However, the comparison to human labor costs is where the debate gets heated. If AI is truly meant to increase productivity and reduce operational expenses, its cost must eventually fall below the expense of equivalent human effort. Right now, in many high-intensity use cases, it doesn’t. This puts companies in a bind: invest heavily in potentially transformative but expensive technology, or stick with more predictable, albeit perhaps less scalable, human capital.

The real value proposition of AI needs to move beyond just generating more code or performing more tasks. It must translate into demonstrable business value – increased revenue, improved customer satisfaction, or significant cost savings. Without that clear link, the current spending spree looks less like strategic innovation and more like a speculative gamble with increasingly unfavorable odds.

**


🧬 Related Insights

Frequently Asked Questions**

What does Goldman Sachs mean by “token demand”? Goldman Sachs is referring to the computational units (tokens) used to process information by AI models. Agentic AI, which can perform more complex, multi-step tasks, requires significantly more tokens than simpler AI applications, leading to increased demand and costs.

Will AI agent costs ever come down? Goldman Sachs and other analysts suggest that advancements in AI hardware, such as more efficient chips and optimized inference processes, could significantly reduce the cost per token. However, the immediate future likely sees continued high costs due to rapidly increasing complexity and usage of agentic AI.

Is AI replacing jobs faster than these costs are increasing? This is a central tension. While AI is often cited for productivity gains that could lead to job displacement, the current high costs of advanced AI, especially agentic AI, raise questions about the immediate economic feasibility of large-scale AI-driven workforce reduction. The costs might be slowing down the rate at which AI can displace human workers economically.

Written by
theAIcatchup Editorial Team

AI news that actually matters.

Frequently asked questions

What does Goldman Sachs mean by "token demand"?
Goldman Sachs is referring to the computational units (tokens) used to process information by AI models. Agentic AI, which can perform more complex, multi-step tasks, requires significantly more tokens than simpler AI applications, leading to increased demand and costs.
Will AI agent costs ever come down?
Goldman Sachs and other analysts suggest that advancements in AI hardware, such as more efficient chips and optimized inference processes, could significantly reduce the cost per token. However, the immediate future likely sees continued high costs due to rapidly increasing complexity and usage of agentic AI.
Is AI replacing jobs faster than these costs are increasing?
This is a central tension. While AI is often cited for productivity gains that could lead to job displacement, the current high costs of advanced AI, especially agentic AI, raise questions about the immediate economic feasibility of large-scale AI-driven workforce reduction. The costs might be slowing down the rate at which AI can displace human workers economically.

Worth sharing?

Get the best AI stories of the week in your inbox — no noise, no spam.

Originally reported by Tom's Hardware - AI

Stay in the loop

The week's most important stories from The AI Catchup, delivered once a week.