¶ Introduction to Tokenmaxxing Costs
From Tokyo, Japan and Changsha, China – this is Down to Business English. With your hosts Skip Montreux and Dez Morgan. Hello Dez! How's it going? Hi Skip. I’m doing well, thanks. But I have to admit, I'm feeling a little guilty about a recent purchase I made. Oh? What did you buy? I just upgraded to the Pro version of my calendar app, and it cost $120 a year! 120 bucks for a calendar app? Why in the world do you need a Pro version of a calendar app, Dez?
Well, it uses three different AI programs running constantly in the background. They read all my emails, analyze my tone, and then color code my meetings based on their importance. Dez, you are a teacher! You have the same classes and the same faculty meetings every single week. Your schedule is highly predictable. You do not have anywhere near enough meetings to justify spending $120 a year on an AI calendar app! But the app gives me an AI Efficiency Score based on how much the AI works for me.
And I’m currently in the top 10% of users! I really want to keep my high score. So let me get this straight. You are intentionally using unnecessary computing power just to get a high score on an AI digital leaderboard? Exactly! It makes me feel productive. Huh. Well as silly as that sounds Dez, you are perfectly illustrating a brand new corporate crisis called tokenmaxxing. Tokenmaxxing? What’s that? In the AI world, a token is a basic piece of information.
When a company uses commercial AI, they are billed based on the total number of tokens their employees consume. Tokenmaxxing happens when developers waste millions of these tokens. Sometimes they do it intentionally to pad their productivity stats, but often, it is just very inefficient coding where the AI is left running unnecessary tasks in the background. So they’re playing the exact same game I am, but with the company’s money? Well, not exactly, but pretty close.
¶ Agentic AI and Escalating Costs
And the implications are massive. Er, for example? Because of tokenmaxxing, the ride-sharing company Uber just burned through its entire annual AI budget in only four months. That sounds like a serious crisis for the tech sector. It is. I want to learn more. Then let's do it. Let's get D2B … Down to Business with Tokenmaxxing and the Corporate AI Pullback. I'm curious. You said Uber blew through its entire 12-month AI budget in just four months. That’s right.
And that forced the company to put emergency limits in place. How did they manage to spend so much money that fast? Good question. It comes down to how their software engineers were using AI — specifically, a new approach known as "Agentic AI." Hold on a minute. What do their software engineers even need AI for? Uber already has a working app. You push a button, and a car shows up. Well, it might work seamlessly on your phone, but a global app like Uber’s is always evolving.
Engineers are constantly rewriting the code — updating algorithms, maintaining massive databases, revising their user interface. Okay, so there is a lot of tedious, behind-the-scenes maintenance work. Exactly. And this is where Agentic AI comes in. And what exactly do you mean by Agentic AI? Well, most people use AI like a search engine. You ask a question, and you get an answer. Or you prompt the AI to do a task, like create a spreadsheet or write an email, and the AI does that for you.
Or create crazy videos of your cat breaking into your bedroom late at night, playing a guitar and serenading you with a Mexican ballad. Yes, there is some of that AI slop too. But Agentic AI is different. Instead of doing one single task, it acts as an independent worker. You give it a goal and it goes to work accomplishing it for you — using whatever tools it thinks are necessary and making decisions on its own to accomplish the task.
So instead of just asking AI to draft an email, you give it an entire project to work on, from start to finish. Exactly. Uber engineers were giving agentic AI tools broad goals, like reviewing code or updating old databases, and other similar tasks. And the AI just does all that work automatically? That’s right. The AI works in continuous loops — searching through millions of lines of old code, writing updates, testing them, and correcting its own errors over and over again in the background.
Ah, I see. And all that background reading and writing consumes tokens? A massive amount of tokens. In fact, industry data shows that an Agentic AI can consume up to 1,000 times more tokens than a standard AI prompt just to complete a single task. 1,000 times more. No wonder developers are burning through corporate budgets. So, what exactly did Uber do when they realized all the money was gone? They immediately hit the brakes and instituted an emergency company policy.
They placed a strict $1,500 monthly token limit on every single developer. Once an engineer hit that $1,500 cap, their access to premium AI coding tools was cut off for the rest of the month. A strict $1,500 limit? Mm. Interesting. But Skip, this isn’t just a problem for software engineers, is it?
¶ Corporate AI Usage and Solutions
You’re right, not at all. Coders are just the canary in the coal mine. Uber is the headline example, but the bigger story is that this problem could follow agentic AI into any department that handles a large amount of information — legal, human resources, marketing, finance, and many other departments. Can you give us an example, um, outside of tech? Sure. Well, imagine a law firm. Instead of paying a human employee to spend a week reading through old contracts, they unleash an Agentic AI.
You give it a goal, and the AI reads ten thousand pages of legal text, cross-references corporate policies, and drafts a final report — all in an hour or two. Much faster than a human could, obviously. Sure. But it brings the exact same tokenmaxxing problem. If that legal AI gets confused by a loophole, it might run in a continuous loop, reading the same files over and over, burning through tokens. Which brings us back to the crackdown on token usage.
Are other companies stepping in to stop this waste? Let’s look at two other responses. Amazon focused on employee behavior. Microsoft focused on software access. Okay, start with Amazon. What did they do? Amazon realized their employees were treating AI usage like a video game leaderboard, competing to see who could use the most tokens. So, the company completely shut that leaderboard down. And, what about Microsoft?
They were one of the first to jump on the AI bandwagon — are they doing anything about this? Absolutely. In fact, Microsoft just canceled the majority of its internal licenses for Claude Code, a premium AI tool. Their developers were running so many Agentic AI loops that the token costs blew through the company's annual AI budget in a matter of months. Now that’s not good. Microsoft is now forcing its teams to switch to less expensive AI tools to stop that bleeding.
But hold on, Skip. Even if a company caps AI spending at $1,500 a month, isn't that still much cheaper than paying a human paralegal or software engineer's salary? That was the original theory. The promise of AI was that it would be incredibly cheap and could replace costly, human workers. But the cost of Agentic AI is proving that to be wrong. Because the AI runs in these massive, endless loops, the actual computing costs involved in using AI are astronomically higher than anyone predicted.
¶ From Usage to Inference Yield
So the ROI — or the return on investment — isn't as high as they originally thought. It is highly questionable. Companies are realizing you cannot just lay off a human and plug in an AI. When you combine these massive, unexpected token bills with the ‘babysitting tax’ … Sorry, ‘Babysitting tax’? Yes, where you still have to pay a senior manager to fix the AI's mistakes. When you combine that "babysitting tax" with unexpected token costs, the financial math falls apart.
So in the end, replacing human employees with AI might not actually be worth it? And that realization is why many businesses are changing their AI strategy. They are moving away from measuring raw token usage, and shifting to a concept called Inference Yield. Inference Yield. Okay, break that down for us.
Companies are moving from a simple question — “How much AI did employees use?” — to a better question — “How much useful work did that AI actually produce?” So the focus is no longer on using more AI, it’s getting better results from the AI you use. In Amazon’s case it means tracking efficiency — specifically, the ratio of tokens used to useful code published. Developers no longer get a high score just for using AI.
In fact, if a developer burns through millions of tokens just to produce one or two lines of useful code, their Inference Yield is terrible, and they are considered a financial liability. Well, that does make a lot of sense. It’s like measuring a writer by the quality of the books they publish, rather than just the number of words they write. That is a perfect analogy, Dez. So this is not just an internal budgeting problem.
¶ AI Pullback's Market Impact
It could certainly affect the companies selling the AI tools. And that is where Wall Street enters this story. Anthropic is heading for an IPO, and OpenAI is expected to follow. But to justify huge valuations to investors, they have to prove that their revenue can keep growing. And that revenue basically comes from selling access to AI computing power — the more tokens customers use, the more money these AI companies can make.
Exactly. OpenAI and Anthropic make money when customers use their AI system. And that usage is measured in tokens. So, if major corporate clients like Microsoft, Amazon, and Uber suddenly hit the brakes, cap their AI spending, and focus only on useful output, that endless revenue growth hits a wall. So just as these AI companies are trying to go public and cash in, their biggest corporate clients are realizing the tech is too expensive.
Yes. And if the demand for AI was artificially pushed up just because employees were playing games with company money, the tech market is in for a very harsh reality check. If the tokenmaxxing party is over, these massive IPO ambitions and trillion-dollar valuation hopes could be in serious trouble. And on that somewhat worrying note, I think it’s time for us to get D2V … Down to Vocabulary.
¶ Essential Business English Vocabulary
The first item on our D2V list is the verb ‘color code’. When you color code something, you organize information by using different colors to show different categories. For example, you might color code the labels of your ToDo list — red for urgent, green for finished, and yellow for the todo someday. I used to color code my todo list, but I gave up. Everything ended up being colored red. That happens.
In the introduction of today’s episode when Dez was talking about his very expensive AI calendar app, he said the app reads all his emails, analyzes his tone, and then color codes his meetings based on their importance. In other words, the app uses different colors to show how important each meeting might be. Red meetings, very important, blue meetings somewhat important, and green meetings optional. To be honest, color coded meetings might be the most useful feature of the whole app.
I still think you are paying way too much for it. But the important thing here is that ‘to color code’ means to use colors to organize or label information clearly. You might hear this in a project meeting. Someone might say, “We’ve color coded the supplier list so that green shows approved suppliers, yellow shows suppliers under review, and red shows suppliers with quality problems”. That is a very practical example.
Moving on, next on our D2V list are the phrasal verbs ‘to burn through’ and ‘to blow through’. Two very similar verbs, pretty much identical in meaning. They are. They both mean to use something up very quickly — especially money, time, or resources. Right. And they usually suggest that something was used much faster than expected. You often hear these phrases with business nouns like to blow through a budget, or burn through cash, or burn through inventory.
In today’s episode, Skip first said that Uber burned through its entire annual AI budget in just four months. Then, in the main report, I asked how exactly Uber managed to blow through its entire 12-month AI budget in just four months. And again, later in the report, I said Microsoft's developers were using agentic AI so much that the token costs blew through the company’s AI budget in a matter of months. So in all those examples, the meaning is the same.
The companies used up money that was supposed to last a full year much too quickly. Exactly. The image is not just normal spending. It is fast spending. The money disappeared before the company expected it to. Kind of like me with a box of donuts. That is probably the clearest example we’ve had all day. But in business, these phrases are often used when a company loses control of a budget or uses resources faster than planned. How might we use one of these phrases in a B2B context, Skip?
Well, you might hear this in a meeting, “Okay everyone. We burned through 80% of the total budget in the first six weeks, so we need to reduce the project scope before moving on to the next phase”. Sounds like trouble. I wouldn’t want to be the team leader of that project. Our next D2V item is the idiom ‘a canary in the coal mine’. A canary in the coal mine is an early warning sign that a bigger danger or problem may be coming. Unsurprisingly, this expression comes from coal mining.
In the past, miners carried small birds called canaries into the mines with them, to warn them if there was dangerous gas in the mine. And how exactly did the canary warn them? Well, because canaries are more sensitive to toxic gas than humans they would die sooner. Wow! That is animal abuse. Ahh … it was a different era. In any case, today, if we call something ‘a canary in the coal mine’, we mean it’s the first sign of a larger problem.
And in today’s episode, I used this expression when we were talking about software developers and the huge cost of using agentic AI. I said coders were just the canary in the coal mine. In other words, coders are the first group impacted by the problem of overusing tokens. But the same issue could easily appear in other departments, like legal, human resources, marketing, and finance.
The point is not that coders are the only employees facing the problem, but that they may be the early warning sign of a larger issue. How would you use this in another business context, Dez? You might hear this in a sales strategy meeting: “The sudden drop in repeat orders from our distributors may be a canary in the coal mine. It could be an early warning sign that demand for our product is weakening across the whole market”. Hmm. That’s a pretty strong example.
It shows how the expression is used when one small problem may point to a much bigger problem coming later. What’s our next word? The final item on our D2V list is the noun phrase ‘reality check’. A reality check is a moment when you are forced to stop and look at a situation honestly, especially after being too optimistic or unrealistic. Right. It is often used when expectations are set too high, and then new facts, or new costs, or poor results force people to rethink the situation.
In today’s episode, Skip used this expression near the end of the report. He said that if the demand for AI was artificially pushed up because employees were playing games with company money, then the tech market could be in for a very harsh reality check. In other words, investors and AI companies may have believed that AI demand would keep growing forever. But if companies start limiting AI spending and focus only on useful output, the market may have to accept a harder truth.
So the reality check is this — AI may still be powerful, but it may not be as cheap or as profitable as many people hoped. How might we use ‘reality check’ in a different business context, Skip? Hmm. Well, you might hear this in a management meeting, “The failed pilot project was a reality check for us. We realized the product was not ready for large-scale use by our customers”. Yeah, that’s a good example. It shows that a reality check is not just bad news.
It's a moment when a company has to face the facts and make more realistic decisions.
¶ Episode Conclusion and Membership
Would you like to help Down to Business English reach more people wanting to improve their Business English skills? Follow Down to Business English on Apple Podcasts, YouTube Music, Spotify, or any place podcasts are found. Leave a rating and a review and tell everyone how much you enjoy the show. Thanks for that report on tokenmaxxing, Skip.
I’m still not totally sure I need to cancel my $120 AI calendar app just yet, but I do have to admit the whole tokenmaxxing story has got me thinking about my AI usage. It certainly should. What starts as a story about employees using too much AI quickly turns into a much bigger question about budgets, productivity, and whether companies are getting real value from these AI tools. I guess it comes down to whether AI is actually producing useful work.
D2B Members and Apple Podcasts subscribers — the Bonus Down to Vocabulary for today’s episode will be released within the next few days or so. In that bonus episode, we will look at six more useful words and phrases from today’s report — to pad something, seamlessly, behind-the-scenes, to hit the brakes, to jump on the bandwagon, and to cash in. A very useful set of vocabulary for talking about AI spending, corporate budgets, and whether new technology is actually creating business value.
If you are a D2B Member, make sure you have copied your members-only podcast feed URL from your account page on the D2B website and pasted it into the podcast app of your choice. That way, you will not miss out on the Bonus D2V episode when it is released. And Apple Podcasts subscribers you do not need to do anything. The Bonus D2V episode will appear automatically in your feed as soon as it goes live.
And if you are not yet a D2B Member or Apple Podcasts subscriber, but you get value from what we are doing here on Down to Business English, please do consider supporting us by becoming a D2B Member or Apple Podcasts subscriber. To become a D2B Member, just visit d2benglish.com/membership and sign up today. That’s d2benglish.com/membership. And to become an Apple Podcasts subscriber, just visit the Down to Business English show page in Apple Podcasts and click the Subscribe button.
Thanks for listening, everyone. See you next time. Bye bye. Down to Business English … Business News, to improve your Business English.
