LLM profitability has a massive problem

This is not a financial deep dive, that’s been better covered elsewhere by more qualified individuals.

What I want to talk about is the race that these companies have against local hardware and free / open-weight models.

My argument is that we’re in the short window of time where a closed-weight model can be profitable, and that’s rapidly coming to an end. Open weight models have started to become “good enough” at more and more tasks, and the frontier labs are being forced to push into harder, more niche tasks.

A token’s price is effectively: (energy cost of inference) + (profit margin of hardware owner) + (profit margin of model creator)

When the open-weight model becomes available that matches the performance of the frontier closed-weight model, the closed weight model’s value drops to zero.

In a perfectly efficient market, a token of equivelant quality being sold for just (cost of energy + hardware owner’s profit), means that the closed-weight model needs to either remove their own profit to compete or accept significantly less volume.

Thus, the frontier labs must either:

Produce a token of quality X that’s much cheaper to produce (and thus capture volume)
Produce a token of quality X that’s much higher value to consumers

Open-weight models have been consistently 2-3 months behind the closed-weight models. That’s not a lot of time considering the cost to develop a model.

So far, the frontier labs have not been able to compete well on price for lower quality tokens. Which makes sense: smaller models are cheaper to produce, and thus they don’t have a moat of money keeping the playing field tilted in their favor.

They have instead been chasing quality of their tokens.

With this approach however, they are implicitly expecting that there’s infinite work that needs to be done at higher and higher levels.

Looking at software engineering, we need a lot of junior engineers, less senior engineers, even less principal engineers, and so on.

As these models get better, they are climbing the ranks, but displacing fewer dollars of salary.

A team likely has one principal engineer, who gets paid (at FAANG) 500k / yr, 5-8 junior engineers (300k), and 1-2 seniors (400k).

All ballpark numbers, but that means there was 2.4M dollars to replace at the junior level, there’s only 800k at senior, and just the 500k at principal.

The model needed to replace a principal engineer is more expensive to develop, by orders of magnitude, while simultaneously having less demand.

Effectively, frontier labs are getting squeezed out on both ends, and their window during which they can show profit is rapidly shrinking. More and more companies are realizing they can use open-weight models to achieve 90% of what Opus does at 10% of the cost.

Once the bottom lines are looked at closely, I don’t see the closed-weight business model surviving very long (as a stand alone company). Google and Microsoft may continue developing models as it gives their other products a boost. But, Anthropic’s biggest issue will justifying why they need to keep developing models when people just want their harnesses.