DeepSeek was Inevitable
The tech world has been abuzz since last week about the release of DeepSeek R1, an open source AI model that seemingly came out of nowhere. Not only does DeepSeek R1 appear to be a credible competitor to offerings from AI behemoths like OpenAI, Meta and Google, but the company behind it claims to have developed the model for less than US $6M. DeepSeek’s emergence has rattled the global stock market, with chip maker Nvidia losing nearly $600 billion in market cap in a single day.
There are plenty of questions still to be answered about DeepSeek (and more than enough folks already diving down those rabbit holes). In the meantime, it’s clear that two of the tech industry’s fundamental assumptions about foundation models have been disproven:
The incumbents already have an insurmountable lead
The only way to get better results is to throw more hardware at the problem
I’ll be the first to admit that I bought into both of these assumptions to some degree. It was difficult for me to see how any up-and-coming company could credibly challenge the industry leaders, with their billion-dollar war chests and armies of top AI talent (at least, when it came to generalist models — I have invested in companies building specialized models). Moreover, with the incumbents all seeming to converge on increasingly incremental rates of advancement as of late, it certainly felt like we were hitting an asymptote of progress.
Of course, in hindsight, we were foolish to believe either one of these as true. Had we stopped to really think about it, it would have been obvious that a challenger would emerge. The history of tech simply does not have any precedents for either of the above statements.
The emergence of DeepSeek — or something like it — was inevitable.
Forgot About Dre Open Source
“Y’all know me. Still the same OG.
But I’ve been low-key…”
Earlier in the year, I wrote about the hard tech renaissance that’s currently underway and the lack of VCs who are old enough to have experience underwriting technology risk. In thinking about why so many investors were surprised by the emergence of DeepSeek, it occurred to me that a corollary of this observation is that there’s an entire generation of investors who have no clue just how powerful open source can be.
In the 90s and well into the early 00s, epic battles took place between commercial companies and open source companies in many segments of the market:
Microsoft battled with Linux distributions for operating system supremacy
Oracle and IBM DB2 faced off against MySQL and PostgreSQL in the database world
Mozilla emerged from the ashes of Netscape to confront Internet Explorer
While open source software certainly plays a role in many aspects of the tech ecosystem today, we really haven’t seen it at the forefront of any major industry in awhile (the closest is probably Android, which itself was more a tool for major handset manufacturers than anything). And I say this as someone who has invested in multiple open source and open core startups.
But of course there would be a compelling open source foundation model. It was inevitable.
Investors didn’t see it, however, because they weren’t necessarily looking for it. Many VCs today avoid open source software entirely or consider it nothing more than a distribution model — a way to get developer adoption. But the best open source projects have always been more than that — they were cultural movements — the likes of which we haven’t seen in a long time.
Of course the next open source movement would emerge around AI. It was inevitable.
We Need More Dilithium H100s
The other unprecedented assumption we’ve been making over the past several years is that the only way to make meaningful forward progress is to throw more hardware at the problem. If anyone knew better, it should have been me.
“Let’s go get more dilithium…”
20 years ago, internet adoption was exploding. By virtue of everyone coming online, we could observe and measure all sorts of human behavior that we couldn’t before — from interactions on social networks to consumer buying behavior. But we had no ability to analyze all the data that we were collecting. In those days, network connections were so slow that databases could only perform complex analysis of data that was physically colocated on the same computer or server.
The solution? Throw hardware at it.
Incumbents like HP, Oracle and Teradata spent millions upon millions of dollars building custom servers capable of storing a fraction of the data that will fit on your laptop today. Companies signed massive multi-year contracts in order to get the most basic insights into what their users and customers were doing, gleaned by sampling and filtering data. It wasn’t nearly enough, but the incumbents all had the same message: hardware was a fundamental limitation and there was nothing more that could be done on the software side.
Alas, on the campus of Stanford University my friend and fellow grad student Mayank Bawa had a different perspective. In his PhD research with the Stanford InfoLab, Mayank contemplated whether or not intelligent partitioning of data across a peer-to-peer network of commodity servers could allow a distributed database to perform the type of complex analytics that until then could only be done with colocated data. The resulting paper, published in 2004 at VLDB (the international conference on “very large databases”) was the basis for Aster Data, which he convinced me to join shortly thereafter.
You may not heard of Mayank or Aster Data, but you likely know what we created by another name: Big Data.
The first prototype system, built using 3 cheap off-the-shelf computers from Fry’s Electronics in Palo Alto, out-performed a $10 Million custom Oracle server on the core industry benchmark, TPC-H. And we weren’t the only ones. Around the same time, a handful of other companies (notably Vertica and Greenplum) emerged with products based on similar algorithmic insights. And the rest was history.
Fast forward to today, and we’ve been listening to incumbent AI companies making the same calcified claims.
A new challenger with a new approach that could work around hardware limitations? It was inevitable.
But, but, but…
Obviously, the DeepSeek situation isn’t as simple as I described above. This isn’t a Hallmark story about a plucky open source community taking on a commercial goliath. There are significant geopolitical considerations at play and we’re already hearing claims and counterclaims between the parties.
But if we put all of that to the side, I sincerely believe that the emergence of a credible alternative to the increasingly converging approaches of the incumbent AI companies was inevitable.
Necessity is the mother of invention. And by all accounts, DeepSeek’s lack of access to state-of-the-art hardware drove it to invent a new approach to AI.
Marc Andreessen called it AI’s Sputnik moment. I think it’s a thunderous reminder that as exciting as the past few years have been, we’ve barely scratched the surface when it comes to AI.
Note: if you want to get a sense of some of the algorithmic solutions the DeepSeek team came up with to work around hardware limitations, check out this post (jump down to the section titled, “The Theoretical Threat”).