How to Make Your Startup Antifragile
In the aftermath of the shocking collapse of SVB, a plethora of posts have been written with recommendations for how startups should react. Many have, understandably, focused on ways to add redundancy to banking and financial infrastructure.
If you take a step back, this experience should serve as a broader reminder of the importance of building companies that are antifragile.
But what does it mean for a company to be “antifragile” and how can you make your startup antifragile?
What Does It Mean to be Antifragile?
The term antifragile was introduced 10 years ago by Nassim Nicholas Taleb, a professor and former hedge fund manager (who in a previous book coined the term “black swan event” to refer to the disproportionate role of high-profile, hard-to-predict events):
Some things benefit from shocks; they thrive and grow when exposed to volatility, randomness, disorder, and stressors and love adventure. Yet in spite of the ubiquity of the phenomenon, there is no word for the exact opposite of fragile. Let us call it antifragile. Antifragility is beyond resilience or robustness. The resilient resists shocks and stays the same; the antifragile gets better.
Nassim proposed three types of companies, each defined by how they react to stress: “Fragile” companies wither and collapse when stress is introduced. “Resilient” companies act like reeds: they bend but do not break when stress is introduced, returning to their original form once the stressor is removed. “Antifragile” companies grow stronger as a result of stress. Like muscle development through strength training, stress still challenges the system in the short term, but when the stressor is removed, the company becomes stronger as a result.
While the concept of antifragility is relatively new for companies, its application to software development predates Nassim’s book by at least a decade. My first exposure to the these concepts took place 10 years prior, when I was a graduate student at Stanford University. My advisor, Armando Fox, was part of the “Recovery-Oriented Computing” (ROC) project, a joint effort between Stanford University and U.C. Berkeley to develop reliable distributing computing and internet infrastructure.
The foundation of recovering-oriented computing was a simple concept: instead of trying to enumerate all of the ways that software could fail and trying to author it in such as way as to prevent bad things from ever happening, engineers should create software under the assumption that bad things can happen at any time and instead focus on reducing recovery time. (For those readers old enough to remember when you could remove the battery from a cell phone, this is how mobile software was designed: engineers had to presume that a user could take out the battery at any time, and the software had to be able to recover no matter what.)
The concepts originated in the ROC project helped move software design from fragile to resilient. The next step — towards antifragility — came in the form of software innovations intended to promote learning and adaptability of systems. For example, in 2011, Netflix created a tool called “Chaos Monkey” that randomly disabled servers and networks in its production infrastructure in order to test resiliency and drive improvement:
Imagine a monkey entering a 'data center', these 'farms' of servers that host all the critical functions of our online activities. The monkey randomly rips cables, destroys devices and returns everything that passes by the hand [i.e. flings excrement]. The challenge for IT managers is to design the information system they are responsible for so that it can work despite these monkeys, which no one ever knows when they arrive and what they will destroy.
It’s All About Options
Antifragile systems don’t try to predict future events. Rather, they are designed to react quickly and effectively to shocks of unknown or unexpected origin.
Just as recovery-oriented software is built on the assumption that bad things can happen at any time, antifragile companies expect shocks to occur. What makes them antifragile is how they deal with them.
The foundation of antifragile systems is options — courses of action that are prepared in advance, such that they can be put into action if and when a shock occurs. By preparing options in advance and independent of specific events, you don’t have to scramble when something unexpected occurs.
Scott Fenstermaker noted one example of an antifragile company that should be familiar to all startup founders: VC firms.
What does a VC company do? It buys a certain ownership percentage of a range of companies and mixes them together into a portfolio. These investments have a limited downside: the up-front cost of ownership. Therefore a VC company always knows what its maximum financial loss would be.
But every once in a while, they get a unicorn in their portfolio, a company that takes off to a far greater degree than the other members of the cohort.
Notice also that the VC didn’t have to predict which companies in the portfolio would be the ones to take off. This is key. Antifragile propositions don’t require specific predictions about the future.
VCs are not fragile to a deviation from expectation because they don’t care which company in their portfolio takes off. They just have to be correct on average. Every startup investment is simply an option that they can choose to exercise.
VCs don’t need to think in advance about which companies will fail, or what will cause their failure (although this information definitely benefits them when making investment decisions). By building a company that is predicated on options, VCs are able to continue moving forward when shocks to one or more portfolio companies occurs.
Making Your Startup Antifragile
As great as it sounds for a startup to get stronger when unexpected events occur, I don’t actually think that’s a realistic goal for most companies (it certainly isn’t the case for VC firms). Rather, I think the goal in making antifragile startups should be to minimize the risk and distraction when unexpected events occur, such that the company can continue to make progress while its competitors are panicking and reacting.
If we think about the recent collapse of SVB, every company fell into one of three buckets:
SVB was your only bank
SVB was one of multiple banks
You did not bank with SVB
If you were in category (1), then you likely lost a week (or more) of productivity dealing with the situation. If you were in (3), then you likely lost little or no time. Of course, founders with U.S. bank accounts couldn't reasonably have predicted which bank would collapse, so whether you were in (1) or (3) came down to luck. That means from a resiliency standpoint, (2) was actually the best case scenario. In this case, there would have been some distraction (trying to move money out), but no panicking (since short-term cashflow wouldn’t be a risk). In parallel, the company would have continued to make progress.
Of course, you would only ever have implemented (2) if you believed that there was a specific risk that a bank in the U.S. would fail. Until a few weeks ago, no reasonable founder believed that, or even thought about it.
So what is reasonable?
If we go back to the concept of recovery-oriented computing, the solution is definitely not to try to enumerate all of the exact things that could go wrong in the future. Trying to think through every possible shock that could happen and attempting to design resiliency around each and every one is a losing proposition.
Instead, founders should think about the broad categories of shocks that are possible and design around those. While the exact stressor might not be predictable, it’s possible to design resiliency around themes. For example:
An employee suddenly becoming unavailable (due to a family emergency, injury/illness, unexpectedly quitting, etc.)
A major technical issue (breach, hack, extended downtime, etc.)
A cash flow issue (large customers not paying, investor or grant money not coming through on time, fraud or other criminal activity, a major bank failing,…)
While this does require some enumeration of risks, in reality it’s focused on understanding the core components of the business and designing options for unexpected failures in each of those components.
Having redundancy across the team provides options when an employee is unavailable, without having to design around each specific employee or scenario. Building redundant software and infrastructure (and systems around that software and infrastructure) provides options when there’s a technical issue, without having to enumerate every possible thing that can go wrong. And so on.
At a higher level, you can design crisis management processes so that everyone knows who to go to when a shock occurs. Thinking about and designing (even at a high level) responses to technical, legal, PR and finance crises can provide a huge advantage if and when such a shock occurs.
Of course, all of this comes at a cost. Each option you create for your company requires an investment of time and resources. The return comes over the long-term as and when shocks do occur. As a founder, one of the best things you can do is to build your organization such that it is resilient to the types of unexpected shocks that are reasonably likely to occur.
As Louis Pasteur famously said, “Chance favors the prepared company mind.”