At Hatcher+, we are using machine learning to help evaluate investment opportunities.
Not coincidentally, we are looking for signals that can’t easily be manipulated*. This is analogous to Google’s early use of the link graph on the web as a leading signal, rather than page content. Of course, site operators soon figured out they needed to game the links and an SEO "arms race" continues to this day. However, the novel approach at the time gave Google a substantial lead in search quality.
This feedback, adaptation, and arms race will happen as more automation enters the investment decision process. Having seen this movie before, we are thinking ahead and looking for signals that are difficult to assess or manipulate. This includes extensive application analysis and external data collection today. It will lead to even more nuanced application processes, including audio/visual analysis, team resume analysis, and so on.
Along the way, its also interesting to highlight how predictive some simple, hard-to-fake data points can be. This is an easier problem than SEO, since we have data points with verifiable ground truth.
A (Very) Basic Prediction Case
Here are a couple factors, which are easy to verify in diligence and help predict a firm’s future:
- Years in business
- Time since prior investment
Individually, these variables don’t help much, but together they help predict the likelihood of exit with 3 years. The chart below shows how predicted exit likelihood tracks to the actual exit likelihood (which is the average of the binary outcomes in each bucket). You can see that higher likelihood predictions loosely track with exit propensity of the bucket.
This chart was generated using investments in about 40,000 companies, including various rounds up to Series F. Not bad for so little data.
Adding More Data
Although it begs the question a bit when evaluating an investment, adding the number of investors in a round improves the prediction ability. Here’s how the above example looks when the investor count is added for the round under evaluation:
Wrapping It Up
Its clear that very minute features help the overall prediction challenge. While these charts show some lift, these features are not yet predictive enough to be practical. To experience the lift, you’d need to make a large number of investments.
Similar to the wide variety of signals a search engine uses, we leverage a large, fluid feature set that generates predictions. We use a combination of information provided by the applicant and data from external sources.
Our goal is not to pick winners. Rather, we are seeking to help investment teams focus energy where its most useful. We use machine learning to distinguish the more likely winners from the less likely. Machine learning tools provide a unique added voice in the decision process.
*We prefer not to show too much detail here, or we could ;accelerate an ‘SEO-like’ ecosystem within which applicants study the algorithms and adjust their presentation, style, and content to improve their scores.