Transformers, LLMs, and the (R)evolution of the Moonfire Tech Stack 🌗🔥

DDVC #37: Where venture capital and data intersect. Every week.

May 25, 2023

👋 Hi, I’m Andre and welcome to my weekly newsletter, Data-driven VC. Every Thursday I cover hands-on insights into data-driven innovation in venture capital and connect the dots between the latest research, reviews of novel tools and datasets, deep dives into various VC tech stacks, interviews with experts and the implications for all stakeholders. Follow along to understand how data-driven approaches change the game, why it matters, and what it means for you.

Current subscribers: 9,760, +346 since last week

Brought to you by VESTBERRY - A tool designed specifically for data-driven VCs

Request your FREE demo account to explore the potential of establishing live data links from over 300 diverse data sources, apps, and platforms, paving the way for distinctive VC insights.

Request for FREE

I’m incredibly excited to have one of the top 20 thought leaders from the “Data-driven VC Landscape 2023” contribute today’s guest post. An extremely talented engineer who founded an infrastructure analytics company and previously worked at Etsy and Facebook, Mike Arpaia (the nice guy on the left in the photo below with his data team) is a Partner at Moonfire.

For those who don’t know, Moonfire is an innovative UK-based venture firm that understands itself as a tech team doing venture, not the other way around. Just yesterday, they announced a fresh $ 115m in capital across a second-generation $ 90m early-stage fund and a $ 25m growth opportunity fund. They are known as one of the most data-driven VC firms leveraging a modern tech stack and AI across the value chain.

Thank you, Mike, for sharing this unique behind-the-scenes look and your valuable thoughts on the impact of AI on VC 🙏🏻

In this article, we delve into the journey of deep learning and its impact on the landscape of artificial intelligence (AI), with a specific focus on the venture capital sector. We'll explore the evolution of transformer models, such as GPT-4, that have become pivotal in today's state-of-the-art AI.

Through the lens of our technology stack here at Moonfire, we'll examine how these innovations are streamlining processes and aiding decision-making in venture capital. This article serves as an exploration of where we are, how we got here, and the potential direction of AI in venture capital, offering a pragmatic glimpse into an evolving future.

Transformers

The “state-of-the-art” (SOTA) in AI has shifted and evolved over the course of history. Deep learning-based approaches (neural approaches) to AI first became dominant in vision. When the convolutional neural network AlexNet was released in 2012, it not only popularised deep learning in computer vision, which dominated the domain for years but sparked a deep learning revolution.

Then came ‘Attention is All you Need’ – the famous, and aptly named, 2017 paper that introduced the transformer model to the world. This model enabled the training of state-of-the-art models for language tasks, like translation and summarisation, without needing sequential data processing. All the powerhouse models today – GPT-4, BERT, PaLM – are transformers, using the same architecture but trained in different ways to do different things.

And that’s set us off on this crazy trajectory. I got really excited about transformers quite early on and ever since then, I’ve been using them to play around with professional data like people, companies, learning content, etc. Transformers are really good at analyzing language, and there’s a lot of language in professional data. I worked on this problem space of deep learning for natural language applied to the professional domain at Workday and I got a feel for how powerful these models were for dealing with this sort of thing. At Moonfire, I wanted to take this tech and apply it to VC.

My colleague Jonas and I have already written about how we’re using text embeddings at Moonfire to find companies and founders that align with our investment thesis, but we’ve been consistently working towards a grander vision for where we want to take AI in venture.

For the first two and half years at Moonfire, we basically partitioned the venture process into all of its various components, like sourcing, screening, company evaluation, founder evaluation, portfolio simulation, and portfolio support, similar as described in Andre’s article here. Given that decomposition of the problem space, we then built transformer-based models to focus on the specific tasks within that ecosystem.

We’ve gotten pretty good at that! But then GPT-4 came along. We were already using GPT-3.5, but GPT-4 was a step change. You can ask it a question and the output is often better than a smaller model trained for that specific task.

GPT-4 was a significant leap in performance

For example, we have one small model in our infrastructure called the venture scale classifier, which aims to answer the question, ‘Is this company a venture-scale business?’ So is it a business that is venture investable from the perspective of growth, type of business, and stuff like that? It’s a subtle problem and it’s challenging to get a lot of useful training data. But it’s an important model for us because it’s where we filter out a lot of the noise early on. So, if we can make iterative improvements to this model, it has a meaningful effect.

Jonas spent a quarter working on this model, before GPT-4, and improved it by 5%, which was incredible. Then GPT-4 comes out, he spends an afternoon with it, and gets a 20% improvement on top of that. We decided then and there that a critical objective for us would be to go through the whole stack and try that for every model.

So that’s a bit of what we’ve been doing for the past few months. Around 75% of the time it’s better, but where we’ve trained a really optimized model for a specific task, the foundation model doesn’t provide an improvement. Not yet, anyway.

But we don’t want to just use LLMs to help us improve the individual parts. The idea is to create an architecture that allows us to integrate increasingly more powerful AI models into our infrastructure in a sustainable way: as they get better, we get better. We don’t want the way we’ve traditionally partitioned the space to become a straitjacket. We’re trying to think about it holistically – ‘what is the GPT-4 way of reasoning about partitioning the venture evaluation process as effectively as possible?’ That way, we’re not just bolting on AI, we’re completely reformulating the whole stack.

The ultimate goal is an end-to-end model where you just have one neural network doing basically everything; a future where software can write and improve itself, and where humans act more as supervisors and trainers of the AI. But that’s a way off yet.

The “10x VC” experience

So what does it look like for our investors? To start with, our pipeline was very investor-driven with some elements of automation. But we’ve been transitioning towards a more machine-driven process while allowing for a more focused investor perspective.

Currently, it’s like a symphony between machine automation and our investors, working together to move the pipeline forward. We do a lot of automated and ML-powered sourcing and evaluation work, which outputs a few dozen companies each week for our investors to look at.

Then we’ve got a lot of infrastructure that automates this board of investment opportunities. So our investors interact with the pipeline through carefully designed interfaces that provide them with all the necessary information about a company, most of which is automatically populated by our AI and data infrastructure. Based on this information, they can make a more informed decision faster, whether it’s rejecting the company, following up, or making an investment.

Each investor's decision is a valuable learning opportunity for us. We get so little training data, so investors making actual decisions is the best training data possible for our models. We try to keep as much of that as possible.

Looking way ahead, I think we’ll see AI making the bulk of decisions, only requiring investor input for particularly complex issues – basically, the stuff where humans are a better judge. So it’ll be like a handful of really focused interfaces that allow investors to make key decisions efficiently.

And now we’re thinking about how to increase the throughput of that decision-making. It’s this iterative process of trying to learn as much as we can about our investors’ perspectives and using that knowledge to guide them through the evaluation and investment process so they can focus on all the other decisions they have to make. Ideally, our investors will primarily focus on thesis inquiry, meeting high-quality founders, and making investment decisions.

Another firm might eschew human input entirely, and just try to build a fully automated pipeline. But our philosophy is: we hire top-tier investors and build the best ML and AI infrastructure, aiming to learn and integrate the perspectives of all our investors in a way that is really additive.

Every investor contributes significantly to our investment thesis and training data – even if we hired someone who only makes a few decisions before leaving. Collectively, everyone is making our engine smarter.

Share this article with others who might benefit.

Orchestrating Venture with Agents

LangChain, a software development kit (SDK) designed to help developers use LLMs, says the following in their Python documentation:

We believe that the most powerful and differentiated applications will not only call out to a language model, but will also be:
Data-aware: connect a language model to other sources of data
Agentic: allow a language model to interact with its environmentHow imperative is it for businesses to adopt this new way of thinking about technology and capabilities?

How imperative is it for businesses to adopt this new way of thinking about technology and capabilities? LangChain and alikes (HuggingFace’s Transformer Agent, Microsoft’s Guidance, etc) serve as a bridge between the language model and specific data sources or capabilities, providing the LLM access to structured information, APIs, and other data outside its original training set.

By using these tools, developers can extend the functionality of an LLM, customizing it to accomplish specific objectives, such as solving complex problems, answering specialized queries, or interacting with external systems in a more intelligent and context-aware manner. A great example of a GPT-based agent model called “LeanStartupAget” was nicely described in last week’s episode of Data-driven VC.

The idea of data awareness amplifies the capabilities of language models, enabling them to connect and interact with a broad spectrum of data sources, thereby providing more relevant, accurate, and insightful responses. This is particularly useful for tasks like investment research, founder evaluation, and company evaluation; all of which require a comprehensive understanding of a variety of data points.

Similarly, allowing language models to interact with their environment, or being agentic, pushes the boundaries of their potential applications. It enables them to engage in real-time data exchange with various systems, make decisions based on the current context, and even execute actions, greatly enhancing automation capabilities.

Multiple perspectives in an agent-based world

While some firms intentionally cultivate adversarial perspectives, our goal is to unify these viewpoints – and this is feeding into our own discussions as we develop AI agents or digital investors! Should we have separate agents for each investor to account for the fact they each have their own biases, priorities, and ways of thinking about things? Or could we employ a single agent capable of weighing these different perspectives and tweaking its approach through prompting?

I think the latter works in the context of one firm, but we will see some really interesting stuff when the AI agents of different companies, and different people start interacting with each other because they’ll have to collaborate and negotiate. Perhaps that is how companies will interact in the future, like a core communication fabric. But that moment will be the beginning of an exponential explosion of agents – all coordinating and competing for things. That’s for another blog post.

Despite being only a few years deep into using transformers in the venture stack, it’s already clear that this marks a significant shift in how we, and the venture world at large, operate. We wrote a piece recently discussing the concept of the 10x VC, where AI makes VCs much more efficient. I think it’s more on the scale of n^(10)x, as machine learning has the capacity to exponentially enhance our capabilities. That’s what we’re shooting for.

The n^(10)x VC?

The ongoing evolution in AI, led by transformer models like GPT-4, has ushered in a transformative era across various industries, particularly in venture capital.

Our pioneering approach to integrating AI into venture processes showcases some of the gains in efficiency and decision-making accuracy this new perspective brings.

By embracing the data-aware and agentic potential of LLMs, VCs stand to gain unprecedented capabilities in handling complex problems and streamlining operations. In the VC landscape, these advancements empower investors to focus more on high-level decision-making tasks, amplifying their effectiveness in building relationships with founders and spotting the next big venture-scale business.

As we gaze into the future, the possibilities for AI, and specifically LLMs, seem boundless. The prospect of AI agents communicating and collaborating with one another across different platforms and firms hints at a paradigm shift in all businesses, but also venture investing, heralding an exciting future that could redefine the venture capital landscape.

In my opinion, the path forward is to embrace these innovations and continuously seek ways to better harness their potential, always striving for that n^(10)x efficiency that this new AI era promises.

This is it for today. Thank you so much for sharing your valuable perspective on the future of VC and the importance of data and AI in this world, Mike!

Stay driven,
Andre

Thank you for reading. If you liked it, share it with your friends, colleagues and everyone interested in data-driven innovation. Subscribe below and follow me on LinkedIn or Twitter to never miss data-driven VC updates again.

What do you think about my weekly Newsletter? Love it | It's great | Good | Okay-ish | Stop it

If you have any suggestions, want me to feature an article, research, your tech stack or list a job, hit me up! I would love to include it in my next edition😎