Foundation Models Will Eventually be the Basis of all AI-Powered Software

by Mike Volpi

Image created with Midjourney

Artificial Intelligence has captured our interest and fascination at Index Ventures for many years now. With investments in companies like Scale, Aurora, Covariant, Cohere,, Lightning, DeepScribe, Gong and many others, we’ve been firm believers and advocates of the incredible potential of this technology.

As we head into 2023, we continue to look at AI as one of our most important investment areas. Our talented data and infrastructure team got together to collectively identify the major trends in AI and distill our insights into four key pieces. We’ll be releasing them once a day for the rest of the week in hopes that the series will be useful to other operators, entrepreneurs, and investors in the space.

Foundation Models will eventually be the basis of all AI-powered software.

They will encode the context upon which application-specific decisions are made.

One of the new additions to the ML lexicon is foundation model. These are large artificial neural networks “pre-trained” on massive amounts of data without particular end-uses in mind. A popular foundation model is OpenAI’s GPT-3, which was trained merely to predict the next bit of text in a file given what came before. But foundation models can then be “fine-tuned” on smaller sets of hand-labeled data to perform specific tasks, like answer customer questions.

Foundation models can lead to surprisingly human-like generality of skills. Most AI models are trained on task-specific data. Robotic arms are trained to pick things up. Autonomous vehicles are trained to drive. But foundation models consume vast amounts of data from all across the Internet. Some parts of it are excessive and obnoxious, but it encodes much of what humanity has learned of the millennia. One of our theories at Index is that almost any AI application can benefit from a baseline usage of foundation models.

Some human tasks seem narrow, like driving, filling a box at a warehouse, or answering a customer’s question. But we frequently apply understanding we have gained from other parts of our lives. Machines that are trained on siloed tasks don’t have that breadth. That’s why a robot that sees a cat on a conveyer belt doesn’t know what do with it. That is why an autonomous vehicle that encounters illogical traffic cones just stops. Humans can put these things into context, based on what we know about animals and construction, and carry on. It’s our superpower.

We believe that over time, engineers will increasingly start with pre-trained foundation models and then fine-tune them on narrow tasks. The foundation model will not make the siloed AI models “human.” But their use will help models comprehend the most unusual of circumstances and help them navigate through them. Understanding that a human driver is angry based on their behavior will assist the autonomous vehicle navigate. Understanding that a cat snuck into the warehouse and isn’t supposed to be on the conveyor belt will help the robot deal with the snafu. The most difficult situations for almost all AI models are the “long-tail” events that haven’t been seen before.

Cohere was co-founded by Aidan Gomez, who co-authored the seminal paper that introduced the “transformer” software architecture that underlies many foundational models. We led the company’s Series A in 2021 with the thesis that, like the cloud giants of a decade ago, there would emerge a small number of providers that abstract the often-prohibitive complexity behind developing, managing, and hosting these models. We have a long way to go before foundation models truly understand the meaning of things the way we humans do. But they are improving at a torrid pace, and, in the not-too-distant future, they will begin to approximate the knowledge base that we use as the context of how we accomplish tasks.

Published — Dec. 15, 2022