How to make history with LLMs & other generative models

NVIDIA HQ - apparently GPUs aren’t a bad business these days

It’s been well over a year since I published my overview of large language models, or “LLMs”. The magic of Github Copilot’s beta in early 2022 made me the most excited I’d been about a new technology since I got my first iPhone. I was (and remain) convinced that we are at the start of something big - big enough for me to write a large list of LLM application ideas and watch as many others published their large lists in subsequent months. There’s nothing wrong with a great market map, filled to the brim with potential startup ideas, but it inevitably leads to the question of where to focus. Sure, there are all of these opportunities, but let’s be real, what are the ones that a smart startup founder should go after? And which are flash-in-the-pan, waiting to be made irrelevant by a foundation model provider, a new architecture shift, or an onslaught of other early stage startups?

Over the past year I’ve talked to many founders, operators, and investors, and I’ve encountered the full spectrum of opinions regarding where new LLM/generative model-related $10B+ companies will be built. On one side are those who believe there are many opportunities to build “AI-native” companies (similar to the generational building opportunities from the move to the cloud) with feelings best summarized by this tweet:

On the other side are those more convinced that incumbents (applications and/or infrastructure companies) will capture most of the value from adding or supporting this new generative tech. I’ve even heard opinions as extreme as believing that all of the foundation model development and usage will become so commoditized that building chips or supplying power will be the only long-term advantages. In terms of where I land personally as an investor, I’m in the middle. I don’t think that every incumbent company will be disrupted by a new startup, but I believe some are more likely to be disrupted than others. I believe some infrastructure categories offer a better risk-return than others, as parts of the “LLM stack” appear relatively more resilient. In this post, I want to expand on some ideas that I’m particularly excited about and others that I’m less certain of reaching venture-scale as standalone businesses. I’ll also mention some companies I know that are working on each idea, but this is not meant to be a complete list (of companies or ideas- there are so many other exciting stealth companies for a potential future post…). Lastly, these are strong opinions, loosely held. I would love to be convinced to change my mind, and I believe it’s a sign of a great founder to be able to address risks, navigate the idea maze, and prove skeptics wrong.

Thanks for reading Leigh Marie’s Newsletter! Subscribe for free to receive new posts and support my work.

Promising generative model-related startup application ideas

Developer tooling platform

The success of Github Copilot (>30% acceptance rate for auto-generated suggestions and rumored 100,000’s of users or >27,000 organizations paying for the product) has proven that current technology can augment even the best programmers. It’s now a no-brainer for most developers to have some sort of LLM-powered autocomplete, whether VSCode with a Copilot extension, ChatGPT copy-pasting, or a variety of competitors that have emerged over the past year to address Copilot and ChatGPT’s shortcomings. Given the massive size of the opportunity to speed up the world’s ~30m developers as well as gaps between what LLMs can be used for and the current Copilot product surface area, I am excited about AI-enabled dev tools startups’ potential. LLMs can already be at least somewhat helpful for code search, delegating larger coding tasks (e.g. refactoring), testing (unit, integration, QA), code review, debugging / observability, setting up infrastructure, documentation, and security. In addition, many companies want to self-manage anything related to their code (surprisingly, the majority of Gitlab’s revenue comes from self-managed code repositories) which is not supported by Copilot, and some have broader security concerns around producing GPL code. Copilot also doesn’t support every IDE / notebook and doesn’t allow for fine-tuning or other customization to better understand or maintain bespoke codebases. Startups like Codeium*, Grit*, Warp, Sourcegraph (Cody), Cursor, & Contour have addressed many of these problems. How acute the pain point each initial wedge solves, the effectiveness of their products (e.g. latency, UX, model infrastructure tradeoffs), and the customers’ willingness to pay will likely determine the winner(s) here. Though I believe the success probability is reasonably high for at least one new company (large market & opportunity to build a differentiated product), the main risk considerations would be either open source commoditizing most of these workflows (e.g. Meta’s Code Llama) or the incumbent, Github / Open AI, developing proprietary model advances and aggressively leveraging existing distribution to capture the market (we’ll see…).

Augmenting knowledge workers (consulting, legal, medical, finance, etc.)

Early signs point to many simple back-office, front-office, and even customer-facing tasks being automated completely by LLMs - Adept AI & other startups excel at simple personal and professional tasks, like locating a specific house on Redfin or completing an otherwise click-heavy workflow in Salesforce. In addition to text & webpages, voice is understandable and replicable; for example, Infinitus* automates B2B healthcare calls, verifying benefits and checking on statuses. More specialized knowledge workers - especially those in legal, medical, or consulting professions - can also be made drastically more efficient with LLM-powered tools. However, given their plethora of domain-specific knowledge, the majority of higher-stakes workflows I’m referring to here are more likely to be assisted than fully automated in the near-term. These knowledge workers will essentially be paying such startups for small (but growing!) tasks to be completed inside of their complicated day-to-day workloads.

Whether drafting a legal document for a transaction or PI case or analyzing a contract for due diligence, some lawyers are already using legal assistant tech to save time. Thomson Reuters, a large incumbent tax & accounting software platform, saw so much potential in Casetext, an AI legal assistant, that they recently acquired it for $650 million. Given gaps in existing legal tech and the potential of LLMs to speed up workflows, there’s potential to build a larger legal software platform from the various initial automation wedges. However, startups must navigate finding champions and validate lawyers’ willingness to change personal workflows & ultimately pay for large efficiency gains.

In medicine, doctors can have more leverage with automatic entry of patient data* into their electronic health records after a meeting (especially important with the turnover of medical scribes) as well as automated patient or hospital Q&A through chatbots. Biologists are also already taking advantage of LLM-powered tools to help them find protein candidates faster. Though scaling medical GTM is notoriously challenging, the payoff of saving large amounts of time for these highly educated personas could be immense.

Finally, the consulting industry continues to boom, helping businesses make all kinds of decisions from pricing models, store placement, inventory & risk management, and forecasting. Startups like Arena AI*, Kumo AI, Unearth insights, Intelmatix, Punchcard*, and Taktile use LLMs and other related tech to help many different types and sizes of customers with decision-making. If a startup is able to build a generalizable product with a scalable GTM - so, not just another consulting company - they might be able to eat into some of the large consulting spend as well as the budgets of those who didn't use consultants in the first place.

Digital asset generation for work & for fun

Other types of generative models outside of LLMs (e.g. diffusion models) enable the generation of media outside of text like images, videos, and audio. Whether you draw portraits, edit videos, or make PowerPoint presentations for a living, the current state of generative models can likely help you become more efficient. Separately, if you thought you weren’t skilled enough to create images or songs at all, some AI-powered generation tools may convince you otherwise - similar to how Canva made graphic design more accessible for many non-artists years ago. Startups like Midjourney, Ideogram, Genmo, Tome, Playground, and Can of Soup help users create and share images for professional or personal use. Some may continue to build out enterprise features and challenge Adobe and Microsoft, while others may continue to build out social media engagement & e-commerce capabilities through network effects & ads. Video creators & editors - from Instagram stars to blockbuster special effect artists to L&D professionals - can speed up & reduce the cost of their work with products such as Captions*, Wonder Dynamics, Runway, Hypernatural, and Synthesia*. On the cutting-edge of the current generative tech, short-form video (e.g. Pika, Genmo), music (e.g. Oyi Labs, Frequency Labs*, Riffusion), & 3D asset generation (e.g. Sloyd, Rosebud*) show promise, though the longer-term business plans seem less straightforward than those of the image generation and video editing companies. In addition, some digital asset generation like audio/voice seems more challenging to maintain a differentiated product over time, especially as cloud providers expand their offerings. As a final note about this category, the legal and copyright issues are most pronounced here in comparison to other categories, as there are already many lawsuits alleging improper training data and unattributed output.

Personal assistant & coach

I’m convinced that we will eventually have the option for a LLM-powered assistant or coach for the majority of things we do, both at work and in life. I’d personally love a future where wearing some sort of AR device is somewhat socially acceptable, and my device can listen to me talk with a founder, fact-check the conversation in real-time, give me advice on how I could be more helpful or convincing, and automatically follow-up for me after. In the meantime, tools to help with writing persuasive emails, navigating internal knowledge bases*, or automating common tasks in the browser seem appropriate and ripe for expansion. The current LLM tech can also already perform well enough to help learners, in school and out, with personalized educational solutions and conversations. Using large models to create compelling, seamless experiences on mobile is quite challenging given latency & compute requirements, which likely makes great products here harder to copy than meets the eye.

Generative model-related startup application ideas I’m less certain of

Some other SaaS replacements

In general, I’m more skeptical of any SaaS disrupter that doesn’t have a strong story against the incumbent and other upstarts in the space. To truly claim LLMs or generative models as the “why now” for a new startup, I’d prefer the existence of some sort of innovator’s dilemma, large product rework, and/or special unattainable resource (e.g. talent) that makes incumbent repositioning challenging. When in doubt, I go to a favorite book of mine, “7 Powers”, on how to build and maintain a moat.

Summary of “7 Powers: The Foundations of Business Strategy”

As an example, given the how tech-savvy and agile document incumbents like Coda* and Notion are, it’s harder for me to believe that new directly competitive LLM-powered companies will be able to take away significant market share. In fact, they’ve both already added AI-powered features to their platforms, and it seems to have had material impact on revenue/engagement from their existing users.

I hear pitches for AI as the main catalyst for new Salesforce-competitive sales platforms, BI & data science tools, CAD software, and basically any SaaS tool that you’ve ever heard of. In many cases, I don’t think that the timing argument for a new company is the current AI wave. I am generally optimistic about companies consolidating things like sales, BI, & CAD tooling and making it easier for various personas to use (Salesforce and Autodesk are challenging to use), but I believe AI is at most a feature for some, not the key differentiator or main reason for incumbent disruption. Nonetheless, AI will likely help all product experiences improve over time, and particularly AI-native teams may be able to use AI more effectively internally to build and sell faster than their competition.

Standalone general consumer search

In theory, the chat experience that made ChatGPT explode in popularity (reaching 100m users in record time) should disrupt the traditional model of ads through bidding on search keywords. However, whether it’s due to a lack of desire of consumers to change ways, the inconvenience of incorrect information and difficulties fact checking LLMs, or something else, even the Bing AI-powered chat thesis hasn’t played out in a significant way yet, In fact, Bing’s market share is lower in 2023 than 2022. Moreover, given the importance of search to Google as well as their ability to train LLMs internally, I’d be skeptical that Google doesn’t continue to adapt, as they’ve finally rolled out some generative model features in search. Ultimately, I believe the incumbent’s distribution for general-purpose consumer search seems challenging to go up against, even with a counter-positioning argument and a better product experience.

Promising generative model-related startup infrastructure ideas

Running large models locally

If you imagine a world with all of the wonderful LLM applications discussed above, you probably assume some sort of infrastructure to run increasingly large personalized models on various edge devices - laptops, phones, cameras, sensors - with minimal lag. Today, if you tried to run Llama 2, the state of the art open source LLM as-is on a laptop, it would likely be impossible. However, if you use the GGML version of Llama 2, all is solved. GGML is an open-source tensor library for ML that optimizes (quantizes) models to run on CPUs instead of just GPUs, giving a massive boost in inference speed for a minimal accuracy tradeoff. Even Meta runs Llama internally by using the GGML versions and saves “a lot of money” as a result. GGML’s vibrant contributor community seems focused on building and optimizing GGML versions of popular & consistently improving models, perhaps with longer term aspirations to build a full-fledged edge inference framework. There’s likely an opportunity for GGML and/or another company (e.g. Ollama) to offer paid extensions or tooling on top of the popular related open source here given the growing interest in LLM local usage.

Providing compute & software for model training / fine-tuning / inference

Everyone needs GPUs these days, and GPU cloud providers like Coreweave and Lambda Labs are exploding in growth. Startups and even investors are hoarding chips. Many companies wish to use LLMs and other generative models, but they want to own their own models and therefore can’t just call something like OpenAI’s APIs. Instead, these companies must figure out how to best utilize compute for initial training or fine-tuning of open or closed source models as well as for ongoing inference. Given the complexities of managing this infrastructure as well as the opportunity size (many companies would rather own their own custom models for quality, strategic, and/or security reasons), a variety of startups have popped up. Some seem more focused on inference (Modal*, Banana, Replicate, Runpod, Fireworks), others on fine-tuning (Scale AI*, Lamini, Automorphic AI), and others as more general purpose (Mosaic [acquired by Databricks], Together, MLFoundry, Hugging Face). I don’t doubt this is a valuable market to compete in, and some of these startups with proprietary tech can offer much easier-to-use and more cost-effective experiences than cloud providers. The winner(s) here may not just need to have a superior product & internal cost / margins, but also financial savviness and GTM excellence, especially if the competition continues to heat up. In addition, many “outside” forces – the demand for GPUs, the behavior of NVIDIA & cloud providers, and the macro for raising money to purchase or lease GPUs – probably have a significant effect on the outcome.

New ML framework and/or new chip

A particularly ambitious but exciting infrastructure category is challenging NVIDIA’s dominance in both a free ML framework (CUDA) and associated chip. Combining the shortage of NVIDIA GPUs, the general pain of hardware lock-in, and CUDA’s usability and latency challenges, a disrupter could theoretically break into the market. Both Modular and Tiny Corp are working on new challenger ML frameworks - the latter then plans to develop a competitive chip while the former seems to be commercializing some associated software. Many other startups are skipping straight to building chips for specialized LLM workflows.

Generative model-related startup infrastructure ideas I’m less certain of

Observability

Though observability is a well-defined term & large market in traditional software, attempts at standardizing & monetizing ML observability products have proven more challenging. From working with many potential buyers at Scale during the peak of the self-driving computer vision hype years ago, I found that the typical ML engineer’s attitude was much different in regards to model monitoring tools compared to labeling. ML engineers generally hate to manage any type of operational process like labeling, but when it comes to software for observing and correcting ML models, they tend to have stronger and more diverse opinions about what is best to build. I imagine this may be true of the current LLM wave, where many LLM application companies I speak to or work with build and maintain their own LLM monitoring. Many even consider it a competitive advantage - no time is spent communicating their ever-changing needs with vendors. If an external infrastructure startup becomes successful here, they’d need to build a customizable product that adapts to the frequently-changing best practices as a result of the fast pace of AI developments. In addition, the growing prevalence of “AI engineers” versus more traditional ML engineers of the past likely makes my personal Scale anecdotes less applicable.

Vector databases

Much has been said about the uses and benefits of vector databases - they are useful in RAG, which is a standard for many LLM applications involving specific information retrieval today. However, as fine-tuning becomes more commonly used and context window sizes continue to grow, the urgency of having a highly performant vector database diminishes. Combine that with an already fierce set of startup competitors, incumbent solutions like MongoDB & Postgres, and perhaps cloud providers at some point, it’s a tough space; new startups (especially those without a special advantage or platform play) face technical headwinds and a variety of adversaries.

Privacy or quality-related middleware

There are a host of issues with current LLMs including data privacy, security concerns, hallucinations, bias, and even output format. A viable strategy for building a general-purpose model creation platform may be wedging in by solving one of these pain points that is particularly acute for a specific group of users, but I’d be skeptical of a large standalone company being built in the near-term without a compelling expansion strategy. I imagine the foundation model providers as well as companies helping their customers train & refine open source models will incorporate an increasing amount of these security and quality guarantees, so fast competition attempts following successful traction is likely. It’s also always challenging to predict when customers will generally really care about privacy / security (e.g. after regulation or industry scares), which can be risky if that’s the timing argument for a company’s growth. In addition, as vertical LLM applications continue to mature, companies could choose to use those, with privacy & quality guarantees, instead of managing foundational model APIs or their own models internally.

Thanks for reading! It’s always fun to theorize what the future holds, but I will conclude the post with the following saying from Henry Kissinger:

“Traveler, there are no roads. Roads are made by walking.”

The only way to really know what will be the most incredible AI companies of the future is to try building them, and I can’t wait to continue to partner with awesome founders doing so at all stages. I would also love to hear thoughts & feedback on anything in this post - leighmarie@kleinerperkins.com.

indicates author is angel and/or current/previous institutional investor in company

Thanks for reading Leigh Marie’s Newsletter! Subscribe for free to receive new posts and support my work.