AI Agents and Why They Are the Future
Everything you need to know about one of the hottest topics in AI
What are agents and agentic behavior?
If you read this Substack, you probably use ChatGPT and other LLMs, but you have probably noticed a big limitation: it can give you a lot of useful output, but there aren’t a lot of things that it can do for you. For example, ChatGPT can give you great, personalized restaurant recommendations, but it can’t make reservations. As you saw last week, bots can recommend which products to buy, but they won’t place an order. And in customer service contexts, companies are still working to implement LLM-powered bots that can change your reservations, etc. (More on this below.) To get bots to actually do things, we need agents.
ChatGPT’s definition of agents is pretty good: “An agent is an entity that can perceive its environment, act upon it and learn from its experiences. Agentic behavior is the ability of an agent to pursue its own goals, interests, and preferences, while adapting to changing situations and feedback.”
I think about AI agents like human agents. It’s a person or bot that does something on your behalf. Think real estate agents or booking agents (although bots today are a long way from doing jobs that complex.)
We’re very early in terms of what’s possible, but some agents are already in the wild.
Examples of agents in action
ChatGPT already has some agentic capabilities. It can search the web, and it can write and execute code. Here’s an example of the first one. My wife is an art historian who studies a printmaking studio called Atelier 171. About 4 months ago, she came to me with a list of people who worked at Atelier 17 after WWII. She wanted to add country of origin, birth year, and death year to the list.
At the time, I tried to have ChatGPT do this, and it was able to go search for one at a time and find the info. But, when I asked it to do larger batches, it would frequently time out. It got so frustrating that my wife ended up doing this manually.
Today, I tried again, and it worked much better. It took a fair amount of coaching, and it also stopped a few times. But it was able to do 20 entries at a time without much of an issue. It’s far from perfect but significantly more efficient than a human doing that task. Here’s some of the output after about 20 minutes of fiddling with it:
That’s great progress!
I’ll cover LLM-powered analytics in another post, but suffice to say, ChatGPT and other LLMs can cut a database, calculating new analysis and making charts.
But that’s not all! Real companies are using agents to serve their customers:
Decagon is a startup that creates agents that can translate customer support requests into actions, such as booking a flight, canceling a subscription, or issuing a refund. I’ve spoken to a company that’s using them, and they’ve been very impressed with what the agents can do. It’s still mainly designed for simpler tasks, but even in its current state, it could have a profound impact on customer service
Carrefour is a French supermarket chain that uses a conversational agent to help customers shop online. The agent can understand natural language queries, suggest recipes and ingredients, and add items to the shopping cart based on availability at a local store. The ability to take the “add to cart” action is a big step forward.
Pixii2 is a startup that allows companies to create ad copy and then manipulate it. By breaking the ad into elements, it’s able to ensure that the branding is within guidelines and make it easier to modify specific parts of the ad.
What’s next: Multi-agents and beyond
This is the state-of-the-art in terms of agents that are in production. The next frontier is “multi-agent systems.” As the name suggests, these are tools that use multiple agents to solve problems, often more effectively than a single agent can do. They split the work up into different bots with different directives or capabilities and then those different answers get aggregated back into a final answer. They will be able to solve even more complex problems than traditional agents.
Rumors abound that the next generation of LLMs will have more agentic features, making it easier to send them on missions to do increasingly complex tasks. They may also make multi-agent processes easier to implement.
This could mean some exciting things for PE investors (and everyone else):
Investment Committee Bot: Ingest all the documents from a deal and be able to answer even complicated questions about it, including going off to new data sources and finding answers, and even casting a vote on whether a deal should proceed
Investor Co-pilot: A bot that can ride along side a deal time, suggesting relevant analysis and then performing it. For example, it could hear the management team say, “We’re better than our competition in digital marketing,” and then immediately pull data from multiple data sources and synthesize it into a scorecard to see if that claim holds up
Get Smart tool: When a fund considers a new company to acquire, an army of agents could spring to life, studying it from different angles, pulling in data from external sources and internal data from prior deals, and then synthesizing these varied outputs into a coherent view of the company
Keep on the lookout for new agents that you see, and feel free to describe any interesting agent behavior in the comments.
Full disclosure: I am an investor in and advisor to Pixii
Great post, Richard! Agree agents will unlock significant incremental value. One interesting outstanding question in my mind is to what extent we'll keep "humans in the loop" on agentic behavior (at least in early days) -- e.g., where can you build in human review or approvals to gain confidence in higher criticality tasks. This may be natural in some places (e.g., I've drafted this email, do you want me to send it), but also might limit the value in other areas (e.g., chatbot going back and forth with a customer can't stop to ask for approval for each incremental message!). In areas where it's not pre-approval, it could also potentially be post-action flags for review, in the spirit of improving agentic capabilities for the next time.