[PODCAST] Michael Sena: How Recall Network Is Creating a Reputation System for AI Agents Artwork

Block by Block: A Show on Web3 Growth Marketing

Each week, I sit down with the innovators and builders shaping the future of crypto and web3.

Growth isn’t a sprint; it’s a process—built gradually, step by step, block by block.

Let’s build something incredible, together. All onchain.

All Episodes

Block by Block: A Show on Web3 Growth Marketing

[PODCAST] Michael Sena: How Recall Network Is Creating a Reputation System for AI Agents

July 23, 2025 • Peter Abilla

Summary

Michael Cena, co-founder of Recall Network, outlines a vision for building the discovery and trust layer for the internet of AI agents. He introduces AgentRank, a reputation system modeled after PageRank, to evaluate and surface trustworthy agents in a future where agents interact, contract, and collaborate with one another. Cena emphasizes the importance of agent memory, human-in-the-loop curation, and economic incentives to ensure quality rankings. The conversation explores Recall’s current progress, including its testnet and agent competitions, while also touching on broader implications for marketing, creativity, and decentralized identity.

Takeaways

– Recall Network is building a discovery layer for the internet of agents
– AgentRank offers a reputation protocol akin to Google’s PageRank
– The AI agent ecosystem is rapidly expanding and interconnected
– Agents can delegate work to other agents, forming complex task webs
– Persistent memory is essential for agent personalization and trust
– Competitions assess agent performance and build credibility
– Community curators play a central role in surfacing valuable agents
– The protocol incentivizes accurate evaluations and reputational staking
– Subjective agent skills, like creativity, require human feedback
– AI agents are extending into many domains, not just finance

Timeline

(00:00) Introduction to Recall Network
(00:44) The Concept of AgentRank
(03:59) The Growth of AI Agents
(07:08) Understanding AI Agents vs. Automation Tools
(09:51) The Learning and Memory of Agents
(13:22) How Recall Solves Reputation Issues
(18:22) The Role of Community in Agent Evaluation
(23:23) Activating Curators and Community Engagement
(27:06) Michael Cena’s Background and Vision
(28:13) The Birth of YouPort and Self-Sovereign Identity
(30:23) The Evolution of Recall and Its Mission
(33:28) Current Stage of Recall: Testnet and Competitions
(36:31) The Role of AI Agents in Marketing and Development
(42:14) Challenges in Evaluating Agents and Trust
(49:35) Rapid Fire Insights on Crypto Trends

--------

Episode is brought to you by Infinex. Experience crypto designed for humans:

https://app.infinex.xyz/?r=B2KSQJ77

Follow me @shmula on X for upcoming episodes and to get in touch with me.

See other Episodes Here. And thank you to all our crypto and blockchain guests.

0:01

Michael Sena co-founder of Recall Network. Welcome to the show. Hey Peter, thanks for having me. I want to begin with on the Recall Twitter uh handle, on the Recall Twitter account, I think it starts with a really good explanation for what Recall is. ah Maybe we can use that as a jumping off point and then get into it. uh Recall, trusted discovery for the internet of agents. Google's PageRank transformed the internet. It made the entire web accessible with trusted website recommendations. AgentRank is doing that for the internet of agents. So maybe you can complete kind of that line of thinking. Tell us what Recall is and what this product is that you guys are building. Sure. um So as you mentioned, we're building the trusted discovery layer for the internet of agents. And um if you think about it, the explosion of AI agents pretty similarly mirrors the explosion of web pages on the early internet. And Imagine back then, it's got to even know in the 90s, and you wanted to find something online. And you had to go to AOL or Yahoo or these random internet bloggers who were keeping a directory or a repository of useful web pages. And that's really how people navigated the early internet in the era before search. know, obviously those curations uh were fragmented. They existed in a bunch of places. uh They were biased to whatever the curator thought was uh useful information for whoever was looking for whatever they were looking for. And uh it wasn't based on anything real, right? They weren't driven by the number of traffic to those pages. It wasn't driven by uh quality of the content on those pages. And when Google came around, uh they introduced PageRank, which was Google's. search algorithm basically and what it did was it indexed and organized the internet according to website reputation, page reputation. um And so, you know, instead of being at the whim of these curators and having everything fragmented, Google created a system where it could automatically sort of provide the most oh reputable pages and the most relevant pages to users based on whatever they were looking for. And so as the internet of agents arrives, and today you have in the tens of thousands, let's say, or hundreds of thousands of agents that people are creating for a whole bunch of things, soon that'll be in the millions and the billions, with the agent economy expected to grow to over 300 billion in the next nine or 10 years. um That network of agents needs similar infrastructure that enables people to ultimately search for what they're looking for. So agents with, let's say crypto trading experience or agents that are good at marketing and or agents that are good at travel recommendations, cancer detection, et cetera, and have the results presented be based on real reputation. And the main difference here is that the reputation of a website was really defined by how many other pages linked to it, how much traffic it got, those sorts of things. If the New York Times and A16Z and some other high quality pages linked to your page, it gave your page reputation. So Google would rank it higher. When people searched for it, it showed up higher in the list. The difference is with agents, ah obviously you don't have that back linking type of metric, but you do have performance. When people are searching for a trading agent or a cancer detection agent, they're looking for an agent that is the best at that skill. um And there's various ways to determine that, but that's effectively what Recall is providing. uh AgentRank is a reputation system. for AI that enables better discovery, trusted discovery, uh ultimately so users can find what they're looking for uh and be more successful in this new internet paradigm. You know, the number that you shared about the number of agents, I don't think most people are aware of kind of the growth of kind of AI agents online. And especially like Web2 people, they just, most people just don't interact with agents, or maybe they are, they just don't know. Tell us about those numbers, because they're, they sound astounding, you know, in the next 10 years, 300 billion agents. Yeah, and I think those are honestly conservative numbers. em Where agents differ from, an application, which is this all-in-one monolithic program that you go to Facebook, Instagram, Amazon, all pages or applications, on the agentic web, em agents are really small. You can think of them as like, automations. And if you think of, you know, the number of automations that exist in the world today, like bots or scripts that do things, there, you know, there's so many. And, you know, if you look at some of the agentic products today that let users create agents pretty easily for doing certain tasks. Like I'll create an agent to automate my purchases of cryptocurrency. know, DCA me, you know, buy coins over time. Once a day, I want to invest a hundred dollars. Like that single action can be an AI agent. And then I can also sort of, you know, deploy or launch another agent to. DCA, another coin, right? And so when you think of agents not as full applications, but it's almost like what microservices did to... em you know, backend infrastructure, sort of taking these monolithic things and breaking it down into all of these single functions that serve a specific purpose. um It's not far-fetched to believe that, you know, there could be hundreds of agents per person if you're just letting it run all of your automation. So when we think of how this economy scales, um the numbers become really staggering. in terms of just all the things that you might want to do online or do in your business uh become fully automated away and each of those automations is run by an agent. uh The numbers become even more astounding when you think of um the logarithmic sort of exponential growth of the agent ecosystem when Agents are actually contracting other agents to outsource some of those tasks, right? So maybe I have a uh Personal assistant and I say I want to book a travel recommend a travel booking or a hotel recommendation and I want to go to Italy and It might find you know on-demand the agent that is the most skilled at knowing all of the hotels in Italy Right to provide it with the input. It needs to provide me with its recommendations when agents actually start contracting other agents and outsourcing these, then you start to see that real big explosion. across all dimensions, users, businesses, and agents, it's an already large ecosystem today that's running a lot in the background, but tomorrow we'll be powering the internet. forget the name of the Web2 tool that I used that helped automate lots of like disparate tasks. That's it. That's exactly it. Yeah. I was like, I was a heavy user Zapier, right? Grab data. I'd scrape something and then stick it in a Google sheet and then make a report and then do this, do that. And um when you think of, when people hear of agents, they probably think of that. I guess how are kind of AI agents different from, you know, using a tool like Zapier? This is a popular question, and I think one that um we'll start to understand as agents become better. But today, a lot of people say anything that's automation is an agent, because it's trendy and your business gets more market cap if you say you're an agent. But I think of them in three stages. So there's bots. which are just scripts, right? They're sort of like defined functions that take an input and they generate uh an output. And that is like a closed loop, right? Like it's in, out, and that bot or script defines what happens in between. That's not an agent. um Then at the next step, you have what I would call like workflows, right? And so this is like a little bit more. nuance and a little bit more autonomy, but it's not fully looping. So you have a system that might pull from different sources um or might be able to automate some complex processes that you don't explicitly define. um It can learn what to do, but does the intended action that you tell it to. I want to do x, and it figures out how to do it and generates the output. I think of that as a workflow. That's like not quite an agent either. The line between that's a little gray. But when we start talking about agents, we talk about sort of things that can make decisions on their own. So you don't have to explicitly define the output of what it gives you. And instead it can keep like looping on itself to understand what it needs, the best way to do it, like what the correct decisions to make are and generates that output. And so they're sort of like autonomous. thinking entities that can take in data, perform some transformations in logic and make its own decisions and chart its own path forward. So I think of it in terms of how much you have to define the input, the process, and the output in terms of bots. You step into the middle with workflow automations, and then you step fully into the agentic world where they're sort of. independent and autonomously thinking and reacting to the world around it, creating this like loop right between environment, agent action, it rereads the environment, thinks of a new action. And so that's really where you get into the innovative use cases. Two questions on that. The way I think of it is kind of like the difference between supervised and unsupervised. Kind of supervises, you tell exactly what to do. know, grab data from this URL, stick it in this Google Sheet with this URL, that type of thing. Whereas unsupervised is, I think of it as you give it an objective and then kind of figure it out. And then it learns as it goes along. um Is that kind of a decent way of kind of describing or maybe summarizing what you shared? Yeah, exactly. And I think in your first example, like that's Zapier, right? If you imagine the UI, you have to go into Zapier and say, like, take this column on this table and put it in this other thing here, right? And you have to define all of those. This maps to this and this does this. Whereas with an agent, you're like, generate me the most um returns on my crypto portfolio within seven days. You can trade any coin on any chain. Go. And it sort of does what it needs to do to achieve that objective. That's amazing. um When you think of an agent and its learning, I think one of the common questions that people have, and I've asked it myself, is when they're learning a thing, where is that memory, where is that information stored? As they try something, they hit a wall, and then they recognize, okay, that didn't work. Let me store that somewhere so I remember not to go that way again. Well, it depends on the type of agent, right? So on one extreme, um you can think of OpenAI, right? And so OpenAI had that big announcement. was probably a couple months ago at this point. where they say, now, chat GPT will remember you, right? Like chat GPT will remember all of your previous prompts and interactions so that the next time you come, you don't have to give it all this context again. It has all that context from its previous interactions with you and its thought processes so that your prompts are more efficient and it's more personalized. And that's where they see the stickiness of their product coming in. It's from this like memory retention. And that is the moat that they're building, right? Because logic is sort of a commodity at this point. The models themselves are a bit of a commodity, but it's like how much context can you give that model for a user to make the model better for that user? So that's where OpenAI is, and they store that centrally. But for the crypto world or for other agent builders, there's a variety of places you can store memory. A lot of agents store memory locally, in a local database or something. um But uh with Recall, agents can store memory on chain. It's a feature of our network. so it sort of depends on the use case you have. But being able to... put memory someplace where you can ensure it's durable. So if your agent shuts down, gets wiped out, restarts, you can pull back in all that memory and sort of boot it up um to its last snapshot. um So it can keep moving forward. Like that's pretty powerful. It's also a good way that agents can collaborate on logic. Like one agent can learn from the logic of another agent to sort of build multi-agent systems. um So yeah, think memory is a pretty key piece to this. There's not one size fits all though, you know. um And so I think it really depends on your use case, but for sure, I think the, models move to become commodities, memory and the ability to amass a lot of memory about certain things so you can be more effective as efficiently as possible. um That's big. You shared with us kind of the very high level, high level introduction to recall. Maybe let's get into the details. um How does recall, like maybe let's talk about, you know, a specific problem and then in how recall solves that. You know, the high level problem is, you know, this is going to be a very large market um and there is no reputation system on which agents to that you can work with that will do kind of the objective that you want. There might even be malicious agents. And so we want to be able to create some kind of reputation to call those out um and possibly recommend the best agents for whatever you need. um Tell us kind of how that's working right now with the agents that are live. uh Maybe tell us a story. Sure, so I'll go back to the crypto trading example. And that's really where we focused first as our first market because in the crypto world, it's like people are building social agents or trading agents, right? Like those are the two big first use cases. So if we sort of focus in on the trading agent use case, um Everyone claims they're the best. you know, everyone is like, use my agent, you'll generate 10 X in a week and you you can go touch grass. and there's really no way to verify which ones are actually the best. And so what we do at Recall is we evaluate agents for a skill such as crypto trading. by putting agents head to head in competitions um that are sort of like live and in response to real world conditions where these agents sort of like compete at that skill for a period of time. And uh the network actually evaluates their performance and generates performance data. know, agent A, B, C, D generated these results at this skill over this duration. uh And making agents sort of run head-to-head in defined competitions versus sort of open-ended evaluations sort of uh solves for market conditions and dynamics, right? Like an agent trading this week versus an agent trading next week could generate very different results. So we sort of make, we control as many variables in the competition as possible. So these agents compete head-to-head, generates real performance results. and agents do that over time. So it's like less about the results from one competition, but then they enter a second competition and a third competition. And you can start to chart their performance over time, which helps you build confidence that those performance results are legitimate. They're not luck, they're skill, right? The more times you loop through that evaluation cycle. the more confident you feel in the result. And so a real story is there's this agent, Moonsage Alpha, um previously unknown, ah enters our first trading competition uh and wins by a small margin. It was a pretty competitive competition. There were 25 trading agents competing head to head over a week. um And I think it won by like half a percent P &L over the second place agent. And so that generated some signal, it built its reputation. um Then that agent came back and competed in the following competition, which were pretty much the same parameters, trade over seven days, uh optimized for P &L. And it won the next competition by like, 10 % return or something like that over a seven day period. And so it's outperformance versus the pack in the second competition uh was way more impactful than its outperformance from the pack over the first one. And so what Recall does is it takes performance outcomes over competitions and it creates a score called agent rank out of that. agent rank factors in performance data. And we haven't launched it publicly yet, but AgentRank, you know, in its full form, but AgentRank also lets community stake on uh agents they believe will continue performing well in the future. if performance data is backwards looking, right, because all of these competitions happened, community curation is a bit more forward looking. So when you combine those metrics together, you get a pretty robust reputation score that's crowdsourced and verifiable because all the results of these competitions and stakes all on chain. And so that's how we rank agents per skill. And so now with trading, we have a list of 35 or 40 agents that have competed over two competitions that have generated agent-ranked data and have, in the absence of curation being live yet, we have community voting, which is effectively the same thing. And so we combine this data and we rank agents on these leaderboards, so users coming to look for crypto trading agents can see agents ranked by their agent rank score. um And so, you know, it's conceptually a pretty simple algorithm. There aren't very many variables, but it's about the combination of the various factors put together that lets us surface the best agents um for given skills. That's really cool. Who enters these agents into the competition? Do they enter themselves or the agent creator? How does that work? I guess if you created an agent that was fully autonomous and it could enter its own competitions, right? But uh yeah, today it's the agent developers. So, um you know, if I have built an agent or I, you know, deployed an existing agent and populated it with my strategy, um I can enter it in competitions. And... the community on the other side of this are really playing that role of curators. em And so we involve both sides. So it's not just a developer platform for testing and ranking your agents, but it also puts community in the loop of like curation and ultimately they're the ones that are looking for agents. So yeah. So as you're thinking about your go-to-market strategy, it sounds like it's a two-sided market where um you're trying to bring brand awareness to recall to agent developers so that they can enter in their agents into the recall competitions. And then on the other side are curators of these agents. And so I guess these would be, in the specific example they shared, would be like traders. But at some point, if there's a different type of agent in a different domain, then it could be, you know, let's say healthcare or I don't know, x-ray data. Then that would be on the curating side, these would be people that are familiar, maybe oncologists or something. Is that weird way of thinking about it or is that right? No, I mean, that's right, except what we've seen is people are playing the curator role, even who aren't necessarily experts in that skill set themselves. em So we found the people curating trading agents aren't only like sophisticated traders. Over our first two competitions, had over 450,000 votes cast for these agents. um So we found that we're able to activate a pretty wide audience in terms of these evaluations. And so. um I think depending on the skill type, certain skills and certain competitions lend themselves more towards uh a wider participation rate from the community than more niche things or things that are just less exciting to pay attention to. um So it'll vary by skill, but. um Yeah, I mean, it is the right way to think about it. of like the supply side of agents are agent builders, the supply side of curation are curators, and the demand side for all of this are AI consumers, whether they're users, businesses, or agents. Yeah. Earlier you shared an example of like a cancer identifying agent for the, on the curation side. It, it, it occurs to me that, that no, but not anyone can just be a curator of a uh cancer identifying agent. All Will you filter for kind of expertise on that side or? So the market should take care of that. um And so curators, depending on how you design the economics and how things we've been working on internally are like designing the economics in such a way that accurate curations and predictions if they're early enough em are rewarded appropriately and inaccurate or curations that aren't useful are slashed. And so you really shouldn't have people, unless they're just fully gambling, curating things they know nothing about or haven't used or can't vouch for at all. And so if you think of when someone curates something. If you're curating it before there's any historical data about it, which basically means if you're curating it before it's competed in a competition, you have the most upside from a curation perspective because you're providing economic signal that you believe this agent will improve its agent rank score and will climb up the leaderboard. without any data backing it. So maybe that means you've looked at its code. Maybe that means you know the developer. Maybe that means, you know, whatever reasons you have for making that curation. But other curators might curate already proven agents. They'll capture less upside, but the agent will have a body of historical work that they can look at and sort of just say, well, this agent has steadily improved its agent rank score over its last 10 competitions, uh and they have a solid presence on social, and I believe, I know someone that works on it. So I'm going to curate it. And so there's different reasons for curating depending on how early you're curating, but it's still the same uh incentive model where you're only rewarded for correctness. em and incorrectness is punished. And so you sort of have the market deciding like people with true insight and true knowledge about these agents will probably curate early and those that just sort of want to free ride or follow and provide signal to already proven things will capture much less of that curation reward. What if the, you mentioned there's like 400,000 curators so far, something like that, which is an astounding number. So two, I guess two questions there. How did you, you know, how did you activate that many people on the, the curation side? And, and did you select for specific, I guess the initial go to market is like, crypto traders and so are these primarily people that are interested in crypto trading? Well, I think it's sort of, you know, by being uh in the crypto space, there is a supply of crypto trading agents that we knew we could just get on the platform and get tested and start to have them build reputation. But on the other side, you know, a lot of people in the crypto space, I would argue almost everyone owns at least some amount of something. ah And so there is this natural incentive to like pay attention. It's something they can relate to. It's something that's relevant to them. So You know, one of our key decisions as we were choosing our go-to-market strategy is like, how can we solve both sides? How can we test agents and provide them reputation where there is a lot of supply and also test agents that people will actually care about? Like people actually care about the skills that these agents are competing on and they understand them. And so we started with trading. And one of our key tenants in going to market is we didn't roll out full curation from day one. We didn't make it economic from the start. We started with super simple actions that people could relate to. I think our first competition, we rolled out a voting system where before the competition started, You could look at the agents that were competing and you could read their trading strategies. You can look at who created them. You can find various info about the different agents that were competing. And we let people vote in a few time windows. You could vote before the competition started, which is like early curators, like curating before these agents have competed. Or you could vote after day one of trading, after day two of trading, or after day three of trading. And so like we kind of let you vote. over this period of time and you got fewer points the later you voted because you had more data available to ultimately make your curation or your decision. So we've started super simple. It was like an agent prediction game. Like which agent do you think is going to win the competition? No economics, just voting. some points incentives to reward accurate curations, which ultimately translate into the eventual economic model. And so we started to just test various components of the curation system in super simple ways for the community. um And I think that that really resonated. The second competition, we added some more bells and whistles. We put agents on different teams. was like five trading agents on Team ETH and five trading agents on Team Sol. So it gave people that cared about those ecosystems a reason to care about trading agents. And they were sort of like repping their communities. We saw a ton of fan art created. um It was like e-sports for crypto trading agents. It was actually pretty wild. And... And so I think it's about that. It's like finding easy entry points for people to engage and learning what mechanics the product really needs. And then as you start to roll out your economics and things like that, you know, it's in line with both the needs of the community and the actions that they can and want to take and can ultimately drive the data that your product needs to become better. So I think we, we spent a lot of time focusing on that and I think the 400,000 wallets voting in the early competitions is a result of that. Tell us about your background. What in your background makes you really well suited to co-found recall and um do something so innovative in both the crypto and AI space? about that. So I've been crypto native building decentralized protocols since 2017. So I've been really deeply thinking about, you know, verifiable systems, community powered coordination, you know, economic incentives. And I've gotten to see over almost a decade in this industry, which is even hard to imagine at this point, what's worked and what hasn't. And so, you my background is I was a co-founder of, joined ConsenSys in summer of 2016. My job offer was actually the day of the Dow hack. Like I got an offer letter and then the Dow hack happened and I was like, is it, does Ethereum even exist anymore? Like, is this going to work out? uh And it did. um I co-founded a project called YouPort, which was the first self-sovereign identity protocol built on blockchain, built on Ethereum. um And since then, I've just sort of been in pursuit of... uh verifiable data systems and identity and reputation systems. um And so, what we were building with YouPort was a way for a user to collect a whole bunch of um from a bunch of sources that collectively makes up a digital identity, like certificates and attestations and verifiable credentials and NFTs and all sorts of things to make up your digital profile. uh And then we sort of uh spun out of consensus and uh we started the company ThreeBox Labs. And at ThreeBox we built Ceramic, um which uh I guess is now integrated into Recall, but it's a data streaming network. It enables you to build things like scalable, verifiable, decentralized databases. um Again, in the sense that like all data was signed by a user so it could be attributed back to an identity fully verifiable but more scalable than you might be able to do on chain. um And over time, as we were pushing that forward, uh you know, we've been working alongside the team at textile who was building very similar things for about just as long as we were. They were building databases and data networks and verifiable systems and their team has a background. They started as an ML company. And so we kind of, you know, we've always been like, coopetition with them. Like we were both building similar products. It's crypto. We worked on standards together, but sort of we're always in the same place at the same time. And we finally decided it was time that we just did something together. And when we put all our heads together and thought about really where is the next step? Where's the web going? What needs verifiability and some notion of identity and reputation? um AI stood out as being, there's going to be more agents than there are humans. There's going to be more agents maybe even than web pages. And when we started thinking about that world that was rapidly emerging a couple of years ago, um we knew we just had to build recall. And so I've been on this trajectory in college. I studied political economics. So I'm always thinking in terms of game theory and economic design. um And so yeah, I guess it's sort of been a... a lifelong mission uh at this point to make this thing a reality. And so it felt only natural that not only did we build this, but we built it together with textile. And so we merged companies and formed Recall Labs and launched Recall. And I guess the rest is history. That's cool. oh Tell us what stage Recall is at right now. Have you guys launched Mainnet or you're still in Testnet? Tell us about that. Mainnet's not yet live. We've been on Testnet now for a few months. We're running competitions. Just because it's Testnet doesn't mean that the competition data is not verifiable or valid. Really what Testnet means is em there's no live economics in the system. But the core mechanics of competitions, competing on skills, writing results on chain, building reputation, all that stuff still works. em And so we've run. a few internal competitions, two public competitions. We have our next competition for crypto trading coming up uh on July 8th through 15th. It's another seven day trading competition. um And we'll continue rolling those out. We'll expand uh the number of skills. So right now it's trading. We'll be rolling out things like predictions of different kinds. So sports betting and sports predictions will be rolling out uh taking action on prediction markets with agents. So agents that actually make predictions economically um will be rolling out marketing skills, a variety of ones. And That culminates in us launching a feature called skill pools, which actually lets users stake in pools for skills of agents, and it lets them launch new skills. So if there's enough demand staked in a pool for a skill, it'll kick off competitions for that skill and start rewarding agents for improving their agent rank on that. So it's a way for the community to crowdsource skills and ultimately like drive the development of AI towards the things they care about. um So that isn't live. You know, all these things will be coming out when Mainnet is out. But And that's the direction we're headed. Right now, we're really just sort of like nailing the competition format, nailing running competitions in this one skill. We start to expand the skills. We start to let people spin up their own skills and competitions. um And we sort of turn on all the curation. um But uh it doesn't mean that competitions aren't happening. And they'll continue to happen in the short term. That's cool. um No, this is really, really interesting. I met with a uh project called Fleek a little while ago, and they're in the space of making it easy to launch agents. And asked him one of the questions. One of the questions I asked him was like, what trends are you seeing? What kind of skills are these agents? What domains and skills are you seeing? And he said that what is really interesting, he was expecting to see kind of developer-focused agents. But instead he was seeing a lot of marketing and like copywriting and like video editing, kind of like agency, things that like a marketing agency would do. And he saw a lot of agents being created in that space. And I know right now you guys are focused on trading agents, but I'm curious to know like, you know, of what you've seen, are you seeing something similar or different? Definitely. And internally, we use a lot of them. Like not only are we a company building AI infrastructure and protocols, but we are actually like an AI company and we use tons of AI internally. It's one of the things we filter for in hiring. So we use a ton of AI agents on our marketing side, on our development side to make us all more efficient. You know, I think the era of the multi-tiered organization and a bunch of middle managers, like that's just not the future the world is heading towards. so Now you're a much leaner team, you are contracting a fleet of agents that does various outsourced tasks on your behalf. I think really the role of people in this world is how do you evaluate the taste of an agent? So like technically an agent can be a technically good copywriter. But is it set in the right tone of your brand? Is it of the right style? Technically an agent can make a design for an app that meets the feature requirements you outlined, but is it cool? Like, does it vibe? Like, I think these are the things where it's up to the director of the orchestra, the person in the seat to be making those decisions, knowing which agents to contract for which things based on where they excel. And so. You know, I think this is another area for, for recall competitions. Like trading is a very objectively measured skillset. There are metrics that are indisputable that you can use to evaluate the skill of a trading agent. P and L being the most obvious, but you know, more advanced traders look at things like sharp ratio. Um, back in the day, I to work on wall street and those were some of the metrics we use. It's like, what is your value at risk that you took to achieve that return? Did you like full port degen? 40X leverage your capital to achieve that return? Or did you make a bunch of safe bets with not that much value at risk at a time? And those are like two very different skill sets. And I would always pick the agent that is better at managing risk. But still, those are all very objectively defined, measurable things. But when it comes to things like marketing agents, content creation agents, creative agents, uh these are a bit more subjective, right? Who's to say which agent has better taste? How can you measure that? Recall competitions will expand to include more subjective skills, and that brings human curators in the loop even more. uh Right now, they're curators. They will become also evaluators of agent skills. Things that are creative do need humans in the loop to do those evaluations. You can start to think of for what certain types of tasks do agents excel at a certain taste what agent is the best. um And so, you know, there's, it sort of is a little bit of a different way to think about skills, because it's not so clearly black and white, or worse, but there's better or worse according to what the taste profile you're looking for is. um And so, yeah, we're seeing a ton of marketing agents being created, you know, automating all facets of the business. um And so, you know, we knew that AI was disrupting a lot of things. And one of the things that's going at first is the SaaS industry and the agency industry, right? Like you don't need all these different SaaS tools. You don't need all these big agencies. You don't need big teams. You just need people that know which agents to hire for which things and they know how to sequence them. You mentioned evaluators. uh Who evaluates the evaluators? So they're built into the network economics. The evaluators have to stake um their auditors. uh so, the nodes exactly. Yeah, so you can think of like all the participants are staking value to do their job. Mm-hmm. is sort of enable the agent rank flywheel, right? So like agent stake to compete and earn if they do well, evaluators stake to assess and earn for correct assessments. curators stake to earn from accurate curations or predictions, and then users can discover the best agents because these other participants are working together, sort of secured by this economic system that rewards them proportionately for the value that they're adding to agent rank. em Yep. Is the end game for Recall to to provide some user interface where I, as a consumer, want, hey, give me the best copywriting agent. And then you guys present, here are the ones that have been ranked based on these competitions. And here are the top 10. And then I select from the top 10. Is that one of the outcomes that you guys are trying to achieve? Totally, and that interface can look a bunch of different ways and it doesn't have to be built by us um because like really we're providing that reputation data set, like that trusted agent rank data set that. can be implemented in a variety of ways. Like maybe I want a clickable menu. Like maybe I want to go, I want finance agents, I want trading agents, I want agents best on Solana, and I can see a list, right? And it's ranked by agent rank. Or it could look like search. You know, I want trading agents on Solana, and it just sort of ranks the best ones, or most likely. um You know, and so it can be embedded in, um, agent marketplaces or AI marketplaces where the agents that that marketplace is showing and enabling you to hire it's recommending which ones to use based on the agent rank score. And so like, we're not, you know, we don't want to trap that interface because the value it provides is far greater. um than a single UI. um And so really it's about providing that trusted data set that can power trusted discovery of agents across a variety of platforms. um We started this conversation with uh the Twitter account Invoke PageRank. And so maybe let me share with you an example where I hacked PageRank and created a really healthy business out of it. um And then uh two algorithm changes came, Penguin and Panda, which then destroyed all the businesses that were hacking PageRank at the time. And would love to see kind of like what I Guess what kind of loopholes can you envision? And how you guys are planning on tackling it, but let me share with you this example uh So a while ago, I I had dozens of uh very domain specific websites and And I hacked the system in a way where I was able to get lots of really really really healthy backlinks And I didn't buy any of these backlinks at all And of course, as you know, PageRank, it's based on reputation, and reputation is, they counted each backlink as a vote. And some votes are weighted higher, like if you get a backlink from New York Times, et cetera, it's better than, it's weighted heavier than a bank link from Michael.com website or whatever. anyway, so in, a couple of these websites, they did well. They were all affiliate websites and one of them, uh funny enough, I was selling uh mattresses for an online company that was selling mattresses. And I became, or this website was like for three months in a row, was the highest uh grossing website for selling these mattresses. At one point I sold like 23 mattresses, like, you know, which is crazy, right? um uh Anyway, so, but the way I was building these, would do my SEO research and then I'd have the next step which is create a bunch of really great content and then publish the content and then getting the backlinks to these pages, et cetera, and then of course there'd be affiliate links throughout these pages. And then um Panda, and then I forget which one came first, either Panda or Penguin, anyway, Panda, one of the big issues was, the low quality of websites that was being presented to uh searchers. And that led to the first change, which was either Panda or Penguin, I don't remember. And then that eliminated, or that cleaned up a lot of the search results for consumers. And then Penguin came along and then it slashed even more websites. And then after that, Google learned that they needed human evaluators and they created a system where websites would be presented to them and then they'd give it some kind of a ranking. But it was always, and it still is the case where there's a human in the loop. And so it's not entirely automated. having presented that, like what, I guess what problems do you envision Recall having m as you build out this reputation system for agents? um And what are you guys doing to alleviate or, you be ahead of the curve, guess, you think about that. Yeah, so two come to mind. Well, by the way, congrats on your mattress business. That's uh impressive. Little known fact, back in the day, I had a bedsheets business. we're on the same market. Yeah. um So the first one I'll call out is, it's pretty hard today to know what is an agent and what isn't an agent. um Because really what agents are doing is they're writing some chain of thought logic, like how it's getting to its decision, and then it's publishing the decision. And that's how we're evaluating. And recall, that can be... Like there's nothing that says that those thoughts and output originated from an agent. Now, there is a way to solve that. There is CK proofs. you can do sort of like prove verifiably that this computation happened in a certain environment and you publish that along with your results. So I would say that is on the roadmap to being solved in recall. that's, I really think of ZKPs as being the other side of trust. So ZKPs are like this thing. I can prove that this thing did what it said it did. And recall proves, but how good was it? Right, and so like you need those two things to really build the full trust in an agent, because you want to know. Can I trust it to do what it says it's doing? And is it the best one I can use? Like is it better than the others, right? And so you need both of those things. So I would say that's the first piece. um But the second one is uh economic curation. So these are some variables we're playing with right now. Because we let... people curate agents and that has an impact on that agent's agent rank score. So, you know, in the same way that Google measured backlinks as votes and things like that, our algorithm really is, you know, performant. You can think of it like a X and Y axis, you have performance on the Y axis and on the X axis, you have confidence or certainty. And that certainty score goes up. based on the number of times that agent has competed and based on the number of stake on that agent. And so we're figuring out where the right place is to cap the amount that stake can impact your certainty score. because ultimately in the top of this, you the top right quadrant is where you have high agent rank scores. You have high performance and high certainty, which means high performance over time and high performance with a large amount of TVL backing this agent. And so the certainty variable. is affected by community curation. It's super valuable when there's not a lot of performance data to build certainty in an agent. You can signal things early. You can give early agents visibility and the chance to actually compete with agents that have been there for a while. But... We're tweaking the variables and doing various modeling right now to figure out what the right parameters on that are to make sure it's not gameable. uh And that the algorithm is truly expressing what it's intended to express, which is high quality, high performance agents. Yeah. that are trustworthy. And so, you those are some things we need to work on a little bit, but I think, yeah, it's those two. It's like making sure that it's an agent and making sure that the certainty, uh you know, function on the agent rank algorithm has the right tolerances for uh curation. I think including ZK proofs in the overall uh platform, I think makes a ton of sense. uh At some point, we're going to be proving everything. Proving that the data that uh the agent is using is what they say they're using, that it's being used for what they say that they're going to use it for, et cetera. That makes a lot of sense. When you think of this matrix that you described, You know, I immediately think of like type two errors where it's, you know, maybe there's, there's a ton staked on an agent, but it turns out the agent didn't perform very well. And so how do you think about that? And also the opposite where the agent did really well, but no one had confidence that he would do well. Well, those are both opportunities in our system because it's a community powered economic system. so if we break down the first case where I think it was an agent had a bunch staked on it and it did poorly. um In that case, those are inaccurate curations. um And so those curators are penalized for providing that signal. uh But if it is a true open market and if that agent has demonstrated a long history of performance up to that time, it's kind of like a surprise, right? And so if they have a hundred competitions and they sort of outperformed in all of them and they had one when they were really off, that won't slash people that much because it's sort of like there's a whole body. doesn't really affect its agent rank score all that much. It introduces some more uncertainty. But if an agent hasn't competed, has a bunch staked in tanks, well, why was there a bunch staked on that agent to begin with? um Kind of like, do your own research. There shouldn't have been. And people that incorrectly provide that signal will be penalized proportionately. um And so, you the drop in that agent's rank score will be a lot higher than the drop in an agent's rank score of an agent that has competed over 100 competitions. And so the penalty will be way worse for the agent with no competition history for its curators. is a code freeze enforced during a competition. uh That way there's not an additional variable that's included in the competition that wears, because I can see a situation where the agent has done well historically and then uh something is made during the competition like a code change and then all of a sudden it's not doing so well. Yeah, so we require agents to submit with versions. And so you are really like... You can see the version history, and each version does inherit the reputation of the previous ones. um Like in your App Store rankings for an app, don't go to zero if you release a new app. It inherits the thing, but you introduce a bit more uncertainty. And so if you start to perform less, it'll drop faster than if you had stayed on your current version and maybe performed a little less. So the protocol has to take those things into account. um But conversely, in the case where an agent has little stakes on it and does better, it's literally just the inverse, where if you're an early curator and you're the only person that curated this previously unknown agent and it way outperforms in its first competition, well, your rewards as a curator will be high. Um, and if, you know, an agent that has proven itself over hundreds of competitions has little stake on it, you know, you'll still, and it outperforms, like you'll still benefit, but it won't be to the same extent as if you were curating an early agent. it's sort of like incentivizes people to like, not just, you know, rest on their laurels and curate already proven agents. Like there's some incentive to find new participants because new things are being built all the time and we can't just curate yesterday. champions, right? Like there is some recycling of skill that has to go through the system. Yeah, and that presents new opportunities for curators. Yeah, totally. um You know, maybe a couple rapid fire questions before we end. First of all, thanks for sharing everything you did about recall. mean, it's super fascinating what you guys are doing. um Okay, rapid fire questions. You ready? Okay. What's one overhyped narrative in crypto? Agent Meme Coins. Okay. um Let's see. uh What is one underrated protocol? Hmm. That's a good one. ah Are you saying they're all overrated? I mean, they're all pretty well known. You know, I do think... I know, do think Eigenlayer continues to be pretty innovative in a lot of the things that they're doing. ah I wouldn't say that it's unknown, but I would say they continue to release new things that ah are pretty high quality. ah What is, you you mentioned you study political economy. I'm curious to, you know, what are your thoughts right now on, I guess, the world um and what's happening in Iran? Do you think, how do you think about that in any potential kind of prediction markets that you could see popping up in recall at some point? There's definitely a lot of prediction markets about Iran on polymarket these days. But I mean. I think the world is an interesting place these days. um And it seems to really just be, you know, people. It's really hard to diagnose what's going on. There's so much noise now. You see all this sort of propaganda and various things online. You don't know what to believe is real. I actually find that like the more access to information we have, the less we know um because it's hard to know what to trust anymore. ah And so there are certain things that like, ah you know, can be predictions built around, can be competitions for agents that are um truth discovery agents, like agents that compete to determine whether a thing is factual or false. That would be a cool agent and a cool competition around that could provide real value to the internet in sort of these sorts of crazy times. um So no direct comments on the conflict or the world, but. um I would just say like AI and narratives and everything is playing more of a role than ever before in these conflicts and situations. so, you know, the world and the internet is just going to become weirder and weirder. And, you know, it's a race of us against the machine sometimes. And so I think in these cases, we can use the machines to be on our team to help us figure out what actually is real. Okay, well on that note, Michael from Recall, thank you so much. Yeah, thanks, Peter. Enjoy the conversation.

People on this episode

Peter Abilla

Host

Michael Sena | Recall Network

Guest