[AUDIO] Vivek Kolli on Decentralizing AI: How Sentient Foundation Is Building a Community-Driven Future Artwork

Block by Block: A Show on Web3 Growth Marketing

Each week, I sit down with the innovators and builders shaping the future of crypto and web3.

Growth isn’t a sprint; it’s a process—built gradually, step by step, block by block.

Let’s build something incredible, together. All onchain.

All Episodes

Block by Block: A Show on Web3 Growth Marketing

[AUDIO] Vivek Kolli on Decentralizing AI: How Sentient Foundation Is Building a Community-Driven Future

July 10, 2025 • Peter Abilla

Summary

Vivek Kolli, Chief of Staff at Sentient Foundation, shares how the organization is working to build a decentralized, community-driven AI ecosystem. He argues that today’s AI landscape is dominated by a few tech giants, creating risks for how AI evolves and who it serves. Kolli outlines why decentralization and community involvement are critical—both for safety and for unlocking diverse, human-centered AI applications. He also discusses the challenge of monetizing open-source AI, the future of modular AI models, and how better data inputs and user experience will define the next generation of AI products.

Takeaways

— Sentient Foundation’s mission is to build a decentralized, community-powered AI ecosystem.

— Today’s AI is overly centralized in the hands of a few major corporations.

— Decentralization is essential for creating safe, human-serving AI systems.

— Community involvement will drive more diverse, creative AI applications.

— Open-source AI enables better user experiences and greater accessibility.

— The future of AI is modular, customizable, and user-friendly.

— Differentiation in AI products will come from unique data inputs.

— Sentient is focused on creating a better, more human-centered AI UX.

— The most valuable AI models will be empathetic and human-like.

— Community collaboration is key to driving AI innovation forward.

Timeline

(00:00) Introduction to Sentient Foundation

(01:28) The Problem with AI Centralization

(04:35) Building an Open AI Ecosystem

(09:32) Community Involvement in AI Development

(13:53) Decentralization and the Role of Crypto

(18:29) Competing with Major AI Players

(20:23) Data as a Competitive Advantage

(24:32) Modular AI Models and User Experience

(30:48) The Future of AI: Balancing Hard and Soft Skills

(35:12) Roadmap and Community Engagement

Episode is brought to you by Infinex.

Join here: https://app.infinex.xyz/?r=B2KSQJ77

Follow me @shmula on X for upcoming episodes and to get in touch with me.

See other Episodes Here. And thank you to all our crypto and blockchain guests.

0:01

Vivek Kolli Chief of Staff at Sentient Welcome to the show. Hey Peter, thanks for having me. Yeah. Let me begin with the headline on the sentient.foundation website. And then would love to hear kind of your explanation for what sentient is for the audience that might not know. And then would love to then, you know, at some point in the call kind of hear about your background. We typically start with, know, what is your origin story, but it kind of gets old. And so what I've been doing lately is let's start kind of with the project and then let's, you know, maybe you can weave in your background throughout the call. Okay. So on the sentient dot foundation site, it says the open AGI foundation sentient foundation is pioneering a new era in AI, empowering communities to create loyal AI community built community aligned and community owned. As a nonprofit committed to advancing open source AI technologies and building a decentralized transparent ecosystem, we champion an open AI economy where AI builders are key stakeholders. With AGI on the horizon, our mission is to ensure it serves humanity, not corporations. There's a lot to unpack there. Would love your help in helping the audience understand uh what Sentient Foundation is. and in all the buzzwords in that uh paragraph I just read. Yes, definitely. So I think let's start with the problem that we're trying to solve. So Sentient came out of the idea that AI is the most powerful technology probably ever. And right now, six to seven corporations control 90 % of global AI development. to understand that the most powerful thing ever is controlled by six to seven entities is quite worrisome. So sentient came about and was like, okay, how do we create an alternative to, you know, what we're calling closed AGI. Um, know, these hyperscalers that have all the resources, have all of the talent to just go and dominate AI, which they're doing, you know, Google, open AI, Anthropic, XAI, all these closed source labs. uh You can see that there's a lot happening uh in the AI, but it's happening from these main players. uh So the first iteration of Sentient was not necessarily creating an open ecosystem, but allowing people to build in an open ecosystem. So... uh The first idea started with, have all these closed source AI companies and they can just give people a million dollars to come and work there. ah If you think about AI, was 30 to 40 years in the making of open source research and open source work. And then a few companies took it one last mile and monetized it to the moon. Open AI took so many years worth of research and then said they were gonna be open and decided not to be open because they found out they could make. a bunch of money. um So the idea was mainly from an academic perspective of, like, you know, now you have this massive AI economy in terms of objective money value, but not a lot of people being able to develop because there's only very few people that can break into these closed source companies. So what we wanted to create was a build a platform where anyone making AI would be rewarded for their efforts. How do we get to tens of millions of developers to develop? rather than being coached and going to the closed source companies? Eventually that evolved into, if you can get tens of millions of developers to develop, you know, anything in an open source ecosystem, then you have like a massive quantity of stuff. And because you have a massive quantity of stuff, Can we provide a better AI user experience to our users? And that's what Sentient has evolved to evolve to today. um Where we're still keeping that same portion of, there's a bunch of developers that want to build in the open and we can reward you instead of you feeling that you have to go to open AI to make money from AI. But if you build here, you can contribute to this massive pool of intelligence and then we can provide a better user experience because there's just so much AI being built by these tons of ah So I guess that's the high level gist of the problem and the solution that we're working towards. And I think when most people hear what you just said, I think they can get on board with that. I mean, it's really troublesome to hear that six or seven companies control really this AI revolution. in fact, just last week we heard that Meta acquired 50 % or 49 % of scale AI, which is arguably one of the more important kind of picks and shovel type AI companies in the in the space. um And now, know, Metta is, you know, generally in control of that and the CEO left to join Metta. um And so the these private companies with close source, they're really black boxes. We don't know what's what's happening behind the scenes. And I think most people when they hear what you just said, you know, can totally get on board with sentient. m But can sentient Are you trying to compete head to head with these seven companies or work alongside? I wonder if there's opportunity for co-opetition. Yeah, so like this would be my personal opinion. um For AI to be truly safe, it must be built in the open. But the way open source is today, just very hard to monetize. I mean, we ourselves have went through that problem. One of the first things that we put out was the fingerprinting library, which was a step in helping people monetize open source models. So what this technology did essentially was You could create a put it open source and fully reveal the model weights and fingerprint it, which gives you some number of secret passwords where you could query the model and you would get a uh response back. And that way you could identify that that was your model. What that does is it protects people from say you release this model and then me just downloading the model and re-releasing and saying it's mine. Now I can't do that. that. So this was just one step of what it looks like to monetize. Um, the problem with open source in general, as I said, it's, it's hard to monetize. So I don't see like a cooperative, you know, arrangement working out, let's say like sentient and open AI one from this open source versus closed source perspective. And two, like, why would they ever cooperate with someone much smaller than them? ah You know, they're at the forefront. have the best engineers. They, you know, they're, they're here to lead with their technology. Um, so, you know, at risk of sounding quite bold, I would say, yes, we are competing with them. And theoretically, like the roadmap does make sense. What we're trying to do and why I think it can work is one is the quantity of knowledge and developers that we can amass in open source versus what decentralized companies can do. And then two is the research and work that we're doing with routing the different types of intelligences. So, um, if you can imagine like, let's, let's take open AI, for example, um, they have less than 10,000 engineers and they work on their products only. uh And that's what they deliver to the end user. What we're saying is, you know, we have tens of millions of developers because in the open you hope that everyone will contribute either agents, models, data, et cetera, uh into this massive intelligence pool. And now you have that one, one, one issue is our pool of intelligence will be bigger than open the eyes. That's one bet we're taking because we'll have more developers working on it. And then the other bet that we're taking is, since we have more stuff, we can offer more stuff to our end user. And that part is like this little thing that we're calling the router model for now, where if you go onto our platform and ask, you know, any query, we want to work on the best router model that picks and chooses which parts of this massive intelligence pool that we've conglomerated, which part should answer each part of your query. um So the more stuff we have, the more we can do. That's the best that we're taking. Got it. And the community built AGI is such an amazing mission that I fully support it. If we could unpack what, um for the audience, if we look at the, I guess, the intelligence supply chain, most people are familiar with the inference side where they type in some question on some interface and they receive an answer. But if we go backwards from there, Maybe you can explain all the major steps, um if we go backwards, and how Sentient uh enables this idea of community built, community driven at each step. uh Because there's the model stuff, and then there's the data stuff, and then there's the data enhancement. Community built could be included in each of these major steps. But I think most people are familiar, or probably only familiar with, you the inference side. Yeah, definitely. So yeah, it's an interesting way to look at supply chain wise and adding community to each section. So let's start with models. So we, I guess we did this in a, in a way with the first models that we released. So we released, um we called it Dabi, building off of this loyal, loyal AI messaging. um And the idea here was just to research proof of this concept to show what does loyalty and the model layer look like? How can you align a model to believe certain things without just prompting? Prompting is like a external bandaid you can put on a model, but we want to change the brain of the AI to believe in something that you believe. um So part of the community built initiative in this was users, we released two AP models. which we call Dobby Mini, and they had two different tones. And then what we did is we put them out to vote and let people try them for a week. And they could vote on which tone they liked better and which model they liked better. And then we used that model as the seed for a bigger model, Dobby 70B, which was several billion parameters. So this is just like a very small instance of what community built can look like with models. I think NUS recently is now doing like a community built um training of a model where you can provide data that the model is actually trained on. um Eventually that's kind of our goal as well. Like there are many, many biases that people have. um And our thesis is that there's no way you can create an unbiased model just because data itself isn't very biased. um So you should create many, many models with biases and these biases should be obvious to the person using them. Right? Like when I go to chat GPT and use it like six months ago, maybe eight months ago, it was maybe a little slightly less than meaning compared to what it is now. Um, but you and I don't realize that until after we use the model and maybe months down the line. Um, so I guess one way to in This model portion is putting community built with the data side. know, what preferences are they providing? What data are they providing to make this model truly theirs? Truly part, truly for the community. oh Now the other side of it is the community built aspect of this intelligence pool. I kind of touched on it before, but the community can build all of these different applications. and mold AI to the way they want. um So this is the thesis behind our recent product, Sentient Chat, where we want to provide the highest level of intelligence, but allow the users to mold it in the way they see fit, and then let other users use it. let's take like Perplexity, for example. Perplexity is great search, but you can't really mold that search to how you want. You just use the product that they give you. With Sentient Chat, we also offer great search, but now you have this, we have a portion on Sentient Chat called the Agent Hub. And there's many, many agents on there that have different capabilities, all working in conjunction with our search. Right? So it's a way to show that, Hey, here's really good search and you can plug in your agent to work on top of that really good search. And then other people can use that agent. agent. So that's, I would say, community built and the capability of providing a site. And then, as you mentioned, that's end step of inference where a user just types in a query and gets a really good answer. That is a downstream effect of the community built stuff done on the model side, on the data side, on the capability side. uh So, our goal is to have community built throughout this whole intelligence pipeline as you've laid out. Now, it's not said, I don't think explicitly, but decentralization of AI is probably one of the more uh implicit goals of Sentient precisely because of the problem of these AI models are black boxes and they're controlled by six to seven entities. m How will crypto as a coordination mechanism help to decentralize ah the AI models and really the AGI and help enable community-built AGI. How does that work with crypto? it's a good question. I guess one thing to establish is like, why is the crypto or web three slash decentralization even necessary for our mission? You know, why can't, and this is a question we get sometimes, okay, you guys are building this really cool platform that has a bunch of agents and you're saying as the biggest pool of intelligence, why can't Google just do that? because they already have search, they can just get people to build agents and provide the same experience. Well, one, they're like, we don't want Google to do that because they are hardly quite centralized and they just continue to dominate. two, like if you let one entity, whether it's a company, a person, even a country own that whole pipeline, like There is nothing to stop them from doing what they want. uh And like, you kind of see it feeding up politically now, like maybe three or four months ago, the narrative in AI turned from like, look how great open AI is to wow, like now it's US versus China, who's going to get AGI? And how do we make sure our country gets it? uh Which is somewhat of like a US-centric view, uh which... We've talked about a lot here, um but decentralization is very necessary for the best AI. Let's just call it AGI. It's very necessary for AGI because if any one entity gets it, it's over. There's nothing, there's no going back. You can't stop that after. like, I think a lot of people are well aligned. There's like varying degrees of what you're... like probability of doom is on if you think AI will kill everyone or like if you can gauge AI will ruin everything. But like, I think a majority of people can agree that you don't want AI to be owned by a single person and decentralization is necessary for that. um So I would say going, going back to the main point now, like how does it actually work? Well. I would consider decentralization as this openness I would've described. You know, can we put out models with fully open weights so people can verify and you know, this AI isn't being developed behind closed doors and someone is telling me it's safe. Can I just verify it because you give me everything. That's one, the safety of it. The second thing is we think that decentralization and this community built stuff will lead to a better product and a better user experience. uh When you have more and more people suggesting things, like you will get more and more stuff and we will get more and more perspectives, like that will always end up being better than 10 people in a room deciding what it should be. So I think that's where decentralization fits in. As like a tangent to that is recently I had this conversation with one of our co-founders um He brought up an interesting idea of like going past companies trying to race this race to AGI. Like countries are also invested in getting AGI first. Like the US is putting a lot of policies towards it. I'm sure China is doing a lot as well. um Once they get to it or like whoever gets the best AI, like they also want to control it. it. the greatest advantage ever. then every piece of technology that's ever been good has always been modernized. that's like the hottest thing that's happening right now here. drones and military technology. ah So once a country gets it, like how do we, how do we dis, disallow them from completely censoring something? as a, as a US citizen, can give this example because it's most fresh on my mind, imagine only technology that had Chinese sentiment and their political agenda. I would not want to use that kind of stuff. But I'm not suggesting that everyone should use US-centric AI. I'm just saying, again, this is the bias thing and whoever has the best can define what it is, but not how it should be. It should always be democratically defined. ah So that's what decentralization achieves. It stops censorship from these countries. ah Let's talk about, I'm um really intrigued by these six or seven entities and how, and you can kind of see how they're competing with each other, right? They're competing with each other from my view in really kind of two major places. The data inputs, like differentiating on the data inputs and actually three ways. So differentiating on the data inputs and then on the models themselves. And then also on the inference side, where the consumer really interacts with the AI. um On the data input side, Google has access to Google search data. I think at some point we'll have access to Twitter feed data. And then I believe OpenAI has an agreement with Reddit. And so it has... know, agreements with to be able to use, you know, Reddit data. And so, you know, each of these have a really unique kind of data inputs that the others do not have. But at some point, you know, the readily available data that's already been labeled and enhanced and then fed into these models will at some point probably not add to their competitive advantage. um And so I'm curious your thoughts on like, first of all, do you agree with that? And second, um how can Sentient compete on the data side? Because getting access to Twitter data is gonna be very expensive, but may not be helpful since Grok already has access. And so it's almost like competing against parity. And then Google search data, same thing. So it's like, what data types are out there that's interesting and available and helps to differentiate Sentient versus its closed source competitors? Yeah. Yeah. Data is an interesting topic. one thing I think you're right. That like, eventually you'll reach this, this point, this inflection point where data inputs, we've maxed it all out. I think we haven't reached there yet. Like there's, we probably like gotten to the tip, like fully using public data, which is why I like All these leading AI labs now do these deals to get secret and private data. um It's interesting, guess, side note on that. It's unfortunate you didn't mention Apple because they should be the leading AI company right now because of how much data they have from iPhones. But I think they also made a deal with OpenAI to provide them all that data. real quick on that point, I was actually mentioning to a family member about um Alexa. I mean, Amazon has an amazing trove of data that they could use, but Alexa is still like one of the worst AI agents out there. Like she does not know how to do anything. I worked at Amazon and so it's really unfortunate because they could have They could be so far ahead, but they're just fumbling. They're dropping the ball. Anyway, that's just an aside. That's so funny. You're right. I only use my Alexa to play music or settle arms. So yeah. if even that she, you I asked her to play a song the other day and it was like the wrong song. So it's just like, ah what is happening? Anyway, go ahead. Yeah. But yeah, now these companies are realizing, okay, we've maxed out public data. We scraped everything that ever existed on the internet. We've got every book in the database. uh Where can we get more data? And that's from private, know, private treasure troves. One example is like, like hospital data. do you get like HIPAA, HIPAA-protected data? Can you make like a really good health model? um But the bigger question is like, how do we compete with all these big companies that have already gotten all this data? um So one way is we compete on fronts that they haven't yet discovered or like prioritized. So with Sentient Chat, our crypto data and the accuracy that we give with crypto related queries is better than Proplexity and ChatGPT. because we prioritize that use case. ah Then it becomes a game of like one tackling verticals one by one. The next obvious vertical for us is probably financial data, but obviously that's quite crowded already. But at least for crypto, can say we provide the best crypto data, the most accurate, the fastest, whatever you want. Any crypto query will do better than Chachapitr complexity. And so what you're calling vertical, I've heard also being called like domain specific, like law, health, um crypto. so, okay, got it. exactly. um But on the data side, you're right. It'll be a hard race to get everything. But from the user side, one thing that can kind of bridge this gap is the way that we will serve a user. let's take OpenAI. So they have this one massive model. And they feed it all this data. And now you have one big, big myth model giving you answers. Um, so, you know, it's generalized. So every query you ask it or every use case will probably be like 80 to 90 % good. What we're saying with sentient chat is we want to incorporate little, little pieces and we will build off those pieces to give you a final answer. So, know, there's little pieces of data, there's agents, there's models. um And this router thing that I mentioned in the middle between the user query and this pool of intelligence, this will decide like what pieces to use and build off of that, which to the user will look like a better experience than this massive behemoth model that was trained on all this data. That's kind of how we bridge the gap by using different pieces to give you a better answer. Got it. um I wonder, you when you go into one of these um inference models and these tools, you there's different models you can use and it's often not user friendly because I don't know what is contained in any of these models. And so even though there's a dropdown, you can choose which model to choose. You still don't know what is, you know. what the advantages are or differences between each model. What you're describing is more of a modular approach where instead of having this big monolithic model, it could be a bunch of little ones that, and then this router piece that you're describing takes the best pieces that probably meets the user's needs the best or something like that. Got it. give you two examples about it. one, uh maybe a high level example is like, what we want to be able to provide the user is you come to Sentient Chat and you say, ah here's my MetaMask wallet. I want you to access it, analyze my portfolio for my risk uh profile, and then take out$200 and research what type, what top five meme coins grew the most in the last 24 hours and invest across them to match my risk portfolio. So that's like a complex query and that's like 10 very simple tasks put together. is, not right now. So autonomous, autonomous executions are not available right now, but they will be like, we are prioritizing wallet integrations. Now what we're really good at right now is the half of the query about market research. will provide you the best market research and the most accurate right now. And then the other half is trying to get those autonomous agents working. But again, like those are very small queries, small tasks. And, you know, in our intelligence pool, there will be an agent that does one specific task very well. So like in that previous example that I mentioned, like cat GPT, generalized 80 does all tasks 80 to 90 % good. Well, with this intelligence pool, have one thing that does one thing 100 % good. You have a bunch of those things. So the idea is you get something that is 100 % good across all tests. um So the router model will pick, we should use this agent to access the MetaMask wallet. We should use this agent to do research on what top five main coins grew the most in the last 24 hours. um And little by little, you know, each separate thing does a specific task. then to you as a user, you know, it's very seamless experience where it gets done. Another example I can give you that actually exists today that we just did is with our research, uh we released a new model two weeks ago, and it was accompanied by some research that showed what does it look like to use various models within one query. So just like you identified, having a router to use different models for specific tasks. um The beauty of Dabi was its human-centric tone. But the problem with making a model really good at talking like a human is its model performance completely degrades. Like it doesn't, it's not able to answer questions correctly. Its knowledge sucks. And we were able to strike a really good balance with Dabi. So with this research, showed on top of like it, Our research methods aside, like training a model to be human centric. How can you provide that user experience? Well, one way is to use two different models in the background and have them respond to different parts of your query. So if you want to focus on human tone, for example, like a sarcastic, you know, funny kind of tone. If you ask like, okay, what is the capital of India? The bigger model that has really good knowledge. will provide the answer of New Delhi. The smaller model with human-centric tone will provide the extra fluff around it of like, you dummy, it's New Delhi. uh So this is just a very simple example of how you can route different skills to these models. So it's published in the paper, and that's the research that we're bringing into 17CHILD. That's pretty cool. I don't know if I shared this earlier, but my master's thesis was in a, I studied computational linguistics a long time ago at University of Chicago. And my thesis was, I actually created a neural net that attempted to disambiguate words or senses. And so if you've got a word like bank, for example, has many different senses. And so one sense could be m it's the side of a river um or it's a uh financial institution, a building where they handle financial transactions or a type of basketball shot. You you bank it off the rim. um And so I created this neural net. by the way, at the time it was kind of, I thought it was pretty pioneering work. I used something called analogical model or reasoning. um And anyway, I fed this neural net that I created, uh just large corpora of data. And then I gave it a bunch of test cases of words with lots of different senses. And I asked it, in which sense am I using this word in this sentence? um And the results were OK. And keep in mind, this was like 15, 20 years ago. This was a while ago. And I'm realizing today, like 20 some years later, these AI models are still struggling with word sense disambiguation. And so it's, on the one hand, it's AI has really advanced in many, many ways, but in other ways, it hasn't advanced much at all. And so your example of, you know, two different models where one, has a really large knowledge base and gives you the truth. And then another one that gives you the truth with some additional information and also some human-like characteristics is interesting. um anyway, your example there kind of just brought me back to that time and how far we've come and yet we haven't really advanced in some ways. In the very human, in the very ways that kind of define what a human is, uh AI has not really advanced a ton. Yes, you're absolutely right. like this was like, I'll send you the paper after I think you'll enjoy reading it. ah But the main point of it was like, you know, we're taking a different approach on what good AI action means for the last, let's say five years. Good AI was what can program the best, what can do the math best, what can do math the best, ah what has the best logic. uh And now you're seeing like, okay, we're kind of starting to max out these hard skills, um something that you can train just on data. And this goes back to your data point, right? Like we're almost done with all the data. um But we haven't really made any advances on like, okay, how do you make like an empathetic model? How do you make a funny model? You know, these soft skills that make humans human. That's what we think the best AI should be. The best AI is the one that's the most human. um It's really exciting work that we put out about how do you balance these hard and soft skills and start making AI more human-like. Yeah, it's, uh, I find it really cool that you mentioned funny because I'm actually, I'm talking with a project that's trying to build a domain specific model, um, around, comedy. And so they're, they're actually hiring, um, data labelers with backgrounds in comedy and theater to help the AI understand what funny is. And so it's a very, very interesting domain. Like. I mean, it's like hyper-specific, like domain-specific kind of work. um And I'm curious to see what the output will look like and if it's successful, what it will do to the comedy industry. mean, potentially it could make all of us funny, which would be a huge advantage for me since I'm not funny at all. Like I'll be using this thing all the time. No, it'd be super interesting to see something like that. And I'm sure we will start seeing more people focus on skills that haven't been prioritized from large AI labs, right? I mean, at end of the day, it makes sense. Like the people that are building these models want it to be best used for them. They're all engineers. They're like, okay, how can I make a model that I can use? Well, we should make a code really well. Like no engineers, okay, I want to talk to a funny model. Well, it doesn't help me to work. Um, so like, I think it's going to be really interesting to see, you know, if you want to consider data on one side, like there's all this data that's accessible. That is all hard skill data. And you and I both agree that we pretty much maxed it out. Um, the hard skill models are pretty much there. Then there's all this private data, which are still hard skill focused, but that's a whole different issue of how do we get access to that data? Like, you know, legal documents, um, hyperprotective stuff, you know, things like that. And then finally, there's this third data section of soft skills that we, you know, don't know how to define. We don't really know what makes someone funny. We don't know what makes someone empathetic. like, since it's hard to define, it's hard to even put it in a model and like, train someone to be empathetic, train someone to be funny, because we haven't even figured out how to do that with us. um So it'll be interesting to see what kind of experimentation leads to that. I think there's a, I think a fourth category and I'd love your feedback on this. I'm seeing some projects that are uh targeting personal data, like your camera roll or your notes or your Microsoft Word documents. It's kind of a hybrid between like private information from But when I think of private information, I think of enterprises. JP Morgan has four petabytes of data or some ridiculous number that's not public to the world. But if they ever created their own internal AI, it would be very, very helpful and interesting to internal employees. But from individuals and their private data from their phone camera rolls or their interactions on whatever. um What do you think of that type of data? And I mean, if it's successful, could greatly differentiate the model or the output of the model because of the data inputs are just super unique. um But do you think there can be enough data to feed the model? I don't know. It's just something I've been thinking about is like, How many like camera roll? I don't know. What do you think about that? It's interesting. mean, like, I think the most ideal AI model or interface, whatever you want to call it, is deeply in tune with who you are. ah Like, and people do this in piecemeal right now. For example, if you're using ChatGPT to generate essays or fix your writing, like the first time you use it, It's some base case, and then you continue to give it feedback. I'm like, Hey, could you make it sound more this? Could you make it sound more like this? then like Chet, GPT is saving all this data. Um, think recently we verified that whether you say no, or you say yes, it is still saving the data to, to train the model. like, can see it now. Six months ago, when I started generating or like having it revise my content. to now, it's great now. I don't even have to continue to pester it. I'll put it in and it will generate it exactly the way I want it. um So this is like kind of um maybe a shade lower of this personal data that you're talking about. But if people had the option of like giving someone everything on their phone and saying, Can you generate me an AI model that's personally customized to me and like, we'll, you know, respond exactly like me. We'll do exactly what I want in the way I want it. It understands how I talk. It understands what I want. I think I wouldn't mind trying it out. I think that's kind of what people were expecting from Apple because they had access to all the data. They have your camera, they have your texts. could on paper do it. ah But yeah, this, this idea of personal data, I think And this is maybe a pessimistic view, but whether you and I want to give it up or not, it will eventually end up in a model because this train to AGI is not going to stop. like they're all thinking of, like how do we make our models better? One is through, you know, the reasoning, et cetera. And the other is through data. Like we have to get more data. We have to get data that we haven't touched yet. Um, so it's going to be there eventually. I'm curious about, it's probably more of an engineering question, but um what's the inflection point where, I guess the requirements of the model, like how much data would it need? Let's say we created some company that where you can upload your camera roll pictures and you gave it the AI permission to do whatever it needs to do with your pictures. And then you get compensated for it somehow. Like how much data would it, camera roll pictures, would it need to actually do something useful with it? I thought about this the other day as I was kind of doing research for this interview and I don't know the answer to that question because And relative to the available data that's on the open internet, there's so much data. I just don't know if there's going to be enough camera roll data to train any AI. Right. It's interesting. think it's more about like how specialized you want it to be. Like you could theoretically train a model on one picture and it can be the best at generating that one picture. um Like obviously the more you give it, that's the reason why these AI labs are the way they are. They want to generalize and be 90 % good at everything, which means you should take in as much data as you want. But if you look at some of the more niche models that do a really specific use case, know, take ours, for example, Dobby is an AP model. It's very small, but it does tone and it does, you know, crypto alignment and loyalty very well, which you can't have in a really generalized model because the data would skew it out. um Yeah, I don't have like the technical answer. You should get Anna Kazlowskas on your next podcast from Vana. She has a really interesting take on model training and the data. She's the data person. oh But yeah, mean, this is all super interesting about data and what it will take. But I think the bet that we're taking is on the data side, we don't really care. about, I guess that's not the right way to phrase it. We do care about getting the most amount of data, but not like, we don't care like if it comes from one fully centralized source, like we think we will get more data from everyone contributing it rather than us trying to go out and fetch it from every single source, know, crawling to Google, going to these different websites. um So it'll be exciting with Sentient Chat. And we're excited to show what can happen with it. I had a question about, ah sorry I got distracted. Oh, Dobby, can we talk about, um so that it's right now it's invite only, is that correct? So sentient chat is invite only. Yeah. Yeah. And so is there a wait list that people can sign up for? Yeah, so we put out a wait list. um I want to say right during ETH Denver, end of February, and it hit like 2 million users in a week. um And we kept it going. And now we're just fully focused on scaling up. So um honestly, I'm happy to give like 50, 100 codes to you and you can distribute to your viewers. um Yeah, for sure. So I've been using it and have been testing it against, uh you know, chat GPT and grok and others. And so, um and it's, I like it. I like that there's other options out there. And uh I believe uh Yup AI announced recently their tool, which um for any given prompt, it gives you the four answers, I think, from OpenAI and Grok and ChadGPT and maybe one or two others. That's kind of interesting, right? From a consumer's perspective, that way you can actually choose which answer best meets your needs. um And then that feedback then is help to retrain the model. um Anyway, that's an aside. yeah, some invite codes would be great. I think the audience would appreciate that. um think with Yup, Yup is an interesting case. cause it was bound to happen, right? Like people already use two, three models and they like check on chat GPT, go to Claude, check that, go to Grok, check that, see which answer they like best. then eventually, you know, someone on Twitter tells me, like, you know, you should use chat GPT for this use case. You should use Claude for this use case. And we just do that. Um, But one thing that we, I think like the most optimal user experience is that you just type in a query and it tests all of those answers in the background, but gives you one best answer. It just knows what is best, what you will pick. um That's the perfect user experience. And that's what we want to build at Sentient Chat, right? Like it would be a kin, you know, we have like, I think, 12 agents live on Centering Chat right now and 50 more in the pipeline. uh Imagine if you asked some kind of crypto resource question and then we gave you four reports. That's great. But what we want to do is give you a much better user experience where you ask the question and maybe there are four different agents working in the background, but they're working together and then you will get one consolidated report. to you, it will look like you're using a beginning myth model. But to the back end, there's all these little pieces working together to work together and do their jobs very specifically and do them well, and then combine it for you. um So I think we'll see how YUP does it. I think it'll all go back to what is the most seamless user experience. the different angles of how a specific, for any given prompt, what does the actual on-chain information look like from DUNE, as well as what are people actually saying about it? And sometimes there's directionally, it may converge in a line. In other cases, something might be doing really, really well, but the sentiment's very bad. And so I'm curious to know about how Sentian chat. the data that it accesses to provide the answers for people. Yeah. you're right. Like these two are definitely very important for if we're going to claim that we're giving the best crypto answer, the best crypto also, we should have this really good data. ah So we have a partnership with Kaito. um We're going to start integrating their data in the background or should already be up. see. But like definitely by the, I want to say mid-July or end of July, the sentient child you're using right now will be completely different, um upgraded in every sense, uh front-end design, UX, data, agents. uh So it's going to be really exciting. And then uh we have like a data consortium where we're bringing all these data partners. One, we brought them in originally to um give users data related to model training. That's what we wanted to enable in the beginning. Here's this platform where you can easily train models, fingerprint them, open source them, and then make money if anyone needs to study. And here are these great data providers to give you data to train their models. But now we also want to include that data. I think we have like 15 partners that can include this data across different domains in sentiment chat. We'll have another data consortium announcement probably end of this week with some new partners. um But yes, like this again, everything just consolidates into this pool, right? Like data, agents, models. um The more we have, the better it will be for you. um But yeah, like that's the goal to try and get everything good in crypto into sentient chat and provide users the best experience. Well, that's quite exciting, Vivek. You mentioned the upcoming things around June and July. Maybe give us a sense of the roadmap for the next, I don't know, maybe the rest of 2025 and how people can get involved, especially if this is a vision and mission that they can support. How can people support the mission? Yeah, definitely. So we'll start with the roadmap. um I think the next two months especially will be very exciting. uh You know, we expanded our engineering team to really focus on SentientChat. Everything is focused on SentientChat now. ah Three months ago, we had different threads of research product, uh all trying different things and seeing what's stuck. And SentientChat was being built in the background. Now, Like everything is focused on sentient chat. Our research team is working on bettering our AI router. So for example, now in sentient chat, when you query something, it will decide if it needs the web search or not. That's an example of the router. And the most advanced version of that is it decides what agent model or piece of data to use to answer your query. So that's what research is focused on. Product is focused on completely redesigning the platform. improving your user experience, uh integrating all these data sources that we have, the models that we have, the more agents that we have. ah Like I mentioned, have, I think, 50 plus agents in the pipeline. And now we're expanding past Web3 agents as well. Like we've canvassed the whole industry. um You know, we're setting up a lot of efforts in SF to talk to some of these Web2 agents that can provide Web2 use cases. uh that exists on GBT or exists on these other platforms, but they're not making any money. So come to Sentient Chat. If people use it, they'll make money. ah So that's the product roadmap. By end of July, it'll be a very exciting launch of the whole redesign platform. uh And that's the best I can give for roadmap. Nothing has been discussed after that. We're just fully focused on Sentient Chat. on the other issue of like how to get involved. One way is if you're a builder, if you're somewhat technical and you want to build an agent, like bring that to sentient chat. We have a builder program that we announced a few months ago. We put a million dollars up into this grant pool and we want to fund any idea. That's good. Um, if you want to build an agent, want to build a model, et cetera, you know, apply it through the builder program. Um, link on our website. And now we just hired someone full-time actually to fully manage that and take in all in-bounds. And then I would say the other way to get involved is just learn about what we're doing at Centaur and understand the problem that seems like it doesn't exist, but will vastly approach. And once you realize it exists, it may already be too late. ah Once you realize that, well, like, wow, like Google controls a lot, like OpenAI controls the whole AI stack. Well then like, we've already lost at that point. There's no taking that back. ah So, join the Discord community. We have a thriving community in there. We just brought a new community manager. ah Chat to people, learn about what Centin is doing and stay tuned. Two months is going to be fun. Awesome. Well, Vivek Kolli, Chief of Staff at Sentient. Thanks so much for taking the time to speak with us and educating us on Sentient and the amazing things that you guys are doing to advance its mission of OpenAGI. Yes, sir. Thank you, Peter. Thank you for having me. This was great. Thank you. Cheers. Yeah.

People on this episode

Peter Abilla

Host

Vivek Kolli Sentient

Guest