Ben Fielding & Harry Grieve: Gensyn – The Deep Learning Compute Protocol | Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies podcast

00:00

This is epicenter episode, 471 with guests, Harry grief, and been Fielding from Jensen. Welcome to epicenter. The show would shocks about the Technologies projects and people driving decentralisation and the blockchain revolution. I'm Frederick a handstand today. I'm speaking with Harry grief and been Fielding the founders of Jensen.

00:34

Jensen is an AI blockchain project, that is looking To enable you to buy a i compute in a decentralized manner and we will get to that in just a second before that. I will tell you about our sponsor this week, though, our sponsors Tallyho and open source wallet, redefining the wallet as a public. Good with Tallyho, you can safely connect to defy and web three with everything you need from metal mask. Plus a lot more, you can view your nft Xin, what it across the theorem polygon out.

01:10

Amazement a Bertram. There's also no need to manually at these networks. They already come plugged in tally, ho has the best edge of support around built. By a community of developers that listen to users swap between Assets in a wallet at a fraction of the price and conveniently view. All of your account balances across multiple networks, with our new and improved portfolio

01:32

tab. Currently, they're running a campaign called gyoji SSE, why da GC See a layer to Adventure that rewards users for exploring the arbitrary ecosystem with Telly Hope from now until December 2nd. So hurry up, rich fun. So Albert room with one of their participating Bridges and claim your trusty space Dock and ft+ be entered into a giveaway for rare blueberry Club nft had to their blog at block tortelli dot cash or their Twitter, telecast for more info. Tell you who isn't just building

02:06

a wallet that work steady. Ho is building a wallet web three can believe in Visit Tallyho to Iraq today and to download the wallet and join Over. 150,000 people in signing their Community pledge. Okay. Fantastic guys. Hang on. Let me let me order you a bit. Put myself in the middle here. Fantastic. Harry and Ben. Thank you so much for joining me. Hey Federica! Thanks so much for having us. I think we said this on Twitter, but we both been longtime listeners of the epicenter podcast.

02:38

So really pleased to be here. And I think that's so good to hear. How are you? And Ben, I'm tell me about yourselves. What are your backgrounds? And what did you do before Jensen? Sure. So yeah, I guess my backgrounds in machine learning research. Mainly. So I did a PhD in deep learning focused on neural. Architecture search as a problem which is essentially searching the space of deep neural network structures to find one that kind of most performant for a

03:08

specific task. So did a PhD in matte finish that in 2019, and then moved up and moved into the startup world and co-founded a data, privacy startup. So I've got quite a kind of strong interest in individual data, privacy kind of data sovereignty and things like that. Did that for a couple of years. And then joined an accelerator program in London called the entrepreneur entrepreneur, first, which is where I'm a

03:32

hurry. And we kind of went down the rabbit hole of what we're building with Jensen. Yeah, and I'm on my side, my backgrounds and applied econometrics, so kind of a fusion of economics and statistics. I was sort of introduced in machine learning during my post-grad doing my Master's Degree whilst while studying econometrics and fell in love with it from there. I just thought it was so cool to be able to essentially quantify everything.

04:00

The kind of next step for me was leading data research team and they I start up in London so whereas been comes from over. Of technical kind of academic, background lines from one and the applied side. Commercially got to the point where I really wanted to build something in the space. I saw a lot of issues with respect to scaling and yeah, join the entrepreneur to First accelerator met then for anyone who doesn't know what the year for entrepreneur.

04:25

First accelerator is it's been described as love. Island meets Shark Tank. So join as a individual and then you find a co-founder and then they kind of invest and you saw sort of pre idea met then shared a similar vision for the future of a I shared a similar sense of humor. So yeah the rest is history. So it seems like you both come from a fairly extensive a ID planning background. What moved you to kind of marry this entire thing with blockchain? Good question.

05:00

It wasn't a sort of instant thing. It happens over a relatively long period of time, to be honest and essentially, it was technology-driven. So we knew we wanted to build massive scale, AI infrastructure and essentially as we were doing the research to figure out how we could make this, the absolute maximum scale. We realized that in order to do that, you need to have a

05:24

trustless layer. Essentially, you need to be able to unite compute without To do centralized onboarding of new providers because at that point you end up with an administrative kind of like scaling limit and we don't want

05:37

any limits. So we went down the kind of the road of verifiable computation research until we hit that kind of block of the always has to be a trusted third party that has to be this judge or Arbiter when you're checking a computation who makes a kind of consider a decision on whether something's been done correctly, blockchain represents a way to kind of break that and do it. Vai Vai by consensus

06:00

essentially. So a large group of people can do it without having to nominate a single person to make the decision. And that was the light bulb moment for us where we said, like this has to kind of be the next step for AI to get the scale that we want. Like planetary a i scale. The has to be this kind of consensus later introduced and blockchains the way to do it before that. Interestingly we were kind of blockchain Skeptics to an extent.

06:24

We hadn't kind of dived into the space before we'd sort of day, taking the typical technical path of Of saying read only database can do the same thing that for I Won't kind of dive into it. But I know for me personally, realizing that kind of trust layer was was an absolute light bulb moment. It was when I realized the kind of actual power behind it and got very into the space. Yeah, interestingly that and I shared a lot of the kind of ideals that you see Champions kind of in The Wider

06:50

decentralization scene. So we both were very like large, free speech maximalists. And we kind of a lot of the a lot of the kind. Of censorship stuff that we saw with Snowden and things like that. We we bonded over prior to even talking about blockchain.

07:05

So it kind of felt almost like obviously we should have started in the blockchain space but we didn't interestingly right before making the switch, we were trying to do Federated learning which is an area of deep learning where you train lots of models across distributed data and then combine them to create a kind of metal model that can learn from all the data sources and we were doing that with banks. So the the kind of Of realization for other for me at

07:31

least. Well as that, there's a much bigger problem with accessing computes or essentially just the processors on which the models could be trained and to do that, you need a decentralized kind of method of trust and that's basically a blockchain. Okay. So basically it's kind of the platform and decentralized incentive layer that kind of did it for you in terms of that form. Yeah. And in terms of moving this to a blockchain Maybe let's do many.

08:00

Let's talk about AI fast before we kind of go into what Jensen exactly does as a as a blockchain protocol because most business of this podcast will be familiar with blockchain to a certain extent, but AI is not so much our user cup of tea. So let's talk about the state of AI today as an outside. It kind of seems like it's totally on fire. Fire. I mean with GP T 3 and T PT Force concurrent going to come out soon. I think and then things like Dolly and mean it just it's

08:38

completely mind-blowing. And can you guys talk about the advances in AI in the last couple of years? Absolutely. Yeah, I think it's interesting being in the kind of AI space and watching this explosion happen because the sort of AI and machine learning space, over the past seven years I guess has basically been a series of mini explosions. So this one is just kind of a

09:02

next one in the in the sequence. But I think to The Wider world, it's one of the first times they've seen it actually create real impact, and create applications that people see the value and essentially. But yeah, I think Deep learning fundamentally has been the big change. That's kind of enabled all of this. It was when I first started my PhD, the Deep learning kind of explosion was just happening. It just started it just kind of Taken computer vision as an area by storm.

09:29

They'd shown that essentially using a deep neural network, you could blow away all of the benchmarks set by sort of manual computer vision methods in the past. So very, very I'll try it. Very, very briefly computer vision before that used to be kind of manually defining sort of Otters over images and then figuring out how to detect lines

09:49

and things. And then you would have to Define this filter to detect the kind of line that you're looking for and textures that you're looking for and is a very manual process. Deep learning essentially just came on the scene and said we can do all of this straight from the data and that was such a huge change.

10:03

It took away all of that kind of expert knowledge, that was required and just allowed somebody with enough compute to design, a kind of relatively simple model, apply it to a very large amount of data and then just have the outcome that they

10:15

want. What we're seeing now is essentially the kind of building on top of that building models that can do even more and then crucially getting them to the consumer or to the developer who doesn't necessarily know the specific problem that's been going on for years, but image and Ali G, PT 3, Etc, of really kind of fast track that. I don't know if you want to speak to some of the deep learning stuff as well. Hurry, yeah.

10:39

I think whenever we kind of talked to crypto crowds about it at conferences, we always do a kind of sharpener. Iran, the distinction between three terms so the AI machine learning and deep learning because you're used essentially interchangeably but they're quite different and the best way to think about it. Our series of kind of like circles which are like a matroska doll almost wear on the outside of the big on the outside.

11:01

You've got Ai. And Ai and bi the loosest definition possible with many people will disagree with this, but the losses definition is, it's just programming machine to do something. So, you know, kind of washing machine, is in a sense, a narrow version of artificial

11:15

intelligence. And you tell it to do something and it kind of programmatically does it, or it works out how to do it, machine learning kind of came into the scene, much more prominently in the kind of 90s, and the kind of early 2000s, where in you instead of having has been set expert systems for you say, you know, if this then this you use data to advance.

11:34

So it's really work out, the kind of probability with which a certain decision will be made, deep learning takes that concept, but allows different kind of Concepts, be modeled much. More with much more kind of fidelity. So it kind of has hierarchical feature representation which means that the way that the model Works, different parts, learn different things.

11:56

If, for example, if you're the classic example is if you want to recognize handwritten letters and your network typically pushes the image through lots of different layers. Each layer will kind of pick up something like a kind of all this. This kind of number has a has a close look in it or as a stem and then we're over time and over Some kind of computational cycles and lots of tweaking. The model will be able to generalize any new image.

12:21

It sees to one of these kind of categories, you know, a number between 0 and 9. So that's basically the distinction between a IML deep learning deep learning is where you see all the kind of big breakthroughs coming in. So all the things you mentioned, then gbd3, Thalia search for stuff like stable, diffusion, all thats deep learning and the story for deep learning over the past, kind of, I guess, you

12:44

know, I guess it's like 20 years. 16 2015 has been transformed, our models, which are a specific type of deep learning model that have been very useful for things like large language modeling. I think what's crucial as well as a kind of more social point is. If you told people, you know, at the beginning of the 2010's that they'd be able to essentially generate a comic book, which is in really kind of convincing with really convincing art. Just from a series of text prompts.

13:12

I honestly don't think most people would believe that's possible. Particularly the kind of consumer grade for like a normal person, just to be able to type text prompts to create a comic book in the next few years. The kind of same order of magnitude jump is going to

13:25

happen. So in the 2020s, the ability to sit down in front of say, you know, Netflix and instead of picking a movie which has been, you know, pre-made you simply enter a text prompt and you're like, you know, I want to see free technologists talking a podcast for an hour, you know,

13:44

about about AI or something. And with other kind of prompts, and maybe like a kind of set of initializations, you'll be able to generate an entire movie, which you can then kind of steer or maybe a different place if you want or as a final point, maybe you have the same story but you can change the genre of

14:01

the story. So you could turn something like, you know, I don't know, Halloween into like a sci-fi movie, where you can change Jurassic Park, it's a love story or something, all by changing the using same script that changing the kind of rendering, Lots of exciting things coming. In my opinion can't Kenny talked about kind of the Paradigm Shift

14:21

behind this. So basically I mean if you look at like old-school programming it's a lot of deterministic if this then that and so on. And and in my understanding and admittedly the, this is a very layer understanding you kind of you use like some sort of neural network with, you know, like complex connectivity. And where exactly is. I mean, to Do people exactly understand how decisions in a neural network actually reached? Is this something that can could

14:54

you kind of Transport this? But I mean, obviously, you don't use real neural networks, right? So basically everything's in a regular computer your, you don't have to go to like the bio lab, although, that would be not. Yeah, I'm not sure whether that would be terrifying or fun. Um, but basically everything's

15:16

anyways in a computer. So basically kind of your modeling like in a different system that's kind of more interconnected and more flexible and maybe you can, maybe you can kind of qualify how the system you're modeling with your regular computers different from just, you know, just giving the computers prompts. I mean yeah sure. I think the black box sort of nature of Learning models is just down to the absolute size of them.

15:49

At the end of the day, you're still tracing a path through a series of kind of decision points in the, in the network. It's just that path is absolutely enormous and it's hard to kind of Link the weights or the parameters within that model down to exactly why they're that sort of value.

16:05

Because they've come to that value after being fed millions of samples and you can deterministically, you could do that, you could track every single update, but the size of data that you would end up generating, would be absolutely enormous. I think it's the sort two things that I see happening. As we kind of go through. This one, is the Black Box. Nature, is sort of falling away

16:29

a little bit. As we start to understand more and more about the models that were building deep learning as a kind of research area is sort of gone through a, an interesting fast period where there's been a lot of experimentation that wasn't driven by the sort of Of the research. It was more driven by seeing what we could get out of it.

16:49

So we throw more data at it. We try out new architectures and we just see what happens rather than starting from first principles and designing this thing and knowing exactly how it works. So, it's been that kind of exciting period where everything has been very black box.

17:02

I think a lot of the gains that happened there or something to sort of starting to slow down a little bit and we're seeing people revisit those architectures and sort of check and say, why does this work so well, let's dig into it and that's kind of prove it out. Out. So, in some ways that kind of curtain is lifting.

17:18

The other thing that's happening which is a bit more controversial, I guess, is the shift in people's perspectives as to whether a kind of computational system needs to be fully deterministic or whether we can live in a probabilistic world. We live in a probabilistic world as people that's kind of self-driving.

17:35

Cars example is probably the clearest where, when we're driving around, we accept that there are kind of stochastic events that happen and that the can be small accidents in the can be issues that Button with a self-driving car system. We don't accept that at all and we say that this has to be a fully completely deterministic process.

17:52

I think one of the challenges that the self-driving car industry, as hard as been an assumption that people would just accept that probabilistic mechanism, applied to self-driving cars, and they haven't, but I think that will change. And that's the probably the controversial but as we as a society go towards actually allowing kind of probabilistic computational systems to exist, alongside us. Not sure if it'll be an easy road but But I think it'll happen.

18:17

Yeah, thank you before we dive into the current landscape. There's one term I have come across often kind of in preparing for this episode. Also maybe that's a question for Harry because you already talked about the different kind of machine learning, deep, learning artificial intelligence. So basically there's this time of artificial general intelligence is said different from the three terms you already talked about. Yes. So it's a term which was popular.

18:47

Yes, I believe by at Bangor so who's on? AI researcher and entrepreneur. The idea of AGI is similar to also the singularity so it's the idea that you get human level intelligence from a machine. So you have right now, what you might describe as a kind of like artificial narrow intelligence for by machines are good at doing certain tasks. So for example, machines are very, very good at detecting certain types of cancer from from medical scans. Right, so pattern recognition, yes, yeah.

19:20

But kind of scaling that up to general intelligence for by a machine, can be good at doing a task, which is kind of may be simple to humans, but actually quite difficult to reflect in a kind of computational. Like, can you give an example prediction space? Yeah, good example, would be a machine being able to walk through a crowded area in a

19:40

smooth way. Whilst being able to essentially make get a discrete assumptions about all the The inputs around it. It's one of the reasons that I can't believe I can't remember the level of driverless cars. I think it's like maybe level 10 or something.

19:54

It's one of the reasons that driverless cars do really well on the motorway because it's a very kind of, it's a kind of, it's a problem was to humans might feel quite complex but it's quite like a simple sort of mathematical problem because there's not much variation, but when you take that same car and you put it kind of in a city street in Rome, you know, going over cobbles Sakura squawking out in front of everything, it becomes extremely I'm Lee difficult.

20:18

So it's kind of yeah, some of the stuff which we think is really kind of difficult like being really good at chess.

20:24

It's actually quite easy for a machine but some of the stuff that we think is really easy like being able to kind of walk down the street or, you know, being able to like I guess certain certain kind of things in conversation, like, you know, understanding looking at some of the tire body language and looking at everything of someone saying, and being able to kind of withdraw an emotion from that, there might be, she won't be good at various things like Estimation, you know, how

20:48

someone sitting but combining that all together and making a kind of decision, it's quite difficult. So yeah. Artificial general intelligence basically means a model for a set of models or a system which is able to essentially be as good as humans at everyday tasks. Critically. The kind of Advent of AGI leads to artificial super intelligence because it follows that once a machine is kind of mastered everything, a human would reasonably do.

21:17

Their rate of kind of marginal Mastery, over tasks moves a lot faster than humans because of as a kind of function of both the kind of complexity of their model and the amount of computer available to them. So if we threw all the computer in the world at a model which is already a human level, it's got much more energy than the kind of normal human does and it's also got infinite lifespan and it's also got a perfect memory.

21:38

So horribly Your Perfect Memory. So it kind of that's where you kind of get into the realm of kind of Science, Fiction, horror movies but this is what eat on is afraid of Yes, you hear a lot and these kind of examples and you here also kind of what one of the kind of Pathways that people are at least I estimate will kind of take is, there is the kind of fusion, of humans

22:00

and machines. So, for example, if you have a kind of brain-computer interface or brain machine interface BMI, and you're able to essentially augment your lived experience with, you know, machine kind of inputs that machine, learns from all your kind of the way. Your brains working and firing it learns patterns, you're helping it train is going to helping you train your own brain and that's going to help speed

22:25

up that process as well. It raises a kind of you know, Treasure Trove of ethical could you know, kind of issues. But the yeah that's basically definition of AGI and then subsequently a SI artificial. Super intelligence. Who is super interesting. So let's look at what the landscape currently looks like, right? So basically say I want to In a, a martyr. Where do I buy a? I compute? So, I mean, I could just get an instance on AWS or I could run

22:55

it on my local machine. So kind of walk me through through the options. Yeah. So it really depends on the scale and model your training. If you're a kind of student learning about AI, maybe you're an undergrad, you typically just use AWS or for small enough models. Your local machine.

23:12

As you can say, the next level up, you might be a kind of Startup. You've just burned through your kind of hundred K of a and credits and you're kind of looking at the kind of marginal cost of training models. You might go for an on-demand. AWS instance, you might go for

23:29

something more kind of fixed. We're going to permanent which is typically cheaper when you have booked them and in advance but the reaches a certain point when you're training models that use a kind of are experiencing enormous cost in AWS or B, you can actually achieve the scale required. Terms of gpus. So you just get kind of limited by AWS in terms of scale at that point, you see companies go in

23:55

house. So in our, in our kind of research prior to raising our last funding round bed, and I spoke to about 150 machine, learning, researchers, and Engineers, a variety of places from going to find companies to startups to Academia and whilst a lot of academics at top universities, have access to kind of clusters and large, can high-performance compute. And people, it's a We have access to the fear research lab supercluster there, which is

24:21

biggest a, I lost him. The world, most people in our experience didn't manage to get the scale that the that they wanted. And one of the ways that some of them kind of dealt with that would be, they buy gpus themselves and they bring them in house and then we manage

24:35

them. And we heard all these horror stories about people like in south of England, having a spare bedroom with a fan in it and loads of gpus, it's a good, I like a bum, like a Bitcoin miner upstairs, and like, also people who would have them in their office. Then it's a bit of a kind of fragmented Market. However, basically, the bottom line is, if you buy the gpus outright typically it costs less marginally over the long term to

24:58

run them. And that's a function of basically not having to pay the sorts of 65% ish, premium for, or should see margin for accessing Amazon ec2 instances. So, those that's kind of cloud, local, or kind of get in your own cluster. There's also high performance compute if you're in Academia, You have access to that type of compute but then again, it can be bottlenecks there.

25:21

There's other kind of options. So for example, if you're a kind of, I guess the neverland organization and you're wanting to solve our highly parallelizable computer, science problem example, that would be like folding at home. You can, you can access volunteer, computer networks using things like boinc from from Berkeley originally. It may be a lot listeners. Will remember things like seti at home. Well, it's not, you know, it's not It's not machine learning. It's kind of just on a

25:50

processing signals. It's a really good example of grid Computing, reaching very large scale. I think, right now that folding at home, which is kind of successor has the largest kind of compute volume anywhere in the world even greater mystery book and supercomputers like 'fuck a coup. So yeah, to summarize you have you kind of go from your local machine onto the cloud maybe via High-performance cluster at University or and then ultimately back off the clouds, taking it back on program.

26:23

The goal of Jensen, as a segue is to give everyone access to the same kind of compute scale that the people who currently have on Prime clusters can achieve and crucially to do. So, in a way, which allows Fair access so kind of it's not can't be turned off by centralized entity. There have been projects like this in the blockchain space before one of Very, very old by blocking standards projects.

26:50

It's got em. I believe they actually did their SEO and 2016, which is basically, like 50 years ago and block him. Yes. So how does that Jensen compared to Gollum? Yes, great question. So, we've been given kind of two axes. The first one is the kind of thinness of the protocol, so to speak. So, Golems are General compute protocol, you can do lots of things on it, and we are, Finn protocol more similar to like

27:19

render protocol. If you want the kind of analog there where we do one thing and that's training, machine learning models. The the second kind of point is on the kind of scalability of the verification. So what we see in a lot of the kind of earlier projects is a tendency to use things like reputation or to use kind of less Byzantine tolerant or fault all, and I should say, methods of replication when we looked at those.

27:48

Those kind of those architectures for verification systems, be just didn't work for us. As people who train machine learning models, we just wouldn't, we wouldn't have enough faith in the results. Doesn't mean that you don't work. It just four of his comfort purely for kind of machine learning. We just weren't such a villain when we had conversations with kind of web to machine learning

28:07

people, they kind of agreed. So for us, the goal was to basically take a lot of those initial learnings around, how do you kind of position? Our you protocol in this kind of world crypto world, but do it crystalline away which is only for machine learning. So you can make super, you know, optimizations around the the kind of speed and that the cost

28:29

of the protocol number one. But number two, how do you can reach a satisfactory level verification right now, that verification and consensus pieces, really like the vast majority of our time and energy. You know it's it's the question and we had a good initial stab at it with Our kind of inaugural light paper but we've expanded on it since since then I don't know been if you'd add anything to that.

28:55

Yeah. Probably just to emphasize the kind of general purpose approach that most people before have taken it's quite an attractive one. You want to get the biggest Market you can possibly kind of get to. So saying we do general-purpose computation any scale, any kind of computational problem is attractive at first, but you fall so quickly into the two traps Harry mentioned the first drop is the very first Vacation problem. It's very, very, very difficult.

29:19

Our thesis is, you have to narrow and that we will have a big sort of set of thin protocols at the bottom of the kind of the decentralized infrastructure stack if you think about AWS. But in web three, we think all of the kind of functionality that exists.

29:35

There will be ported over and it exists is this sort of hierarchical stack of things getting closer and closer to the user as you go up and on the bottom is protocols, like Jensen, protocols, like render token where you do want Specific type of computation really efficiently with really strong verification. And then, on top of that, you can have the kind of general purpose computer networks that fall back onto that. So that's our kind of vision for the, the decentralized infrastructure.

30:00

I think, as part of that, when you launch is one of those then, protocols, you have a much easier job in initially targeting your market. So, our Market isn't doing kind of like chess simulations and things like that. That just building machine learning models, that's it. It can be really sort of attractive to say, I We could just do this extra thing. We could do this extra thing, maybe we could attach ourselves to an existing sort of thing

30:22

that's quite popular right now. Maybe we could generate ftes things like that but I think when you do that you split the mindshare massively in terms of product and people don't know what you are for Jensen. We will always be very clear that where machine learning compute if that's what you want. Then you come here, if you want something else, you go to a different protocol, maybe it falls back onto Jensen at some point. But fundamentally, that's all

30:43

kind of we are. And I think they'll very long term of it is we're behind the scenes. Since we're just like HTTP, but for machine learning compute to an end user and a developer, you won't even know that Jensen exists. All, you know, is that the world has changed. And now, when you train a machine learning model, it goes out somewhere and it gets performed by someone in the world through a series of kind of apps and apps, and things until it eventually sits on the

31:06

Jensen protocol. We think that's the kind of best way to provide this compute to the world is via that kind of hierarchical infrastructure where we gradually go more and more behind-the-scenes. Ali-A, dad had one final point to that which is there's where we think about the kind of properties that the network has to have it needs to be targeted towards machine, learning engineers and researchers.

31:29

At least have the verification piece but crucially on the kind of permissionless site in needs to have that level of sort of censorship resistance but also kind of an agnostic relationship with Hardware. So in the kind of, I guess deep learning Hardware space, you know dominated by Like Nvidia, there's companies which are doing their own proprietary Asics, like Google through TP. Use sensor processing units or a graph core and a good one with their IP use intelligent

31:58

conscious processing units. What kind of trap is I feel which some protocols not even in the kind of deep learning space of gone done before is shipping like proprietary Hardware. So, the idea that, you know, I think I may be a good example for General compute. I've listened to that are the epicenter session with ICP and Affinity where they have their own boxes, basically, and they're sold by them. That's actually very attractive to us.

32:26

The idea that you can essentially ship your own Hardware because then you all the kind of issues you have with sort of, you know, rerunning proofs in a way, which is deterministic or hashing Etc. Lots of our gets solved, but crucially, it creates us a choke point of centralization. So one of the kind of rabbit holes, we've seen some other kind of Computer articles, go down is the rely on certain kind of how would you say kind of secure enclaves?

32:55

So certain like secure Enclave like Intel sgx where they're like, you know, we run we can run you computations for you in a way which are kind of private. But you only have to use the specific chat which is manufactured by the specific company. And you know, it's only rentable and these specific services and it just it doesn't hold true to the decentralized, be false in our opinion. It also doesn't scale. Well, currently at least.

33:20

Yeah. I mean, if you look at what appears to me about Jensen's offering most aesthetic, kind of it can use resources that are currently lying fallow. And I mean, this would not be the case of you actually had to buy a dedicated piece of Hardware to kind of partake in the network know. Yeah, exactly. I think like I said it's really attractive to go that route from a A perspective because it's so easy.

33:48

But I think it intersects with one of the biggest things that we think about when designing our verification system, which is how what assumptions are we making, and how we can straining the system, because essentially we have to make some assumptions and we have to put some constraints in, but a constraint like that to us is massive. It's huge. We don't want to do that. Unless we.

34:07

Absolutely, absolutely have to there's other things that we can do, we can sort of narrow, the space of devices in in a temporary sense or in a permanent sense, we Can look at certain manufacturers, we can look at certain libraries that are provide determinism and things like that.

34:22

But every time that we make any decision like that, we make it very deliberately and I think it's quite easy to jump over those in the rush to ship something, but if you're going to build the network that we want to build that kind of takes the entire world and turns it into an AI supercomputer, you have to be very deliberate about that and maybe it takes slightly longer, but you've made it generalizable and that's the

34:44

kind of Step change essentially. Most 0 or 1, if you make those assumptions, you won't reach that kind of end State. It's I think it's sort of fits on three axes, this product assumptions. There's research assumptions in this technical assumptions and essentially, you have to balance all of those things, which makes it, I think uniquely tricky.

35:04

You have to have kind of voices, of each of them, equally kind of valid in the company and that's something that we've focused on quite strongly with, with hiring and things like that. Just making sure that we don't accidentally overweight a sir, Certain kind of area. I think the some protocols we've looked at before, who've fallen into traps. There's some traps with research where you can go down a, let's make the most formally verifiable system, we possibly can. And then you never ship

35:29

anything. And then you can go the other route where you make their kind of flashiest thing that an end user will like you ship something really quickly. And in previous startup terms that will be fantastic ship. It it breaks build it again in the web three world not quite as good as it breaks isn't just a little thing anymore. It's a Problem. So I think it's sort of a unique area web 3 where you have to walk this.

35:52

I think of it like a ridge where there's really attractive looking paths that go down either side but they're not attractive, they quite quickly, drop off the cliff and we're being very careful to stay on that Ridge. Quit before we dive into the ins and outs of the protocol itself. So gentle its own layer, one block chain in principle. It could have also been built as a Dap on another chain. Why did you go the layer one road? Yeah, it was it was a big question for is at the start.

36:28

I think, like we said the the sort of blockchain world for us was all about tack. So when we, when we entered it, we were quite sort of deliberate about it. We looked at all of the potential ways. We could build it. We made a massive lift list of pros and cons and we kind of navigated through figuring out what the again, I guess, the constraints and assumptions were for each each one, we quite quickly moved from layer to layer 1 because we wanted the freedom to kind of change

36:54

things. Is on the layer, one side essentially, the consensus mechanism. We didn't want to be constrained by certain smart contract system. We wanted to be able to do as much as we possibly could, because we knew this was going to be a big sort of open-ended problem. Essentially being a layer. One allows us to do a lot more work on the Node side than we would otherwise be able to do.

37:16

I think if we'd built in the VM, which you could absolutely do, you could build what we're talking about there. You'd be very, very constrained by what you can do in solidity essentially whereas Building in Rust for us, we can do certain things we can fall out and do some machine learning processes. Maybe we can do some tensor processing, things like that, that just wouldn't be available to us within the evm.

37:35

It was a, I guess in a nutshell it was a future proofing thing for us. We don't want to constrain ourselves early when we don't understand fully why we're making those constraints. So we kept it as open as possible. And fundamental we also believe in a multi-chain future. We think that the future is true multi-chain. It's not sort of ecosystems full of chains. Its Will change interacting with each other with a kind of generally, agreed messaging

37:59

protocol. I think we've seen some movements through the ecosystem having their own messaging message passing and now I'm moving back into kind of General message passing. And I think realistically, we're seeing the multi chain future, sort of play out. So, we're quite pleased with that kind of that so far. So you're looking at building this, as a parent chain, why this on substrate in the polkadot ecosystem? So we're not, we're not fully. Certain weather will be a pair

38:28

of chain or not. Yet, the substrate decision was essentially the technology. So when we looked at everything, we looked at the sort of Frameworks that we could use in the libraries that existed from Attack perspective just what was nice. What had sort of the best technology built-in and substrate came out on top for us. We weren't blockchain people. We were machine learning people. We came in knowing that we wanted to like stand on the shoulders of giants, if you

38:54

will. We don't want to rebuild consensus from scratch. We want to use whatever the best one is and then carry on with building the machine learning stuff that we're focused on and substrate provided. That tours, as a way, to very quickly, iterate build up the chain and then get on with the off chain stuff with enough, flexibility to change it when we need to.

39:13

So the kind of frame sub system allows us to quickly get something running but then if we need to step in and completely change it, which is really attractive is written in Rust were fans of rust as a language. Edge it just kind of made sense from that perspective. It's interesting.

39:28

This was a year and a half ago and the kind of to that came out on top or Cosmos and substrate and essentially substrate one because of the tack and the kind of nice libraries in the developer tooling and things like that. But yeah, in the power of chain decision is one that we essentially will make later as a bit of a cheat answer. We can be at para chain, we could not be a pair of chain, we

39:49

don't need to decide right now. So essentially we don't if the System starts to fill up with things that we can interact with. So if there's like storage layers in there, if there's sort of, sovereign data layers, and things like that, that we would want close ties with, then maybe it makes sense. If they Exist Elsewhere, then maybe it makes sense to kind of Bolt IBC, on and exist in the wider world. But yeah, yeah. So you guys should look at Solutions like TC also.

40:16

So things have kind of allow you to kind of have a legacy Operating system, that kind of hooks into a blockchain for provable compute. It's a super interesting. As you can say, sounds interesting, I've not come across it before, but yeah, we'll definitely check it out as you have an insulator. So let's dive into the protocol. So there's a couple of participants in the Jensen economy. There are submitted service verify as and whistleblowers. The submitters are the people

40:52

who actually wants work done. So let's say let's start in the beginning. Let's say I'm a submitter. What kind of AI problems can I submit a my constrained in any way? Yes, currently you constrained by your AI problem has to use gradient, based optimization at some point in the in the computational process. Basically, we use portions of the gradient calculations as part of our proof system. That's not necessarily set in

41:24

stone. I think as a hurry mentioned earlier, we've got our light paper, which is public. We're iterating on that internally and there are lots of kind of things in play, essentially, but right now, it's great. Nice optimization. We use the signals from that as part of the verification mechanism, what does gradient based optimization means so to me as an AI Noob. So how would I know whether a problem falls into that category or not sure.

41:49

Yeah. So I guess fundamentally, if we think about a neural network, it is a big set of players that have parameters in them. And those parameters are essentially just real numbers that could be millions billions now trillions of those numbers in there.

42:05

But fundamentally, they are the kind of deciding factor in the output of the network and the training of the network is setting, those two realistic values that allow data to go through and Trigger the kind of outputs that you like at the end of the network. So you go through lots of sort of matrices, layers of these real numbers. It sort of it changes the, the current input as its work going through. And then you get the output that

42:28

you want. By all of those changes that have happened, you need to update those numbers to reflect the output that you For a certain input. And previously way way back in the day that would be done manually. So you maybe not with a neural network, but with kind of certain systems you would set those using expert knowledge and then you would know that when an input goes through, you would get the right output. There's also different ways of setting them programmatically.

42:52

So you could imagine a super sort of naive way of just randomly setting, all of the parameters running, a sample through checking, how far away from the realistic sample, it is and then just doing random ones. Again and then doing a random search, essentially until you make a smaller error value at the end and then you just keep decreasing that error value. You can do other strategies where you do sort of more targeted updates and there's lots of ways that you can do

43:17

that. Gradient based optimization talks about essentially, what was the big change for neural networks and deep learning which was showing that you could essentially use the gradient or differentiating, the parameters of the layer with respect to the arrow as you go through the network and you can use the Chain rule to apply that all the way back through the hierarchical network that hurry described. Essentially, in that way, you get the position on the hill of loss, if that makes sense.

43:46

So, if you modeled the loss as a in like euclidean space, you would see it as this kind of really bumpy area, where somewhere there's a big dip and at the bottom is where the loss is really, really small. And you're trying to find that dip getting the gradients for each layer of essentially shows you for that layer. Where you It on that surface and what direction you should go. And so you use the gradient to say, hey, we've got a massive like Drop here.

44:12

Let's go down it. So the direction that we want to update the parameters in is this way and we want to update them with this sort of size of Step because this is really steep or it's not steep. So we want to make a big jump or a smaller jump and essentially that's it.

44:26

You're just navigating this huge bumpy surface looking for a big dip and the gradients give you sort of a position on that surface so that you know which direction To go in and it was a huge leap because that signal that direction is kind of really clearly useful rather than just taking random leaps, all of the

44:44

space and figuring out. Hey, I'm on the top of a hill now, or hey, I'm at the bottom of a trench, you know where you are, you know, that you're on the side of a trench or that you're on this weird flat bit and you need to make a big jump to try and get out of the flat bill, something like that. How do you know this? How do you know there's only one Trend or how do you make sure you're on the right?

45:00

Train ch right because basically if there's lots of trenches you kind of you want to end up in the Deep First one. You don't want to get stuck on a mole here, right? Yeah, I mean you want to go to Mount Everest. So basically how do you make sure that basic? How do you know how how you how low you can go? I how how you can go with your mother very good question. That's one of the big, big problems in deep learning itself. Essentially little lots of

45:26

techniques for doing that. The very, very simple answer is assume it's convex and then you don't have to think about the being any other. Yeah. Obviously in the real world, it doesn't work like that. Essentially there's lots of sort of regularization techniques that happen in deep learning training and make it a really complex thing. It makes it more of an art than a science because a lot of people have their sort of little tricks that they do, there's things like within learning rate

45:53

schedules. So you'll use a learning rate set, the magnitude of the jump that you'll take in that gradient space but your you can use certain schedules to sort of Decay The Learning rate, make it smaller over time, which means smaller and smaller jumps. So you don't accidentally jump over kind of trench. But in the same case, you can suddenly randomly introduced a huge jump, which just allows you

46:14

to know that. Maybe if I am in a global minimum, maybe I'm at, if I'm in a tiny trench here and there's a massive one over here, I'll just do a huge jump, I don't know where I'll end up, but it should, it could be better if not, I'll probably roll back to where I was before. So there's lots of techniques like that, that sort of more trial and error than they are sort of deliberate. But Like I said before, they're becoming more deliberate over

46:37

time. So now that people have introduced these regularization techniques, Dropout, norms and things like that. Now, people are looking back at them and saying, hey, did this work for the right reason? Or was it just some weird random Quirk of the model architecture that made it work here? And can we kind of figure out exactly why it works? But yeah, it is, it comes down to an art more than a science, to be honest, it can be very

46:59

frustrating. So now, I understand that gradient optimization problems are what I should submit in terms of, I mean, can you can you can you talk about like real word problems, Robert problems and say which ones are gradient optimizations and which one, which ones aren't just, so I can get like a feeling for what kind of problems I should be able to submit Yeah, I mean the simplest way is thinking pretty much every neural network is uses gradient based optimization.

47:36

There are other problems that use it as well, but within neural networks, all of the kind of big steps that we've seen all the big changes have been neural networks recently, so it's a logical place to focus for us whilst also allowing this big space of other places. So any optimization problem you could theoretically, as long as it's differentiable and we can use the chain rule to flow back. It could use gradient based optimization and Use other optimization techniques with the

48:01

grading as a signal in there. As long as you're calculating a gradient, we can use, it's useful, but yeah, fundamentally it's all neural networks. Every maybe two three years, somebody comes out with a paper that says, hey, we're training neural networks with evolutionary optimization that doesn't use gradients, and it's better. It's never better. It's better in a really constrained system and it never takes, hold not to say.

48:26

It never will do. But so far gradients of Managed to stay pretty kind of solidly at the top. Okay. Then I have turn the question around. What I've what kind of problems can't I submit? So what what what problems are not solvable or not? Where's Ava were with neural networks? Hmm. Good question, expanding at the question your own networks in general neural. Networks are generally quite data hungry algorithms. So if you have a have a problem with very low data volume, Good example.

49:00

That would be like, I might be wrong about this, but some of the kind of toy examples which are used to teach people to do machine learning like the iris dataset and stuff. Are you have like a very like you could actually have a spreadsheet like a hundred rolls and maybe if I think like seven or eight features, I don't think that intuitively. They're like, well, suited to neural networks are typically better handled by like statistical machine learning

49:21

techniques. So I think data volumes, one of them. There's also just certain Certain types of neural networks which is very large. So like fitting them on edge devices can be a challenge, but in terms of like the actual, I guess, when you say problem, if you think about the kind of, what's the type of thing, you're trying to predict in the world, there isn't something which immediately comes to mind, that neural networks aren't are like explicitly currently, and always will be bad at.

49:53

I don't know if you've had any tuition there. Been. I guess you can think of a neural network as a universal. Action approximator. So theoretically, it can do all of the things that you would do with other methods. I think, like Harry said, the reason you would not use a neural network would typically be down to data volumes where with a statistical machine learning mechanism you method, you could get a better result essentially and then you would you wouldn't train that using

50:17

gradient based optimization. But fundamentally you could do it within your own network if you wanted to. It just might be a bit worse. Okay, good. So I understand that. I can submit almost any questions. So basically say do I why I asked for an entire program or I mean do I ask for like a dolly kind of output can I say like I want a picture of accountants in hot air balloons over a waterfall and there should also be a rainbow with scorpions on it and it'll do that for me.

50:50

Oh can I ask I'm betting? I'm reading this car and I need an AI to drive that car. Can you deliver that AI? Is that kind of the Both within the scope or do one of those fall out of scope. So I guess it's more like you'd use Jensen to train, the model itself.

51:09

So what you would do would be you'd think I want those things, I want to be able to create my scorpion rainbow kind of image generator from the text prompt scorpion rainbow and which I love and you'd you'd build a model which, you know, receives a text prompt and then converts that text problems into into images and then you would have the training data.

51:29

Which facilitated the kind of learning of that model and then to magenta Network, you would submit model data and then some hyperparameters which determine you know, like badminton League, The Learning rate, schedule things like that. How do you how long you wanted to train for? And then you're kind of the artifact you receive.

51:48

From that training process, the kind of product you get, is the train model and then that model can, then you can then post that and then you can submit scorpion rainbow. Yeah, how do I decide, which untrained motto. How to use? That's a brilliant question.

52:02

I think there's kind of two ways of thinking about it, so it just kind of emerging and highly kind of popular concept around foundation models currently, which is, you know, you get like a big company like open the eye or something like mid-journey or something, and they build the base model. And then you take the base model with you, your training data, which might have lots of rainbows scorpions in it and then you train the model on that and then the output of that.

52:28

And then I'm always very good at, you know, approximately Amazing that those outputs. That's kind of option.

52:32

One and that's the most common for people who are quite compute restricted which is the kind of theme in the industry just now second option is you build the model but they were be whatever wise build from scratch and I don't know if you want to talk about that then from our Yeah, I suppose a lot of our thinking comes down to the foundation models approach because we think it's the kind of the future of the space. My research back in my PhD was specifically on essentially Auto

53:00

ml techniques. So the idea of allowing somebody to optimize that that model structure and find the best model structure without necessarily being an expert, that's another way of doing it. And you've seen that sort of happen within like AWS Sage maker for example, where they and gcps. A few clouds as well, where they build in some Auto ml techniques to say to a developer.

53:21

You don't need to know the specific machine learning architecture because essentially, we can just see that as something that's trainable as well. And we apply an optimization technique on top of that. Jensen is a protocol. Can have that if you wanted to, we would see that as something that you would build as adapt that would use Jensen. And that dap might Implement a evolutionary optimization technique or something like

53:42

that. It would submit the sort of individual architectures, it wants to train and Best to the Jensen protocol, it would have them trained and then it would iterate on the structure and it could build up the kind of the model that you want. And that's a bit of a theme in the way that we think about Jensen as being purely machine, learning compute. All of these interesting things that exist around it. We would love to see build out

54:03

as an ecosystem essentially. So all of the nice things that you see on stage maker and gcp, we see as being additional things on top, I think it could be very attractive to build them yourselves but ultimately, it's a, it's a trap, but yeah.

54:16

A on the foundation models we've seen because of the compute problem like how he described, we've seen people take Foundation models, from very large research papers that have been spent maybe 10 million dollars in funding in order to kind of trial out all of these different architectures and then they publish a new architecture and say, hey, this is the best in the world at doing these three computer vision tasks and then you take that, you use the vast majority of that

54:42

pre-trained Network. That cost Millions to train, you would add some layers on the end. You Some layers off and then you train those layers on a smaller set of data and you have a kind of usable model for that and it generalized, loads of information from the first kind of training it did. So you call that pre-training and then you've got fine-tuning and that's very kind of classic.

55:01

And in the Deep learning space, one of the things that we find particularly difficult with that is the bias that gets introduced in that pre-training. So, one organization doing that on a proprietary data set or on a dataset that they haven't. Closed features about means that when somebody else comes to use it, they don't know what's going on because of those black box issues that you mentioned earlier, it can't kind of you can't go back in and say why did it make this decision?

55:29

The solution to that, in our minds isn't to go fully deterministic and kind of get rid of the black box. It's to open it up to everyone. And say, hey everyone train, this Foundation model, so design it together, we train it together on an infrastructure that nobody owns and at the end of the day, we have a model that we can all use.

55:46

That's kind Global and hasn't necessarily been biased by a specific company's cash of data that they've kept back and they don't want to tell you what's in it, and things like that. So, once we've got those kind of global Foundation models, then anyone can come along and say, I'll take I'll find the hash of that model on the Chain. I know it's been trained. I'll pick that spot. And I'll continue training from there on my data set for my problem or task.

56:10

And then I'll have a model that I know at least is as biased as the entire global population. Rather than Being as biased as a company in California. Okay so basically until we have the global Foundation model maybe we'll talk. We can talk about how you plan on kind of delivering that later. But before we have that, I kind of have to decide on one of the commercially available ones and I have now submitted my my problem who gets to work on it.

56:39

Do servers need to kind of prove is some prerequisites. And basically is it one solve a Ma, can you Paradise this? That sure I would say it's at the task level. It's one solar per task but a model can break out into lots of different tasks. So typically when large language models are trained, it's interesting. They've kind of been built in a way which maxes out the current Hardware at their time of creation.

57:07

So you know they are designed to fit chunks on you know certain the video processors Etc. You imagine a similar thing happening across the network. It's complicated. Created by the fact that there's heterogenous devices on the network. But essentially for any given task, you know, supplier of compute. So if I have a verifier or worker, they have their ability to basically say, I'll take that from the mempool and then they are randomly chosen from the pool.

57:34

People who say that they'd like to take that task. So everyone can do it if the model and the data can't fit on your device and you said that you know, it And then it follows that there's deliver lightly be a penalty there because it's kind of clogging up the system but essentially if you're if a task and fits on your machine then yes your ability to run it is essentially just determined by a verifiably random function which is the likes you from a sub, a subset of the available

58:04

minors or workers, I should say, how do you verify the what kind of capacities the miners have? So busy, if I say I have like a 16 core GPU and 400 gigabytes of RAM. How do you verify that? Yeah. So it's essentially in the verification of the computation, they won't be able to do the computation if they don't have that compute device essentially or that capacity. And when they come to submit their proof, when I gets checked, it will be found that they couldn't do the

58:41

computation. Okay. There's a little bit of a sort of question there and how big you make a task, because You, if you made a task in an enormous piece of compute, then that would kind of be an issue. Because you could quite easily Das the system by set grabbing lots of tasks and saying, I can do these never doing them wasting everybody's time and kind of money and things like that. So it really is a decision

59:02

exactly. It feeds into that decision on the size of a task and that's there's lots of other things that feed into that. There's a parallelization that you mentioned as well. And how you split the tasks up into the most optimal structure. At the end of the day, we're doing a lot of research on Figuring out what that should be based on the constraints. When we launched our test, net will do it based on the kind of practical aspects as well.

59:23

When we see how this actually works in the real world, we're very conscious that it's easy to kind of divide Define this in the kind of perfect system and say, yes, this is the best size of task. And then you go out launch a test, that someone does something really weird and you realize that you have to completely change it again. So it's part research part. Let's just see how it functions when we get out there essentially.

59:45

How do so basically if If I get a specified model and the training data, how can you make sure that I've actually done the job right? Because it's very much not deterministic. So it's not like, you know, you can make me do a hash and then the hash will tell you whether I've done it or not. How do you build in checkpoints

01:00:05

into this into this process? Because otherwise, I could, I could just, you know, pretend to do the work and then kind of, you know, this was, this was a lazy moderate kind of didn't. Do the work, it kind of, maybe stupid. I don't know. I've done it but it just couldn't be taught. Essentially, that's the big challenge that verification system. It's a huge Challenge and I think the simplest most secure solution to say is a zero knowledge proof of the entire computation. Essentially.

01:00:35

That's sort of what you think about. In X years time, we should be able to do any computation as part of his own knowledge proof. And then we can you can check that proof to say. Definitively whether someone's done that computation or not? Don't you need a new second for each given computation? Yeah. So right now, that's that's the case. And it's horrible to try and do for machine learning work.

01:00:59

The computations are massive. You need a DSL for defining circuit, with respect to a machine learning computation, it's horrible. Essentially, our approach is to have a hybrid between the that and a probabilistic mechanism. We sort of follow some principles.

01:01:16

Suppose in work called proof of learning by Nicolas paper knots Group which is it was a paper within the machine learning World, essentially, showed that using the path through gradient space that we described before, you can sort of create this certificate proof using checkpoints and that space that theoretically it's just as hard to generate a realistic-looking path as it is to just do the work.

01:01:42

And then using a kind of financial irrational assumption on all of the Ben's you can say they would just do the work. Essentially, there were issues with that paper and kind of flaws with the with the ways that check things but fundamentally what it showed was using a stanch Ali, random auditing scheme on top of a path, through gradient space, you can build up a relatively robust check.

01:02:05

And essentially, we take that one step further by introducing zero-knowledge proofs at certain steps, and on top of the kind of global loss of the model, just add another definitive kind of proof on top. And we package all of that up within a game theoretic mechanism. That looks quite a bit like true bit from, from way back in the day with staking and slashing solving the verifies dilemma with like random jackpots essentially with whistleblowers,

01:02:30

and that's the full system. Although I'm aware. That is a, just a big word, vomit of things. So, happy to dig into bits of. Yeah, maybe that's, there's so much unpack here, so, maybe kind of that's back up to true it. So I think lots of people kind of, remember prove it. It's kind of this. Basically, it lets you do large computations of chain and then base, you can prove it where the binary search on Shane. If anything's is that, is that a

01:02:56

fair summary? Yeah, I think, essentially that's, that's exactly it. So, true bit prove that you could take a very large computation that wouldn't fit in the evm or would be absolutely massive and really expensive, do it off chain and using that challenge mechanism and that search that you described eventually, prove it on chain with the chain, doing a tiny Ian, we take that same principle, we apply it on top of some of the sort of certificate proof stuff that we mentioned before.

01:03:23

So if you applied that to a full machine learning training job, you'd be searching forever like it's enormous. So we distill that down into a kind of a smaller proof that is still representative of the larger IE rather than doing the full thing. You do one in 100 checkpoints or something like that. You've already reduced the size by 100, then you go into that challenge mechanism.

01:03:44

There's also some work in the machine learning Space again which is applied the true B mechanism to neural networks. But rather than using virtual machine instructions for that search, use a graph and you Traverse kind of Merkle tree, graph of a neural network graph, essentially of operations and you can do that at different granularities. So you can do it at Native operations.

01:04:05

Like within Pi torture tensorflow on a convolution and then you can step into that convolution and do the Matrix multiplications that are involved and then you can step into that matrix multiplication and do the individual. And of floating Point operations that are involved, it's quite large overhead. So it requires you to do that big reduction before you get to that stage.

01:04:23

But once you do it provides, that crucial link, that goes from random off chain participant to full consensus of the chain with the chain running, something and that, links back to what we said earlier about being a layer on versus a layer. To, as a layer one, we can also increase the size of computation that the chain can do.

01:04:41

So if we make the chain to a matrix multiplication and that's okay, then we get to kind of skip that step in to a matrix multiplication doing following points and things which is quite nice at the end of the day, it's constraints and assumptions again, you increase the hardware of the validators and things, so you've got to be careful but we like the flexibility and being able to kind of change all those levers and things.

01:05:02

I totally get that. So, I think basically, kind of fixing the, the block gas limit and kind of maybe repricing some opcodes. And I mean, obviously it gives you it goes a long way, right? Maybe that's talk about the how the blockchain itself Works in just a bit but there are two more parties in the process. So there's the verifier will actually make sure that the check points have been checked.

01:05:31

T' and then The Whistleblower who makes who make sure that the verifier actually operates correctly. Can you go through what their respective roles are? Yes. So the verifying The Whistleblower have a relationship similar to the verifier in the kind of worker in the tribute paper.

01:05:51

So essentially The Whistleblower solves the verifiers Dilemma problem which is the idea that you won't necessarily about you know, verify Mark, unless, you know, you can be simply expect there to be warkworth kind of, you know, catching is being wrong and being rewarded for. So the question blower essentially checks that the The verifiers Works being correct. There's also incentivized to do so by force Terrors from the verifier. So, the verifier wasn't from the

01:06:21

true bit paper right there. Yeah, it kind of like that the dogs at the baggage carousel, where we see if they don't find, you know, any drugs in their hand Let's put, you know, like a suitcase with drugs. So, you know, they don't end up depressed and you know, stuff working the dogs need dog treats, occasionally the that's basically the kind of the kind of thinking there, so it falls. The basically, the solver does

01:06:47

the work. If the work is incorrect, the verifier shows it as being correct and the list of Laura, can they confirm? It's incorrect that then goes back onto the kind of chain which we can talk to in In terms of, you know, being being verified on chain. But essentially, it periodically and also kind of the rate at, which is kind of linked to the security of the system, I guess.

01:07:13

V, the verifier will show up an error on purpose to twist a blower, which keeps The Whistleblower wanting to be the kind of engaged with time, as well. If there was a blower, does find a problem, the play a game pinpoint protocol, where we narrow down equal to down the computation to a single Kind of point in the, in the kind of I guess you could view it like the miracle tree of computations for that area of the neural network and then that goes to the the

01:07:39

chain for, for arbitration. That's the kind of the the version of it it kind of plane way that we originally had has been mentioned earlier. We've Advanced on it in a couple of areas after a basically closing or sit around and doing more research work. But yeah that's the verifying, it was Civil War. So tell me how that fits into

01:08:01

the blockchain as a whole. So obviously someone has to build blocks, there has to be I assume this is some kind of shaking Network so has to be staking token and how does all of this fit in with the Jensen prodigal? Yeah. Essentially it's a vanilla to an extent substrate blockchain, we use the proof of stake. Grandpa babe consensus mechanism. Validators. Just kind of doing things in the normal way that they will, and all of the parts that Harry

01:08:36

described. And I described earlier happen off Jane, they're all kind of off chain participant's doing portions of work and kind of being incentivized by the fact that they've staked through a kind of normal staking palette within substrate, but been a smart contract, it could just be submitting, a certain amount of tokens and then there will be rewarded. When that work, ultimately gets

01:08:55

checked. Now, the kind of game theoretic Faculty here is making sure that all the staking potential slashing amounts and the reward amounts all add up. So that there isn't an incentive somewhere for somebody to either be lazy or to do something that is malicious essentially. So it gets complicated. When you add more participants in the kind of having the Whistleblower.

01:09:17

There is an additional participant is annoying because it's over complicated but it's crucial for us given the size of the computations to have it there to assure the honesty of the verifier. Essentially, it's not certain that we'll always have to have that we do. Keep thinking about ways that we can potentially remove the Whistleblower. There's certain zero knowledge, proof techniques, that mean that we potentially could. But we don't want to get ahead of ourselves essentially.

01:09:41

So right now, it kind of looks like what's described in the light paper but we're chipping away at each bit of it to try and simplify it. We think, if you look at the way that other protocols have gone in the past, there's a tendency to launch with a complicated system and then once you get it out there, realize that you can simplify buy it and we're expecting to go through that essentially.

01:10:01

We kind of saw the same thing with polka dot on the fisherman mechanism that sort of got removed after the thing had launched and it was out in life. I died. One other point. They're just on the kind of our augmentation of the vanilla kind of substrate shown.

01:10:17

There's an issue in the verification system as we originally proposed it and also as it currently looks for us state-of-the-art whereby if the the data which is being used To perform the initial kind of smart from the solver is removed or made inaccessible. Halfway through the verification process, you reach a kind of standoff. Because at the barrel fire, can't access the data.

01:10:42

Then there's a follows that they can't verify some kind of data availability solution that kind of plugs into it precisely. Yes. So we are. We built that in to on top of the old kind of substrate. So we have a proof of availability. P of a is this. Kind of what we've got adopted internally layer, which is Erasure encoded etcetera. And basically provides what we couldn't find in The Wider, kind

01:11:09

of storage layer market. And if anyone's listening to this huge building that space and this does exist, I would honestly be fascinated to see it. But essentially a layer where in you can lock data for a period of time in a way, which is pinned on pinnable for that period of time. And verified on chain that the exist there and threw that on our we've it's too expensive and

01:11:34

are we've. So are we is the answer but the cost for if you think about, you know a terabyte of training data, being stored Forever on our. We've it just doesn't work when the kind of alternative is like, you know, storing it on S3. So, yeah. Should I should also caveat it has to be an expensive. Yeah, but yes, just on that with our we've The reason we need it is is for the data, the training data, but it's also for some of the intermediate, like proof data, and that doesn't need to

01:12:06

be around for very long. It could just be 20 seconds. While we go through a certain number of block block, like, what releases or something with our we've, we don't need like 200 years of storage if that thing, but that's what we're paying for essentially. So if somebody has really short term but with the guarantees of our we've so bringing the price down because it doesn't need to be 200 plus years. That's what we want. Basically, we just haven't seen it anywhere. You should go.

01:12:33

You guys should talk to Ali's. I mean, I weave the storage rent. This would be, you know, this might be a thing. Yeah, it's like a sort of Perma web that are temporary web. The temperature at the Tempe Perma web. Right. So I assume that there's going to be agents in token somewhere in this eventually, tell me about that.

01:12:57

Sure. So the Jensen token fundamentally is required for the attack essentially, everything that we've just described assumes that you have this token built-in that can be used to stake /, provide rewards, Etc. And also maintain the consensus of the system itself essentially. So use kind of a small inflation amount to pay out validators, and then be used in that game theoretic mechanism to allow us to kind of guarantee that Financial irrational kind of assumption over the entire system.

01:13:28

I think crucially for us, that's what it's for. And that's like the only thing it's for we're very, very deliberate to say it's a technical thing that we need, but we will bring in when we technically need it. And not before we've seen what's happened with with kind of utility tokens in the past where people have kind of launched them too early. And then it's a distraction for the team as a distraction for everyone. People aren't buying it to use it in the system, they're buying

01:13:54

it for other reasons. We don't want any of that. Ideally, I mean, It's easy to say that it's hard to kind of see what will happen in practice. But our approach is to essentially delay it as much as possible and then quietly bring it in when it's required to maintain that consensus and pay out the participants with the game theoretic mechanism. Essentially, yeah, it's critical to note as well that along with some other kind of early movers in the Deep learning crypto space.

01:14:22

We're very much a minority with respect to the rest of a deep learning community. There's at least in our experience, quite a lot of skepticism about crypto more. Broadly, I mean, bed and eyes but histories kind of Testament today, you know, we were skeptical and we've obviously from a technological and kind of ideological point. We like it, I think it's the right way.

01:14:44

However, when we initially provide, when the network initially, launches we anticipate that the majority of the earning users will pay in Fiat and it will just simply be swapped into tokens, the solvers, and the people participating. On the supply side of the network will facilitate with with tokens. And we have huge interest from a lot of the kind of old fiery M1 miners who have lots of gpus who want to attach them to

01:15:08

something. But yeah, it's crucial that there's a kind of the kind of crypto the scary crypto words like Takin her kind of removed from the the end then deep learning and machine learning users, which is exciting because they're ours, this is one of those use cases which really Bridges. The Two Worlds of web to and web three, you know, there's our economic rationale for this existing. There's the technology now to enable it. Its existence, it's Dow.

01:15:39

Almost like a kind of execution question to a large extent, you know, how do you just hold you get people comfortable with the idea that I'm not using Amazon? And, you know, there's a kind of variable price kind of concert happening here with the talking and how the obfuscated that as much as possible and crucially

01:15:56

how do you have to skate it? Wait, which is decentralized because if you very easy just to stand out some centralized API front end which you know, just yeah automatically converts the tokens and a centralized exchange somewhere but that becomes that brings its own problems. So yeah, I just add that to the pot. So what is the roadmap look like for you guys, there's going to be a test net early next year. Yep. Test net. Early next year. It won't be incentivized. Kind of torture.

01:16:27

Earlier and it's more to pick up the kind of do two things.

01:16:31

Firstly to kind of battle test, some of the, some of the tech that we've been building internally and 22 going to get feedback on the usability of it. Overall, that will be that precedes our kind of incentivize test net which will be which, yeah, well, essentially, you'll be able to train models, kind of, you know, in anger on it. The the rate at which we move is, It's something that we talked about a lot, you know, we could ship something very very soon, which doesn't really give

01:17:03

us any meaningful feedback, but it kind of looks good because it's like all you kind of shit something and we don't want to fall into that trap because yeah, there's there's been lots of things which have kind of come and gone for us where there have been like incentives to kind of ship something super early. So even kind of you know earlier this year there was lots of kind of hype around the idea of doing generative nft arts and we could have kind of provided like an

01:17:26

inference. And for that, you know, really quickly, but we decided that it kind of, It kind of blows it out, step of her principles, you know, it doesn't solve. The big problem is not really on the way to solving the big problem. It had lots of kind of other things we have to build. So yeah, I guess I'm trying to say is we're not like in a kind of immediate rush to believe something. Tomorrow, we die were really something, which is Meaningful, which takes time giving a

01:17:48

fundamental. Some of the stuff is, particularly the zero-knowledge stuff which is, you know, which is it was a pretty involved in terms of time. Cool. Thank you guys. Where can people go? To learn more about Jensen, yesil. Jensen. Do AI, is you're going to primary source? We have a discard, we don't have a telegram group, but this cards

01:18:07

were a lot of chat happens. We're also hiring just now so if anyone's listening to this and is interested in, you know, building a permissionless, deep learning compute protocol than we are, we're it and I guess. Moreover next year we should be hosting as your knowledge machine learning Summit. So if anyone's Particularly interested in that kind of

01:18:27

crossover. There's that and maybe as a final point, maybe more for traditional kind of deep learning or machine learning people coming was listening or sponsoring the the New York's conference in New Orleans next week. So, we'll bet I will both be in Louisiana at the conference attending the talks flying the flag for a crypto and yeah, if anyone's there, we were more than happy to chat. Super good. Thank you both for coming on. This was Super interesting.

01:18:57

We look forward to kind of seeing how this plays out with them. The test net and the mainland fantastic. Appreciate it. Thanks dropping songs been a pleasure. Yeah, thanks so much for having his own really enjoyed it. Great questions as well. Really interesting. Thank you. Thank you for joining us on this week's episode. We release new episodes every week. You can find And subscribe to the show on iTunes Spotify, YouTube SoundCloud or wherever

01:19:22

you listen to podcast. And if you have a Google home or Alexa device, you can tell it to listen to the latest episode of the epicenter podcast, go to epicenter, .t V /, subscribe for a full list of places where you can watch and listen, while you're there, be sure to sign up for the newsletter. You get new episodes in your inbox as they're released.

01:19:39

If you want to interact with us guests or other podcast listeners, you can Us on Twitter and please leave us a review on iTunes helps people find the show and we're always happy to read them but thanks so much and we look forward to being back next week.

Transcript source: Provided by creator in RSS feed: download file

Ben Fielding & Harry Grieve: Gensyn – The Deep Learning Compute Protocol

Episode description

Transcript