¶ Introduction
the main thing is browsers have storage now, like, they didn't have storage for the longest time. I feel like my mission in life is just like I'm shouting from the mountaintop Hey, did you know browsers have storage now Welcome to the localfirst.fm podcast. I'm your host Johannes Schickling, and I'm a web developer, a startup founder, and love the craft of software engineering. For the past few years, I've been on a journey to build a modern, high quality music app using web technologies.
And in doing so, I've been falling down the rabbit hole of local first software. This podcast is your invitation to join me on that journey. In this episode, I'm speaking to Aaron Boodman, who's the founder of Rocicorp, the company behind local first products, such as Replicache and Reflect.net. This conversation covers a wide range of topics, starting from how his work on Google Gears has led to Google Chrome. And many of today's web standards.
Later, we're diving into what Replicache is and how it's implemented. Also a big thank you to Expo and Crab Nebula for supporting this podcast and supporting the local first ecosystem as a whole. And now my interview with Aaron. Welcome Aaron to the show. Thank you so much for joining today. Would you mind briefly introducing yourself? Yeah, sure. okay, , My name's Aaron.
I am most recently the founder of Reflect.net which is a hosted service for multiplayer and And for local first style apps, my company also built Replicache which is a variation of that product. And I kind of have a long history in the web. I've been doing open source on the web for a really long time. I started in like, I don't know, 2002, 2001. And yeah that's me. Awesome. Thank you so much for making time and coming on the show.
Would you mind sharing a little bit of like your arc of what brought you to local first software today? Yeah, sure. I'll try to give a short history. You know, I started out my career just doing JavaScript libraries, UI libraries. Actually, a bunch of my oldest friends and co workers we were a member of this community online called dhtmlcentral.
com, which was like back in the very dark ages of the web, like before GitHub, before really open source like hit its stride and we would share JavaScript on there, and we built UI components and things like that, and we were just really excited about, you know, pushing this, Tiny, terrible platform to build, like, real apps with it.
One of my friends built, like, a whole windowing, like, toolkit, like, with all the window components, like, in JavaScript, like, in 2000, I don't know, it must have been, like, 2001 in there. And, you know, over time, you know, the web grew and we found ourselves at big companies. I found myself at Google helping them do do their JavaScript products, right? And, you know, it was always like this huge fight to get anything in browsers at that time.
Like, like, browsers just would never add any features. Like, there's no incentive for them to do it. And so, this opportunity at Google came up to work on this thing called Gears. Which was this crazy idea to, like, add a plug in to browsers that added APIs. Like, usually plug ins, they're like a graphical thing, you know. They like, they add some sort of UI feature. But this was an idea to add a plug in that had no UI features. All it did was add web APIs.
So that, like, we could actually build stuff with browsers that was, like, good, you know? And and it was just such a cool, like, bold idea. And so I jumped on that, and the first, one of the very first things we built was a database. aNd the reason that we wanted to put a database in browsers was because, at that time, you know, you could only store five megabytes of data in a browser, you know? And so, if you think about it, the web is just, like, fundamentally hobbled. By not having storage.
Like, you think about what makes an app. You know, we have this constant annoying debate on the web. SPA, MPA, blah, blah, blah, blah, blah. Just step back and think to yourself. Like, what are the ingredients of a native app? What makes it tick? What makes it a native app? And there's not that much to it. It's processing, memory, network, storage, UI. You know? Those are the ingredients. You know? The web until recently didn't have storage and that's like an important ingredient.
If you don't have storage you're doomed to be a dumb app, a dumb client. completely agree. So before going a bit more into today's storage mechanisms, what was , back then for Gears, what was the inspiration, was there any sort of existing database that you looked at where you said, okay. It would be great to have something similar to this, but for the web, or did you mostly design this from scratch?
¶ From Google Gears to Chrome
Well, the API that we the actual database that we used was SQLite, which at that time felt very mature, even then, at that time. You know, I think, I'm pretty sure that SQLite started, I want to say, in 2000. And this was 2007 that we were doing this in. It was a really, it's kind of an obvious thing. Already SQLite was in everything at that point, you know? And, or it felt like it was in everything. And it's embedded, fast, robust. But, you know, it wasn't an appropriate API for the web.
The cool thing about working on Gears was like a lot of the people that were doing the platform design were like JavaScript developers. And then we had a bunch of like sort of more gnarly C++ people working on the guts. And we, you know, we wanted to do APIs that fit in the web from both like a security and a usability point of view. So we wrapped SQLite in you know, a simple, but what we felt was like fairly webby abstraction. So what was the path forward from there?
So I can't say that I've built something on top of Google Gears. But I, I've, I know of quite a few database-like abstractions for the web for example, thinking of web SQL, which I also think is built on top of SQLite was that sort of an evolution of what you built back then, or take me from back then, from the first steps of bringing SQLite to the web to the various chapters up until today. Yeah, absolutely. Gears had a short and glorious life. I think it did not last long.
We launched it in I believe. And it was dead by 2010. And what happened was It was this very, in retrospect, it was this very high risk strategy for Google. We were trying to force change on the web. Like, Google needed the web to be better, like, in order to do its applications. And and, what happened when we launched Gears was, like, the other browser vendors, like, hated it. And, like, like, legitimately, or all, at that time, Google did not have a browser, right?
So, the browser vendors hated it. And, you know, they felt threatened by it, you know? And I think there were also, Fair and good like standardization arguments, but I think the reality was like they were the people who made the web platform and this was like someone coming and trying to like force things on them sort of against their will. Whether the will was like fair or not. And and so there was a lot of backlash. We Had hoped that we would distribute gears with browsers.
Safari, like, they outright said no I think Firefox said no, and basically, like, around that same time, Google was starting Chrome. And there was strategic questions of how the two fit together. And basically how do we advance the web? And it just became really clear that the way to do that was to have a browser. Because like, browser vendors were the ones with a seat at the table. And it just, it was obvious that we would, as a company, we would have more leverage doing it that way.
At first, we put Gears in Chrome. So the first thing was Chrome launched and we put Gears in Chrome. And then, a little while later, we shut down Gears. But, even with, like, just the few properties using Gears, it was obvious that the web needed this. And there's a guy at Google who was HTML5 spec named Hixie, Ian Hickson and he just worked right down the hall from me. And he was like, we have to standardize this. We have to specify it.
His philosophy was like, The web is what browser vendors ship, you know, that's just the reality, you know, and Chrome has shipped this and then shortly thereafter like Safari was working on shipping it and he was like we have to have a spec for this. Otherwise, no one's, it's not gonna be interoperable. So he wrote it down and he designed the web SQL API, web SQL database, which was a JavaScript API around SQLite. Which was inspired by Gears. I helped, like, edit the spec.
And That took us to WebSQL database. The story actually has this long and sad history Beyond that, because, you know, of course, WebSQL got killed too. Yeah, well, as, as many products and great ideas within Google, but looking at the looking at it as a glass half full, I think each of those has brought new ideas and started new things that I outlived. The initial products. So I think now SQLite is far from being done. So it's really like a technology of the decades and is still getting better.
And I think we should do entire shows also just on SQLite and local first. Were there any other ideas that are sort of noteworthy that came out of the gears project that also made it in today's browsers. Yeah, absolutely, you know, geolocation. We shipped geolocation.
We were the first way that you could get access to geolocation programmatically in web browsers and the API and the guy who designed that API was like on my team and the API that ended up in browsers is almost like word for word the same API and Workers so the very first version of Gears had three APIs offline boot, like the inability to boot a web app offline. So it had a sort of equivalent, something a little bit like AppCache.
And then it had workers, so that you could do sync processing in the background without blocking the UI thread. And it had a database based on SQLite. And all three of those are the first release, and all three of those APIs became web standards. You know, not the exact same API, but they directly influenced, I mean, basically, Hixie was like, we better write this down, because people are going to start implementing it.
So. In that way, Gears definitely had the impact that Google and that, you know, I hoped it would have, that it did move the web forward a lot, really fast. That's incredible. You had like all of those foundational ideas already back then.
And I feel like now, what is it like definitely more than 10 years later, the web still feels like to just about discover those bigger primitives that we can actually use storage to build real web apps, that we can use workers to make things more performant and kind of spare the main thread a bit. To keep the app more responsive. So all of those capabilities that you all kept thinking of way, way back, I feel like step by step that the web is finally waking up to those.
So I think storage is really like the main thing on your mind there. Yeah, I mean, there's so much to talk about, I'll try not to ramble, , you said 10 years, it's been more than 15 years. Like, it's 2023, we launched Gears in 2007. So it is crazy how long these things take. it hasn't been a, it has not been a straight line it wasn't Google that , killed WebSQL. It was actually Firefox kind of, and Microsoft. Like, basically, the browser vendors wouldn't implement it.
They were afraid that it would not be able to be standardized. Which I think is not really true. We could have done that but it would, there was a fear and it, I think it was a legitimate fear. But I think ultimately we could have done that. And so we sort of randomly got IndexedDB out of the deal instead. And it, you know, the vendors were looking for an alternative that was standardizable, right? And so this thing sort of emerged.
And like one of the, one of the like participants in the W3C list or the H, the WhatWG list proposed this. And it was like, it fit. The sort of strategic goals of the other vendors, but, and so it got specified and implemented really quick, but just one of those things where it wasn't coming from a place of people actually using it, you know, and wanting to build things with it.
And so it just has this just like always happens when you do that, like, it just has this really weird API shape and so, you know, IndexedDB like famously never got used for anything and then we didn't have Web SQL either. And then meanwhile, the browser vendors kept, well, people kept trying to put SQLite in the browser because SQLite is so awesome. It's so useful, , and so people kept looking for a way to do this both inside browser vendor companies and outside.
Google had this team, a sub team of Chrome, working on this thing called NaCl, and that was like, you know, an assembly language that could run in the web, sandboxed. And the predecessor to WASM. One of the main, use cases, was we could run SQLite in the browser. And we wouldn't have to standardize it, right? Because the standard would be lower level, right? But nobody wanted to, the other vendors didn't want to implement Yeah, so then that didn't work.
And then they tried PNaCl, which was like supposed to be like a more portable version of it, but then that, they, the vendors wouldn't implement that. And for a little while people were like, Oh, we can just compile everything to JavaScript. That was like, you know, asm. js, and people tried compiling SQLite to that. And that, I mean, it kind of worked. But then eventually Mozilla proposed Wasm, which was like basically a really closely related idea to NaCl.
And and then everyone got behind that. And so then people started putting SQLite in that. And then, you know, humorously around the same time Safari killed cross site caching for privacy reasons. So, using SQLite this way is like a little bit hobbled because it means like every app has to cache it themselves. But yeah, it's been a, it's been a long story and you know, there's way more to it than that. Why do the platforms grow so slowly?
There's like technical reasons and there's sort of political reasons and strategic reasons from the companies, but like, I think also it's like, There's this economic thing that happens. I think like humans are sort of lazy and risk averse. You know, not in a bad way, just like, no one uses storage on the web, right? And so if you're building a new app on the web, and say you want to build, I don't know, a new competitor to Google Slides or something like that, right?
Are you going to use storage? It's like a, it's like a high risk thing, because no one else does it, and that alone makes it high risk, because it's like unknown.
¶ Challenges of the Permissionless Web
So zooming out a little bit the web is not the only platform. We have native mobile platforms, iOS, Android. We have also native desktop operating platforms. So most of those have storage mechanisms since otherwise, if I. Reopen an app on my iPhone and it has lost all my data that wouldn't be great. Whereas that is the more common experience on the web. So why do web apps feel so different compared to native apps? Yeah. I mean, this is a question as old as time.
I mean, people have been asking this since I started programming on the web. And I think it's this like really deep and interesting. Question. So why is it hard to put storage in the web platform? Why has it taken so long? A, a big part of it is just the web is zero permission. It's permissionless, right? That's like the web's superpower. That's what make, that's why we still have the web and it, you know, iOS is like a, on a, on technical merits, a superior platform.
You can't deny it, but the web has this one thing that no other platform has, and that has made it so powerful that it's permissionless. You can send a link to your friend and they see the picture of your cat, that is something that doesn't happen on any other platform and it's super, super important. It's why I started programming for the web and it's why probably many of us did. It's like when you're a kid and you got notepad.
exe, you know, and you know, an FTP client, you can like write some, put it on your webpage and share it with your friends. You don't need anyone's permission. You don't need an account. You don't need a credit card, but this superpower is also the source of all the web's difficulties. Like the fact that the web is permissionless means that it can't have any permission that could be dangerous. That could be a security issue that could be privacy issue that could hurt you or your computer.
And so it's different than a platform that's more managed and so just from a platform design perspective, figuring out how to put storage in browsers has been difficult, right? Because you, you know, a malicious app, you don't want it filling up your disk, right? Making the other apps not work right. Or like consuming all of the storage quota, right? And from a privacy perspective, you don't want people using the storage to like, you know, track you or whatever, right?
Because of the way that the web is permissionless, it has grown an ecosystem of apps that are like different than native apps, right? Only recently people have started using the web for like sort of personal productivity apps like note takers and stuff like that. You know, like you think about like traditional platforms, they tend to focus on apps that are single user, you know, like the bread and butter of iOS is like single user apps. Right? All the apps that it's shipped with, right?
Those are all things you use alone, right? And storage is easy for those kind of apps, right? Desktop platforms have had storage for a long time, right? But because the web is permissionless, it has tended to be strong in collaborative applications, right? And also apps that are huge, where there's like way more data that you could possibly fit. On your device, right? Google search, Google Maps, Gmail, right?
All of these products are like things where you couldn't actually run them on your device, right? Like they need a lot more computers to run. Even assuming that you have storage, like once we added gears, right? Once you have that, even using it is hard because you have to figure out how to use it in a multi user, like, environment. You have to figure out sync and conflict resolution.
And then because the data can be bigger than can fit on your device, you have to figure out not just sync, but partial sync, which is like this whole other harder part. And then you have to figure out authorization and sync. Like with iCloud, your own data back and forth, right? You have permissions to it. But in, in like a classic web app, you only have permissions to a tiny subset of the data. So you have to do partial authorized sync, right?
And then on top of that, the storage isn't reliable because the browsers have to implement it in such a way that that apps can't abuse it. That means as an application developer that it can disappear at any moment, which is also different than a native platform. So I not only have to build a much more complicated syncing mechanism, I also have to make that syncing mechanism robust to the fact that the storage underneath it can just disappear.
I think there are like legitimate technical challenges and then on top of that, I think there are sort of just natural human challenges to to like doing something that no one has done before.
¶ Web storage
Right. So before diving into those let's maybe dig a little bit more into the technical challenges. So you've posed them as challenges and given that over the last. 10, 20 years, the web has changed significantly in, in terms of the things that have been standardized et cetera. So I think a few of those challenges are always inherently challenging due to the nature of the web, which is permissionless.
But given some of the technological improvements, I'm wondering which of those challenges are much more manageable now and why, and what are those improvements that have been landing and how do they make things easier now? the main thing is browsers have storage now, like, they didn't have storage for the longest time. I feel like my mission in life is just like I'm shouting from the mountaintop Hey, did you know browsers have storage now? I feel like largely developers don't know this.
In fact, I was on a tweet thread just yesterday where someone well known in the web, in the software development community, like, very respected, was like, wait, browsers have more than 5 megabytes of storage? Like, yes, they do. On most devices browsers have like gigabyte of storage. You know, , the actual quota is complicated. It's dependent on how much free space you have on your device and what browser you're using and like how many other apps are using storage and blah, blah, blah.
But I mean, you can assume as a web developer that you have access to like hundreds of megabytes of storage locally on the device, which in almost all cases now is SSD. It's like almost memory. You know, we have a local cache, persistent cache that can store hundreds of megabytes of data. And if you're not using this as a web developer, you are leaving a massive amount of performance on the table.
We have a whole community of like web developers who are constantly talking about how much performance matters to them and how performance is the most important thing and they have this massive cache sitting on the device that they're not using, so we have the primitive, you can sort data in browsers. But we now need to develop the patterns and libraries and techniques and mindshare for people to know that they can use this and how to use it. I fully agree.
This is exactly where I wanted to go to. We have the primitives now, we had some cruder primitives in the past. But I think those primitives are getting a lot better now. They work more consistently between browsers. They're getting more performant. Some restrictions are no longer there, but as you say, now it's a matter of building the layers on top, building the libraries, building good tooling, building things like. Browser dev tools that work with us. And so I think that's one major part.
And then the other major part is to getting people, like you say, like you go into mountaintop and shout it down. This is what needs to happen as well. Besides building great tools, since otherwise people kind of stick with the old way of doing things. But maybe in the context of the storage of the web maybe you want to draw a quick bridge to what you're working on.
¶ Roci Corp, Replicache and Reflect
Yeah, sure. So I'm the founder of this company, Roci Corp. It's basically collection of my friends from Google that worked on Chrome mostly with me and and we loved the work that we did there and like the quality of work that we built and we wanted to keep doing it on our own. So we formed this small company.
We're fully distributed and we have two products around multiplayer and local first Replicache is client side only so it's a library that you include in your app and you can think of it sort of as like a wrapper around local storage. We actually use IndexedDB, not SQLite and I think I think IndexedDB is slept on. I think like it has a lot of advantages and we're probably going to continue to use it for a while. But it's doesn't matter.
There's multiple storage mechanisms now in browsers and they have trade offs. So anyways, we have Replicache, which is client side only, and it's a wrapper around local storage that has a synchronization protocol built into it. A robust, high quality synchronization protocol that can do partial sync, that can do authorized sync that can store hundreds of megabytes of data locally is very performant. That can do 60 frames per second, like responsiveness client side.
And, you know, people build apps out of this, you know, like, one of our biggest prop proponents, like Dax, is online today, talking about this crazy app that he has built that's competing with Vercel using Replicache. And but you know, implementing the backend for a synchronization system is also challenging. And so people asked us for a hosted solution for this. And so we also have Reflect, which is the complete package that includes the service that syncs.
And it's, basically the same thing as Replicache, but with but with the backend as well. Got it. So can you explain a little bit more the cases when I would be using Replicache as opposed to when I'd be reaching for reflect? Yeah so Replicache is like, you want control of everything. Like, as much as you can have. Basically Replicache can connect to not any backend stack, but many backend stacks.
You can implement a Replicache connector for basically any relational database, for most of the document databases, you know, you can even implement a backend for, like, your custom distributed system. So it's more effort, but it's more flexible and adaptable, right? Reflect is very opinionated. It's a complete hosted service that's tightly integrated and it's designed to be very fast.
It leaves some, it's for, like, when you're starting something new, and you don't have a lot of time, and you just want it to work.
¶ Tabs are challenging
So I'd be curious to learn a little bit more about the challenges that you were facing building Replicache and building Reflect as it relates to the challenges you've mentioned before in regards to storage and other challenges we might not have talked about yet when it comes to building such a technology for the web. Yeah. there's a lot of little things that you discover when you set out to do something new like this.
And a lot of times, you know, in software engineering You know, a lot of the work comes from like unexpected places, you know, it's not the algorithm, it's not the core algorithms or the data structures or whatever. It's like dealing with the practicalities. Like one thing that is really a quirk of the web that is another one of the web superpowers, but really makes this challenging is tabs.
Like, the web has tabs and tabs from like a platform software developed, like a software engineering perspective are weird because they are different execution environments, right? It's as if. You have an application, but it has like, it has many different, like, places where code can run. Many, like, it's as if you had a bunch of different processes.
They're not, tabs aren't really different processes, but like, from a software engineering perspective, they're kind of like processes because they're different execution environments that can have different code in them, right? Like, you could have different versions of your app with different code bases running in each tab. Like, if the server updated and and one tab updated, but the other didn't, you know? But they share storage. So that is a weird thing.
Usually when platforms have a situation like that, where you have processes or threads, they also have locks so that you can like protect the storage and coordinate access to it. But the web has kind of has like a fledgling web locks API, but it is very unused and tested. And there's a lot of edge cases in it. And we don't really trust it. Like our company doesn't use it and trust it because we investigated a bunch of these edge cases and they're sort of like underspecified.
I don't believe that they're consistently implemented across browsers. And we've even had people like on browser teams like recommend that we don't use it. You have to figure out a way to deal with the fact that You have this persistent storage, right? And one tab can update and the other tab cannot be updated. And then what happens? Right? And There's always a problem with these kind of systems of like schema migrations, right?
You have to figure out schema migrations, but it's harder on the web because you have schema migration on the server You have schema migration on the client But then you have the problem that one tab can update and like want to migrate the schema forward to like the version of the Storage that it wants, but the other tab is like still back on the old version, you know, like what does it do? So these are just like fun practical problems.
So I would actually love to learn a little bit more about either of those. And so are those challenges that you've been facing and that you can fully take care of on behalf of an application developer, are you able to like go some way, but still leave some of the hard. trade offs to an application developer and there's some path, some like bad or even worse trade offs someone needs to make as an application developer, they ultimately have to choose. So how we're able to work around those?
My view is that in order for storage to be widely used on the web, it has to be as easy as, easier than like building a normal web app. It has to be easier than building today's web app, right? And I think that we can get there and the and the people who need to fill that gap, who need to make it easier than today's web apps are library developers. Like, it's not gonna be browser vendors because they move way too slow.
Like, it's gonna be library developers because we have this , awesome ecosystem of people furiously tr trying different things to figure this out. So we have completely solved the cross tab thing, and I think our solution to it is, I'm really proud of it. Like we put a lot of work into it.
It's for this like tiny moment when your app is updating we put this massive amount of effort into this, like 10 seconds at worst when your app is updating and like two tabs have different versions but we made it really like elegant. And like, you don't have to think about it. Like basically what happens is as a developer constructor for Replicache, you specify the version of the database that you want. And that's an identifier that you choose.
So you say like, I want version seven and you can create as many different schema versions as you want in Replicache, and they'll each be isolated from each other, fully isolated. And when we talk to the server, we send in the request the version of the schema that we're asking on behalf of. So the server has to respond with that version or say, I can't serve that version, you need to reload.
But then internally to deal with that moment when tabs are on different schema versions, we actually fork the database and we have both running at the same time. So typically in Replicache, the storage is shared, right? So you make changes in one tab and you see them instantly in the other tab whether you're online or not. But for this moment, when a schema update is happening, they fork. So the tabs can continue independently and they can continue working.
You know, a concern is like, if you're typing in a, in an input, you don't want to just reload the app at that moment to get the new code, right? The user could lose their work or just be frustrated, right? So you want, you need to allow the app to continue running for a little while until it thinks that it's the right time to update. That sounds incredible. And that sounds like an absolute nightmare.
If like I, as an application developer, it's hard enough to ship the app version that I have, and then to even like really think about that there is this point of time where a user has the old version up and running. And to then just like throw multiple tabs into the mix here as well. So, I think whatever pain you can take away from my application developer here is amazing.
So yeah you've been mentioning the various app schema versions and that does at least locally speaking forks the database. And I suppose then there's some mechanism how the forks are eventually being. joined or merged again, or how if the different tabs are still being used, how did those forks come together again? Well, what happens is they're both still talking to the same server, right? So the they're local storage forks but they're still talking to a shared truth on the server.
And the server and when tab a makes a request to the server, it says like, hi, I, I want to talk and I'm talking schema version seven and when tab B sends a request to the server, he says, I'm talking version eight and the server can choose to speak both versions if it wants, or it can choose to tell seven, I can't talk to you. I only know version eight but that doesn't mean that tab seven has to reload at that moment. Tab seven can continue working without a server connection, right?
Because it's local first. And then it can decide to reload when it's ready. Let's say we have tab a and tab B and they're happily working in the same version right now. Is there some communication mechanism between those two tabs that that doesn't rely on the server alone? Yeah, under normal circumstances, the storage is shared. So you make changes in tab.
If you're off totally offline with Replicache or reflect, and you make changes in tab a, you will see them in tab B. They share storage, right? It's just at this moment when an upgrade happens, they fork momentarily. Just to maintain like integrity of the system, like, so that you don't have. The schema changing underneath one of the tabs. And, you know, that is like a little bit of a rough edge in the like, abstraction. You know?
But, under normal circumstances, you're never going to notice this as a developer. And it's like, the cleanest solution that we could come up with. I agree. I think it's not like that you're rolling out a new release constantly. I think the, still the majority of the time where the app is being used, that you're not upgrading. And so for that little period of time to have a simpler solution to this very gnarly problem, sounds like a very good move to me.
So in the case where it's not currently upgrading, you mentioned that those two different tabs. Share storage. Does that mean they share the same IndexedDB database or is there even more sharing, such as like sharing an array buffer between those tabs, or what is the communication mechanism between those? Are you listening to changes in IndexedDB, is there sort of like a broadcast channel between those? How does the tap interplay work? I'm so glad you're asking these questions.
Like, I don't know if our listeners will care about these details. I hope they do. Because this is like, the fun part, you know? But Yeah, this gets to another one of the practical challenges that I am proud of in our implementation that I, that we put a lot of work into, which is like we, it was very important to us. And I know important to you too.
And a lot of other people in this space that you have 60 frames per second interactivity, like at, you know, that we want people to be able to use Replicache as if it's memory, you know, as if it's your, as if it's your state model, like one of the benefits that should come out of adopting these tools is that you don't need complex state management libraries, you know, you have this like database locally You should be able to use that as your as your state but when
you look at that from like again from like an engineering perspective, it's actually not so easy because You know storage local storage on SSD is fast. Sure You know SQLite is fast IDB is not fast, but you can like wrap it and do things to it to make it fast but it's nowhere near as fast as memory. It's not 60 frames per second. It's an order of magnitude slower, right? So, so how do you bridge that gap, right?
And if you want to have cross tab consistency, if part of the product design is that you make changes in tab A and they reflect in tab B, then you cannot use storage alone as your communication mechanism because it's too slow, right?
So you have to have memory inside the tabs at some level, you have to have memory like that has the state in the tabs because that's the only way it can be fast enough and then you again have a synchronization problem like a distributed system problem between the tabs, right? They're changing independently. So how do you resolve that?
There's like different, I think there's different legitimate ways to address this, but the way that we do it, which is that we basically, we use the storage as like, We, we run in memory, Replicache runs in memory, like in the JavaScript thread, in the main thread, right? So it's right there next to your app. It's crazy fast and it lazily loads and stores to IDB.
And we have a, basically a synchronization protocol between the tabs which you can kind of think of it roughly as like two Replicache browsers, like sync via the server and two Replicache tabs sync by IDB. That's like the basic structure. And yes, we need to have a way to know when something has changed cross tabs. And for that, we use broadcast channel. But like, you know, it's the same if you're familiar at all with Replicache, the server doesn't send data proactively to the client.
The server only sends a poke that like a tap on the shoulder that something has changed and the client requests the changes. And we kind of do the same thing cross tab. Like we, we use broadcast channel to tell the other tab, Hey, IDB has changed. And then. That tab is like running as fast as it can in memory and it has like a background process to refresh itself from storage periodically. I'm very curious to learn more about that background process as well.
Are you leveraging workers at all in your implementation here, or is this mostly running in the main thread? And this is where given that you have multiple tabs, multiple main threads is most of Replicache running in the main threads. We don't use workers as part of the implementation of Replicache Currently at all.
you know, we, We want people to be able to use Replicache as if it's memory, it should be as fast as memory and, the only way to do that is to be memory, you know, you could have something running in another thread, in another worker, but then you're still gonna have to have state and memory, right, so, What we landed on was, we have Replicache as a in, like an in memory main thread thing.
It has a background process that syncs with IDB, but that is just like a periodic task that's running on the main thread. You can easily run Replicache in a worker, and many people do. People, like, people do this to do full text search. They run Replicache in another tab, they do, you know, indexing in that, or they run it in a worker, they do indexing in that worker, and then they, like, access that index from the main thread.
And because of the cross tab communication that we have, it works fine across workers too, workers in the main thread, you know, but where to put workers in your stack is like a, is a, is an application developer question not a Replicache question I think it's very possible that we would implement workers.
We would add workers to various parts of the implementation of Replicache, like as an implementation detail, you know, like there's some background tasks that we need to do, like cleaning up things, you know, that are heavy. And it would be useful to have those on a background thread to make sure they don't interfere with the UI. The one thing that like people frequently ask, and it has come up over, over the development of Replicache, whether to use service workers.
and because it, it seems so tempting. It's like this shared place that you can run code, you know, across tabs. But man, service workers are like another part of the web platform. That's just so hard to use. You know, it's just so gnarly, and I feel like almost nobody knows how to use them, you know? And like, there's so few examples of them being used successfully. And if we use service workers in Replicache, it would have impact on how people build their apps. So we just haven't gone there.
Once you're just a little library that you use in your JavaScript app, then I think that keeps things way simpler since I think very few JavaScript developers are even aware of the concept of a thread in the context of building like their React app. And so a worker is a thread, but once you have a library or technology, that spans, the main thread that spans workers or service workers, then you need to conceptually deal with that. But it also becomes a tooling and a bundling problem.
So this is where I think the best technology that we have for those sorts of patterns right now would be Vite and my opinion so have you had success or not so much success with certain technologies? We love Vite it's like my default. It's kind of a funny story, like, when I started RociCorp which was, like I said now, almost four years ago I wasn't up to date with like the web, like, and like the popular open source tools. And my friend was like, you gotta check out Next.
js, it's like, so awesome, like, you guys should figure out how to integrate with Next.
js, and like, I started, like, working with it, and I was like, oh, this is like a really cool DX, but like, I'm trying to, like, what is it, like, what's the core value here, I couldn't put my finger on it, like, it's like, it's hosting, like, there's the the thing where you have deploys that are like, like, preview deploys, that's really cool, but like, is that, I'm trying to think, like, why is this so exciting, and I finally realized, it's like, the easiest way to set up React.
That's really like the back then, like that's the core of it, you know, that's how like it's like setting up a react project is just so hard, like, and then I, you know, I think V has taken the has taken the crown on that front now, Yeah, I think, so my take is that if you're using Next and typically you then deploy it on Vercel, I think that's great for like anything that's like a more on the, if there's a spectrum from website to web app, then I think
this is rather where you start on the website spectrum and make it more, add more and more app like features. But I think it becomes increasingly hard if you want to build a local first app with Next. js as you want to introduce those capabilities, as you want to introduce really deep storage mechanisms or once you want to work with workers. I'm sure it's on their roadmap somewhere.
But I think they, they've just started their journey on one side of the spectrum, which is, I think, rather to the websites. And that's great and that works really well. Server side rendering, React server components is great for this use case. But I think once you want to build web apps that really feel more, almost like a native app, I think this is where you need to reach for a different tooling stack.
And I'm currently very happy with Vite as it has support for workers as a first class citizen, was a bit rough over the last few years. But it has gotten a lot better with every release and I'm very happy using it. And I've even seen a few libraries also shipping workers out of the box that work quite well with Vite so an example here would be the SQLite WASM built. That also ships with some workers out of the box, which works pretty well, yeah we use it often and like it as well.
I don't have as much experience with workers in particular just because we haven't taken it on as like an implementation detail yet. But yeah, just overall we have a lot of success with using it for our samples. And like, you know, when you're building this kind of stuff, you end up making apps all the time, right? So speaking of maybe this is a good segue to. How I would use Replicache or reflect in, in my, let's say in my React app. So I think you've been mentioning MobX.
Is that a typical technology that you use Replicache with, or does Replicache completely replace something like MobX? MobX, Redux, Zustand, all those all those sort of state technologies. Are they complimentary or are they rather being replaced by Replicache? I think long, like long term they're being replaced but the reality is that Replicache isn't there yet.
Like these are, you know, very well developed, like sophisticated tools, you know, like that people have, Developed, like to do legitimately hard things, you know, like, you have a fairly large data set in memory and you're trying to update like little bits of it reactively, you know, that's like a legitimately hard problem. And so Replicache has an API like A subscription, API that's memory fast. And I think it actually competes well with like SQ lite based systems. In many cases it's faster.
But. I mean, if you're building something like Dax is building, you know, that has like a lot of data in it like 30, 000, 50, 000 records, you know, and, you know, you're trying to do 60 frames per second updates in there, and you have a lot of, like, computation, like, derived computation chains in memory, like, we don't have the, we don't have the APIs in Replicache yet that, that can compete with, like, MobX or, like, Signia from TLDraw, like, things like that.
And the cool thing is like, the design of Replicache is complementary to putting those things on top. Like at the bottom of the Replicache abstraction stack, you have a key value store that's reactive. You know, so you can like plug those reactive changes, like, into your into Zustand or whatever, and it'll work great. It's interesting, like, different people in the space started at different angles. Like, I think that's something you've been passionate about, like, from the very beginning.
There's so many exciting things happening in local first. Like, other people have started focusing there. We started, like, a lot more on making the synchronization correct and robust. And making partial sync work, authorized sync work. You know, making the mass storage, like, really fast. And we still have to like finish up the libraries legitimately, like the API layer to make it really nice.
I think that it's going to be competitive with like using those types of tools, like, you know, next quarter. Yeah. I mean, Replicache has been, I think, one of the first solutions really been on the local first market in that way.
And so, and I think you, you have been quite ahead there in terms of the work on syncing and just having a. A fully fledged thing out there that developers can use to build on top of and that shows I think most most of the local first apps that have been built over the past one or two years, I think, use your technology and I think that's already. driving some of the change that I want to see for our apps.
So are there some apps that that you're particularly excited about, where you say, okay, this is exactly what I wanted to help create more of or help create more of? Well, I mean, right now the one that, that I'm like most excited about probably is sst. dev, Dax's thing that I've mentioned a few times. Like just because, I don't know it's an example of like a data intensive application that is like public that you can try.
A lot of our, a lot of our customers are like, you know, they're private things that, you know, not easy for people to access outside. And yeah, we have a lot of.
Customers using Replicache for things that are like in the building industry or like service industry where like, like we have a customer that is building like a tool that people who build houses like would use and, you know, they go out in the field and there's intermittent connectivity and, you know, actually like building a house is like a super data intensive thing. You know, there's like thousands of elements on the checklist to a house, you know?
And like lots of people that have to come through and look at it and then there's back office things that happen and like, so it's like a perfect use case, but it's not something that you can like use, you know, that you could go try and use right now because, you know, it's a private system. So yeah, I think like sst. dev is like the best example right now that I'm excited about.
I'm equally excited about the things that I can use myself, as well as the anecdotes I'm hearing about other technologies being created for other people. So I think this is what I'm particularly excited about Local First, that it makes it easier to build technologies for use cases that were just not that Viable before to, to build technologies with the tools that we had before.
So what you've been sharing about the construction use case here, or you've been also privately sharing a few other use cases with me, those sounds, sound incredible. And this is exactly why I'm excited that local first opens a whole new area of the web.
¶ Replicache and local-first
So if I'm looking on Replicache. dev or Reflect.net on Replicache. dev, for example, it says the way to local first. So I'm curious what local first means for you. I think there's a whole bunch of terms flying around, whether it's offline first, local first. So can you share a little bit more about how Replicache thinks about local first? Yeah. Yeah. So there's obviously, there's a little bit of controversy around, around naming and like what local first means.
And I think this happens every single time there's like a new catchphrase in tech. Or really even in anything, like even in music or other domains, like people get worked up about what qualifies as what term. PVH. I don't know what the H stands for. Harden, Hardenberg? Yeah, Hardenberg. Oh, yeah, he lived down the street from me in San Francisco and we met up in coffee shop all the time and and talked about Local First and CRDTs and things like that. And he coined Local First.
I think it was him, or maybe someone on the team, maybe it's wrong to attribute it to Peter solely, but anyways they list, like, seven ideals for Local First software. I think it's seven. And Replicache, like, does not meet all of those ideals. Right? Like, in particular, there's like the long now, I think, which is like, you know, that you should be able to keep using your client side software.
If the service it depends on disappears like that, that like replicates doesn't really do that because it's a client server system. And I think there's some others that kind of point to like PDP, like in order to implement one of the seven, you would kind of have to be PDP. And like replicates is a client server system. It's like designed for, you It's designed for, like, the classic web services that people, that are 99. 99 percent of the things you use every day.
So, when we built Replicache, we, like, specifically called it Offline First for the longest time. And we avoided calling it Local First, you know, out of deference to that team. Because they coined the term, but, like, the thing that happened is, like, people kept calling us Local First.
Like, like the users kept calling us local first, you know, and like, at some point we were just like, forget this, you know, and also like other companies started calling themselves local first that were like the same design as us. And it just the market seemed to like consolidate around this term. And I think it makes sense why it happened because it describes what people think of as this technology, right? Local first, you access the local storage first, and then you fall back to the network.
That is what Replicache does. That is what ElectricSQL does, that is what all of these systems do, and like, and so it's like a correct descriptive name, and I think that's why people don't understand the distinction, and and so we just ended up like, giving in, and like, deciding that we're local first too. That, that makes a lot of sense. Local first is like offline first, but we have an additional capability.
It's not just like an app that only works offline or can then like also work with a server, but it's fundamentally also giving you collaboration and I think it's more of a spectrum of, yeah, you've been mentioning the seven ideals. And if some technologies can give a foundation that adheres to all of those seven ideals, great, but fundamentally. The tools we were building and the tools we're using help us to get from A to B faster and it always depends.
So I think Replicache is striking some very reasonable and attractive trade offs. And if you don't have that client server architecture, if you don't have that server. Then you are also left alone with some very hard problems that you don't really need for many applications. So I'm very excited about how you thought about those trade offs. And I think local first is a big umbrella and I'm excited that, that replication and reflect is a part of that. Oh, thanks, man. That's nice to hear.
Peter invited me to like the local first party in St. Louis, like at strange loop last year or this year. And so I was like, Oh I'm in the club. I got invited to the local first party. I think this is definitely a goal.
Let's bring as many people in here, particularly like people, like with your great technology background, I think you've been rooting for those ideals, like way longer than most people have started to really think so crisply about those, so before we are wrapping up, do you have anything else that you want to mention? I mean, I'm going to plug Replicache and Reflect, but before I do that I think that we're just, it's a really exciting time.
Like, I think that we have been working on these technologies now for so long, like some of us, you know, and like, there have been so many things to, to solve. But it really feels like it's turning a corner and I think that more and more people in 2024 are going to be thinking, you know, I think it's time to build something local first and or at least play with this. And there's just a lot of really great options out there right now. And it's and it seems like it's growing every day.
So yeah, if you're thinking about building something local first check out Replicache. dev. That's the client side only project that we have. And if you just want to get up and running quick and and not do all the setup yourself check out Reflect.net. It's fully managed and hosted service. That sounds great. I might just play around with that over the holidays myself. So yeah, Aaron, thank you so much for taking the time. We have quite a time zone difference.
I'm here like in Berlin where it's already quite late and you're in beautiful Hawaii with the very background. So thank you so much for taking the time and sharing everything. All right. Yeah. I'm really looking forward to hearing the rest of these too. I'm sure you'll get some really interesting people on here. Perfect. Thank you for listening to the Local First FM podcast.
If you've enjoyed this episode and haven't done so already, please subscribe and leave a review wherever you're listening. Please also consider telling your friends about it, if you think they could be interested in Local First. Thank you again to Expo and Crab Nebula for supporting this podcast. See you next time.
