¶ Intro
There's database. sync engines and then there's document sync engines. So for database sync engine, I think of things like Linear, like things where you have some relational model data, you probably don't want the client to have all of it. You kind of have, the client storing some subset of a database for each, for each account. Maybe you're sharing this data across multiple people in that account. On the document sync side you're sending all of the data down to the client.
The unit of data that gets synced is in memory size on the browser. You're not dealing with like a terabyte of data here. you're not taking a subset of it. You're synchronizing the entire document. This would be kind of things like Figma or Google Docs, where there's a full local copy of some self standing piece of data. Welcome to the localfirst.fm podcast. I'm your host, Johannes Schickling and I'm a web developer, a startup founder, and love the craft of software engineering.
For the past few years, I've been on a journey to build a modern, high quality music app using web technologies, and in doing so, I've been falling down the rabbit hole of local-first software. This podcast is your invitation to join me on that journey. In this episode, I'm speaking to Paul Butler, founder of Jamsocket, and creator of the Y-Sweet Project.
In this conversation, we talk about building versus buying a sync engine and explore the various projects behind Jamsocket, including Plane, Y-Sweet, and Forever VM Before getting started, also a big thank you to ElectricSQL and Jazz for supporting this podcast. And now my interview with Paul. Hey, Paul, so nice to have you on the podcast. How are you doing? I'm good. Thank you, Johannes. I'm excited to be here and been listening since the beginning. Thank you so much.
The two of us had the pleasure to meet in person at last year's local-first conf, and I'm hoping to see you there again this year. So for those in the audience. Who don't know who you are, would you mind introducing yourself? Sure. I'm Paul Butler. I'm a co founder of a company called Jamsocket. the kind of one line pitch is it's like a Lambda, but for WebSockets. Yeah. I've been looking a little bit into Jamsocket and it's like, looks really fascinating.
And I also want to hear more about The origin story where it's coming from, since by now we got more and more sort of like infrastructural options. Obviously there's like CloudFlare with their primitives, CloudFlare workers, et cetera. And I think with Jamsocket, you provide a really powerful alternative. For high scale applications.
So yeah, before we go into more into depth, what Jamsocket is and what it offers, would you mind sharing a bit more of the origin story, how you ended up working on it?
¶ Jamsocket Origin Story
Sure. Yeah. so I started, with my co founder started the company about three years ago. prior to that, I was working in finance and I was doing a lot of building a lot of internal tools for myself, my team, that.
We're dealing with midsize amounts of data, so talking about like single digit, double digit, gigabytes of data, not anything that was out of realm of putting in RAM for a desktop application, but I realized that as soon as people wanted these things to be delivered through the browser, there was really nowhere to put that data, that I couldn't really load that over the Internet into the browser, Chrome would just give up, didn't really make sense to load that
into kind of a flask server or something like that, because the web stack is kind of built for these servers to not consume a lot of memory for each user of the application, things like that. So I really wanted this sort of neutral location and a way for the, I almost think of it as a way for a browser. Based application to spin up a server side sub process that just belongs to that browser tab. And that when you close that browser tab, that server side process also goes away.
So that was essentially the origin story of Jamsocket. That makes a lot of sense, but could you motivate a little bit more? What kind of application I should imagine there? I've never worked in finance and I think the average web developer has sort of like. A list of like 50 Airbnb, items that they want to render. And then there's like pagination and all of that easily fits like in a JSON array that you can fetch into like, a single, CRUD, REST API call.
But when you say like single digits, double digit, gigabytes of data, what sort of data are we dealing with here? And if you want it to transfer it over the wire into the browser, what would that even look like? Would that be sort of like one big JSON blob or can't, yeah, maybe you can motivate that a bit more. Yeah, so, one of the motivating examples at the time was, we would run like a simulation, a back test simulation.
So we have some model that we hypothesize is predictive of of stock returns, run it back in time and generate a bunch of data could be say on every five minute increment or even more fine grained than that, over petabytes and petabytes of, past market data. and then we get back some gigantic time series data, number of time series, maybe you have profit and loss over time and record of all the trades and everything like that.
so we have like massive, not massive, large, like, Gigabyte, multi gigabyte of time series data, likely in something like Parquet. that tend to be the best format for that type of thing. and so over the wire, ideally it would be Parquet or Arrow. Got it. And over that sort of data you want to do like. Based on the user's input through the UI, driving some sort of queries, some sort of like accumulations to make sense of like what the data is trying to tell us.
Yeah. It was things like, maybe I want to be able to drill down in the data, in the client, be able to kind of go from this high level overview of data to kind of. Looking at specific trades, specific stocks, things like that. Got it.
And sort of your insight and way to deal with that, fundamental problem where like ideally you could just like move that big data blob over in front of like your, browser that you're looking at and then like happily query and compute away, but that wasn't feasible because like Chrome or another browser has. Certain limits and the way how you want it to like, cope with that is to say like, okay, we're going to have like a little companion.
Each browser session has a little companion on some beefy server, which holds all of that data in memory. And then there's some sort of like. Real time wire protocol that helps you do that, like still so fast that it's, sort of proxying to being local. Yeah, exactly that. I kind of think of it as like a somewhere along the spectrum where you have like there's thin client.
Sort of setups where the server does everything and the client's really just a dumb display all the way to a full fledged browser based app that where everything's happening in the browser that can happen in the browser. I think there's some middle ground where you get. Next frame latency on almost everything, but maybe in the background it needs to query some server data and load that in and maybe it can even approximate client side.
what that next frame will look like, but it's able to sort of do that in the background as well. Got it. So as you face those problems that has led you to get so interested in the problem that you started to dedicate your next chapter in life to that. And that led to you building a technology called Plane. And that was then also the foundation for Jamsocket. So can you explain a bit more what Plane does and then how it connects to what Jamsocket is?
¶ Plane
Yeah, so, and just to kind of continue on the story. So my co founder, Taylor was at data dog and had kind of faced some, some similar problems. So we got together in 2022. and the first thing we started working on was, yeah, what became Plane, which is, it's open source. and it's a, I think of it as kind of the way that we spin up those processes. It's kind of the orchestration plane essentially for that type of application.
so what it's responsible for is you kind of give it a pool of computers. you tell it you want to start a specific process and it will find where on those on that pool of computers to start that. But it will also give that process a secure web accessible URL. so it can give it a host name and a kind of a password, essentially, then anything on the web, anything on the public Internet. That can access the web, can use that URL to send and receive messages from that process.
And as long as there's at least one open connection to that process, Plane will keep it alive. And then as soon as there's no more processes, Plane will start a countdown timer. And if nothing reconnects, it'll shut that process off. Got it. So in terms of use cases, you've clearly motivated that original use case that you had while working, at a financial company.
are those also the kind of use cases that, you know, mostly face when talking to people that are interested in Jamsocket or is there a wider set of applications that Jamsocket is trying to serve. yeah, largely not that actually hasn't, we haven't seen that many use cases of kind of wanting to just modify massive data sets or deal with massive data sets in the browser.
But one of the things we quickly realized was that the infrastructure we were building had a lot of parallels to how Figma did things, how Google Docs did things, how a lot of these kind of collaborative applications did things. And so we decided to kind of lean into the sync engine hosting side of things. Got it. And when I'm looking at your, website, among a few other companies, it looks like Rayon is built also on, on top of Jamsocket.
So I've happened to have seen their launch, I think, a while back. if I recall correctly, it was sort of like a really interesting Figma esque, application, I think for architects. And, yeah, maybe you can share a little bit more about their specific scenario, how they're employing Jamsocket to build their collaborative experience.
¶ Rayon
Yeah, Rayon's one of my favorite use cases. cause we've really grown with them. They've been using us since we started the company, essentially. they were one of the first users on the platform and we've seen them kind of grow as they launched and, and everything. essentially the way that they're using us is that we are the data backend for these documents while they're open. So I open a document, you open a document, they'll start a server on Jamsocket for that document. And as I make edits.
They get pushed up to that server. They get sent back down to you. And, that back end is also what's storing the data on S3. So even if it's single player mode, that Jamsocket server still sits between their end user and the source of truth on the data source, the durable data source. Okay. So you mentioning S3 and a durable data source. maybe we can take a step back and motivate if someone wants to build their own, little version of something like Jamsocket or Plane. how would that look like?
So there seems to be something pretty beefy in the middle that holds the necessary data in memory. And as the name, as in memory suggests, that's pretty volatile. So if someone stumbles over a power cord, that data might be gone. And, that's also why it needs to stay some more, more durable, something like S3. So can you walk us through like the rough architecture? And what were sort of like the insides and deliberate trade offs that went into it?
Yeah, I mean, a common pattern that I see people use with Jamsocket is that the source of truth for application data will kind of shift as the application is used. So at rest, the source of truth of the application data is In durable storage somewhere, usually S3, something like that, where you might want to, might not want to write to that like 60 times a second, but you want it to persist when that document is open.
Then that source of truth effectively of that document is in memory on Jamsocket. And the nice thing about that is it's, you know, it's memory. You can write to it very frequently. You can write to it 100 times a second if you want to more than that. and that can then be synced down to all of the connected clients and then. In some sort of loop or, you could have some sort of write ahead log, but as changes are made to that document, you are then durably persisting them.
some people really care about that being really low latency. I think in general, unless it's really a bad thing for users to lose like 5 seconds of data that just batching everything up into writing just the edits every five seconds. Something like that is pretty reasonable. or you can, you know, what a lot of people do is they just say 60 seconds is fine.
I'm just going to write the entire document over what existed there before every 60 seconds because the outage, you know, a server just failing out of the blue is actually pretty rare these days. Got it. So if we compared to, a technology like Cloudflare durable objects, with Cloudflare workers, that's a particularly distinct programming model where it kind of gives you kind of Best of both worlds in that regard that you only pay for the CPU cycles where you actually want the CPU to do things.
And otherwise it can hibernate while still keeping a web socket connection alive, for example, or keep like some memory alive or rehydrated from some persistent storage. is that sort of like a useful parallel way to think about the programming model and also. can I implement any sort of free web socket messages or request handlers, or is there a more pre specified API, something like Redis, how I interact with data from a client to the server and vice versa? Yeah, good question.
I agree that like, I think durable objects is probably the closest kind of parallel product out there right now. when we started this up, durable objects, wasn't really a big thing and had may have existed, but had a lot of limitations. like, I think we, we came at things from a very different angle, but kind of landed in a similar architectural space. in terms of the servers though, we just, Really host anything that's HTTP.
So, when I talk about it as being for WebSocket servers, I think that we kind of came at it at an angle of we want this to be the right model for hosting WebSocket servers. But, and we do, you know, we sit on the connection. So we. Work well with WebSockets where there's a long lived connection, because then we know not to terminate the server with HTTP requests. We have to rely a little bit more on heuristics. we've got that WebSocket connection open.
really, just anything could be Socket.IO could be, your own WebSocket protocol. we essentially just take a container from our customers that will serve HTTP on port 8080. And we expose that to the, the outside web through a proxy that we wrote. Got it. So in the specific case of Rayon, did they build their own from scratch sync engine? Did they leverage any specific off the shelf technology, something like Yjs?
given that Jamsocket advertises as the platform where you build your own sync engine on top of, maybe you can walk us through by this example. how I should think about that. Yeah. they're one of a number of customers who have kind of built their own sync engine on top of Jamsocket. there's not like an SDK that you need to adopt or anything like that on, on the server side. It's, you're just writing a web server. but one of the things that's specific about this model is that.
You are guaranteed by the infrastructure that only one, at most one server is running per document or however, you want to fragment your kind of space of things, but, in their case, it's per document. And so, yeah, you get that guarantee from the system, and then it becomes much easier to implement your own sync engine. But we, at least at the Jamsocket level, are not opinionated about how you actually go about implementing that. but then you mentioned Y-Sweet. Yeah, we.
So Rayon does not use Y Suite, but, some of our customers use Y Suite, which is a Yjs backend that we wrote that we provide. That's a much more opinionated path if they want to take that. Got it. Yeah. I want to learn a lot more about Y Suite in a moment as well.
But given that you've already mentioned those 2 paths of Y Suite, which is a off the shelf technology that you're building that basis, on top of Yjs, which is a very well known, CRDT implementation, probably the most, common and, longest technology that's out there. so that being an example for an off the shelf technology. Rayon, which has built their own sync engine.
you've probably seen many, many, decisions being made where people choose to use an off the shelf technology or choose to build their own. which sort of advice would you give to people who are thinking whether they should buy, or, as an alternative to buying is like adopting an off the shelf technology.
¶ Building vs Adopting a technology
Yeah, I think that where. It kind of comes down to for the kind of build versus off the shelf is whether you want to have business logic live in the sync engine on the server side. so where I think you generally don't need that is if you want to just think text documents, things like that, where CRDTs are probably the best way to do it right now, at least the most off the shelf way to do it. You can do your own way, but it's sort of a research problem.
where, on the other hand, I think if you have a very simple data model, but you want to do atomic transactions, you want to have, kind of an event sourcing type approach. you want to be able to do things like trees with reparenting and and some of that Ends up being that you're working against the CRDT. and in those cases, I think it makes more sense to implement your own business logic.
The other thing that we see is if maybe you want some change to trigger some action server side, you want actions to have some side effect. You want to, maybe some piece of data changes and you want to insert that into a queue. So it becomes really nice to have some server side code that Reacts to changes to the document. that's another place that we find.
Building your own tends to be really nice because you can just have that be one server that's responsible both for the sync and, and for triggering some side effect. Right so maybe to linger a little bit on that specific point, I think with, local-first software you have, in this scenario where you build your own sync engine, you have kind of two, approaches, how to deal with that. And also for the off the shelf approach, if you use something like Yjs.
so if you build your own, you can basically just wherever you handle the messages, you can, possibly inspect the messages and see, okay, this seems to be like a user signup event. And so here let's send out that confirmation email or something like that. but another approach could also be that you basically have a server side client instance that listens to the same sync messages and you. Based on the state that you have, on that server side client, you could then basically React to that.
have you thoughts on one approach versus the other? Maybe, one is like a lot more, expensive to run or, more, complex to model. What thoughts do you have on the different approaches here? I think that where I've tended to see this breakdown because we've, we've seen it both ways and we've seen, we've seen customers do it both ways is that if it's. Purely just sort of Reacting to a side effect.
And it's something that you want to that your model of it is that it's like a server triggered type of thing. Like, if it's that, you know, that send email example, send some sort of notification. I think that that makes more sense to just do in the server, just in terms of architectural complexity. you could certainly listen for the events.
And if there's architectural reasons that that makes sense for you, I don't see any problems with it, but where I think that the server being a client can make a lot of sense is like AI integration type things, where you want the server in this case that, you know, it's code running on the server, but it, your application should just treat it like another client. This is something like maybe an agent's going out and modifying a document based on some prompt.
Then I think it does make sense if you want to run it through the same kind of code paths that a user edit would go through, then it makes sense to, to kind of treat that as a distinct client of the data. Got it. So to dig a little bit more and towards that, server as a client, when I'm thinking more about like a browser client, or like using my, phone, or there's like a concrete point in time where I'm starting a session. I'm opening a tab. I'm opening an app. I'm doing things afterwards.
Like I'm closing it. So there's like a concrete start stop. Maybe there's like some background stuff, but, let's pretend there's just like a clear start, stop 30 seconds. And that's it. how should I think about that in a server context? let's say I'm trying to offer that to a thousand customers. Would I have a thousand separate, like, but let's go crazy. Let's say we have a thousand VMs, one per customer. that strikes me as very expensive.
So what is like a useful programming model, like a useful deployment model. To, deploy those sort of server side clients. so the way that Jamsocket does this is that we run a process for every service, essentially. So when, when you and I are connected to a document, we're running a process, not a full fledged VM, but it's, using some Cisco interception through something called G visor. So it's a little bit more secure than sort of just. Yeah. Containerized workloads.
so the nice thing about that is that processes are pretty good at giving resources back to the system when they're, when they're not actively in use. So we've seen is that when you want the server to kind of first Okay. Class of interactions where it's sort of definitely want it to be processed by the service. in those cases, it makes sense to run directly in the sync engine when it comes to. multiple clients, we tend to see those run off of Jamsocket.
So these are running on an end user server talking to Jamsocket and the pattern that I've seen their work is that client will maybe trigger something directly through like a web endpoint on that remote server that's not running on Jamsocket, that server will then talk to Jamsocket to say, fetch some data or, connect and sort of trigger something. so it might synchronize data, but it's not, a long live client. It's kind of a client that spun up based on a specific action.
That's usually triggered by the client. Got it. That makes a lot of sense. So instead of like being super long running, and that's times and for each possible instance, you make it more event based. So, let's say there is a new sync message that you want to React to, or there's like some other. maybe like a webhook that's coming in from Stripe and then, so you, you do your thing as a response to the event and, then you go, yield again, back to the runtime.
and I think a model that also comes to mind that could fit really well together here. Is, our durable long running workflows, something like Temporal. And there's also other options as well, I think could work really well together here that you have a workflow. That's essentially a participant in a sync system where it's just a long running workflow. It's just like another client happens to live on a server and not in a browser.
yeah, I'm, I'm really excited to see more folks explore this since I think it will. open the door for a whole bunch of different application topologies, really. One of the things, things that we found with Y Suite is that, we had people ask for, like, I want a Python client to this. And it was for exactly that reason. Like they want to run some server side code that interacts with a document. same with the node on the node side.
We support kind of the built in WebSocket client in the browser, but we also support a shimmed in WebSocket client so that you can run it in Node. Very cool. Yeah, I'm really looking forward to like, whether it's Python or well, I'm a native person in JavaScript and JavaScript has this amazing, Aspect to it that supposedly runs everywhere and we're getting more and more there with like ESM now, being really, the default.
And, I'm really excited about bringing the same business logic, the same code to all sorts of different platforms. And I think sync engines are. or like a huge lever that gets us closer towards that since like, otherwise we can have, the code there, but if we don't have the data there, that is only good for so many use cases. So maybe. transitioning towards Y-Sweet, what you've already mentioned.
before we get into what Y-Sweet is, can you share more about the origin story of Y-Sweet and which problems you try to solve?
¶ Y-Sweet
Yeah. so we'd already been working on Jamsocket for a while by the time we started Y-Sweet and we sort of started to see for one thing, You know, we thought from the get go that, well, people are going to want to write their own sync engines.
one of the things we saw was that a lot of people were sort of using Yjs and other CRDTs and running those on Jamsocket and finding advantages, even though they don't need the authoritative kind of model of Jamsocket that they were still finding advantages to having that. so we started thinking like, what would a Yjs server kind of built to run on Jamsocket look like? And one of the things that. It's nice if we're, you know, running a lot of a process is that it's really memory lightweight.
So we wrote Y-Sweet in Rust and it's pretty memory efficient. another thing that we became really opinionated about is that you shouldn't really start document data in a database. I think it's just a bad fit. I think with something like a, you know, if you're building something like Figma, like Figma uses S3.
As where they store the document, they store the document metadata in Postgres and started to see a lot of use cases of like patterns like that, because if you're writing the document each document that's open many times a minute, If you're using a Postgres database, that Postgres database is in the bottleneck. Every, every edit is coming through that.
Whereas S3 is a more distributed kind of file system where if you have a server that is the authority of what's in a document at that given point in time, it can just write to S3 and you can horizontally scale that out as much as you want. So we kind of became opinionated about, okay, that should be rust. It should be lightweight. It should write to S3.
and it should be, Okay. As simple as possible to just use, like, I really like software like Caddy, where it is, which is web server written in Go, if people aren't familiar with it, where you like that has really sane defaults. It's somewhat opinionated about just doing things right. You don't have to fantastic. It even gives you like, SL certificates that work locally works with tail scales. Fantastic. Definitely check it out. If you're not using it yet.
Yeah, so Caddy just like simplifies so much and just like does things right. And so we wanted to build a piece of software that was felt like that to use, that it was, we wanted something that you could use in a CICD process and it would be the same API as if you were using it at scale, horizontally scaled out on the cluster.
so it was like, because the other thing, I mean, the things that we were thinking about at the time were like, what would an open source document sync engine look like, if we were to write it from scratch and we kind of kept landing on, it would look something like, you know, pretty close to Yjs, even if we didn't have the distributed constraints of Yjs. So we're like, well, Yjs exists. It has great community. Great people involved with it. this looks like what we would want to build.
So let's just build a sync engine around this. Got it. In terms of the, behavior or like what makes it a little bit more like Caddy in terms of opinionated, but like, very well motivated opinions baked into it, if you compare it to the Yjs. Default server, any sort of thing that stands out there where you lean a little bit more heavy into some opinions?
Yeah, I mean, I think the default Yjs server is built to be very modular and suit a bunch of use cases The Yjs community in general embraces this idea of providers where a provider. So Yjs itself is just a data structure and then providers are what will synchronize it to another client or synchronize it to a database or things like that. the kind of official way to do things in the. Yjs world is to kind of compose a bunch of providers together.
so you might have an index db provider on the client, synchronizing the index db. You might have a web socket provider, synchronizing to other clients. And then you might have a database provider on the server. We wanted to just have a single stack that was kind of our opinionated stack. So we have an index db implementation on the client.
We have our s3 storage, which we've Decided is, you know, the only storage that will support will support S3 compatible storage, but it's, it's ultimately our opinion was object storage is the right way to do storage for this. and then we have our, our wire protocol as well. WebSocket. Got it. That makes sense. Yeah. And I haven't managed yet to.
Have, Kevin Jans here on the podcast, but he happens to also live in Berlin, and I've just seen him, for the, last local-first meetup that we've done here. So I think it's, well, about time that we hear from Kevin, about YJS, there's been, it's been such a rich ecosystem of different things around it, so I think we gotta make that happen as well. Yeah, you should. So I'm actually, I've been procrastinating editing a podcast that I did with Kevin. so we'll have that soon. There you go.
we should put it in the show notes. So, YJS you've built, just, as you've seen that this is a, flavor of Sync server that can be hosted on, Jamsocket. So, is my understanding correct that, if I want to use YJS with Y Suite, I can just deploy that off the shelf on Jamsocket yeah, so you could deploy that. We have like a off the shelf offering that deploys it on Jamsocket. you can run it on your own servers as well.
and it's one of the things we decided was like, regardless of how it's hosted, it should be the same API. So we have kind of the, what I call it, the document management API where you're, you know, create a document, give somebody an access token to that document. that is sort of just universal, no matter how it's deployed.
Got it, so I think Yjs is one of the most mature options right now for people who want to build local-first apps, for people who are just, who've heard it a bunch of times, but maybe haven't yet come around to, fully. Implement their app using it. what are questions that people should ask themselves? Whether Yjs is a useful foundation for the app and in which scenarios would you say, actually, you probably want to build your own sync engine.
¶ When to choose Yjs
yeah, so I, I think the, one of the first dimensions to think about here is I see this sort of, there's two worlds. There's like database. sync engines and then there's document sync engines. So for database sync engine, I think of things like Linear, like things where you have some relational model data, you probably don't want the client to have all of it. You kind of have, the client storing some subset of a Database for each, for each account.
Maybe you're sharing this data across multiple people in that account. database sync world where there's, Elastic SQL and, zero and, power sync and kind of a number of players there. instant DB and triplet and a number of others. on the document sync side, that's where you kind of have, you're sending all of the data down to the client. So you're dealing with kind of the unit of data that gets synced is in memory size on the browser. You're not dealing with like a terabyte of data here.
you're not taking a subset of it. You're synchronizing the entire document. This would be kind of things like Figma or Google Docs, where there's a full local copy of. Some self standing piece of data. and generically in, Yjs, that's essentially like a JSON style or JSON shape data. So things like nested maps, things like nested lists, and text, and then JSON primitives.
Is it fair to say that, so you've mentioning Figma, Google Docs, if I think about Figma and Google Docs, there is like a distinct boundary of a document. So I have a Google Docs document open. I have a Figma document open. is it wherever a product experience has sort of like for a given part of the experience is all centered around a document or tl draw comes to mind?
is that a great fit for embracing the document model and anything that is more, rich in terms of, like a relational database where you can just freely join between things. That's where you would choose the other approach is that's a useful rule of thumb. Yeah. I think the words, that you use distinct boundary, I think that's really nails it. a, if there's kind of like a document with This is like self contained. It's distinct.
you mentioned TL draw like, and actually, I mean, I think this gets to another point is that you can use both in the same application. So TL draw uses zero and their own document sync engine. Figma has built their own sync engine for both. and they're distinct sync engines. They can be used in tandem as well, right I mean, that gets us to a really interesting, topic more generally, which is combining multiple sync engines.
And I think for people who've been dabbling in local-first, that might be more intuitive, but I think for, people who are just very new to, the local-first space, it's hard enough to wrap your head around, choosing the right sync engine. Now you're telling us, wait, you should choose multiple. Can you motivate a little bit more of like, how to think about that?
¶ Choosing multiple Sync Engines
so I think of it as like the app layer and the document layer. If you have a document based application, there's, you know, if you have a file viewer, for example, I think of that as app layer, you're not in a specific document at that moment, like in Figma where I'm on the home screen and I see my various projects. Yeah, exactly. and I think there's nothing that forces that part to be real time synced. In a lot of cases, I think a traditional Postgres database goes a long way for that.
and then, but then once you're in the document, that's where I think you, you do kind of need a sync engine because, it's the type of thing that if you have two Google Docs open in two different tabs, you expect them to be in sync, even if you're just a single user. I think that actually motivates like 98 percent of the value of local-first is just somebody who has the same document open in two tabs and they've got 100 tabs open.
I think that that's less of a given expectation these days for like a project view or something like that. I think that It's a nice surprise when that is in sync. And I think it is becoming the status quo, but I think that overall it's. less of an expectation that, Oh, you might have to refresh your Figma project, to sort of see the new assets that come up or that kind of thing.
so yeah, but it is, I do think, and there's been a bit of Twitter debate about this lately, but like whether the same sync engine can handle both. I think that there are things that you are going to need transactions for, and if you need transactions, you're going to need a database with a single that is effectively a single bottleneck on updates. At the same time, if you have lots of documents, you don't want those documents to be bottlenecked in a single point.
So I think unless there's a solution that offers both distributed and centralized with transactions, you kind of need both. Got it. So, if you're thinking more about the leaning into the document aspect of it, or even, when you say like, that something is bottleneck, let's say we also embrace the, database aspect of it. Maybe you have different.
Workspaces, and, I think there's still like one aspect of like drawing boundaries around some body of data, where you say like, Hey, within that boundary, I care about certain constraints, maybe that there shouldn't be more than 10 documents ever. Or maybe you want to enforce some constraints around like users, access control, et cetera. can you share any sort of learnings or advice about how to approach this entire topic?
Like, how do you decide this is a useful boundary about like how data should be modeled at and fragmented or petitioned. And what are some of the dimensions that should be taken into consideration here?
¶ Boundaries
So I think in general, if it's not obvious what a document should be in an application, then it's probably the document model is probably not the right fit. I think things like Figma where, you know, you're, in a document at a time, like. You might have a different document in another tab, but you don't have two documents in the same tab concurrently open. it's taking up the whole screen.
Like, I think that there's certain heuristics like that, that just tell you, like, this is definitely a document model application. Same with Google Drive or Google Docs. you kind of have one thing. Open at once, where would you put Linear? Since you could, for example, put each Linear issue into its own document. Why might that be a reasonable approach? where is this? Where might have not? I think I could see.
That being reasonable, if there, if you really care about the tickets themselves being, you know, multiple people editing a ticket at one time and seeing the text. And, if you really wanted to make that kind of a first class experience. But in general, I think that, Linear just screams kind of database approach to me. although I do, I know they are, I believe using Yjs, for some of the issue. text now could be wrong, but I think they do use it or a CRDT.
it might be a different CRDT, but I think they're using some sort of collaborative text editor. so given that you've seen quite a couple of different customers and products build their own sync engines, any sort of interesting, almost second order effects that you've seen there, unexpected things, new challenges that you didn't see in, in previous applications, things like. Database migrations or other things, which sort of challenges and problems have you seen?
¶ Challenges in building Sync Engines
Yeah, I think whenever you're dealing with data on S3, data migrations do become really interesting because you're not just sort of writing a database query and issuing an update. Usually some form of gradual lazy migration. So it's kind of like the application that's reading the data has to know how to transition from version one to two and two to three and then kind of apply those consecutively.
And so that logic tends to linger around in the application for as long as you have old documents to support. and I think there's ways to do schema migrations or schema changes that don't require a migration as well. Like, I think that the, It was at Google and we, you know, there were certain rules about what you could do with protocol buffers. that would ensure that they were always backward and forward compatible.
and so I think, you know, things like a required field always has to be required. And so. Deciding being delicate of when you call a field required. there's certain kind of things you can do at the schema design level and schema migration or schema change migration level that you can avoid kind of having to implement any sort of migration. It can kind of be more access time oriented. So I think doing that has been where I've seen.
It will be successful with that, in terms of second order effects, I think kind of goes back to like, once you have the sync server, people are like, oh, this is now a place where I can trigger this notification or I can do this check or I can, you know, so I think we've sort of seen these, these backends kind of grow in scope. you know, we want that to be first class part of the application that can do whatever you want it to do. That makes a lot of sense.
And yeah, I think this is, an area that, has already caused a lot of, headaches, schema migrations, data migrations in general, but now that we are rethinking the data architectures at large here, we also need to rethink that part and like you've mentioned, when you have all the data in a single Postgres database, then you can at least like apply like your old playbooks there, but now if all of your data is in an S3 bucket, laid out in whatever way, now you do need a
different new approach to deal with that. And, That is one way to deal with it, to bake in the migration logic into your app logic. But, that is also, I think that also comes with its own downsides. This way you're like litter some of that code that was once very clear. and now you make it less clear because you need to account for.
That historical evolution, a project that I want to shout out here is the project, Cambria by the folks at In I've actually studied this project myself quite intensively and I've rebuilt it, myself a few times once even on a type level just to, provide a nice type save API. Given that the original implementation rather lets you specify those sort of projection rules in YAML. But, I've heard some rumors that they're thinking of like rebooting that project at some point.
So fingers crossed for that. And yeah, another approach that I'm investigating heavily myself, given I have my fair share of like. Database migration traumas, that I tried to remedy with, starting Prisma, but now I'm, trying a different approach with event sourcing. Where if you basically split up your documents your database into a dedicated write model and derive the read model from it.
The core insight here is basically that if you split this up into two parts, the schema for your read model, that is typically the thing. That changes orders of magnitudes more often where you have different kind of queries that you want to do different sort of aggregations and where you want to maybe change the database layout to make certain queries faster and more efficient and then the write operations. Those are much more bound to the domain of when stuff actually happens.
So, and that's changes way less over time. Like, maybe you want to capture, someone's preference on email, marketing emails on, on signup, but historically you can way easier say, like, actually we default to no. but a user signup event. Is always valid and way easier to upgrade. And then you can basically reapply all prior events into the new read model that you can change very easily. And you can even have like multiple read models all at once.
So, that is what I'm exploring right now on the umbrella of Livestore. But, that also comes then requires that rigor to split it up into a read and write model. But yeah, curious whether you have thoughts on that. Yeah, that's really interesting. I think that event sourcing in general does sort of simplify migrations.
If you're willing to kind of go back over the event source log and regenerate, because then as long as you represented all of the data that matters, then you can essentially just add fields as.
As needed to another problem that emerges in that world is like, if your domain produces a lot of events, so let's say you build a TL draw and whenever you move a rectangle, that creates, you could model it in a way that when you let go of the rectangle that creates an event, but you could even model it in a way. Where, like whenever the browser registered a new move event, dragging it can cause 5, 000 events and that can lead to a very long history of events.
So now you gotta keep that in mind as well. And, whereas in the traditional mixed read and write model approach, you would basically just overwrite the position and it would not necessarily cause the database to explode. because you have too much data. but yeah, it's all about trade offs that that is like, what data management is all about. maybe a slightly different aspect about data that, you've also written about, which is in regards to encrypting data.
So, you've written a great blog post about that. can you tell us more about that blog post and, what it's about?
¶ Data encryption
Yeah, so this came out of when we were with Y-Sweet. We wanted to do We wanted to have store the data locally in the client, at least as an option. so we looked at the options that were available or, you know, local storage, indexed db, opfs, origin, private file system, realized that indexed db was really the kind of the right way to go for this right now. have high hopes on opfs, but they're still, I mean, they all kind of have flaws, but.
Index DB is like the best people know the flaws the best, I guess, and how to work around them. So, looked at index DB. But the problem that we found with all of them is that all of them store the data in plain text, and that's not just a theoretical problem. There is at least a couple months ago. Now, there was some, you know, NPM and pie pie modules out there that would read some application data from these plain text sources. it's kind of a real problem that people have identified.
And has been exploited. so we wanted to make sure that we provided an option that at least as, as best as possible would prevent that. so we said, okay, well, browsers have web crypto. We can encrypt all this. but then there's this problem of where do you store the key? because you could start on on the server, but then kind of defeats the purpose if you're offline, of then accessing that data. So realize that. don't really have a good way to store a key kind of credential.
we've got like WebAuthn, but WebAuthn is a bit more secure, like, which is where you have pass keys and things like that. It's a bit more opinionated. It uses the operating systems key chain, but it, doesn't really expose that to you as any sort of low level API that you can store your own secrets in.
What has started happening is that some browsers, particularly Chromium based browsers, Google Chrome, Edge, Rave, have built in something called App-Bound encryption, and they're just using this for cookies, but the idea is that the browser will store, cookies in, you know, on disk as they always have, but they'll be encrypted on disk, and then the symmetric key to that will be stored in the, Operating systems keychain and the operating system is set
up to at least in theory, and there's been some vulnerabilities here, too. But, at least in theory, only give that private key back to the browser process itself not to another process that attempts to impersonate, the browser process. So what we landed on, which was pretty surprising to me, that this was kind of the best available path right now. But if you enable local storage, we encrypt it stored in index DB and then store the key in a cookie and.
Kind of piggyback on that being App-Bound encrypted in at least in browsers to support it. That is very interesting. Yeah, I've been studying, cryptography, particularly in a browser context, also a bit more for various reasons. I am, trying to see what would it take to, do the entire, sync. messages for Livestore, what would it be, for them to be enter and encrypted, but the hard part is not the encryption, but the hard part is the end to end where, the various ends own their keys.
And there's a, we should do an entire episode just about that. what's difficult about it, but, it can all be distilled down to the hard part about, anything cryptography related as key management. And you can either around the side of like being a little bit more loose with like how you manage keys, but that defies a lot of the, purposes and the benefits here.
but then also the, browser makes that really, really tricky because it has very constrained APIs and historically it's always been rather a web document viewer than a fully fledged application platform and, we're getting the building blocks. I mean, you can, use the, web crypto API. I'm also using the Libsodium projects, compiled to WASM, which is very powerful and gives you a couple.
of advanced, algorithms, et cetera, that you can use for, symmetric or asymmetric encryption, signing, et cetera. and pass keys, I think are also like, a super important foundation. But, they also get you just so far. And I think they don't really help you for the encryption as such, but rather for signing messages. So I think we're still lacking a few building blocks. So very excited to hear about this what, what it was again, App-Bound.
App-Bound encryption, so ideally at some point, this goes even beyond cookies that, this can be applied for other storage mechanisms, but I like the approach to, basically encrypt it and then you reduce it to the key management problem and that you put into a cookie, which also, there's another question, which is what happens if that cookie goes away? did you figure out a, an answer for that? we don't.
We just set it to a long expiration, but it's the thinking there was like, if the user is clearing their cookies on that tab or on that hosting, they probably want to destroy the data. And so are they, you know, they want to be logged out. so we actually saw it as the right thing to do to, bind it. The other nice thing about that is like, unlike indexed DB cookies can actually have an expiration date. So we could set an expiration of a week.
we're still relying on the browser to enforce that, but if the browser enforces that, and then, you know, two weeks later, that person is fully hacked, including their operating system key chain, the browser, at least in theory, will have deleted that private key and then the data that's in IndexedDB will be gone. So that's actually, funny enough, additional functionality. It was just incidental to the, to using cookies for that. Right. I like this trick a lot and I got to look into it.
One thing to point out still is, you've mentioned that this mechanism is only available in Chromium browsers anyway, but, cookies and IndexedDB, OPFS, et cetera, all of that is available in other browsers and namely Safari as well. One thing that, people find out the hard way about Safari is that it automatically deletes a user's data after seven days if they haven't visited that website.
So if you're building a fully local-first web experience where someone, creates some precious data, in Safari and maybe doesn't sync it yet to somewhere else, go on vacation, come back and poof, the data is gone. So I think as app builders, we need to be aware of that and detect, Hey, is this Safari? And in Safari, make this part of the product experience show sort of like a message, like, Hey, be careful. Your data might go away. There are ways to remedy that.
And, to, for example, if you make the Safari app, a, progressive web app by adding it to the home screen. That limitation goes away. but app builders need to be aware that they can make the app users aware. it's just something that, I think is important to, note. Yeah, I think that's an example of a number of cases where the browsers are just not optimized for local-first apps, unfortunately.
you know, the, I think the ability to just store low level access to the operating systems key chain is another, where. Browsers have improved a ton in terms of what they expose of the APIs, but I think they're still lagging when it comes to that storage and encrypted storage. Yeah, totally. So, maybe slightly, moving to another browser related topic.
you've been both through your work, Through your prior role, and also as part of Jamsocket, you've been dealing with quite a bit of WebAssembly, any interesting story to share about WebAssembly?
¶ WebAssembly
guess I really wanted. I wanted to build the company around WebAssembly. I wanted WebAssembly to take off, particularly like server side, client side, that kind of having isomorphic client side server side code would be a big thing. And I've, I guess, just generally soured on WebAssembly a little bit. I think that it where I've seen it work really well is when it's in the application layer and you kind of have an application.
there's a couple examples I like to go to that are like effectively the same model. the same kind of architecture, Figma.
There's a company called Modify a few others that I'm, I'm blanking on, but the architecture is essentially a JavaScript UI, with a, webAssembly, WebGL, WebGPU kind of rendered canvas, behind it, so like Figma, you know, the core engine is, I believe, in C, talking to WebGL, with Modify, it's in Rust and WebGPU, but it's literally like, The application is layered that way that on screen, there is the canvas behind the UI.
They're written in two different languages and they just talk to each other. so I think that is the most promising architecture that I see for WebAssembly, where I think it's been harder. To get right is building something like a library that is ultimately consumed by JavaScript developers, but written in WebAssembly, I think there's just so much friction still in the bundling that, I've kind of soured on that as an approach.
Right. I mean, I agree in that regard that I wish there was already, we'd be further along with WebAssembly, but I think it's a bit of a chicken egg problem that we need more inspiring applications. That makes people feel like, wow, that is possible. I didn't recognize that this was the web. it feels so fast. And I think that is still true and, more true than, than ever that WebAssembly, I think. Can unlock whole new experiences.
And there is a few Lighthouse examples like Figma that stand out here. Also a big shout out, to the folks building Makepad, which is a super ambitious project, which is, basically the same way as like. I'm probably going to do it, I don't do it justice by pitching it, but, I just want to speak to the ambition where it's basically like Unreal Engine is sort of like it's full engine.
They, they're building their own platform and including, like a, a rendering layer and, sort of like as a few people think about, think that MakePad is an editor. No, MakePad has just as an example app. Build an editor in which they build make pad, which is just so phenomenal. So, and make pad is just such an incredibly fast app. So you should definitely check it out, go to make pad. dev and then press the option key, to see like how the entire code editor expands.
So apps like that get me very excited about what's possible with WASM, but, they're fully, they're building everything in Rust. They're fully leaning into everything, there. And I think the either or, where you want to like, combined one step at a time. I think that's a. Tooling problem, partially it's also a trade off problem where if you move a lot of data back and forth between WASM and JavaScript, that doesn't come for free. So I think, you got to keep that in mind.
I've seen a few, I think the, the RepliCache folks actually in the past have written a lot of their stuff in Rust and then moved to JavaScript because of that boundary crossing being, too expensive.
But, I think not every use case suffers from that problem, but, I want to turn it around and, invite anyone who is excited about WebAssembly as, seeing that as an opportunity to make things significantly better, like working on projects like WasmBindGen or other things, I think the Deno folks are pushing heavily on that, so I'm seeing this glass half full and I think the glass is going to get full pretty soon.
Yeah, I think to your point about like, the JavaScript WebAssembly boundary crossing, and I think that it comes down to just placing that boundary in the right place when it comes to, applications like the Figma model of sort of JavaScript front end with, renderer in WebAssembly, make pad is is a great example. I think of going like all the way in on WebAssembly. Another one's called Remix.
and I think what's notable about both cases is that to do that well, they've had to basically be living in the GUI toolkit layer. Like, they've been writing their own code or adapting a lot of their own code for it. So I think that's, Not for the faint of heart.
I think that people who have done it have built amazing software, but what comes up more often when I talk to people is like they there's a scarcity of rust developers and they want to optimize the rust developers to working on kind of the engine component and then be able to hire React developers and Svelte developers and kind of front end web developers to work on the GUI where it may not be Like, you know, think about Figma's UI components.
Like they're not super performance sensitive in the way that the canvas is. Yeah, totally. I think it just takes, some bold thinkers and this is not something where you're gonna rebuild the world in two weeks, this is really something you gotta, put in the five, 10 years possibly.
to really build something phenomenal, but I think the, rewards are massive and, I'm really looking forward to getting kind of alternatives to something like React that provide different trade offs and that allow you to build like really, really high performance applications and fundamentally React biases towards simplicity and biases towards that you can, Prevent, not so experienced engineers to, hurt themselves or others and drag the application down.
But I think there's a different, trade off space as well, where you bias more towards performance and you need to know a little bit more what you're doing. And particularly now with AI being on the horizon, I think we can rethink a lot of trade offs significantly where engineering team sizes, maybe, get reduced as well, but that's the topic for another conversation. But, related though, in regards to AI, you've recently also launched a new project, that is certainly adjacent to AI.
It's called ForeverVM. can you share a little bit more what that is about?
¶ ForeverVM
Yeah. So pretty much from the beginning with Jamsocket, one of the ways we've seen people use it because we run these sandbox processes on demand is people have run LLM generated code in them. actually. Going back to the beginning, it wasn't even LLM generated. This was sort of pre chat GPT. it was things like Jupyter notebooks, but over time we see more and more LLM generated code. and it's, you know, it's good.
I think we're, we're like competitive with other products for that, but we kind of realized, first of all, we're not really positioning the product that way. but also. We're not building the product necessarily to be like the best for that from first principles. Like if we were just say, like, I want an LLM to be able to execute code. What would that look like from first principles? And we kind of thought, well, we don't really care about the session.
We don't really care about, you know, we want it to From the LLM's point of view, feel like it can always run code. It doesn't have to start a sandbox and stop a sandbox when it's done to cut down on costs and things like that. We, we kind of like cut out the rest of it, make that into the abstraction and build it, frankly, into something that we can position for those products so that we're not confusing people who are like, I thought you did sync engines.
now you're telling me running AI code and it's like, architecturally, they can actually be fairly similar, but, we wanted to build a product around that. So We have forever VM, it is. Way to think about it. It's like an API that runs Python code in a unbounded session. so by that, I mean, if you, kind of make an API call and get a machine ID, maybe ABC 123, you can run instructions on that machine set a equals three or something like that.
Two years from now, if you kept that machine around, you can query the value of a, you know, a plus five, and then you get back a value. and the way we're doing that behind the scenes is using, memory snapshotting of the underlying Python process. So we kind of from the ground up architected the whole system around this and it's kind of neat fascinating.
Yeah. My mind is also going to other technologies like mentioned Temporal before, but there's also really fascinating project called Golem VM, which I think is also, Also employing some really interesting tricks, to use WASM and knowledge about, the WASM memory to make sort of checkpoints where you can restore and resume computation or retry. And yeah, I love that, Yeah, we, we get some bolder ideas out there.
and particularly now when there is the cost of writing code has come down so much and, now it's also people write that code who know even less about whether it's good or not. So we need to put it into boxes that are somewhat blast safe. but also long, like durable in a way that doesn't break the bank. And I love how that is like an entirely different product, but yet leverages all the benefits and all the, foundations that he's built with Jamsocket, or with, I guess with Plain for that matter.
That is very, very cool. Yeah, thanks.
one of the things that's been really cool to see is that if we give an LLM the ability to write this code and get responses back very quickly, like kind of just treat it as a local-repl, that the AIs can kind of do more like they get that fast feedback loop and they can make mistakes and correct them almost faster than, and in some cases we've observed them doing this faster than a reasoning model could kind of just generate the right code in the first place. So that's been pretty neat.
¶ Outro
Nice. Any other things that you would like to share with the audience? if you want to find me online, I'm, paulgb on Twitter or X and paulbutler. org on BlueSky. jamsocket. com is the site, jamsockethq on Twitter. also on BlueSky is jamsocket. com. and, yeah, forevervm. com is the product we were just talking about. Perfect. We're going to put links to all of those things in the show notes. Paul, thank you so much for coming on the show today.
I've learned a lot about so many different topics and yeah, really enjoyed it. Thank you. Thank you, Johannes. And really looking forward to seeing you at local-first in Berlin this year. Perfect. See you then. See you then. Thank you for listening to the localfirst.fm podcast. If you've enjoyed this episode and haven't done so already, please subscribe and leave a review. Please also share this episode with your friends and colleagues.
Spreading the word about the podcast is a great way to support it and to help me keep it going. A special thanks again to Jazz for supporting this podcast. I'll see you next time.
