¶ Intro
I mean, another thing is like the operational characteristics of the system, for this type of sync technology. So comparing HTTP with WebSockets, like WebSockets are stateful, and you do just keep things in memory. If you look across most real time systems, They have scalability limits because you will come to the point where if you have, say, 10, 000 concurrent users, it's almost like the thing of don't have too many open Postgres connections.
But if you're holding open 10, 000 WebSockets, you may be able to do the IO efficiently, but you will ultimately be sort of growing that kind of memory and you'll hit some sort of barrier. Whereas, with this approach, you can basically offload that concurrency to the CDN layer. Welcome to the localfirst.fm podcast. I'm your host, Johannes Schickling, and I'm a web developer, a startup founder, and love the craft of software engineering.
For the past few years, I've been on a journey to build a modern, high quality music app using web technologies. And in doing so, I've been falling down the rabbit hole of local-first software. This podcast is your invitation to join me on that journey. In this episode, I'm speaking to James Arthur. Founder and CEO of Electric SQL, a Postgres centric sync engine for local-first apps.
In this conversation, we dive deep into how Electric works and explore its design decisions, such as read path syncing and using HTTP as a network layer to improve scalability. Towards the end, we're also covering PGLite, a new project by Electric that brings Postgres to Wasm. Before getting started, a big thank you to Rocicorp and PowerSync for supporting this podcast. And now, my interview with James. Welcome James. So good to have you on the podcast. How are you doing? Great.
Yeah, really good to be here. Thank you for having me on. So the two of us know each other for quite a while already. And to be transparent, the two of us have actually already had quite a couple of projects together. The one big one among them is the first Local-First Conference that we organized together this year in Berlin. That was a lot of fun. But for those in the audience who don't know who you are, would you mind introducing yourself? So, my name is James Arthur.
I am the CEO and one of the co-founder of Electric SQL. So, Electric is a Postgres sync engine. We sync little subsets of data out of Postgres into wherever you want, like local apps and services. and we do also have another, project which we developed called PGlite, which is a lightweight WASM Postgres. So we can sync out of Postgres in the cloud, into Postgres in the web browser, or kind of into whatever you want. Awesome. So yeah, I want to learn a lot more about Electric as well as PGlite.
Maybe PGlite a little bit towards the end of this conversation. So Electric, I've seen it a bunch of times. I've been playing around with it, I think quite a bit last year, but things seem to also change quite a bit. Can you walk me through? What was the history of like the last couple of years as you've been working on Electric and help me inform the right mental model about Electric as it is going forward?
¶ Electric SQL
Yeah, absolutely. I think like Electric as a project, it started, in a way, building on a bunch of research advances in distributed systems, CRDTs, transactional calls of consistency, a bunch of these primitives that a lot of people are building off in the local-first space, which actually a bunch of people on our team developed in the kind of research stage.
And we wanted to create a developer tooling and a platform that allowed people who weren't experts in distributed systems and didn't have PhDs in CRDTs to be able to harness the same advances and build systems on the same types of guarantees. So in a way, that's where we started from. And we started building out on this research base into stronger consistency models for distributed databases and doing sync, from like a central cloud database out into whether it's to the edge or to the client.
And then we're a startup. So like we built a small team and you go through this journey, building a company of, you have ideas for what's going to be useful and valuable for people, and you have a sense of sort of where the state of the art is and, what doesn't exist yet, but as you then go and experiment, you just learn more and more. And so you work out actually what people need and what problems you can solve with it.
and so through that journey, we went from starting off thinking we were building a next generation distributed database to using the replication technology for that system behind existing open source databases like Postgres, SQLite, into finding, local-first software as a pattern is really the killer app for that type of replication technology.
So people looking to build local-first applications because of all of the benefits around UX, DX, resilience, et cetera, but to do that, you need this type of sync layer. and then when we first focused on that, then we tried to build a very optimal end to end integrated local-first software platform. So for instance, if people saw Electric as a project, like this time last year, that's what we were building.
And in a way we just found that we were having to solve too many problems and there was too much complexity making a kind of optimal one-size-fits-all sort of magic active active replication system. We were doing things like, managing the way you did the database migrations and schema revolution and generating a type safe client and doing the client side reactivity as well as all this sort of core sync stuff. So, as you know, there's a lot to that kind of end to end stack.
Because we had wanted to build a system that integrated with people's existing software, like if you already had software built on Postgres or if you already had a working stack, like building that sort of full system was in a way sort of too complex and was difficult to adopt from existing software. So more recently we have consolidated down on building a much simpler sync engine, which is more like a composable tool that. You can run in front of Postgres, any Postgres.
It works with any standard Postgres, any managed Postgres, any data model, any data types, any extensions that you have. And it just does this work of basically consuming the logical replication stream from Postgres. and then managing the way that the data is fanned out to clients, doing partial replication. So, because when you're syncing out, say, if you have a larger database in the cloud. And you're syncing out to like an app or a kind of edge service. You don't want to sync all the data.
We have this sort of model of partial replication. And basically what we're aiming to do with the sync engine is just make that, as simple to use as bulletproof as possible. And we're making it with standard web technologies that make it easy to use with your existing systems and with your existing stack.
And so we went in a way from this sort of quite ambitious, tightly integrated end to end local-first software platform to now building more like composable tools that can be part of a local-first stack that you would assemble yourself as a developer, that's designed to be. Easier to adopt for production applications that work with your existing code. That makes a lot of sense.
And that definitely resonates with me personally as well, since maybe, as you know, before I founded Prisma, Prisma actually came as a pivot out of like a focusing effort from a previous product that was called GraphQL, which was meant as a more ambitious next generation backend as a service.
Back then there was like Firebase and Parse and so we wanted to build the next generation of that, but what we found back then in 2016, that, while we've been making a lot of progress towards that very ambitious, holistic vision, we had to basically oil, like, multiple oceans all at the same time. And that takes a lot of time to fully get to all the different ambitious things that we wanted to.
So the only way forward for us where we felt like, okay, we can actually serve the kind of use cases that we want to serve in a realistic timeline was to focus on a particular problem, which is what Prisma eventually became. And by focusing just on the database tooling part and leaving the other back-endy things to other people.
And it sounds like what you've been going through with Electric is a very comparable exercise, like focusing exercise to trying to, from a starting point of like, let's build the most ambitious, the best local-first stack, like end to end by focusing more on like, okay, what we figured out where our expertise is, is around Postgres, is about, existing applications wanting to adopt local-first ideas, syncing approaches, et cetera. And that is what now led to the new version of Electric.
did I summarize that correctly? Yeah, exactly. Right. It sounds like a very similar journey. And I think it's interesting as well that as you focus in and you learn more about a problem space, you both discover in a way, more of the complexity in the sort of aspects of it. So you realize there's actually more challenges to solve in a smaller sort of part of it or a smaller scope.
And also it's interesting that I think for instance, when we started the project, I would have thought coming into this as a software developer, I'd go, Is a read path sync solved? I'd be like, well, there's quite a lot of read path kind of sync stuff. You can kind of do this.
There's various real time solutions, but actually as you dig into it, you find that there's a whole bunch of weaknesses of those solutions and they're actually hard to adopt or they have silos or they can't handle the data throughput. And so you realize that actually you don't necessarily need to bite off all of the more ambitious scope because actually you can deliver value by doing something simpler.
And I think also for me personally, learning about stewarding this type of product, understanding that you can build out still towards that more ambitious objective. So in the long run, you know, we want to sort of build back a whole bunch of capabilities into this platform. probably a sort of loosely coupled kind of composable tools. So you mentioned the term read path syncing. Can you elaborate a little bit what that means? So let's say I have an existing application.
Let's say I've built an API layer at some point. I have a React front end and I have all of my data sitting in Postgres. I've been inspired by products such as Linear, et cetera, who seem to wield a superpower called syncing. And now I found ElectricSQL, which seems to connect the ingredients that I already have, such as Postgres and a front end with my desirable approach, which is syncing. So how does Electric fit into that? And what do you mean by. Read path syncing.
¶ Read and Write Path Syncing
Yeah. I mean, the sort of read path and write path when it comes to sync, the read path is syncing data, like onto the local device. So it's a bit like kind of data fetching from the server. And then the write path would be when like a user makes a write, and then you want to sync that data typically back to the cloud so that's sort of how we talk about them there.
I think there's something unique about local-first software compared to more sort of traditional web service systems where you explicitly have a local copy of the data on device. And one of the challenges with that is because of course you can just like load some data from the server and keep it in a cache, but if you do that Then you immediately actually lose, any information about whether that data is stale.
So say a user goes to a route on your application and then clicks to go to another route and then comes back to the original one. So to load that original route, say you did a data fetch, but now you've navigated back to it. Can you display that data? Can you render the route or is the data stale?
And so you have this sort of thing where I don't really know, and you tend to sort of build systems with like REST APIs and data fetching where you might show the data and go and try and fetch new data. but in a way it's that problem of you want the data locally so that your application code can just talk to it locally and you're not having to code across the network with local-first software. But that means that you need a solution to keep the data that is local fresh.
Like you don't want stale data. And if you build a sort of ad-hoc system. As we've all done across like many generations of software applications, it's one of these things where you always end up kind of building some sort of system to keep the data up to date. But what you really want is a kind of properly engineered system that does it systemically for you. It is really a sort of an aspect of your applications architecture that kind of can be abstracted away by a sync engine.
And so for us, for this focusing on the read path sync is about saying, okay, what data should be on the device and let's just keep it. fresh for you. And then with the write path, one of the things that we learned through the project is that there are a lot of valid patterns for handling how, when you do local writes on the device, how you would get those back to the cloud. You can do through the database sync, you can do optimistic writes.
You could be happy with online writes and you have different models of like, can your writes be rejected? Are they local writes with finality? Or do you have a server authoritative system where when the write somehow syncs, it can be rejected and how do you handle that? And so there's actually a lot of different patterns for those writes, which are often relatively simple because different applications can be happy with certain trade offs and you could pick a model like.
Okay. I'm going to show some optimistic state and make a request to an API server. And it's fine. And you get a kind of, you get a local-first, experience with just a sort of simple model that says, okay, if the write is rejected when it syncs, then, I'll just sort of roll it back and the user loses that work. And for many applications, that's fine.
For other applications, you might have a much more complex conflict resolution or you're trying not to lose local writes and there's different collaborative workloads. And so. Building a generic system that can give you a write path that gives you the best developer experience and user experience for all of those variety of scenarios is very, very hard, whereas building it on an application by application basis on the write path is actually often fairly straightforward.
It can be like post your API and use the React use optimistic hook. And so, with building local-first applications that have both read and write path with Electric, the idea is that we do this core read path with partial replication, but then as you're building your application, you can choose out of a variety, whichever pattern fits your, what you need the most for sort of how you would choose to get the writes back into the server. That makes a lot of sense.
So basically the more general purpose. building block that can be used across a wide range of different applications. It's actually how you read data, how you distribute the data that you want to have locally available in your applications that would kind of replace the API get requests before. But now what needs to happen in those Put, post, delete requests, this is where it depends a lot more.
And this is where you basically, what you're arguing is there are different sort of write patterns that heavily depends on the kind of application. So that is where you're kind of leaning out. And previously with Electric, you tried to provide the silver bullet there. But actually, it's really hard, maybe impossible to find the silver bullet that applies to all use cases. However, for the read path, it is very possible to provide a great building block that works for many use cases.
So, can you provide a bit of a better spectrum of the different write patterns that you've seen so far? Maybe map them to canonical applications? that illustrate those use cases. And maybe if you know, maybe you can also compare analogies to something like Automerge, et cetera, which sort of write patterns that would be a good fit for, or not as much.
¶ Read Path use cases
Yeah. So I think the simplest pattern for writes with an application would be to just, for instance, send a write to a server and require you to be online. So, because there's many applications that are happy, for instance, with read only, like there's a lot of people who are building, data analytics applications, data visualization, dashboards, et cetera.
And so if you have a sort of read heavy application, then in some cases it may just be a perfectly valid trade off, not to really deal with the complexity of say offline writes at all. But you still have a lot of benefits by having local data on device for the read path, because all the way you can kind of explore the application and the data is all just instant and local and resilient, then the sort of simplest pattern to layer on, support for offline writes.
On top of that as a sort of starting point where imagine that you have like a standard REST API and you're just doing put and post requests to it as normal is to add this concept of optimistic state. So optimistic state is just basically you're saying, okay, I'm going to go and try and send this write to the API server. And whilst I do so, I'm going to be optimistic and imagine that that write is going to succeed. And in two seconds later, it's going to sync back into the state that I have here.
But in the meantime, I'm going to Add this bit of local optimistic state to display it immediately to the user, and because in most cases that of happy path is what happens, then you end up with what just feels like a perfect local-first experience because it's an instantly displayed local write, and that sort of data is resolved in the background. Now, You know, immediately with that, you do then just introduce like a layer of complexity with like, well, what happens when the write is rejected?
And so you have both the challenge of, for instance, say you stacked up three writes. Did they depend on each other? So if one of them is rejected, should you reject all of them? and different applications and different parts of the application would have different answers to that question. In some cases, like it's very simple to just go, if there's any problem with this optimistic state, just wipe it.
And for instance, like the React use optimistic hook, like its approach is just like, it waits for a promise to resolve. And when the promise resolves, it wipes the optimistic state. And so it's very much just like, if anything happens at all, it's like, And so it's only. Interestingly enough, there's also a lot of people coming from React Query and so on, from those sort of more traditional front end state management things.
and that brings them to local-first in the first place, because they're like layering optimistic, one optimistic state handler on top of the next one. And if there's a little flaw inside of there, everything collapses since you don't really know have principled way to reason about things. So that makes a lot of sense. Exactly right.
And so like a framework like TanStack, for instance, with TanStack query, it has like slightly more sophisticated optimistic state primitives than just say the kind of a primitive use of optimistic hook. And one of the thing, one of the challenges that you have is that for say, a simple approach to, to just using optimistic state to display an immediate write is like, is that optimistic state global to your application? Shared between components? Is it scoped within the component?
And so, as you say, like there's an approach where you could come along and say, okay, I've got three or four different components and so far I've just been able to sort of render the optimistic state within the component. But now I've got two components that are actually displaying the same information. And suddenly I've got like stale data. It's like the old days of manual DOM manipulation and you forgot to update a state variable. And so.
Yeah, in a way that's where you come to a more proper local-first solution where your optimistic state would be, stored in some sort of shared store. So it could just be like a JavaScript object store, or it could be an embedded database. And so you get a slightly more sophisticated models of managing optimistic state. And the great thing is there are, like TanStack Query and others, there's like, there's a bunch of existing client side frameworks that can handle that kind of management for you.
Once you go, for instance, like to an embedded database for the state. So one of the kind of really nice, points in the design space for this is to have a model where you sync data onto the device and you treat that data as immutable. And then you can have, for instance, so, so say, for instance, you're syncing a database table, say it's like a log viewer application, and you're just syncing the logs in, and it goes into a logs table.
Now, say the user can interact with the logs and delete them, or change the categorization. And so you can have a shadow logs table, which is where you would save the local optimistic state. And then. You can do a bunch of different techniques to, for example, create a view or a live query where you combine those two on read.
So the application just sort of feels like it's interacting with the table, but actually it's split in the storage layer into a mutable table for the sync state and a kind of local mutable table. And the great thing about that is you can have persistence for the, both the sync state and the, local mutable state. And of course it can be shared. So you can have multiple components, which are all sorts of just going through that unified data store.
and there's some nice stuff that you can do in SQL world, for instance, to use like instead of triggers to combine it. So it just feels like you're working with a single table. Now it's a little bit additional complexity on something like defining a client side data model, but what it gives you is it gives you a very solid model to reason about. So like, You can go, okay, basically the sync state is always golden. It's immutable. Whenever it syncs in, it's correct.
If I have a problem with this local state, that's just, that's like mutable stuff. Worst case, I can get rid of it, or I can develop more sophisticated strategies for dealing with rollbacks and edge cases. So it in a way it can give you a nice developer experience.
with that model, you could choose then whether your writes are, whether you're writing to the database, detecting changes, and then sending those to some sort of like replication ingest point, or whether you're still just basically talking to an API and writing the local optimistic state separately. So, so at that point you can have, again, you can have, you have this fundamental model of like, Are you writing directly to the database and all the syncing happens magically?
Or are you just using that database as a sort of unified, local optimistic store? So this is the sort of type of like progression of patterns. And once you start to go through something where you would, for instance, have a synced state that is mutable, or you are writing directly to the database, that's really where you start to get a little bit more into the world of like convergence logic and kind of merge logic and CRDTs and sort of what's commonly understood as proper local-first systems.
And I think that's the point where almost the complexity of those systems does become very real. Like, as you well know, from building LiveStore and as we see from the kind of, quality of libraries like AutoMerge, Yjs, et cetera. so that's probably where as a developer, it makes sense to reach for a framework. And you certainly could reach for a framework for that sort of like. Combine on read, sync, sync into a mutable kind of persist local mutable state.
But what we find is that it is actually if you want to, it's actually relatively straightforward to develop yourself, you can reason about it fairly simply, and so it's not too much extra work to just basically go as long as you've got that read sync primitive, you can build like a kind of proper locally persistent, consistent local-first app yourself, basically. Just using fairly standard front end primitives. Right.
Okay. Maybe sharing a few reflections on this, since I like the way how you, portrayed this sort of spectrum of this different kind of write patterns. in a interview that I did with Matthew Weidner, I learned a lot there about the way, how he thinks about different categorizations of like state management, and particularly when it comes to distributed synchronization.
and I think one pattern that got clear there was that there's either you're working directly manipulating the state, which is what like Automerge, et cetera, are de facto doing for how you as a developer interact with the state. So you have like a document and you manipulate it directly.
You could also apply the same logic of like, you have a Database table, for example, that's how CR SQLite works, where you have a SQLite table and you manipulate a row directly and that is being synchronized as the state and you're ideally modeling this with a way where the state itself converges and through some mechanisms, typically CRDTs.
But then there's another approach, which might feel a little bit more work, but it can actually be concealed quite nicely by systems, for example, like LiveStore, in this case, unbiased, and where you basically separate out the reads from the writes. And often enough, you can actually fully, re compute your read model from the write model. So, if you then basically express everything that has happened, that has meaningfully happened for your application as a log of events.
Then you can often kind of like how Redux used to work or still works, you can fully recompute your view, your read model from all the writes that have happened.
And I think that would work actually really, really well together in tandem with Electric, where if you're replicating what has happened in your Postgres database as like a log of historic events, then you can actually fully, recreate Whatever derived state you're interested in and what is really interesting about that approach, but that particular write pattern is that it's a lot easier to model that and reason about that locally.
Did you say like, Hey, I got those events from the server, those events, I am applying optimistically. You can encode sort of even a causal order that doesn't really, If someone is like confused about what does causal order mean, don't worry about it. Like you can probably at the beginning, keep it simple, but once you layer on like more and more dependent, optimistic state transitions, this is where you want to have the information.
Okay. If I'm doing that, and then the other thing depends on that, that's basically a causal order and modeling that as events. I think is a lot simpler and is a way to, to deal with that monstrosity of like, losing control over your optimistic state. Since I think one thing that's, that makes optimistic state management even more tricky is that, like, how are things dependent on each other? And then also like, when is it assumed to be good.
I think in a world where you use Electric, once you're from the Electrics server, you've got sort of confirmation, like, Hey, those things have now happened for real. You can trust it. but there's like some latency in between, and the latency might be increased by many, many factors.
One way could be that you just, you are on a like slow connection or the server is particularly far away from you and might take a hundred milliseconds, but another one might be your have a spotty connection and like packages get lost and it takes a lot longer or you're offline and being offline is just like a form of like a very high latency form and so all of that, like if you're offline, if it takes a long long time, and maybe you close your laptop, you reopen it.
Is the optimistic state still there? Is it actually locally persisted? So there are many, many more layers that make that more tricky. But I like the way how you're like, how you split this up into the read concerns and the write concerns. And I think this way, it's also very easy to get started with new apps that might be more read heavy and are based on existing data.
I think this is a very attractive trade off that you say like, Hey, with that, I can just sink in my existing data and then step by step, depending on what I need, if I need it at all. Many apps don't even need to do writes at all, and then you can just get started easily.
Yeah, I think, I mean, that's explicitly a design goal for us is like, yeah, if you start off with an existing application and maybe it's using REST APIs or GraphQL, it's like, well, what do you do to start to move that towards a local-first architecture? And exactly, you could just go, okay, well, just, let's just leave the way that we do writes the same as it is. And let's move to this model of like syncing in the data instead of fetching the data. And that can just be a first step.
And I think, I mean, Across all of these techniques for writes, there is just something fundamental about keeping the history or the log around as long as you need it, and then somehow materializing values. So sort of internally, this is what a CRDT does, right? it's clever and has a sort of lattice structure for the history, but basically it keeps the information and allows you to materialize out a value. if you just have like an event log of writes.
So as you were saying with, with LiveStore, when you have like a record of all the write operations, you can just process that log. so I think, you know, you can do it sort of within a data type.
And I think that fits as well for greenfield application where you're trying to craft, kind of real time or kind of collaboration and concurrency semantics, but like from our side of coming at it, from the point of saying, right, when you've got applications that build on Postgres, you already have a data model.
You just sort of layer the same kind of history approach on top by like, keeping a record of the local writes until you of sure you can compact them and actually that same principle is exactly how the read path sync works with Electric. So Postgres logical replication, it just basically, it emits a stream, it's like transactions that contain write operations and it's basically inserts, updates, and deletes with a bit of metadata.
And so we end up consuming that and basically writing out what we call shape logs. So we have a primitive called a shape, which is how we control the partial replication, like which data goes to which client and a client can define multiple shapes, and then you stream them out. But that shape log comes through our replication protocol as just that stream of logical update operations. And so in the client, you can just, you can materialize the data immediately.
So like we provide, for instance, a shape stream primitive in a JavaScript client that just omits the series of events. And then we have a shape, which we'll just take care of materializing that into a kind of map value for you. but you could do what you want, whatever you wanted with that stream of events. So if you found that you wanted to keep around a certain history of the log in order to be able to reconcile some sort of causal dependencies, that's just totally up to you.
And so, yeah, it's quite interesting that it's almost just the same approach, which is the general sort of principle for handling concurrency on the write path is also just exactly what we've ended up consolidating down on exposing through the read path stream. That makes a lot of sense. So, Let's maybe go a little bit more high level. Again, for the past couple of minutes, we've been talking a lot about like how Electric happens to work under the hood.
And there's many commonalities with other technologies and all the way to CRDTs as well. But going back a little bit towards the perspective of someone who would be using Electric and build something with Electric and doesn't maybe peel off all the layers yet, but get started with one of the easier off the shelf options that Electric provides. So my understanding is that you have your existing Postgres database.
you already have your like tables, your schema, et cetera, or if it's a greenfield app, you can design that however you still want. And then you have your Postgres database. Electric is that infrastructure component that you put in front of your Postgres database that has access to your Postgres database. In fact, it has access to the replication stream of Postgres. So it knows everything that's going on in that database.
And then your client is talking to the Electric sync engine to sync in whatever data you need. And the way that's expressed what your client actually needs is through this concept that you call shapes. And my understanding is that a shape basically defines a subset of data, a subset of a table that you want in your client. since often like tables are so huge and you just need a particular subset for your given user, for your given document, whatever. is that accurate?
¶ The role of Shapes
Yeah, that's just exactly how it works. And. the Electric Sync Engine it's a web service. It's a Docker container, like technically it's an Elixir application. And it just connects to your Postgres as a normal Postgres client would. So you have to run your Postgres with logical replication enabled. And then we just connect in over a database URL.
And so it's just as if you were like, imagine you're deploying a Heroku app, and it's sort of Heroku Postgres, and it just provisions a database URL, and your back end application can connect to it. So it's the same way that a sort of Rails app would talk to, talk to Postgres. And then Electric does some stuff internally to of route data into these shape logs, which are the sort of logs of update operations for each kind of unit of partial replication.
And then we actually just provide a HTTP API, which is quite key to a whole bunch of the, affordances of the system. So I can dive into that if it's interesting. But then, yeah, you basically have a client, Which pulls data by just making HTTP requests. and so HTTP gives you back pressure and the client's in control of which data it pulls when, and then how you process that stream. Yeah, we do provide some primitives to make it simple.
Like we give you React hooks to just sort of bind a shape to a state variable, but Basically, you can do what you like with the data as it streams it. So, yeah, I would love to learn more about that design decision of choosing HTTP for that network layer, for that API. Since I think most people think about local-first, think about real time syncing, et cetera, that reactivity. And for most people, I think particularly in the web, the mind goes to web sockets. So why HTTP?
Wouldn't that be very inefficient? How does reactivity work? Can you walk me through that?
¶ Why using HTTP for network layer?
Yeah, so. I mean, exactly. We, went on that journey with the product where with the earlier, slightly more ambitious Electric that I was describing, we built out a custom binary WebSocket protocol to do the replication, and it's just what you sort of immediately think you're like, let's make it efficient over the wire and obviously it should be a WebSocket connection because you're just having these sorts of ongoing data streams, but, So one of the things that happened with the,
focusing of the product strategy was that, Kyle Matthews joined the team. So Kyle was actually the founder of Gatsby, which is like the React framework. And through Gatsby, he did a lot of work around basically data delivery into CDN infrastructure. And so one of the insights that Kyle brought into the team was if we re engineered the replication protocol on plain HTTP, and we just do like plain HTTP, plain JSON. And we replicate over an old fashioned long polling protocol.
So you just, basically we have a model where the client makes a request to a shape endpoint, and then we just return the data that the server knows about. So we'll sort of chunk it up sometimes over multiple requests, but it's just a standard, like load and load a JSON in a document request.
And then once you get a message to say that the client is up to date with the server, then you trigger into a long polling mode where basically the server holds the connection open until any new data arrives. and yes, you kind of think instinctively like, okay, it's say JSON instead of binary, so it'll be less efficient and you're having to make these sort of extra requests that surely they add latency over some sort of more optimized, WebSocket protocol.
But the key thing is that by doing that, it allows us to deliver the data through existing CDN infrastructure. So those initial data loading requests, like typically when you're building applications on this shape primitive, you can find ways of defining your shapes so that they're shared across users.
You might have some unique data that's unique to a user, but Like say you have a project management app and there's various users who are all in the same project, you could choose to like sync the kind of project data down rather than just sort of syncing all the user's data down. And so that way you get shapes being shared across users.
And so the first user to request it hits the Electric service, we generate these responses, but then they go through Cloudflare or Fastly or CloudFront or what have you. And every subsequent request is just served out of like essentially Nginx or Varnish. And so it's just super efficient. All of this infrastructure is just like super battle tested and as optimized as it can be. That is very interesting.
It reminds me a little bit of like how modern bundlers, and I think even like all the way back to Webpack, used to split up larger things into little chunks. And those chunks would be content hashed. And that would be then often, be cached by the browser across different versions of the same app. In this case, it would be beneficial to the individual user who would reload it.
And also of course, like to other people who visit this, but now you take the same idea, even further and apply it to data shared across users by applying the same infrastructure, HTTP servers, CDNs, et cetera, to make, things cheaper and faster, I guess. Well, and, and the local browser c or client cache as well. So you have this sort of shared caching within a CDN layer where you might have multiple clients, which are like, literally it's a sort of shared cache in the HTTP cache control.
That makes a lot of sense. Since like, on a website level, I'm not sure whether you have clear caching semantics. I don't think so. Yeah, you'd have to do some very sort of custom stuff to sort of achieve the same things. but also because, so with the browser, when you're loading data, like HTTP requests with the write cache headers can just be stored in the local file cache.
So one of the really nice things with just, like loading shape data through the Electric API is you can achieve an offline capable app without even having to implement any kind of local persistence for the data that's loaded into the file cache. So that sort of model, if like say you've gone to a page and you've just loaded the data through Electric, even if you didn't store the data, if you navigate back to the same page, the data's just there out of the file cache.
So the application can work offline without even having any kind of persistence. So you almost get like, I mean, there's some sort of edge cases on this stuff, but it's the thing, because you're just working with the standard primitives, you've just got the integration with the existing tooling and you get a whole bunch of these things for free.
That is very elegant and I guess that is being unlocked now because like you embrace the semantics of change of like how the data changes more and by modeling and this is where it now gets relevant again why everything here is modeled as a log under the hood since like to the log you just append and so you can safely cache everything that has happened up until a point in time, and from there on, you just add things on top, but that doesn't make the stuff that has happened before less valid.
So you can cache it immutably. That makes it super fast. You can cache it everywhere on the edge, on your local device, et cetera. And that gives you a checkpoint that at least once in a point in time was valid, and now there might be more stuff that should be applied on top of it, but that's already a better user experience than not getting anything. I mean, another thing is like the operational characteristics of the system, for this type of sync technology.
So, for instance, again, comparing HTTP with WebSockets, like WebSockets are stateful, and you do just keep things in memory. And so across, if you look across most real time systems, They have scalability limits because you will come to the point where if you have, say, 10, 000 concurrent users, it's almost like, you know, it's like the thing of don't have too many open Postgres connections.
But if you're holding open 10, 000 WebSockets, you may be able to do the IO efficiently, but you will ultimately be growing that kind of memory and you'll hit some sort of barrier. Whereas, with this approach, you can basically offload that concurrency to the CDN layer.
So, it's not just about, being, basically taking away the query workload of the cached initial sync requests, but these kind of reverse proxies or CDNs have a really nice feature called request collapsing or request coalescing, which means that when they have a cache of requests come in on a URL, if they have Two clients making a request to the same URL at the same time, they sort of hold both of them at the cache layer and only send one request onto the origin server.
And so basically we've been able to scale out now to 10 million concurrent clients receiving real time data out of Electric on top of a single single Postgres. And there is literally no CPU overhead on the Postgres or the Electric layer. It's just entirely handled out of the CDN CDN serving.
And so it's sort of remarkable that the combination of the initial data load caching means that we, like one of our objectives is we want to be as fast as just querying the database directly for an initial data load and then orders of magnitude faster for anything that then subsequent requests coming out of the cache, but also this sort of challenge with. Almost like the, this thing about saying, okay, you're building an application.
You maybe want some of the user experience or developer experience affordances of local-first, but if to do that, I need a sync engine and a sync engine is kind of a complex thing. And so you end up either going, okay, maybe I'll sort of use an external system.
And then you get like, A siloed real time database in your main database and you get operational complexity, or you get some sort of system where you have, yeah, you're basically of stewarding these web sockets and it's very easy for it to fall over. And I think actually, like, if you just sort of honestly view that type of, architectural decision from the lens of like somebody trying to build a real project, which is their day job, trying to get stuff done.
You're just going to avoid that as much as you can, because like you'd far rather just like, I just want to serve this with Nginx. I know how that's going to work. I'm not going to stay up at night worrying about it. Whereas I have 10, 000 concurrent users going through some crazy WebSocket stuff. I'm going to get pager alerts. And so like the whole approach here with what we're trying to do is to change that sense that sync is a complex technology that you sort of.
Play with on the weekend and only adopt when you have to. So going, look, you can actually do sync in such a way that it is just as simple and standard as normal web service technology. And then suddenly you can actually unlock the ability for kind of real projects you know, you can take this stuff into a day job and not, get it shouted down at the design meeting. Cause it just feels like too much black box complexity. You're using the word simple here.
And I think that really speaks to me now, because it's both simple in terms of architecturally, like, how does data flow? so I think this is where Electric provides a very simple and I think easy to use and easy to work with trade off, like, how does data flow, but then it's also gives a very simple answer of like, how does it scale?
Since you can throw at it like all the innovations and all the hard work that has now gone into the like our web infrastructure for the last decades, you can run on the latest and greatest and all the innovations that Nginx and HAProxy and Cloudflare and like all the work that has into that. You can just piggyback on top of that without having to innovate on the networking side as well, since like you, you're really doing the hard work on the more semantic and data side.
And that's a really, really elegant trade off to me. Yeah. And it's, it's fun because like our benchmarking testing at the moment, like we break CloudFlare before we break Electric. if something is battle tested, it's CloudFlare. It again, it carries on because it's not just about this sort of scalability or operational stuff. It's also about then how you can achieve, like we talked about the write patterns. And so this sort of pattern of how do you do writes?
And it's like, well, actually you can do the sync like this, use your existing API to do writes. And it can work with your existing stack. But you have other obvious concerns with this type of architecture, like say, authentication, authorization, data security, encryption. But HTTP. just has proxies and it works with the sort of middleware stack. And so for us, a shape endpoint as a sync endpoint is just a HTTP resource.
So if you want to just put like an authorization service in front of it, you just proxy the request through and you like, you have the context from the user, you can have the context about the shape and you can just authorize it using your existing stack. If you want to do encryption, then you can do that. It's just a stream of messages. And yeah, a bit like you were saying that, like with Electric, you could just use it as a transport layer to like, say, route a log of messages.
That can be ciphertext or plaintext. So you could just like encrypt on device, sync it through. You can just decrypt whenever you're consuming the stream. And again, you could do that, like in the client, you could do that in HTTP middleware. So a lot of the sort of concerns, which, like certainly our experience of trying to build a more integrated end to end local-first stack, you know, you go, okay, we need to, we need to solve this.
I need a security rule system because suddenly there is no API and how am I going to authorize the data access? And it's like, we don't need a security rule system. Because you can just use, you can just use normal API middleware in front of an HTTP service. And so you just sort of take that problem out of scope and like the system doesn't need to do encryption.
It doesn't need to provide like a kind of hooks mechanism or some sort of framework extensibility because the protocol is extensible and just, you just have all of this ecosystem of existing tooling built around it. So it is, I mean, it's been fantastic for us because it, because it simplifies all of this aspects. And allows us to go, look, this is how you can achieve, say authorization with Electric, but again, it pushes it out of scope.
So we get to focus our engineering resources on just doing the core stuff to deliver on this core proposition. So which sort of things would you say are particularly tricky from a application of all perspective with Electric, where it might be not as much of a good fit? I think, One of the things is that we sync through the database and that has latency.
And so if you're trying to craft a really low latency real time multiplayer experience, like, or even doing things where in a way it doesn't really make sense to be, synchronizing that information through the database layer, then it's maybe not the best solution. So sort of for like presence features, let's say Infignar, where you see my mouse cursor moving around, those sort of things.
yes, it would be nice if it was in real time shared across the various collaborators, but you don't need a persistent trace of that for eternity in your Postgres database. So I think a common approach for that as well is just to have like two kind of different channels for how your data flows, like your, persisted data that you want to actually keep around as a fixed trail. Like, did I create this GitHub issue or not?
But like how my mouse cursor has moved around, it's fine that that's being broadcasted, but if someone opens it an hour later, it's fine that that person would never know. So for this sort of use case, it's an overkill basically to pipe that trough Postgres Yeah. And you know, it's. For us, Postgres is a big qualifier.
It's like, if you, if you want to use Postgres, if you have an existing Postgres backed system, like Electric shines where like, yeah, you have, you already use Postgres or you know that you want to be using Postgres, maybe you already have a bunch of integrations on the data model already, maybe you do have existing API code, like this is the scenario where we're really trying to say, well, look, in that scenario, this is a great, pathway to move towards these more advanced
local-first sync based architectures, where, whereas if you look at it from a sort of more greenfield development point of view, and you're trying to craft a particular concurrency semantics, say, you would reach for Automerge and you would get custom data types, which you can craft advanced kind of invariant support with your kind of data types. But of course, you know, so that's a slightly different sort of world.
And, and I think so almost probably for sort of a lot of people in the local-first space dive into CRDTs and so forth, you know, it's really, it's fascinating to try to sort of craft these sort of optimized, kind of, present style, immediate real time streaming experiences.
And so whilst we do real time sync, it's almost more about keeping the data fresh and just sort of making sure that the clients are sort of eventually consistent rather than making that more sort of game kind of experience where, you know, where maybe peer to peer matters more or of finding clever hacks to have very low latency kind of interactions.
¶ PGlite
That makes a lot of sense. So now we've talked a lot about Electric and Electric is the name of the company. It's the name of your main product. But there's also been a project that I'm not sure whether you originally created, but it's certainly in your hands at this point. It's called PGlite. That made the rounds on Hacker News, etc. Also through a joint launch with the folks at Superbase. What is PGlite? What is that about?
Yeah, so I mean, interestingly with Electric, we started off, building a stack, which was sinking out of Postgres into SQLite because it made sense as the sort of main like embeddable relational database.
and I remember, speaking to Nikita, who is the CEO at Neon, the Postgres database company, and some of his advice from building SingleStore or MemSQL was the impedance or the mismatch between the two database systems and the data type systems will continue to just be a source of pain for as long as you build that system. And so we were just having these conversations about going, how do we make this Postgres to Postgres sync? And then, You can just eliminate any mismatch.
You just, you don't even need to do any kind of like serialization of the data. You can just literally take it exactly as it comes out of like the binary format that comes through in a query or the replication stream from Postgres, put that into the client and like, you can have exactly the same data types and exactly the same extensions. So this was a sort of motivation for us. And co founder Stas, the CTO at Neon had done an experiment.
to try and make a more efficient Wasm builder Postgres that could potentially run in the client. So previously there'd been some really cool work by Superbase, by Snaplet, a few teams, which had developed these sorts of VM based, Wasm Postgreses. But they were pretty big. they didn't really have persistence. They weren't, they were sort of more of a kind of proof of concept. and the approach that Stas took was to do a pure Wasm build and run Postgres in single user mode.
And that allowed you to basically remove a whole bunch of the concurrency stuff within Postgres, which allowed us to make a much, much smaller build. So they shared that repo. And we sort of, played with it for a little while. Didn't quite manage to kind of make it work. And then one of the guys on our team, Sam Willis, just picked it up one week and put in some concerted efforts and basically managed to pull it together with persistence as a three meg build.
And it worked, and so suddenly we had this project which was like a three meg like SQLite for context is like a one meg WASM build, and so Postgres is much kind of larger system and you think it would be much bigger, but suddenly actually it's not that far off in terms of the download speed, and it could just run as a fully featured Postgres inside the browser. and so we sort of tweeted that out and it's gone a bit crazy.
I think it's like, it's the fastest growing database project ever on GitHub. It's like 250, 000 downloads a week nowadays. There's a huge, there's lots and lots of people using it. Superbase are using it in production. Google are using it in production. Lots of people are building tooling around it, like drizzle integrations, et cetera. And it's the sort of thing that just should exist, right?
There should be a WASM built at Postgres, just being able to have it like the same database system instead of mapping into an alternative one has these fundamental advantages, and also a lot of people have just been coming up with like a whole range of interesting use cases for it as a project. So some people are interested in running it inside Edgeworkers. As a sort of data layer that you can hydrate data into for kind of background jobs.
Some people are interested in running it as just like a development database. So you can just NPM install Postgres. And if you're running like an application stack, you don't have to run Postgres as an external service. The same thing in your testing environment. So there's a whole bunch of different use cases.
And in fact, like some of the work, for instance, the Superbase have done is they built a very cool project called database.build, which is a sort of AI driven database backed application builder. So it's sort of AI app builder for building Postgres backed applications, and it just runs purely on PGlite in the client. And so that's a demonstration where.
this sort of database infrastructure for running software, you had centralized databases, and then you had this sort of move to serverless with separation of compute and storage. And now you sort of have this model where actually you can run the compute, with a whole range of different storage patterns in the client. And you don't even need to deploy any infrastructure on the server. to run database driven applications.
it really reminds me of that time when JavaScript was getting more and more serious. And at some point there was no JS and suddenly you could run the same sort of JavaScript code that you were running in your browser, now also on the server. And well, the rest is history, right? Like that changed the web forever. It has like changed dramatically how JavaScript just become like the default full stack foundation for almost every app these days.
And there seemed to be a lot of like similar characteristics. This time, the other way around, like going from the server into the world, Node, it was rather the other way around, but, that seems like a huge deal. Yeah, you know, you sort of step forward and we of see, I guess, some of these trends in data architecture and just, you know, it can just be the same database everywhere. And in a way, it's just sort of almost logically extended to wherever you want.
And you almost like, you can just have this idea of like declarative configuration of what data should sit where. AI systems can optimize transfer and placement, and it is just all the same kind of data types.
and I think, this is sort of where systems are moving to, but also just like some of these things we've been learning with PGlite, like for instance, if you're running a system that relies on having say a database behind your application and say it's a SAS system and you're spinning up some infrastructure for a client, With PGlite, you don't necessarily need to spin up a database in order to serve that client.
So if you think about something like the free tier of like SaaS platform like that, it can just change the economics of it. it can do that on the server by just allowing you to have the Postgres in process. So you're not deploying additional infrastructure. But also you move it all the way into the client and there just is no compute kind of running on this. It just moves even more of the compute onto the client.
And I think it like, it obviously aligns with sort of local-first in general, but I know some of the stuff we've talked about before around the concept of like local only first. And as a developer experience for building software, so one of the things that LiveStore is specifically designed to support is this ability to Build an application locally with very fast, feedback and iteration. And then you progressively add on, say, sync or persistence and sharing and things when you need to.
And I think this sort of model of being able to build the software on a database like, PGlite and then go, okay, I've played with this enough. I want to save my work. And it's at that point that you write out to blob storage, or you maybe provision the database to be able to of save the data into. Yeah, I think you've touched on something really interesting and something really profound, which I think is kind of two second order effects of local-first.
And so one of them is for the app users directly. So ideally it should just become so cheap and so easy to offer the full product experience as sort of like a taste, fully on the client that is no longer sitting behind a paywall. But if the product experience generally allows for that, if it's sort of like a note, note taking tool or something like that, that I should be able to like fully try out the app, on my device and doing the signup later and being able to offer that economically.
That is basically with those new technologies, that's no longer an argument, so you can offer it. So hopefully that will be a second order effect where software is way easier to offer, where it's way easier to just try it out from an end user perspective. But then also from the second point, from an application developer perspective, I think it makes a huge difference in terms of complexity.
How, when you build something, whether it is just a local script without any infrastructure, whether you can just run it, has no infra dependencies, you can just run it, maybe you run like your Vite dev server. And that's it. It's self contained and you can move on. There's like no Docker thing you need to start, et cetera. That's like your starting point.
And if the barrier to entry there, if like, if that threshold is lower, that you can build a fully functional thing just for yourself, just in that local session, and you can get started this way, and if you then see like, Oh, actually, there's a case here that I want to make this a multiplayer experience or a multi tenant experience, then you can take that next step. But right now, like, you can't really, leap ahead there.
You need to start from that multi tenant, that multi player experience, and that makes the, the entry point already so much more tricky that many projects are never getting started. And I think both of those, I think can be second order effects and improvements that local-first inspired architectures and software can provide. So, I love those observations. Yeah, yeah, totally.
And I mean, I think, for instance, with, it's interesting as well that a lot of people do define their database schema using tools like Prisma, Drizzle, like Effect Schema is a great example that obviously you're working on. the more layers or indirection between where you're, say, iterating on the user experience in the interface, and you want to be able to, say, customize a data model to adapt to trying to sort of iterate there quickly.
But if you have to sort of go all the way into some other language, another system, it just sort of takes you out of context and slows everything down. So that's somehow the ability to like, yeah, apply that sort of schema into the local database, not have to sort of work against these sort of different legacy layers of the stack in order to actually be able to build out software is really transformational.
¶ The relation between Electric and PGlite
So going back to PGlite for a moment, how does PGlite and Electric, Electric as a product and Electric as a company, how do those things fit together? Yeah. I mean, there basically are sort of two main products. We have two products. They're both open source, Apache licensed. One is the Electric Sync Engine, and one is PGlite.
And so you can use them together, or you can just use them independently, so it's not like the Electric system is designed only to sync into PGlite, you don't have to have an embedded Postgres to use it Electric, and you can use PGlite just standalone. There's a range of different mechanisms to do things like data loading, data persistence, et cetera, virtual file system layers, loading in, unpacking Parquet files.
But if you do like have an application with this local database and you wanted to then be able to sync that data with other users or into your Postgres database, then Electric is just a great fit. And obviously we make a kind of first class integration.
So I think for us, I mean, as a, as a company, as a startup, Electric is the main product that we aim to build the business around, because in a way that type of operational data infrastructure is just slightly more natural to build a commercial offering around, like you have to run servers to move the data around, we can do that efficiently, it sort of makes sense and adds value.
Whereas with PGlite as a open source embedded database, it's not something that we're aiming to sort of monetize in quite the same way. And potentially, maybe it could be upstreamed into Postgres, like, you know, there should be a Wasm build to Postgres. or, you know, maybe it kind of moves into a, a foundation and sort of develops more governance, like certainly already with, PGlite.
So like Superbase, co sponsored one of the engineering roles with us, there's been contributions from a whole bunch of companies. So it is already a sort of, wide attempt in terms of the. The stakeholders who are sort of stewarding the development of the project. That is very cool to see. I'm a big fan of those sort of like multi organizational approaches where you share the effort of building something. And, yeah, I love that. I'm very excited to get my own hands on PGlite as well.
I'm mostly dealing with SQLite these days just because I think it is still a tad faster for like, those single threaded embedded use cases. But if you need the raw power of Postgres, which often you do, then you can just run it in a worker thread and you get the full power of Postgres in your local app, which is amazing. So maybe rounding out this conversation on something you just touched on, which is a potential commercial offering that Electric provides. can you share more about that?
Which problems it intends to solve and where it's currently at?
¶ Electric commercial offering
Yep, so we're building, a cloud offering, which is basically hosting the Electric sync service. So like we, we, for instance, we don't host the Postgres database. We don't host your application. We just sort of host that kind of core sync layer, and then that can integrate with other Postgres hosts like Superbase, Neon, et cetera, and kind of other platforms for deploying applications. that's our sort of first commercial offering.
And we of see that as like a almost sort of utility data infrastructure play, where we've put a lot of effort in being able to run the software very resource efficiently, and with sort of flat resource usage, so it doesn't you know, scale up with memory with concurrent users, etc. So we want to be able to run that very efficiently. And so, we, we sort of see that that's kind of, low cost usage based pricing based basically on the sort of data flows running through the software.
I think, you know, monetizing open source software is quite a sort of, it's an interesting topic, but it's also sort of, there are a lot of, common patterns that are well known. And like, ultimately our aim as a company is, We want people building real applications with this technology, and we want developers to enjoy doing it and become advocates of the technology.
And then, there is a pathway when, imagine that you're a large company and say you have like five projects and they're all using Electric sync. It's very common for those sort of larger companies to need additional tooling around that. So governance, compliance, data locality. There's a whole bunch of sort of considerations there. So, it's quite common to be able to build out a sort of enterprise offering on top of the core open source product.
And so, you know, there are various routes like that, that we could choose to pursue in future. and maybe that's how it plays out as we build a cloud, we focus on, making this sync engine and these components bulletproof, make sure people are being successful building applications on them. And then we can look at maybe some sort of, value added tooling to help you operate them successfully at scale, or help you operate them within sort of larger companies or regulated contexts.
¶ Outro
That makes a lot of sense. Great. James, is there anything that you would want from the audience? Anything that you want to leave them with? anything to give a try over the next weekend? The holidays are upon us. what should people take a look at? Yeah, I know that, You may be listening to this at any time in future, but, we're recording this in the lead up to kind of December. So if you have some time to experiment with tech over the holiday period, just take a look at Electric.
you know, it's ready for production use. It's well documented. There's a whole bunch of example applications. So there's a lot that you can of get stuck into there. So please do come along and check it like our website is electric-sql.com. we have a Discord community. There's about 2000 developers in there. So that's linked from the site. we're on GitHub at, Electric SQL. so you can see the Electric and the PGlite repos there. and so those are the kind of the main things.
And if you're interested, for instance, in building applications, we already have a wait list for the new cloud service, and we're starting now to work with, some companies to help manually onboard them onto the cloud. So if a cloud offering for hosted Electric is important, let us know, and there's a pathway there to work with us if you're interested in being an early adopter of the cloud product. But also just, we spend a whole bunch of time talking to teams and people trying to use Electric.
So our whole goal as a company is to help people be successful building on this. And so if you've got questions about. how best to approach it, challenges with certain application architecture. We're very happy to hop onto a call and chat stuff through. So if you come into the Discord channel, say hi and just ask any questions, and we're happy to help as much as we can. That sounds great.
Well, I can certainly plus one that anyone who I've interacted with from your company has been A, very helpful and B, very, very pleasant to interact with. And also at this point, a big thank you to Electric, not just for building what you're building, but also for supporting me and helping me build LiveStore.
You've been sponsoring the project for a little while as well, which I really much appreciate, and there's actually a really cool Electric LiveStore syncing integration on the horizon as well. That might be, some potential topic for a future episode, but I think with that, now we've covered a lot of ground. James, thank you so much for coming on the podcast, sharing a lot of knowledge about Electric and about PGlite. thank you so much. Yeah. Thanks for having me.
Thank you for listening to the Local First FM podcast. If you've enjoyed this episode and haven't done so already, please subscribe and leave a review. Please also share this episode with your friends and colleagues. Spreading the word about this podcast is a great way to support it and help me keep it going. A special thanks again to Rosicorp and PowerSync for supporting this podcast. I'll see you next time
