¶ Introduction
The most important thing about the cloud is collaboration. You can send someone a link and you're working together right away. But the downside of the cloud is that you don't actually have the software. And so what we think you should have is a copy of the program and the data on your computer, but it should still be able to collaborate with other people. Welcome to the localfirst.fm podcast.
I'm your host, Johanna Schickling, and I'm a web developer, a startup founder, and love the craft of software engineering. For the past few years, I've been on a journey to build a modern, high quality music app using web technologies. And in doing so, I've been falling down the rabbit hole of local first software. This podcast is your invitation to join me on that journey. In this inaugural episode, I'm speaking to Peter van Hardenberg, who helped to coin and popularize the term Local First.
As the director of the Ink & Switch Research Lab, he's been on the forefront of this work for the better part of a decade. My conversation with him today starts with the basics of what Local First is and why you, an application developer, should care about it. Before getting into it, also a big thank you to Expo and Crab Nebula for supporting this podcast and supporting the local first ecosystem as a whole. And now my interview with Peter. Hello, welcome Peter.
Thank you so much for making it today to kick off this new podcast with me. So would you mind introducing yourself? Yeah. Hi, my name is Peter Van Hardenberg. I am the director of Ink & Switch Research Labs. We're an independent industrial research lab studying the future of computing and tools for thought. Awesome. Peter and I have known us for quite a while. Back then, Peter, has been just closing up his last chapter at Heroku which I've been always a big fan of.
Peter found his way now working on local first and defining what local first is. Before going into it, what has led you to local first? Yeah, I mean, obviously it's a bit of a departure, right? Like, here I was building a platform as a service. You know, Heroku, for those who don't know, is a, basically a host Primarily when it started for Ruby on Rails applications, but over time for pretty much everything.
And the idea behind Heroku is that you could build an app on your machine and then push it. To a continuous delivery system, and it would put it live in the cloud, and there are other systems like that today, you can imagine it as you know, proto Netlify or early kind of, containerized cloud. Yeah, and it is a bit of a departure to go from trying to get everybody's apps running in the cloud to telling people that maybe they should just run them on their computer.
So I guess I should really explain how we got there. I think for me. I'd like to tell a story about riding the train in San Francisco, the subway there. And my friends had been working on a music app called Rdio which was sort of a competitor to Spotify. They're not around anymore. And you know, as I was riding on this train, I had been listening to music in the online mode when the train was above ground. And then when we went into the tunnel, the music would just stop working.
And it was really... upsetting to me because like I couldn't go back and even listen to songs I'd already listened to or like scroll through the playlists that I had because I wasn't in offline mode and I just remember having this feeling like we've really blown it we've turned software from this thing that you can just have like they call it an informational good that means that like you know the marginal cost of copying it
and using it as zero it's just it was this idea that like software had gone from this thing that anyone could have a copy of To a thing that didn't even work on the train because I didn't really have the app. Even though I'd had the data in my hand on my screen in front of me, in my ears listening to it, somehow it had just completely evaporated in the time it took a train to go into a tunnel. right. And riding a train is not this out there thing. It's like this is a daily thing.
Yeah, is my daily commute. And, you know, there are people out there who say, Oh, yeah, you know, the Internet's going to be everywhere next week, and have you taken a subway lately? You know, and like, both in big cities, right? You have this problem, but also in more rural communities, right? There's lots of places where there's no Internet, or the Internet is slow. And you can even be in places where there is Internet, right?
But your cell phone reception isn't great next to the fridge, or your Wi Fi has a cut out. Faced this every day when I'm just like leaving the house and there's just this little gap of the wifi stops responding. There's a little bit of signal and my apps just don't work properly anymore. Yeah, everything kind of locks up. Your phone is just a little black brick without a good Wi Fi connection.
¶ The Problem with Current Software
And so that got me thinking, like, well, why do we build things the way we build them? And the answer to some extent, I think, was just that, you know, we told people to. Right. We made it easy to build apps for the cloud, and then everybody did. Yeah. Yeah, and so at Heroku, I had worked on our data services mostly. That was our Postgres service. And I started thinking about how really the problem is that the data that you have for your application doesn't actually live on your device.
It lives on the server. And so I started thinking, surely there's some way to get this data from the network down to the computer. And it's funny. You know, Johannes, you and I met originally through Prisma and GraphQL and database schemas and those kinds of things. And this always struck me as a really adjacent problem, which is that fundamentally, on your server, you're doing this process of taking data from a database and getting it into your sort of API back end.
And then you do a separate, distinct process to get data from the API to the client, right? And then when it gets to the client, you just kind of throw it away. And so I started thinking like, well, maybe it would make a lot more sense if we just sort of synchronized data between those points instead of fetching it on demand all the time. And in fact, that's sort of how offline modes already worked.
And so I started thinking more and more about this and got involved with some great fellow researchers like Martin Kleppmann, who's currently at TU Munich, but was at Cambridge at the time and had just written a paper about. What are called JSON CRDTs, which is like a technology for synchronizing changes between two documents in a way where you didn't just have to clobber one or the other. And yeah, that one thing led to another, and we've been working on the problem for about five years now.
I think you've been telling me about it already many years ago. And initially I wasn't quite sure it's like, Hey, is this about only offline mode, but step by step I realize, Oh, there's much more to it. And like, fundamentally this can unlock much higher quality software and also not make it harder for me as an application developer, but actually make things a lot simpler. We're maybe not fully there yet. But that's a vision I totally buy into.
So this is the path that has led you to think what becomes local first. Did you coin that term? So the term local first came around in a call we had one day. I think... Folks have given me credit at the lab, but it was definitely one of those open ended conversations where several of us were kind of trying to find the right term for this for a long time.
I'd actually planned to refer to this as serverless technology, but then Amazon came out with this so called serverless technology, which in fact was just all servers. So I found that sort of like frustrating and slightly upsetting because it felt misleading. But we really like the idea of calling it local first technology because the idea was to emphasize that it's not about doing away with the cloud. You know, it's not really serverless.
What we wanted to say was actually, we just want to prioritize the user's experience on their device and make that easy to deliver on high quality. Yeah, I like that a lot. You've been mentioning the story of Listening to Rdio on the subway and that's not working, but from an application developer perspective what would you say were the frustrations that has led you to think about local first.
¶ Complexity of Modern Web Development
Yeah, I mean, it's a few things. One is if you just think about software development as an app developer, it's sort of obscenely complicated, right? You've got To build a client app, probably using Node and JavaScript and React and Vite and tailwind and all these technologies.
Then you have to build like the API backend, which is its own separate kind of node app or part of your existing app with all of its own sort of like JWTs and, you know, database sync technologies and GraphQL and, you know, you're building out all that whole system. And then behind that, you actually have your data layer, which is like. Postgres and so you have like in kind of the best case in the minimum case.
You know, for of modern web dev, you've got this three tier stack as they call it, but the reality is, it's not just those three tiers. You also need like pager duty for letting you know when things are down. And maybe you're using like a bunch of weird Amazon services. You got a DynamoDB thing going on. Then, you know, No one's hosting things on Heroku anymore. You got to put stuff up on, you know, you need a Kubernetes cluster somewhere. And, you know, it's just, it's a mess.
it's insanely complicated. So there's just like a lot of complexity feels gratuitous for like making an app for your soccer team or something. And not even from a tooling perspective, but just the minimal architectural footprint is quite a lot for often very simple apps. That's right. And then on top of that, you have to pay for it all. Right. You know, maybe you're a YC startup, you get some credits to get you started, but that's definitely the drug dealer model, right?
Like first one's free, right? And then, you know, I have personally known a number of companies that died, even though they were, uh, their product was growing rapidly, but just the cost of running their infrastructure was too high to justify the system, right? Like they didn't see how they'd be able to turn this thing into a Like a venture backed billion dollar valuation company. And so they couldn't get any money to keep the lights on despite having a popular and growing product.
So that's crazy. And then like. Even on top of all of that, like, let's say that you've managed to, you know, raise enough money to hire an entire team to be on call for you and you managed to, you know, build a product that you can convince somebody will be a billion dollar, you know, valuation business at some point and you learn all of those skills and you build all those systems out and everything, you know, ultimately, it's still very fragile, right?
In In time, as well as in space, you know, we've all had all of these products now that we loved and used, whether it's, you know, Google reader or dark sky or, you know, pick your favorite favorite dead tech, you know, and these products go on the founders, when they get acquired, they put up this blog post about what an incredible journey building this product has been.
And, you know, the incredible journey that's happening is that your software with your data is going away and you can't have it anymore. Right. They're being acquired. They're shutting everything down. And if you're lucky, you know, as with Rdio you can download a bunch of JSON files. That's all that's left for you as the user. And I just want to say, like, this is a problem we created for ourselves, right?
Like, I can go and download an old Microsoft, you know, MS DOS copy of, like, WordPerfect or something from the late 80s, and it'll still run. You know, I can get doom on a floppy disk and it'll still run, but like software that came out for the cloud, you know, it's just gone when it's gone. And that's it, right?
This is, they call a certain period of history, the dark ages, not, you know, formally because they were bad, but because they're dark to us today, because We can't go back and look at the records from them, they were lost. And so similarly, we have built a dark age for technology. Not that it's bad, there are a lot of things that are great, but that there won't be a record of our work here, right?
Like the things that you and I are building, the work that I did at Heroku, the apps that we've made and put out there. Because they're not, they don't exist outside of the one set of servers they run on, you know, when they go offline, that, that's it they're gone forever. And I think that kind of sucks. Yeah, I fully agree.
I mean, having gotten into software development, step by step, I could take my first steps with building a little thing with HTML, CSS and step by step, you want to make it more functional, more real. And then you start pulling on that thread, like all that complexity comes towards you and it's never ending. So it's so much to learn, so much to maintain, so much to pay for and operate. So I wish there was a simpler way to, to build things and to run things.
¶ What is Local First?
So maybe this is a good point to ask, what is local first? What is Local First? local-first software is software that runs on your computer and collaborates with other people. We think that the most important thing about the cloud is collaboration. This ubiquitous, always accessible, copy of your data from anywhere. You can send someone a link and you're working together right away. Nothing to download, nothing to install.
But the downside of the cloud is that you don't actually have the software. And so what we think you should have is a copy of the program and the data on your computer, but it should still be able to collaborate with other people, right? We want the good parts of the cloud combined with the good parts of the old way of building. And in a way right now we have great collaboration tools like Google Drive and Figma, etc. But it's all by default lives in the cloud.
So you suggesting let's have the cake and eat it too, so that the software fully works on the client and we only use the server for the absolutely necessary pieces but still have collaboration. That's right. And Doug Engelbart, you know, a famous computer science researcher he talked about how we can de- augment our intellect. We can make things worse for ourselves. And he demonstrated this really poetically by taping a brick to a pencil and asking people to write with it.
So I'm not here to say that there are free lunches necessarily in solving these technical problems, but I do think that we may have taped a few giant bricks to our pencils here, and that there's a lot of unnecessary difficulty that we've introduced by picking architectures that are not well suited to the task. These are in many cases, great technologies.
It's just that we're deploying the same technology that we would use to build a billion person app to build a 10 person app or a million person app. And the requirements are just radically different. Now, I think this sort of homogenization of the technical stack is part of why we're in the position that we're in. Like Jonathan Edwards famously said, you know, we were promised bicycles for the mind. But they gave us aircraft carriers instead, right?
So, so we're building aircraft carriers when we need bicycles. That doesn't mean that the aircraft carrier isn't useful. It's very useful if you need to project force in the South Pacific as the U. S. Navy, but it's not terribly helpful if you want to go get groceries. Right? It's the wrong tool for the job. And we're just trying to do everything with one set of tools.
¶ How to build Local First Apps
So you've mentioned that it basically should work fully on our device without the server and the server augments that software by giving it additional capabilities, such as collaboration, maybe it also helps with automatic backup, et cetera. So, that. is very clear. However, it would not be super clear yet how I would go from typically building a three tier web app to building things in that model. My intuition would be maybe adding tons of caching, et cetera.
Would I still have that server database? So what is the typical way how I built local first apps? Yeah, you kind of have two main problems. There's the program and the data, right? We'll just kind of boil it down. The program in some sense is the easy problem to solve. So, you know, if you're on the web, what you want is a PWA, progressive web app. You're going to use a service worker to cache the code.
You're going to try and run everything in the browser, and you're going to keep it stored so it works offline. If you're on mobile, in many cases you already build this way, Yeah. Like, everything that Apple publishes first party, more or less. Is already local first, right? Like Apple notes. It's not like you ever pull out your phone in the elevator to, like, make a note from a conversation you just had. And all you see is a little spinner. No, that's ridiculous, right?
Like Apple knows, I think Apple notes is a great example of what an app should feel like, whereas I do a lot of times see that spinner on the hallway where I don't have perfect internet connectivity in other apps. right? Because you're at somebody's office and you know, want to make some notes about a conversation not on the WiFi or something. And you're like, Oh, God, I gotta wiFi password? Yeah, no, that's ridiculous. I'm just trying to take a note here.
Okay, so, so you need to get the program on the device. That's one. And then you need to get the data on the device. That's too. So how do you get the data on the device? Well, in some very limited cases, you just put the data into the program, right? So like, I don't know, maybe it's a reference for a board game or something like that. You can just download the data and leave it in the app. But of course, that's not like the interesting case.
A lot of apps, most apps, I think, you know, you have user generated content. That's why you're there, right? It's a map, it's a note, it's whatever you're there to do. And so in that case, what you need. Is some way to get data from the cloud to the device and then back to the cloud. And the tricky part is you need to deal with the fact that people can go offline. And so that data is basically some data I might've previously created.
For example, if it's the Apple notes app, then maybe some notes I've created on my Macbook. I think we don't need to talk too much about like how do you deliver the program. But the data aspect that seems really tricky. So what are the problems that you've seen? And what are the current best practices or different way how we can architect things?
¶ The Data Is The Hard Part
Yeah. And again, I like anchoring this in , how did we get here? We've been downloading data for a long time. You can get a CSV file off the Internet. You can email somebody in Excel file, right? We kind of get that, like, then you have it. You know, I can email you an Excel file and then you can open it like no big deal, right? That seems easy. Similarly, like with my with my story about the music app, you know, I could download those playlists and just cash them on the device.
And so that's like a kind of a modest thing you were saying, just cache everything. Cool. Okay. As long as all you're doing is reading, no big deal. That's fine. Right? That just works. Cool. We'll cache everything. The problem is that we also still want to be able to work when we're offline. Right? one our tenants is you should never, ever be stopped from doing what you want to do. By the availability or lack of availability of another computer.
And I would say this is maybe also where you can draw a line between. Current, not yet local-first software that's not entirely useless in an offline scenario if it caches things a lot, so at least I can read it. But to go from there to that, it allows you to still do the job completely offline. I think this is sort of like a barrier that is incredibly hard to overcome with like your more like typical three tier app optimistic, of caching. yeah, exactly.
And it's not only that, but, like, it's just complicated to think about, right? Like, there's a lot of subtle problems. Oh, if you know, you click the OK and then it goes green right away, but really you're running the fetch in the back end and have an post request failed. now we've already told the user that it works. Okay, now we need to like raise an exception. Let's make sure that we check if we're online first. Oh, but the browser thought we were online.
But now it's just, it's a nightmare as let's not do I've seen many times that basically like shows you indicates, Oh, this was great. And then you see like a little spinner, a pop up, something has gone wrong. Please reload the Refresh the browser. You've lost all your work. One of my all time least favorite messages is on Google Docs.
Sometimes if you work on a document, like while you're on a flight, even though you've downloaded it, remember to add the extension, cached it before you got on the plane so that it would be available offline. And then you edit the document on the flight and then you land and you come online and it says, we're sorry, this document has changed too much while you were offline. We're now resetting you to the cloud state and then all your data is gone.
That's only happened to me once or twice, but man, that hurt. What we do is we can, you know, if it's just you working again, there's not really any problem here, right? Like I can edit the data. And then just when I come online, we upload it. The real challenge is if you have two different things, editing the data, and then you need to merge their changes, right? So like I added a track to the playlist. And, you know, Spotify added a track to the playlist. Okay now we have a conflict.
Conflict is the technical term we're going to use here. And so that's the hard thing. And lots of different systems have approached this in different ways over the years, right? Like, Git gives you the ability to merge conflicts as long as you don't have edits to the same file or the same line in the file, right? CouchDB and PouchDB, you know, they use what's called last writer wins. And so it's whichever.
The, you know, all the systems agree that whichever the last one was will overwrite now, which one was last. Okay, now we start to get into, like, interesting distributed systems questions because you can't trust the clocks on the computers. So, is it the last 1 to the server? Does there have to be a central server that decides? Can we get rid of that central server, Johannes? Can we? spoiler! Yes, we can! Sort of. Sort of. We'll come back to that later. That makes sense.
If the software runs on our devices and if the data is also in our devices, now the server is kind of demoted in a way no longer necessarily has to play that absolutely central role. If something goes wrong, the client can just forget everything and get everything again from the server. But that's exactly the step where we want to go beyond to have the work that we do and the clients to actually trust that and then collaborate with others. that's right.
And I think the beauty of this model, and we talked earlier about wanting to make software simpler, right? A lot of the complexity of the model that we have today comes from having all these different systems and programming languages and Like different computers involved in literally every task, right?
Like if I tick a box in a standard web app, I've got my computer, which then sends a request to the API server, which then does a right to the database, which then needs to propagate that back to my computer. So like, like best case theoretical. We've got three systems involved.
Practically, though, there's also like API gateways and like, you know, request routers and like all this other stuff is happening along the way, PG proxy bouncer connection bouncers and all this like extra hidden complexity on top of the sort of notional three servers. Whereas if you can move that so that you know, yeah, you tick the box and it writes it locally. That's it. You're done right now. The problem is, well, how do I let other people know that I took the box?
And so what we're trying to do is set up. We call this synchronization, right? So we want to synchronize the state that you have in your local device to other computers, whether that servers in the cloud or other people who you're collaborating with. And so what we're doing is we're kind of changing the relationship where instead of the server mediating what's happening. Right. The client decides what it's doing, and then it lets the server know, like, Hey, I made this change.
And, you know, we can do that incrementally. And in fact, what we can do is record all the edits that are happening on your local machine and then just basically send a log of that to the server. And then similarly, the server can send you a log of any other edits that other people have made. And now you have, in some sense, a much simpler problem, which is just like, okay, well, I've got all these different sort of changes. We have a change log, basically.
And all we have to do is figure out how to put that together so that it works.
¶ Local-first Use Cases
So I think there's a lot to unpack there. So I'd like to understand how can you actually do that? But before going into that I'm also wondering whether this works well for some kind of applications, but not maybe for some other applications. So I can very well imagine how this works for like Apple notes. I could imagine how this works for like a. More complex note taking system, maybe all the way to the scale of Notion.
What I'm wondering though, whether there's like some cutoff point where this is no longer a good approach. So if I imagine building something like Facebook is that still a good fit? For local-first software. Is there kind of like a line in the sand where you say, maybe theoretically still possible, but absolutely not a good idea. What is good fit for local-first software and what isn't? Yeah. So the strength of local-first software is when users should have control over their data, right?
And when the devices at the edge need to be able to operate independently, local first offers a great fit. Probably not a great fit for an ATM, aTM just deciding when it's offline to spit out money. You know, without any way to let other nodes know what it's doing. When the users have a lot of agency and authorship over the data, local-first software makes a ton of sense.
If you're trying to manage an external resource, like cash in a bank account or a meeting room booking or something like that, then it makes less sense. I think social media is an interesting middle ground. I think there's a lot of benefits. to giving users control over their data and an agency over their data. But there's interesting scale and indexing problems. So I can imagine, in fact, there are social networks that are totally local first.
A great example of one is called scuttlebutt and that's a local first social network. But you know, it comes with a bunch of architecture trade offs. It's designed for offline use. And so that means when you join scuttlebutt, or at least When I joined Scuttlebutt, you had to download like 10 gigs Wow sailboats and like, you know, social media posts that you will never read just to be able to get sort of bootstrapped into the system.
don't think that's a necessary consequence of local for software. design that they chose to prioritize, which is like you should be able to have the whole enchilada. But yeah, I think again, right, like, I guess another useful thing to consider is like how often will users engage with the same data? So, like, Wikipedia, probably not a great local first app in some sense. Though I'm told Patrick Collison built an offline Wikipedia as one of his first ventures.
so, you know, I guess there's a use for it. But, you know, if you only go to an article once and then you read it and then you never need it again, and that's like a random access pattern and we can't know what you're going to want to read, then it's probably not a great fit for local-first software, right? Like, you're not editing, you're only reading, and you're not going back to the same data.
On the other hand, like, a recipe book kind of the inverse of that, which is like, kitchens often have a lot of, like, metal and RF interference. Transcribed So even though you're fetching data, you might want to have the recipe, you want to be able back say, what did I make last year? might be your favorite recipe and you might have a little note that you've scribbled on it.
absolutely, scribbling in the margins of your recipe books is a time honored tradition in, like, every household, I think. And so, right, like, this is where it comes back to that question, which is, like, is a user taking data for themself and then they can have authorship over it going forward? It's probably a good fit.
the more kind of like trust, like control and trust, I think are related here and the less you can trust all the participants in your system, the harder it probably will be to build in this local first way. Since you need to mediate what is typically referred to as auth and your typical three tier app. This needs to be thought of probably quite differently when you build local first apps. I think this is a bigger conversation for another day.
I believe that local-first software will provide a much higher standard of privacy, security and authentication than a cloud service can, I think there's a whole bunch of new best practices that we need to figure out that might not be fully discovered yet. yeah, it's definitely an open area of work that I'm not here to tell you that problem is solved. I'm here to tell you that, like, the potential, I think, is huge for this, but the reality is we're a long way from having, And that's perfect.
that's what I want to help with this podcast give an outlet to discuss those sort of particular areas, such as what does authentication look like in a local first way? And I think something that makes the local first world already so rich are those like. Early products that are being built in this way where we can share and learn from their experiences and their experiments. So I'm really looking forward to deep dives on those various topics. tons of people doing great work.
Of course, blue skies out there doing it in more of what you'd call a federated way, but they're definitely thinking about these problems. And there's lots of people building products. I can't wait to talk to them about this stuff.
¶ Data Syncing
So going a step back again. When you say synchronization, I'm thinking the Dropbox icon back in the days when I was using Dropbox a lot. So that does synchronization. I think a lot of people are still familiar with that. Is that kind of like the mental model, how I think about it? Git also has a notion of. Synchronizing, if a bit more explicit with push and pull how would I do that for my data where I think about my data, maybe more in terms of SQL tables.
So how do I translate that very intuitive model of syncing to the more harsh realities of my app, This is a great flashlight to shine on like how the world has changed, right? So if you go and look at Dropbox, what does Dropbox sync? It syncs files. What is a git repo on your machine? A git repo is a bunch of files. And then git and Dropbox both do something special behind the scenes to get those files other places or from other places. Okay. What is a Google Doc?
You have a Google Doc on your computer, you don't actually have it. You're just looking at it, right? And that's why Google can throw away all your data because there have been too many edits by other people. Nevermind spending three hours on a flight, rewriting the stupid blog post, right? Cause it's, you didn't, I didn't have the doc. I had a cache. I had a copy. I had a view. And because somebody else had edited it, the actual doc, the place had changed and mine was now trash.
. Google Docs are places. And I think this is like a really interesting kind of duality, right? Because we want both Properties for our documents. I want to be able to send a link to someone so that they can edit a document with me. Right? That feels like a place. But I also want to be able to save a copy and email it around or have it on my computer and look at it and know that it's mine. Right? So we want properties of both this kind of like object like experience that we get from files.
And the place like experience we get from cloud documents. And I think that's where we get into like a real fundamental tension between the two.
¶ Existing Local First Products
Do you feel any sort of software that we use on a daily basis is getting the closest to what you think could be a good resolution of the tension? I think there are lots of companies that have made incremental progress. Just to kind of, you know, speak to some exemplars. I mean, obviously, I think the basic idea behind an app generally, I think it's kind of problematic, which is it's taking your data and putting it into their app. So you can't access somewhere, access it anywhere else.
Great example, again, is Apple Notes. Plain text notes, more or less, but you can't open them in VS code or notepad or anything else or Dropbox notes, right? Like, we've managed to turn text notes into closed proprietary formats that are specific to a single piece of software. That's sort of a bummer. You can't sync your Apple Notes with Dropbox. But okay, I still think though that, like, within this world of files versus You know, things versus places, files versus links.
Apple notes is like a nice middle ground, right? Like you can make a note, it always works offline, but you can share them with other people and see the edits and get updates from people. So that's really cool. I think Figma is doing really interesting things with like, branches and decentralized workflows.
If you ever talk to people who work in a large org that uses Figma, You know, there's this problem, which is that like when you're working on a design in the early stages, you don't want other people to look at your work, right? You're they don't want someone hopping in and going, I don't like that color. It's like, yeah, I'm not done. back later, right? You want that creative privacy where you can explore. And so people have found all these creative ways to build that in Figma.
And I think there's somewhat cognizant of that organizationally. So I think that's like a good example of yeah. Like trying to plumb some of those depths, I think, though, for the most part, I mean, you know, we could give a bit of credit to Google Docs, right? Like, yes, it requires a browser extension and like literally half the time that you at least I get on a flight.
I realized that although I had the browser extension plug installed, it was in the wrong profile, or I didn't remember to cash the document or whatever, but you can at least get some semblance of that offline support from Google Docs. And of course, you know, Git is an exemplar as well in terms of, like, Git plus GitHub gives you something. It's clunky.
I'm not here to tell you Git's a great user experience, but, you know, Git gives you this ability to work kind of in both worlds where you can share links to files through GitHub, but you can have a full copy of the repository with its history on your machine. Right. Yes. I think there's a lot to be explored from a user experience perspective, like patterns that fit the domain of the software patterns that users intuitively can work with.
But I think in terms of how we actually building that there's probably still a long way to go and think it's super, super fortunate that there are companies already pushing forward. there whether it being Figma or Google with Google Docs or even Notion who I think is like not super sophisticated in terms of syncing is yet a good example of like how far you can get while building collaborative software that is not super fine granular in terms of its syncing resolution.
So I think syncing will be a continuous topic both in terms of user experience.
¶ Conflict Free Replicated Data Type (CRDTs)
And then when you actually get to the implementation, when you're saying syncing I've already seen the term CRDT quite a bit associated with local first. Can you briefly touch on what CRDTs are? We probably don't have enough time to fully go into it. Maybe you can, let's give a very high level primer. I think this is a topic that needs a deeper dive, but we'll give the quick intro. So the idea behind the CRDT is that it stands for a conflict free replicated data type. data type is pretty easy.
There are lots of different CRDTs. You can have a set, you can have an array, you can have a counter. We work on one that's basically like a JSON file. Right? So it has sets and arrays and counters and maps and things and when you say data type, this is now a thing that I'm using to build my app. So in the same way, if I build a JavaScript app and I would use a map or set an array, now I would use one of those CRDTs as foundation for my data. Why are you doing that?
Well, because it's replicated. What does means? It means you can take that data type and you can send it to a bunch of different computers. All right, cool. typical JavaScript map can't do that. Yeah, I mean, you can call toJSON and then post it somewhere. And that leads us to the last part of the description, which is conflict free. And that's where the heavy work gets done. So replicating your data is not that hard as long as you don't care what happens to it.
You could just, I don't know, open your window and start shouting, you know, byte codes at the window. Technically, you're replicating it, I guess, if anybody's listening. But you know, the conflict free part, that's where it gets tricky. How do you make sure that if two people both edit the document, One, that you can merge the changes, but two, that nothing's lost, right? Like, it's really easy to merge data if you just say, well, I'm going to throw, always throw away your data and keep mine.
But that's not a very useful strategy, right? So the whole idea behind CRDTs is lots of different approaches. Tons of different implementations. It's a very active area of research and there's lots of great ideas still unfolding out there right now. And it's really simple conceptually though, which is like, if two people edit a thing, you should be able to merge their edits together and both get the same answer without having to go to recourse to a central authority.
That seems like a very interesting area that I want to learn more about. When thinking about CRDTs, there are a few technologies that come to mind, whether that's being Yjs by Kevin Jans, who's been working on this already for many years, then you all been working on auto merge . So I would love to learn a lot more about those, but right now we'll shelve it away and say, okay, CRDTs is a great strategy if you need to implement syncing.
Is that Yeah, it is, and I think there's a couple of, I'm not a big believer in silver bullets or kind of like magical thinking with technology. There's a set of tradeoffs here, right? Like, what happens when you have a conflict in git? Well, you're busting open your text editor and looking for a bunch of less than signs in a row and hoping you found them all before you push the stupid commit again.
You know, similarly, CRDTs don't generally have that problem, but you can still have conflicts, right? Like if you and I both edit the title of a document, You know, the CRDT isn't going to be able to know which one is right. So there's interesting, just like, challenges here that go beyond, abstract computer science problems. This is a really fundamental problem. The computer can't know what the title should be, right?
And in the central cloud model, what happens is, like, one of us will find out when we go to write that we lost. And so you know the problem happens right away. And so what makes CRDTs interesting from like a user experience design problem is that you can have conflicts discovered a long way away from where they're created, either in space or in time. And so that actually has serious user experience consequences. This can also happen to you and get right.
Like all of us developers listening here know what it's like. To, like, try and merge a long running branch and realize that somebody else has, like, completely rewritten a file you edited and you're like, oh no, what do I do? Do I just throw my entire week of working away and start over? Or do I, like, There's no free lunch here. So it's semantically diverges a lot or changes a lot, then the user still needs to help out.
And similarly, the CRDT can't actually recognize all the conflicts, right? The technical conflicts we can catch. But again, if you sort of think about software, Git is kind of like a bad CRDT in some ways. Even if we edit different files, that doesn't mean the tests will pass or the software will be useful. I may have renamed a function that you were using. Both can commit cleanly, but then when we try and run the tests, it won't even parse and it crashes.
We can think about all those things, and I think that's the, design side of the problem, and then, there's lots of neat stuff to talk about on the implementation side, but we can on from that for now. We'll back to those later. Yeah, so going beyond syncing and CRDTs, obviously that's a very important aspect, syncing in general for building local-first software, as it's one of the pillars, like the software should work on my device, and we should be able to collaborate.
So if we come back to the. the software should run on my device.
¶ The Challenges of Building Local First Apps
Which sort of problems does a developer who's typically rather building that three tier web app, what sort of new problems does the developer now need to think about or maybe think about differently aside from syncing if you want to build your next app in a local first way. The biggest one is architectural. When you grow up in this environment of you know, the database is the source of truth. And then things sort of roll out, you know, now you have different problems.
Like, how do I make sure that I have the data, you know, when my user is offline? Can I make sure that I've fetched everything they need so that they know before they go offline? Like, yes, you're up to date. Similarly, you know, if you can be partially online, right? Like, maybe you're online and you can talk to the sync server. But somebody else can't. So, you know, how can you know that you're seeing what other people are seeing that you're seeing the things somebody else has sent, right?
I like to give examples from the field when we talk about this stuff. Facebook Messenger has like a really simple version of this problem, you can still reply to a message when you're offline with your phone, but you don't get the little checkbox that says, Yep, that's been sent. And it's not until somebody else takes the phone out and looks at it and the little, you know, miniature avatar icon drops below the message that you know that they've seen it. What does that mean for source code?
Or like a text document. I mean, we can't just move the avatar to the bottom of the page, right? So there's like really interesting challenges here around kind of, thinking a little differently about how you build your app, right? And similarly, just because you have the data locally or could have the data locally doesn't mean that it's going to load quickly.
If you store everything as like a long log of edits in an indexed DB, it could take you longer to load than if you were fetching it from the web. So we need to be very thoughtful about these things. But the result of Using this approach has one really killer benefit that I just still delight in every day, which is that, you know, there's the old thing about, you know, it ran on my machine. Well, we'll ship your machine. And that's how Docker was born, right?
Like, it's sort of like that, though, with local-first software, like, if it works on your machine, like, that's the target deployment environment.
¶ The Benefits of Local First for Developers
You have the whole thing. So you're done. Right? Like, as long as you have a sync server somewhere, you're finished. And actually, maybe this is a good moment. Can I talk a little bit about peer to peer for a sec? Please. Yes. So, I think one misconception I hear from people is like, Oh, local first. That means you're anti cloud and you don't like servers. Not at all.
Like, peer to peer technologies are interesting and there have been You know, use cases where they were really vital to achieving users aims, like, you know, when you want to download an entire album full of Linux ISOs, you know, then BitTorrent is, like, really good, and you can see why having central servers would be a problem for that, but in the case of most software, you know, one, peer to peer systems actually don't work very well in a lot of network conditions.
You can, you know, you can run BitTorrent on your home network with a little bit of work or a VPN, but at the coffee shop or places like that, those peer to peer systems really struggle because the network tends to be configured in a way that doesn't allow them to work. But more than that, right? Like, if you and I are collaborating on a peer to peer system, right? Like, I'm on a laptop most of the time. I don't want my work to disappear completely just because I close the lid, We want servers.
We want them mean, they literally should serve us. They right, not us serving them, right? We're trying to make the server serve us rather than vice versa. Love it. That's great. And so, yeah, the big difference here in this model is that you still want servers, but their responsibility is just. to host data and distribute it. And so, you know, yeah, you want it. It needs to be online for things to be successful because peer to peer systems don't really work.
I have spent many years trying to like adopt peer to peer architecture before kind of finally giving in and embracing this perspective. And then even if they do work, they don't solve the problem, which is that in fact, you want your data to be online even when you're not Right. You know, there's an interesting role it can play. I know a few folks who haven't given up yet on peer to peer. And maybe in the cases where it does work, it can be an optional part of the process.
Of a technology stack, let's say you're in a smart home setting where you can assumptions about software works. Well, peer to peer software is great and can work really well, but it, I have not seen a design or implementation of peer to peer systems that are like really robust in all network environments and where you don't just actually have a few centralized servers hiding off stage.
While you're saying peer to peer, but actually, you know, 20 percent of traffic is going to some server like a great example here is WebRTC, right? If you're familiar with WebRTC, people think of it as like this peer to peer system built into the browser, but it's not for two reasons. First. While it is true that WebRTC allows two computers to talk directly to each other, in order for them to find out about each other to have that conversation, you need a server to introduce them.
The second is that, and opinions vary about the prevalence of this, but I have seen credible reports that are sort of on the order of 20 percent of WebRTC connections, cannot be made peer to peer. And are instead sent through what's called a turn server, which is just relay that just bounces the traffic for you. And like, that's not to say WebRTC is bad. I mean, it's super useful. We use it every day in our, like, video conferencing lives. But that, it's not really like peer to peer.
It's like client server with a peer to peer fast path when it's available. I think there's lots to be explored as well. That's more like on the networking side. And I think depending on for some use cases, like . Let's say you build software for a warehouse or maybe a cruise ship or something. I think there, there might be real use cases, and I'm really you have control... are exploring it.
Yeah, if you have control over the network, you totally can and actually our friends at ditto where former lab collaborator Ray McKelvey works have built a really great business building local-first software using a different kind of CRDT than the one we use for airplanes and like fast food restaurants. They have like real scaled out production, industrial deployments of this technology in the field. And by all accounts, you know, it's working really well.
¶ The Seven Ideals of Local First
So if you Google Local First right now, you might come across the blog post and essay that you've been a co author of where you first introduce the ideas of Local First, and as part of it, you have those seven ideals of Local First. So I think this would be enough to, fill an entire episode on to really do each ideal justice, but maybe you can just Summarize those seven ideals and what they mean for you.
Yeah. Well, we've talked a bunch already about how building local-first software is good for developers, right? Like, your distribution costs are better. Your on call experience is better. Your software is less complicated, hopefully.
But the original motivation behind developing the ideas of local first was really focused on kind of user experience and not in the sort of what corner radius should the buttons be or what shade of green for the call to action, but like the real core experience that the user has a person who's consuming your software. And so we identified with Martin Kleppmann and Adam Wiggins and Mark McGranagan seven ideals.
That we think local-first software should aspire to the first one is no spinners right next frame or your money back like you should never go to open something and just see like a spinner because your network connection is bad or whatever else like that's just it's inexcusable we have the fastest computers in history with the biggest drives ever in the most bandwidth like it's just ludicrous that our own personal data would not be available to us.
Number two your work is not trapped on one device. No one wants to be in a world where like you dropped your phone, over a cliff or off a ledge or whatever and suddenly your data is all gone with it, right? You left your laptop on the train and now you've lost your life's work. No, we don't want that. So your work should be able to synchronize across all your devices and be available on every device. Simultaneously with this idea of not having spinners, right?
The idea is we want all seven of these things to be true together. Number three is that the network is optional. As we've discussed, like, you often find yourself in places where the network either is not available because of technical reasons, or just because, you know, you want to be offline. Right. There's something to be said for making that choice, turning off the Wi Fi signal, staying focused, and still being able to work. We still insist on seamless collaboration with your colleagues.
That's number four, right? It's, if it's not collaborative, it's local only, not local first. Another one that I think is really important is this notion, number five, the long now. I can get on a single micro SD card. Like, every game that was ever released for Super Nintendo. And those will work a hundred years from now. Like, how much of the software that you've built in your career will work a hundred years from now? How much of it will still be around ten years from now?
For many of us, so much of our work just evaporates. And it's not just for the vanity of wanting to see your work survive. When you're talking to people, writing books, or... You know, who have built their careers around like a long term project, you know, it could take 10 years to do the research for a book. How many cloud services that were around 10 years ago are still around? Or how many of that are around today will be around in 10 years.
You know, you can't control who's going to get acquired or shut down. So I think that's a big part of it. Number six is security and privacy by default. You know, why is the default that when I use software, the product managers at the company or the support people can just read my data whenever they want? I mean, they don't. I hope they don't. They most... Places say they won't, Right. but you know, they, they could, right. That's pretty messed up.
And then number seven, ultimately, you know, we want the users to retain ownership and control. Right. And that's in some sense, kind of a distillation and restating of some of the other points, but like. You know, if someone can take it away from you, it's not yours. We want you to be able to have the software to make decisions about when you update it. If you update it, who you share your data with, when you share your data with them and to put those tools into the user's hands.
We want them to have that feeling. It's a genuine fact of having the thing and having agency and ownership over. those are the seven ideals. I think it dovetails with a lot of the developer benefits, right? Like, sort of the same underlying decisions that make things easier for developers also can improve things for users. But those are the seven ideals. I would say from a product perspective and from the end user perspective, I think there's nothing controversial about it.
I think like almost all developers who build apps for the cloud today would probably still say, yes, those would be great characteristics for our apps. It's just hard to build apps like in a cloud first way that have all of those characteristics. So I think this is why we need a bit of a different model, how to build software. And, those from a user perspective are amazing. And from a developer perspective, everything becoming simpler sounds incredible.
¶ How ready is Local First?
Zooming out a bit, I love all the ideas of Local First, and so this is not the topic of today's show like, I'm also, for the last two years, working on a Local First app myself, but, Pretending I'd be two years earlier and just trying to make sense of what is local first? Is that for me? How do I go about it? Let's say I'm at the beginning of my journey and I'd be asking you as a fellow builder for advice, like, Hey, Is local first ready for me to build my next app with it today?
And if so what does my typical path look like? Is there like a Ruby on Rails for it already? Is this a good fit for me? Do I rather have to be like an adventurous explorer? Or can I already go into this with a mindset of what is the most popular stack? And that's also what I got to adopt here, So we're recording this in 2023. And today I would say building a local first web app is likely to require inventing something at some point along the way.
Like this is not like just pushing stuff to Netlify and following posts off stack overflow. This is a new thing and you're going to have to figure some things out and I hope someday that this is as easy as falling off a log, right? Like at Heroku, we used to always say make the right thing the easy thing and make the easy thing the right thing.
And so if we can do that, if we can get there together as a community, then, you know, Hopefully when people find this podcast a few years from now, they're like, wow, we've come a long way. But today I think, you know, what I would recommend is like, you can build a react app. You can use a CRDT like, Yjs or auto merge are probably the most mainstream choices right now. But there are some technologies that I can Oh yeah. And there's a Vulkan. io. There's ElectricSQL.
There's like a ton of people coming out and trying things. And depending on what looks good to you you should try one and see how far you So there's basically an ecosystem of tooling already for those use cases. there's always a bit of like, it depends. So one might better for your use case than another. But think what saying is like, I need to invent maybe something along the way. Maybe I need to do end to end encryption, or maybe I need to do permission control in a certain way.
Where none of the available tools yet do exactly what I need. And this is where I need to get my own hands dirty, Yeah, you'll probably need to figure out how to deal with some of those offline use cases. You're going to have to decide who you want to give this data to. I think that's one that everybody kind of wants different things when we talk to people. And I think eventually some standards will emerge. There'll be probably two or three patterns.
Most people adopt, but at the moment, because everybody's asking for different things, people are all making different things. But I think that's a that sort of ACLs and access permissions is an area you should expect to spend some time thinking about. But like, I gotta say, it feels amazing to just stand up an app. In a short matter of time pointed at some kind of a sync server and then just be, you know, working on it with other people.
And I hope everybody will take a take the time to at least have that experience to try building something small, whether it's with automerge or Y. J. S. or ElectricSQL or anything else. And that experience, it's just. It's so eye opening for me as somebody who spent all these years building these big complicated web apps for the cloud to just say actually, it can be pretty simple. And when it is simple, it feels great. You feel like you have like, like wings.
You're just able to move so fast, you're able to get so much done. And it's like someone's taking the brick off the pencil. You just, like, you don't even realize how much hassle you're putting up with every day until it goes away. So maybe after someone's listening to this, maybe on your next weekend, give it a little try, like build a app you want it to build all along, maybe React app, like throw in automerge, throw in Yjs, see how far you'll get.
And I think the promise is really that users, your end users will get much better software. And for you as an app developer, your life becomes a lot simpler. You can take off that, that big brick and just enjoy drawing and writing with your pencil. That, that sounds super exciting. And like having seen the ecosystem now for the last two years evolve, I think we have already come quite, quite a long way. Now there is a real ecosystem and you don't need to reinvent everything by yourself.
If someone's like looking for some references or some inspiration, can you point to like any sort of software that is already local first, or I guess like local first is a bit of a spectrum. Some apps might be more local first than others, but is there like some software that people already use today that, that is directionally local first that you can orient yourself around? Just looking at the core set of apps on your phone is like a nice starting point.
For web apps, there's things going back all the way to like ether pad. Right. Where you have offline collaboration. And then if you want to try and build your own experience automerge.org/docs/quickstart. You know, you can get up and, you know, experience that feeling of building local-first software for yourself. I think most apps that we from a user perspective, see as working fast or that you also trust to use them while you're traveling on a plane, on a train.
I think they are all probably, even though they don't have a local first label yet on the software box, they are probably already more local first. And yeah, this is, I think, whoever's trying to build something next if you get to experience that for yourself, I think that's the easiest way to convince yourself.
I'm thinking of some software that I think that I use on a daily basis, whether like, whether it's like, something like, Linear or something like superhuman, those sort of apps, while I'm not sure whether they were from the inception, meant to be local first, I think they've directionally evolved in a way that is more and more local first, there's probably still like servers heavily involved, but in terms of that client is getting a lot smarter.
I think that's already an important step in that direction. And that's certainly why those apps feel great. Amen.
¶ Open Research Questions
So maybe before wrapping up, what are some topics within Local First that are very top of mind for you right now that you spend a lot of time thinking about and where you hope that we'll be a lot further along in one or two years time? Ooh. I think for me, the biggest design questions right now are around versioning and version control.
One of the things we've found as we've studied local-first software and talked to writers and creators is that, um, this idea of the cloud as like, Sort of the panopticon where anyone can watch you work is like really deeply uncomfortable to a lot of people and so exploring these ideas of creative privacy and the ability to work the way software developers work, which is you work private until you're ready to share. I'm really eager to explore those ideas and bring those.
To the rest of the world, your Google Docs, you should be able to work in that same offline way. So I think that whole... Question about like bringing together version control in new ways is a big one. Authentication, authorization, access control, sharing permissions, that kind of stuff. I think it's actually quite closely related. I think that's an area we're going to see a lot of progress around. I think the app models are going to continue to evolve, right?
Like. Auto merge is document based, electric SQL is database based and Riffle is SQLite. Both are actually right. I don't think it's one or the other, but I think exploring kind of that relationship there is going to be a big area. I think there's a lot of stuff around security and encryption, end to end encryption that's going to unfold and with it. What's that what's that quote?
I think it was Leah Kistner, which is , cryptography is a technique for turning many problems into key management problems. so I think we're going to see a lot of that over the next few years. I think that's just scraping the surface. And I think all of this though will also intersect with like application development models.
Just thinking about other kind of intersecting things, another big part of where we're still relatively immature is deciding what to sync and when to sync it, There's this motto we have at Ink and Switch, which is like, results on the next frame are your money back. We want everything you do to run at a hundred hertz, none of this like a hundred millisecond round time to the server, and then you get to paint.
No, like when you interact with the system, you should see the response the next time your screen refreshes. So that means on a hundred hertz screen, you got 10 milliseconds to do everything. And so if the server is 50 milliseconds away is too late, you can't ask the server, right? If you want to have things by the next data, that means you need to out there already get the data before the user even asks That has big implications on your data architecture on like performance, So many Yeah.
Things need to be in memory. How do you manage memory use? What about like low bandwidth connections? So yeah, I think that whole area of like sync optimization and background sync and so on is really important. And it's an area that have thought about, but we haven't seen a ton of progress on implementations yet, because until you actually have enough users and enough usage to really drive that, Really interesting topics to dive deeper on in future episodes as well.
There is one topic that also stands out to me that I think we're not there yet which is cross app collaboration or cross app interoperability. So as we're. Steering in the direction where we have like more and more AI in our lives and all of those AIs so far are looking also that they run on like somewhere's remote server, et cetera. And now we're going to feed more and more of our data into those I would love to see a world where like the AIs can also similar, like the.
Like the programs that we're building also run on our devices. Now we need to make our data available to those. Have those programs and those AIs collaborate between each other. So in the same ways, like all the stuff I have in my own head.
From conversations with you, from conversations with others, my mind is like able to, all of those different interactions I have are almost like little apps and they naturally can collaborate yet my contacts app that runs on my machine and my email app, unless they're from Apple and does integrate it, otherwise they can't like, what does that integration story between apps, et cetera? What does that look like? And I think the foundation for that is. The data needs to be on the device.
And now we need to build something on top that makes it easier to interoperate. That data is shared across apps. That's a whole topic that I'm also really excited about. So we call that malleable software at Income Switch, and it's a very active area of research for us. We've just published a paper called Embark Today as of this recording about that https://inkandswitch.com/embark where we talk exactly about some of these problems.
One really cool thing about auto merges design that I haven't seen in other systems is that the grain of the system means that any apps which share a backup sync server. can load each other's data as long as they know how to interpret it. And, you know, it's a little bit like sharing a SQL database, but without all the problems of sharing a SQL database, because the database is not like the shared resource that you have to worry about people hammering, right?
you can just put data in a sync server and then anything can query it out. And so, I got to say, it's awesome. I use it all the time. When I break something, I can open it up in another app and fix it. You know, I have a whole suite of little apps that are local first that collaborate on our data now. And yeah, well, we'll say more about that sometime in the future when we have something to say.
¶ Outro
Peter, thank you so much for sharing all of that about Local First with us. I think this is the beginning of a much, much longer journey where I want to hear a lot more different voices, like people building apps with Local First, people building tools to help other developers build Local First apps. All the things we've just barely touched the surface here today. So thank you so much for coming on the show and I'm looking forward to having you back hopefully soon. Yeah. It was a lot of fun.
Take Awesome Thank you for listening to the Local First FM podcast. If you've enjoyed this episode and haven't done so already, please subscribe and leave a review wherever you're listening. Please also consider telling your friends about it, if you think they could be interested in Local First. Thank you again to Expo and Crab Nebula for supporting this podcast. See you next time.
