386: One Year of Podscan: Reflecting on Tech & Business Decisions - podcast episode cover

386: One Year of Podscan: Reflecting on Tech & Business Decisions

Apr 18, 202524 min
--:--
--:--
Listen in podcast apps:

Summary

Arvid reflects on the first year of Podscan, detailing the tech and business decisions that contributed to its growth and profitability. He discusses the choice of PHP and Laravel, database considerations, external services like MeiliSearch and ScrapingBee, and key business tools like Paddle and Mercury. Arvid emphasizes the importance of reliable technologies and strategic partnerships in bootstrapping a successful business.

Episode description

This week, I'm sharing behind-the-scenes choices that allowed Podscan's growth from a small experiment to a thriving business. I'll delve into the choices I've made and how some have worked better than others. I'll explore the key decisions that have driven our success, such as selecting the right tools and technology, and utilizing AI to enhance the efficiency of our processes. Join me to learn how these choices, along with important upgrades and strategic partnerships, have contributed to our achievements.

The blog post: https://thebootstrappedfounder.com/one-year-of-podscan-reflecting-on-tech-business-decisions/
The podcast episode: https://tbf.fm/episodes/386-one-year-of-podscan-reflecting-on-tech-business-decisions


Check out Podscan, the Podcast database that transcribes every podcast episode out there minutes after it gets released: https://podscan.fm
Send me a voicemail on Podline: https://podline.fm/arvid

You'll find my weekly article on my blog: https://thebootstrappedfounder.com

Podcast: https://thebootstrappedfounder.com/podcast

Newsletter: https://thebootstrappedfounder.com/newsletter


My book Zero to Sold: https://zerotosold.com/

My book The Embedded Entrepreneur: https://embeddedentrepreneur.com/

My course Find Your Following: https://findyourfollowing.com

Here are a few tools I use. Using my affiliate links will support my work at no additional cost to you.
- Notion (which I use to organize, write, coordinate, and archive my podcast + newsletter): https://affiliate.notion.so/465mv1536drx
- Riverside.fm (that's what I recorded this episode with): https://riverside.fm/?via=arvid
- TweetHunter (for speedy scheduling and writing Tweets): http://tweethunter.io/?via=arvid
- HypeFury (for massive Twitter analytics and scheduling): https://hypefury.com/?via=arvid60
- AudioPen (for taking voice notes and getting amazing summaries): https://audiopen.ai/?aff=PXErZ
- Descript (for word-based video editing, subtitles, and clips): https://www.descript.com/?lmref=3cf39Q
- ConvertKit (for email lists, newsletters, even finding sponsors): https://convertkit.com?lmref=bN9CZw



Transcript

Arvid

Hey. It's Arvid, and this is the Bootstrap founder. Today, I wanna reflect on something that is both foundational to the business and quite personal to me. It's been pretty much a year since I started Podscan, and I think it's time to look at the choices I made back then in terms of technology, and dependencies, and services, and everything around the business, and see how they're holding up today. This episode is sponsored by Paddle.com, the merchant of record that has been responsible for allowing me to reach profitability just this month.

Paddle is truly MORE, that's MOR, merchant of record, because their product has allowed me to focus on building my own product that people actually wanna pay money for. And that's why Paddle does more for you. They deal with taxes, and they reach out to customers when they have failed payments. They charge in people's local currency, do all these things that I don't need to focus on so I can really be present for my customers and their needs. It's amazing.

Go check out paddle.com to learn more. So Podscan itself is doing pretty well. It has grown from this experiment that it was a year ago into something quite substantial. It's profitable now and has customers of all kinds that I didn't even expect to serve in the beginning. With every week, I think I shift closer to the perfect messaging and I reach more of the right people at the right time.

What I wanna share with you today though is me walking through the tech stack and the business stack of the Podscan business and the product. I'll talk about what expectations I had back then in the beginning and if they've been fulfilled, if they've been exceeded, or if I had too big or too small expectations. Essentially, what worked out and what is still lacking. I'll start with the tech side of things and then go more into the business side, the services that I use to make Podscan run. Let's start with the tech stack.

There are a couple of major choices you always have to make in a software business when you start. What's the general infrastructure that you build your product on? What's the development pipeline? And what are the components that you integrate your system with from the beginning? For me, the general architecture was quite simple.

I wanted to build a single central application server that pretty much does everything. I didn't wanna overcomplicate it. If I need any external stuff, I'm just gonna set up servers that are gonna be their own deployment somewhere else, any external services or background processes that happen on another server. I figure that out. But from the beginning, I wanted to build one central kind of monolith application.

And for that, I needed to make a tech choice. The biggest choice that I made at the beginning of Potscan was to build this product with PHP and more precisely the Laravel framework. Now my background is actually in PHP. Back in the day, in 02/1934, when I first started working on software and web development, PHP was at version four, and that was the first language that I started out with. Over time, I went into JavaScript and Elixir and Ruby and all these other things and built software with that.

But it turns out that PHP has grown into something quite amazing, and Laravel in particular is the gateway to building web businesses more easily than anything else I've ever used. At the time that I started, Laravel was at version 10. It's now at version 12. I haven't upgraded yet because it's still working and it's perfect in terms of performance and works well enough so I didn't have to do the upgrade yet. I will get to that in a second, though.

Because when I started using Laravel and looked into the Laravel space, I was quickly very pleasantly surprised and inspired, really, by the ecosystem around the product. Nowhere else had I found such a plug and play ecosystem that seemed so well integrated before, not just with libraries, because most ecosystems have great libraries and projects, but here it was actual products that seamlessly fit with each other. And this isn't surprising when you look at all the work that the Laravel team and Taylor Ottwell have been doing to create things like Laravel Forge, which is for hosting, Envoyer for deployment, Laravel Spark for payments, Laravel Jetstream for users and TeamLogic, and so many other components. In fact, there is Laravel Shift, which is an upgrade component which helps you upgrade from Laravel 10 or Laravel Any version to Laravel Current 12 in this case. There are solutions, kind of plug and play products that are all integrated with each other and allow you to use the full power of the ecosystem.

And initially, it was almost overwhelming to see all these services, but I ended up massively benefiting from them. The initial prototype for Podscan was very quickly spun up, like, very quickly within a couple hours with all these components in place, and none of them has been problematic over the long term. They all fit really well. And they're all still well supported. They're working extremely well with each other.

They're reliable. And I haven't needed any extensions that I couldn't build or find somebody who had already built it for. Like, anything that went beyond the basic things that these packages offered, I could either do through just reading the documentation and diving into it or use AI, I guess, or have some other solution out there that is building on top of it. And one benefit that I found I didn't expect in the beginning because I wasn't working with AI back then, I was still coding everything by hand, is that once AI actually hit the scene and I could have AI tooling check my logic and build features for me, the fact that PHP is such a popular language, it has been, with decades of people solving problems, became a massive benefit. AI systems, particularly the current versions trained on so much material and with really good reasoning capabilities, they are extremely good at building PHP software.

They are pretty good at building any software that has been around for a while. Choosing PHP, one of the oldest programming languages on the web, and using it to build on top of a well distributed and popular framework, I think that was a spectacular choice. The couple businesses back, both Fiteb Panda, the one that I built and sold, and even PermanentLink, the one that I built after I had started writing my book, they are built on Elixir. And Elixir itself is a great language. It has a wonderful ecosystem, but it's not as widely popularized as the PHP or the JavaScript world for comparison here.

So Elixir's wonderful benefits of being like hyperparalyzed, they kind of pale in comparison to just how easy it is to build with PHP. You may not build as performant an application, but it still will work if you run PHP. And the tools that are around are just so much better because of the wide popularity of these things. I highly recommend that you look into this when you build a new product. You take this into account.

There's something called the Linde effect. And that means that a thing that has existed for a while is very likely to keep existing for just as long. Things that have been around for twenty years, like PHP, will probably be around for twenty more years. But something that's only been around for two months, think of recently more hyped popular JavaScript framework, might not be here anymore in two months. Right?

These things, they have to grow into some kind of adult state or a mature state, but they often don't. Things that are already mature, they will likely stick around for a while. Hypes are quick, but long term things stay long term. And this effect is particularly true about programming language choices and framework choices. Most technical choices actually benefit from things that have been around for a while because there's a long rat tail of addendums and additional stuff that you might not even think about, but a community.

Right? That is something if that exists and has existed for a decade or so, that's valuable. So if you choose a language that's popular and has been around for a while, you will find a lot of potential experts out there to help you with problems because they've had the opportunity to learn for a while, build projects with this, and become experts for a long time. You will find documentation and examples that you can draw from if you just look at it, read it, understand it, and that AI has already learned and understood. And you will find well maintained libraries that are not just well written and well done, but they include features that you actually need in real world projects because they were written by people who needed them for real world projects over a couple decades.

And this is something that comes with a technology choice that might be boring or not as modern and maybe not as super performant, but it's still incredibly impactful for how easy it is to build a business on top of it. And that's kind of why we're doing this. Right? We're not just playing with technology for the sake of technology. We wanna make money with this.

So the choices have to reflect that. And there are several cascading choices from this framework selection for me. Laravel, with its internal system called Eloquent, just kind of the ORM, the mapper that sits on top of the relational database, well, supports a couple databases, including MySQL, PostgreSQL, and SQLite. And I chose MySQL here. That's a choice that, let's just say, I don't necessarily regret it, but I think I could have made a different one and probably had a more positive outcome here.

As I created massive numbers of transcript every day and the system is still doing this, like 50,000 new episodes come in every day and I transcribe them all and I feed them into this database, well, I have to deal with the fact that MySQL is not that good for things like full text search or vector embeddings. Those things don't really exist in MySQL or they don't really exist well. So it has been around forever, MySQL, for web programming and even before, but these capabilities aren't part of the basic toolkit. That's different for things like PostgreSQL, and it turns out to be much easier in SQLite as well. So out of the three choices that I had in the beginning, did I make the right one?

Well, let's just say SQLite doesn't make sense for me at all because I have so much data to store. Podscan at this point is probably somewhere north of five or six terabytes in raw storage. I'm not gonna put that on a local database file. I had to build a system to move some of this onto object storage and only reference and pull that data in when needed to kind of trim down the database. So that's that's not really something system.

I want it to be on a dedicated server somewhere that runs a database. So that kind of left MySQL and Postgres. And Postgres, I probably could have picked that and fared pretty well because Postgres does all of the things. It has a really good vector storage internally. It has full text search, probably usable to a certain degree.

But I think MySQL is fine. The consequence of having massive amounts of data in a database table is that with MySQL in particular, changing the structure or the schema at that scale of that particular table with all this data is almost impossible without a workaround. If I try to add an index to a database or to a table in the database that's constantly being read from and written to and has a lot of stuff going in and going out, has a lot of indexes already in there, that index is going to lock up the application. And I can't afford this, particularly not if I offer an API the way I do, right, with full access, read write to the database at all times. Can't just have it lock up for two hours.

So what I can do at this point is what AWS, where this is hosted, calls a blue green deployment, where I create a copy of the whole database, I synchronize it. I then add my index to the green copy, the kind of greenfield, the new variant that's not being kind of used yet by the program. Then I get the index run there. It takes less time because there's not too much reading going on from it. Writing still happens, but it's synchronized in the back, so the backlog just waits.

And then when that's done, I switch it over. Right? I turn the green into blue, and I switch them over. AWS does this, I think, on the DNS level, so you don't even have to change your configuration. It's really cool, but it's additional cost.

So for every index that I wanna add, I effectively have to pay twice the daily hosting rate for spinning the whole thing up, running the index on it, and then switching it over. That's easily hundreds of dollars for every index that I add or every new field that I wanna add to the database. That's not fun. And I rarely do this now unless it's really needed because it's just a complexity of a certain thing. So that's MySQL here.

Not sure if Postgres would have been easier or if it's just my database schema approach that I should change. There are so many different ways of solving this problem, but I just feel like I have to mention here that MySQL may not have been the perfect choice because I've been constantly barraged by Jack Ellis, the cofounder of Fathom Analytics, for not using single store, which is something that he suggested to me from the beginning and I probably should have used, but I didn't, you know, and it's really hard to make that switch at a later point. But it's just what it is. The fact that I can't do full text search in MySQL though meant that I needed to externalize search to a different system. And I chose MyLySearch at this point out of several options.

Back then, I think we were on version 1.7 of Miley Search. I quickly learned that having every single item from my main database in my search engine, kinda trying to completely duplicate it, would almost be impossible for the search engine to keep up with. It's just too much stuff to ingest, too much pointless data, too much text to run through, and Mighty Search needs to build an index on this massive round of transcripts for hour long podcasts. It just kinda swamped even a very powerful server. So I had to figure out how to set up my server to handle the inflow.

And then over time, Miley Search became much better. 1.9 was the version that really started working for me, and I just upgraded to 1.14 a couple days ago. It just worked until now. And throughout this time, Miley Search had a couple hiccups, which is just, you know, hosting an isolated system. I just have to deal with, you know, file descriptors and temporary storage and all that.

But I always run two servers anyway, I can quickly switch to one if the other one needs maintenance. And it requires some monitoring, but once running, it's extremely fast, quite reliable, and produces very good results that I use in both my API and the UI. And with 1.14 ingestion, like the indexing process has sped up so much, I can now hammer it with transcripts, and it just keeps ingesting them very quickly. It's been really good. It's a good choice.

So the big benefit of building Potscan in public, which is something that I'm currently doing, I guess, and I have been doing for the last year, was being able to reach out to the people behind my research, the open source maintainers, and do bug fixing with them and complain to them about stuff and give them full data exports just to play with and see how that works. This led to features being prioritized that I can really use now. And I'm really happy with this choice, especially seeing where Mighty Search is going with AI assisted vector embedding search, hybrid search, that stuff. One day I'll turn that on and have a semantic search over the full corpus of all podcast episodes anywhere. And that's still something for the future because it requires resources that I can't spare yet.

Podscan has only recently turned profitable. I still have to do a couple of things, but that'll be in the future of the system. Just full semantic search. I'm excited for this. Now let's talk hosting a deployment here.

My server is hosted on AWS, which has been very reliable. I think there was one time in the last year where my server died because of a physical rack malfunction where it was hosted, But I could quickly restore the virtual machine to another server and it just kept running. I deploy through Laravel Forge and Laravel and Voyeur, two tools that make that happen. That has been an amazing choice from the start. Like I said, everything in this ecosystem is really cool.

And Forge in particular builds this really performant PHP FBM setup. And Voyager has a very easy Git based deployment system, and they work really well together. These tools have been super reliable in helping me set up my main server and the API servers that I use for transcribing and data abstraction. The whole ecosystem has been extremely useful and good to me. I have a backup server somewhere for Podscan.

I think it lives on Hetzner just if I ever need it. Again, if I need to restart my main server or something, that is what that would be. Don't think I need to scale the system to anything much bigger than this because it's just good enough the way it is right now. But that also might change in the future. Who's to say?

Right? I don't know what's gonna happen over the next year. So that's the technical side of things. Let's maybe look more into the business side choices that I made, the services that help Podscan operate. Big one here is scraping because I need to check a lot of things and pull a lot of data from the web to enrich the podcast data that I already have.

So for all my scraping needs, I've been using ScrapingBee, and they have been great from day one. It's a wonderful SaaS that delivers exactly what you want for scraping data. They're very good, very reliable. And more recently, they added AI features where you don't even need to tell them exactly what to extract from a page in terms of, like, this is exactly where it is. You just tell them what you want to know, and the AI will extract it in your chosen format.

And this has been massively helpful to offload some of this AI complexity to another company and just pay them for it so I don't need to do it on my own end. It's really nice. It reduces my own system complexity, and they keep making it better and better, which is really cool. Things that break, though, happen, like, everywhere, and they happen for me as well. So for my error tracking in both PHP and the JavaScript front end, I've been using Sentry, mostly focused on JavaScript here, and Flare, which is a PHP centric error tracking tool.

And both have been really, really helpful. Their alerting is quite spectacular, and the deep insights that they give you into the problem have really helped me figure out how to build a more stable system. Like, at this point, there's still errors. Like, every system experiences errors. Right?

It's just the complexity of it. But new errors, I spot within minutes of deploying. And old errors, once they reoccur, for some reason, I get to see immediately as well. Like, the alerting is really, really spot on. And they send through email. They get emails. I get notifications. And for sending email, I initially used Postmark for a while. That worked pretty well at first, but it became quite expensive over time and deliverability tanked at some point. I made a mistake.

I think I sent 10,000 emails to a person in a day because of a glitch in the system. And then trying to recover deliverability just did not work out. I switched to resend at some point, and that has been a much better choice for my transactional emails. They are being delivered more reliably and it's cheaper and it's also more a developer focused tool. So Resend, I highly recommend for sending emails.

For nontransactional emails, for marketing stuff, I use Kit, formerly ConvertKit, and it's probably one of the best choices I made. It just keeps everything super useful, and I can adjust the layouts and I can add tags. I do this through my application. Right? If I get somebody signing up, I add their email with a tag to my kit list, and then it allows me to start a campaign and invite them.

And then the team is capable of coming in and talk to people, help with emails, and we have well designed templates through Palladio. There are so many interesting tools in that ecosystem. I highly recommend looking into Kit for that. And then I think the most important choice that I made in the business sense is payments. And for that, unsurprisingly, I chose Paddle.

They've been the most reliable and least complicated payment provider that I've ever worked with. I have almost 10 in MRR at this point because Paddle makes it happen. And I've worked with other payment providers that made it hard to get to just even integrating payments. Not the case here. Now I could probably have about $500 more per month if I didn't use Paddle, but I would have so many headaches and I would have to build so many things that Paddle just provides, it doesn't make sense.

They handle renewals, failed payments, refunds, local currencies, all that stuff that I would have to manually build. Like, it's extremely useful and has been a good choice to have chosen Paddle. And this also means that I don't even have to send invoices to my customers. The people buying PodScan are actually Paddle's customers and buying it from them. I just have to get these reverse invoices 12 times a year with Paddle.

And that is so much easier to handle even when it comes to bookkeeping. Maybe that's important to mention here too for banking and making the business happen, filing taxes, bookkeeping reports. I use Firstbase. That's a company that sets up US based companies in Delaware, Wyoming, wherever you need it. I set Potscan up as a Delaware c corp so I could take this bootstrapper compatible funding from the Com Company Fund, And they also set me up with bookkeepers and tax professionals.

They set me up with a bank account at Mercury, which has been really good. Banking with Mercury has been a breeze, and it's really cool because it's app based and professionally handled. Like, they know exactly who their customer is. They get their businesses need a modern software for their own modern software businesses. Their UI and APIs are built with that in mind.

So if you want to start a company or want to move banks because yours is getting weird and old school, look into your Mercury. Now it's not a bank itself. It's a financial service provider, a fintech startup. They are working with Evolve Bank and Trust, which has been pretty reliable and reputable. The more traditional bank might not be as easy to actually set any of this up.

Now let's reflect on this year of choices because obviously Podscan is now a profitable business and it's doing well and it's growing. I think I made the right choices along the way so that I could even get to this point. Looking at it after the fact seems very easy. Right? But I do wanna do this kind of reflection on a more regular basis because I tend to do this yearly recap in my mind for all of my projects, looking at where things are, how I want them to go, how I want things to go a year ago, and how they are now.

Maybe next year, I'll have a different answer to any of these things. I have a different perspective on that. Maybe I'll find certain technologies don't work for me anymore. Maybe my transcription system will be completely replaced by a service invented two months from now. That's 20 times more efficient, but I then have to rebuild everything and some past choice doesn't make sense anymore.

But, hey, looking at it now from April 2025 back to April 2024 when Podscan really got going as a fully fledged product, I think the technical and business choices that I've made have been pretty good on average. I still try to ensure that any choice I make has this migration path, the possibility to choose something else. I think it's harder with fundamental choices like the language, the framework, the database, but it's easier with external services like the email providers, the scraping tools, error tracking. I always think if that service exploded tomorrow or was acquired and then shut down, would I still be able to run my product? How long would it take me to get back to normal?

And for all of these services, I believe I have an answer, I have backups, I have abstractions, and I have internal and documented migration plans. So that's kind of what you have to do in a situation like this. De risk. Right? De risk a de platforming.

And because I'm thinking about questions that I will ask myself a year from now, looking at today, the choices that I make today, maybe I should ask myself today, could I have chosen something wildly better a year ago? And realistically, I don't think so. I mean, you know, the database could have used another kind, but I would still have scaling database issues and not have completely solved all of these problems. I still would have to build the right indexes, you know, for data to come quickly. That happens on every database that has millions of rows in there.

I've met many new technologies along the way, and I experimented with them, but nothing really beat what I had going on already. And mostly because I think at this point, I had already built a project with this stack. Right? Podscan is the project that was trying to help me market Podline, and Podline was built exactly with this stack, I knew exactly how to set it up. Part of the magic is really just knowing what you're doing and repeatedly doing it.

So that's my look back at the technology and business choices that I made a year ago. Let's see how many good choices I make for the next year. And that's it for today. Thank you so much for listening to The Good Serve Founder. You can find me on Twitter at avid kahl, a r v I d k a h l.

If you wanna support me on this show, please share PodScan.fm with your professional peers and those who you think will benefit from tracking mentions of brands, businesses, and names on podcasts out there. PodScan is this near real time podcast database with a stellar API, so please share the word with those who need to stay on top of the podcast ecosystem. Thank you so much for listening. Have a wonderful day, and bye bye.

Transcript source: Provided by creator in RSS feed: download file