Claude Code makes several thousand dollars in 30 minutes, with Patrick McKenzie

⁠¶ Intro

00:01

Welcome to Complex Systems, where we discuss the technical, organizational, and human factors underpinning why the world works the way it does. Hi de ho everyone. This is Patrick McKenzie, better known as Patio Eleven on the Internet. Today's a bit of an experimental episode.

00:23

I've spent quite a bit of time recently doing AI assisted coding. The experience of doing this has, in the view of many technologists I trust, gotten discontinuously better in December of twenty twenty five, as a result of Cloud Code upgrading to Opus four point five as the underlying model.

00:37

The sharpest technologists I know use words like fundamentally transformed to describe the impact of LLMs on coding. Thomas Tacek, my erstwhile co founder and an engineer's engineer, wrote All progress on LLMs could halt today, and LLMs would remain the second most important thing to happen over the course of my entire career.

00:56

Andre Carpathy, one of the founding members of OpenAI, wrote, This is easily the biggest change to my basic coding workflow in almost two decades of programming, and it happened over the course of a few weeks. He is referring to the switch from writing code and asking the AI questions to asking the AI to write code and occasionally tweaking it, and not to any particular. If you code every day and have not used modern coding tools like Cloud Code, Cursor, or OpenAI's codix yet,

01:22

Strongly consider hitting pause now and playing with them for a week or so. It is by far the most important thing you can do for your career and your employer this week. If you don't code every day and don't understand what this fuss is about, the rest of this episode is for you. I want to walk through a coding session with Claude Code in a pedantic level of detail. In it, Claude solves a real business problem involving complex interaction of four computer systems.

01:45

in a way which multibillion dollar businesses routinely fail to solve in thirty minutes. My contribution to this work was being half attentive to a chat window after dinner, and while it was definitely engineering work, it exercises relatively few of the muscles that my degree or decades of experience coding developed.

02:02

If you'd prefer not getting deep into a particular example and just want to survey, Bloomberg's Odd Lots podcast did an excellent episode with Noah Breyer on this topic. I'll link to it in the show notes. It's the single best artifact for a non-specialist audience about why modern coding tools are discontinuous advanced.

02:18

All engineering work happens in a business context. As I often tell people, engineers are not employed to write code. Engineers are employed to increase revenue or reduce costs of businesses. Those are the only two goals.

⁠¶ All engineering work happens in a business context

02:30

I run a small business which publishes to the internet about infrastructure and receives money for doing so. This is definitely not what I think I'll be doing for the entirety of my remaining career, but it is what pays the bills today. This podcast is one product of that business. The other main one is Bits about money, which I describe as a professional journal on the intersection of finance and technology. Other people might describe it as a paid newsletter.

02:54

BAM memberships are similar in character to supporting one's local radio station. There is no paywall, and so the pitch for members is pay so this public good continues existing, and not pay so you specifically can continue accessing this publication. For historical reasons, most BAM members purchase annual memberships, and most of these memberships renew in January. Every year in January, we have a spike of revenue, yay, and a spike of failed transactions.

03:19

I promise we're returning to AI assisted coding in a minute, but you have to know a little bit about financial infrastructure to understand why the newsletter, all about financial infrastructure, suffers failed transactions. Briefly, almost all BAM memberships are paid for on a personal or business credit card.

03:34

Credit cards as an ecosystem tolerate a much higher rate of industry term of art here, spurious declines than you'd expect. We previously had Emily Sands from Stripon to discuss machine learning approaches to manage this.

⁠¶ Payment failures briefly taxonomized

03:47

You can broadly bucket payment failures into three very different scenarios. Scenario one is the least common. The customer genuinely does not intend to pay, and has either instructed their bank to block attempts to charge, or is using ghosting you as a way of soft canceling their commercial relationship. Scenario two, the transient payment failure, is where there's a temporary infrastructure hiccupe.

04:09

A credit card transaction needs to coordinate many more than five computer systems at five firms within a window of a few hundred milliseconds. If any of them blinks during that window, the transaction fails. This happens much more frequently than you would expect for infrastructure that trillions of dollars ride on.

04:26

The solution to a transient payment failure, which is a business will not generally know is transient, is to simply retry the payment again. I previously worked at Stripe and presently an advisor there, and they do not necessarily endorse what I say in my own space.

04:39

But I was a user of Stripe long before and long after my employment, and so I know that Stripe will, if you have a setting on your account, transparently try to retry the payment on a configurable schedule. If one of those retries works, then no human has to even know the payment failed temporarily.

04:53

Today's work is about scenario three, payment failures which need human interaction to recover from. This could be, for example, a credit card being maxed out, though that is extremely unlikely for my customer base. It could be that a card was issued after expiry or loss, and that the issuer doesn't participate in automated systems to allow businesses to seamlessly charge the new card.

05:14

The failure could be caused by the user switching banks or employers, and thus their purchasing card, in the course of a year. Or it could be caused by a bank, for reasons of its own, heuristically detecting this year's charge as perhaps being fraudulent and blocking it. And they will continue blocking all attempts until the user tells them to please knock it off.

05:33

Scenario three interdicts billions of dollars of commerce every year. The typical way businesses deal with it is by sending an automated email, telling you that a payment failed and asking you to reenter your payment credentials. In an ideal world, that email lets you one click, no login to a screen, which immediately asks for the new credit card number, and you're done in 30 seconds.

05:52

very, very few businesses, even the largest businesses in the world, successfully execute on this ideal scenario. This is because their revenue operations or payments teams have basically zero engineering cycles available to them, because this isn't hugely salient at an executive level.

06:07

Even if they're leaving literally tens of millions of dollars on the table for want of a few days of engineering work, that fact is not necessarily obvious to the business. There isn't a graph which exposes it directly. Accounting will never detect. If I hadn't worked in payments, it wouldn't have been obvious to me either. The symptom is you expect, for example, 100 users to renew their subscriptions on a particular day. Ninety-two do, and the others don't.

06:33

If you aren't infinitely familiar with what is going on, you might think those eight people didn't feel the last year was valuable for them, or they are trying to save money in the coming year, or whatever folk theory of behavior prevails in your organization. Now knowing that spurious declines were costing me money every January, did I actually do anything about it? No.

06:53

It required some actual engineering work to deal with it. I ballparked it as one to three days for me, and each year in January, I was more concerned with either writing BAM or doing a subscription drive than reversing failed payments. Maybe next year, I said, multiple times.

07:08

I'll note for completeness here that Stripe has a setting which will automatically send out uh payment recovery emails for you, but those emails are entirely templated and because I have strong views on what the voice of this publication is, I uh did not simply turn on that setting. Why is the engineering work here hard? Sending an email about the payment failure is relatively easy. Sending a customized email with a link that gets you directly in to update the payment is more difficult.

07:32

There are four computer systems in play. My own system, Ghost Pro, the platform that BAM has published through, Stripe, which handles all the payment processing and is the system of record for, for example, failed payment attempts, and then Postmark, an email provider.

07:46

These each have slightly different views on the elephant. Stripe, for example, knows about payment failures, but not about emails already sent through Postmark. Postmark has no idea who subscribes to BAM. That's Ghost's job. My system has to reconcile these against each other to determine what to do. The engineer doing this work needs to use APIs, ways for computer systems to talk to each other programmatically.

08:07

to coalesce those views into a list of who hasn't paid successfully, remove users where that is expected, for example if they explicitly requested to cancel, construct special URLs to let people in without a login to update information, and then actually send the emails. Then the money rolls in, yay.

08:24

Now follows a conversation with Claude Code. I'm going to try to be clear about what I'm telling Claude, substantially verbatim, and what Claude is doing or telling me. I will fudge a tiny bit in the interest of comprehensibility or omitting sensitive information.

⁠¶ Now follows a conversation with Claude Code

08:38

For example, in the development of this and every other similar system, an engineer will have to look at some point at so called production data, real names attached to real email addresses. I will obviously not read those on air. Thankfully, an engineer does not have to look at real credit card numbers, and indeed cannot in a well designed system.

08:56

Different scales of enterprises would use different levels of care in having testing environments. This small business does not have a testing environment for BAM. The real users are the only ones we have. And so some of the work with Claude in a previous session was being careful because of that lack. We've, for example, built a drive run feature to identify who would get emails before actually sending those emails, so that I could eyeball the list for correctness.

09:19

Some important context before we get started. Claude can't see a project the same way an engineer sees a project. It relies on an orientation file, claw.md, plain text in the markdown format. that it reads at the start of every session. It can then, in the course of a conversation with the user, use what it calls tools.

09:37

like searching for relevant code, reading it, operating the command line, which lets it do things like restart servers or make test API requests or run automated testing tools. I'll provide some color commentary about its tool use.

09:51

And while Claude does not explicitly remember this fact because it is in a fresh session, it's important that a few weeks ago I had Claude build a feature of her bits about money, which used the APIs to list upcoming renewals on a dashboard and send emails reminding users about those upcoming renewals.

10:05

This is a different business problem than recovering from payment failures, but it involves much of the same technical infrastructure, and I explicitly don't want Claude to reinvent the wheel. So the first thing I do is try to jog its memory. Meet a Claude. We're doing the annual Bits About Money subscription renewal month, and it seems many subscriptions failed to renew. I think this is likely to be mostly transient payment failures, cards which weren't updated, etc.

10:30

I'd like you to enhance the BAM subscriptions class with logic to detect recent last thirty days but make it configurable payment failures, which haven't recovered. Stripe might try, auto retry, et cetera, et cetera. This should be similar in character to inspecting upcoming BAM renewals. exclude any where the subscription is in an intentionally cancelled state, either originating from the user or from me.

10:51

After you enhance BAM subscriptions with this, I would appreciate a a new dashboard page to show these users and B a rake task to send them an email, similar to the existing subscription reminder email. You can take a stab at writing the skeleton of that email. I will refine it to be in my voice and make the updates as easy as possible. Some commentary. This specification is verbatim, and note how little technical detail it includes.

11:15

Every bit of software ever written until twelve months ago had to be extremely pedantic about what data sources it is touching, in what bits of code, with what functions, etc. I have given Claude precisely one identifier, the name of a class, BAM subscribe.

11:30

I expect Claude to be able, with no other information, to be able to successfully find BAM subscriptions on disk, read it, find the part of it which is about upcoming BAM renewals, and generalize from that previously completed known to work code. Claude then fires several tool calls. The first one searches on disk for a file named BAM subscriptions. This fails to find the class, and this is a CS101 level.

11:54

That error is totally inconsequential, because Claude just immediately proceeds to running a wider search, locating BAM underscore subscriptions dot rb in the libsubdirectory, right where it put it last time. Claude then reads the file and ponders a bit. Claude then begins writing new code. It, in a single shot, writes eight hundred and nine lines of code across seven files.

12:15

Although I had provided the suggestion, it staggered this development, first doing the bookkeeping work of finding failed payments, and then building a dashboard to show those failed payments, a task to send email to each person, and a skeleton of what that email would say. Flaw just does this all in one go. This is in less than five minutes of elapsed wall clock time. Claude to me.

12:35

I've implemented the payment detection and notification system. Here's a summary of the changes. Changes made. One, enhanced lib slash BAM subscriptions.rb. Added new methods for payment. failure detection, subscribers with payment failures, parametered days, finds BAM subscriptions in the past due or unpaid status with failed invoices in the look back period.

12:57

Fetch recent invoices, parameter subscriber ID, gets detailed invoice data, including failure reasons. Extract payment error. Parameter invoice. Translate stripe error codes into human readable messages, for example, card declined, insufficient funds. Payment failures with details, parameter days, enriched data for dashboard display, including invoice URLs and next retry times.

13:18

Duplicate prevention via local file tracking and postmark API checks. Similar pattern to renewal reminders. Two, new dashboard. Slash private slash admin slash ban payment failure. This lists all subscriptions with unrecovered payment failures, shows the email, the name, the status, past due or unpaid, the error reason, the failed date, the next retry date, and the amount. Configurable lookup period seven, fourteen, thirty, sixty or ninety days.

13:44

Preview the email before sending functionality, can send individual or bulk emails, and a link to Stripe's hosted invoice page for each failure. Also a revenue at risk summary statistics. Three, new email template app slash views slash BAM mailer slash payment failure dot text dot erb. A skeleton email that explains the payment issue with a specific error where available provides a link to the member portal to update payment method.

14:09

includes direct invoice payment link if available, shows next automatic retry date if scheduled, friendly tone, and offers to help. Four new rake tests. Preview who would receive emails with dry run mode. Rake, bam, preview payment failures thirty. Actually send those emails. Rake bam. Send payment failure emails. 30, comma send. Show email statistics. Rake bam payment failure stats. Key behavior.

14:37

Detect subscriptions in past due, where Stripe is still retrying, or unpaid statuses, where retries have been exhausted. Exclude subscriptions where cancel at period end is true. This means the user has chosen to leave. 14 day postmark duplicate check by default for payment failure emails. And the day's parameter is configurable, defaults to 30 as requested. Would you like me to push this to the dev box so you can test it? Some commentary.

15:04

you might not be familiar with the word dev box. Briefly, I do software development on my MacBook Pro, the latest in a long line of laptops that I've used in the fourteen years this system has existed. If I were to actually run the code I was writing on MacBook Pro, which I did for many years, I'd tear my hair out as, for example, old versions of libraries conflicted with new versions of Mac OS.

15:25

So instead I use a pattern quite common in the professional software industry. I have a so-called dev box sitting in the cloud. While I write the software locally on the MacBook Pro, I sync the code to the dev box, and the code actually executes in the dev box while developing. Claude, for what it's worth, is open on my MacBook Pro and not open on the dev box. It lives on uh, neither. The actual brains are somewhere in the cloud, on a computer operated by Anthropic, be that as it may.

15:52

Claude can access my dev box because A, it can run Linux commands, including SSH, which allows it to open a terminal on any computer in the world that it has credentials to, and B, I have arranged to have credentials to the dev box on this computer, and permitted Claude to use them. Cloud's permission model for Linux commands is a bit complicated. Some Linux commands are essentially zero risk. Echo, for example, just displays what you type to it back to you.

16:18

Some are higher risk, like deleting your entire file system, and some are kinda tough to tell. What's the worst thing that can happen if a computer connects to another computer and asks it to do something in that computer's power? Well it could be either A, nothing, or B, millions of dollars of damages. And so one option is you review every command as flawed typesit, and approve deny them individually.

16:41

This is really annoying for the operator, and virtually nobody does it for longer than a few minutes. Claude really wants to be firing off one command after another while cranking through things. The second option, which I mostly use, is to whitelist commands individually, which are low impact, and then manually inspect ones not on the whitelist. SSH is, in the general case, potentially a very high impact command.

17:04

Control that I have in here is not internal to Claude. It is internal to the rest of my business. I know what the dev box can do, and that is intentionally less than what my business can do. And therefore, allowing Claude to operate the dev box is relatively safe. And I want to say explicitly, relatively safe. If North Korean hackers had that level of access to my dev box, that would be an instantaneous emergency for me. But I trust COD more than I trust North Korea.

17:32

And it's worth saying, I really do trust Claude, on the basis of having used it for a few months. It is a very productive junior engineer, which has occasionally made mistakes which cost me, but which does not act maliciously. Different engineering orgs will have sharply different opinions about how much they trust modern coding tools.

17:50

based on org culture, the relative strength of the security team versus feature developers, resourcing, and the domain the business operates in. But suffice to say that I have a much more security focused pod than almost any business with similar revenue. And I have a far less security focused posture than, for example, a bank Which is why I don't use the other method. Dangerously skip permissions, a command line flag, where you just let Claude do whatever it wants.

18:16

I've heard that described as YOLO mode, and for a business which actually touches computers which touch money, YOLO mode is a painful experience waiting to happen. Back to the chat. MetaClaw. Is there a way by which we can eliminate any login step or other auth for the emails? I see we're looking for payments links. Are those scoped per user here, do you think?

18:39

My commentary. Payments links are a particular product from Stripe, which lets users, for example, update their payment information on a Stripe hosted webpage. Some users of payments links cannot write code, and so their payments links are one link for all users. The user is forced to log in immediately after accessing it.

18:55

I want the other kind of payment link, which is scope to a particular user. This has, essentially, a secret in the URL of the link, which lets someone immediately in to update a payment method without logging in, expecting that the engineer has guaranteed security in some other fashion. Here the guarantee is I already know you control the email address corresponding to this account because you are seeing this payment link in precisely an email delivered to that email address.

19:21

It will add a long discussion of email security here. Suffice it to say that businesses get to make a security versus money trade off, and here businesses choose money while tolerating a tiny risk of interception of email. and the payments link product is made to be robust. Even if a hacker were to gain access to someone else's payment link, about the worst they could do is cancel their subscription. Or, I suppose, they could pay for it.

19:44

They can't extract a full credit card number from the payment. Claude to me. Yes, the invoice URL, Stripe's hosted invoice URL, is already a direct payment link scoped to that specific invoice. No login required. The customer clicks it and pays immediately on Stripe's hosted page. The friction point is the member portal link, bitsabout money.com slash membersh slash hashtag slash portal, which requires the ghost magic link login.

20:10

For a more frictionless update payment method experience, we could generate a Stripe built portal session URL instead. These are pre authenticated, so no longer required, time limited, they expire after about twenty four hours.

20:22

scope to that particular customer, and they let the customer update the payment method, view invoices, cancel, etc. Would you like me to try adding billing portal session URL generation? That code would look something like I think the acknowledgement of an ad read sounds cooler in Japanese.

⁠¶ Sponsor: Framer

20:46

I have an engineering degree and sold two software companies, and I associate building marketing websites with pain and suffering, particularly the maintenance burden, which nobody talks about. Framer is a next generation website builder, which gives your design teams the ability to make something beautiful, your marketing teams an all in one platform, and your engineering teams, their schedule back.

21:06

You can do real-time collaboration with colleagues and as soon as you hit publish, it's live on the internet. Landing pages? Live. A quick microsight to support that new campaign? Live. A ground up redesign of your entire dot com? Live if you want. Thousands of businesses from weeks old startups to the Fortune five hundred use Framer to accelerate their ability to ship sites their customers will love.

21:28

Learn how you can get more out of your dot com from a frame specialist or get started building for free today at Framer dot com slash complex systems for thirty percent off a frame pro annual plan. That's framer.com slash complex systems for 30% off. Framer.com slash complex systems. Rules and restrictions may apply. Commentary.

⁠¶ Conversation with Claude Code (continued)

21:54

We've now entered the major hard part of the work from my perspective. eighty percent of the actual writing code is already done in the first five minutes. It was boilerplate to support the identification of payment failures, the dashboard for the operator, and the sending of emails.

22:08

The other eighty percent of the work is why this hadn't been done despite multiple years of costing me money. It requires fiddly work at the intersection of multiple systems, including what I currently assume to be deep dives into documentation and chasing down edge cases within edge case. And I'm letting Claude do that for me, and only gesturing in the direction of what I really want. Claude picks up what I'm putting down immediately and starts brainstorming through engineering approaches.

22:33

My next message to Claude. Is there a ghost specific way to send people to a logged in ghost instance? Probably not, but I have to ask. Read some docs for me, please. Commentary. Here we are working on the part of the email where I want to send people to a Bits About Money page, which is on Ghost, rather than a Stripe page. Partly that lets users see, yep, this page that's asking for my credit card is actually bits about money requesting these payment credentials.

22:58

Partially it covers for cases where Stripe doesn't have a payments link available for a particular user. What are those cases? Don't know. I'm operating defensively here, which is common to engineering practice. Note that I explicitly tell Claude I am uncertain about whether what I want is even technically possible, and I ask it to read, quote, the docs. I do not point it at the docs. I know it can run web searches.

23:20

It does, and chases down some rabbit holes, including two years ago forum posts on the Ghost Developer Forums, issues opened up in GitHub, and the actual source code of the Ghost system. Clawed to me. Good news. There is a way now. Ghost has a slash ghost slash API slash admin slash members slash ID slash sign in URLs endpoint that generates a one-time login URL. As of October twenty twenty four per github dot com slash trighost slash ghost slash pool slash two one two eight four.

23:49

This endpoint accepts admin API tokens. Previously it required cookie based admin session water. So you could one, look up the member by email via the ghost admin API, two, call ghost slash admin slash API slash members slash ID slash sign and URLs with your admin API Java web token. Three, get back a one-time sign-inly URL. Four, include that URL in your payments failure email. This will let users click straight through to the member portal without the magic link email step.

24:16

Want me to add ghost sign-in URL generation as the payment email flow? I'd need the ghost admin API key. Or I can look for it in the config if it's already there. Sources. Github slash trighosts slash ghost slash issues slash one six seven four eight documents the fix.

24:32

Forum.ghost.org slash T slash admin API request member endpoints slash four zero eight two five documents the history of this coding change. Ghost.org slash help slash impersonate members describes the manual process that this replicates. Here's my commentary. I was surprised and encouraged by this. Well I'd ground out the research process in five minutes of wall clock time while I was all to Twitter. This is easily an hour of Googling and reading if you do it at human speed.

25:00

Engineers have the frequent experience of getting the critical nugget of information in a three-year-old post on a random forum. They generally won't tell you that they needed to read twenty posts to surface that nugget. Claude did too, but it's time is

25:15

Me to Claude. Yes, would like you to make this as transparent as possible to the user. Please externalize any interaction with the Ghost API into its own lib class. You can stub out a configuration file for the API key. I will create one with appropriate permission. Commentary.

25:31

A core skill for working with these tools is they require frequent small amounts of wall clock time. I expected it might take Claude one to four minutes to write that code, largely driven by repeated references to documents it would need to read, since this code base had no prior art on accessing code. I was going to simultaneously use the same time to go to my ghost instance and provision an API key for the system.

25:53

My request is drafted to avoid blocking either of us. Claw does its thing, I do my thing. We are working towards the same goal, but neither of us needs to talk to the other until we are both done with our work. What Claude did next, summarized. One, created lib slash ghost underscore API dot rb, a new class for ghosted API interactions with JWT token generation.

26:16

Which is Ghost's required auth format. Find member by email. Parameter email. Look up a member. Generate sign in URL. Parameter member ID. Get a one time sign in URL. Sign in URL for email. Parameter email. Convenience method combining both. Account settings URL for email. Parameter email. Sign in URL that re redirects to account settings. two created config slash ghost dot yaml dot example a template for the API credentials.

26:45

Three, added config slash ghost.yml to get ignore. Four, updated BAM subscriptions to optionally fetch ghost sign in URLs when building payment failure details. Five, updated the email template to use the ghost sign-in URL when available. Then when testing on DevBox, the JWT gem wasn't installed. Claude added it to the gem file as gemjwt version.

27:08

two point five ish, but it failed because JWT two point five and later requires Ruby two point five and the dev box runs Ruby two point three point three. I updated to Gem JWT use two point two point three for Ruby two point three compatibility, which works. At no point did I Patrick. have to intervene and tell Claude either to use Java web tokens, JWT, because the Ghost API requires you to do that.

27:34

B, pick this specific Ruby library for JWT, because it can search the internet and guess what the usual option is, or C, resolve the version conflict. It entirely self-propelled through that. This is the easiest possible versioning issue, and would have cost an engineer five minutes. More complex versioning issues routinely eat a day. It cost me zero seconds, because it was already resolved when I came back with the

27:57

What I told Claude. I have overwritten the sample credential file with real production gross credentials. Rename the file as appropriate and then test against Patrick at Calzumius dot com, which is my email address, which I'm happy to read on error. Generating ghost magic login link. Display that to me, but do not attempt to access it. I will verify it works in an incognito window. Claude then does so.

28:19

As I watch it work, I see it makes a sensible choice that I would not have made. Fourteen years ago, when this system was originally built, the overwhelming practice of Rails developers was to put their credentials in a configuration file, then check that in with the rest of the code base. This is no longer considered good practice, because someone compromising a copy of the code would get the production credentials, which potentially have a large blast radius.

28:41

Last radius as engineering speak for, there could be consequences to things that are not visible in the four walls of the file that we are looking at right now. So if you compromise a production system's credentials, you can do perhaps anything that production system can do, which It's not described in the file that simply has those credentials in them.

29:00

And so Claude, rather than checking the credentials in, explicitly blacklisted the credentials from being checked in and copied them to dev box via a different mechanism. What I told Claude. One, I confirmed in an incognito window that the link did work and logged me in, but it displayed a list of essays rather than the expected payment information, and I want to shorten that loop.

29:21

I understand your reticence to check in credentials as code, but this is an old legacy system. We don't have a pre made secure credentials store pattern in use yet, and developing one is not a priority for tonight. Instead, just check their credentials in, like we did old school. Some commentary. Note that Claude will not generally debate you when you give it an explicit instruction like that, even when the explicit instruction reduces one's security possibility. This is important.

29:47

If you want to maximize for security, keep the computer entirely powered off. However, that does not maximize for people successfully paying you money. And so you have to find some balancing point. Here, when an engineer says they have a reason for choosing a particular balancing point, Claude immediately does whatever they say they want. You can potentially use Claude in a more exploratory fashion, like I did earlier with regards to APIs that I didn't know existed.

30:10

many engineers w might not be able to reason through. For example, what is the blast radius of a ghost credential? Could it cost me money if it was leaked? Could it harm other people if it was leaked? And if so, how? What is the relative risk of source code being stolen? Is that exposure likely to come from, for example, a hack of GitHub, or, for example, me having my laptop stolen? What other non code compensating controls could I put on my laptop to minimize that?

30:36

And Claude will happily walk you through these issues, like an infinitely patient mid seniority engineer would. Its answers are, I predict, more consistently right than a lot of people who earn six figures. And they're wrong often enough that security teams probably don't have to worry about job security, at least for the next few months. What happens next?

30:55

For the next fifteen minutes, Claude and I are mutually driving an exploratory process as I try to coerce Ghost to do what I want by manipulating URL parameters. Claude makes frequent reference to Ghost source code. I am mostly just copy-pasting to an incognito window in Chrome, trying to access my own account with magic links that Claude keeps generating using its new API access.

31:15

We eventually find a pattern which does exactly what I want it to do. Open to the closest page that will allow someone to update their payment credentials. Claude's first response to me. Summarized. I searched the Ghost portal source code and found routes including account home, account profile, etcetera I generated a test URL with the query parameter

31:34

Oh boy, reading this on air is gonna be fun. Ampersand R equals percentage sign, two capital F percentage two, three, percentage, two F portal, percentage two F account, which is URL incorporated slash hashtag slash portal slash account. My response. Didn't successfully open the portal.

31:52

I noticed that the ghost side URL for the portal is bitsabout money dot com slash question mark action equals percentage two F portal ampersand success equals true. Perhaps that will work. Let me try manually fudging the tokenized URL and see if it works. After a bit more back and forth between us and I tell Claude, this is hideous, but it worked.

32:12

H D P S colon slash slash WWW dot bits about money dot com slash members question mark token equals Sixty four characters of token ampersand action equals percentage sign two capital F portal slash ampersand slash portal. I read that wrong. Well, you don't care anyway.

32:32

Claude successfully extracts some changes to its code based on this. However, instead of using slash portal, it infers slash portal slash account is the correct URL. I tell Claude, not sure where that slash account is coming from as I don't see it in the ghost UI in my clicking through.

32:46

But I will attempt using it and if that works, it works. If not, take off the slash account part, please. I then report didn't work. Claud, adjust the URL as appropriate. It now works as well as it is going to work. We then pivot to a different issue, as I'm getting pretty close to considering this work ready to go, in at this point, about twenty minutes. I asked Claude, question for you.

33:09

BAM started charging in January of twenty twenty three. If those subscriptions are stale since twenty twenty four, then I think the email is a bit out of bounds, even if the user hasn't formally killed their subscription yet. Given my desired relationship with them, I suspect that Stripe subscriptions, which Ghost uses under the hood, would kill a sub automatically, but I don't know that to be the case.

33:29

Can you investigate from docs what design behavior is here typically? Save me from having to look up I will redact the number. People in dashboard to verify. Claude then attempts to do a web search. Stripe subscriptions automatic cancellation after failed payment retry attempts behavior, reads the documentation on Stripe about smart retries, and briefly tries reading ghost docs about subscription cancellation behavior before finding there is nothing useful.

33:54

Claude replies to me, based on the document. Stripe's default behavior. Configurable and dashboard, go to billing, go to revenue recovery. Smart Rush Rise default is eight attempts over two weeks. After exhaustion, subscription either cancels or marks are unpaid, depending on your choice and settings. Ghost behavior. When Stripe cancels the subscription, ghosts convert the paid member to a free member.

34:16

The key question, if your stripe is set to cancel after retry is exhausted, which is the typical default, then any subscription showing past due or unpaid status should still be recent failures in the retry window and not years old zombies. However, if Stripe is set to mark as unpaid instead of cancel, subscriptions can linger indefinitely in the unpaid status without ever auto-canceling.

34:39

Here's a quick check. You can verify your settings at Stripe Dashboard, go to settings, go to billing, subscription and emails, manage failed payments. Or I can query the actual last invoice date for these number redacted customers to see if they're recent or stale. Here's what I would do. SSH into dev.calzumius.com, run a Rails runner with this code.

35:02

Would you like me to run that? The sources I'm relying on. HTTPS docs.stripe.com slash billingslash revenue recovery slash smart retries. Ghost.org slash help slash what happens when a member payment fail. My commentary on this. Your claw needs to make inferences about the state of my Stripe account. It cannot see indirectly, partly for technical reasons and partly because I have made the considered choice to not give it credentials that will allow it to move money.

35:29

So I copy information between systems to paper over its lack of visibility. Importantly, it is mispredicted what the actual settings are, for a reason even most Stripe engineers would mispredict. This account is one of the oldest accounts on Stripe, and so it has legacy behavior baked into the settings from more than a decade ago. Not the default that more modern vintage accounts get.

35:49

Major Claude. The current setting in Stripe is I drop in a screenshot of the settings page. Claude can read screenshots, very helpful. Claude summarizes it accurately to me. Okay, so with cancel the subscription set, any subscription that exhausts free tries gets canceled automatically.

36:05

So the number redacted again and showing us past due or unpaid in your dashboard should be recent failures still in the retried window, not year old zombies. This means emailing lamb is appropriate. They're actively failing, not long abandoned. If you want to double check, I can run that query to show the actual last invoice dates for the number redacted. But based on your settings, they should all be within the last two weeks or so, which is the smart Retrize default window.

36:29

Me to Claude. I was not on smart retries. I have an old as the Hills Stripe account and have adjusted that to using Stripe's quote unquote new behavior, rather than the migrated default setting they had from twenty ten vintage accounts, which has three tries on a fixed schedule. But for hygiene's sake, please do run a query for me. Claude then writes and runs disposable code to generate a table of recently failed payments.

36:52

All, number redacted, are from the last fifteen days. No zombies. It then shows me a table of each day for the last fifteen days, with the subscribers who add a payment failure first that day, showing their current subscription status. Claude continues. All recent, all past due, still in retry status. All appropriate to email. I tell Claude.

37:13

Spiffy, I appreciate the sanity check. I'm going to fire these reminder emails probably tomorrow after I have a newer bits about money to point to, and B Stripe's automated retry logic, now turned on, has a chance to recover some percentage of them without contact. Refresh my memory on what our deployment procedure is. Claude says Reading from Claude.md, the deploy procedure, run on the DevBox interactive and require sudo password.

37:38

SSH into dev.calzumius.com, change directory to code slash calzumius, get pool, create and push a new release tag. I won't bother reading you the Linux command for this, it's painful. And then deploy. Bundle exec cap slash s environment equals production, deploy. And there's a bit of more detail there that isn't uh super relevant.

38:00

My commentary, this is a low cost lookup of local docs, which I absolutely could have just done myself, but my muscle memory for these questions is increasingly eh just ask claude, rather than reading documents that I wrote for myself. My closing remark to Collad, after deploying, which required squashing one tiny bug in the deploy procedure, whose solution was so immediately obvious, I did it myself without announcing it.

38:23

Thanks. Good work tonight. I might do some finishing touches in a later session prior to actually sending, but this is looking pretty solid. I appreciate the help as always. Claude's final remarks. Thanks. To recap what we built tonight, payment failure detection in BAM subscriptions, fines past due and unpaid subscriptions with recent failed invoices, a dashboard at private slash admin slash BAM payment failures to preview and send individual or bulk emails.

38:46

Ghost API integration, frictionless sign-in URLs that open the account portal directly, rake tasks, bam, send payment failure emails, and bam, preview payment failure emails. And an email template with your copy, stripe in voice links, and the ghost magic links. Good luck with the Sends tomorrow. The number redacted are all recent and legitimate target.

⁠¶ My final thoughts on this

39:07

My final thoughts on this. So our conversation recounting this engineering work has probably lasted longer than the actual engineering work took, because my speaking speed is much slower than my reading speed, and because I've narrated the commentary to you.

39:20

But let me underline, this is an absolutely real problem, which had persisted for years, and for want of a nail, the war was lost, despite the only person working at this company having an engineering degree and easily sufficient skill to eventually arrive at the solution. So, did it actually work? Well, you can ask my mortgage servicer. The amount recovered here should pay for my family's living expenses in February.

39:44

And ideally, neither Claude nor I have to worry about this seriously ever again. After seeing user behavior regarding these emails, I'll write one line of code, it's called a cron job, to fire them automatically in the future, and payments will simply be semi automatically rescued in the case where they can't be fully automatically rescued. This is a relatively small problem in a relatively small business, but there are very many problems shaped like this in the world.

40:08

The analogous issue bites Fortune five hundred companies every day of the year, sized in billions of dollars per year, and most have not fixed it yet for want of a few days of engineering time and meetings between teams and similar that they need to coordinate that engineering time. Will modern coding tools completely solve their organizational bandwidth problems? I don't think I would make that bet for one hundred percent coverage, but it will certainly be greater than zero percent coverage.

40:33

And there are many places in the world where a lot of human welfare is unlocked by relatively small amounts of cognition, going from scarce to unskarce. The world does not broadly appreciate how seismic advanced this is. Will this be the last seismic advance in coding? If AI capabilities never advanced another inch, that would mean we had somehow hit the smallest imaginable bullsline.

40:55

Will this replicate across other knowledge work domains? Almost certainly. Perhaps not all, perhaps not instantly, but the world does not understand yet what is coming. I'm quite optimistic about the future, but I will acknowledge, out of a sense of completeness, that many of the people who most understand the shape of reality here are quite worried. And that is a discussion for another day. I hope this was useful for you, and see you next week on Complex Systems.

⁠¶ Wrap

41:20

Thanks for tuning in to this week's episode of Complex Systems. If you have comments, drop me an email or hit me up at Patty Eleven on Twitter. Ratings and reviews are the lifeblood of new podcasts for SEO reasons, and also because they let me know what you like.

✨ This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.

Summary

Episode description

Transcript