¶ Vibe Coding and Study Guide Project
I bet, by the way, I've been Vibe coding so much lately. So Jay sent me an LLM study guide, like a list of interview questions he found on LinkedIn. So I made a—check this out. I made a—it's like a long, long, long PDF, and it's like latex and equations and stuff. So I chunked it up and made a website. Did the whole thing has like, it's not a website based on the questions. It's basically finding all references.
All scientific papers, YouTubes, anything. It's a giant study guide with a tracker. And then when you click on the checkbox, it's like, I watched that video, I watched that video. It increases the progress bar. It's all local storage. The whole thing took like three hours in Claude code. Did you type any code? Did not type a line. I'm solving that exact problem in all the different agents. I'll do it in GitHub agent, I'll do it in Claude, and then see what I get with each one.
That's kind of a cool idea. I vibe-coded it. I've got a couple projects I'm vibe-coding, but one of them is just about done, but it's been painful. Oh, yeah?
¶ Building an AI-Assisted CD Tool
On Linux, the CD command doesn't behave the way that I want. I want it to do searches. I want it to autocomplete. I've been navigating up and down these directory hierarchies. are very regular. And I find myself, I'm going to go CD dot dot slash dot dot slash dot dot. CD dot dot dot. One more dot dot. Or I went too far, go back down. So I've made this combination of a Rust.
program in a shell script, shell integration, where I can just type. Here, let me show you a demo of it. You should, have you ever used Oxide? Zoxide is basically Z. It's the CD. And it doesn't do what I want. Okay. So what do you got? So let me flip over here and show you. Yannam thinks it's really cool too, so I've got her endorsement. Share your screen because we're recording. We are. And then I sent you my study guide as well.
Okay, you see me? Yep. So I've got this directory hierarchy here. So I've called it MCD. Let's make the fonts bigger. Control scroll. Or control or command plus. There you go. Okay. All right. Is it literally called Mark CD? Yes. That's awesome. All right. So there's a bunch of directories under here. just to play with it so you can see how it works. So there's a four down there. You should be able to CD into foo four directly is what you're saying. Like watch. Ooh. And watch this. Okay.
Oh, okay. Is it scoped? What is the scoping? Is it scoped within a certain window? What do you mean scoped? If you typed one and there was... three or four things that it could potentially be. Is it scoped within temp and down or within data drive? So it goes up first and then down. It goes for current directory first.
Nothing there. Up and then down. And so if I type three, watch this, three. If I just press enter, it's going to take me to the first match. I'm still getting used to using it. If I go three, tab. Oh, yeah. There you go. Look at that. That's hot. Right. It's the pit of success, right? It's like, I just do it. You know what I mean, and it just does it. Yeah.
what I want. You should have done this years ago. You're going to put it in Sysinternals so we all get it? Yeah. I think I'll just put it in my own repo. I guess I shouldn't have got a Sysinternals tool. I mean, we're already getting it with Winget. Just bring it along for the ride. Yeah. That's cool. Can you switch drives? Can you switch drives with it? One of the things I've never liked about CD is you have to type CD slash D.
to swap across drives. This is on Linux right now. Oh, that's on Linux. Okay. Yeah, this is a Linux machine. So watch this. There's a subdirector called experiments that's a couple down. So if I type... Will the tab have worked? Yeah, watch this. Ooh. Oh, that's interesting. Now it's doing a big, big search. That's a clever idea to use an elliptis for an animation. Yeah, that's my idea. Does it store an index just then? No, it doesn't score an index. Okay.
So this is a little more narrow. Yeah, it's still a lot. Yeah, anyway. It's close to being done. There's an edge case. There we go. So there were multiple matches, a couple down, for example. That's cool. Yeah, so, I mean, I think it's, you know, very intuitive. Very convenient. It's more intuitive. It's like CD on steroids, CD++, as opposed to Z and Xoxide, which are a little bit more opinionated. Yeah, and those, I was like, when Janan told me about...
And I'm like, oh, let me go try it. And I was already partway through this. And then I'm like, well, these things that I just did, it doesn't do. Well, because Z uses FZF, the fuzzy finder, command line fuzzy finder. This seems more narrow and less fuzzy. If you misspelled experiments, yours is not going to go there. Yeah. Cool. And how much, do you ever look at the code? Yeah, it's roughly a thousand lines of Rust.
Or 800, you know, 900 lines. Okay, 800. And did you look at it, or did you judge it, or did you just focus on features? I focused on features. I mean, I looked at it just out of curiosity. Like, here's where it does the... So there's prioritization that has to happen for this to work the way that I want it. And the AI figured out, oh, I need to do prioritizations.
It is cool to be able to just spawn into existence a thing that you like, because you would have done this, but you would like, I don't have a weekend to dedicate to it. And here you dedicate an hour. Well, actually, I mean, many, many rounds. So this test, this is kind of like at the limit of Frontier model capability. So I, and I've tried 4.1, O.1, Sonnet 4.
And Sonnet 4, I think, is still doing the best job at this. This was all in co-pilot agent mode? Yeah. Cool. The challenge that I found was that...
¶ Babysitting AI Agents and Testing
I say, now do this, make it so that when I press enter, it just takes me to the first match. And it works on that, and it's very pedantic. So I've actually gone to auto-approve. There's another way you can set auto-approve, so it's not...
asking it for everything it wants to do, because otherwise it's like, now I want to do this, now I want to do this, now I want to do this, now I want to do this. For this kind of project, it wants to do a lot of shellx commands, and those are by default to approve. So I turned that on, but it still wants to ask me to do things. And while it's solving that task...
it's like, let me go check out this, let me go check out that, let me try this, let me try, like 50 million things. And then it'll be like, okay, fixed it, done, tasks, you know. And then I'm like, I try to play with it. And of course... It might not have fixed it. And so I've told it to write tests for everything it does. And even tell it after a while because it'll fix something and then regress something else.
And the tests won't cover that. And then, I mean, it's just been a lot of babysitting. So yeah, I didn't type in the code, but I babysat the crap out of it. And it took, and if I summed up all the time that I spent looking at it. and trying to understand what it's doing and testing is probably a few hours. Let me show you this thing that I did, which is not as cool, but still fun.
¶ Study Guide Details, History, Pronunciations
So the context was this fella here put up a really nice study guide on LLM interview questions. And it's very kind of latex-y. And it's like 12 pages long. And it starts at basic stuff like what's in beddings. Yeah, a little trivia. See a computer science history trivia. Do you know who invented Leibniz? Who created it? I don't remember the guy's name. What, who?
Leslie Lamport. Oh, yeah, I talked to Leslie. I had him on the podcast. You know what else he did? Oh, Lord, what else did Leslie do? Basically, he came up with concurrency algorithms and Paxos. He came up with... Distributed concurrency and the concept of logical clocks for partial ordering and distributed systems really is a monster impact on computer science. And it worked at Microsoft.
Until he retired a few months ago. Yeah, I talked to him in 2021. Super nice guy. We talked all about a bunch of stuff. And there you go, the developer of Latex. That is a good trivia. LaTeX. LaTeX, my bad. Yeah. So this is like a, oops, that's my expense report. This is, I don't want to show you. Where did I, where did it?
So you have a question, for example, we'll just say, what is generative versus discriminative models? It's question number 19. And it puts a little thing here, but I don't want that. I don't want a one-paragraph essay. I want study material. I want meat. So I built a website that tore apart his PDF. And then I put in his answer, Hao Hong, great guy.
I put in an AI answer. I put in definitions. Now, what I don't have yet is a LaTeX equation generator, so I need to go and update my rendering. But then I put in... you know, comparisons. I put in resources, research, and then you can check these. And then as you check them, because you've done the work, your study progress gets updated down here. That's very cool. So how much coding?
That was all done in clawed code. So the idea is I'm going to try it in agent mode with the different models, and I'm going to vibe it maybe three times. This took about an hour and a half, two hours. So I have a problem that is nice and clean and complicated that is a do a thing and do it end times and then deal with all of the edge cases and then see what it comes up with and see which one works.
chain of thought. And then also, the important thing was... Sorry, go ahead. It's amazing what you can do. It is cool. The hardest part was validating each one of these and making sure that these are all... RCIV, you know, legitimate links, because I didn't want it to hallucinate any of the citations. How do you pronounce it? Archive. Archive. Archive. So this word here is... Think A-R-C-H-I-D-E. It's an archive of papers. Okay. So make sure I understand this. Yep. So this...
It's pronounced that. Yeah. And this is pronounced LaTeX. Correct. Okay. Makes total sense. Yeah, it's obvious. I feel like a schmuck. Yeah. I know you do, actually. No, I know. And I appreciate not only did you call me out on it, but now we recorded it. So now I feel like a schmuck.
Forever with a link. We don't have to air this. With a link to my schmuckery. No, I don't care. This is a great point. We had a conversation about this at... at this ERG I went to yesterday about when do you feel at a certain point in your career that you can mispronounce a thing or misunderstand a three-letter acronym and just own it?
versus feeling like someone's going to flip the bozo bit on you. Now, you flipped the bozo bit on me years ago, so there's really nothing I can do to save myself. Nothing to lose. Right. At this point, it's pretty much over. What do you do? Have you ever been in a meeting, even at your level, and said a thing, like a thriller, and just like, you didn't know what that code name was? What do you do with that?
I'm like, people are talking about something and I don't know what it is. I feel like I should probably. So if I ask about it, people might go, what the hell? But, you know, I'm at the point in my career just like, no, I'm going to do it. And oftentimes... somebody else would be like, yeah, what is that? And I'm like, oh, that was the one that was bold enough to represent the people that didn't know. Yep. I think that's important. People will text, will text.
teams me afterwards and be like, thank you for asking what that was. I didn't know what Project FooFoo was. Everyone else knew Project FooFoo and read the FooFoo paper, but I... Oh, so get, speaking of FooFoo projects and...
¶ AI Hallucinating Fake Projects
not knowing what it is, I was like, let me see how M3CT5 Copilot does when I have an email in my inbox that references a project that doesn't exist. And I ask it, and you know what happens? It makes up the project. It really makes everything up. Because it's grounded in what it thinks is reality, which is your email. I went into research mode. And it's like, this project, and I'll show you the transcript, it's like pages and pages of...
dates, milestones, integrations, team structure. And then I'm like, well, who's in the project? And then it goes off and, well, and actually, let me back up, because in the chain of thought, you can see it going. I've done a whole lot of searching. I can't find anything about this project. I think the right thing to do is to admit that. But then what would a project like this be structured like? And then it goes off, and then from there, it just makes the whole thing up.
And this is where, is it grounded? Is it grounded in fact, or is it an improv partner? And that's why it's so important to stay grounded. I know the PM who owns 365 Copilot, so I can introduce you to her if you don't already know them. Well, I pissed it. I gave this to the AI Red Team chat, and I think there's people from that. They're in there, but I'm like... Anyway, yeah, projects. That was a little digression off projects. It's very, very tented. Yeah.
Nobody knows about Fufu. The level of detail they came up with for this thing. That's so cool. By the way, this project that doesn't exist. Deputy CITO for Project Fufu. Now what's going to happen is someone is going to go, and this will be on the net, and then they're going to turn into a transcript, and this is going to be on the verge. Councilman and Resendich go and leak Microsoft's secret foo-foo project.
Pretty sure some AI will come across it and go, oh, they're talking about a secret project. Oh, yeah.
¶ UI Design Trends and Liquid Glass
And you know it's going to be a direct compete to glass, whatever they're doing, liquid glass. Have you seen that? I can't see how bothered you are. You're so annoyed. I'm pretty bothered by liquid glass. I'm just feeling. It's like you're on. It's like your new mission in life is make it clear that this is not new. Arrow invented this. Apple's just copying. You shouldn't get all this credit. I don't want to make it clear. I want to make it liquid glass.
It's fine. It's got a lot of animations. I think the question is, do I really want hardware shaders working that hard on my computer? Do you see the reflections as you move things around? Yeah, and then if two buttons get near each other, they kind of do the liquid. I mean, it's hardware shaders, which we know how to vibe, actually. But people are reporting their phones getting hot. So it's just like, how much...
extraneous stuff do I want? I don't know. I think I'm kind of a minimalist. I'm not a Windows Phone apologist, but I did like Windows Phone. By the way, you and I have been in the tech industry enough to see these cycles. First, it goes from very minimalist.
to, ooh, let's add bells and whistles, let's add edges, let's add shadows, let's add, oh, clearness, let's add lots of colors, and let's add animations. And then it goes back to, you know what? No, we're going to reset and do something simple. And this is with Windows 8, you remember? And Windows Phone with the very simple things. And then everything is monochromatic. Very thin lines for everything representing icons.
And then people are like, nah, you know what? That's not rich enough to really show what we want. We need to add some color back. Oh, you know what? We need to add some icons that actually have graphical elements to them. oh, we need to add shade or shadows on the things. And so here we go. We're back into, let's make things really fancy and whiz-bangy and rich and colorful. And then a few years from now, we'll be like, wait, wait, wait, wait.
New aesthetic. Really simple. You remember when the iPhone 3 came out and it was like, there was a bookshelf. It was a skeuomorphic, literal wooden bookshelf that you would put. ebooks on. It had grain. It had a 3D thing. It was like Microsoft Bob, but on your own. I don't know, man. I just want to have flexibility and choice. But I'm concerned that there's just no value in this liquid glass thing other than it's different. And it burns the GPU. I think that's what it is.
you know, give it a fresh look. You got to give it a fresh, things a fresh look every now and then. Because there's so much a perception to it. It's like, if they can have new features in it, but if without the look, people think, oh, it's just the same thing. Now, we... There are design things like that. People are like, this is brand new.
There are design things that come out that are new thoughts, like the dynamic island, which is a hack, but a genius one, where it's like, we have a circle at the top here we can't do anything about today because we have a camera there. But we can put stuff around there and make it feel juicy. So if I charge my car, the little dynamic island jumps up into the corner there, and it gives me status. If I'm on an airplane, if I use Flighty, which is the best flight.
playing Spotify in the background. Yeah, love that stuff. That seems useful, and that seems like a new and interesting kind of user interface. The problem is when you work for a company like Microsoft, people assume that your opinion is rooted in some Microsoft-ism. But I didn't work here always. I remember what it was like to not work here. I will not work here in the future.
My opinion is spanning many years that are not necessarily informed by Microsoft's design language. So like, yeah, Windows Arrow did that without the animation stuff, but it's not about that. There were Glass... UIs before even Windows. I'm not a Windows apologist in that context. The other thing that's interesting is, have you noticed that all websites now look basically the same? It's like dark mode.
blues and purples, maybe a little orange because we're a little spicy. And it's like blue, orange, and purple, and that's just the aesthetic that the web has right now. And the whole web is moving. It's kind of progressive scroll, vertical scroll. Everything is vertical.
And then you scroll, and things kind of animate to get bigger as you get near them. Right. That's called scroll jacking. They grab the scroller, and then you don't get to see that. Sorry, New York Times. Keep scrolling. We're going to build. We're making your wheel movement. Yeah. build up the animation. I actually kind of like it, but... Yeah. Scroll jacking is okay if it provides value. I think the New York Times and the stuff like D3JS, those really cool visualizers like the Edward Tufte...
If you've ever read his book on the quantitative display of information, those things I think are really, really powerful. But when we're talking about vibing, this basically vibed up. bootstrap site. There's so much of this left rail, tree. There's no creativity there. That's why it's the perfect project of mine. Why do you say that? Because there's so many examples out on the web of that very type of... So many. Okay, so that's a great point. It's not that complicated either.
Right, it's not that complicated. But what happens when everything becomes a statistical mediocrity? We're going to need to do bespoke, interesting work that does not involve vibing in order to... break new ground. Because right now, we're just sitting on top of the fat part of the normal distribution. Yeah, I don't know. By the way, you want to do the episode?
I think this is the episode. This has been 24 minutes of you talking. Mostly you. Actually, I do. Hey, here's a project for you to vibe. Something that doesn't look at all our podcasts.
¶ Future Projects and AI Languages
just audio, and then adds up the number of minutes you talk versus I talk. Do you think we should do that right now? We should do that in the next episode? How long do you think that one takes? I think we could do that in 30 minutes because all you need to do is... get the RSS, loop over the things, bring them down, transcribe them, identify speakers, and then build a chart. And then...
As you scroll, we have to have scroll jacking. As you go down episode one, episode two, you can see the competing of who's yapping more. You're getting more ambitious than I was going to be. If we're going to do it, we've got to own it. Oh, yeah. That's a really good idea. We could just do a command line in Rust. Yeah. Count of lines. Does the language matter if we're not coding any of it?
Although, what if you... Mark's CD, I did make it do Rust. I'm like, you're going to do this in Rust. Okay, what if you ask the agent, I want to do this project, and you tell it... explicitly, don't tell me what language you're going to use. And consciously go out of your way to hide the implementation from me. But you, AI, decide.
What's that? You're going to see it making files with extensions and compiling. You'd have to basically have it hide it. Because then the question is, when does the top-level programming language become assembler, become effectively a runtime that we don't think about? And is that bad? Well, here's one of the questions that I get asked. I've had people ask me a few times, and I kind of my head hurts thinking about it. I say, hey,
We're having AI do this AI-assisted coding and bytecoding using programming languages designed for humans. What if we let AI create a programming language that's more... That's the most optimal for AI. Does it have enough context? Because we're good at programming. We're good at making programming languages from a human perspective. In thinking about that, I'm like, well, what kind of programming languages would let...
the EI be more efficient than the existing ones. Actually, if you take a look at programming languages, they're alternate complete. You can do anything and just about anything in any of these popular languages. Right. So they're all basically equivalence class, you know, an equivalence class. And unless you've got something like Rust, which forces you to do things a certain way, they're all, it's just a matter of preference and ecosystem. Check this out.
Are you even listening to me? No, I'm trying to figure out which one of his talks the most. It does. Did you already vibe cut it? You're done? I'm doing it now. Thank you.
