#80 Max: From Zero to RAG Agent – A Complete Beginner's No-Code Course | AI Fire Daily podcast

00:00

Imagine an AI chatbot, you know, brilliant at general conversation. But then you ask it about your company's specific policies or maybe the nuances of your industry. And he either guesses or maybe worse, just kind of makes things up. What if you could give that AI a custom brain, one filled with your verified knowledge? Welcome to the deep dive. Today, we're diving deep into exactly that idea. Retrieval Augmented Generation,

00:26

or RA for short. And honestly, it's simpler than it sounds, but it fundamentally changes how we can trust and actually use AI. We're going to discover what RA is, how these powerful Mind Palace databases make it possible. We'll break down the two crucial parts of any RA system. Then yeah, we'll walk you through building one, step by step, no code needed. You'll even learn how to give your AI a memory, make it remember

00:46

stuff, and power its new brain efficiently. It's really about turning a smart... chatbot into a true expert on your stuff. Okay, let's unpack this. So let's start right at the heart of it. What is ARAG? Retrieval Augmented Generation. In plain English, it just means your AI does its own research before it speaks. Think of it like an open book test. When you ask a question, maybe how many feet are in a mile? ARAG system doesn't just pull from its huge general training

01:13

data. No, it first retrieves specific accurate info like 5 ,280 feet from a source you've given it. Only then does it use that verified fact to augment or, well, significantly improve its answer. This is what transforms a generic AI into one that genuinely knows your company docs, your support tickets, maybe your internal policies. Yeah, it changes the AI from just a generalist into a trusted specialist. OK, so the core idea is that the AI isn't just generating from like

01:41

its initial programming. It's actively seeking out and pulling in external facts first. Exactly. It finds the facts, then generates with confidence. But where does this AI store all the specific knowledge so it can find it so fast? That's where the vector database comes in. It's truly the

01:57

brain of a rag agent. Imagine this vast... multi -dimensional space kind of like a galaxy right where every star is a piece of information from your documents each chunk of text becomes a single point of light in this galaxy that's what we call a vector and the really profound insight here is semantic similarity this database doesn't organize things alphabetically not at all it places data points based on their meaning so all the points about say fruits cluster together

02:23

over here and animals form a totally different constellation over there when you ask what is a kitten your question itself gets turned into a new point of light. And it appears naturally right within that animal constellation. The database then super quickly finds the closest points, maybe cat, puppy, even wolf, and pulls the text link to them. That gives you a perfect, hyper -relevant answer. It's all about asking, what does this mean? And getting an intelligent, context

02:48

-aware match. For a deep dive today, we'll be using Supabase as our vector database example, which is built right on top of Postgresful. So this database, it really understands concepts. then, not just keywords. Yes, precisely. It's all about meaning and context. Okay. And it's this clever organization that lets the two crucial halves of a RIAG system work so well together.

03:10

If we kind of connect this to the bigger picture, building a RIAG system, it's like... assembling this, I don't know, magical, intelligent library, it's really a two -phase project, requires two distinct but totally complementary halves of a single brain, you could say. First, you've got what we call the AI librarian. That's your RA pipeline. This is the behind -the -scenes

03:29

worker. Its job is to meticulously read every single document you give it, understand the content, and then place each piece of information on the correct shelf within your, well, infinitely large perfectly organized vector database. This isn't an ongoing job, mind you. It's a one -time process you run to initially stock the library. Right. And then you have the AI scholar. That's your R agent. This is the public -facing expert. The scholar kind of sits at the front desk, ready

03:54

to help visitors. When you ask a difficult question, the scholar instantly zips through the library, pulls the exact pages from the correct books, synthesizes the information, and gives you this brilliant custom -written answer. But these two parts, they're completely interdependent. A scholar without a library is, well, just a generic chatbot. No specialized knowledge. And a library without a scholar, that's just a silent... inaccessible

04:18

database. So one half is organizing all the knowledge and the other half leverages that organization to actually provide the answers. Precisely. A powerful two -phase system, yeah. So let's dig into the librarian's job then, that RAG pipeline. It's really all about transforming raw documents into a searchable, structured knowledge base. And this process has four key steps. First up, document input. This is simple. It's just acquiring

04:40

the book. It's your source document. It would be a PDF, a text file, or even data you pull from other apps, maybe like a CRM. or something. Then we chunk the document. Think about it. You wouldn't file an entire 500 -page book under just one topic, right? You'd separate it into chapters or sections. Similarly, we split large documents into smaller, more focused pieces of text. This is super critical because it creates these context -rich pages for more accurate searching

05:05

later on. A pro tip here, aim for chunks of maybe about 1 ,000 characters and use a 200 -character overlap between chunks. That just helps ensure concepts aren't awkwardly cut off right in the middle. Okay, and here's where it gets, well... really interesting. After chunking, you embed. This is I think the most magical part. An embeddings model. It's a sophisticated AI, acts like a universal translator. It converts those text chunks into vectors, which are basically just long strings

05:32

of numbers. Think of it like giving each chunk a precise coordinate in that huge galaxy of meaning we talked about, that 1536 number you sometimes see. That just refers to the number of dimensions or kind of like traits an AI uses to describe each piece of information. It's like a super detailed profile for every single text chunk. And for your AI to understand your data consistently, this profile format, it has to match across all parts of your system. Finally, you vectorize

05:59

those chunks. This just means storing these numerical vectors in your Supabase vector database. They get intelligently placed by meaning, right? So all the golf rules about putting are grouped together, far away from rules about driving. In a tool like N8AN, this whole process is actually simplified into like a five -node assembly line.

06:16

You've got a trigger to start it, maybe a Google Drive node to get the document, a data loader for the chunking part, an embeddings model, like OpenAI's text embedding 3 small to do the translating, and the Supabase vector store node to file it all the way neatly. You just run this workflow once for each document you want to add. Gotcha. So it's about breaking down the info, translating it into this conceptual number language, and then filing it away smartly. Precisely. Creating

06:41

that structured, searchable knowledge base. Yeah. Okay, to make this all super concrete, let's talk about our mission for today. We're building an AI golf caddy using a 22 -page PDF. It's called The Rules of Golf Simplified. That's our single source of truth. Why golf rules? Well, because they are incredibly dense, super specific, and totally self -contained. A generic AI. You might just guess or even hallucinate, you know, make

07:06

stuff up if you ask a tricky golf question. But our engagement trained exclusively on this official rulebook. It'll be a true, reliable expert. Our goal is an AI caddy that can instantly and accurately answer very specific questions like, what am I allowed to do for practice? Or maybe, can I hit a practice shot between playing two holes? Yeah, and this isn't really about golf, is it? What's fascinating here is that this golf PDF,

07:28

it's just a placeholder. This whole process is actually a universal blueprint for any specialized knowledge you want to give your AI. Your data source could be anything. Your last 5 ,000 HubSpot support tickets may be an air table base full of project data or even recent client emails. Imagine asking your AI, what are the top three common product issues we're seeing in the UK? Or, which projects are running over budget this year? and your agent's trigger, how you start

07:56

the conversation. That can also be anything beyond just a chat window. It could be an email sent to askatyourcompany .com or a submission on your website form, or even a scheduled task that, say, summarizes the week's support tickets automatically. The skills you learn building this simple AI golf caddy, they're absolutely foundational for building really powerful business systems. So this simple golf example, it really unlocks vast possibilities for building custom AI then. Absolutely.

08:21

It's a template, a template for pretty much any custom AI you can think of. All right, let's get our hands dirty then. Step one, building the library itself using Supabase. This is going to be the home for our AI agent. It's built on Postgresql, which is incredibly solid. Plus, it's pretty user -friendly and has a very generous free tier, which is great for getting started.

08:40

Think of this step as laying the foundation, putting up the walls, and installing that magical shelving system for all your AI's knowledge. Okay, first up, you'll head over to supebase .com and create a new project. You give it a name, obviously, and critically, create a strong database password. Critical tip. Please, save this password securely right away, put it in a password manager, you will definitely need

09:03

it later. Once your project is spinning up, might take a minute or two, you'll configure the vector database part. Now, this sounds a bit intimidating because it involves a snippet of code, but trust me, consider it a one -time magic spell. Exactly. A magic spell is a great way to put it. You just navigate to the SQL editor inside Superbase. You copy and paste a provided code block. We'll make sure you have access to that. And you click run that code. It's simply doing three things.

09:29

First, create extension vector that's telling your database, hey, install the add -on to work with these AI vectors, these numbers representing meaning. Second, create table documents. This sets up the main storage shelf for your document chunks, including that special embedding vector 1536 column we talked about earlier, the profile. Finally, create function match documents. This creates the magic search tool that finds similar documents by comparing their vectors, their meaning.

09:56

That little bit that says one, documents .embedding, query embedding. That's just telling the database how to measure the conceptual distance between your question and every piece of info it has. Think of it like a reverse distance meter. The closer the concepts match, the higher the similarity score it gives. And then it brings back the highest scoring, most relevant information. So by running this one script, you instantly upgrade your plan.

10:17

database into an ai powered knowledge base pretty cool right and the final part of this setup step is just getting the keys to the library so to speak in your super base project settings under the api section you'll find your project url and the service role secret key security warning this service role key it is the master key to your entire database treat it like a password like gold Never, ever share it publicly. You'll need both the URL and that key for connecting

10:45

it in 8n later on. OK, so we're essentially building the AI's personal research library here, getting it all ready for use. That's it. Building the database and its intelligence shelves ready to be filled. Right. So with our library built and ready to be stocked or maybe already stocked, if we ran the pipeline, it's time to hire the scholar. Our ARIG agent. This is a new separate. and add in workflow. This one's for the interactive conversations. Think about the anatomy of this

11:11

AI scholar. It's got the ears. That's a chat trigger node. It just listens for user questions. Then there's the central nervous system, an AI agent node that coordinates everything, the thinking and responding. The brain itself is an open AI chat model, maybe like GPT -40 mini, which does the actual thinking and generates the responses. Crucially, we give it the library card. That's a Supabase vector store tool. You add this as a tool specifically for the AI agent. You set

11:37

its operation to retrieve documents. And you give it a really clear description, something like, use this tool to look up the official rules of golf. The AI needs that clarity. And finally, the universal translator. That's the embeddings model again. This must be the exact same model, say, text embedding three small that you use when you build a pipeline. That ensures consistent understanding for accurate searching. It's got

11:57

to speak the same language. Now, a brilliant scholar who can't remember what you just said five seconds ago, that's incredibly frustrating, right? It's like the Dory from Finding Nemo problem. By default, your agent has zero short -term memory, so we need to give it a notepad. This is step four, adding conversational memory. And you can actually do this surprisingly easily. Right inside your AI agent node, you add the PostgreSQL chat

12:19

memory feature. You'll create a new PostgreSQL credential using your Supabase database settings. Quick tip here, use the transaction pooler settings that Supabase provides. It helps with efficiency, especially if you plan to scale this up later. The password you need here is the one you set up for your Supabase project way back at the start. See? Told you you'd need it. You can also set the context window how many past messages it remembers. The default is five recent interactions,

12:45

which is usually a good starting point. And this whole memory system works using session IDs. Think of it like a unique library card for each user. That ensures everyone gets their own separate continuous conversation. Your chat doesn't get mixed up with someone else's. I still wrestle

13:00

with remembering to configure these. memory pieces myself sometimes it's just it's easy to miss yeah i can see that and testing this memory is pretty simple right you just start a new chat say hello my name is whatever your name is and then in the next message ask what's my name if it answers correctly bingo its notepad is working you can even double check by looking at the engine histories table directly in superbase if you want proof So this creates that conversational

13:27

expert, one that can actually remember the previous turns in the conversation. Yes, exactly. It enables those interactive, context -aware conversations that feel much more natural. Okay, makes sense. Mid -roll sponsor read. Now let's talk about the fuel, the stuff that makes our AI engine actually run, the OpenAI API key. This is a step that can sometimes trip up beginners. It's really crucial to understand the difference between your ChatGPT Plus subscription, if you have one,

13:54

and the OpenAI API. Think of ChatGP Plus like an all -you -can -eat buffet. You pay a flat fee, like $20 a month, for direct human -to -AI chatting through their website. The OpenAI API, though, that's different. It's a la carte. You pay for exactly what you use, usually charged per token, which is kind of like parts of words. This API is specifically designed for programmatic use, you know, machine to AI interaction. And that's exactly what you need for tools like NAI

14:19

to work. Right. And getting your API key is pretty straightforward. You navigate to platform .opni .com. Notice the platform part. That's the developer side, not the regular chat GPT site. You'll create an account if you don't have one at a payment method. Think of it like opening a tab at a restaurant. You only pay for what you order. When you go to the API keys section, click create new secret key. It'll generate a key that starts with SCET.

14:42

Security 101. Again, this API key is basically like a credit card number for your AI usage. Guard it fiercely. Copy it immediately. Save it in your secure password manager. Never, ever share it publicly. And once you close that little window where it shows you the key, you'll never see the full key again. If you lose it, you just have to generate a new one. Okay. And the golden rule of cost control here is really start small and set limits. Tip number one, use cheaper models

15:08

when you're testing and building. GPT -4 -0 Mini is incredibly capable, especially now, and it's significantly cheaper than the bigger models like GPT -4 Turbo. You can always upgrade later once you know it works and you need more power. Tip number two, and this is probably the most important one, set hard usage limits. In your OpenAI billing settings, you can actually set a hard spending limit, maybe just $10 a month

15:29

to start. If your usage ever hits that limit, the API will simply stop working until the next billing cycle. This is your ultimate safety net. It lets you experiment with total confidence, knowing you won't get a surprise bill. Got it. So the API key is the gateway to the AI's brainpower, and cost control is absolutely essential. Right. Access the brain, but keep that spending firmly in check. All right. The moment of truth, then. Let's test our AI golf caddy. You execute the

15:55

agent workflow in NA8N. You open the chat interface it provides, and you ask a question. Let's try that one. What am I allowed to do for practice before a round? Now, behind the scenes, here's what happens. The agent gets your query. It identifies its look up the rules of golf tool as the right one to use. It sends your question to the embeddings model to get vectorized, turned into numbers. It searches this super -based database using

16:17

those vector numbers. It finds the most relevant chunks of text about practice from that PDF we loaded. It feeds those relevant chunks to its GPD 4 .0 mini brain along with your original question. And then, it generates a perfect, context -aware answer based only on the provided rules. Yeah, and the response you'd likely get, pulled straight from the source document, would be pretty

16:36

detailed. Something like, before a round, you can practice on the course on the day of a match play event, but not before a stroke play tournament or playoff, or between rounds, unless the committee allows it. During the round itself, no practice shots are allowed when playing a hole or between holes. Except for some chipping or putting on or near the last green you finished, a practice green, Or the next D -Box, provided it doesn't hold up play. Let's see. Super specific, right

17:02

from the rules. And you can even check the agent's execution logs in ADN to see this whole thought process play out, which tool it used, what info it retrieved. It's great for debugging, but really this is just the hello world of RAG, you know. Beyond these basics, you can set up dynamic document updates. Imagine a workflow where any new file dropped in a Google Drive folder automatically triggers an update to the AI's knowledge base.

17:24

For multi -user systems, you could use unique session IDs, like maybe a user's email address or phone number. to maintain potentially thousands of separate private conversations simultaneously. Beep. Whip. I mean, imagine scaling that. Giving a custom private AI brain to potentially millions of individual users, all running off one core system. That's truly remarkable when you think about it. And of course, there's always performance

17:48

optimization you can do later. Things like fine -tuning the vector search parameters, maybe caching common queries to reduce API calls and costs. Lots you can do. So this example, it's really just scratching the surface of what's possible with this RAG approach. Absolutely. It's a powerful foundation. You can build almost infinite applications

18:05

on top of this basic. So let's recap. You've just gone from a general chat bot, one that knows a lot about everything but nothing specific about your stuff, to an AI with a custom specialized brain. RDAG systems, powered by these clever vector databases, allow AI to do its own context -specific research within your data. You got the librarian. building the knowledge base, meticulously organizing the information, and the scholar using that knowledge base, retrieving precise facts

18:34

to answer your questions accurately. And maybe the best part, you can actually build this all without writing a single line of traditional code, creating powerful, intelligent systems for anything from, well, golf rules to complex business data. Yeah, this isn't just some cool tech demo. It's really about building the future of how AI interacts with your unique world, your

18:54

specific information. You've now got the foundational skills, the understanding to customize AI and build truly intelligent agents tailored to your needs. This deep dive has hopefully equipped you with a pretty profound understanding of how to give AI both a memory and a personal library. So the question for you is, what specific dense body of knowledge in your life or your work could you give your AI next? Imagine the expertise you could unlock. Thank you for diving deep with

19:20

us today. Keep exploring, keep building, and yes, stay curious.

Transcript source: Provided by creator in RSS feed: download file

#80 Max: From Zero to RAG Agent – A Complete Beginner's No-Code Course

Episode description

Transcript