#59 Max: RAG 2.0 is Here – Build an AI Agent That Understands Your Data with Knowledge Graphs | AI Fire Daily podcast

00:00

Imagine an AI that doesn't just, you know, pull up facts, but one that truly understands how those facts connect, how they influence each other. An AI that can reason, much like a brilliant research assistant, not just a super fast search engine. We're really talking about a fundamental shift in how AI interacts with information. It's like giving AI not just a memory, but a mind, you know, a way to connect things. This capability, it genuinely changes what's possible. Welcome

00:29

to the Deep Dive. Today we're exploring a really cutting edge development in AI. It's this fusion of... agentic RG and knowledge graphs. Sounds fancy. It does. But our mission here is to unpack why the traditional ways AI retrieves information often, well, they hit a wall. Right. And then see how combining this sort of photographic memory with the ability to connect the dots creates

00:51

something genuinely powerful. You'll get a feel for what enables this, how it's actually built, and what it could mean for how you interact with your own data. Yeah, and I think a lot of us have used AI to chat with documents, right? It feels pretty magical sometimes. It does. You ask a question, boom, it finds the answer right there in your files. But what happens when the questions get really complex, when you need to understand relationships, maybe compare strategies

01:18

across different sources? That's where it gets tricky. Right. So for anyone who's played with traditional RRAG, that's retrieval augmented generation, it's a good starting point. It's meant to ground a language model in your specific documents. But what are its limits? When do questions need more than just a simple lookup? Well, think of it like this. Traditional RRAG, it's great for direct answers, finding that specific sentence.

01:42

But it's fundamentally inflexible. The process usually involves breaking documents into these little chunks. Then you create mathematical embeddings, sort of numerical fingerprints for each chunk, and store them in what's called a vector database. This is for quick similarity searches. So when you ask a question, the AI just finds the most similar chunks and stuffs them into the prompt for the language model. So it gets the context, the raw text, but it can't really reason about

02:09

it. It doesn't understand the meaning behind it. Precisely. The AI is constrained. It can't say, look at the initial search results and think, hmm, these aren't quite right. Let me try a different angle. It's not built to explore complex relationships or do multi -step reasoning across different facts it finds. So essentially, it ends up being a very sophisticated search engine. But still just a search engine. Yeah, exactly. And one that's kind of blind to the deeper connections.

02:35

It's like someone reading random paragraphs from a dozen books, but never grasping the overall story. Okay, so what specific kinds of questions really break these traditional ARG agents? Where do they really stumble? Questions about relationships, comparisons, or maybe cause and effect. Those often cause major trouble. Got it. Questions needing more than just finding similar words. Yeah, things that require synthesis. Right. Okay. So here's where it gets really interesting then.

03:02

This new approach, agentic R, combined with knowledge graphs, you said it gives AI not just that memory, but also this ability to connect the dots. That really does sound like a big leap. It truly is. Think of it like this. If traditional R is a really, really good filing cabinet, excellent at pulling out specific files quickly, then this new system... is more like your brilliant research assistant. This assistant hasn't just read every

03:28

document you gave them. They've actually understood how everything within those documents relates to everything else. They've built this comprehensive mental map. So the AI itself, the agent, it has autonomy. It gets to decide the best strategy to answer my question instead of just following one rigid search then answer path. Exactly. That's the core idea. For a simple fact, like what year was this company founded? It might just do a

03:51

quick vector search. Fast and easy. If you ask about relationships, like how does this new regulation impact our main competitor, It can traverse the knowledge graph, mapping out those connections. And for really complex queries, it might intelligently combine both methods. Two -sec silence, you know,

04:09

a real -world example. Instead of just finding separate documents mentioning Microsoft and OpenAI, this agent, because it has that knowledge graph, it understands their multibillion -dollar partnership. It knows how that connection impacts their strategies, their products, everything. It's connected knowledge. not just isolated facts. And that's what you mean by agentic argue, the agent choosing the

04:30

strategy. Yep. An AI system where the agent itself figures out the best way to find and use information to answer your query. So how does this deeper level of understanding fundamentally change what AI can do for us? What kinds of problems can it tackle now? Well, it allows the AI to actually reason and provide much more nuanced answers going way beyond simple fact retrieval. It starts to sound like actual understanding. It's getting closer. This capability, it sounds pretty advanced.

04:59

What are the actual tools making this possible? Is this all super proprietary, complex stuff? Surprisingly, no. A lot of it is built on open source components. We're talking about things like Pydantic AI. That's a Python library that helps build structured, reliable outputs from the AI agent. So it's not just rambling. It gives you organized answers. Okay, structure is good. Yeah. And for the knowledge graph itself, there's a tool called Graffiti. to help define the structure,

05:24

the schema. And then Neo4j is pretty much the leader for graph databases where all those intricate connections actually live and get queried. Right, Neo4j. And what about the vector database piece for that initial similarity search? Ah, yeah, that's a pretty clever choice in this setup. It uses PostgreSQL, the familiar, robust SQL database, but with an extension called pgVector. So your standard relational database suddenly gains high -performance vector search capabilities.

05:52

Whoa, wait. Postgresible with PGVector? So it's like a database and a vector search engine rolled into one, a two -in -one superpower? Yeah, kind of like having a sports car that also happens to be a submarine. Yeah. It's really powerful. You get this solid, reliable SQL foundation, but can also handle the vector stuff extremely well for many use cases. That sounds incredibly

06:12

versatile. But I wonder, does putting vector search inside PostgreSQL have any trade -offs compared to using, say, a totally separate specialized vector database? That's a fair question. You definitely gain a lot of convenience and often simplify your whole setup. You have one less system to manage. But, you know, for scenarios demanding absolutely massive scale, maybe billions of vectors or ultra -low latency search. a dedicated specialized vector database might still have

06:41

an edge. Okay. For most applications, though, this Swiss army knife approach with Postgres and PG vector is incredibly effective and flexible. Plus, the whole thing uses fast API for the API layer, which is modern and fast. Right. And critically important, I think, you have amazing flexibility with the large language models of brains, right? You can plug in OpenAI's models or run open source models locally using Alama or use Google Gemini models. You're not locked into one vendor. No,

07:10

vendor lock -in is huge. So what's the key benefit, the sort of magic, when you combine these specific tools, Pydantic AI, Graffiti, Neo4j, Postgres Vector, FastAPI? It's that seamless integration that allows you to build these really powerful yet customizable AI systems that can both search and reason. Powerful and customizable. I like the sound of that. Sponsoreed mid -roll. Okay, so how does someone actually build this kind

07:35

of assistant? It sounds like a lot of moving parts, but the article suggests it's pretty approachable if you're willing to dive in. Yeah, the article lays it out step by step. You basically get the code, set up your Python environment. You'll need PostgreSQL. And they suggest maybe using a managed service like Neon, which handles the vector stuff easily. Okay, Neon for Postgres. And then your Neo4j instance for the graph database, Beat. But the really crucial part, the magic

08:01

moment, is loading your knowledge base. Right, your documents. Exactly. You just put your documents, PDFs, text files, whatever, into a specific folder. Then you run an ingestion script, and the AI gets to work. What's it doing, though? It splits the text into meaningful chunks, not just random bits. It generates those vector embeddings for the similarity search in Postgresql. And this

08:21

is key. It uses a large language model to read through the text and identify the important entities like people, companies, concepts, and the relationships between them. Ah, so the LLM itself helps build the knowledge graph. Precisely. It extracts that structure, that understanding, and populates the Neo4j graph database. That's where the real understanding of your data gets built. Okay, so once that's built and the graph is populated...

08:46

How do we test it? What kinds of questions really show off its reasoning beyond just finding facts? This is the fun part. You can actually see its logic. Ask it something simple like, what are Google's main AI initiatives? The agent will probably just use a quick vector search. Finds the relevant passages. Simple. Right. But then ask something like, how are Amazon and Anthropic connected? Now, it gets interesting. The agent should realize this is about a relationship.

09:12

So it uses the graph search. It navigates the connections in Neo4j and finds their big investment relationship, maybe Anthropic's use of AWS infrastructure. It pieces together the connection. That's cool. And then for the really complex stuff, multi -step reasoning. Ask it something like, compare Microsoft's AI strategy with OpenAI's approach. Okay, that's not a simple lookup. Not at all. Beep, beep. Here, the agent has to be smart.

09:39

It needs to combine vector search to pull specific facts about each company's strategy and graph search to understand their deep partnership, how one influences the other. The result should be a really nuanced, comprehensive answer that considers both the individual parts and their relationship. It's genuinely impressive when you see it work. Wow. Imagine scaling that kind of analysis. A billion queries like that. The depth of understanding you could unlock. It's

10:04

incredible. Really remarkable potential there. So thinking about getting it running, beyond just installing the software and setting environment variables, what's often the trickiest part in getting the quality right, making sure the understanding is actually good. Yeah, that's key. Setting up the databases and making sure all the connections and like. API keys are right, can definitely be tricky for first timers. But getting the quality often comes down to the data feeding in and how

10:31

the agent reasons. Right, the inputs matter. Hugely. Okay, so once a system is built, how locked in are you? How customizable is it? Can you really tailor it? Oh, absolutely. That's the beauty of it being mostly open source. You can totally customize the types of entities and relationships it looks for. Maybe you need it to understand legal clauses or specific scientific concepts. You could integrate multimodal knowledge. Imagine feeding it images or audio files alongside

10:57

text. You can set up real -time updates so as new documents come in, the knowledge graph grows. And you can definitely optimize performance for really massive data sets. It's designed to be flexible. That's really powerful. Now, what's equally fascinating, you mentioned earlier, is how the system itself was actually built. The author used something called context engineering with an advanced AI like Claude. So the AI wasn't just like a co -pilot helping out. It was more

11:23

fundamental. Way more than a co -pilot, really. The description makes it sound like... Imagine the AI was the incredibly skilled camera operator, the lighting expert, the sound engineer, and the editor. Wow. While the human was primarily the film director, providing the vision and the overall plan. Apparently, over 90 % of the system's code was generated by the AI in just 35 minutes.

11:46

35 minutes. Yeah. By feeding it a really detailed project specification, the requirements, and crucially, access to the documentation for all

11:55

those tools we mentioned. vulnerable admission I mean I still wrestle with prompt drift myself sometimes getting the AI to consistently do what I want so seeing this level of complex system generation through AI assistance is genuinely mind -blowing and that's context engineering giving the AI all that background material essentially yeah it's about guiding the AI assistant by providing comprehensive documentation clear examples detailed plans basically giving it everything it needs

12:24

to understand the task deeply and generate the right code or system components. So what's the big takeaway then from seeing AI build such a sophisticated system so rapidly? What does that imply more broadly? It just shows how AI -assisted development can drastically speed up the creation of complex systems. It's a game changer for building things quickly. A definite force multiplier.

12:44

Okay, so for listeners who are thinking, hey, I want to try building this, are there any common pitfalls, any gotchas they should watch out for that might trip them up? Definitely a few things to keep in mind. First, during your initial testing phase, creating that knowledge graph from scratch can sometimes take a really long time, especially with lots of documents. Right. The good news is you can actually skip that step temporarily

13:07

for faster iteration. There's usually a flag you can use when running the ingestion script, something like no knowledge graph, just to get the vector search part working first. Okay. Handy tip for testing. What else? Another common one. Sometimes you find the agent gets stuck in a rut. It always uses vector search or always tries the graph, even when the question clearly suits the other method. Ah, it's not making the right

13:32

decision. Exactly. If that happens, you need to go back and refine the system prompt you give the agent. You have to be super clear about the criteria it should use to decide which tool vector search or graph search is appropriate for different kinds of questions. So tweaking the agent's instructions.

13:50

off irrelevant like it's not finding the right chunks yeah that can happen first thing is maybe experiment with different embedding models some are better than others for certain types of text okay but crucially pay attention to how you're chunking the documents this idea of semantic chunking is really important semantic chunking yeah instead of just chopping documents into rigid say 500 word blocks right You split them based on logical breaks in the content, paragraph

14:17

sections, topic shifts, meaningful segments. That usually gives the vector search much better context to work with. Semantic chunking basically is just dividing text into meaningful, logically coherent parts, not just fixed size blocks. It makes a big difference. Makes sense. Better chunks, better search. And what about if you just can't connect to Neo4j, the graph database? Oh yeah,

14:39

that's usually simpler. Double check your credentials, username, password, database URL, and just make sure the Neo4j service is actually running on your machine or wherever it's hosted. It's often just a configuration type or the service isn't started. Standard troubleshooting then. So beyond these specific fixes, what's a good general mindset to have when you're troubleshooting these complex

14:59

AI systems? Patience. iteration it often involves carefully tweaking the AI's internal prompts trying different chunking strategies maybe adjusting parameters and seeing how it affects the results it's rarely a one -shot fix careful refinement is key iteration and refinement got it so let's pull back what does this all mean for us what's the really big idea here it feels like more than

15:23

just a new technique I think it is. The big idea is that the future of AI, especially when it comes to understanding information, isn't just about making search faster or training even bigger language models. It's really about building systems that can genuinely reason. We're moving away from thinking about static databases that just store isolated facts towards building dynamic, interconnected knowledge networks where the relationships are just as important as the facts themselves.

15:50

You know, traditional reg gives you a really good search engine. Agetic reg, combined with knowledge graphs, starts to give you something closer to a brilliant research assistant. From search engine to research assistant, that's a powerful shift. It's a fundamental change in how we can expect AI to help us make sense of complex information. So this deep dive really

16:10

reveals a fundamental shift, doesn't it? Moving from just simple search towards an intelligent understanding that actually grasps context and crucially relationships. Absolutely. And the tools are becoming accessible enough to actually build this now. So maybe some homework for our listeners, if you're feeling ambitious, that is. Yeah. Get your hands dirty. Try setting this system up. The article provides the guide and the code. Load it with your own documents. Maybe

16:35

it's your company's internal notes. Maybe research papers you need to synthesize. Project documentation. Whatever complex information you need to master. And then test it. Exactly. Ask it different kinds of questions, simple facts, relationship questions. Those complex comparison queries push its boundaries. See what it can do with your data. Because this is really about more than just playing with code,

16:56

isn't it? It's about starting to grasp how AI can navigate that incredibly complex web of relationships within your information, uncovering insights that were maybe hidden before. It's about starting to build that future of AI for yourself, tailored to what you need to understand. So the final thought to leave... everyone with, what kind of complex relationship -based questions could you finally get answers to if your AI truly understood your data? Out to your own music.

Transcript source: Provided by creator in RSS feed: download file

#59 Max: RAG 2.0 is Here – Build an AI Agent That Understands Your Data with Knowledge Graphs

Episode description

Transcript