Hello, and welcome to the data engineering podcast, the show about modern data management. If you lead a data team, you know this pain. Every department needs dashboards, reports, custom views, and they all come to you. So you're either the bottleneck slowing everyone down, or you're spending all your time building one off tools instead of doing actual data work.
Retool gives you a way to break that cycle. Their platform lets people build custom apps on your company data while keeping it all secure. Type a prompt like build me a self-service reporting tool that lets teams query customer metrics from Databricks, and they get a production ready app with the permissions and governance built in. They can self serve, and you get your time back. It's data democratization without the chaos.
Check out Retool at dataengineeringpodcast.com slash Retool today, that's r e t o o l, and see how other data teams are scaling self-service. Because let's be honest, we all need to retool how we handle data requests. Your host is Tobias Macy, and today I'm interviewing Jamie Knowles and Ryan Hirsch about ER Studio and the foundational role of enterprise data modeling and modern data engineering.
So, Jamie, can you start by introducing yourself? Yeah. Hi. I'm Jamie Knowles. I'm product director for ER Studio.
And, Ryan, how about yourself?
Yeah. I'm Ryan Hirsch, and I'm the product marketing manager for ER Studio.
And going back to you, Jamie, do you remember how you first got started working in data?
Oh, yes. About twenty five years ago, trying to understand how all the 43 police forces across The UK interoperated. So trying to understand how to describe bad people and police and crimes and things. Started working with tools since then.
And Ryan, how did you get started working in data?
Yeah. So I don't have quite the the years that Jamie does, but I I've been in the the data world for a few years now. I had a really cool opportunity to work with another another idea of brand called Warescape, I so got kind of my foot in the door with them, building data warehouses, and then yeah, had the opportunity to move over to the ER Studio brand, and it's been an awesome experience the last couple of years working for them. I think data years are like dog years, though, weren't they?
Least. True. Of course, now we're into AI years, which is an exponential multiple of that.
Yes. Exactly. You bet.
And so digging into ER Studio, can you start by just setting the groundwork for what it is and some of the goals behind what you're trying to achieve with it?
Sure. Yeah. So ER Studio is an enterprise data modeling and architecture platform. So at a high level, we help organizations define and document their data so everyone agrees on structure, meaning, and relationships before anything gets built. We support both logical and physical modeling, so teams can align on business definitions first and then implement them consistently across their databases, their warehouses, and analytics platforms.
So we work with a lot of typically large and complex organizations where the data reliability, the scale and clarity really matter, and where data governance is vital to their organization.
Yeah, and Tobias, we talked about this last time. Think most companies' sort of analytics programs, data structure just kind of happens. New source lands, someone builds a pipeline, someone builds a dashboard and meaning sort of accumulates over time and I think governance, if it exists at all, happens after things have been built and we're saying all of that's backwards. You've got to start off with deciding what things mean first and then build. That's the r studio.
And in terms of the interfaces and personas that you're focused on working with, how does that goal of bringing the design into the upfront piece factor factor into the ways that you think about the overall interfaces and user experience and engineering flows around the product?
Yeah. So that user interface is all around pictures. So providing something that everybody can universally look at. So data models are generally built as some sort of graphical picture. So entity relation diagrams, good old fashioned entity relation diagrams. We've got entities or things, those things have attributes, you draw lines between them and
everyone can understand very clearly what does all this information mean and looking at the different subtypes of things. It's all laid out in diagrams and then the tool will sort of take you through the process of translating those logical diagrams into physical diagrams and then into code.
And you mentioned some of the tension between the engineering default of, oh, I've got a a data source. I'm just gonna do the work to get it to the destination and how that sometimes circumvents that modeling aspect because maybe they don't have all of the business context that's necessary to produce the actual useful artifact
because they're just trying to deliver the artifact that they were asked to deliver and just some of the ways that the overall aspect of and I'll use air quotes here, enterprise data modeling because everyone should be doing it regardless of size of or scale of business, how some of that differs between that move fast and just get something delivered default that engineers are pushed into versus the actual business requirements around how the data needs to be thought about at the organizational scale.
Yeah. So there's two very distinct roles, the the the architects versus the engineer. So the architects focused on design and intent. They're asking questions like, what does the term customer mean? And how should core concepts relate, what standards we follow, how you make those consistent across systems and consistency is the key. So traditionally we might have customers using your studio to design application data architecture. So we've lots of different systems all holding data.
We want to make sure that the data in those systems is consistent which aids interoperability, usability etc. And that's even more important for an analytics programme, so all those different layers, we want to ensure that consistency. So yeah, the architects are all about that structure and meaning, whereas the data engineers focused on execution, they're building pipelines and transformations and integrations, making sure they move reliably efficient.
And I think maintaining those role devices is really important. Defining business meaning isn't a trivial task when engineers are forced to do it in in SQL under deadline pressure, get semantic drift and that's that's a huge danger and that doesn't happen because they're careless, it's just because they're human and under pressure. Architects should be doing that hard work of translating business intent into coherent designs
and that leaves the engineers free to do their own bit of hard work, is implementation. So two different distinct roles working together.
Digging into some of that role and responsibility divide, how do you see that in terms of the hiring aspect? What are some of the characteristics that you're looking for in a data architect versus a data engineer? And what are some of the ways that you need to be thinking about the composition of your organizational team so that the architect and the engineer are able to execute effectively in tandem?
The engineers are are really doing a more technical role. They're focused on the technologies, whereas the the data architects are they've probably got more sort of business facing abilities. They they they they've got to sort of listen to the enormous quantity of information that comes out of the stakeholders and try and boil that down into that sort of structure and meaning. It's a different set of skills.
Yeah, I would, you know, to add on to that, I would say, you know, with ER Studio, we kind of feel that we live kind of at that precipice, that kind of overlap, where we expect the architects and the engineers to both utilize ER Studio. So, know, ER Studio, it's gonna give the architects a place to define the intent clearly, and then the engineers that actual blueprint that they can trust and implement that meaning that the architects have set in place.
Yeah, that's a really good point. Think those pictures sort of really help sort of connect everybody together. And our studio does a lot of the heavy lifting of converting those pictures into more technical layers, the technical blueprints, and then eventually the code.
When you're talking about the logical models, the semantic models, and some of the conversion into the technical representations, what are some of the key conceptual elements that need to be maintained throughout each of those different representations and some of the ways that at the product level, you focus on ensuring that there is no unintended drift or there aren't any elements that are lost in translation between those layers?
Yeah. So so it all starts off with the logical model. So a logical data model describes the business. So it's technology independent, nothing to do with the with the databases and and the underlying code. So we're, again, defining common entities. Is a customer? What is a product? What are the different types of a customer? How do those things relate to each other? So the logic model focused on the information. And that semantic layer, semantic framework then kind of governs everything else.
So again, looking at a traditional data warehouse, if we've got sort of, we've got the source systems, we've got bronze, silver and gold, all those different layers should be realising that same semantic model, okay, but in different ways. So that logical model is the overarching structure for everything. And the power of EMR Studio is from that one
logical model, that business driven model, we can generate those different layers, structures, different physical models. So the logical models focus on information, the physical model is concerned with the implementation of that information on the technology.
So we talk a lot about sort of creating buildings and a traditional architect in a building, he'll create a sketch which shows how the user is going to use that building and then he'll create technology blueprints, wiring diagrams, plumbing diagrams, everything related to the implementation of that. So the power of VR Studio is it sort of connects those different layers together, allows you to generate one from the other. You've got the traceability.
So that overarching semantic model allows you to completely understand what is the information at any point in in that that flow.
And there are a couple of different directions that I'm interested in exploring, one of which is particularly with the introduction of agentic coding capabilities into the ecosystem.
How does that change the workload and requirement around the data engineer role in particular because you already have the enterprise architect who is doing that translation from the business to the logical model. But before we go down that rabbit hole, I'm also interested in talking through engineering aspects of how ER Studio works across some of the different compute engines and compute systems that you might need to execute against for being able to actually materialize
that logical model down to the physical realities of data on disk and just some of the ways that the shift in the ecosystem around those compute substrates inform some of the ways that you need to change some of the ways that you think about the translation going from that logical and semantic down to the ERDs or some of the shifts in terms of reality of what modeling approaches you need to be able to address, particularly as those, as we move to introduce new approaches such as,
data vault or anchor modeling beyond the kind of Kimball approach that has been the bread and butter for of the industry for such a long time?
So I think the approach stays the same as it's always been. So again, that that sort of approach of building the semantic framework. What does what makes this organization tick? What are the pieces of the organization,
that we that we care about, we're interested in? And then from that, everything else drops out. It doesn't matter whether it's human beings involved or AI. And I think the interesting thing about the world at the moment is there's a lot of misconceptions around AI, that AI is going to solve all the problems and do it all for us. We're hearing customers that are just coming to us absolutely terrified.
Our leadership thinks that we can just stuff all the data in the warehouse as quickly as possible, connect the AI to it, and it's going to tell us all the answers. So it's classic sort of what's the meaning of life, it's 42. So you've got to do those hard yards. I think in the past human beings being able to create pipelines manually,
the risk was less. The human being would kind of understand the whole process from source data all the way through to a BI report, understood the context and the nature of all of that data and would be able to produce well structured BI reports.
I think nowadays connecting an AI to the sort of the output of this and saying 'hey crack on, create a knowledge graph from this and tell me all the answers' this is where the danger really really exists and I think companies will experience a lot of problems if they take that assumption that it's going to be all okay. Here, if you allow the AI to reason across data and it doesn't understand the nature of that data, the structure and the meaning of that data, that's where the danger comes in. So that traditional approach that we've always sort of advocated for in EAR studios,
start off with the semantic framework. That semantic framework then drives everything, it drives your data governance process. So a big feature of VR Studio is the way that we connect with a number of popular data governance tools. We're bringing together data governance and data architecture. So those two roles are working hand in hand. So we've now got a semantic framework overlaid with a governance framework. That then allows us to really start motoring in terms of the analytics process.
So we can start building data catalogues, understanding what source data we have mapped to that semantic backbone. We can then start understanding the requirements from the business needs. A lot of our customers are using the data as a product approach. So we can define what our data products, again same sort of process in ER Studio, we create a beautiful logical model as a picture,
we can confirm that with the stakeholders, is this the information you're looking for? And I think so many of our customers are saying, look, the hardest part of the job is trying to find out from stakeholders what do they want? So that picture is a really good vehicle for that. Is this what you're looking for? You talked about customer. Are you aware that our organisation has three different types of customer and they're very different? Great.
Okay. So and you're talking about revenue for those customers. Are you aware that we've got very different types of revenue and different calculations for each of those? It's the same term you're using, but which is the one that you want? So the logical model is a really good vehicle for that, Creating that logical model for the data product from that overarching semantic model
gives you reusable architectural patterns. So we're already reducing risk. And then from that, ER Studio will then help you work down defining the physical layers, give the data engineers all they need to know to say okay we've got data over here, here's the primary keys of it, here's the business keys of the target tables, here's how it all maps, this is what everything means, Go create the pipelines. And then the final output is great. We've got a well designed,
maybe gold layer of the warehouse. Here it is nicely designed against that semantic framework. And then this is where the magic is really going to happen with with ER Studio. We've we've got plans to to be able to generate outputs from from AR Studio to inform the AI. So for instance, we're generating things like RDF files to to provide that sort of terminology box, a the sort of
the structure of all of that data from that semantic framework, and then connect everything in your warehouse to that semantic framework. So the AI completely understands how does this business work and all right, that information, where does it live in the warehouse? And I think only then can you really rely on your your AI to to be able to reason successfully on your data. A kind of a long answer to a short question.
No. That was great. You anticipated my next question. So I was going to start digging into some of those aspects of the ways that having that logical and semantic model is able to act as supplemental context to the AIs who are some of the new downstream consumers because you mentioned business intelligence being the longtime client of all of the work that we do in data warehouses.
But as we move into a much more dynamic consumption ecosystem, that also changes the requirements around how that logical and semantic model is
preserved and exposed because it's not just something that's consumed by the human operators who are taking that and translating it into the physical models that are then queried by the business intelligence systems. It's also something that is used by these AI systems that are doing some of these agentic analytics workflows and also some of the ways that having that semantic
understanding of what is the purpose of this data asset can also be used for some of the reverse ETL workloads of I need to take this enriched dataset that started from five different places and then populate that into my ERP or my CRM so that it can then be used to improve the overall operational capabilities of the business because I actually have a more complete view of the
client, of the business, of all of the context around what my role is supposed to be focused on independent of my involvement in the actual data warehouse and data modeling.
Yeah. Exactly. I think if if all these different components, whether it's human or AI, can understand everything in sort of relation to that semantic model, then things are gonna happen really nicely and risk is really going to be reduced.
Yeah, would, I mean, just to add on to that, I think, you know, especially when it comes to the AI, you know, we understand that, you know, AI systems don't handle ambiguity well. As humans, we can work around unclear definitions using our own context. AI can't. It can't do that. If meaning isn't explicit, AI won't ask for clarification. It'll just confidently kind of spit out incorrect results, which we all know is AI hallucinations.
So, you know, with that foundation, with those knowledge models, or those conceptual and logical models, we provide all the grounding that AI would actually need. So humans, you know, they're going to define meaning intentionally, and AI is just going to come in and consume that data. But what we're doing is we're putting trust,
reliability, and structure behind that data first before it's going be consumed by AI. Because we know without that foundation first set by us as the humans, AI is just going to amplify any of those inconsistencies, and that's just going you know, snowball. So we like to kind of grab the tail, you know, before Ford starts running away from us, and really just set that reliability upfront.
Yeah, I think it's an important time at the moment. So AI does not fix semantic drift, It amplifies it. As I said earlier, humans can compensate for the fuzzy definitions. AI cannot. So if you connect an AI to ambiguous data, it's gonna scale that ambiguity at sort of super
high machine speed. And I think now is is is is kind of an inflection point in architecture created by AI. Understanding the structure meaning your data is now non negotiable. AI didn't just accelerate analytics. It's it's changed the tolerance for for ambiguity.
And digging more into that governance aspect as well, how does information about the business rules and technical controls around what data can be accessed when and by whom and for which purposes inform
some of the other downstream work as far as actually implementing those controls. And then also, particularly for a lot of these agentic consumers, instruct the models as far as what operations to perform, ways that they can prevent wasted cycles trying to access something that they're not actually enabled to access, etcetera.
Yeah. That that's really important. And and again, governance is is just so important nowadays. We we we It's got to be baked in right from the start, so right from the point where we're defining what our data products and the contents of them. And just the same, the human beings and whoever's giving access to human beings that data, the AI needs the same information. And and again, the the only way to sort of make this work is to have some sort of framework that ties everything together. That's that semantic backbone. That's the logical data model. The the logical data model gives you the the structure of of the pieces, entities and attributes and the relationships.
From that we can bring our data governance exercise in. EL Studio has got some great tools for being able to harvest a business glossary for instance from a logical data model and allow the data stewards to then work through those business terms and assign classifications for the data, assign policies for the data. When we do that harvesting process, we're also arranging those terms into a similar sort of structure so to see those taxonomies.
Some data governance tools like Calibra allow you to actually use the logical data model as the structure so and they call it guided data stewardship where those policies and rules are directly attached to the logical
model. But again if we can then take that logical model and give it to the AI with the connections to the sort of the business terms and the policies then great, we can inform the AI to be able to deal with it. And again in some of the mechanisms that we're using like RDF export, we're using things like SCS linkages back to the business terms. So yeah, we're trying to make sure that technically we can provide that information to the AI in a form it can ingest.
And I would just add on to that. Just when it comes to the governance side of things, that was absolutely vital for ER Studio
when kind of developing our competitive advantage and kind of what are the primary features that we wanted to provide to our customers. So our integrations with Purview and Collibra were basically nonstarters that had to be part of the solution we would offer as a data architecture company, because we recognize governance tools and modeling tools, they're going to solve different but complementary problems.
So the governance tools like Purdue and Calibro, they're excellent with the stewardship, policy, data lineage, compliance, where we, as your studio, we're excellent designing, meeting, and structure. So when you kind of integrate both of those tools together, you don't end up with two versions of that same truth. The models are going to feed that metadata into the governance platforms, and those platforms then stay aligned as that architecture evolves.
So that's then going to prevent any of that semantic drift and avoid any companies from having to work with just a single vendor. We kind of provide that pliability and that variability if need be.
I think it's all about building those ecosystems, isn't it? We've talked about the roles of the data architects and the data engineer, and now we've got the data steward in there. Those three groups, those three roles have to work together effectively. And I think going back to agentic workflows, we can't just sort of offload responsibility to the AI. The human beings have to be involved in that process. They have to put in the checks and balances. They have to provide approved
definitions of things. So, again, the the AI is just another role in there. And ER Studio is providing that sort of that core, allowing us to sort of connect those different roles together.
And then circling back around to some of the organizational aspects of data architecture, data modeling, the interplay between the architect and the engineer, what are some of the ways that having that central shared repository for the purpose and usage and definitions around these data models can actually accelerate the engineering effort, whereas a lot of times it's viewed as a point of friction and something that is a roadblock to the actual implementation.
Yeah. This is the the age old story, isn't it? Sort of architecture slows me down. Martin Fowler did some great graphs on this with with, just building applications. So, yeah, without any architectural, minimal architecture, we can just start building, build real quick and then over time as things get more complicated
sort of adding more functionality and takes longer and is harder. Doing architecture at the start gives you a slower start, okay? But as things get more complex then productivity and output grows. So having architecture is absolutely
vital for any platform that is going to be scalable over time, going to be manageable. So having a tool like ER Studio, as you say that central repository, all the models are published, they're connected to business terms and data governance tools, so anybody can go to those and understand what does this piece of information mean, how does it fit into the sort of the broader semantic framework of the organisation, what rules and policies applies to this, how is this information classified,
this particular concept, how is it identified. So in this system we can see that it's got a primary key, okay great, but the business concept, how does the business identify one of these things? What are the different types, different ways of identifying it? How do I then correlate these keys across these systems? So there's just so much useful information in there to help the engineers in their already difficult job.
Yeah, I totally agree. You know, I think from our perspective, we see that the data architects and the data engineers, the roles are closely related, but they're solving different problems. So the data architect is going to focus on the design and the intent. The questions they're going be asking are like, what does the term customer mean? How should core concepts relate? What standards should we follow? How do we make this consistent across systems over time?
Whereas the data engineer is focused on the actual execution. They're building those pipelines and the transformations, the integrations, making sure that data is actually going to be moving reliably and efficiently. So where they overlap is going to be both those roles, really what they care about at its core is going to be trust and correctness. Where things often go wrong is where that architectural
decision isn't clear or didn't exist in the first place. So those data engineers are then forced to make those semantic decisions within their code, often under time pressure. So again, at ER Studio, we're going to sit right at that overlap. We're going to kind of be that place, that safe zone, where the architects and the engineers can really work together in harmonization to make both of their lives easier.
And we want to take some pain away. Mean, for an engineer, nothing's more frustrating than inheriting a pipeline and having to reverse engineer the business logic from variable names and half remembered decisions. So having it all laid out there is, is just gonna help so much.
And for people who are using ER Studio in their overall development of their data architecture, physical data models, the integration with all of the actual storage layers and compute systems?
What does a typical workflow look like as you go from maybe I have some physical tables, my data architecture is scattered across Wiki pages and markdown documents and user code randomly and now I need to actually get a proper view of what everything is and start working through that more deliberate architecture design and delivery workflow.
Yeah. So I think starting off with that core semantic model is is is gonna be the easiest place to start building that knowledge model. We use the term knowledge model a lot. AR Studio will will help you with that process. We've got AI tools within ER Studio to help you create those models really quickly, combining those different models and sources together. So step one define the structure and meaning of the organisation's information.
Step two catalogue data assets against that model and then you can start looking at the requirements. As I said, you'll be taking a data product approach, defining those data products as fragments of that overarching model. And then comes the design part of the warehouse designing gold, silver and bronze layers again using that overarching logical model as the framework so semantically those layers are identical or at least similar then the job of building the pipelines comes in.
Yeah, Jamie, maybe you can speak a little more to this, just add on to that. I want to ensure that we also offer that flexibility. We understand not every company is going to come in from the Ground floor at step one. They're going to have a lot of their systems in place already when they're looking to start working with your studio. So we're data agnostic. We work with all the major platforms,
Snowflake, Databricks, Microsoft, all the big players. And we also offer a lot of those features that, hey, if you have systems in place and you need to harmonize them, there is the opportunity to do the data engineering. There's import and export features
to really bring all that data into synchronization through ER Studio. So again, while it's ideal to get that semantic layer down first, get the logistic model or the logical models, I'm sorry, in place first, we completely understand that's not going to be the perfect situation for every organization.
We do offer a lot of those features to kind of get them up to speed where they need to be with importing, exporting, metadata bridges, tons of macros they can use for shortcuts to make their lives a lot easier in that aspect.
Yeah, that's a really good point. Mean, there's so many different forms that data can take and so many different technologies. EOS Studio has been around a long time. We've got connectors to most things. So we could be reverse engineering your 30 year old core banking application sitting on an ancient DB2 database. We could be pulling in the JSON structures,
creating physical models of all these things to try and understand them. So we take care of the sort of technology side of things. And likewise on the sort of design side of things, if we're designing the warehouse, any of the modern platforms, yep we support most of them. Once you've created your picture as a physical model, the tool will then generate the code and
create the data asset for you and then change processes as new requirements come in. Then again we've got nice simple workflow where we change the logical model, understand the information first, translate the tool then translate those changes to the physical model, we'll go through another approval gate, yep that's what we want the new version to look like and then the tool will generate change code, alter scripts to update those platforms.
The tool does a lot of heavy lifting for for the technologies itself, and we connect into things like Git repositories and, such like to handle that change process.
And for people who are using ER Studio, they're developing their overall organizational architecture. What are some of the interesting ways that you've seen the implementation of ER Studio change some of the organizational dynamics or some interesting success stories that you've seen as a result?
You know, the main thing that we'd like people to come across with is, you know, we help to maintain those consistent definitions. You know, so for example, we offer a multi user repository where different teams can collaborate in real time with built in version control and role based access. So modelers can work in parallel without overriding each other's changes,
maintaining a complete history while preventing bottlenecks and rework. This is going to ensure that all the agreed upon definitions are consistent, and that's going to come across all the governance aspects as well. We can even take it to another layer with our web based collaboration portal that we call TeamServer that really extends that consistency beyond just the architects. So now we have business and technical users that can explore those models, participate in reviews,
and even contribute through comments and discussion threads. So again, the definitions like, what does customer mean? What does revenue mean? They're just documented. They're actually embedded in the architecture and visible across the entire company.
Yeah, and I think the most exciting output that our customers seen is that kind of taking away the stress of that semantic entropy. So just having one consistent framework where everybody can agree what everything means, That relief that our customers see when that happens. That's probably the most exciting output from ER Studio.
Yeah, just a few, you know, like actual kind of concrete results. What are some, like, the quantifiable results we've seen with your studio? A lot of really good stuff. For example, we've had companies for their compliance reporting time, it's dropped by 85%. We've seen cataloging time drop by 80%. Productivity gains, we've seen that increase by 25%. A lot of powerful stuff.
How fast organizations can now scale by embedding the AR Studio, we've seen that increase by fivefold with a lot of different organizations. We've seen a lot of companies reduce their errors by half just by utilizing our logical data models. Onboarding time has been decreased by 40%. Data quality, we've seen increased by 30%. So these are all quantifiable results that have been reported to us through various case studies with different organizations
working with ER Studio. So a lot of really cool success measures that aren't just hypothetical, are true and accurate.
And in your own experience of working in this space and building ER studio, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?
I think a lot of customers are surprised on how easy this is. It doesn't have to be massively hard. And I think when we talk through all this sort of prescriptive approach, you've got to build this sort of enormous architecture before anything happens. I think just
focusing on the core things, defining the most important pieces of information, and customers have really been surprised how easy that process is and how effective it is. So it doesn't have to be some gargantuan process before you deliver any value.
I think one of the biggest obstacles that we also hear, or hesitations we hear, is this is going to slow us down if we try to implement a data modelling tool up front. So this kind of goes back to your earlier question, Tobias, where the fact is, like, it may feel that way at first, but the inverse is actually true. Modeling is going to reduce any rework. So any hesitations or a slow startup up front, that's not going to be the case moving forward. You're going to prevent late stage changes
that are going to be expensive and disruptive. So again, getting those definitions and semantics clear up front, moving forward, the engineers are going to move so much faster because they're not going be making assumptions. They're not going to be having to do rework and fixing bugs. There's going be fewer surprises involved. So I think that's one of the key takeaways and
really surprises that our clients here is, wow, yeah, we were hesitant at first, but now that we've implemented the AR Studio and a sound data architecture,
you know, we're flying through the work. And it will be slower on day one, but it's dramatically faster on day 100.
And for people who are looking at how they want to implement their modeling strategy, figure out how they want to manage that interplay between dart data architects and data engineers, what are the situations where you would advise against using ER Studio either for an overall team or for a specific project?
I can't really think one. I mean, I think in the past, if humans are involved all the way through, you could possibly risk just creating pipelines, creating datasets, creating outputs. But I think AI has changed everything. If you attempt to connect AI to your data without understanding that semantic backbone, everything is going to come crashing down. So I think the time has gone. Again, as I said earlier, it's an inflection point for architecture. AI has changed everything.
Yeah, while we did mention we typically work with large organizations with a lot of different teams, a lot of different systems, you know, that might play into the fact like, okay, well, about, you know, a smaller organization that only has
one architect or a handful of architects and engineers? Like Jamie said, we still feel it's vital to set that foundation in place and establish those logical data models. And just to be clear with ER Studio, we offer different levels of additions. So we're not going to break the bank. If you need a model and you want to get that foundation set up front, we offer professional and standard versions of our software as well. So again, as Jamie mentioned, we feel it's vital for every organization
that really relies on their data to really set up that beginning step with a logical data model.
Yeah, I think that's important. Think the majority of our customers are larger organizations where they've got a lot of diverse data sets and data around the organization. There's a lot of complexity. There's a lot of sensitivity around that data. That's our traditional customer base. Think for organisations that maybe have got maybe a single CRM system, something that's very simple, very clearly defined,
maybe what we do is less important. Where governance is not an important thing, fine, less important. But as soon as there's any complexity or sensitivity of that data, then really need to be doing what your studio does.
And as you continue to build and iterate on the product and keep an eye towards what's happening in the broader industry, both in terms of the data engineering tooling, as well as the impact of agentic use cases, what are some of the things you have planned for the near to medium term or projects that you're excited to explore?
Yeah. So I I think that there's two streams. There's the engineering stream and the pipelines and then there's the that sort of meeting model that we're talking about. So we're we're spending a lot of our time on the meeting model. So as I mentioned earlier, being able to take that logical model, that overarching logical model, express it in a knowledge graph form that an AI can ingest. We're also providing mechanisms to document the warehouse and provide semantic layers for tools like Power BI. So again, the AI components of Power BI can then reason on that data and support natural language queries and and signals from it. So, yeah, the the the meaning model is is where we're we're focusing most of our effort at the moment.
And Jamie, I got to give you a little credit too. You know, a recent feature we've come out with, you know, in relation to AI is an AI data modeling assistant. So again, we're well aware that it's not going to be just data architects or engineers using data modeling, that, you know, that responsibility has grown to the data stewards and the business users and the end users. So we have recently implemented a new feature where you can just type in, in a standard text box, a prompt, and AI within seconds will generate a logical data model for you. So this really kind of opens the door to a lot of those users that don't have a lot of that technical background or expertise
in actual data modeling to really kind of expedite that process. So that's a really exciting feature that we recently released that I got to give my cohort here, Jamie, a little pat on the back. He's kind of the brains behind that feature release.
Thanks, Zak. I mean, that's a good point. I mean, I've been talking mainly about using ER Studio to support AI programs, but of course, we're doing a lot of work building AI into the tool itself to make that process of building these models easier, quicker, more intuitive, doing a lot of that heavy lifting. There's no excuse.
All right. Are there any other aspects of the ER studio product or this overall space of organizational data architecture that we didn't discuss yet that you'd like to cover before we close out the show?
You know, I think we've covered a lot of ground here. It's been a really fun, technical, useful discussion. But, you know, I think the main thing I'd like listeners to come away with is that most data problems are meaning problems. So if you get the meaning right, everything else is going to become easier. So when the definitions are clear and agreed upon, the engineering is going move faster,
the governance is to become simpler, and the analytics are going to become more reliable. So really, at the end of the day, we want our customers to equate your studio with data reliability, accuracy, and ultimately, meaning.
Yeah, I think my takeaways would be if your engineers are debugging definitions instead of building value, consider investing more in in architecture. And then the other one is if you're considering connecting AI to your data, you've gotta take a good look at your architecture and ask yourself, is the structure and meaning of our data absolutely clear?
Alright. Well, for anybody who wants to get in touch with you both and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see is the biggest gap of the tooling or technology that's available for data management today.
So I think it's it's being able to connect with AI. AI to support the process of of of doing architecture, and then how can architecture then support an AI program. And with the R Studio, we're we're solving both of those at the moment.
I almost see that the biggest gap is not a tool itself that's needed to be put in place. It's the of the belief and the mindset of people working in the organization that, again, if we want to use the analogy of data with building a home, you've got to start with a blueprint. You've got to start with a solid foundation before, you know, putting the woodwork in place, installing the windows and the doors. You need the requirements, you need the standards,
you need those definitions established upfront. And I think everyone's just, you know, these days really, you know, all they want is speed, speed, speed. Let's let's see the results ASAP, where take the step back, get the blueprint in place, lay that foundation first, put that mindset into place. That's where I think we see the biggest gap that is the mindset that needs to change, not necessarily the tools, because the tools are out there, they're available.
Yeah, and I like the point you made, uniting those teams. All those teams need to be collaborating together on the job in hand. So, again, ER Studio, we're trying to sort of connect all those different tools to now all those different roles to to collaborate on on the job.
Well, thank you both for taking the time today to join me and talk through some of these challenges around how to get everybody on the same page around what the data means and how it's being used. It's obviously one of the foundational problems in this space, so I appreciate the time and energy that you're putting into helping teams come to grips with that. I hope you enjoy the rest of your day. Thanks for having us. It's been a pleasure. Thanks for your time.
Thank you for listening, and don't forget to check out our other shows. Podcast.net covers the Python language, its community, and the innovative ways it is being used. And the AI engineering podcast is your guide to the fast moving world of building AI systems. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts@dataengineeringpodcast.com
with your story. Just to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
