The Analytics Setup Guidebook

Speaker 1

00:00

All right, so we've got a ton of information here about how to actually set up an analytics system.

Speaker 2

00:04

Yeah, it seems like you're really looking to get into the modern best practices for this.

Speaker 1

00:08

Absolutely, and I think this is going to be really fascinating because we've got some excerpts from the Analytics set Up Guidebook by Holistics.

Speaker 2

00:17

Yeah, and Holistics they've got years of experience helping companies build their business intelligence capabilities exactly, so they definitely know what they're talking about.

Speaker 1

00:26

Yeah, they know their stuff. So what we're going to do is give you a really clear roadmap to building a modern and scalable analytics system and.

Speaker 2

00:34

Hopefully cut through a lot of the jargon that's out there. Yeah, sure, and really highlight some of the most fascinating bits from this guidebook.

Speaker 1

00:43

Now, they start out by emphasizing that any analytics system, no matter how complex, it ultimately boils down to three key actions.

Speaker 2

00:51

Yeah, what they call the core trio, so loading the data, transforming the data, and then actually using the data.

Speaker 1

00:57

Okay, I like that the core trio.

Speaker 2

01:00

Yeah, it's a simple framework, but it's really powerful just because it helps you kind of break down, you know, this whole complex system into like manageable steps.

Speaker 1

01:10

Yeah, for sure. So step one then is loading data. Seems pretty straightforward, right, just get all your data into one central location.

Speaker 2

01:18

Yeah in principle, yeah, but in reality it can quickly become a logistical nightmare. I can imagine, you know, think about all the different sources you might have, Like you know, you've got data in your app, you've got your CRM, all your marketing platforms, and then you know, maybe even those dreaded spreadsheets lurking in the shadows.

Speaker 1

01:35

Yeah.

Speaker 2

01:37

So consolidating all of that into a central repository it's no small feet, No.

Speaker 1

01:42

It's not. This makes me think of like herding cats.

Speaker 2

01:46

You know what I mean, it really is, Yeah, except the cats are data points from a dozen different systems.

Speaker 1

01:51

Yeah, exactly.

Speaker 2

01:52

So this is where this concept of a data warehouse comes in. It's basically, you know, it's a massive centralized storage system that's really specifically designed for housing and managing all of your organization's data.

Speaker 1

02:04

Okay, so that's our central hub, the data warehouse.

Speaker 2

02:07

Yeah.

Speaker 1

02:08

But you know, the guidebook mentions that some early stage companies might try to skip this step. Is that ever a good idea?

Speaker 2

02:16

It's a delicate balance. You know, if you're truly early stage, maybe you just have one data source not a lot of traffic, you might be able to get away with querying directly from your production database for a short period of time.

Speaker 1

02:30

I mean, but that just seems like that would put such a strain on your system, potentially even affecting your users exactly.

Speaker 2

02:36

And that's one of the big risks, right It's like it's like trying to run a marathon while carrying a heavy backpack. Okay, you know, you might make it to the finish line, but it's going to be slow and painful, ye, and you run the risk of tripping and injuring yourself, which in this analogy, would be you know, equivalent to causing downtime for your application.

Speaker 1

02:55

Right right now, that makes perfect sense. And what about companies that are using you know, no secal databases like Mango dB. Are those suitable for analytics.

Speaker 2

03:04

Generally no no SQL databases. They're great for handling specific types of data and workloads, but they're not really optimized for the complex queries that are essential for business intelligence. Okay, so trying to perform in depth analytics on a no SQL database, it's it's kind of like trying to you know, trying to write a novel on a typewriter that's designed for short memos. Right.

Speaker 1

03:27

Okay, so even for startups that are just starting out, you're advocating for them to move towards you know, this dedicated analytics setup as soon as possible.

Speaker 2

03:35

Yeah. Absolutely. I mean there is this you know, dump and load method which is basically just exporting data to like local files, you know, to analyze it. And that might work in the very very early stages, but it quickly becomes unsustainable, Okay. It just it lacks the structure and the scalability and automation that you get with a proper analytics system.

Speaker 1

03:55

Okay, so we've established that this data warehouse is essential, but how do we actually get all of our data into it? That seems like a really big task.

Speaker 2

04:04

Yeah, for sure. And that's where data consolidation comes in. And this is you know, a critical part of building a modern analytics stack. And the guidebook talks about a key shift in thinking here, moving from this traditional ETL approach to a more modern ELT approach.

Speaker 1

04:19

Okay, so ETL versus ELT. What's the difference and why is this shift so important? Right?

Speaker 2

04:24

So, ETL stands for extract, Transform load, and that's the traditional method where you extract a data from all your various sources, you transform it into a usable format outside of the data warehouse, and then finally you load it into the warehouse.

Speaker 1

04:39

Okay, so it's like before you put it into storage, you're meticulously cleaning and organizing everything.

Speaker 2

04:44

Yeah, exactly. But this approach, it has some significant drawbacks. You know, as the volume and variety of data increases, transforming everything before loading it into the warehouse can really create a massive bottleneck.

Speaker 1

04:56

So it's like having a single narrow doorway into our data warehouse.

Speaker 2

05:00

Great way to put it, Yeah, and it really slows everything down. So ELT, which stands for extract load transform, basically flips the script to extract the data, you load it into the data warehouse in its raw, unprocessed format, and then you transform it within the warehouse itself.

Speaker 1

05:19

So we're dumping everything into the warehouse first and then organizing later. It just seems so counterintuitive.

Speaker 2

05:26

Yeah, I know it seems that way, right.

Speaker 1

05:28

But what are the benefits of this approach.

Speaker 2

05:30

Well, there are several first eliminates that bottleneck that we talked about. That's caused by transforming everything outside the warehouse. Okay, and cloud data warehouses are incredibly powerful these days, you know, so you can actually leverage that processing power for the transformation.

Speaker 1

05:46

It's like using one of those, you know, industrial sized vacuum cleaners versus like a tiny handheld vacuum cleaner.

Speaker 2

05:53

Exactly. Yeah, it's a much more efficient use of resources.

Speaker 1

05:55

Okay.

Speaker 2

05:56

Secondly, ELT allows for a more agile, dumb first, transform later approach, so you can load all your raw data into the warehouse without having to like meticulously define every transformation upfront.

Speaker 1

06:09

Okay, so you're saying that's particularly valuable if your company's data needs are still evolving.

Speaker 2

06:15

Exactly. Yeah, if you're not entirely sure what insights you're looking for yet, you can just get all the data in there and then worry about the transformation later.

Speaker 1

06:22

Okay, that makes sense.

Speaker 2

06:24

And then here's another key advantage. ELT empowers data analysts to actually take ownership of the transformation process.

Speaker 1

06:33

Wait, so the analysts are actually writing the transformations themselves. I always thought that was like a data engineer's job.

Speaker 2

06:39

Yeah, traditionally it has been, right, but with ELT, analysts can actually use SQL okay, which is the language they're already familiar with to define and manage those transformations within the data warehouse.

Speaker 1

06:51

So it freeze them up. They don't have to rely on the data engineers as much.

Speaker 2

06:54

Exactly. It reduces their dependency on the engineers, and then they can be much more agile and responsive to you know, the changing business needs.

Speaker 1

07:03

Yeah, I mean that must be incredibly empowering for them.

Speaker 2

07:07

Yeah.

Speaker 1

07:07

Now, before we move on from this section, I know data laks are a part of this modern data ecosystem, but I'm not exactly sure how they fit into this this whole ELT picture.

Speaker 2

07:19

Right. So, a data lake is basically a vast, unstructured repository for all of your raw data, regardless of its format. Think of it as a massive data holding tank. So you can dump everything into the data lake, and then you selectively pull out the specific data that you need for analysis using ELT.

Speaker 1

07:38

Okay, so we can dump everything into the data lake and then pull out what we need when we're ready to actually analyze it.

Speaker 2

07:44

Exactly. Yeah, It's kind of like having a giant pantry where you store all of your ingredients and then you pull out the specific ones you need when you're ready to cook a particular dish.

Speaker 1

07:53

Okay, all right, I like that analogy. So now that we've got you know, our data into the warehouse, what happens next.

Speaker 2

08:00

Well, now it's time for step two of the core trio, which is transforming that raw data into something we can actually use to gain insights.

Speaker 1

08:07

So we're molding it to fit our needs.

Speaker 2

08:09

Yeah, exactly, And this is where we start shaping and molding the data to fit our specific business needs. And you know, there are a lot of potential benefits to doing good data transformations. It's not just about making the data look pretty. It's about ensuring consistency, making it easier to analyze, and even potentially saving money.

Speaker 1

08:29

Right, Okay, so let's dive into those benefits a little bit more. What are some of the key advantages of really well designed data transformations.

Speaker 2

08:40

Well, one of the key benefits is that it helps ensure consistency across the entire organization. Okay, so by creating standardized definitions and calculations, you can avoid this dreaded thing called metric drift.

Speaker 1

08:53

Well, metric drift, yeah, I've heard of that.

Speaker 2

08:54

Which is basically where different departments end up calculating the same thing in different ways. Leading to confusion and inaccurate reporting.

Speaker 1

09:01

Right, So we want to make sure that like everyone speaking the same language exactly.

Speaker 2

09:05

Yeah, like everyone in the organization is using the same dictionary. Okay, so when we talk about key metrics, we all know what we're talking about, right, We're all on the same page, exactly. Another advantage is reusability, okay. So by defining transformations as like modular components, you can easily reuse them across different analyzes and reports, okay, which saves you a lot of time and effort. But it also ensures that all your insights are based on the same underlying logic.

Speaker 1

09:32

So we're not reinventing the wheel every time we need to pull a new report or something. Yeah, okay.

Speaker 2

09:37

And that brings us to the third major benefit, which is performance and cost effectiveness.

Speaker 1

09:43

Okay.

Speaker 2

09:43

You know, well designed transformations can optimize your queries, make them run faster and more efficiently, which can translate into significant cost savings, especially when you're dealing with large volumes of data in a cloud data.

Speaker 1

09:57

Warehouse, right where you're paying for compute time okay exactly.

Speaker 2

10:00

So yeah, good data transformations. They're like building a solid foundation for a house. You know, it might not be the most glamorous part, but it's essential for stability, efficiency, and long term success.

Speaker 1

10:13

Okay, I like that. So it's like the foundation. So what are some what does a data transformation look like in practice?

Speaker 2

10:19

Yeah, So to give you a practical sense of what this looks like, the guidebook actually walks us through a really basic example of using SQL to transform raw booking data into a daily summary. And it's really quite elegant in its simplicity.

Speaker 1

10:35

Oka.

Speaker 2

10:35

It really demystifies that process of data transformation and shows you how with a little bit of SQL knowledge, analysts can really take ownership of this crucial step.

Speaker 1

10:45

Yeah, it's really accessible. Yeah. So okay, so we've loaded the data, we've transformed it. What is the final step in this core trio?

Speaker 2

10:52

Right? So the final step, of course, is actually using that data. Right.

Speaker 1

10:56

It's all about using it getting value out of it exactly.

Speaker 2

10:59

And the guidebook refers to this as data servicing.

Speaker 1

11:02

Data servicing, okay, And they.

Speaker 2

11:04

Talk about how the role of data analysts has evolved significantly in recent years.

Speaker 1

11:08

You know, I was just thinking about that the other day, like, how how much that has changed?

Speaker 2

11:14

Yeah, it's really fascinating. How this role has evolved, and they actually frame it as a tale of three jobs.

Speaker 1

11:21

A tale of three jobs.

Speaker 2

11:22

Okay, highlighting you know, the journey of data analysts from like report monkeys to self service enablers.

Speaker 1

11:29

Okay, I definitely want to hear more about this.

Speaker 2

11:31

Yeah, it's a really interesting story. So in the early days of BI, you know, data analysts were often relegated to the role of report monkeys.

Speaker 1

11:38

Okay.

Speaker 2

11:39

They basically spent most of their time manually generating reports for decision makers.

Speaker 1

11:44

Okay.

Speaker 2

11:45

It was really tedious and reactive, and they often felt like they were just churning out spreadsheets okay, and not really having any opportunity to actually analyze the data.

Speaker 1

11:54

Right, They're just you know, putting together spreadsheets.

Speaker 2

11:56

Exactly, And it was a recipe for burnout.

Speaker 1

11:59

Oh absolute, I can imagine.

Speaker 2

12:01

But then came the advent of self service BI tools like Tableau, oh, Tableau, which promised to liberate analysts from you know, the shackles of manual reporting and empower business users to actually explore the data themselves.

Speaker 1

12:15

Okay, so they're empowered. That sounds like a really good thing.

Speaker 2

12:19

Freedom It certainly seemed that way initially, but as the guidebook points out this shift to self service BI, while empowering in many ways, also introduced some new challenges.

Speaker 1

12:30

Okay, so there's a butt coming.

Speaker 2

12:31

Yeah.

Speaker 1

12:31

What were some of the downsides of this self service revolution?

Speaker 2

12:34

Well, one of the biggest issues was the rise of metric drift.

Speaker 1

12:38

Okay.

Speaker 2

12:39

With no centralized definition of governance around key metrics, you know, different departments started calculating things in slightly different ways, leading to you know, inconsistencies and making it difficult to really get a clear picture of what was actually happening.

Speaker 1

12:56

Right, So again, like everyone speaking a different language exactly.

Speaker 2

12:59

Yeah, it's like everyone having their own version of the truth, right, which doesn't sound like a recipe for good decision making.

Speaker 1

13:05

Oh no, it doesn't. So is there something else where do we go from there?

Speaker 2

13:09

Right? So that's where the third stage in the evolution of the data analyst comes in. And this is where you know, tools like Looker and Holistics enter the scene with their emphasis on data modeling layers and a more balanced approach to self service.

Speaker 1

13:26

Okay, so data modeling layers, I think you've mentioned this before. Can you just explain what those are and why they're so important?

Speaker 2

13:31

Right? A data Modeling Layer is basically an abstraction layer that sits on top of your raw data, and it allows you to define business logic, calculations, relationships, and standardized definitions for all your key metrics in a centralized location.

Speaker 1

13:47

So rather than just having like the business logic baked into every individual report or dashboard exactly.

Speaker 2

13:54

Yeah, you're basically creating a central repository, okay, of all those definitions and calculations that can be reused to cross the organization.

Speaker 1

14:01

It's like a blueprint for your data.

Speaker 2

14:02

Exactly, So everyone is working from the same set of plans. And this not only helps to prevent metric drift, but it also makes it much easier for business users to actually explore the data independently without you know, getting lost in the technical complexities.

Speaker 1

14:18

Yeah, you know. And there's there's this great anecdote in the book about a CEO who needed specific data and how data modeling layer empowered them to be able to get that data themselves without having to go through the data team.

Speaker 2

14:33

Yeah, that's a great anecdote. It illustrates how this can really bridge the gap, you know, yeah, between technical and non technical users and make data accessible to everyone. You know, so instead of having to rely on the data team for every single request, business users can actually explore the data themselves, okay, armed with the confidence that they're working with accurate and consistent definitions.

Speaker 1

14:55

Yeah, it seems like a win win. The data analysts don't have to just focus on those mundane reporting tasks, and business users are able to kind of answer their own questions exactly.

Speaker 2

15:06

And you know, the guidebook highlights the key benefits of this approach. It's increased self service analytics, okay, more efficient use of your data team's resources, and a well documented and consistent layer of data knowledge that everyone can access.

Speaker 1

15:23

Okay, So data modeling layers very important piece of the puzzle for sure. So let's talk about data modeling itself a little bit more. You sent me some really interesting stuff on Kimball's dimensional data modeling, with I think is a very important framework in this field. Yeah.

Speaker 2

15:39

Absolutely, Kimball's Dimensional data Modeling. It's it's a classic for a reason. You know, even in today's cloud first world, the core concepts are incredibly relevant.

Speaker 1

15:49

Okay.

Speaker 2

15:50

It's it's like learning the fundamentals of music theory. You know. You can always add your own flare later on, yeah, but those basics they're essential, right, So what are some.

Speaker 1

16:00

Of those core concepts those essentials.

Speaker 2

16:01

Well, one of the core concepts is this idea of a data model itself. So a data model is essentially an abstract representation of your business data.

Speaker 1

16:11

Okay.

Speaker 2

16:11

It defines the entities, the attributes, the relationships that are important for your analysis. It's kind of like creating a map, you know.

Speaker 1

16:19

Okay, So it's like our map of our data exactly. Okay, So what else do we need to navigate this terrain?

Speaker 2

16:25

Well? Another key concept is what's called relationship mapping.

Speaker 1

16:28

Okay.

Speaker 2

16:28

So, just like in a relational database, you need to define how different data models relate to each other. So, for example, a customer model might have a relationship with an orders model okay, right, which allows you to analyze customer purchasing behavior.

Speaker 1

16:43

So it's like connecting the docks between different data.

Speaker 2

16:45

Points exactly, Yeah, to reveal that bigger picture.

Speaker 1

16:48

Okay.

Speaker 2

16:49

Then there's what's called custom field logic, which allows you to create calculated fields within your modelsky. For example, you might define total revs new as the sum of all sales transactions.

Speaker 1

17:03

Right, So we're going beyond the raw data we're adding calculations.

Speaker 2

17:05

To it exactly. Yeah, you're adding layers of meaning and calculations to make it more relevant to your specific business needs. Right. And then finally, there's this idea of models built on top of other models, okay. And this allows you to create increasingly sophisticated and nuanced representations of your data.

Speaker 1

17:26

Okay.

Speaker 2

17:26

So, for example, you could combine a customer model with a product model okay, and create what's called a customer segmentation model okay, right, which allows you to group customers based on their purchasing patterns.

Speaker 1

17:37

So we're building like a hierarchy of models.

Speaker 2

17:40

Exactly, Yeah, each one adding more you know, granularity and insight.

Speaker 1

17:44

Okay. So, and this brings us to the star of the show, which I think you mentioned earlier, the Star schema. Yes, the Star schema, So tell me about that.

Speaker 2

17:54

It's a classic for a reason, okay. And it's it's basically a specific way of organizing your data for analysis that makes it incredibly efficient for querying and reporting. It's called a star schema because visually it kind of resembles a star. You have a central fact table and that's surrounded by multiple dimension tables.

Speaker 1

18:14

Okay. So the fact table is the heart of the star exactly, and then the dimensions are like the points radiating outward, like what goes into each of those?

Speaker 2

18:22

Right? So the fact table contains the core metrics you want to analyze, okay. Often these are numerical values like sales figures or website visits, you know, customer interactions. It's like the what of your analysis okay, the what right, and the dimension tables provide the context okay, and descriptive information about those facts okay, you know, like the who, what, where, when and why of your data.

Speaker 1

18:49

Okay. So, for example, if we're looking at you know, sales data, our fact table might contain like date, product ID, quantity, souled price exactly. Yeah. And then our dimension tables would be things like the customer, the product, you know, the location and things like that precisely. Yeah. Yeah.

Speaker 2

19:07

And by linking these tables together, you can start to answer those, you know, complex questions about your data, like which products are selling best in which regions? Or what are the demographics of our highest spending customers.

Speaker 1

19:19

So we're adding layers of details exactly.

Speaker 2

19:22

Yeah. And the guidebook does a great job of, you know, walking us through a practical example of designing a staff schema using the case of modeling data from a point of sale system, and they emphasize this importance of choosing the right grain, you know, which refers to the level of detail that's captured in that fact table. Okay, and getting this right it's really crucial for efficient and accurate analysis.

Speaker 1

19:49

Yeah, because if you get it wrong, it can lead to all kinds of problems, right, like performance issues, data redundancy, and even inaccurate results exactly.

Speaker 2

19:58

Yeah. It's it's like, you know, trying to build a house with bricks that are too big or too small. It's just not going to work well.

Speaker 1

20:04

Okay. So, like they use the analogy of tracking website traffic, right, and if your grain is too coarse, you might only be tracking visits at you know, the website level, but you're missing details about like the page views right exactly. But if it's too fine, you're tracking every single click and mouse movement.

Speaker 2

20:26

Yeah, and you end up with this massive, unwieldy data set.

Speaker 1

20:30

Okay, So it's really important to find that that sweet spot exactly, so you know, you get the information you need without overwhelming yourself with data you don't need, right, now, you know, I know the world of data warehousing has changed a lot since Kimball first introduced this. Oh yeah, I mean, especially with cloud data warehouses and their power and affordability.

Speaker 2

20:49

Absolutely, and the guidebook emphasizes that, you know, while Kimball's principles are still really relevant, some of those specific techniques that he talked about can be adapt or even bypassed in the context of a modern cloud data warehouse.

Speaker 1

21:04

Okay, so there's some things you can skip, yeah, exactly. You know, like they gave the example of inventory management, and traditionally, you know, if you wanted to track inventory levels over time, you'd need these complex snapshot fact tables. But with a modern data warehouse, a lot of the time you can just work with the raw data exactly.

Speaker 2

21:21

Yeah, and that's because of the sheer processing power and storage capacity of these modern cloud data warehouses. Yeah, it allows us to be much more agile and iterative in our in our approach to data modeling. You know, we can load the raw data first and then experiment with different modeling techniques within the warehouse itself.

Speaker 1

21:42

Okay, so you don't have to like figure everything out ufront, you can.

Speaker 2

21:45

Just experiment exactly, yeah, it's much more flexible, I.

Speaker 1

21:47

See, okay, And they also talked about how features like table partitioning can simplify how you handle slowly changing dimensions or seds.

Speaker 2

21:57

Yeah for sure, which used.

Speaker 1

21:58

To require you know, these plax modeling.

Speaker 2

22:00

Techniques, right, right, So table partitioning is basically a really powerful technique. It allows you to divide, you know, a large table into smaller, more manageable chunks, okay, based on criteria like date ranges. And this can really improve query performance and make it a lot easier to manage that historical data.

Speaker 1

22:21

So instead of having like this one massive table for all of our customer data, we might partition it by year exactly. Yeah, so if we need data from a specific year, it's much faster to get to.

Speaker 2

22:31

Yeah, precisely. And then when it comes to ScDs, you know, which track changes to dimension attributes over time, table partitioning can actually be used to store different versions of a dimension in separate partitions, okay, so you can easily track those historical changes without having to resort to those complex modeling techniques.

Speaker 1

22:54

So I mean, with these cloud data warehouses, it seems like there's just so much more flexibility.

Speaker 2

22:58

Yeah, absolutely, we're no longer limited by those constraints of the old traditional on premise data warehouses. So yeah, we can experiment more freely, iterate more quickly, and adapt our models as our business needs evolve.

Speaker 1

23:10

Okay, And you know, they have a really great case study that they talk about about their own company, Holistics, and they talk about how they implemented a snowcloud analytics system, right yeah, and how they took this very iterative approach to data modeling.

Speaker 2

23:28

It's a great example of putting this putting theory into practice. You know, they didn't try to model everything up front. They just started with a basic model and then iteratively refined it based on how people were actually using the data.

Speaker 1

23:41

So the key takeaway is, let usage guide your modeling decisions exactly, you know, don't try to over engineer things from the start. Start simple and see how people are interacting with the data, and then adjust accordingly.

Speaker 2

23:53

And they highlight some key principles that guided their process, like embedding business logic in the data models sometulf okay, rather than individual queries and aiming for you know, self serve and analytics by really empowering those business users to explore the data independently.

Speaker 1

24:10

So we're creating a system that's both you know, robust and flexible and sure it's consistency, but also allows for exploration.

Speaker 2

24:17

Yeah, finding that balance.

Speaker 1

24:19

Okay, And this brings us to this this fascinating concept of the arc of adoption, right, Yeah, which describes the typical stages of data usage evolution within an organization.

Speaker 2

24:30

Yeah.

Speaker 1

24:30

It's a really great framework for understanding like how data driven thinking takes root and.

Speaker 2

24:36

Matures exactly, and understanding where your organization sits on this arc can help you make much more informed decisions.

Speaker 1

24:42

Okay, So tell me about this arc of adoption. I mean, you know, we've already talked about the fact that, like in those early stages, there's this heavy reliance on you know, spreadsheets and ad hoc analysis, right, So, what are some of the telltale signs that a company's in this phase and what are some of the challenges they might face.

Speaker 2

24:59

Well, Well, in this initial phase, the data is often scattered across various sources, and there's this heavy reliance on manual processes to extract you know, basic insights. Think about you know those late nights you spend cobbling together reports

25:13

and excel. Oh, Yeah, I've been there trying to make sense of data from different departments and systems, and you know, it's functional, but it's so time consuming and prone to errors, and it just doesn't scale well as your organization grows, Like trying.

Speaker 1

25:28

To build a skyscraper with hand tools exactly.

Speaker 2

25:30

Yeah, you might be able to lay a few bricks, but you're not going to.

Speaker 1

25:32

Get very far right right.

Speaker 2

25:34

And the guidebook it highlights some of the key challenges of this stage, like data inconsistency, lack of standardization. You know, it's really difficult to collaborate, and there's a very limited ability to answer those more complex business questions.

Speaker 1

25:49

It's a lot of frustration from both you know, the people putting together the reports and then the decision makers who are trying to use that information.

Speaker 2

25:56

For sure, and this is often the point where companies ree, okay, we need a more robust and scalable solution, and they begin to invest in bi tools, which kind of moves them to that second stage of the arc of adoption, okay, which is charpitorized by those you know, static reports and dashboards.

26:15

So this is where we start to see you know, those colorful charts and graphs that executives love, right right, Yeah, and dashboards they provide this uh, you know, more centralized and visual way to track those key metrics, so it makes it easier to monitor progress, identify trends, and bi tools play a much more prominent role at this stage, host speaker.

Speaker 1

26:39

Right, they start automating those reporting processes exactly.

Speaker 2

26:42

Yeah, and they make data more accessible to a wider audience.

Speaker 1

26:46

Okay, so it's like, you know, we've gone from that hand drawn map to the GPS.

Speaker 2

26:51

That's a great way to put it.

Speaker 1

26:52

Yeah, we get a better picture of you know, where we are and where we're going. Yeah, but we're still kind of on a predetermined.

Speaker 2

26:58

Route, right And dash towards you know, they provide valuable insights, but they don't necessarily empower users to you know, explore the data freely or ask those what if questions that often lead to those really groundbreaking discoveries.

Speaker 1

27:13

And that brings us to like, you know, the the holy grail of data maturity, right, self service analytics exactly the third stage of this arc of adoption, And this is where our users can you know, they have the ability to access, analyze, and visualize data independently without having to go to the data team for every single request exactly.

Speaker 2

27:35

Self service analytics really represents this shift from a centralized, it driven model to a much more decentralized user driven approach. It's like, you know, having a fully interactive map, you can zoom in and out, explore different routes, discover like those hidden gems, you know, and even create your own personalized maps.

Speaker 1

27:57

So incredibly empowering for those users. Yeah, but I know that the guidebook cautions that just giving people BI tools isn't enough, no at all, to really you know, experience self service analytics.

Speaker 2

28:08

Yeah, simply giving everyone access to data without the right foundation, it's it's a recipe for disaster. Okay, you know, it's it's like giving someone like a really powerful sports car, okay, without any driving lessons or you know, knowledge of traffic laws.

Speaker 1

28:24

Right, you know, they might be able to get a moving but it's probably going to end in a crash.

Speaker 2

28:28

Yeah, that's a that's a scary thought. So what are those essential ingredients?

Speaker 1

28:32

Yeah, so the guidebook they highlight three key pillars data literacy, data governance, and a robust data modeling layer.

Speaker 2

28:40

Okay, let's break those down a little bit.

Speaker 1

28:42

Yeah, so data literacy basically means that users need to understand how to interpret and work with data, how to ask the right questions okay, and then how to draw you know, meaningful conclusions from the insights they discovered. Okay.

Speaker 2

28:57

So it's not just about having access to thet it's knowing what to do with it.

Speaker 1

29:01

Yeah, exactly. It's about being able to actually speak the language of data. Data governance ensures that the data is accurate, consistent, and trustworthy.

Speaker 2

29:13

Okay.

Speaker 1

29:13

It involves establishing processes for data quality management, defining those clear roles and responsibilities for data stewership, and then implementing those security measures to protect sensitive information. So we're making sure everyone's using the data responsibly exactly.

Speaker 2

29:29

Yeah, it's about creating that framework that ensures everyone is using the data responsibly and ethically.

Speaker 1

29:35

Okay. And then finally, the data modeling layer, which I think we've talked about a lot already, but that's the foundation really that makes this all possible.

Speaker 2

29:43

Yeah. Absolutely. And by defining those business logics and those calculations, you know, those relationships in that centralized location, we're creating this single source of truth that everyone in the organization can actually trust.

Speaker 1

30:00

So the modeling layer is our guide exactly.

Speaker 2

30:03

Yeah. It ensures that everyone's you know, speaking the same language and interpreting the data consistently.

Speaker 1

30:08

Okay, So without that things could go very wrong.

Speaker 2

30:11

Oh yeah, absolutely. Without a robust data modeling layer, you know, self service analytics can really quickly descend into chaos.

Speaker 1

30:19

Okay.

Speaker 2

30:19

You know you have users potentially creating their own definitions, their own calculations, okay, which can lead to all sorts of inconsistencies and inaccurate reporting.

Speaker 1

30:29

Okay. So you know, they make it very clear that that self service analytics is a journey. It's not like you just arrived there.

Speaker 2

30:35

No, absolutely, yeah, it's it's an ongoing process of learning, adapting, and refining. And as organizations become more data driven, the demands on that data team they also increase.

Speaker 1

30:48

Yeah we talked about that earlier.

Speaker 2

30:50

Right, and the guidebook delves into the challenges of scaling your BI tools and processes to match the growing data needs of the organization.

Speaker 1

31:00

Yeah, you know how that team can be overwhelmed too many requests for reports and dashboards at hoc analyses.

Speaker 2

31:08

Right, Yeah, it's like that small kitchen staff trying to cater a banquet for you know, hundreds of guests. It's just not sustainable in the long run.

Speaker 1

31:17

Right, it's not. So how do we prevent that? What are some of those strategies?

Speaker 2

31:20

So scalability becomes really really crucial here, and the guidebook highlights several strategies for scaling your BI infrastructure, things like choosing the right BI tools that can handle those increasing data volumes and that user concurrency, you know, implementing a really robust data governance process okay to streamline those workflows, and then fostering that culture of data literacy throughout the entire organization.

Speaker 1

31:44

Okay, so working smarter, not harder, exactly.

Speaker 2

31:47

Yeah, you know, investing in those right tools, processes, and skills really helps those data teams keep pace with those growing demands and ensure that the organization can continue to make those really data driven decisions as it scales.

Speaker 1

32:02

Okay, So technology is obviously, you know, a very important part of this, but it's really the people that are driving that success.

Speaker 2

32:08

Yeah, absolutely, you know, technology is an essential enabler, but it's it's ultimately the people who drive that data driven success. So, you know, fostering that culture of data literacy, empowering those business users and then investing in the development of your data professionals. Those are all crucial for building a truly data driven organization.

Speaker 1

32:29

Okay. So the guidebook concludes by emphasizing that this world of business intelligence, I mean it's constantly changing.

Speaker 2

32:36

Oh yeah, absolutely, always evolving.

Speaker 1

32:38

New tools, new technologies, new approaches. I mean, what's what works today might not work tomorrow. So you know we got to stay curious, experiment, yeah, definitely, and continue to learn and adapt.

Speaker 2

32:52

You know, it's like surfing. Now you need to be aware of those changing tides, adjust your balance, and you know, stay ahead of the curve to really avoid wiping out.

Speaker 1

33:03

Okay, I like that. So what does this mean for our listeners, Well, you know.

Speaker 2

33:07

Building a modern analytics system, it's not just about choosing those right tools, it's about really understanding those underlying principles, you know, adapting them to your organization's specific needs and growth, Okay, and then embracing that culture of continuous learning and improvement. So this, this deep diet has really equipped you with the knowledge to navigate this really exciting and rapidly evolving field. But the journey doesn't end.

Speaker 1

33:36

Here, no, it doesn't.

Speaker 2

33:37

So keep exploring, keep experimenting and keep pushing the boundaries of what's possible with data.

Speaker 1

33:43

It really is, and it can help you make those decisions about your bi strategy. So let's go back to these stages and unpack them a little bit more. You know, we know that in the early stages, companies rely heavily on spreadsheets and ad hoc analysis. What are some of the telltale signs that a company's in this phase and what are some of the challenges they might face.

Speaker 2

34:02

Yeah, so in this initial phase, data is often, like you said, scattered across different sources, and there's a really heavy reliance on those manual processes to extract, you know,

34:12

just those basic insights. Think of those late nights, like you said, spent cobbling together reports and Excel trying to make sense of data from different departments or systems, and you know, it's a functional approach, but it's incredibly time consuming, prone to errors, and it just doesn't scale well as the organization grows.

Speaker 1

34:29

It's like trying to build a skyscraper.

Speaker 2

34:30

With handles exactly. Yeah, you might be able to lay a few bricks, but you're not going to get very far. And the guidebook highlights some of those key challenges of this stage, you know, data inconsistency, lack of standardization, difficulty in collaborating, and a very limited ability to answer those those those complex business questions.

Speaker 1

34:49

Frustration from both you know, the people who are putting together the reports and the decision makers who are trying to use it exactly.

Speaker 2

34:56

And this is often the point where companies realize, okay, we need a more bust and scalable solution, so they start to invest in BI tools, which kind of moves them to that second stage of the arc of adoption, okay, which is characterized by those you know, static reports and dashboards. So this is where we start to see those you know, colorful charts and graphs that executives love.

Speaker 1

35:17

Right and dashboards. They provide a more centralized and visual way to track those key metrics, making it easier to monitor progress, identify trends, and BI tools play a much more prominent role at this stage, automating those reporting processes and making data more accessible to a wider audience. So we've gone from that hand drawn map to a GPS system. We've got a clearer view of where we are and where we're going, but we're still following that pre determined route.

Speaker 2

35:43

That's a great way to put it. Yeah, dashboards, you know, they provide valuable insights, but they don't necessarily empower users to explore the data freely or ask those what if questions that often lead to those you know, groundbreaking discoveries.

Speaker 1

35:57

Right, and that brings us to the holy grail, self service analytics exactly. Yeah, the third stage of this arc of adoption. Yeah, and this is where our users have the ability to access, analyze, and visualize data independently without having to go to the data team for every request.

Speaker 2

36:12

Exactly. Self service analytics represents a shift from that centralized, it driven model to a much more decentralized user driven approach. It's like having a fully interactive map. You can zoom in and out, explore different routes, discover those hidden gems, you know, and even create your own personalized maps.

Speaker 1

36:30

Okay, so very empowering for the user. But I know the guidebook cautions that just giving people BI tools isn't enough to actually be able to do this.

Speaker 2

36:38

No, not at all. Simply giving everyone access to data without the right foundation is a recipe for disaster. It's like giving someone a really powerful supports car you know, without any driving lessons or knowledge of traffic laws, they might be able to get it moving, but it's probably going to end in a crash.

Speaker 1

36:55

That's a pretty scary thought. So what are those essential ingredients for self service analytics?

Speaker 2

36:59

Yes, so the guidebook highlights three key pillars data literacy, data governance, and a robust data modeling layer. So data literacy basically means that users need to understand how to interpret and work with data, how to ask the right questions, and then how to draw you know, meaningful conclusions from the insights they discover.

Speaker 1

37:16

So it's not just about having access to the data, it's knowing what to do with it.

Speaker 2

37:20

Yeah, exactly, it's about being able to speak the language of data data governance that ensures that the data is accurate, consistent, and trustworthy. It involves establishing those processes for data quality management, defining clearer roles and responsibilities for data stewardship, and then implementing those security measures to protect sensitive information, but.

Speaker 1

37:42

Making sure everyone's using the data responsibly and.

Speaker 2

37:45

Exactly, yeah, it's about creating a framework to make sure that happens. And then finally, you know, we have that data modeling layer and This is that foundation that makes self service analytics possible. By defining those business logics and calculations, you know, those relationships in a centralized location, we're creating that single source of truth that everyone in the organization can trust.

Speaker 1

38:07

So that modeling layer is our guide. It ensures that everyone speaking the same language interpreting the data consistently exactly.

Speaker 2

38:14

And without that robust data modeling layer, you know, self service analytics can really descend into chaos where you have users potentially creating their own definitions, their own calculations, which leads to inconsistencies and inaccurate reporting.

Speaker 1

38:30

So it sounds like this is really a journey. It's not a.

Speaker 2

38:32

Destination, absolutely, yeah, it's it's an ongoing process of learning, adapting, and refining. And as organizations become more data driven, the demands on the data team also increase, and the guidebook really delves into the challenges of scaling your bi tools and processes to match those you know, growing data needs of the organization.

Speaker 1

38:52

Right, so, the data team can easily become overwhelmed too many requests for reports, dashboards, ad hoc analyses. How do we prevent that?

Speaker 2

38:59

What are some strategy so scalability becomes really crucial here, and the guidebook highlights several strategies for scaling your BI infrastructure. You know, things like choosing the right BI tools that can handle those increasing data volumes and that user concurrency, implementing a really robust data governance process to streamline workflows, and then fostering a culture of data literacy throughout the.

Speaker 1

39:21

Organization, so working smarter, not harder, investing the right tools, processes, skills to help that data team keep pace with those growing demands exactly.

Speaker 2

39:30

Yeah, you know, investing in those right tools, those processes, and those skills really helps those data teams keep pace with those growing demands and ensure that the organization can continue to make data driven decisions as it scales.

Speaker 1

39:42

Okay, so technology is a really important enabler, but at the end of the day, it's really the people that are going to be driving data driven.

Speaker 2

39:49

Success, right, Yeah, absolutely, technology is an essential enabler, but it's ultimately the people who drive that data driven success. So fostering that culture of data literacy and power uring those business users, and investing in that development of data professionals, those are all crucial for building a truly data driven organization.

Speaker 1

40:08

And the guide, but concludes by saying that this world of business intelligence is constantly evolving with new tools and technologies. What works today might be outdated tomorrow, so it's really essential to stay curious, experiment, and continue to learn and adapt.

Speaker 2

40:24

You know, It's like surfing. You need to be aware of those changing tides, adjust your balance, and stay ahead of the curve to avoid wiping out.

Speaker 1

40:31

Yeah, I like that analogy. So what does all this mean for listeners?

Speaker 2

40:34

Well, building a modern analytics system, it's not just about choosing the right tools, it's about really understanding those underlying principles, adapting them to your organization's needs and growth, and embracing that culture of continuous learning and improvement.

Speaker 1

40:48

This deep dive has really given you the knowledge to navigate this exciting and rapidly evolving field. But remember it doesn't end here. Keep exploring, keep experimenting, and keep pushing the boundaries of what's possible with data.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript