BA Bites - 10 Things Every BA Must Know About Modern Data Management - podcast episode cover

BA Bites - 10 Things Every BA Must Know About Modern Data Management

Feb 27, 2025β€’22 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

πŸš€ BA Bites - 10 Things Every BA Must Know About Modern Data Management πŸš€

Data is the lifeblood of business decision-making, but how can Business Analysts navigate the ever-evolving landscape of data fabrics, analytics models, and automated pipelines?

In this episode of The Better Business Analyst Podcast, we break down the 10 key things you need to know to get started with modern data management. We cover:

βœ… The Data Management Lifecycle
βœ… Data Fabrics & Pipelines
βœ… Levels of Data Quality
βœ… AI-Powered Analytics
βœ… Data Storytelling & Business Value

If you’re a BA looking to enhance your data skills and stay ahead of the curve, this episode is for you! 🎧 Tune in now and take your data game to the next level.

#BusinessAnalysis #DataManagement #Analytics #DataDriven #BAPodcast

Transcript

Hi everybody, and welcome back to the Better Business Analysis podcast with your host, Benjamin Walsh. And today we're diving into a topic that every modern BA needs to understand. That's right, this BA Bytes episode will be focused on modern data management and analytics. The Better Business Analysis Institute presence, the Better Business Analysis Podcast with Benjamin Walsh Data is the backbone of decision making.

It's the backbone of AI, machine learning and as BAS we need to know how to work with it, analyse it and ensure its quality. But with new technology like data fabrics, modern analytical methods, and automated pipelines, how do we keep up? Well, don't worry, I've got you covered. In today's episode, we will break into 10 things you need to know to get started with modern data management as a business analyst.

So the first thing we need to really talk about is that number one, data isn't structured anymore. OK? So if you're in the world of I guess relational databases, which was new when I started in IT, or you are around cleaning data and putting in spreadsheets and so forth, the world has 10 steps ahead of you. Once Upon a time data was mostly structured and there were neat rows and columns in a database. Think Excel if you haven't

worked in databases before. But now we have structured, semi structured and unstructured data. And before we were trying to get that to be structured so we could use it. But now techniques have kind of caught up and also we are more close to source. And So what I mean by that is text, images, incentive things, so IO, TS, sensor data, social media interactions, we need to be able to deal with all of those things.

And to be honest, you can't structure all that information and keep up with the structuring of that information in time to use it. So we need to better deal with that from those 3 formats we just talked about structured, semi structured and unstructured data.

And as BAS, we need to understand where data comes from and how to work with these, I guess, diverse formats in terms of collecting it, capturing it, processing, storing it, transforming it. So we can get it into a form in which it is consumable for whatever use case that we want to use. So that might be back into data pipelines or systems. It might be a Power BI report, so business intelligence reporting. It could be integration with other systems.

It could be back to consumers. So there are lots of different consumption use cases and some of those are structured, unstructured and semi structured and their formats differ as well. That leads me into #2 the data management life cycle. It is critical. Data doesn't just appear and disappear. It follows a life cycle. And some of these are quite different. Gartner has one. There is kind of some data governance forums that have a good life cycle. You would have heard a version

of this. Here's one. It's ingestion, storage, processing, transformation, analytics and disposal. But there are different ways of talking about that. So I said consumption, which I prefer analytics, which is a consumption type. So there is a really standard data management life cycle and you can argue about whether or not data goes in a straight line or which it doesn't, whether or not it can go back through those processes. But there are fundamental

building blocks. There's about five of them. And even with different terminologies and different organizations, they're they're really consistent. OK. And I would take a Gartner approach here and just not get involved in organizational distractions. Usually your government or best practice in the private industry has already defined these

things. Now knowing where data is in the life cycle in those five or so steps helps BAS define requirements and align stakeholders on expectations of data quality, what they might need to do to collect the data or what state it might be in. And I'm experiencing this right now with a very key client and expectations, I tell you, are all over the place. OK. And people don't realize we need to invest either money, time, process, change engagement in order to enrich your data #3 is

important. And this is changing the game. So even if you are maybe a data architect, a data analyst working in a more traditional environment, data warehousing environment, you need to know #3 which is data fabrics, OK? And they're changing the game. So it's weaving data through pipelines. Traditionally data was managed in silos. OK, So think of different blocks, maybe one block per data management life cycle step or

per application or per use case. So maybe you've got data in Salesforce, maybe you've got some data in a data warehouse, maybe you've got data in CRM Dynamics 365, maybe you've got it in SQL databases, maybe you've got it in spreadsheets, maybe you have it in survey forms. This is a typical organization, right?

And so there there is a way that we structure what we call that subject area data or entity data to have a bit of an idea about where we should store things and why we use applications and and whatnot. And that's leading the way in terms of application design. But we also need to be aware that there is simply very few places that are able to work in one monolithic system like AERP, SAP for example, and that be

their only system. A lot of people move to that, but that has it's own constraints. And so there's this acceptance that we're always going to, we're not always be in control of I guess the ecosystem of our data. So what our customers use, what our data consumers want to use, the technical landscape that we are exposed to. So we need to be able to connect to this environment. And so therefore you need to use

a data fabric to do that. And that's a new approach that integrates all data across systems into a unified architecture. The architecture is still unified at the high level, both business and technical and it allows probably real time access and better analytics. Now I've said probably real time access, real time actually cost money. And when we say real time, it could be a in a day. It's we're not talking about

microseconds here. And things do take a little while to process through if you want them to be in a right state. So when we say real time, just be careful with that term as it be at now. This is something that Beas need to advocate for when discussing modern data strategies. And I am currently writing a paper about this to accept multi cloud, maybe on premise on cloud

solutions, transitional states. And we need to really think about data fabrics as a solution there as opposed to consolidation #4 and we've touched on this, data fabrics. OK. So the thing about data fabrics, there is a product called Fabric, Microsoft Fabric, which we'll get to in a minute. But it isn't the only product out there, but #4 is data pipelines. OK. So think about these as pipelines in your house where you need water to go.

That's a good analogy. And it connecting to the main pipe, which is also connected to other infrastructure that provides water to your house. Now these pipelines are like factory assembling lines, right? So you could think about them as a factory along a factory or water going through, being routed through a pipe. And maybe it changes from fresh water to dirty water to hot water. Data moves through various stages, OK? And we need to extract, effectively extract data.

And there's a broader kind of high level abstraction of the data management life cycle covers a few steps, which is like collection and capturing the data and maybe getting it into the state you want. Then we've got transformation and then we've got load. And this is an old term and we call it ETL or ELT depending on which way around you do the the loading and the transformation. Now understanding how these

pipelines work. So these are the technical capabilities needed to meet the life cycle we talked about, OK. So I'm going to say that again, the data flows through the data management life circle, OK. Now more conceptual, they're both business and technical capabilities, but under the hood, if you like, and the just that connects our business layer down to our technology solution, we have these steps which are broadly now referred to as the data management steps.

And historically, we're talked about in terms of ETL and understanding how those pipelines work, right, in both the new world data management life cycle or ETL world, which are kind of one in the same, just different terminology and groupings that will help you as ABA ensure that data is processed correctly and useful for decision making, right? So I'll give you an example. I have collected data in a survey. I've surveyed all my customers about a new product that I have launched.

Now that product might be a web product and might be on my website and that might be integrated with my CRM solution, which is say HubSpot. Now I may have a greater architecture than just those components, but let's just keep it simple here. I may have sent out a mail, I guess SurveyMonkey, sorry, survey, and I've integrated that with HubSpot and I service maybe the product I've got on the

website. So when they use the product, survey pops up, which happens to be something different, which is SurveyMonkey. And when they capture the feedback, it goes back into HubSpot, right? So my data moves around. Now in that case, we are collecting data through SurveyMonkey. We're actually collecting it there and we're capturing it in HubSpot. We might be transforming it into HubSpot. We may be connecting it with other information from, for example, the website and the

product that we're using. And then we might be say loading that into say reporting tables and Power BI, for example, out. And so that we need to think about what is the state of the data in all those different steps. Another way of looking at those steps is to look at it in the data management life cycle term, which I prefer, and to think about this something called the medallion model where we kind of classify our data in terms of bronze, silver and gold in terms

of it's usefulness. And so as it moves to the data management life cycle and gets closer to consumption, it gets better. And so it's a gold form, OK. And that's also another way that you can look at data in a modern way. So you may hear those terms.

And that's much better than this kind of ETL process because it doesn't really allow you to know the quality or it doesn't give A tag of quality along the way, which is the most important for most organization #5 is that data quality has levels, as we just talked about. And you can actually look at these, not just in this gold layer model, medallion model, but you can look at these through 6 dimensions. And poor data leads to poor insight. So we, we need to be really important about that.

And so one is accuracy. How accurate is the data? And the trick to making sure that it is accurate is to focus on its capture. So making sure you capture it in an accurate way with validation, OK. And you don't want to build in a whole lot of validation checks because that might take a long time.

There's completeness. So what data do we need from different sources to add to the picture to know that our product, the feedback we've got through SurveyMonkey and the product itself on the website come together to give us a complete picture? We need it to be consistent. So we need to collect it again and again and again through multiple different time periods, maybe different customer segments in order to compare it. We also need to factor in

timeliness. So if you've collected data from last year, you're making a form this year, it's just not good enough. So a lot of the solutions that we use traditionally take a long time to process in a lot of effort, time and effort. And so we need to use these new modern techniques to be able to, like we said, real time it. But what we mean by that is just get it in a more timely fashion. So within the period in which you need to make the decision.

So if that's within a day, then you need to get it within a day. If you need it within 1/4, you need it within the quarter, OK. And that that's the whole process of getting it, collecting it and capturing it, you know, accessing it, transforming it, storing it and getting it ready for consumption. So there's quite a lot going on there. The data needs to be valid truth, if you like. OK, so we need truth to the data. It needs to actually be true.

If you, for example, if you're involved in statistics, you'll know all about data quality and, you know, surveys and the margin there and all the rest of it. We can't make decisions based on a small data set generally, right? We can make assumptions. And so that's why we say big data because we need it to be valid and to be accurate. And the last bit is we need some uniqueness. So what what we mean by that is we don't want duplicate information coming from different sources having

different versions of the truth. That's why we talk about single source of truth, which is the most used word in IT ever. What we mean that a single view of the truth, not source, OK, because you will have multiple sources. So that term, you can kill that term whenever you hear it and say that term is old school. There are multiple sources. What we need to make sure and sources are good. By the way, we need one view of the truth, right?

No doubt our conversation would would be where it needs to be without adding a bit of boringness to the conversation. And that boringness comes into two very important areas, which some people love, I find boring. But do you know what? I I'll tell you how I get around not being brought out of my brain when I dip into this. And that is governance and compliance. So we have regulations, we have government policies, we have internal policies. And data governance isn't optional.

OK, you have to do it. BAS need to work with compliance teams. We need to understand privacy acts. We need to ensure that data is handled responsibly, securely and ethics are used. Now, if this doesn't blow your trumpet like for me, then there are so many good models out there.

The trick is don't come up with your own look, see what best practices and adopt it. OK, and then and then if you need a massager, you can, but I would assume that every, I don't know education government department in the world has very similar governance across it. You need to have internal governance turned out over Baker 'cause that's where bureaucracy can kill good outcomes. So you need to apply data governance. You can't ignore it.

And So what I deal with when I had something that I don't enjoy as much, like some vegetables I don't like, is you eat them first, right? Get them done 1st, and then get on to the stuff you do enjoy, which might be improving outcomes through the data you've got and insights. OK, so that's number six. And if we move on to #7 we've touched on this quite a bit lately, and that is AI and machine learning. OK?

They're driving insights. But it's so important that to realise that if you do not have great data, your AI and machine learning are a waste of time. So this is a prerequisite for your own, using your own data to make informed decisions. Data is not just about spreadsheets and dashboards. And yeah, sometimes they're really good, but AI powered analytics can uncover patterns that you can't. It can predict trends, it can make automated decision making.

And B as need to understand how to interpret those insights, communicate them effectively and explain why maybe based on the data that's been inputted into this, been consumed by these tools, why those insights might be different to what were expected. And that if you've pushed for these tools early, when your maturity model is low, even though you you want to use them, you want to get these outcomes, every CIOCTO in the world is pushing for these tools.

If your data is crappy, your insights are going to be crappy #8 do. This is so important. And this is where we need to make sure that data is not owned by digital or IT OK per SE. Not an ivory tower exercise here #8 is self-service analytics OK? And it empowers teams. There is a lot of kit out there in the data space. There are a lot of tools you could use and they really need to be selected based on your environment. You need to choose the right tool for the right job and the

right environment. So business unit users don't want to wait for IT to generate a report anymore. But not only that is your business users might be data analysts, they may require data analyst capabilities and maybe BAS outside of digital and they need access to the data, they need access to pipelines, they need access to continually improve these needed access to run their own jobs. So what tool are you going to use as an interface layer?

So they don't have to be data engineers, but you've set it up so they can build on top of that infrastructure. Again, the data fabrics, you know, pipelines, visibility, modern analytical tools, right? They offer self-service capabilities, meaning anyone can access and visualize data. So that's starting at the analytics end. The analytics tools are now exposing the data pipeline so you can see where the data came from and maybe know why you're getting the insights you're getting.

BA should help design intuitive interfaces and ensure that stakeholders get the right insights and know why that data point is the way it is #9 data. Storytelling is a must have skill so facts don't drive decisions, stories actually do. And BAS need to go above and beyond charts, numbers and craft compelling narratives around data. The collection, the ecosystem, the application framework, the

customer journey to make. Insights clear and actionable for stakeholders and #10 always align data strategy with business goals. There is no point in having great AI pipelines, massage data if it's not going to be used.

If we go back to our HubSpot example, if we're surveying customers on the features they enjoyed about our product through SurveyMonkey, but we're never going to use that to actually make a change to our product because our product strategy doesn't incorporate enough, I guess, ad hoc customer feedback from the website. Then don't do it. What's what a waste of time. At the end of the day, data management isn't just about technology, it's about business

value. Every data initiative should go back to strategic goals or investment objectives and a bit of business case model and explain whether that's reducing cost or increasing efficiency or improving customer experience. Why do we need to invest in this data project? I've heard horror stories of IT or data teams building data products, spending millions building data products that no one wants to use. So what you might find is 2

things. In that case, 1, you haven't capture requirements and you're not meeting objectives. So therefore your user base, internal user base or your customers are not getting what they asked for or what they want. And there might be another thing, another insight that I've experienced. Sometimes people want to fish for themselves.

So in a modern data analytical world, we need our users to be able to fish on top of these tools and a secure, you know, pond with fish with the rod that we give them. But it is no longer it's job to own data. I will see you next week.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android