¶ Intro / Opening
Speaking of reliability, a podcast with good friends talking with you about reliability engineering topics. Welcome to Speaking of Reliability. This is Fred Schenkelberg. And this is Diana Deene. Hey Diana, good talking to you again. Yeah.
¶ The Data Dilemma: Too Much or Too Little?
It's a one of the questions I got. I want to dive right into it because I think it's a an interesting topic. Is I got a question this week that said, You know, do as a reliability engineer in a company, do you have enough data to do your job?
And I think there's a couple aspects of that and I wanted to chat with you about it. And I think uh you've Obviously have been in companies and you're looking for data sets or, you know, trying to answer a question that, you know, if I had the data I could, you know, analyze and get some insight. Have you ever run into the time when you just didn't have enough or vice versa had way too much? Yes, um of course. Yeah. That's a really good question. And it comes up a lot.
Um with reliability kind of questions, it was always there was not enough data because it was something new and unproven, like we're doing something at the edge of what we understood and knew. So we had to do more testing to just find out what what was happening, how things would react. Um and sometimes, you know, those You have to be strategic with designing them because they can be expensive.
So yeah, there's a lot of times where there's just no data and you have to create a test and then really justify the expense, um, which I think is a common story with reliability engineers. And then um and then other times uh People had so much data. You know, it took us a couple of conversations to really get to understand what kind of data it is that we wanted. So So yes, I guess I've gone through the gamut. Not enough and too much and what's our real question? Yes.
¶ Accessibility and Siloed Information
Yeah, yeah, I get that. It I find you know, in working with different companies and saying and they say we we're trying to estimate what our field reliability is. Okay. Do you keep track of, you know, customer phone calls and complaints or returns? Oh, of course. All right, so what's the problem? Use that. Well, we don't have access to that. Yeah.
All right. Let's go let's go stand in somebody's desk until they give us the key or or sanitize it for us or whatever. It's you know, we're trying to figure out How we're going to improve an existing product that the sales team and marketing and everybody else says, well, it's not good enough, make it better. And we're thinking will make what better. Nobody's told us anything about what's failing or how it fails or all that stuff.
Or even how many are existing out there. And it was, well, that's in a different department. We don't talk to those people. Yeah, I never really accepted that answer. No. If it was within the same company, yeah. No, you gotta fix that.
Yeah, it's our it's our data. Yeah. We we need to look at that. But and then there's also the other scenario if it's a big company. Um Somebody's off in the corner doing their job and collecting data and information and plotting charts and like sharing it with their managers and it goes nowhere else. Yeah. And then when you finally stand at that person's desk and be like, okay, what's this what's this data? We need this information. And they share share their report like, wow, this is gold.
Where where has this been?
¶ Untrained Analysis and Data Quality
Yeah. Why are you hitting away over here? And I've also found it though, I was in a in one in a company that the RD team had a really good reliability group and they had test data and they had some access to field data. They often complained that the customer service group didn't really understand how to analyze the stuff. Somebody handed them a software packages in here, just stick it in here and make a plot.
And they did. They had no concept of failure mechanisms or why things would bend or s or change or how to fit it. They weren't even statisticians. They just were handed this software package and here, make plot. And the reliability team was just we don't have time to teach these guys how to do it. That's not our job. But then it was a battle between two departments of we want the raw data. Well, you can't have the personally identifying stuff.
That's fine. Give me the serial number. When was it made? When was it not? And then lining up manufacturing data, you know, in some organizations you get I have pretty good idea when it was installed or first turned on type stuff. And then others would have, well, they gave us a phone call and they got a return authorization, and then it got failure, some failure analysis stuff.
And every one of those topics I mentioned was in a different database and and configured differently and and a good number of them were hand filled in fields. So it was like really error prone and not consistent and everything else. I spent more time with more data sets just cleaning them up.
You know, and you get a data set, you drop it in the in a program and it says, you know, we can't analyze it because there's some text in here. And invariably there's a O instead of a zero somewhere in your 10,000 lines of data, you know. Where is and I don't know it's an O. It might be an I instead of an uh uh one. Yeah, I I remember back in the day there I was working with a professor and a grad student. I was uh student eight, I think. Um but they were stuck on this program.
And it was just I don't know how many like if they printed it out, it was like two hundred and fifty pages. Yeah. And it wasn't working. And they ended up finding out that it was a misplaced. period somewhere. Yeah. And in that all all that stuff.
¶ Cleaning Data and System Integration
Yeah, it a lot of times though I've learned over the years is never underestimate how much time it takes to clean the data, to sort it out, just to you know You look down the column and if they're all four digit numbers and one of them's got twenty digits. w is what's going on here. Is that part of just the data collection or was that three sets of data all put into one cell? You know, what
It and little things like that can really throw an analysis off and in some programs won't even let it load. You know, you can't even start. And Yeah, so so there's a few things there. There's the the data quality itself. That's being what is what is being collected and is what is being collected correct. And then you mentioned all these disparate uh programs that are not talking to each other so that people really can't make decisions with the information that they got.
Yeah. And um I'm not I'm not gonna go down the AI rabbit hole, but Hmm. Yeah. Well we'll just fill in to the data that we think should be there. Well no, but like getting systems to talk, you know, that's that's something that it could do. Um and then maybe being a, you know, like an intern helper to help look at what you got, um, to s to highlight any problem areas could be a use for that. But Well I wondered, yeah, that's a good idea is because looking for In a numerical column.
is some you just sometimes you just can't even see it. And you don't know what text is in there that you're looking for. So that makes it more complicated. I I remember Excel at one point they had numbers were on the right and text was Left justified. If by default and but if the column was
uh formatted to be numeric and it was always on the right, then you you lose that capability. Is but at some point you you just have to kinda look at the data and go, I'm wondering if AI could say Which set highlight the cells that have text in them? Not numer numerical data. Yeah. Might be a useful thing. Yeah, you should be able to do that.
Yeah. the the amount of time it takes to just find the data and then say, all right, we have a serial number over here, but we change it after, you know, this period of time and then it goes to this system and and we lose the ability to sort because of the changes and the way it was you know, recorded or the manufacturing people have it by this kind of by week of manufacturer, but everybody else has it by hour of installation and which ones fits with what?
There's a lot of puzzling the first time you go through this and and get, you know, some reliability data and show that it's useful. Then organizations go, oh, that is useful. Let's make this a bit simpler. And then you get assigned an IT team that really doesn't want to talk to you. Look at that.
¶ Misaligned Data Collection Purposes
But the data quality is the biggest part, I think, is that are we collecting the I mean we can f there's data that's collected for when a customer calls and is returning a product and they get credit for it. So there's financial data, there's, you know, where do we send the check or we where where do we send the replacement part? Well what I find is good number of those teams, the better ones do it really well, but the others just go, all right, we did it. We replaced it.
And you and I would walk in and go, why? What was wrong with it? Yeah. Yeah, kind of thing. And Uh okay. We're not getting that. How about we work on it so we get like one in five or one in a hundred units come back to us and we get to do failure analysis to see what's going on and what we can do with it. Um
It's rare that I find it that it's rich enough at the points of collection because the customer service has got a different purpose, right? Primarily to satisfy the customer, get them the replacement. They're not the failure analysis team. They're not the design team. And the same with finance folks, you know, a warranty claim. And even if the product never comes back, they send out a replacement and it gets charged and they track dollars, not number of units. Yeah.
¶ Management Oversight and Data Relevance
Yeah, this is reminding me in a in in the medical device world and and even anyone that has a quality management system. Mm-hmm. that it this is all kind of falls under management responsibility. There's supposed to be oversight of what information is created and how it's reviewed for what to make sure that things are functioning. Properly. But then um you get that, but then sometimes With reliability engineering information.
How how would you say you're down a level in the hierarchy? For man for management review. You know, you're Yeah. you're working, you know, on on on things that are important to the business and the company, maybe management doesn't have eyes and all the nuances of it. So even if you are working in a quality management system, the data that you're getting may not be useful for reliability engineering purposes unless you kind of advocate for that. And that can that can be hard.
¶ Design Challenges and Vendor Data
But yeah, it really can be. And it's and and there's so many sources of of of data that's available that we may or may need to tap into. I mean the you're mentioning like we need to run the company. So how much money are we making? What's the profitability? What's our warranty expense? What's our manufacturing cost?
uh how many units are we shipping per week, per month, whatever? Are we on targets for for all these various things? Those are generally not discrete enough or f or ref or detailed enough data for for us in the reliability world to really do a decent analysis and impact decisions on warranty policy or design considerations. But it it gets worse when we're in the design for reliability part because like you said earlier, is, you know,
Yeah, we might not know about this new technology or this new setup or whatever and and now we're relying on testings and simulations and stuff like that. So we might get ten points of data. and have to do an analysis. So sometimes we have way more than we ever could deal with and have spent half our month just cleaning it up so we could get a handle on it and connecting all the dots. And then we have great im insights, yet it
the first time through it is always painful. And then other times it's there's such a scarcity of data and the vendor saying, well, we don't test it that way. We don't use this product the way you guys are doing it. So you're on your own. Thanks guys.
¶ Proactive Data Collection Strategies
You know there was The best time to address that, I think, if you can, is before the product is released to market. So you've you've done the product development stuff. Um, you've identified the essential design outputs, things that are critical to safety and critical to mainten, whatever it is that you want to to measure. And you can link that to the outputs of your product. And then you can coordinate with the customer service department. to collect that kind of data.
um if they get a complaint. So so now um you can also have, you know, when you release the product to the field, you can have a certain timeframe where you're monitoring everything and be specific about the data that you're collecting. Yeah. So that um I remember uh working on a project where you
approach the complaint person. This is this was somebody that was responsible for Um taking all the complaint information, trending products for product performance, not necessarily money and stuff, but product performance and failures and that kind of thing, trying to draw some trends. And and analyze how the product's performing so management could decide, does this need a design change? Do we need to look into this further? Do we need to pool this product?
Um, and he was really excited about us working with him coming off of this product design to actually gather the data that we would use. 'Cause he w he wanted the information to be useful for decisions. And so he really welcomed that collaboration and that relationship. So but it's just something that um
I guess you have to f foster and I don't I don't know that your typical procedures and processes are going to handle something like that. I I think it requires some foresight from the team to reach out and work together.
¶ Guiding Teams for Critical Data
Yeah, but I think there's there's a handful of places where the design team who's struggled with getting this thing to work and has played around with prototypes and has learned a ton about the product and where the where the s I want to say where the skeletons are, but where areas that We really need to pay attention though because we're not sure we got it right.
You know, and the better teams will say, all right, manufacturing, you know, you can test whether it's all wired up correctly and the green light comes on when you turn it on. But Pay attention to this, this and this. And we used to call it critical quality project. Yeah. Mm-hmm. And then we'd f feed it forward to the manufacturing team saying
You know, these other things can vary a bunch. That's okay. But this, this, and this really have to be on. And or we're concerned about it. We we really don't know if it's gonna sti cause a problem or not. But it gives them a heads up of These are things to look for. And they might be performance pieces, they might be variabilities or whatever it is. But that we found that the going to the customer service team.
We're like the the story you just told. If we go to them as the design team saying, here's the top part of our FMEA that we did not address, and here's some that we address, but we're not exactly sure if we solved it or not. So here's like 10 or 15 things. We expect you may see. And if you do, we really want to know about it. And so for the first three months or however many units or whatever, if if anybody mentions this widget number 16 associated problem or whatever it is.
Immediately get the product back. Just replace the product. No questions asked. We need to to troubleshoot and tear that one down and understand what went wrong. um or you know, run this diagnostic and send it to us. It's it's kind of a um Coaching the design, the customer service team, and to some extent the manufacturing team, here's these essential pieces of info we need. in order to to improve this design and get it manufacturing stable and stuff like that. And
I found that it worked best when the customer service and the field service teams were in-house. They were in the company because part they were wearing the same branded shirts that we were. Yeah. We didn't always mean they were working to the same purpose and, you know, all these other things, but it it was much easier when it was an outsourced team.
They did not want to hear it. This is I have to do 20 calls an hour and this takes too much time. We're not, it's not in the contract. We're not going to do it. And so we had to go back and change the contract with them and do stuff like that. Yeah. Which takes time. Bye.
¶ Historical Data Collection Pitfalls
No, one of the things I found in in one last story on data collection. Um It was a story by, oh, I'm drawing a blank on his name. He was the original author of Practical Reliability Engineering, O'Connor. Okay. drawn a blank on his first name for some reason. But anyway, uh Patrick, Patrick O'Connor, and he told the story that they he was a a radio repairman in in I think it was in the fifties or something like that, early on. and when you could repair radios. Yeah. And he said,
They were given a, you know, a stack of radios and they had to repair them. And the idea was to get as many radios repaired per day as you could to feed back into the military wherever where they were being used. And and then there was a stack of paperwork that you couldn't leave that day until you finished all the paperwork.
And you were supposed to do it while you're doing it. But the the bench they had didn't have room for the radio and your equipment and a piece of paper. So it was kind of a stack on the end and says, Oh, I fixed eighteen radios today, so let me get eighteen sheets of paper. And you fill it out. And he noticed that they had a checklist of, you know, what what was the problem in the radio? And it was, you know, bad power supply, bad this, bad, whatever. And
They were doing shotgun repairs to do it fast. Just here's the likely things that these radios would do unless there's anything physic visibly obvious. They just replaced like eight or nine pieces. see if it worked and then they replace another nine things and see if it worked and pretty much rebuilding them kind of in a min-set way. And After 18 of them and you're it the paperwork's between you and and getting dinner, you need to spill them out.
Yeah. As quick as you can and he said somebody and so I Remembered that story, and I was working with a team that was in a similar way. They were repairing workstations, I think it was. And I noticed that. uh uh just observing them doing the paperwork and they were doing it with the repair because I'm from the company, I'm standing there w watching them, but they would fill out the serial number, they fill out this and everything else, and then they would click or check a box.
almost always the same box. And it was right under the last piece of information they were required to put in. So it was the closest one. So I said I didn't say anything at the time, but I changed the form. and reordered the the what was wrong with the product. And sure enough, the one right under the last four field that they had to fill in was always was like checked 80% of the time. So all of a sudden you had a new root cause.
You know, it was so let me look at your spares. You know, what what components are you actually using? Which was better, but they were doing so much shotgun replacement, they would replace all kinds of stuff. That even that wasn't terribly useful. It was way better than counting on the technician to tell you what was really going wrong.
¶ Usability and Explaining Data's Value
Yeah. Yeah, that's one of those I always feel like even if you're designing a form, a checklist for somebody, you almost have to think through it like usability engineering. They are your customer because you're asking them to fill out your form and your form is the design. And um
just to think about the quality of the output you're gonna get from it from that point of view. Um that that always helped me when I was trying to create those kind of checklists or Visual guided guidance documents or whatever, uh for for manufacturing. Exactly. You're creating something for them to use. Yeah. Yeah, and it's gotta be useful for them. It's it's yeah, I need the data. And I find that, especially when it's in company, you you're talking to fellow employees, it's like
we get good data, you'll see fewer fewer failures. And then you can do, you know, real repairs without burning through so many parts just because of time pressure. You won't have as much time pressure. And then one of them said, But if if we're not busy, then the boss will fire half of us. And I'm like, Yeah, okay. Let's go change the attitude of a few people. Um, you know, kind of thing. Because the value of good data on field returns and field failures and failure analysis.
way outweighs the extra half hour it takes to get it. But if your budget is just turn them around and get the parts back out in the field, that's not visible to that person, to that manager. So sometimes data collection goes into the silo objectives and the where the where the objectives are for that group and what focuses they have and
And I also find that if you work with folks going, this is why we're collecting it, this is what it we're using it for, this is how we're using it, helps a lot. It doesn't solve everything because there's some systemic things that battle that. I t totally agree and I'm on board with that. That if um
You don't have to explain reliability engineering to people because we're a little crazy anyway. But um but but yeah, if there's an understanding of why why this is important, what we're using it for and how it it can impact you. Um, that's a big part of getting people on board with your data collection ideas. And they may they may have a different idea that's easier for them and better for you too. Yep, yep, yeah. It's a big part of it too.
¶ Data Overload and Nirvana
I think I I walked in one factory and they were collect they the one their main piece of equipment, a bottling machine, and it would bottle like Depending on what it was, it'd be like a thousand bottles an hour or something like that. It was a mechanical. marvel. It was just mesmerizing to watch this thing filling these bottles with the juice of some sort. It must have had 50 sensors on it.
And they were all collecting any jams, any misfills, any caps not being put on, any just uh over and over and over again. It had s it was terabytes of data coming off this machine per day. And so of course I said, is anybody actually ever look at that? And they all kind of shook their heads. Good, no. We paid extra for it, but we don't have time to look at stuff like that. Okay. Okay. Never mind. So it can go one way or the other.
There is data available. Sometimes it takes some work and creativity to get to it. And being very clear what you're looking for and why you're looking for it can help a lot. Um, sometimes you'd be overwhelmed with data that becomes available and it's a mixed blessing'cause then you gotta figure out how to put it all together and deal with it and and And how good is that data? All those factors around any kind of information we we deal with all matter. So we gotta think through it all.
Down to and including in some cases measurement systems. You know, how well is this actually make that measurement? And are we just chasing our tail because it's just a random number generator? Yeah, we didn't even really touch on that in this this talk. Yeah. But if you've got, you know, you walked into your reliability and job and they had great data and you and it's all trustworthy and you got great information to share with other people.
I think that's one that you should s hang out in and enjoy. Just stay there. Yeah. Just stay there and enjoy it. Let us know if it's if you're in
¶ Concluding Thoughts and Listener Feedback
Nirvana kind of land for that. Or what kind of struggles we had. We mentioned a handful from our experience, but I'm quite sure our listeners have probably seen a few other dilemmas and problems that uh have come up. But let us know. Head over to Ascendoreliability.com slash go slash SOR. A couple of ways to get in touch with us. And Diane and I and the other hosts of the show are available through LinkedIn and our about pages.
And this was prompted by a question, do we have enough data? Um yeah, sometimes, sometimes not, sometimes it's hard to answer that question. You know, but what's your take? Are you doing well with the data streams and availability for what you need to do? And so we'd love to hear from you. Well, thanks for taking this one on, Diana. It's I knew you'd had experience dealing with with data one form or another. So that it would be a good discussion.
It's it's always fun. But yeah, I have I've never attained Nirvana. It may work, please. It can happen. I've been in a couple organizations and they had beautiful systems to collect. uh failure data and share it and It probably just took a lot of work from somebody, a group of people to make it happen that way. Good for them. And so anyway, we'll talk to you again soon. Thanks so much, Diana. Okay, thank you for Thanks for listening to Speaking of Reliability.
We invite you to join the conversation. If you have a question or a topic that you think we should discuss in a future show, please let us know. You can find a comment box below the episode show notes Or just leave a note as part of a review on iTunes.
