SGEM#448: More than A Feeling – Gestalt vs CDT for Predicting Sepsis - podcast episode cover

SGEM#448: More than A Feeling – Gestalt vs CDT for Predicting Sepsis

Jul 27, 202430 min
--:--
--:--
Listen in podcast apps:

Episode description

Reference: Knack et al. Early Physician Gestalt Versus Usual Screening Tools for the Prediction of Sepsis in Critically Ill Emergency Patients. Ann Emerg Med 2024 Date: July 25, 2024 Guest Skeptic: Dr. Justin Morgenstern is an emergency physician and the creator of the #FOAMed project called www.First10EM.com Case: Your hospital is running Morbidity and Mortality (M&M) […]

The post SGEM#448: More than A Feeling – Gestalt vs CDT for Predicting Sepsis first appeared on The Skeptics Guide to Emergency Medicine.

Transcript

Welcome to the Skeptics guide to emergency medicine. Meet, Greet, Treat him and S him. Today's date is 07/25/2024, and I'm your skeptical host Ken Mil. The title of today's podcast is, More than a feeling Ge versus clinical decision tools for predicting sepsis. And our guess skeptic is doctor Justin Morgan. He is an emergency physician and the creator of that excellent Fo site called, first 10 Em. Welcome back to the S, Justin. Man ken has been far far too long.

We used to get through this like, almost every month, And now it's been... I I think like, almost 2 years, but I'm very glad to be back on the S gym where I belong. Oh, I don't know if you belong here, but we are very happy to have you. But, you know, you're in a super busy stage of life. You've got a a little 1, you know, who's just passed the Toddler stage

so that... Sucks up a lot of time, but you're also spending a lot of time being, you know, a frontline clinician, and I mentioned it earlier, you've got that great website first 10. What else do you do besides just throwing up blog posts. Yeah. I mean, we're trying to be a little bit better Me and Casey Parker who who's been doing this a lot. Longer on the Broom docs podcast. Trying to do a monthly journal

club. So if people like the S and just need even more nerd every month, we should have some of that, and I'm trying to get it up on Youtube as well. So lots more of this critical appraisal, content if people can't get quite enough just with 1 episode a week that that may not be enough cap. Yeah. If you if if you need even more nerd, but I I read your blog and I listen to your show. With Casey. Hey, Casey, always great to hear your voice. What a great guy there down in Australia.

But when this public, case. This particular publication on clinical decision rules for Sepsis came across my feet. I'm like, automatically thought of Justin. How much you love clinical decision rules. Yeah. Yeah. I'm I might have become a little bit infamous. There might be some long Twitter threads after I gave a fairly infamous talk of lamb decision rules maybe... Short version is that we

tend to get a little carried away. Maybe start using some of these rules well before we have good evidence and that I worry that could be hurting our patient. So I have to admit this paper caught my eye as well, and I was really glad that you reached out and prompted me to actually get down and read it. Could Alright. Well, why don't you start the show with a case. So your hospital's running a morbidity and mortality rounds always a fun time of the month.

After a few cases in which the diagnosis of sepsis might have been identified late and therefore antibiotics were delayed. They were planning on ins a mandatory screening tool at Triage and 1 of their main purposes of the meeting is to compare the available tools, things like Hu sofa. And muse. And as the local evidence based medicine nerd, they ask for your opinion on the evidence. You must be so popular at

committee meetings. I know that my popularity just shot up as soon as I was, recognized as the local E expert like you can tell a fictitious case because actually they never invite the E nerd to yeah. No no. Yeah. Please don't tell Ken's here. Okay. It's Well, sepsis is a life threatening organ dysfunction caused by a der regulated host response to infection. And it's a medical emergency that requires prompt recognition and treatment to improve patient outcomes.

And we've covered this a number of times on the S gem it's talking about, yeah, early recognition, source control, early application of appropriate antibiotics. And the ras, I'm a little bit more skeptical about, so I'll put those previous shows I'll put links in the show notes. Yeah and there's a a ton of emphasis on identifying sepsis early with the whole idea being early in intervention will save lives.

However, you don't wanna downplay the importance of Sepsis, but despite this really strong push for early antibiotics, the evidence of a benefit there is probably lacking. There's is observational data that's widely cited to suggest that early completion of Sepsis bundles improves outcomes, but considering that doctors would never purposely delay antibiotics and a patient who we know have sepsis, all of that data is going to be severely limited by multiple con.

Oh, indeed, my friend, and there was actually a randomized control trial done in the pre hospital setting, were they enrolled over 2500 patients, so this wasn't a small Rc. They had over 2500 patients. And the patients were randomized into 2 groups. 1 group got ce, 2 grams Iv. So this is not a str man comparison. They They gave 2 grams of Iv in the ambulance or usual care, which meant fluids and some supplemental oxygen Until arrival to the emergency department.

Now for that Rc, the primary outcome was reported as no statistical difference in more mortality at 1 month, it was 8 percent in both groups. Despite antibiotics being given more than an hour and a half 96 minutes earlier.

Now, all the secondary outcomes like mortality at 3 months, mis diagnosis, hospital length of stay, intensive care unit admission rate, intensive care, unit length of stay and quality of life, also did not show a statistical difference between the intervention group, which was, you know, very early antibiotics and the control group. And we covered that paper on S 02:07.

Yeah. I think that data is really important because it tells us that early identification of sepsis might not be as important as is sometimes, stated in these guidelines. Now, getting the diagnosis of sepsis, getting the correct diagnosis is clearly important for our patients. And as good as we all are, no clinician is perfect.

Acknowledging our imperfections, many have suggested that screening tools or decision tools might increase our accuracy when we're screening for sepsis and many such tools exist. Such as the systemic inflammatory response syndrome or the Sur tool, the sequential organ failure assessment, what you might know as the sofa, where there's also the quick sofa, the q sofa, and there's also the modified early warning or muse score.

Unfortunately, Justin this enthusiasm for decision instruments often outs strips the evidence. For a decision instrument to benefit patients, it needs to have more than just high sensitivity. I. E picking up all the true positives. It needs to change physician practice and not just change practice, but change practice for the better. It needs to be better than our current clinical practice or just plain an old ge.

There is an article published in Ae With an author list that includes the Who's who of clinical decision rules, people like Jeff Klein, you may know him from the perk rule. Nathan Cooper from the P group, and, of course, my B f Chris Carpenter. And that document tells us, quote, before widespread implementation of clinical decision rules should be compared to clinical judgment, end of, quote. Yeah. I think this is a big problem.

We get really focused on just things like sensitivity and specificity, but unfortunately, most of our rules haven't cleared this every basic evident hurdle. There's a classic study by Dave Sc published in anna of emergency medicine, that looked at all the studies in animals of emergency medicine from 19 98 to 20 15, and they found that only 11 percent of studies compared clinical decision aids to clinical judgment.

And in those that are compared to clinical judgment, which may be biased towards the best rules physician judgment was superior in 29 percent, equivalent or mixed in 46 percent. And the decision aid only outperformed clinical judgment in 10 percent of papers or a

total of 2 trials. There's a very similar study by sanders in 20 15 that concludes that clinical decision rules are, quote, rarely superior to clinical judgment, and there's generally a trade off between the proportion classified as not having disease and the proportion of missed diagnoses. Therefore, before widespread acceptance and use of Sepsis tools like Qs sofa or the Muse, we really need to see comparisons to clinical judgment to see if it outperform clinicians.

That brings us to the current study, which aims at comparing a number of these tools, to clinical judgment in the emergency department setting. So, Justin, what's the clinical question we're gonna try to answer on this S podcast?

So they ask what is the accuracy of standardized screening tools and a machine learning model to predict a hospital discharge diagnosis of sepsis compared with physician ge in the hyper acute period immediately after patient presentation among un patients with critical illness in the emergency department. And what's the reference?

This is na at all, early physician ge adult versus usual screening tools for the prediction of sepsis and critically ill emergency patients en of emergency medicine in 20 24. Alright. Let's run through the pea peacock. What was the population in this cohort? So these are critically ill adult patients 18 years and older. They're un medical patients presenting to a specialized 4 bed resuscitation area in the emergency department.

And they excluded patients who had trauma, had obvious causes of illness to defined as, you know, they had a cardiac arrest, a stem, suspected stroke, patients in active labor, And they also excluded patients being transferred from outside facilities? What was the intervention? So faculty emergency physicians were asked. What is the likelihood that this patient has sepsis, and asked to rate the likelihood on a scale of a

0 to a hundred. They asked at 15 and 60 minutes after the patient's presentation? Yeah. And to calculate statistics, they decided, you know what? If somebody said it was more than a 50 percent chance. That was consistent with the diagnosis with of Sepsis. But it's not clear if that's really a good assumption, and I think we'll probably talk a little bit more about that in the nerdy section. What did they compare the intervention to? Yeah.

So this physician consult was compared to Sir, Sofa, q, Muse, and a regression machine learning model using something called Lasso, least absolute shrinkage and selection operator if for the people really interested, for their variable selection. Oh, that makes me think of the show we just did with Rebecca and Dennis on lessons learned life lessons from Ted Lasso, But this is just... This is a different lasso. This is least absolute shrinkage. And selection operator. Yeah. Wow. Okay.

Okay. Let's go through the outcomes. What was the primary outcome? The final diagnosis of sepsis based on Icd 10 codes at discharge. Oh, that triggers something that might not. Be exactly, I'm sure we'll get to it in the nerdy section. Alright. And what type of study was this? This is a single center prospective observational trial.

So the author's conclusions were, quote, among adults presenting to the emergency department with an un critical illness, physician ge in the first 15 minutes of the encounter out outperformed other screening methods in identifying sepsis end quote. Alright. Let's go through the quality checklist for observation studies. The first question, do you think the study addressed a clearly focused issue? Yes, It did. And did the authors use an appropriate method to answer their question?

Yes. Although we might have questions about the value of the question as it was constructed. Do you think the cohort was recruited in an acceptable way? Yes. With regards to the exposure, do you think it was accurately measured to minimize bias? I'm gonna say no. For clinical judgment, I think hawthorne effect is a concern and for the decision tools, they were only calculated retrospectively based on chart review. Was the outcome accurately measured to minimize bias?

So no as we're gonna discuss a little bit more, in the talk nerdy section. There really isn't a good gold standard for sepsis, so they used Icd 10 codes at the time of discharge, but that leaves us with a lot of uncertainty. Do you think the authors identified all important con compounding facts factors. That's really hard to know. So I had to give it an unsure. Yeah. That's a difficult 1 in these observational studies. Was the follow of subjects complete enough. Yes. How per are the results?

The confidence intervals are tight enough to believe? Do you believe the results then? I do. Now you're practicing in the Toronto area. Do you think that these results can be applied to your? Local population in the Gta, that's the greater Toronto area. Or what with the rest of Canada calls the center of the universe. Yeah. That's what that's that everyone refers it

to. Yeah. I give it an unsure. I will talk, below, but there are some questions about general here given this study only looked at critically ill patients, and this took place in what sounds like a really great academic emergency department that had rapid access to labs and there other tools that, We... Most of us might not have. Yeah. Rapid access to labs. Cool quick. Alright. Do you, do the results of this study fit with other available evidence?

Yes. I think so. The sensitivity and specificity of these various sepsis decision tools have never been very good at all. So these findings are not surprising and furthermore, very few decision tools have ever been shown to be better than clinical judgment. So these results are of exactly what we should expect? And the final question, where did the funding come from? Do the authors have any conflicts of interest, those types of things?

The authors do not report any specific funding sources and declared no conflicts of interest. Alright. So let's go through the results. They included almost 2500 patients. The median age was 53 years, 60 percent were male and 11 percent of the total cohort was ultimately diagnosed with sepsis. Most physician judgment what I mean by most is 9 4 percent was completed by an attending physician with 6 percent being completed by trainee or residents.

They were missing a lot of data for their screening tools. And then when it came to the median visual analog scale or the Va score in patients with sepsis and remember that went from 0 to a hundred, it was 81. As compared to a score of 8 for those patients that they felt didn't have sepsis. What was the key result, Justin? So physician assault was better then just... All of the decision tools both

at 15 and 60 minutes. It's hard to give exact numbers because they're all area under the e curve, we'll probably put a table in the show notes here, but I actually think that the differences are big enough to not just be statistically, but also they look clinically significant to me. Yeah. So when it came to that primary outcome, it was the sepsis diagnosis estimated on that 0 to a hundred visual analog scale. And looking at the area under the curb,

at 2 different time frame. So really early on into the presentation, you know, the first 15 minutes and it's sort of like, do I know what's going on? Or did going out to 60 minutes really help. And that clinical judgment was 0.9 for area under to the curve, and at at 15 minutes and 0.86 at 60 minutes. Do you wanna point out any of those clinical decision tools or instruments how How did they perform? Yeah. So the most common ones suggested now would be q sofa, which was a 0.67 or a 0.71.

That's a pretty big gap. Sir was our classic 1. We knew it was never great, but Sir was 0.67 and 0.64. And even the full sofa, which it has all sorts of extra variables was 0 point 670 0.7. So pretty big gap from the 0.9 or 0.86 that we saw with just the clinician? Yeah. So you're way up there with physician gust. At, you know, about 0.9. So fairly high, and then that really drops down into the 0.6 range somewhere in there. So there is a real gap. So let's talk a little nerdy about

that. Okay, my friend. Yes. As much as I love the results of the study, there are a number of issues that we need to to consider. So the first that I had listed is the lack of a gold standard. Like, really what is the true definition of sepsis? And do we even have a good gold standard? In this study, what I might refer to as the fool gold standard was the chart containing an Icd 10 code of sepsis at the time of discharge. But how many of these patients truly had

Sepsis in the emergency department. And maybe more importantly, not all sepsis created equal. Right? I might care a lot about identifying septic shock or severe sepsis. But if the patients fell out of those more severe categories, do I even care about a few hours of delayed diagnosis. I'm not sure. And finally, I think discharge diagnosis specifically is a poor gold standard because it's quite possible that patients could develop

sepsis later in their hospital stay. So imagine a patient with intestinal ischemia as their initial presentation. Even if that patient later develops Sepsis, we've done that patient no good by labeling them as Sepsis in the first hour in the emergency department if that means that we missed their dying in intestines. In fact, they provide us with a table of the 10 patients who were ultimately diagnosed with sepsis, but quote, unquote, missed by the

initial clinician. The exact case I just mentioned the patient with intestinal ischemia and 0 ser criteria is represented as a miss But perhaps most importantly, antibiotics were given to every single patient who is considered a miss. So that really makes you wonder about the definition of, quote, sepsis that they used. And the second point we wanted to talk about was that visual analog scale score that 0 to 100. They asked physicians to rate that the chances of sepsis on that 0 to 100 scale.

And that's a reasonable question for research purposes, but it's entirely unclear what these numbers really mean to clinical care. I mean, if the patient has a 60 a score of 60 percent chance of sepsis, do you empirical treat as sepsis or do you wait for more information. So where is that tipping point? I mean, some physicians are risk ave reverse, so they might start giving antibiotics if there's a 20 percent

chance, 30 percent chance. And other physicians are quite risk tolerant and they may like, you know, I think it needs to be at least an 80 percent chance before I start antibiotics. In this study, for their statistics, they decided anything above 50 percent, meant that the patient had sepsis, but they didn't ask the clinicians for their interpretation. Would the treating physician agree if you said, well, I I gave him a score of 55?

Does that mean that you really think that they have a sepsis and you're going to give them antibiotics. It is possible that they were gonna give antibiotics to patients even if I thought it was a 20 percent chance of sepsis. And that would make that 50 percent cutoff somewhat meaningless. Therefore, although this is a theoretically interesting question, A much more important question is how Ge compared to the rules in terms of clinical

action. In other words, the question we want answered is, Based on your ge, are you gonna empirical treat them as if this was sepsis and whatever the risk threshold is? It's such an important quest point not just for decision tools, but all of our tests, we get so focused on accuracy, but what we really care about is what are the actions driven by those tests? 0.3 in our talk nerdy section is a couple points about general.

First of all, this study only looked at critically ill patients and they really specialized resuscitation area. As I think the study demonstrates, we are very good at identifying and treating sepsis in patients who look like they need an nice icu you the moment they arrive to the emergency department. But the much more difficult group of patients are those who present a atypical who maybe they don't look so bad on arrival, but get sicker in the 24 hours

after admission. It's possible, but obviously completely unproven that objective tools or artificial intelligence could identify risk factors that clinicians are currently overlooking in these harder to diagnose patients. It's also worth noting that this specific hospital functions unlike most of our emergency the departments. Anywhere, I've worked at that least. Labs were reported on these patients less than 15 minutes after

patient arrival to the emergency department. Whoa whoa, go Justin. 15 minutes. You you need to, like, provide a, you know, people are driving and listening to this. Okay? So I hope no accidents were created.

Yeah. Very, very special. So I I think these clinicians might have had more data points going into their clinical judgment than most of us would have That being said, the scores all need a lot of data points as well and even in this data rich environment, most of these patients could not have their steps of scores calculated at least in the first 60 minutes. Yeah. There was a lot of missing data, so they couldn't calculate that on the full 2500 patients.

Now the fourth point you you brought it up earlier, and this is about the hawthorne effect. These clinicians were specifically being asked about Sepsis. It's almost like priming the pump. The simple act of asking might influence their estimate. For example, you might have left a patient's room who is in shock. With a working diagnosis of pulmonary embolism.

But when asked hey, do you think this person could also have sepsis or maybe sepsis is the cause, and that should be considered on the differential that could upgrade your judgment and make you add antibiotics. If the research assistance weren't present, it is possible that the clinicians could have missed more cases because remember, there were research assistance around. The decision tool would not have suffered though from this same hawthorne effect bias at least not the same magnitude.

And another point, we'll call it 4 b just because you like to fit everything do 5 talk nerdy points. Did you notice that? Yeah. I can find a couple because, you know, I've gotta have 5. It it's separate, but what we'll call it for as well or be... So another potential bias is a the strong person decision rule comparison.

So the authors talk about this a bit in their limitation section, but the comparison of Sur or sofa to clinician Ge here is somewhat nonsensical because these scores on their own don't diagnose sepsis. A diagnosis of sepsis requires a positive sur or sofa plus a clinical diagnosis of infection. In other words, the definition of sepsis always requires clinical judgment. No matter what being counters looking at data retrospectively want to say.

Therefore, these tools probably function better in clinical practice when they're combined with clinical judgment, than they did in this study making it a bit of a false comparison. Alright. And the fifth, yes, there were only 5 points. The fifth and nerdy point we wanted to make was about clinical decision tools calculated retrospectively.

There are major questions about the accuracy of the clinical decision tool results, given that the scores were only calculated looking backwards based on a chart review. Very few patients had enough data to record completely these scores. At 15 minutes, although a hundred percent of patients had enough information for a qs sofa, only 59 percent could have a muse calculated. 7 percent for sur and 2 percent for the full sofa score. The numbers remain similarly low by 1 hour.

And remember, this is when they're getting some of this lab data back within 15 minutes. And then not a talk nerdy point because this would take us over 5, but a bit more of a philosophical question is, you know, why are we so obsessed with decision

tools. There's been an absolute explosion in the in the number of decision tools available in emergency medicine over the last couple decades, especially since I started my career, and as is demonstrated in the study, almost none of them have outperformed basic clinical judgment. That's 1 of the rules I wrote that sort of provocative blog post called decision rules are ruining medicine. Done done done. Clickbait. I'm sure that got... Well, I know it got a lot of response that that

specific post. Alright. That's enough. That's enough nerd. Let's comment on the author's conclusions and compare them to the S mgm conclusions. So although there are some limitations that limit our certainty, we agree with the conclusion that physician judgment appears to outperform the available screening tools in the early identification of sepsis in critically ill patients. So what's the bottom line, Justin?

Despite the many limit of the human mind, we should not underestimate the accuracy of physician judgment. We are highly trained. And run algorithms that are likely more complex than the average decision tool. For the diagnosis of Sepsis, you are going to have to rely on your training rather than any specific decision tool. And how about resolving the case that you presented?

It wasn't popular. But as the Eb nerd in the group, you talk to the hospital administration off the cliff, explaining the problems with using the retrospective scope to shape patient care and discuss the many limitations of the available screening tools.

Thanks to this new publication, you're able to emphasize the accuracy of physician judgment and so instead of forcing already over burden triage nurses from completing sepsis screening tools as well, you just empower them to call physicians to the bedside for any patient that they're worry about? How are you going to apply this clinically then? Yeah. So I think physician judgment is our best tool for early identification of sepsis, or Ai overlords will have to wait for another

Alright. And then you're gonna have to translate this down to the to the side of the gu to the patient. So what are

you gonna say to them? Yeah. I find in general, these critically ill patients usually aren't all that interested in how I'm making the diagnosis as long as I'm getting the correct diagnosis, but I'll tell them that based on my judgment and that I think a an infection is the most likely cause of their illness, but it's also important to keep one's mind open and that I keep reassess them frequently to ensure that they are improving.

Alright. Time to announce the Keen contest winner, And last week's winner was Corey Has camp. He knew the French ob Francois, Cha is credited with introducing the bag valve mask device in 17 80, and it was for the resuscitation of Neon. I really like the question that you have for this week though. What I really wanna know is who's willing to put in the effort and go to Md cal because our question this week is how many clinical decision rules are listed on

the website Md cal. So get out your calculators and go count and let Ken know. Yeah. Maybe they have a clinical decision rule on Md cal. Maybe you can just put it into md Cal can say, okay, M d, Cal, what are the what are the number of rules that you guys actually have? I like the website? It's it's it's very well put together. But if you know the answer to how many clinical decision rules are now listed on the website, md cal.

Then send me an email with Keen in the subject line, my email address is the s j at gmail dot com. Well, thanks, Justin for coming back, and it's always great to get together with a fellow nerd. Yeah. It's been great to be back chatting with you, hopefully it won't take quite so long to organize our next chat. Well, if I can ever cross poll, I would love the opportunity even even with a little guest spot if I could sit in on a roundup with you and Casey, I would really enjoy that.

Oh, we can make that happen for sure. Alright. Until that happens, can you read the S gem tagline? Remember to be skeptical of anything you learned even if you heard it on the Skeptics guide to emergency using message. Talk to everyone next.

Transcript source: Provided by creator in RSS feed: download file