Why I Believe in SOTA Models Over Custom Ones - podcast episode cover

Why I Believe in SOTA Models Over Custom Ones

Mar 11, 20262 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

I think the future is cheaper and Open Source SOTA models combined with context, not custom, narrow models.

Become a Member: https://danielmiessler.com/upgrade

See omnystudio.com/listener for privacy information.

Transcript

S1

I'm not completely sure I'm right about this, but I've never been a big believer in training custom models. I've also never believed in fine tuning going all the way back to 2023. My intuition has always pushed me towards the best state of the art model possible, combined with context management. I just finally crystallized my reasoning around this.

Anytime you think you're using a small model for a small task, there's usually a whole lot more going into a given decision than just that individual area of expertise. For example, labeling emails, writing reports, processing security events, searching for threats on a network. On one hand, I think these are specialized, but the fact is, the smarter and more experienced a human is who has this expertise, the

better job they're going to do. This is because most specialized tasks still benefit from the general life experience of the person doing the execution. This is why I think the future is not a whole bunch of extremely small, specialized models throughout the enterprise. I think what's far more likely is more of an opus, sonnet haiku model, where the best of the best just keeps coming down in price,

including going into open source. And those smaller models are used in conjunction with context to perform all the different tasks in an organization at much lower cost. But I think they'll still be extremely general models, not tiny and narrow custom ones. I think the Tldr here is when you think you're doing a narrow task, that narrow task is actually benefiting from a ton of general experience. And I think this applies to humans, and I think it

also applies to models. I'm not completely convinced of this. I'm about 70% sure. But yeah, I think this is the way it's going to go.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android